All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] AVX guest implementation
@ 2022-04-18 17:39 Paul Brook
  2022-04-18 17:39 ` [PATCH 1/4] Add AVX_EN hflag Paul Brook
                   ` (45 more replies)
  0 siblings, 46 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-18 17:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eduardo Habkost, Paolo Bonzini, Richard Henderson, Paul Brook

Patch series to implement AXV/AVX2 guest support in TCG.

All the system level code for this (cpid, xsave, wider registers, etc)
already exists, we just need to implement the instruction translation.

The majority of the new 256-bit operations operate on each 128-bit
"lane" independently, so in theory we could use a single set of 128-bit
helpers to implement both widths piecemeal. However this would further
complicate the already over-long gen_sse function. Instead I chose to
generate a whole new set of 256 bit "ymm" helpers using the framework
already in place for 64/128 bit mm/xmm operations.

I've included the tests I used during development to the linux-user
testsuite, and also ran these manually inside a debian x86-64 guest.

Appologies for the big patch, but I can't think of a good way to split
the bulk of the instruction translation.

Paul Brook (4):
  Add AVX_EN hflag
  TCG support for AVX
  Enable all x86-64 cpu features in user mode
  AVX tests

 linux-user/x86_64/target_elf.h |    2 +-
 target/i386/cpu.c              |    8 +-
 target/i386/cpu.h              |    3 +
 target/i386/helper.c           |   12 +
 target/i386/helper.h           |    2 +
 target/i386/ops_sse.h          | 2606 +++++++++++++-----
 target/i386/ops_sse_header.h   |  364 ++-
 target/i386/tcg/fpu_helper.c   |    4 +
 target/i386/tcg/translate.c    | 1902 ++++++++++---
 tests/tcg/i386/Makefile.target |   10 +-
 tests/tcg/i386/README          |    9 +
 tests/tcg/i386/test-avx.c      |  347 +++
 tests/tcg/i386/test-avx.py     |  352 +++
 tests/tcg/i386/x86.csv         | 4658 ++++++++++++++++++++++++++++++++
 14 files changed, 8988 insertions(+), 1291 deletions(-)
 create mode 100644 tests/tcg/i386/test-avx.c
 create mode 100755 tests/tcg/i386/test-avx.py
 create mode 100644 tests/tcg/i386/x86.csv

-- 
2.35.2



^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 1/4] Add AVX_EN hflag
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
@ 2022-04-18 17:39 ` Paul Brook
  2022-04-18 17:39 ` [PATCH 2/4] TCG support for AVX Paul Brook
                   ` (44 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-18 17:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eduardo Habkost, Paolo Bonzini, Richard Henderson, Paul Brook

Add a new hflag bit to determine whether AVX instructions are allowed

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/cpu.h            |  3 +++
 target/i386/helper.c         | 12 ++++++++++++
 target/i386/tcg/fpu_helper.c |  1 +
 3 files changed, 16 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 982c532353..0c7162e2fd 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -168,6 +168,7 @@ typedef enum X86Seg {
 #define HF_MPX_EN_SHIFT     25 /* MPX Enabled (CR4+XCR0+BNDCFGx) */
 #define HF_MPX_IU_SHIFT     26 /* BND registers in-use */
 #define HF_UMIP_SHIFT       27 /* CR4.UMIP */
+#define HF_AVX_EN_SHIFT     28 /* AVX Enabled (CR4+XCR0) */
 
 #define HF_CPL_MASK          (3 << HF_CPL_SHIFT)
 #define HF_INHIBIT_IRQ_MASK  (1 << HF_INHIBIT_IRQ_SHIFT)
@@ -194,6 +195,7 @@ typedef enum X86Seg {
 #define HF_MPX_EN_MASK       (1 << HF_MPX_EN_SHIFT)
 #define HF_MPX_IU_MASK       (1 << HF_MPX_IU_SHIFT)
 #define HF_UMIP_MASK         (1 << HF_UMIP_SHIFT)
+#define HF_AVX_EN_MASK       (1 << HF_AVX_EN_SHIFT)
 
 /* hflags2 */
 
@@ -2045,6 +2047,7 @@ void host_cpuid(uint32_t function, uint32_t count,
 
 /* helper.c */
 void x86_cpu_set_a20(X86CPU *cpu, int a20_state);
+void cpu_sync_avx_hflag(CPUX86State *env);
 
 #ifndef CONFIG_USER_ONLY
 static inline int x86_asidx_from_attrs(CPUState *cs, MemTxAttrs attrs)
diff --git a/target/i386/helper.c b/target/i386/helper.c
index fa409e9c44..30083c9cff 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -29,6 +29,17 @@
 #endif
 #include "qemu/log.h"
 
+void cpu_sync_avx_hflag(CPUX86State *env)
+{
+    if ((env->cr[4] & CR4_OSXSAVE_MASK)
+        && (env->xcr0 & (XSTATE_SSE_MASK | XSTATE_YMM_MASK))
+            == (XSTATE_SSE_MASK | XSTATE_YMM_MASK)) {
+        env->hflags |= HF_AVX_EN_MASK;
+    } else{
+        env->hflags &= ~HF_AVX_EN_MASK;
+    }
+}
+
 void cpu_sync_bndcs_hflags(CPUX86State *env)
 {
     uint32_t hflags = env->hflags;
@@ -209,6 +220,7 @@ void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_cr4)
     env->hflags = hflags;
 
     cpu_sync_bndcs_hflags(env);
+    cpu_sync_avx_hflag(env);
 }
 
 #if !defined(CONFIG_USER_ONLY)
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
index ebf5e73df9..b391b69635 100644
--- a/target/i386/tcg/fpu_helper.c
+++ b/target/i386/tcg/fpu_helper.c
@@ -2943,6 +2943,7 @@ void helper_xsetbv(CPUX86State *env, uint32_t ecx, uint64_t mask)
 
     env->xcr0 = mask;
     cpu_sync_bndcs_hflags(env);
+    cpu_sync_avx_hflag(env);
     return;
 
  do_gpf:
-- 
2.35.2



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/4] TCG support for AVX
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
  2022-04-18 17:39 ` [PATCH 1/4] Add AVX_EN hflag Paul Brook
@ 2022-04-18 17:39 ` Paul Brook
  2022-04-18 19:33   ` Peter Maydell
  2022-04-18 17:39 ` [PATCH 3/4] Enable all x86-64 cpu features in user mode Paul Brook
                   ` (43 subsequent siblings)
  45 siblings, 1 reply; 67+ messages in thread
From: Paul Brook @ 2022-04-18 17:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eduardo Habkost, Paolo Bonzini, Richard Henderson, Paul Brook

Add TCG translation of guest AVX/AVX2 instructions
This comprises:

* VEX encodings of most (all?) "legacy" SSE operations.
  These typically add an extra source operand, and clear the unused half
  of the destination register (SSE encodings leave this unchanged)
  Previously we were incorrectly translating VEX encoded instructions
  as if they were legacy SSE encodings.
* 256-bit variants of many instructions. AVX adds floating point
  operations. AVX2 adds integer operations.
* A few new instructions (VBROADCAST, VGATHER, VZERO)

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/cpu.c            |    8 +-
 target/i386/helper.h         |    2 +
 target/i386/ops_sse.h        | 2606 ++++++++++++++++++++++++----------
 target/i386/ops_sse_header.h |  364 +++--
 target/i386/tcg/fpu_helper.c |    3 +
 target/i386/tcg/translate.c  | 1902 +++++++++++++++++++------
 6 files changed, 3597 insertions(+), 1288 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index cb6b5467d0..494f01959d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -625,12 +625,12 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
           CPUID_EXT_SSE41 | CPUID_EXT_SSE42 | CPUID_EXT_POPCNT | \
           CPUID_EXT_XSAVE | /* CPUID_EXT_OSXSAVE is dynamic */   \
           CPUID_EXT_MOVBE | CPUID_EXT_AES | CPUID_EXT_HYPERVISOR | \
-          CPUID_EXT_RDRAND)
+          CPUID_EXT_RDRAND | CPUID_EXT_AVX)
           /* missing:
           CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_VMX, CPUID_EXT_SMX,
           CPUID_EXT_EST, CPUID_EXT_TM2, CPUID_EXT_CID, CPUID_EXT_FMA,
           CPUID_EXT_XTPR, CPUID_EXT_PDCM, CPUID_EXT_PCID, CPUID_EXT_DCA,
-          CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AVX,
+          CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER,
           CPUID_EXT_F16C */
 
 #ifdef TARGET_X86_64
@@ -653,9 +653,9 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
           CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ADX | \
           CPUID_7_0_EBX_PCOMMIT | CPUID_7_0_EBX_CLFLUSHOPT |            \
           CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_MPX | CPUID_7_0_EBX_FSGSBASE | \
-          CPUID_7_0_EBX_ERMS)
+          CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_AVX2)
           /* missing:
-          CPUID_7_0_EBX_HLE, CPUID_7_0_EBX_AVX2,
+          CPUID_7_0_EBX_HLE
           CPUID_7_0_EBX_INVPCID, CPUID_7_0_EBX_RTM,
           CPUID_7_0_EBX_RDSEED */
 #define TCG_7_0_ECX_FEATURES (CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU | \
diff --git a/target/i386/helper.h b/target/i386/helper.h
index ac3b4d1ee3..3da5df98b9 100644
--- a/target/i386/helper.h
+++ b/target/i386/helper.h
@@ -218,6 +218,8 @@ DEF_HELPER_3(movq, void, env, ptr, ptr)
 #include "ops_sse_header.h"
 #define SHIFT 1
 #include "ops_sse_header.h"
+#define SHIFT 2
+#include "ops_sse_header.h"
 
 DEF_HELPER_3(rclb, tl, env, tl, tl)
 DEF_HELPER_3(rclw, tl, env, tl, tl)
diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 6f1fc174b3..9cd7b2875e 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -23,6 +23,7 @@
 #if SHIFT == 0
 #define Reg MMXReg
 #define XMM_ONLY(...)
+#define YMM_ONLY(...)
 #define B(n) MMX_B(n)
 #define W(n) MMX_W(n)
 #define L(n) MMX_L(n)
@@ -35,260 +36,355 @@
 #define W(n) ZMM_W(n)
 #define L(n) ZMM_L(n)
 #define Q(n) ZMM_Q(n)
+#if SHIFT == 1
 #define SUFFIX _xmm
+#define YMM_ONLY(...)
+#else
+#define SUFFIX _ymm
+#define YMM_ONLY(...) __VA_ARGS__
+#endif
+#endif
+
+#if SHIFT == 0
+#define SHIFT_HELPER_BODY(n, elem, F) do {      \
+    d->elem(0) = F(s->elem(0), shift);          \
+    if ((n) > 1) {                              \
+        d->elem(1) = F(s->elem(1), shift);      \
+    }                                           \
+    if ((n) > 2) {                              \
+        d->elem(2) = F(s->elem(2), shift);      \
+        d->elem(3) = F(s->elem(3), shift);      \
+    }                                           \
+    if ((n) > 4) {                              \
+        d->elem(4) = F(s->elem(4), shift);      \
+        d->elem(5) = F(s->elem(5), shift);      \
+        d->elem(6) = F(s->elem(6), shift);      \
+        d->elem(7) = F(s->elem(7), shift);      \
+    }                                           \
+    if ((n) > 8) {                              \
+        d->elem(8) = F(s->elem(8), shift);      \
+        d->elem(9) = F(s->elem(9), shift);      \
+        d->elem(10) = F(s->elem(10), shift);    \
+        d->elem(11) = F(s->elem(11), shift);    \
+        d->elem(12) = F(s->elem(12), shift);    \
+        d->elem(13) = F(s->elem(13), shift);    \
+        d->elem(14) = F(s->elem(14), shift);    \
+        d->elem(15) = F(s->elem(15), shift);    \
+    }                                           \
+    } while (0)
+
+#define FPSRL(x, c) ((x) >> shift)
+#define FPSRAW(x, c) ((int16_t)(x) >> shift)
+#define FPSRAL(x, c) ((int32_t)(x) >> shift)
+#define FPSLL(x, c) ((x) << shift)
 #endif
 
-void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift;
-
-    if (s->Q(0) > 15) {
+    if (c->Q(0) > 15) {
         d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
-        d->W(0) >>= shift;
-        d->W(1) >>= shift;
-        d->W(2) >>= shift;
-        d->W(3) >>= shift;
-#if SHIFT == 1
-        d->W(4) >>= shift;
-        d->W(5) >>= shift;
-        d->W(6) >>= shift;
-        d->W(7) >>= shift;
-#endif
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRL);
     }
 }
 
-void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift;
-
-    if (s->Q(0) > 15) {
-        shift = 15;
+    if (c->Q(0) > 15) {
+        d->Q(0) = 0;
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(4 << SHIFT, W, FPSLL);
     }
-    d->W(0) = (int16_t)d->W(0) >> shift;
-    d->W(1) = (int16_t)d->W(1) >> shift;
-    d->W(2) = (int16_t)d->W(2) >> shift;
-    d->W(3) = (int16_t)d->W(3) >> shift;
-#if SHIFT == 1
-    d->W(4) = (int16_t)d->W(4) >> shift;
-    d->W(5) = (int16_t)d->W(5) >> shift;
-    d->W(6) = (int16_t)d->W(6) >> shift;
-    d->W(7) = (int16_t)d->W(7) >> shift;
-#endif
 }
 
-void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift;
-
-    if (s->Q(0) > 15) {
-        d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+    if (c->Q(0) > 15) {
+        shift = 15;
     } else {
-        shift = s->B(0);
-        d->W(0) <<= shift;
-        d->W(1) <<= shift;
-        d->W(2) <<= shift;
-        d->W(3) <<= shift;
-#if SHIFT == 1
-        d->W(4) <<= shift;
-        d->W(5) <<= shift;
-        d->W(6) <<= shift;
-        d->W(7) <<= shift;
-#endif
+        shift = c->B(0);
     }
+    SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRAW);
 }
 
-void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift;
-
-    if (s->Q(0) > 31) {
+    if (c->Q(0) > 31) {
         d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
-        d->L(0) >>= shift;
-        d->L(1) >>= shift;
-#if SHIFT == 1
-        d->L(2) >>= shift;
-        d->L(3) >>= shift;
-#endif
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRL);
     }
 }
 
-void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift;
-
-    if (s->Q(0) > 31) {
-        shift = 31;
+    if (c->Q(0) > 31) {
+        d->Q(0) = 0;
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(2 << SHIFT, L, FPSLL);
     }
-    d->L(0) = (int32_t)d->L(0) >> shift;
-    d->L(1) = (int32_t)d->L(1) >> shift;
-#if SHIFT == 1
-    d->L(2) = (int32_t)d->L(2) >> shift;
-    d->L(3) = (int32_t)d->L(3) >> shift;
-#endif
 }
 
-void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift;
-
-    if (s->Q(0) > 31) {
-        d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+    if (c->Q(0) > 31) {
+        shift = 31;
     } else {
-        shift = s->B(0);
-        d->L(0) <<= shift;
-        d->L(1) <<= shift;
-#if SHIFT == 1
-        d->L(2) <<= shift;
-        d->L(3) <<= shift;
-#endif
+        shift = c->B(0);
     }
+    SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRAL);
 }
 
-void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift;
-
-    if (s->Q(0) > 63) {
+    if (c->Q(0) > 63) {
         d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
-        d->Q(0) >>= shift;
-#if SHIFT == 1
-        d->Q(1) >>= shift;
-#endif
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(1 << SHIFT, Q, FPSRL);
     }
 }
 
-void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift;
-
-    if (s->Q(0) > 63) {
+    if (c->Q(0) > 63) {
         d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
-        d->Q(0) <<= shift;
-#if SHIFT == 1
-        d->Q(1) <<= shift;
-#endif
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(1 << SHIFT, Q, FPSLL);
     }
 }
 
-#if SHIFT == 1
-void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+#if SHIFT >= 1
+void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift, i;
 
-    shift = s->L(0);
+    shift = c->L(0);
     if (shift > 16) {
         shift = 16;
     }
     for (i = 0; i < 16 - shift; i++) {
-        d->B(i) = d->B(i + shift);
+        d->B(i) = s->B(i + shift);
     }
     for (i = 16 - shift; i < 16; i++) {
         d->B(i) = 0;
     }
+#if SHIFT == 2
+    for (i = 0; i < 16 - shift; i++) {
+        d->B(i + 16) = s->B(i + 16 + shift);
+    }
+    for (i = 16 - shift; i < 16; i++) {
+        d->B(i + 16) = 0;
+    }
+#endif
 }
 
-void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
     int shift, i;
 
-    shift = s->L(0);
+    shift = c->L(0);
     if (shift > 16) {
         shift = 16;
     }
     for (i = 15; i >= shift; i--) {
-        d->B(i) = d->B(i - shift);
+        d->B(i) = s->B(i - shift);
     }
     for (i = 0; i < shift; i++) {
         d->B(i) = 0;
     }
+#if SHIFT == 2
+    for (i = 15; i >= shift; i--) {
+        d->B(i + 16) = s->B(i + 16 - shift);
+    }
+    for (i = 0; i < shift; i++) {
+        d->B(i + 16) = 0;
+    }
+#endif
 }
 #endif
 
-#define SSE_HELPER_B(name, F)                                   \
+#define SSE_HELPER_1(name, elem, num, F)                                   \
     void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
     {                                                           \
-        d->B(0) = F(d->B(0), s->B(0));                          \
-        d->B(1) = F(d->B(1), s->B(1));                          \
-        d->B(2) = F(d->B(2), s->B(2));                          \
-        d->B(3) = F(d->B(3), s->B(3));                          \
-        d->B(4) = F(d->B(4), s->B(4));                          \
-        d->B(5) = F(d->B(5), s->B(5));                          \
-        d->B(6) = F(d->B(6), s->B(6));                          \
-        d->B(7) = F(d->B(7), s->B(7));                          \
+        d->elem(0) = F(s->elem(0));                             \
+        d->elem(1) = F(s->elem(1));                             \
+        if ((num << SHIFT) > 2) {                               \
+            d->elem(2) = F(s->elem(2));                         \
+            d->elem(3) = F(s->elem(3));                         \
+        }                                                       \
+        if ((num << SHIFT) > 4) {                               \
+            d->elem(4) = F(s->elem(4));                         \
+            d->elem(5) = F(s->elem(5));                         \
+            d->elem(6) = F(s->elem(6));                         \
+            d->elem(7) = F(s->elem(7));                         \
+        }                                                       \
+        if ((num << SHIFT) > 8) {                               \
+            d->elem(8) = F(s->elem(8));                         \
+            d->elem(9) = F(s->elem(9));                         \
+            d->elem(10) = F(s->elem(10));                       \
+            d->elem(11) = F(s->elem(11));                       \
+            d->elem(12) = F(s->elem(12));                       \
+            d->elem(13) = F(s->elem(13));                       \
+            d->elem(14) = F(s->elem(14));                       \
+            d->elem(15) = F(s->elem(15));                       \
+        }                                                       \
+        if ((num << SHIFT) > 16) {                              \
+            d->elem(16) = F(s->elem(16));                       \
+            d->elem(17) = F(s->elem(17));                       \
+            d->elem(18) = F(s->elem(18));                       \
+            d->elem(19) = F(s->elem(19));                       \
+            d->elem(20) = F(s->elem(20));                       \
+            d->elem(21) = F(s->elem(21));                       \
+            d->elem(22) = F(s->elem(22));                       \
+            d->elem(23) = F(s->elem(23));                       \
+            d->elem(24) = F(s->elem(24));                       \
+            d->elem(25) = F(s->elem(25));                       \
+            d->elem(26) = F(s->elem(26));                       \
+            d->elem(27) = F(s->elem(27));                       \
+            d->elem(28) = F(s->elem(28));                       \
+            d->elem(29) = F(s->elem(29));                       \
+            d->elem(30) = F(s->elem(30));                       \
+            d->elem(31) = F(s->elem(31));                       \
+        }                                                       \
+    }
+
+#define SSE_HELPER_B(name, F)                                   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
+    {                                                           \
+        d->B(0) = F(v->B(0), s->B(0));                          \
+        d->B(1) = F(v->B(1), s->B(1));                          \
+        d->B(2) = F(v->B(2), s->B(2));                          \
+        d->B(3) = F(v->B(3), s->B(3));                          \
+        d->B(4) = F(v->B(4), s->B(4));                          \
+        d->B(5) = F(v->B(5), s->B(5));                          \
+        d->B(6) = F(v->B(6), s->B(6));                          \
+        d->B(7) = F(v->B(7), s->B(7));                          \
         XMM_ONLY(                                               \
-                 d->B(8) = F(d->B(8), s->B(8));                 \
-                 d->B(9) = F(d->B(9), s->B(9));                 \
-                 d->B(10) = F(d->B(10), s->B(10));              \
-                 d->B(11) = F(d->B(11), s->B(11));              \
-                 d->B(12) = F(d->B(12), s->B(12));              \
-                 d->B(13) = F(d->B(13), s->B(13));              \
-                 d->B(14) = F(d->B(14), s->B(14));              \
-                 d->B(15) = F(d->B(15), s->B(15));              \
+                 d->B(8) = F(v->B(8), s->B(8));                 \
+                 d->B(9) = F(v->B(9), s->B(9));                 \
+                 d->B(10) = F(v->B(10), s->B(10));              \
+                 d->B(11) = F(v->B(11), s->B(11));              \
+                 d->B(12) = F(v->B(12), s->B(12));              \
+                 d->B(13) = F(v->B(13), s->B(13));              \
+                 d->B(14) = F(v->B(14), s->B(14));              \
+                 d->B(15) = F(v->B(15), s->B(15));              \
+                                                        )       \
+        YMM_ONLY(                                               \
+                 d->B(16) = F(v->B(16), s->B(16));              \
+                 d->B(17) = F(v->B(17), s->B(17));              \
+                 d->B(18) = F(v->B(18), s->B(18));              \
+                 d->B(19) = F(v->B(19), s->B(19));              \
+                 d->B(20) = F(v->B(20), s->B(20));              \
+                 d->B(21) = F(v->B(21), s->B(21));              \
+                 d->B(22) = F(v->B(22), s->B(22));              \
+                 d->B(23) = F(v->B(23), s->B(23));              \
+                 d->B(24) = F(v->B(24), s->B(24));              \
+                 d->B(25) = F(v->B(25), s->B(25));              \
+                 d->B(26) = F(v->B(26), s->B(26));              \
+                 d->B(27) = F(v->B(27), s->B(27));              \
+                 d->B(28) = F(v->B(28), s->B(28));              \
+                 d->B(29) = F(v->B(29), s->B(29));              \
+                 d->B(30) = F(v->B(30), s->B(30));              \
+                 d->B(31) = F(v->B(31), s->B(31));              \
                                                         )       \
             }
 
 #define SSE_HELPER_W(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
     {                                                           \
-        d->W(0) = F(d->W(0), s->W(0));                          \
-        d->W(1) = F(d->W(1), s->W(1));                          \
-        d->W(2) = F(d->W(2), s->W(2));                          \
-        d->W(3) = F(d->W(3), s->W(3));                          \
+        d->W(0) = F(v->W(0), s->W(0));                          \
+        d->W(1) = F(v->W(1), s->W(1));                          \
+        d->W(2) = F(v->W(2), s->W(2));                          \
+        d->W(3) = F(v->W(3), s->W(3));                          \
         XMM_ONLY(                                               \
-                 d->W(4) = F(d->W(4), s->W(4));                 \
-                 d->W(5) = F(d->W(5), s->W(5));                 \
-                 d->W(6) = F(d->W(6), s->W(6));                 \
-                 d->W(7) = F(d->W(7), s->W(7));                 \
+                 d->W(4) = F(v->W(4), s->W(4));                 \
+                 d->W(5) = F(v->W(5), s->W(5));                 \
+                 d->W(6) = F(v->W(6), s->W(6));                 \
+                 d->W(7) = F(v->W(7), s->W(7));                 \
+                                                        )       \
+        YMM_ONLY(                                               \
+                 d->W(8) = F(v->W(8), s->W(8));                 \
+                 d->W(9) = F(v->W(9), s->W(9));                 \
+                 d->W(10) = F(v->W(10), s->W(10));              \
+                 d->W(11) = F(v->W(11), s->W(11));              \
+                 d->W(12) = F(v->W(12), s->W(12));              \
+                 d->W(13) = F(v->W(13), s->W(13));              \
+                 d->W(14) = F(v->W(14), s->W(14));              \
+                 d->W(15) = F(v->W(15), s->W(15));              \
                                                         )       \
             }
 
 #define SSE_HELPER_L(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
     {                                                           \
-        d->L(0) = F(d->L(0), s->L(0));                          \
-        d->L(1) = F(d->L(1), s->L(1));                          \
+        d->L(0) = F(v->L(0), s->L(0));                          \
+        d->L(1) = F(v->L(1), s->L(1));                          \
         XMM_ONLY(                                               \
-                 d->L(2) = F(d->L(2), s->L(2));                 \
-                 d->L(3) = F(d->L(3), s->L(3));                 \
+                 d->L(2) = F(v->L(2), s->L(2));                 \
+                 d->L(3) = F(v->L(3), s->L(3));                 \
+                                                        )       \
+        YMM_ONLY(                                               \
+                 d->L(4) = F(v->L(4), s->L(4));                 \
+                 d->L(5) = F(v->L(5), s->L(5));                 \
+                 d->L(6) = F(v->L(6), s->L(6));                 \
+                 d->L(7) = F(v->L(7), s->L(7));                 \
                                                         )       \
             }
 
 #define SSE_HELPER_Q(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
     {                                                           \
-        d->Q(0) = F(d->Q(0), s->Q(0));                          \
+        d->Q(0) = F(v->Q(0), s->Q(0));                          \
         XMM_ONLY(                                               \
-                 d->Q(1) = F(d->Q(1), s->Q(1));                 \
+                 d->Q(1) = F(v->Q(1), s->Q(1));                 \
+                                                        )       \
+        YMM_ONLY(                                               \
+                 d->Q(2) = F(v->Q(2), s->Q(2));                 \
+                 d->Q(3) = F(v->Q(3), s->Q(3));                 \
                                                         )       \
             }
 
@@ -411,30 +507,41 @@ SSE_HELPER_W(helper_pcmpeqw, FCMPEQ)
 SSE_HELPER_L(helper_pcmpeql, FCMPEQ)
 
 SSE_HELPER_W(helper_pmullw, FMULLW)
-#if SHIFT == 0
-SSE_HELPER_W(helper_pmulhrw, FMULHRW)
-#endif
 SSE_HELPER_W(helper_pmulhuw, FMULHUW)
 SSE_HELPER_W(helper_pmulhw, FMULHW)
 
+#if SHIFT == 0
+void glue(helper_pmulhrw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->W(0) = FMULHRW(d->W(0), s->W(0));
+    d->W(1) = FMULHRW(d->W(1), s->W(1));
+    d->W(2) = FMULHRW(d->W(2), s->W(2));
+    d->W(3) = FMULHRW(d->W(3), s->W(3));
+}
+#endif
+
 SSE_HELPER_B(helper_pavgb, FAVG)
 SSE_HELPER_W(helper_pavgw, FAVG)
 
-void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    d->Q(0) = (uint64_t)s->L(0) * (uint64_t)d->L(0);
-#if SHIFT == 1
-    d->Q(1) = (uint64_t)s->L(2) * (uint64_t)d->L(2);
+    d->Q(0) = (uint64_t)s->L(0) * (uint64_t)v->L(0);
+#if SHIFT >= 1
+    d->Q(1) = (uint64_t)s->L(2) * (uint64_t)v->L(2);
+#if SHIFT == 2
+    d->Q(2) = (uint64_t)s->L(4) * (uint64_t)v->L(4);
+    d->Q(3) = (uint64_t)s->L(6) * (uint64_t)v->L(6);
+#endif
 #endif
 }
 
-void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
 
     for (i = 0; i < (2 << SHIFT); i++) {
-        d->L(i) = (int16_t)s->W(2 * i) * (int16_t)d->W(2 * i) +
-            (int16_t)s->W(2 * i + 1) * (int16_t)d->W(2 * i + 1);
+        d->L(i) = (int16_t)s->W(2 * i) * (int16_t)v->W(2 * i) +
+            (int16_t)s->W(2 * i + 1) * (int16_t)v->W(2 * i + 1);
     }
 }
 
@@ -448,34 +555,57 @@ static inline int abs1(int a)
     }
 }
 #endif
-void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     unsigned int val;
 
     val = 0;
-    val += abs1(d->B(0) - s->B(0));
-    val += abs1(d->B(1) - s->B(1));
-    val += abs1(d->B(2) - s->B(2));
-    val += abs1(d->B(3) - s->B(3));
-    val += abs1(d->B(4) - s->B(4));
-    val += abs1(d->B(5) - s->B(5));
-    val += abs1(d->B(6) - s->B(6));
-    val += abs1(d->B(7) - s->B(7));
+    val += abs1(v->B(0) - s->B(0));
+    val += abs1(v->B(1) - s->B(1));
+    val += abs1(v->B(2) - s->B(2));
+    val += abs1(v->B(3) - s->B(3));
+    val += abs1(v->B(4) - s->B(4));
+    val += abs1(v->B(5) - s->B(5));
+    val += abs1(v->B(6) - s->B(6));
+    val += abs1(v->B(7) - s->B(7));
     d->Q(0) = val;
-#if SHIFT == 1
+#if SHIFT >= 1
     val = 0;
-    val += abs1(d->B(8) - s->B(8));
-    val += abs1(d->B(9) - s->B(9));
-    val += abs1(d->B(10) - s->B(10));
-    val += abs1(d->B(11) - s->B(11));
-    val += abs1(d->B(12) - s->B(12));
-    val += abs1(d->B(13) - s->B(13));
-    val += abs1(d->B(14) - s->B(14));
-    val += abs1(d->B(15) - s->B(15));
+    val += abs1(v->B(8) - s->B(8));
+    val += abs1(v->B(9) - s->B(9));
+    val += abs1(v->B(10) - s->B(10));
+    val += abs1(v->B(11) - s->B(11));
+    val += abs1(v->B(12) - s->B(12));
+    val += abs1(v->B(13) - s->B(13));
+    val += abs1(v->B(14) - s->B(14));
+    val += abs1(v->B(15) - s->B(15));
     d->Q(1) = val;
+#if SHIFT == 2
+    val = 0;
+    val += abs1(v->B(16) - s->B(16));
+    val += abs1(v->B(17) - s->B(17));
+    val += abs1(v->B(18) - s->B(18));
+    val += abs1(v->B(19) - s->B(19));
+    val += abs1(v->B(20) - s->B(20));
+    val += abs1(v->B(21) - s->B(21));
+    val += abs1(v->B(22) - s->B(22));
+    val += abs1(v->B(23) - s->B(23));
+    d->Q(2) = val;
+    val = 0;
+    val += abs1(v->B(24) - s->B(24));
+    val += abs1(v->B(25) - s->B(25));
+    val += abs1(v->B(26) - s->B(26));
+    val += abs1(v->B(27) - s->B(27));
+    val += abs1(v->B(28) - s->B(28));
+    val += abs1(v->B(29) - s->B(29));
+    val += abs1(v->B(30) - s->B(30));
+    val += abs1(v->B(31) - s->B(31));
+    d->Q(3) = val;
+#endif
 #endif
 }
 
+#if SHIFT < 2
 void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                   target_ulong a0)
 {
@@ -487,13 +617,18 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
         }
     }
 }
+#endif
 
 void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val)
 {
     d->L(0) = val;
     d->L(1) = 0;
-#if SHIFT == 1
+#if SHIFT >= 1
     d->Q(1) = 0;
+#if SHIFT == 2
+    d->Q(2) = 0;
+    d->Q(3) = 0;
+#endif
 #endif
 }
 
@@ -501,114 +636,152 @@ void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val)
 void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t val)
 {
     d->Q(0) = val;
-#if SHIFT == 1
+#if SHIFT >= 1
     d->Q(1) = 0;
+#if SHIFT == 2
+    d->Q(2) = 0;
+    d->Q(3) = 0;
+#endif
 #endif
 }
 #endif
 
+#define SHUFFLE4(F, a, b, offset) do {      \
+    r0 = a->F((order & 3) + offset);        \
+    r1 = a->F(((order >> 2) & 3) + offset); \
+    r2 = b->F(((order >> 4) & 3) + offset); \
+    r3 = b->F(((order >> 6) & 3) + offset); \
+    d->F(offset) = r0;                      \
+    d->F(offset + 1) = r1;                  \
+    d->F(offset + 2) = r2;                  \
+    d->F(offset + 3) = r3;                  \
+    } while (0)
+
 #if SHIFT == 0
 void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    uint16_t r0, r1, r2, r3;
 
-    r.W(0) = s->W(order & 3);
-    r.W(1) = s->W((order >> 2) & 3);
-    r.W(2) = s->W((order >> 4) & 3);
-    r.W(3) = s->W((order >> 6) & 3);
-    *d = r;
+    SHUFFLE4(W, s, s, 0);
 }
 #else
-void helper_shufps(Reg *d, Reg *s, int order)
+void glue(helper_shufps, SUFFIX)(Reg *d, Reg *v, Reg *s, int order)
 {
-    Reg r;
+    uint32_t r0, r1, r2, r3;
 
-    r.L(0) = d->L(order & 3);
-    r.L(1) = d->L((order >> 2) & 3);
-    r.L(2) = s->L((order >> 4) & 3);
-    r.L(3) = s->L((order >> 6) & 3);
-    *d = r;
+    SHUFFLE4(L, v, s, 0);
+#if SHIFT == 2
+    SHUFFLE4(L, v, s, 4);
+#endif
 }
 
-void helper_shufpd(Reg *d, Reg *s, int order)
+void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *v, Reg *s, int order)
 {
-    Reg r;
+    uint64_t r0, r1;
 
-    r.Q(0) = d->Q(order & 1);
-    r.Q(1) = s->Q((order >> 1) & 1);
-    *d = r;
+    r0 = v->Q(order & 1);
+    r1 = s->Q((order >> 1) & 1);
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+#if SHIFT == 2
+    r0 = v->Q(((order >> 2) & 1) + 2);
+    r1 = s->Q(((order >> 3) & 1) + 2);
+    d->Q(2) = r0;
+    d->Q(3) = r1;
+#endif
 }
 
 void glue(helper_pshufd, SUFFIX)(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    uint32_t r0, r1, r2, r3;
 
-    r.L(0) = s->L(order & 3);
-    r.L(1) = s->L((order >> 2) & 3);
-    r.L(2) = s->L((order >> 4) & 3);
-    r.L(3) = s->L((order >> 6) & 3);
-    *d = r;
+    SHUFFLE4(L, s, s, 0);
+#if SHIFT ==  2
+    SHUFFLE4(L, s, s, 4);
+#endif
 }
 
 void glue(helper_pshuflw, SUFFIX)(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    uint16_t r0, r1, r2, r3;
 
-    r.W(0) = s->W(order & 3);
-    r.W(1) = s->W((order >> 2) & 3);
-    r.W(2) = s->W((order >> 4) & 3);
-    r.W(3) = s->W((order >> 6) & 3);
-    r.Q(1) = s->Q(1);
-    *d = r;
+    SHUFFLE4(W, s, s, 0);
+    d->Q(1) = s->Q(1);
+#if SHIFT == 2
+    SHUFFLE4(W, s, s, 8);
+    d->Q(3) = s->Q(3);
+#endif
 }
 
 void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    uint16_t r0, r1, r2, r3;
 
-    r.Q(0) = s->Q(0);
-    r.W(4) = s->W(4 + (order & 3));
-    r.W(5) = s->W(4 + ((order >> 2) & 3));
-    r.W(6) = s->W(4 + ((order >> 4) & 3));
-    r.W(7) = s->W(4 + ((order >> 6) & 3));
-    *d = r;
+    d->Q(0) = s->Q(0);
+    SHUFFLE4(W, s, s, 4);
+#if SHIFT == 2
+    d->Q(2) = s->Q(2);
+    SHUFFLE4(W, s, s, 12);
+#endif
 }
 #endif
 
-#if SHIFT == 1
+#if SHIFT >= 1
 /* FPU ops */
 /* XXX: not accurate */
 
-#define SSE_HELPER_S(name, F)                                           \
-    void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s)        \
+#define SSE_HELPER_P(name, F)                                           \
+    void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env,          \
+            Reg *d, Reg *v, Reg *s)                                     \
     {                                                                   \
-        d->ZMM_S(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
-        d->ZMM_S(1) = F(32, d->ZMM_S(1), s->ZMM_S(1));                  \
-        d->ZMM_S(2) = F(32, d->ZMM_S(2), s->ZMM_S(2));                  \
-        d->ZMM_S(3) = F(32, d->ZMM_S(3), s->ZMM_S(3));                  \
+        d->ZMM_S(0) = F(32, v->ZMM_S(0), s->ZMM_S(0));                  \
+        d->ZMM_S(1) = F(32, v->ZMM_S(1), s->ZMM_S(1));                  \
+        d->ZMM_S(2) = F(32, v->ZMM_S(2), s->ZMM_S(2));                  \
+        d->ZMM_S(3) = F(32, v->ZMM_S(3), s->ZMM_S(3));                  \
+        YMM_ONLY(                                                       \
+        d->ZMM_S(4) = F(32, v->ZMM_S(4), s->ZMM_S(4));                  \
+        d->ZMM_S(5) = F(32, v->ZMM_S(5), s->ZMM_S(5));                  \
+        d->ZMM_S(6) = F(32, v->ZMM_S(6), s->ZMM_S(6));                  \
+        d->ZMM_S(7) = F(32, v->ZMM_S(7), s->ZMM_S(7));                  \
+        )                                                               \
     }                                                                   \
                                                                         \
-    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)        \
+    void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env,          \
+            Reg *d, Reg *v, Reg *s)                                     \
     {                                                                   \
-        d->ZMM_S(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
-    }                                                                   \
+        d->ZMM_D(0) = F(64, v->ZMM_D(0), s->ZMM_D(0));                  \
+        d->ZMM_D(1) = F(64, v->ZMM_D(1), s->ZMM_D(1));                  \
+        YMM_ONLY(                                                       \
+        d->ZMM_D(2) = F(64, v->ZMM_D(2), s->ZMM_D(2));                  \
+        d->ZMM_D(3) = F(64, v->ZMM_D(3), s->ZMM_D(3));                  \
+        )                                                               \
+    }
+
+#if SHIFT == 1
+
+#define SSE_HELPER_S(name, F)                                           \
+    SSE_HELPER_P(name, F)                                               \
                                                                         \
-    void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s)        \
+    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *v, Reg *s)\
     {                                                                   \
-        d->ZMM_D(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
-        d->ZMM_D(1) = F(64, d->ZMM_D(1), s->ZMM_D(1));                  \
+        d->ZMM_S(0) = F(32, v->ZMM_S(0), s->ZMM_S(0));                  \
     }                                                                   \
                                                                         \
-    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)        \
+    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *v, Reg *s)\
     {                                                                   \
-        d->ZMM_D(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
+        d->ZMM_D(0) = F(64, v->ZMM_D(0), s->ZMM_D(0));                  \
     }
 
+#else
+
+#define SSE_HELPER_S(name, F) SSE_HELPER_P(name, F)
+
+#endif
+
 #define FPU_ADD(size, a, b) float ## size ## _add(a, b, &env->sse_status)
 #define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status)
 #define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status)
 #define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status)
-#define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status)
 
 /* Note that the choice of comparison op here is important to get the
  * special cases right: for min and max Intel specifies that (-0,0),
@@ -625,27 +798,76 @@ SSE_HELPER_S(mul, FPU_MUL)
 SSE_HELPER_S(div, FPU_DIV)
 SSE_HELPER_S(min, FPU_MIN)
 SSE_HELPER_S(max, FPU_MAX)
-SSE_HELPER_S(sqrt, FPU_SQRT)
 
+void glue(helper_sqrtps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->ZMM_S(0) = float32_sqrt(s->ZMM_S(0), &env->sse_status);
+    d->ZMM_S(1) = float32_sqrt(s->ZMM_S(1), &env->sse_status);
+    d->ZMM_S(2) = float32_sqrt(s->ZMM_S(2), &env->sse_status);
+    d->ZMM_S(3) = float32_sqrt(s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_sqrt(s->ZMM_S(4), &env->sse_status);
+    d->ZMM_S(5) = float32_sqrt(s->ZMM_S(5), &env->sse_status);
+    d->ZMM_S(6) = float32_sqrt(s->ZMM_S(6), &env->sse_status);
+    d->ZMM_S(7) = float32_sqrt(s->ZMM_S(7), &env->sse_status);
+#endif
+}
+
+void glue(helper_sqrtpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->ZMM_D(0) = float64_sqrt(s->ZMM_D(0), &env->sse_status);
+    d->ZMM_D(1) = float64_sqrt(s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_D(2) = float64_sqrt(s->ZMM_D(2), &env->sse_status);
+    d->ZMM_D(3) = float64_sqrt(s->ZMM_D(3), &env->sse_status);
+#endif
+}
+
+#if SHIFT == 1
+void helper_sqrtss(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->ZMM_S(0) = float32_sqrt(s->ZMM_S(0), &env->sse_status);
+}
+
+void helper_sqrtsd(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->ZMM_D(0) = float64_sqrt(s->ZMM_D(0), &env->sse_status);
+}
+#endif
 
 /* float to float conversions */
-void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     float32 s0, s1;
 
     s0 = s->ZMM_S(0);
     s1 = s->ZMM_S(1);
+#if SHIFT == 2
+    float32 s2, s3;
+    s2 = s->ZMM_S(2);
+    s3 = s->ZMM_S(3);
+    d->ZMM_D(2) = float32_to_float64(s2, &env->sse_status);
+    d->ZMM_D(3) = float32_to_float64(s3, &env->sse_status);
+#endif
     d->ZMM_D(0) = float32_to_float64(s0, &env->sse_status);
     d->ZMM_D(1) = float32_to_float64(s1, &env->sse_status);
 }
 
-void helper_cvtpd2ps(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     d->ZMM_S(0) = float64_to_float32(s->ZMM_D(0), &env->sse_status);
     d->ZMM_S(1) = float64_to_float32(s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(2) = float64_to_float32(s->ZMM_D(2), &env->sse_status);
+    d->ZMM_S(3) = float64_to_float32(s->ZMM_D(3), &env->sse_status);
+    d->Q(2) = 0;
+    d->Q(3) = 0;
+#else
     d->Q(1) = 0;
+#endif
 }
 
+#if SHIFT == 1
 void helper_cvtss2sd(CPUX86State *env, Reg *d, Reg *s)
 {
     d->ZMM_D(0) = float32_to_float64(s->ZMM_S(0), &env->sse_status);
@@ -655,26 +877,41 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s)
 {
     d->ZMM_S(0) = float64_to_float32(s->ZMM_D(0), &env->sse_status);
 }
+#endif
 
 /* integer to float */
-void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     d->ZMM_S(0) = int32_to_float32(s->ZMM_L(0), &env->sse_status);
     d->ZMM_S(1) = int32_to_float32(s->ZMM_L(1), &env->sse_status);
     d->ZMM_S(2) = int32_to_float32(s->ZMM_L(2), &env->sse_status);
     d->ZMM_S(3) = int32_to_float32(s->ZMM_L(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = int32_to_float32(s->ZMM_L(4), &env->sse_status);
+    d->ZMM_S(5) = int32_to_float32(s->ZMM_L(5), &env->sse_status);
+    d->ZMM_S(6) = int32_to_float32(s->ZMM_L(6), &env->sse_status);
+    d->ZMM_S(7) = int32_to_float32(s->ZMM_L(7), &env->sse_status);
+#endif
 }
 
-void helper_cvtdq2pd(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int32_t l0, l1;
 
     l0 = (int32_t)s->ZMM_L(0);
     l1 = (int32_t)s->ZMM_L(1);
+#if SHIFT == 2
+    int32_t l2, l3;
+    l2 = (int32_t)s->ZMM_L(2);
+    l3 = (int32_t)s->ZMM_L(3);
+    d->ZMM_D(2) = int32_to_float64(l2, &env->sse_status);
+    d->ZMM_D(3) = int32_to_float64(l3, &env->sse_status);
+#endif
     d->ZMM_D(0) = int32_to_float64(l0, &env->sse_status);
     d->ZMM_D(1) = int32_to_float64(l1, &env->sse_status);
 }
 
+#if SHIFT == 1
 void helper_cvtpi2ps(CPUX86State *env, ZMMReg *d, MMXReg *s)
 {
     d->ZMM_S(0) = int32_to_float32(s->MMX_L(0), &env->sse_status);
@@ -709,8 +946,11 @@ void helper_cvtsq2sd(CPUX86State *env, ZMMReg *d, uint64_t val)
 }
 #endif
 
+#endif
+
 /* float to integer */
 
+#if SHIFT == 1
 /*
  * x86 mandates that we return the indefinite integer value for the result
  * of any float-to-integer conversion that raises the 'invalid' exception.
@@ -741,22 +981,37 @@ WRAP_FLOATCONV(int64_t, float32_to_int64, float32, INT64_MIN)
 WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero, float32, INT64_MIN)
 WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN)
 WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN)
+#endif
 
-void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     d->ZMM_L(0) = x86_float32_to_int32(s->ZMM_S(0), &env->sse_status);
     d->ZMM_L(1) = x86_float32_to_int32(s->ZMM_S(1), &env->sse_status);
     d->ZMM_L(2) = x86_float32_to_int32(s->ZMM_S(2), &env->sse_status);
     d->ZMM_L(3) = x86_float32_to_int32(s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_L(4) = x86_float32_to_int32(s->ZMM_S(4), &env->sse_status);
+    d->ZMM_L(5) = x86_float32_to_int32(s->ZMM_S(5), &env->sse_status);
+    d->ZMM_L(6) = x86_float32_to_int32(s->ZMM_S(6), &env->sse_status);
+    d->ZMM_L(7) = x86_float32_to_int32(s->ZMM_S(7), &env->sse_status);
+#endif
 }
 
-void helper_cvtpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     d->ZMM_L(0) = x86_float64_to_int32(s->ZMM_D(0), &env->sse_status);
     d->ZMM_L(1) = x86_float64_to_int32(s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_L(2) = x86_float64_to_int32(s->ZMM_D(2), &env->sse_status);
+    d->ZMM_L(3) = x86_float64_to_int32(s->ZMM_D(3), &env->sse_status);
+    d->Q(2) = 0;
+    d->Q(3) = 0;
+#else
     d->ZMM_Q(1) = 0;
+#endif
 }
 
+#if SHIFT == 1
 void helper_cvtps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
     d->MMX_L(0) = x86_float32_to_int32(s->ZMM_S(0), &env->sse_status);
@@ -790,33 +1045,64 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s)
     return x86_float64_to_int64(s->ZMM_D(0), &env->sse_status);
 }
 #endif
+#endif
 
 /* float to integer truncated */
-void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
-{
-    d->ZMM_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
-    d->ZMM_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);
-    d->ZMM_L(2) = x86_float32_to_int32_round_to_zero(s->ZMM_S(2), &env->sse_status);
-    d->ZMM_L(3) = x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->sse_status);
+void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+{
+    d->ZMM_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0),
+                                                     &env->sse_status);
+    d->ZMM_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1),
+                                                     &env->sse_status);
+    d->ZMM_L(2) = x86_float32_to_int32_round_to_zero(s->ZMM_S(2),
+                                                     &env->sse_status);
+    d->ZMM_L(3) = x86_float32_to_int32_round_to_zero(s->ZMM_S(3),
+                                                     &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_L(4) = x86_float32_to_int32_round_to_zero(s->ZMM_S(4),
+                                                     &env->sse_status);
+    d->ZMM_L(5) = x86_float32_to_int32_round_to_zero(s->ZMM_S(5),
+                                                     &env->sse_status);
+    d->ZMM_L(6) = x86_float32_to_int32_round_to_zero(s->ZMM_S(6),
+                                                     &env->sse_status);
+    d->ZMM_L(7) = x86_float32_to_int32_round_to_zero(s->ZMM_S(7),
+                                                     &env->sse_status);
+#endif
 }
 
-void helper_cvttpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
-    d->ZMM_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
-    d->ZMM_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);
+    d->ZMM_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0),
+                                                     &env->sse_status);
+    d->ZMM_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1),
+                                                     &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_L(2) = x86_float64_to_int32_round_to_zero(s->ZMM_D(2),
+                                                     &env->sse_status);
+    d->ZMM_L(3) = x86_float64_to_int32_round_to_zero(s->ZMM_D(3),
+                                                     &env->sse_status);
+    d->ZMM_Q(2) = 0;
+    d->ZMM_Q(3) = 0;
+#else
     d->ZMM_Q(1) = 0;
+#endif
 }
 
+#if SHIFT == 1
 void helper_cvttps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
-    d->MMX_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
-    d->MMX_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);
+    d->MMX_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0),
+                                                     &env->sse_status);
+    d->MMX_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1),
+                                                     &env->sse_status);
 }
 
 void helper_cvttpd2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
-    d->MMX_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
-    d->MMX_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);
+    d->MMX_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0),
+                                                     &env->sse_status);
+    d->MMX_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1),
+                                                     &env->sse_status);
 }
 
 int32_t helper_cvttss2si(CPUX86State *env, ZMMReg *s)
@@ -840,8 +1126,9 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s)
     return x86_float64_to_int64_round_to_zero(s->ZMM_D(0), &env->sse_status);
 }
 #endif
+#endif
 
-void helper_rsqrtps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
     d->ZMM_S(0) = float32_div(float32_one,
@@ -856,9 +1143,24 @@ void helper_rsqrtps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     d->ZMM_S(3) = float32_div(float32_one,
                               float32_sqrt(s->ZMM_S(3), &env->sse_status),
                               &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_div(float32_one,
+                              float32_sqrt(s->ZMM_S(4), &env->sse_status),
+                              &env->sse_status);
+    d->ZMM_S(5) = float32_div(float32_one,
+                              float32_sqrt(s->ZMM_S(5), &env->sse_status),
+                              &env->sse_status);
+    d->ZMM_S(6) = float32_div(float32_one,
+                              float32_sqrt(s->ZMM_S(6), &env->sse_status),
+                              &env->sse_status);
+    d->ZMM_S(7) = float32_div(float32_one,
+                              float32_sqrt(s->ZMM_S(7), &env->sse_status),
+                              &env->sse_status);
+#endif
     set_float_exception_flags(old_flags, &env->sse_status);
 }
 
+#if SHIFT == 1
 void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
@@ -867,24 +1169,34 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg *s)
                               &env->sse_status);
     set_float_exception_flags(old_flags, &env->sse_status);
 }
+#endif
 
-void helper_rcpps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
     d->ZMM_S(0) = float32_div(float32_one, s->ZMM_S(0), &env->sse_status);
     d->ZMM_S(1) = float32_div(float32_one, s->ZMM_S(1), &env->sse_status);
     d->ZMM_S(2) = float32_div(float32_one, s->ZMM_S(2), &env->sse_status);
     d->ZMM_S(3) = float32_div(float32_one, s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_div(float32_one, s->ZMM_S(4), &env->sse_status);
+    d->ZMM_S(5) = float32_div(float32_one, s->ZMM_S(5), &env->sse_status);
+    d->ZMM_S(6) = float32_div(float32_one, s->ZMM_S(6), &env->sse_status);
+    d->ZMM_S(7) = float32_div(float32_one, s->ZMM_S(7), &env->sse_status);
+#endif
     set_float_exception_flags(old_flags, &env->sse_status);
 }
 
+#if SHIFT == 1
 void helper_rcpss(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
     d->ZMM_S(0) = float32_div(float32_one, s->ZMM_S(0), &env->sse_status);
     set_float_exception_flags(old_flags, &env->sse_status);
 }
+#endif
 
+#if SHIFT == 1
 static inline uint64_t helper_extrq(uint64_t src, int shift, int len)
 {
     uint64_t mask;
@@ -928,113 +1240,213 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, int index, int length)
 {
     d->ZMM_Q(0) = helper_insertq(d->ZMM_Q(0), index, length);
 }
+#endif
 
-void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
-{
-    ZMMReg r;
-
-    r.ZMM_S(0) = float32_add(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status);
-    r.ZMM_S(1) = float32_add(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status);
-    r.ZMM_S(2) = float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status);
-    r.ZMM_S(3) = float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status);
-    *d = r;
+void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    float32 r0, r1, r2, r3;
+
+    r0 = float32_add(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status);
+    r1 = float32_add(v->ZMM_S(2), v->ZMM_S(3), &env->sse_status);
+    r2 = float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status);
+    r3 = float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status);
+    d->ZMM_S(0) = r0;
+    d->ZMM_S(1) = r1;
+    d->ZMM_S(2) = r2;
+    d->ZMM_S(3) = r3;
+#if SHIFT == 2
+    r0 = float32_add(v->ZMM_S(4), v->ZMM_S(5), &env->sse_status);
+    r1 = float32_add(v->ZMM_S(6), v->ZMM_S(7), &env->sse_status);
+    r2 = float32_add(s->ZMM_S(4), s->ZMM_S(5), &env->sse_status);
+    r3 = float32_add(s->ZMM_S(6), s->ZMM_S(7), &env->sse_status);
+    d->ZMM_S(4) = r0;
+    d->ZMM_S(5) = r1;
+    d->ZMM_S(6) = r2;
+    d->ZMM_S(7) = r3;
+#endif
 }
 
-void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    ZMMReg r;
+    float64 r0, r1;
 
-    r.ZMM_D(0) = float64_add(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status);
-    r.ZMM_D(1) = float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status);
-    *d = r;
+    r0 = float64_add(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status);
+    r1 = float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status);
+    d->ZMM_D(0) = r0;
+    d->ZMM_D(1) = r1;
+#if SHIFT == 2
+    r0 = float64_add(v->ZMM_D(2), v->ZMM_D(3), &env->sse_status);
+    r1 = float64_add(s->ZMM_D(2), s->ZMM_D(3), &env->sse_status);
+    d->ZMM_D(2) = r0;
+    d->ZMM_D(3) = r1;
+#endif
 }
 
-void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
-{
-    ZMMReg r;
-
-    r.ZMM_S(0) = float32_sub(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status);
-    r.ZMM_S(1) = float32_sub(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status);
-    r.ZMM_S(2) = float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status);
-    r.ZMM_S(3) = float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status);
-    *d = r;
+void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    float32 r0, r1, r2, r3;
+
+    r0 = float32_sub(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status);
+    r1 = float32_sub(v->ZMM_S(2), v->ZMM_S(3), &env->sse_status);
+    r2 = float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status);
+    r3 = float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status);
+    d->ZMM_S(0) = r0;
+    d->ZMM_S(1) = r1;
+    d->ZMM_S(2) = r2;
+    d->ZMM_S(3) = r3;
+#if SHIFT == 2
+    r0 = float32_sub(v->ZMM_S(4), v->ZMM_S(5), &env->sse_status);
+    r1 = float32_sub(v->ZMM_S(6), v->ZMM_S(7), &env->sse_status);
+    r2 = float32_sub(s->ZMM_S(4), s->ZMM_S(5), &env->sse_status);
+    r3 = float32_sub(s->ZMM_S(6), s->ZMM_S(7), &env->sse_status);
+    d->ZMM_S(4) = r0;
+    d->ZMM_S(5) = r1;
+    d->ZMM_S(6) = r2;
+    d->ZMM_S(7) = r3;
+#endif
 }
 
-void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    ZMMReg r;
+    float64 r0, r1;
 
-    r.ZMM_D(0) = float64_sub(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status);
-    r.ZMM_D(1) = float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status);
-    *d = r;
+    r0 = float64_sub(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status);
+    r1 = float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status);
+    d->ZMM_D(0) = r0;
+    d->ZMM_D(1) = r1;
+#if SHIFT == 2
+    r0 = float64_sub(v->ZMM_D(2), v->ZMM_D(3), &env->sse_status);
+    r1 = float64_sub(s->ZMM_D(2), s->ZMM_D(3), &env->sse_status);
+    d->ZMM_D(2) = r0;
+    d->ZMM_D(3) = r1;
+#endif
 }
 
-void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    d->ZMM_S(0) = float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
-    d->ZMM_S(1) = float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
-    d->ZMM_S(2) = float32_sub(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
-    d->ZMM_S(3) = float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
+    d->ZMM_S(0) = float32_sub(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
+    d->ZMM_S(1) = float32_add(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
+    d->ZMM_S(2) = float32_sub(v->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
+    d->ZMM_S(3) = float32_add(v->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_sub(v->ZMM_S(4), s->ZMM_S(4), &env->sse_status);
+    d->ZMM_S(5) = float32_add(v->ZMM_S(5), s->ZMM_S(5), &env->sse_status);
+    d->ZMM_S(6) = float32_sub(v->ZMM_S(6), s->ZMM_S(6), &env->sse_status);
+    d->ZMM_S(7) = float32_add(v->ZMM_S(7), s->ZMM_S(7), &env->sse_status);
+#endif
 }
 
-void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    d->ZMM_D(0) = float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
-    d->ZMM_D(1) = float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status);
+    d->ZMM_D(0) = float64_sub(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
+    d->ZMM_D(1) = float64_add(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_D(2) = float64_sub(v->ZMM_D(2), s->ZMM_D(2), &env->sse_status);
+    d->ZMM_D(3) = float64_add(v->ZMM_D(3), s->ZMM_D(3), &env->sse_status);
+#endif
 }
 
-/* XXX: unordered */
-#define SSE_HELPER_CMP(name, F)                                         \
-    void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s)        \
-    {                                                                   \
-        d->ZMM_L(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
-        d->ZMM_L(1) = F(32, d->ZMM_S(1), s->ZMM_S(1));                  \
-        d->ZMM_L(2) = F(32, d->ZMM_S(2), s->ZMM_S(2));                  \
-        d->ZMM_L(3) = F(32, d->ZMM_S(3), s->ZMM_S(3));                  \
-    }                                                                   \
-                                                                        \
-    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)        \
-    {                                                                   \
-        d->ZMM_L(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
-    }                                                                   \
-                                                                        \
-    void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s)        \
+#define SSE_HELPER_CMP_P(name, F, C)                                    \
+    void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env,          \
+                                             Reg *d, Reg *v, Reg *s)    \
     {                                                                   \
-        d->ZMM_Q(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
-        d->ZMM_Q(1) = F(64, d->ZMM_D(1), s->ZMM_D(1));                  \
+        d->ZMM_L(0) = F(32, C, v->ZMM_S(0), s->ZMM_S(0));               \
+        d->ZMM_L(1) = F(32, C, v->ZMM_S(1), s->ZMM_S(1));               \
+        d->ZMM_L(2) = F(32, C, v->ZMM_S(2), s->ZMM_S(2));               \
+        d->ZMM_L(3) = F(32, C, v->ZMM_S(3), s->ZMM_S(3));               \
+        YMM_ONLY(                                                       \
+        d->ZMM_L(4) = F(32, C, v->ZMM_S(4), s->ZMM_S(4));               \
+        d->ZMM_L(5) = F(32, C, v->ZMM_S(5), s->ZMM_S(5));               \
+        d->ZMM_L(6) = F(32, C, v->ZMM_S(6), s->ZMM_S(6));               \
+        d->ZMM_L(7) = F(32, C, v->ZMM_S(7), s->ZMM_S(7));               \
+        )                                                               \
     }                                                                   \
                                                                         \
-    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)        \
+    void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env,          \
+                                             Reg *d, Reg *v, Reg *s)    \
     {                                                                   \
-        d->ZMM_Q(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
-    }
-
-#define FPU_CMPEQ(size, a, b)                                           \
-    (float ## size ## _eq_quiet(a, b, &env->sse_status) ? -1 : 0)
-#define FPU_CMPLT(size, a, b)                                           \
-    (float ## size ## _lt(a, b, &env->sse_status) ? -1 : 0)
-#define FPU_CMPLE(size, a, b)                                           \
-    (float ## size ## _le(a, b, &env->sse_status) ? -1 : 0)
-#define FPU_CMPUNORD(size, a, b)                                        \
-    (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? -1 : 0)
-#define FPU_CMPNEQ(size, a, b)                                          \
-    (float ## size ## _eq_quiet(a, b, &env->sse_status) ? 0 : -1)
-#define FPU_CMPNLT(size, a, b)                                          \
-    (float ## size ## _lt(a, b, &env->sse_status) ? 0 : -1)
-#define FPU_CMPNLE(size, a, b)                                          \
-    (float ## size ## _le(a, b, &env->sse_status) ? 0 : -1)
-#define FPU_CMPORD(size, a, b)                                          \
-    (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? 0 : -1)
-
-SSE_HELPER_CMP(cmpeq, FPU_CMPEQ)
-SSE_HELPER_CMP(cmplt, FPU_CMPLT)
-SSE_HELPER_CMP(cmple, FPU_CMPLE)
-SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD)
-SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ)
-SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT)
-SSE_HELPER_CMP(cmpnle, FPU_CMPNLE)
-SSE_HELPER_CMP(cmpord, FPU_CMPORD)
+        d->ZMM_Q(0) = F(64, C, v->ZMM_D(0), s->ZMM_D(0));               \
+        d->ZMM_Q(1) = F(64, C, v->ZMM_D(1), s->ZMM_D(1));               \
+        YMM_ONLY(                                                       \
+        d->ZMM_Q(2) = F(64, C, v->ZMM_D(2), s->ZMM_D(2));               \
+        d->ZMM_Q(3) = F(64, C, v->ZMM_D(3), s->ZMM_D(3));               \
+        )                                                               \
+    }
+
+#if SHIFT == 1
+#define SSE_HELPER_CMP(name, F, C)                                          \
+    SSE_HELPER_CMP_P(name, F, C)                                            \
+    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *v, Reg *s)    \
+    {                                                                       \
+        d->ZMM_L(0) = F(32, C, v->ZMM_S(0), s->ZMM_S(0));                   \
+    }                                                                       \
+                                                                            \
+    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *v, Reg *s)    \
+    {                                                                       \
+        d->ZMM_Q(0) = F(64, C, v->ZMM_D(0), s->ZMM_D(0));                   \
+    }
+
+static inline bool FPU_EQU(FloatRelation x)
+{
+    return (x == float_relation_equal || x == float_relation_unordered);
+}
+static inline bool FPU_GE(FloatRelation x)
+{
+    return (x == float_relation_equal || x == float_relation_greater);
+}
+#define FPU_EQ(x) (x == float_relation_equal)
+#define FPU_LT(x) (x == float_relation_less)
+#define FPU_LE(x) (x <= float_relation_equal)
+#define FPU_GT(x) (x == float_relation_greater)
+#define FPU_UNORD(x) (x == float_relation_unordered)
+#define FPU_FALSE(x) 0
+
+#define FPU_CMPQ(size, COND, a, b) \
+    (COND(float ## size ## _compare_quiet(a, b, &env->sse_status)) ? -1 : 0)
+#define FPU_CMPS(size, COND, a, b) \
+    (COND(float ## size ## _compare(a, b, &env->sse_status)) ? -1 : 0)
+
+#else
+#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_CMP_P(name, F, C)
+#endif
 
+SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ)
+SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT)
+SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE)
+SSE_HELPER_CMP(cmpunord, FPU_CMPQ,  FPU_UNORD)
+SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ)
+SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT)
+SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE)
+SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD)
+
+SSE_HELPER_CMP(cmpequ, FPU_CMPQ, FPU_EQU)
+SSE_HELPER_CMP(cmpnge, FPU_CMPS, !FPU_GE)
+SSE_HELPER_CMP(cmpngt, FPU_CMPS, !FPU_GT)
+SSE_HELPER_CMP(cmpfalse, FPU_CMPQ,  FPU_FALSE)
+SSE_HELPER_CMP(cmpnequ, FPU_CMPQ, !FPU_EQU)
+SSE_HELPER_CMP(cmpge, FPU_CMPS, FPU_GE)
+SSE_HELPER_CMP(cmpgt, FPU_CMPS, FPU_GT)
+SSE_HELPER_CMP(cmptrue, FPU_CMPQ,  !FPU_FALSE)
+
+SSE_HELPER_CMP(cmpeqs, FPU_CMPS, FPU_EQ)
+SSE_HELPER_CMP(cmpltq, FPU_CMPQ, FPU_LT)
+SSE_HELPER_CMP(cmpleq, FPU_CMPQ, FPU_LE)
+SSE_HELPER_CMP(cmpunords, FPU_CMPS,  FPU_UNORD)
+SSE_HELPER_CMP(cmpneqq, FPU_CMPS, !FPU_EQ)
+SSE_HELPER_CMP(cmpnltq, FPU_CMPQ, !FPU_LT)
+SSE_HELPER_CMP(cmpnleq, FPU_CMPQ, !FPU_LE)
+SSE_HELPER_CMP(cmpords, FPU_CMPS, !FPU_UNORD)
+
+SSE_HELPER_CMP(cmpequs, FPU_CMPS, FPU_EQU)
+SSE_HELPER_CMP(cmpngeq, FPU_CMPQ, !FPU_GE)
+SSE_HELPER_CMP(cmpngtq, FPU_CMPQ, !FPU_GT)
+SSE_HELPER_CMP(cmpfalses, FPU_CMPS,  FPU_FALSE)
+SSE_HELPER_CMP(cmpnequs, FPU_CMPS, !FPU_EQU)
+SSE_HELPER_CMP(cmpgeq, FPU_CMPQ, FPU_GE)
+SSE_HELPER_CMP(cmpgtq, FPU_CMPQ, FPU_GT)
+SSE_HELPER_CMP(cmptrues, FPU_CMPS,  !FPU_FALSE)
+
+#if SHIFT == 1
 static const int comis_eflags[4] = {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C};
 
 void helper_ucomiss(CPUX86State *env, Reg *d, Reg *s)
@@ -1080,25 +1492,38 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s)
     ret = float64_compare(d0, d1, &env->sse_status);
     CC_SRC = comis_eflags[ret + 1];
 }
+#endif
 
-uint32_t helper_movmskps(CPUX86State *env, Reg *s)
+uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s)
 {
-    int b0, b1, b2, b3;
+    uint32_t mask;
 
-    b0 = s->ZMM_L(0) >> 31;
-    b1 = s->ZMM_L(1) >> 31;
-    b2 = s->ZMM_L(2) >> 31;
-    b3 = s->ZMM_L(3) >> 31;
-    return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3);
+    mask = 0;
+    mask |= (s->ZMM_L(0) >> (31 - 0)) & (1 << 0);
+    mask |= (s->ZMM_L(1) >> (31 - 1)) & (1 << 1);
+    mask |= (s->ZMM_L(2) >> (31 - 2)) & (1 << 2);
+    mask |= (s->ZMM_L(3) >> (31 - 3)) & (1 << 3);
+#if SHIFT == 2
+    mask |= (s->ZMM_L(4) >> (31 - 4)) & (1 << 4);
+    mask |= (s->ZMM_L(5) >> (31 - 5)) & (1 << 5);
+    mask |= (s->ZMM_L(6) >> (31 - 6)) & (1 << 6);
+    mask |= (s->ZMM_L(7) >> (31 - 7)) & (1 << 7);
+#endif
+    return mask;
 }
 
-uint32_t helper_movmskpd(CPUX86State *env, Reg *s)
+uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s)
 {
-    int b0, b1;
+    uint32_t mask;
 
-    b0 = s->ZMM_L(1) >> 31;
-    b1 = s->ZMM_L(3) >> 31;
-    return b0 | (b1 << 1);
+    mask = 0;
+    mask |= (s->ZMM_L(1) >> (31 - 0)) & (1 << 0);
+    mask |= (s->ZMM_L(3) >> (31 - 1)) & (1 << 1);
+#if SHIFT == 2
+    mask |= (s->ZMM_L(5) >> (31 - 2)) & (1 << 2);
+    mask |= (s->ZMM_L(7) >> (31 - 3)) & (1 << 3);
+#endif
+    return mask;
 }
 
 #endif
@@ -1116,7 +1541,7 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *env, Reg *s)
     val |= (s->B(5) >> 2) & 0x20;
     val |= (s->B(6) >> 1) & 0x40;
     val |= (s->B(7)) & 0x80;
-#if SHIFT == 1
+#if SHIFT >= 1
     val |= (s->B(8) << 1) & 0x0100;
     val |= (s->B(9) << 2) & 0x0200;
     val |= (s->B(10) << 3) & 0x0400;
@@ -1125,160 +1550,243 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *env, Reg *s)
     val |= (s->B(13) << 6) & 0x2000;
     val |= (s->B(14) << 7) & 0x4000;
     val |= (s->B(15) << 8) & 0x8000;
+#if SHIFT == 2
+    val |= ((uint32_t)s->B(16) << 9) & 0x00010000;
+    val |= ((uint32_t)s->B(17) << 10) & 0x00020000;
+    val |= ((uint32_t)s->B(18) << 11) & 0x00040000;
+    val |= ((uint32_t)s->B(19) << 12) & 0x00080000;
+    val |= ((uint32_t)s->B(20) << 13) & 0x00100000;
+    val |= ((uint32_t)s->B(21) << 14) & 0x00200000;
+    val |= ((uint32_t)s->B(22) << 15) & 0x00400000;
+    val |= ((uint32_t)s->B(23) << 16) & 0x00800000;
+    val |= ((uint32_t)s->B(24) << 17) & 0x01000000;
+    val |= ((uint32_t)s->B(25) << 18) & 0x02000000;
+    val |= ((uint32_t)s->B(26) << 19) & 0x04000000;
+    val |= ((uint32_t)s->B(27) << 20) & 0x08000000;
+    val |= ((uint32_t)s->B(28) << 21) & 0x10000000;
+    val |= ((uint32_t)s->B(29) << 22) & 0x20000000;
+    val |= ((uint32_t)s->B(30) << 23) & 0x40000000;
+    val |= ((uint32_t)s->B(31) << 24) & 0x80000000;
+#endif
 #endif
     return val;
 }
 
-void glue(helper_packsswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    Reg r;
-
-    r.B(0) = satsb((int16_t)d->W(0));
-    r.B(1) = satsb((int16_t)d->W(1));
-    r.B(2) = satsb((int16_t)d->W(2));
-    r.B(3) = satsb((int16_t)d->W(3));
-#if SHIFT == 1
-    r.B(4) = satsb((int16_t)d->W(4));
-    r.B(5) = satsb((int16_t)d->W(5));
-    r.B(6) = satsb((int16_t)d->W(6));
-    r.B(7) = satsb((int16_t)d->W(7));
-#endif
-    r.B((4 << SHIFT) + 0) = satsb((int16_t)s->W(0));
-    r.B((4 << SHIFT) + 1) = satsb((int16_t)s->W(1));
-    r.B((4 << SHIFT) + 2) = satsb((int16_t)s->W(2));
-    r.B((4 << SHIFT) + 3) = satsb((int16_t)s->W(3));
-#if SHIFT == 1
-    r.B(12) = satsb((int16_t)s->W(4));
-    r.B(13) = satsb((int16_t)s->W(5));
-    r.B(14) = satsb((int16_t)s->W(6));
-    r.B(15) = satsb((int16_t)s->W(7));
+#if SHIFT == 0
+#define PACK_WIDTH 4
+#else
+#define PACK_WIDTH 8
 #endif
-    *d = r;
-}
-
-void glue(helper_packuswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    Reg r;
 
-    r.B(0) = satub((int16_t)d->W(0));
-    r.B(1) = satub((int16_t)d->W(1));
-    r.B(2) = satub((int16_t)d->W(2));
-    r.B(3) = satub((int16_t)d->W(3));
-#if SHIFT == 1
-    r.B(4) = satub((int16_t)d->W(4));
-    r.B(5) = satub((int16_t)d->W(5));
-    r.B(6) = satub((int16_t)d->W(6));
-    r.B(7) = satub((int16_t)d->W(7));
-#endif
-    r.B((4 << SHIFT) + 0) = satub((int16_t)s->W(0));
-    r.B((4 << SHIFT) + 1) = satub((int16_t)s->W(1));
-    r.B((4 << SHIFT) + 2) = satub((int16_t)s->W(2));
-    r.B((4 << SHIFT) + 3) = satub((int16_t)s->W(3));
-#if SHIFT == 1
-    r.B(12) = satub((int16_t)s->W(4));
-    r.B(13) = satub((int16_t)s->W(5));
-    r.B(14) = satub((int16_t)s->W(6));
-    r.B(15) = satub((int16_t)s->W(7));
-#endif
-    *d = r;
+#define PACK4(F, to, reg, from) do {        \
+    r[to + 0] = F((int16_t)reg->W(from + 0));   \
+    r[to + 1] = F((int16_t)reg->W(from + 1));   \
+    r[to + 2] = F((int16_t)reg->W(from + 2));   \
+    r[to + 3] = F((int16_t)reg->W(from + 3));   \
+    } while (0)
+
+#define PACK_HELPER_B(name, F) \
+void glue(helper_pack ## name, SUFFIX)(CPUX86State *env, \
+        Reg *d, Reg *v, Reg *s)                 \
+{                                               \
+    uint8_t r[PACK_WIDTH * 2];                  \
+    int i;                                      \
+    PACK4(F, 0, v, 0);                          \
+    PACK4(F, PACK_WIDTH, s, 0);                 \
+    XMM_ONLY(                                   \
+        PACK4(F, 4, v, 4);                      \
+        PACK4(F, 12, s, 4);                     \
+        )                                       \
+    for (i = 0; i < PACK_WIDTH * 2; i++) {      \
+        d->B(i) = r[i];                         \
+    }                                           \
+    YMM_ONLY(                                   \
+        PACK4(F, 0, v, 8);                      \
+        PACK4(F, 4, v, 12);                     \
+        PACK4(F, 8, s, 8);                      \
+        PACK4(F, 12, s, 12);                    \
+        for (i = 0; i < 16; i++) {              \
+            d->B(i + 16) = r[i];                \
+        }                                       \
+        )                                       \
 }
 
-void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+PACK_HELPER_B(sswb, satsb)
+PACK_HELPER_B(uswb, satub)
+
+void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg r;
+    uint16_t r[PACK_WIDTH];
+    int i;
 
-    r.W(0) = satsw(d->L(0));
-    r.W(1) = satsw(d->L(1));
-#if SHIFT == 1
-    r.W(2) = satsw(d->L(2));
-    r.W(3) = satsw(d->L(3));
+    r[0] = satsw(v->L(0));
+    r[1] = satsw(v->L(1));
+    r[PACK_WIDTH / 2 + 0] = satsw(s->L(0));
+    r[PACK_WIDTH / 2 + 1] = satsw(s->L(1));
+#if SHIFT >= 1
+    r[2] = satsw(v->L(2));
+    r[3] = satsw(v->L(3));
+    r[6] = satsw(s->L(2));
+    r[7] = satsw(s->L(3));
 #endif
-    r.W((2 << SHIFT) + 0) = satsw(s->L(0));
-    r.W((2 << SHIFT) + 1) = satsw(s->L(1));
-#if SHIFT == 1
-    r.W(6) = satsw(s->L(2));
-    r.W(7) = satsw(s->L(3));
+    for (i = 0; i < PACK_WIDTH; i++) {
+        d->W(i) = r[i];
+    }
+#if SHIFT == 2
+    r[0] = satsw(v->L(4));
+    r[1] = satsw(v->L(5));
+    r[2] = satsw(v->L(6));
+    r[3] = satsw(v->L(7));
+    r[4] = satsw(s->L(4));
+    r[5] = satsw(s->L(5));
+    r[6] = satsw(s->L(6));
+    r[7] = satsw(s->L(7));
+    for (i = 0; i < 8; i++) {
+        d->W(i + 8) = r[i];
+    }
 #endif
-    *d = r;
 }
 
 #define UNPCK_OP(base_name, base)                                       \
                                                                         \
     void glue(helper_punpck ## base_name ## bw, SUFFIX)(CPUX86State *env,\
-                                                        Reg *d, Reg *s) \
+                                                Reg *d, Reg *v, Reg *s) \
     {                                                                   \
-        Reg r;                                                          \
+        uint8_t r[PACK_WIDTH * 2];                                      \
+        int i;                                                          \
                                                                         \
-        r.B(0) = d->B((base << (SHIFT + 2)) + 0);                       \
-        r.B(1) = s->B((base << (SHIFT + 2)) + 0);                       \
-        r.B(2) = d->B((base << (SHIFT + 2)) + 1);                       \
-        r.B(3) = s->B((base << (SHIFT + 2)) + 1);                       \
-        r.B(4) = d->B((base << (SHIFT + 2)) + 2);                       \
-        r.B(5) = s->B((base << (SHIFT + 2)) + 2);                       \
-        r.B(6) = d->B((base << (SHIFT + 2)) + 3);                       \
-        r.B(7) = s->B((base << (SHIFT + 2)) + 3);                       \
+        r[0] = v->B((base * PACK_WIDTH) + 0);                           \
+        r[1] = s->B((base * PACK_WIDTH) + 0);                           \
+        r[2] = v->B((base * PACK_WIDTH) + 1);                           \
+        r[3] = s->B((base * PACK_WIDTH) + 1);                           \
+        r[4] = v->B((base * PACK_WIDTH) + 2);                           \
+        r[5] = s->B((base * PACK_WIDTH) + 2);                           \
+        r[6] = v->B((base * PACK_WIDTH) + 3);                           \
+        r[7] = s->B((base * PACK_WIDTH) + 3);                           \
         XMM_ONLY(                                                       \
-                 r.B(8) = d->B((base << (SHIFT + 2)) + 4);              \
-                 r.B(9) = s->B((base << (SHIFT + 2)) + 4);              \
-                 r.B(10) = d->B((base << (SHIFT + 2)) + 5);             \
-                 r.B(11) = s->B((base << (SHIFT + 2)) + 5);             \
-                 r.B(12) = d->B((base << (SHIFT + 2)) + 6);             \
-                 r.B(13) = s->B((base << (SHIFT + 2)) + 6);             \
-                 r.B(14) = d->B((base << (SHIFT + 2)) + 7);             \
-                 r.B(15) = s->B((base << (SHIFT + 2)) + 7);             \
+                 r[8] = v->B((base * PACK_WIDTH) + 4);                  \
+                 r[9] = s->B((base * PACK_WIDTH) + 4);                  \
+                 r[10] = v->B((base * PACK_WIDTH) + 5);                 \
+                 r[11] = s->B((base * PACK_WIDTH) + 5);                 \
+                 r[12] = v->B((base * PACK_WIDTH) + 6);                 \
+                 r[13] = s->B((base * PACK_WIDTH) + 6);                 \
+                 r[14] = v->B((base * PACK_WIDTH) + 7);                 \
+                 r[15] = s->B((base * PACK_WIDTH) + 7);                 \
+                                                                      ) \
+        for (i = 0; i < PACK_WIDTH * 2; i++) {                          \
+            d->B(i) = r[i];                                             \
+        }                                                               \
+        YMM_ONLY(                                                       \
+                r[0] = v->B((base * 8) + 16);                           \
+                r[1] = s->B((base * 8) + 16);                           \
+                r[2] = v->B((base * 8) + 17);                           \
+                r[3] = s->B((base * 8) + 17);                           \
+                r[4] = v->B((base * 8) + 18);                           \
+                r[5] = s->B((base * 8) + 18);                           \
+                r[6] = v->B((base * 8) + 19);                           \
+                r[7] = s->B((base * 8) + 19);                           \
+                r[8] = v->B((base * 8) + 20);                           \
+                r[9] = s->B((base * 8) + 20);                           \
+                r[10] = v->B((base * 8) + 21);                          \
+                r[11] = s->B((base * 8) + 21);                          \
+                r[12] = v->B((base * 8) + 22);                          \
+                r[13] = s->B((base * 8) + 22);                          \
+                r[14] = v->B((base * 8) + 23);                          \
+                r[15] = s->B((base * 8) + 23);                          \
+                for (i = 0; i < PACK_WIDTH * 2; i++) {                  \
+                    d->B(16 + i) = r[i];                                \
+                }                                                       \
                                                                       ) \
-            *d = r;                                                     \
     }                                                                   \
                                                                         \
     void glue(helper_punpck ## base_name ## wd, SUFFIX)(CPUX86State *env,\
-                                                        Reg *d, Reg *s) \
+                                                Reg *d, Reg *v, Reg *s) \
     {                                                                   \
-        Reg r;                                                          \
+        uint16_t r[PACK_WIDTH];                                         \
+        int i;                                                          \
                                                                         \
-        r.W(0) = d->W((base << (SHIFT + 1)) + 0);                       \
-        r.W(1) = s->W((base << (SHIFT + 1)) + 0);                       \
-        r.W(2) = d->W((base << (SHIFT + 1)) + 1);                       \
-        r.W(3) = s->W((base << (SHIFT + 1)) + 1);                       \
+        r[0] = v->W((base * (PACK_WIDTH / 2)) + 0);                     \
+        r[1] = s->W((base * (PACK_WIDTH / 2)) + 0);                     \
+        r[2] = v->W((base * (PACK_WIDTH / 2)) + 1);                     \
+        r[3] = s->W((base * (PACK_WIDTH / 2)) + 1);                     \
         XMM_ONLY(                                                       \
-                 r.W(4) = d->W((base << (SHIFT + 1)) + 2);              \
-                 r.W(5) = s->W((base << (SHIFT + 1)) + 2);              \
-                 r.W(6) = d->W((base << (SHIFT + 1)) + 3);              \
-                 r.W(7) = s->W((base << (SHIFT + 1)) + 3);              \
+                 r[4] = v->W((base * 4) + 2);                           \
+                 r[5] = s->W((base * 4) + 2);                           \
+                 r[6] = v->W((base * 4) + 3);                           \
+                 r[7] = s->W((base * 4) + 3);                           \
+                                                                      ) \
+        for (i = 0; i < PACK_WIDTH; i++) {                              \
+            d->W(i) = r[i];                                             \
+        }                                                               \
+        YMM_ONLY(                                                       \
+                r[0] = v->W((base * 4) + 8);                            \
+                r[1] = s->W((base * 4) + 8);                            \
+                r[2] = v->W((base * 4) + 9);                            \
+                r[3] = s->W((base * 4) + 9);                            \
+                r[4] = v->W((base * 4) + 10);                           \
+                r[5] = s->W((base * 4) + 10);                           \
+                r[6] = v->W((base * 4) + 11);                           \
+                r[7] = s->W((base * 4) + 11);                           \
+                for (i = 0; i < PACK_WIDTH; i++) {                      \
+                    d->W(i + 8) = r[i];                                 \
+                }                                                       \
                                                                       ) \
-            *d = r;                                                     \
     }                                                                   \
                                                                         \
     void glue(helper_punpck ## base_name ## dq, SUFFIX)(CPUX86State *env,\
-                                                        Reg *d, Reg *s) \
+                                                Reg *d, Reg *v, Reg *s) \
     {                                                                   \
-        Reg r;                                                          \
+        uint32_t r[4];                                                  \
                                                                         \
-        r.L(0) = d->L((base << SHIFT) + 0);                             \
-        r.L(1) = s->L((base << SHIFT) + 0);                             \
+        r[0] = v->L((base * (PACK_WIDTH / 4)) + 0);                     \
+        r[1] = s->L((base * (PACK_WIDTH / 4)) + 0);                     \
         XMM_ONLY(                                                       \
-                 r.L(2) = d->L((base << SHIFT) + 1);                    \
-                 r.L(3) = s->L((base << SHIFT) + 1);                    \
+                 r[2] = v->L((base * 2) + 1);                           \
+                 r[3] = s->L((base * 2) + 1);                           \
+                 d->L(2) = r[2];                                        \
+                 d->L(3) = r[3];                                        \
+                                                                      ) \
+        d->L(0) = r[0];                                                 \
+        d->L(1) = r[1];                                                 \
+        YMM_ONLY(                                                       \
+                 r[0] = v->L((base * 2) + 4);                           \
+                 r[1] = s->L((base * 2) + 4);                           \
+                 r[2] = v->L((base * 2) + 5);                           \
+                 r[3] = s->L((base * 2) + 5);                           \
+                 d->L(4) = r[0];                                        \
+                 d->L(5) = r[1];                                        \
+                 d->L(6) = r[2];                                        \
+                 d->L(7) = r[3];                                        \
                                                                       ) \
-            *d = r;                                                     \
     }                                                                   \
                                                                         \
     XMM_ONLY(                                                           \
-             void glue(helper_punpck ## base_name ## qdq, SUFFIX)(CPUX86State \
-                                                                  *env, \
-                                                                  Reg *d, \
-                                                                  Reg *s) \
+             void glue(helper_punpck ## base_name ## qdq, SUFFIX)(      \
+                        CPUX86State *env, Reg *d, Reg *v, Reg *s)       \
              {                                                          \
-                 Reg r;                                                 \
+                 uint64_t r[2];                                         \
                                                                         \
-                 r.Q(0) = d->Q(base);                                   \
-                 r.Q(1) = s->Q(base);                                   \
-                 *d = r;                                                \
+                 r[0] = v->Q(base);                                     \
+                 r[1] = s->Q(base);                                     \
+                 d->Q(0) = r[0];                                        \
+                 d->Q(1) = r[1];                                        \
+                 YMM_ONLY(                                              \
+                     r[0] = v->Q(base + 2);                             \
+                     r[1] = s->Q(base + 2);                             \
+                     d->Q(2) = r[0];                                    \
+                     d->Q(3) = r[1];                                    \
+                                                                      ) \
              }                                                          \
                                                                         )
 
 UNPCK_OP(l, 0)
 UNPCK_OP(h, 1)
 
+#undef PACK_WIDTH
+#undef PACK_HELPER_B
+#undef PACK4
+
+
 /* 3DNow! float ops */
 #if SHIFT == 0
 void helper_pi2fd(CPUX86State *env, MMXReg *d, MMXReg *s)
@@ -1429,123 +1937,176 @@ void helper_pswapd(CPUX86State *env, MMXReg *d, MMXReg *s)
 #endif
 
 /* SSSE3 op helpers */
-void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
-    Reg r;
+#if SHIFT == 0
+    uint8_t r[8];
 
-    for (i = 0; i < (8 << SHIFT); i++) {
-        r.B(i) = (s->B(i) & 0x80) ? 0 : (d->B(s->B(i) & ((8 << SHIFT) - 1)));
+    for (i = 0; i < 8; i++) {
+        r[i] = (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 7));
     }
+    for (i = 0; i < 8; i++) {
+        d->B(i) = r[i];
+    }
+#else
+    uint8_t r[16];
 
-    *d = r;
+    for (i = 0; i < 16; i++) {
+        r[i] = (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 0xf));
+    }
+    for (i = 0; i < 16; i++) {
+        d->B(i) = r[i];
+    }
+#if SHIFT == 2
+    for (i = 0; i < 16; i++) {
+        r[i] = (s->B(i + 16) & 0x80) ? 0 : (v->B((s->B(i + 16) & 0xf) + 16));
+    }
+    for (i = 0; i < 16; i++) {
+        d->B(i + 16) = r[i];
+    }
+#endif
+#endif
 }
 
-void glue(helper_phaddw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
+#if SHIFT == 0
 
-    Reg r;
+#define SSE_HELPER_HW(name, F)  \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
+{                               \
+    uint16_t r[4];               \
+    r[0] = F(v->W(0), v->W(1)); \
+    r[1] = F(v->W(2), v->W(3)); \
+    r[2] = F(s->W(0), s->W(1)); \
+    r[3] = F(s->W(3), s->W(3)); \
+    d->W(0) = r[0];             \
+    d->W(1) = r[1];             \
+    d->W(2) = r[2];             \
+    d->W(3) = r[3];             \
+}
+
+#define SSE_HELPER_HL(name, F)  \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
+{                               \
+    uint32_t r0, r1;             \
+    r0 = F(v->L(0), v->L(1));   \
+    r1 = F(s->L(0), s->L(1));   \
+    d->W(0) = r0;               \
+    d->W(1) = r1;               \
+}
 
-    r.W(0) = (int16_t)d->W(0) + (int16_t)d->W(1);
-    r.W(1) = (int16_t)d->W(2) + (int16_t)d->W(3);
-    XMM_ONLY(r.W(2) = (int16_t)d->W(4) + (int16_t)d->W(5));
-    XMM_ONLY(r.W(3) = (int16_t)d->W(6) + (int16_t)d->W(7));
-    r.W((2 << SHIFT) + 0) = (int16_t)s->W(0) + (int16_t)s->W(1);
-    r.W((2 << SHIFT) + 1) = (int16_t)s->W(2) + (int16_t)s->W(3);
-    XMM_ONLY(r.W(6) = (int16_t)s->W(4) + (int16_t)s->W(5));
-    XMM_ONLY(r.W(7) = (int16_t)s->W(6) + (int16_t)s->W(7));
+#else
 
-    *d = r;
+#define SSE_HELPER_HW(name, F)  \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
+{                                   \
+    int32_t r[8];                   \
+    r[0] = F(v->W(0), v->W(1));     \
+    r[1] = F(v->W(2), v->W(3));     \
+    r[2] = F(v->W(4), v->W(5));     \
+    r[3] = F(v->W(6), v->W(7));     \
+    r[4] = F(s->W(0), s->W(1));     \
+    r[5] = F(s->W(2), s->W(3));     \
+    r[6] = F(s->W(4), s->W(5));     \
+    r[7] = F(s->W(6), s->W(7));     \
+    d->W(0) = r[0];                 \
+    d->W(1) = r[1];                 \
+    d->W(2) = r[2];                 \
+    d->W(3) = r[3];                 \
+    d->W(4) = r[4];                 \
+    d->W(5) = r[5];                 \
+    d->W(6) = r[6];                 \
+    d->W(7) = r[7];                 \
+    YMM_ONLY(                       \
+    r[0] = F(v->W(8), v->W(9));     \
+    r[1] = F(v->W(10), v->W(11));   \
+    r[2] = F(v->W(12), v->W(13));   \
+    r[3] = F(v->W(14), v->W(15));   \
+    r[4] = F(s->W(8), s->W(9));     \
+    r[5] = F(s->W(10), s->W(11));   \
+    r[6] = F(s->W(12), s->W(13));   \
+    r[7] = F(s->W(14), s->W(15));   \
+    d->W(8) = r[0];                 \
+    d->W(9) = r[1];                 \
+    d->W(10) = r[2];                \
+    d->W(11) = r[3];                \
+    d->W(12) = r[4];                \
+    d->W(13) = r[5];                \
+    d->W(14) = r[6];                \
+    d->W(15) = r[7];                \
+    )                               \
+}
+
+#define SSE_HELPER_HL(name, F)  \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
+{                               \
+    int32_t r0, r1, r2, r3;     \
+    r0 = F(v->L(0), v->L(1));   \
+    r1 = F(v->L(2), v->L(3));   \
+    r2 = F(s->L(0), s->L(1));   \
+    r3 = F(s->L(2), s->L(3));   \
+    d->L(0) = r0;               \
+    d->L(1) = r1;               \
+    d->L(2) = r2;               \
+    d->L(3) = r3;               \
+    YMM_ONLY(                   \
+    r0 = F(v->L(4), v->L(5));   \
+    r1 = F(v->L(6), v->L(7));   \
+    r2 = F(s->L(4), s->L(5));   \
+    r3 = F(s->L(6), s->L(7));   \
+    d->L(4) = r0;               \
+    d->L(5) = r1;               \
+    d->L(6) = r2;               \
+    d->L(7) = r3;               \
+    )                           \
 }
-
-void glue(helper_phaddd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    Reg r;
-
-    r.L(0) = (int32_t)d->L(0) + (int32_t)d->L(1);
-    XMM_ONLY(r.L(1) = (int32_t)d->L(2) + (int32_t)d->L(3));
-    r.L((1 << SHIFT) + 0) = (int32_t)s->L(0) + (int32_t)s->L(1);
-    XMM_ONLY(r.L(3) = (int32_t)s->L(2) + (int32_t)s->L(3));
-
-    *d = r;
-}
-
-void glue(helper_phaddsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    Reg r;
-
-    r.W(0) = satsw((int16_t)d->W(0) + (int16_t)d->W(1));
-    r.W(1) = satsw((int16_t)d->W(2) + (int16_t)d->W(3));
-    XMM_ONLY(r.W(2) = satsw((int16_t)d->W(4) + (int16_t)d->W(5)));
-    XMM_ONLY(r.W(3) = satsw((int16_t)d->W(6) + (int16_t)d->W(7)));
-    r.W((2 << SHIFT) + 0) = satsw((int16_t)s->W(0) + (int16_t)s->W(1));
-    r.W((2 << SHIFT) + 1) = satsw((int16_t)s->W(2) + (int16_t)s->W(3));
-    XMM_ONLY(r.W(6) = satsw((int16_t)s->W(4) + (int16_t)s->W(5)));
-    XMM_ONLY(r.W(7) = satsw((int16_t)s->W(6) + (int16_t)s->W(7)));
-
-    *d = r;
-}
-
-void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    d->W(0) = satsw((int8_t)s->B(0) * (uint8_t)d->B(0) +
-                    (int8_t)s->B(1) * (uint8_t)d->B(1));
-    d->W(1) = satsw((int8_t)s->B(2) * (uint8_t)d->B(2) +
-                    (int8_t)s->B(3) * (uint8_t)d->B(3));
-    d->W(2) = satsw((int8_t)s->B(4) * (uint8_t)d->B(4) +
-                    (int8_t)s->B(5) * (uint8_t)d->B(5));
-    d->W(3) = satsw((int8_t)s->B(6) * (uint8_t)d->B(6) +
-                    (int8_t)s->B(7) * (uint8_t)d->B(7));
-#if SHIFT == 1
-    d->W(4) = satsw((int8_t)s->B(8) * (uint8_t)d->B(8) +
-                    (int8_t)s->B(9) * (uint8_t)d->B(9));
-    d->W(5) = satsw((int8_t)s->B(10) * (uint8_t)d->B(10) +
-                    (int8_t)s->B(11) * (uint8_t)d->B(11));
-    d->W(6) = satsw((int8_t)s->B(12) * (uint8_t)d->B(12) +
-                    (int8_t)s->B(13) * (uint8_t)d->B(13));
-    d->W(7) = satsw((int8_t)s->B(14) * (uint8_t)d->B(14) +
-                    (int8_t)s->B(15) * (uint8_t)d->B(15));
 #endif
-}
 
-void glue(helper_phsubw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    d->W(0) = (int16_t)d->W(0) - (int16_t)d->W(1);
-    d->W(1) = (int16_t)d->W(2) - (int16_t)d->W(3);
-    XMM_ONLY(d->W(2) = (int16_t)d->W(4) - (int16_t)d->W(5));
-    XMM_ONLY(d->W(3) = (int16_t)d->W(6) - (int16_t)d->W(7));
-    d->W((2 << SHIFT) + 0) = (int16_t)s->W(0) - (int16_t)s->W(1);
-    d->W((2 << SHIFT) + 1) = (int16_t)s->W(2) - (int16_t)s->W(3);
-    XMM_ONLY(d->W(6) = (int16_t)s->W(4) - (int16_t)s->W(5));
-    XMM_ONLY(d->W(7) = (int16_t)s->W(6) - (int16_t)s->W(7));
-}
-
-void glue(helper_phsubd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    d->L(0) = (int32_t)d->L(0) - (int32_t)d->L(1);
-    XMM_ONLY(d->L(1) = (int32_t)d->L(2) - (int32_t)d->L(3));
-    d->L((1 << SHIFT) + 0) = (int32_t)s->L(0) - (int32_t)s->L(1);
-    XMM_ONLY(d->L(3) = (int32_t)s->L(2) - (int32_t)s->L(3));
-}
-
-void glue(helper_phsubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    d->W(0) = satsw((int16_t)d->W(0) - (int16_t)d->W(1));
-    d->W(1) = satsw((int16_t)d->W(2) - (int16_t)d->W(3));
-    XMM_ONLY(d->W(2) = satsw((int16_t)d->W(4) - (int16_t)d->W(5)));
-    XMM_ONLY(d->W(3) = satsw((int16_t)d->W(6) - (int16_t)d->W(7)));
-    d->W((2 << SHIFT) + 0) = satsw((int16_t)s->W(0) - (int16_t)s->W(1));
-    d->W((2 << SHIFT) + 1) = satsw((int16_t)s->W(2) - (int16_t)s->W(3));
-    XMM_ONLY(d->W(6) = satsw((int16_t)s->W(4) - (int16_t)s->W(5)));
-    XMM_ONLY(d->W(7) = satsw((int16_t)s->W(6) - (int16_t)s->W(7)));
+SSE_HELPER_HW(phaddw, FADD)
+SSE_HELPER_HW(phsubw, FSUB)
+SSE_HELPER_HW(phaddsw, FADDSW)
+SSE_HELPER_HW(phsubsw, FSUBSW)
+SSE_HELPER_HL(phaddd, FADD)
+SSE_HELPER_HL(phsubd, FSUB)
+
+#undef SSE_HELPER_HW
+#undef SSE_HELPER_HL
+
+void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    d->W(0) = satsw((int8_t)s->B(0) * (uint8_t)v->B(0) +
+                    (int8_t)s->B(1) * (uint8_t)v->B(1));
+    d->W(1) = satsw((int8_t)s->B(2) * (uint8_t)v->B(2) +
+                    (int8_t)s->B(3) * (uint8_t)v->B(3));
+    d->W(2) = satsw((int8_t)s->B(4) * (uint8_t)v->B(4) +
+                    (int8_t)s->B(5) * (uint8_t)v->B(5));
+    d->W(3) = satsw((int8_t)s->B(6) * (uint8_t)v->B(6) +
+                    (int8_t)s->B(7) * (uint8_t)v->B(7));
+#if SHIFT >= 1
+    d->W(4) = satsw((int8_t)s->B(8) * (uint8_t)v->B(8) +
+                    (int8_t)s->B(9) * (uint8_t)v->B(9));
+    d->W(5) = satsw((int8_t)s->B(10) * (uint8_t)v->B(10) +
+                    (int8_t)s->B(11) * (uint8_t)v->B(11));
+    d->W(6) = satsw((int8_t)s->B(12) * (uint8_t)v->B(12) +
+                    (int8_t)s->B(13) * (uint8_t)v->B(13));
+    d->W(7) = satsw((int8_t)s->B(14) * (uint8_t)v->B(14) +
+                    (int8_t)s->B(15) * (uint8_t)v->B(15));
+#if SHIFT == 2
+    int i;
+    for (i = 8; i < 16; i++) {
+        d->W(i) = satsw((int8_t)s->B(i * 2) * (uint8_t)v->B(i * 2) +
+                        (int8_t)s->B(i * 2 + 1) * (uint8_t)v->B(i * 2 + 1));
+    }
+#endif
+#endif
 }
 
-#define FABSB(_, x) (x > INT8_MAX  ? -(int8_t)x : x)
-#define FABSW(_, x) (x > INT16_MAX ? -(int16_t)x : x)
-#define FABSL(_, x) (x > INT32_MAX ? -(int32_t)x : x)
-SSE_HELPER_B(helper_pabsb, FABSB)
-SSE_HELPER_W(helper_pabsw, FABSW)
-SSE_HELPER_L(helper_pabsd, FABSL)
+#define FABSB(x) (x > INT8_MAX  ? -(int8_t)x : x)
+#define FABSW(x) (x > INT16_MAX ? -(int16_t)x : x)
+#define FABSL(x) (x > INT32_MAX ? -(int32_t)x : x)
+SSE_HELPER_1(helper_pabsb, B, 8, FABSB)
+SSE_HELPER_1(helper_pabsw, W, 4, FABSW)
+SSE_HELPER_1(helper_pabsd, L, 2, FABSL)
 
 #define FMULHRSW(d, s) (((int16_t) d * (int16_t)s + 0x4000) >> 15)
 SSE_HELPER_W(helper_pmulhrsw, FMULHRSW)
@@ -1557,104 +2118,119 @@ SSE_HELPER_B(helper_psignb, FSIGNB)
 SSE_HELPER_W(helper_psignw, FSIGNW)
 SSE_HELPER_L(helper_psignd, FSIGNL)
 
-void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,
                                   int32_t shift)
 {
-    Reg r;
-
     /* XXX could be checked during translation */
-    if (shift >= (16 << SHIFT)) {
-        r.Q(0) = 0;
-        XMM_ONLY(r.Q(1) = 0);
+    if (shift >= (SHIFT ? 32 : 16)) {
+        d->Q(0) = 0;
+        XMM_ONLY(d->Q(1) = 0);
+#if SHIFT == 2
+        d->Q(2) = 0;
+        d->Q(3) = 0;
+#endif
     } else {
         shift <<= 3;
 #define SHR(v, i) (i < 64 && i > -64 ? i > 0 ? v >> (i) : (v << -(i)) : 0)
 #if SHIFT == 0
-        r.Q(0) = SHR(s->Q(0), shift - 0) |
-            SHR(d->Q(0), shift -  64);
+        d->Q(0) = SHR(s->Q(0), shift - 0) |
+            SHR(v->Q(0), shift -  64);
 #else
-        r.Q(0) = SHR(s->Q(0), shift - 0) |
-            SHR(s->Q(1), shift -  64) |
-            SHR(d->Q(0), shift - 128) |
-            SHR(d->Q(1), shift - 192);
-        r.Q(1) = SHR(s->Q(0), shift + 64) |
-            SHR(s->Q(1), shift -   0) |
-            SHR(d->Q(0), shift -  64) |
-            SHR(d->Q(1), shift - 128);
+        uint64_t r0, r1;
+
+        r0 = SHR(s->Q(0), shift - 0) |
+             SHR(s->Q(1), shift -  64) |
+             SHR(v->Q(0), shift - 128) |
+             SHR(v->Q(1), shift - 192);
+        r1 = SHR(s->Q(0), shift + 64) |
+             SHR(s->Q(1), shift -   0) |
+             SHR(v->Q(0), shift -  64) |
+             SHR(v->Q(1), shift - 128);
+        d->Q(0) = r0;
+        d->Q(1) = r1;
+#if SHIFT == 2
+        r0 = SHR(s->Q(2), shift - 0) |
+             SHR(s->Q(3), shift -  64) |
+             SHR(v->Q(2), shift - 128) |
+             SHR(v->Q(3), shift - 192);
+        r1 = SHR(s->Q(2), shift + 64) |
+             SHR(s->Q(3), shift -   0) |
+             SHR(v->Q(2), shift -  64) |
+             SHR(v->Q(3), shift - 128);
+        d->Q(2) = r0;
+        d->Q(3) = r1;
+#endif
 #endif
 #undef SHR
     }
-
-    *d = r;
 }
 
-#define XMM0 (env->xmm_regs[0])
+#if SHIFT >= 1
+
+#define BLEND_V128(elem, num, F, b) do {                                    \
+    d->elem(b + 0) = F(v->elem(b + 0), s->elem(b + 0), m->elem(b + 0));     \
+    d->elem(b + 1) = F(v->elem(b + 1), s->elem(b + 1), m->elem(b + 1));     \
+    if (num > 2) {                                                          \
+        d->elem(b + 2) = F(v->elem(b + 2), s->elem(b + 2), m->elem(b + 2)); \
+        d->elem(b + 3) = F(v->elem(b + 3), s->elem(b + 3), m->elem(b + 3)); \
+    }                                                                       \
+    if (num > 4) {                                                          \
+        d->elem(b + 4) = F(v->elem(b + 4), s->elem(b + 4), m->elem(b + 4)); \
+        d->elem(b + 5) = F(v->elem(b + 5), s->elem(b + 5), m->elem(b + 5)); \
+        d->elem(b + 6) = F(v->elem(b + 6), s->elem(b + 6), m->elem(b + 6)); \
+        d->elem(b + 7) = F(v->elem(b + 7), s->elem(b + 7), m->elem(b + 7)); \
+    }                                                                       \
+    if (num > 8) {                                                          \
+        d->elem(b + 8) = F(v->elem(b + 8), s->elem(b + 8), m->elem(b + 8)); \
+        d->elem(b + 9) = F(v->elem(b + 9), s->elem(b + 9), m->elem(b + 9)); \
+        d->elem(b + 10) = F(v->elem(b + 10), s->elem(b + 10), m->elem(b + 10));\
+        d->elem(b + 11) = F(v->elem(b + 11), s->elem(b + 11), m->elem(b + 11));\
+        d->elem(b + 12) = F(v->elem(b + 12), s->elem(b + 12), m->elem(b + 12));\
+        d->elem(b + 13) = F(v->elem(b + 13), s->elem(b + 13), m->elem(b + 13));\
+        d->elem(b + 14) = F(v->elem(b + 14), s->elem(b + 14), m->elem(b + 14));\
+        d->elem(b + 15) = F(v->elem(b + 15), s->elem(b + 15), m->elem(b + 15));\
+    }                                                                   \
+    } while (0)
 
-#if SHIFT == 1
 #define SSE_HELPER_V(name, elem, num, F)                                \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)           \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,   \
+                            Reg *m)                                     \
     {                                                                   \
-        d->elem(0) = F(d->elem(0), s->elem(0), XMM0.elem(0));           \
-        d->elem(1) = F(d->elem(1), s->elem(1), XMM0.elem(1));           \
-        if (num > 2) {                                                  \
-            d->elem(2) = F(d->elem(2), s->elem(2), XMM0.elem(2));       \
-            d->elem(3) = F(d->elem(3), s->elem(3), XMM0.elem(3));       \
-            if (num > 4) {                                              \
-                d->elem(4) = F(d->elem(4), s->elem(4), XMM0.elem(4));   \
-                d->elem(5) = F(d->elem(5), s->elem(5), XMM0.elem(5));   \
-                d->elem(6) = F(d->elem(6), s->elem(6), XMM0.elem(6));   \
-                d->elem(7) = F(d->elem(7), s->elem(7), XMM0.elem(7));   \
-                if (num > 8) {                                          \
-                    d->elem(8) = F(d->elem(8), s->elem(8), XMM0.elem(8)); \
-                    d->elem(9) = F(d->elem(9), s->elem(9), XMM0.elem(9)); \
-                    d->elem(10) = F(d->elem(10), s->elem(10), XMM0.elem(10)); \
-                    d->elem(11) = F(d->elem(11), s->elem(11), XMM0.elem(11)); \
-                    d->elem(12) = F(d->elem(12), s->elem(12), XMM0.elem(12)); \
-                    d->elem(13) = F(d->elem(13), s->elem(13), XMM0.elem(13)); \
-                    d->elem(14) = F(d->elem(14), s->elem(14), XMM0.elem(14)); \
-                    d->elem(15) = F(d->elem(15), s->elem(15), XMM0.elem(15)); \
-                }                                                       \
-            }                                                           \
-        }                                                               \
-    }
+        BLEND_V128(elem, num, F, 0);                                    \
+        YMM_ONLY(BLEND_V128(elem, num, F, num);)                        \
+    }
+
+#define BLEND_I128(elem, num, F, b) do {                                    \
+    d->elem(b + 0) = F(v->elem(b + 0), s->elem(b + 0), ((imm >> 0) & 1));   \
+    d->elem(b + 1) = F(v->elem(b + 1), s->elem(b + 1), ((imm >> 1) & 1));   \
+    if (num > 2) {                                                          \
+        d->elem(b + 2) = F(v->elem(b + 2), s->elem(b + 2), ((imm >> 2) & 1)); \
+        d->elem(b + 3) = F(v->elem(b + 3), s->elem(b + 3), ((imm >> 3) & 1)); \
+    }                                                                       \
+    if (num > 4) {                                                          \
+        d->elem(b + 4) = F(v->elem(b + 4), s->elem(b + 4), ((imm >> 4) & 1)); \
+        d->elem(b + 5) = F(v->elem(b + 5), s->elem(b + 5), ((imm >> 5) & 1)); \
+        d->elem(b + 6) = F(v->elem(b + 6), s->elem(b + 6), ((imm >> 6) & 1)); \
+        d->elem(b + 7) = F(v->elem(b + 7), s->elem(b + 7), ((imm >> 7) & 1)); \
+    }                                                                       \
+    } while (0)
 
 #define SSE_HELPER_I(name, elem, num, F)                                \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t imm) \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,   \
+                            uint32_t imm)                               \
     {                                                                   \
-        d->elem(0) = F(d->elem(0), s->elem(0), ((imm >> 0) & 1));       \
-        d->elem(1) = F(d->elem(1), s->elem(1), ((imm >> 1) & 1));       \
-        if (num > 2) {                                                  \
-            d->elem(2) = F(d->elem(2), s->elem(2), ((imm >> 2) & 1));   \
-            d->elem(3) = F(d->elem(3), s->elem(3), ((imm >> 3) & 1));   \
-            if (num > 4) {                                              \
-                d->elem(4) = F(d->elem(4), s->elem(4), ((imm >> 4) & 1)); \
-                d->elem(5) = F(d->elem(5), s->elem(5), ((imm >> 5) & 1)); \
-                d->elem(6) = F(d->elem(6), s->elem(6), ((imm >> 6) & 1)); \
-                d->elem(7) = F(d->elem(7), s->elem(7), ((imm >> 7) & 1)); \
-                if (num > 8) {                                          \
-                    d->elem(8) = F(d->elem(8), s->elem(8), ((imm >> 8) & 1)); \
-                    d->elem(9) = F(d->elem(9), s->elem(9), ((imm >> 9) & 1)); \
-                    d->elem(10) = F(d->elem(10), s->elem(10),           \
-                                    ((imm >> 10) & 1));                 \
-                    d->elem(11) = F(d->elem(11), s->elem(11),           \
-                                    ((imm >> 11) & 1));                 \
-                    d->elem(12) = F(d->elem(12), s->elem(12),           \
-                                    ((imm >> 12) & 1));                 \
-                    d->elem(13) = F(d->elem(13), s->elem(13),           \
-                                    ((imm >> 13) & 1));                 \
-                    d->elem(14) = F(d->elem(14), s->elem(14),           \
-                                    ((imm >> 14) & 1));                 \
-                    d->elem(15) = F(d->elem(15), s->elem(15),           \
-                                    ((imm >> 15) & 1));                 \
-                }                                                       \
-            }                                                           \
-        }                                                               \
+        BLEND_I128(elem, num, F, 0);                                    \
+        YMM_ONLY(                                                       \
+        if (num < 8)                                                    \
+            imm >>= num;                                                \
+        BLEND_I128(elem, num, F, num);                                  \
+        )                                                               \
     }
 
 /* SSE4.1 op helpers */
-#define FBLENDVB(d, s, m) ((m & 0x80) ? s : d)
-#define FBLENDVPS(d, s, m) ((m & 0x80000000) ? s : d)
-#define FBLENDVPD(d, s, m) ((m & 0x8000000000000000LL) ? s : d)
+#define FBLENDVB(v, s, m) ((m & 0x80) ? s : v)
+#define FBLENDVPS(v, s, m) ((m & 0x80000000) ? s : v)
+#define FBLENDVPD(v, s, m) ((m & 0x8000000000000000LL) ? s : v)
 SSE_HELPER_V(helper_pblendvb, B, 16, FBLENDVB)
 SSE_HELPER_V(helper_blendvps, L, 4, FBLENDVPS)
 SSE_HELPER_V(helper_blendvpd, Q, 2, FBLENDVPD)
@@ -1664,14 +2240,28 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     uint64_t zf = (s->Q(0) &  d->Q(0)) | (s->Q(1) &  d->Q(1));
     uint64_t cf = (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1));
 
+#if SHIFT == 2
+    zf |= (s->Q(2) &  d->Q(2)) | (s->Q(3) &  d->Q(3));
+    cf |= (s->Q(2) & ~d->Q(2)) | (s->Q(3) & ~d->Q(3));
+#endif
     CC_SRC = (zf ? 0 : CC_Z) | (cf ? 0 : CC_C);
 }
 
 #define SSE_HELPER_F(name, elem, num, F)        \
     void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)     \
     {                                           \
-        if (num > 2) {                          \
-            if (num > 4) {                      \
+        if (num * SHIFT > 2) {                  \
+            if (num * SHIFT > 8) {              \
+                d->elem(15) = F(15);            \
+                d->elem(14) = F(14);            \
+                d->elem(13) = F(13);            \
+                d->elem(12) = F(12);            \
+                d->elem(11) = F(11);            \
+                d->elem(10) = F(10);            \
+                d->elem(9) = F(9);              \
+                d->elem(8) = F(8);              \
+            }                                   \
+            if (num * SHIFT > 4) {              \
                 d->elem(7) = F(7);              \
                 d->elem(6) = F(6);              \
                 d->elem(5) = F(5);              \
@@ -1697,28 +2287,57 @@ SSE_HELPER_F(helper_pmovzxwd, L, 4, s->W)
 SSE_HELPER_F(helper_pmovzxwq, Q, 2, s->W)
 SSE_HELPER_F(helper_pmovzxdq, Q, 2, s->L)
 
-void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    d->Q(0) = (int64_t)(int32_t) d->L(0) * (int32_t) s->L(0);
-    d->Q(1) = (int64_t)(int32_t) d->L(2) * (int32_t) s->L(2);
+    d->Q(0) = (int64_t)(int32_t) v->L(0) * (int32_t) s->L(0);
+    d->Q(1) = (int64_t)(int32_t) v->L(2) * (int32_t) s->L(2);
+#if SHIFT == 2
+    d->Q(2) = (int64_t)(int32_t) v->L(4) * (int32_t) s->L(4);
+    d->Q(3) = (int64_t)(int32_t) v->L(6) * (int32_t) s->L(6);
+#endif
 }
 
 #define FCMPEQQ(d, s) (d == s ? -1 : 0)
 SSE_HELPER_Q(helper_pcmpeqq, FCMPEQQ)
 
-void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    Reg r;
-
-    r.W(0) = satuw((int32_t) d->L(0));
-    r.W(1) = satuw((int32_t) d->L(1));
-    r.W(2) = satuw((int32_t) d->L(2));
-    r.W(3) = satuw((int32_t) d->L(3));
-    r.W(4) = satuw((int32_t) s->L(0));
-    r.W(5) = satuw((int32_t) s->L(1));
-    r.W(6) = satuw((int32_t) s->L(2));
-    r.W(7) = satuw((int32_t) s->L(3));
-    *d = r;
+void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    uint16_t r[8];
+
+    r[0] = satuw((int32_t) v->L(0));
+    r[1] = satuw((int32_t) v->L(1));
+    r[2] = satuw((int32_t) v->L(2));
+    r[3] = satuw((int32_t) v->L(3));
+    r[4] = satuw((int32_t) s->L(0));
+    r[5] = satuw((int32_t) s->L(1));
+    r[6] = satuw((int32_t) s->L(2));
+    r[7] = satuw((int32_t) s->L(3));
+    d->W(0) = r[0];
+    d->W(1) = r[1];
+    d->W(2) = r[2];
+    d->W(3) = r[3];
+    d->W(4) = r[4];
+    d->W(5) = r[5];
+    d->W(6) = r[6];
+    d->W(7) = r[7];
+#if SHIFT == 2
+    r[0] = satuw((int32_t) v->L(4));
+    r[1] = satuw((int32_t) v->L(5));
+    r[2] = satuw((int32_t) v->L(6));
+    r[3] = satuw((int32_t) v->L(7));
+    r[4] = satuw((int32_t) s->L(4));
+    r[5] = satuw((int32_t) s->L(5));
+    r[6] = satuw((int32_t) s->L(6));
+    r[7] = satuw((int32_t) s->L(7));
+    d->W(8) = r[0];
+    d->W(9) = r[1];
+    d->W(10) = r[2];
+    d->W(11) = r[3];
+    d->W(12) = r[4];
+    d->W(13) = r[5];
+    d->W(14) = r[6];
+    d->W(15) = r[7];
+#endif
 }
 
 #define FMINSB(d, s) MIN((int8_t)d, (int8_t)s)
@@ -1737,6 +2356,7 @@ SSE_HELPER_L(helper_pmaxud, MAX)
 #define FMULLD(d, s) ((int32_t)d * (int32_t)s)
 SSE_HELPER_L(helper_pmulld, FMULLD)
 
+#if SHIFT == 1
 void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int idx = 0;
@@ -1768,6 +2388,7 @@ void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     d->L(1) = 0;
     d->Q(1) = 0;
 }
+#endif
 
 void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                   uint32_t mode)
@@ -1797,6 +2418,12 @@ void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     d->ZMM_S(1) = float32_round_to_int(s->ZMM_S(1), &env->sse_status);
     d->ZMM_S(2) = float32_round_to_int(s->ZMM_S(2), &env->sse_status);
     d->ZMM_S(3) = float32_round_to_int(s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_round_to_int(s->ZMM_S(4), &env->sse_status);
+    d->ZMM_S(5) = float32_round_to_int(s->ZMM_S(5), &env->sse_status);
+    d->ZMM_S(6) = float32_round_to_int(s->ZMM_S(6), &env->sse_status);
+    d->ZMM_S(7) = float32_round_to_int(s->ZMM_S(7), &env->sse_status);
+#endif
 
     if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) {
         set_float_exception_flags(get_float_exception_flags(&env->sse_status) &
@@ -1832,6 +2459,10 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 
     d->ZMM_D(0) = float64_round_to_int(s->ZMM_D(0), &env->sse_status);
     d->ZMM_D(1) = float64_round_to_int(s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_D(2) = float64_round_to_int(s->ZMM_D(2), &env->sse_status);
+    d->ZMM_D(3) = float64_round_to_int(s->ZMM_D(3), &env->sse_status);
+#endif
 
     if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) {
         set_float_exception_flags(get_float_exception_flags(&env->sse_status) &
@@ -1841,7 +2472,8 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     env->sse_status.float_rounding_mode = prev_rounding_mode;
 }
 
-void glue(helper_roundss, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+#if SHIFT == 1
+void helper_roundss_xmm(CPUX86State *env, Reg *d, Reg *s,
                                   uint32_t mode)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
@@ -1875,7 +2507,7 @@ void glue(helper_roundss, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     env->sse_status.float_rounding_mode = prev_rounding_mode;
 }
 
-void glue(helper_roundsd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+void helper_roundsd_xmm(CPUX86State *env, Reg *d, Reg *s,
                                   uint32_t mode)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
@@ -1908,99 +2540,158 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     }
     env->sse_status.float_rounding_mode = prev_rounding_mode;
 }
+#endif
 
-#define FBLENDP(d, s, m) (m ? s : d)
+#define FBLENDP(v, s, m) (m ? s : v)
 SSE_HELPER_I(helper_blendps, L, 4, FBLENDP)
 SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP)
 SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP)
 
-void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
+void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,
+                               uint32_t mask)
 {
-    float32 iresult = float32_zero;
+    float32 prod, iresult, iresult2;
 
+    /*
+     * We must evaluate (A+B)+(C+D), not ((A+B)+C)+D
+     * to correctly round the intermediate results
+     */
     if (mask & (1 << 4)) {
-        iresult = float32_add(iresult,
-                              float32_mul(d->ZMM_S(0), s->ZMM_S(0),
-                                          &env->sse_status),
-                              &env->sse_status);
+        iresult = float32_mul(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
+    } else {
+        iresult = float32_zero;
     }
     if (mask & (1 << 5)) {
-        iresult = float32_add(iresult,
-                              float32_mul(d->ZMM_S(1), s->ZMM_S(1),
-                                          &env->sse_status),
-                              &env->sse_status);
+        prod = float32_mul(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
+    } else {
+        prod = float32_zero;
     }
+    iresult = float32_add(iresult, prod, &env->sse_status);
     if (mask & (1 << 6)) {
-        iresult = float32_add(iresult,
-                              float32_mul(d->ZMM_S(2), s->ZMM_S(2),
-                                          &env->sse_status),
-                              &env->sse_status);
+        iresult2 = float32_mul(v->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
+    } else {
+        iresult2 = float32_zero;
     }
     if (mask & (1 << 7)) {
-        iresult = float32_add(iresult,
-                              float32_mul(d->ZMM_S(3), s->ZMM_S(3),
-                                          &env->sse_status),
-                              &env->sse_status);
+        prod = float32_mul(v->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
+    } else {
+        prod = float32_zero;
     }
+    iresult2 = float32_add(iresult2, prod, &env->sse_status);
+    iresult = float32_add(iresult, iresult2, &env->sse_status);
+
     d->ZMM_S(0) = (mask & (1 << 0)) ? iresult : float32_zero;
     d->ZMM_S(1) = (mask & (1 << 1)) ? iresult : float32_zero;
     d->ZMM_S(2) = (mask & (1 << 2)) ? iresult : float32_zero;
     d->ZMM_S(3) = (mask & (1 << 3)) ? iresult : float32_zero;
+#if SHIFT == 2
+    if (mask & (1 << 4)) {
+        iresult = float32_mul(v->ZMM_S(4), s->ZMM_S(4), &env->sse_status);
+    } else {
+        iresult = float32_zero;
+    }
+    if (mask & (1 << 5)) {
+        prod = float32_mul(v->ZMM_S(5), s->ZMM_S(5), &env->sse_status);
+    } else {
+        prod = float32_zero;
+    }
+    iresult = float32_add(iresult, prod, &env->sse_status);
+    if (mask & (1 << 6)) {
+        iresult2 = float32_mul(v->ZMM_S(6), s->ZMM_S(6), &env->sse_status);
+    } else {
+        iresult2 = float32_zero;
+    }
+    if (mask & (1 << 7)) {
+        prod = float32_mul(v->ZMM_S(7), s->ZMM_S(7), &env->sse_status);
+    } else {
+        prod = float32_zero;
+    }
+    iresult2 = float32_add(iresult2, prod, &env->sse_status);
+    iresult = float32_add(iresult, iresult2, &env->sse_status);
+
+    d->ZMM_S(4) = (mask & (1 << 0)) ? iresult : float32_zero;
+    d->ZMM_S(5) = (mask & (1 << 1)) ? iresult : float32_zero;
+    d->ZMM_S(6) = (mask & (1 << 2)) ? iresult : float32_zero;
+    d->ZMM_S(7) = (mask & (1 << 3)) ? iresult : float32_zero;
+#endif
 }
 
-void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
+#if SHIFT == 1
+/* Oddly, there is no ymm version of dppd */
+void glue(helper_dppd, SUFFIX)(CPUX86State *env,
+                               Reg *d, Reg *v, Reg *s, uint32_t mask)
 {
-    float64 iresult = float64_zero;
+    float64 iresult;
 
     if (mask & (1 << 4)) {
-        iresult = float64_add(iresult,
-                              float64_mul(d->ZMM_D(0), s->ZMM_D(0),
-                                          &env->sse_status),
-                              &env->sse_status);
+        iresult = float64_mul(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
+    } else {
+        iresult = float64_zero;
     }
+
     if (mask & (1 << 5)) {
         iresult = float64_add(iresult,
-                              float64_mul(d->ZMM_D(1), s->ZMM_D(1),
+                              float64_mul(v->ZMM_D(1), s->ZMM_D(1),
                                           &env->sse_status),
                               &env->sse_status);
     }
     d->ZMM_D(0) = (mask & (1 << 0)) ? iresult : float64_zero;
     d->ZMM_D(1) = (mask & (1 << 1)) ? iresult : float64_zero;
 }
+#endif
 
-void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,
                                   uint32_t offset)
 {
     int s0 = (offset & 3) << 2;
     int d0 = (offset & 4) << 0;
     int i;
-    Reg r;
+    uint16_t r[8];
 
     for (i = 0; i < 8; i++, d0++) {
-        r.W(i) = 0;
-        r.W(i) += abs1(d->B(d0 + 0) - s->B(s0 + 0));
-        r.W(i) += abs1(d->B(d0 + 1) - s->B(s0 + 1));
-        r.W(i) += abs1(d->B(d0 + 2) - s->B(s0 + 2));
-        r.W(i) += abs1(d->B(d0 + 3) - s->B(s0 + 3));
+        r[i] = 0;
+        r[i] += abs1(v->B(d0 + 0) - s->B(s0 + 0));
+        r[i] += abs1(v->B(d0 + 1) - s->B(s0 + 1));
+        r[i] += abs1(v->B(d0 + 2) - s->B(s0 + 2));
+        r[i] += abs1(v->B(d0 + 3) - s->B(s0 + 3));
     }
+    for (i = 0; i < 8; i++) {
+        d->W(i) = r[i];
+    }
+#if SHIFT == 2
+    s0 = ((offset & 0x18) >> 1) + 16;
+    d0 = ((offset & 0x20) >> 3) + 16;
 
-    *d = r;
+    for (i = 0; i < 8; i++, d0++) {
+        r[i] = 0;
+        r[i] += abs1(v->B(d0 + 0) - s->B(s0 + 0));
+        r[i] += abs1(v->B(d0 + 1) - s->B(s0 + 1));
+        r[i] += abs1(v->B(d0 + 2) - s->B(s0 + 2));
+        r[i] += abs1(v->B(d0 + 3) - s->B(s0 + 3));
+    }
+    for (i = 0; i < 8; i++) {
+        d->W(i + 8) = r[i];
+    }
+#endif
 }
 
 /* SSE4.2 op helpers */
 #define FCMPGTQ(d, s) ((int64_t)d > (int64_t)s ? -1 : 0)
 SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ)
 
+#if SHIFT == 1
 static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl)
 {
-    int val;
+    int64_t val;
 
     /* Presence of REX.W is indicated by a bit higher than 7 set */
     if (ctrl >> 8) {
-        val = abs1((int64_t)env->regs[reg]);
+        val = env->regs[reg];
     } else {
-        val = abs1((int32_t)env->regs[reg]);
+        val = (int32_t)env->regs[reg];
     }
+    if (val < 0)
+        val = 16;
 
     if (ctrl & 1) {
         if (val > 8) {
@@ -2213,14 +2904,16 @@ target_ulong helper_crc32(uint32_t crc1, target_ulong msg, uint32_t len)
     return crc;
 }
 
-void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
-                                    uint32_t ctrl)
+#endif
+
+#if SHIFT == 1
+static void clmulq(uint64_t *dest_l, uint64_t *dest_h,
+                          uint64_t a, uint64_t b)
 {
-    uint64_t ah, al, b, resh, resl;
+    uint64_t al, ah, resh, resl;
 
     ah = 0;
-    al = d->Q((ctrl & 1) != 0);
-    b = s->Q((ctrl & 16) != 0);
+    al = a;
     resh = resl = 0;
 
     while (b) {
@@ -2233,71 +2926,115 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
         b >>= 1;
     }
 
-    d->Q(0) = resl;
-    d->Q(1) = resh;
+    *dest_l = resl;
+    *dest_h = resh;
 }
+#endif
 
-void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,
+                                    uint32_t ctrl)
+{
+    uint64_t a, b;
+
+    a = v->Q((ctrl & 1) != 0);
+    b = s->Q((ctrl & 16) != 0);
+    clmulq(&d->Q(0), &d->Q(1), a, b);
+#if SHIFT == 2
+    a = v->Q(((ctrl & 1) != 0) + 2);
+    b = s->Q(((ctrl & 16) != 0) + 2);
+    clmulq(&d->Q(2), &d->Q(3), a, b);
+#endif
+}
+
+void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
-    Reg st = *d;
+    Reg st = *v;
     Reg rk = *s;
 
     for (i = 0 ; i < 4 ; i++) {
-        d->L(i) = rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4*i+0])] ^
-                                    AES_Td1[st.B(AES_ishifts[4*i+1])] ^
-                                    AES_Td2[st.B(AES_ishifts[4*i+2])] ^
-                                    AES_Td3[st.B(AES_ishifts[4*i+3])]);
+        d->L(i) = rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4 * i + 0])] ^
+                                    AES_Td1[st.B(AES_ishifts[4 * i + 1])] ^
+                                    AES_Td2[st.B(AES_ishifts[4 * i + 2])] ^
+                                    AES_Td3[st.B(AES_ishifts[4 * i + 3])]);
     }
+#if SHIFT == 2
+    for (i = 0 ; i < 4 ; i++) {
+        d->L(i + 4) = rk.L(i + 4) ^ bswap32(
+                AES_Td0[st.B(AES_ishifts[4 * i + 0] + 16)] ^
+                AES_Td1[st.B(AES_ishifts[4 * i + 1] + 16)] ^
+                AES_Td2[st.B(AES_ishifts[4 * i + 2] + 16)] ^
+                AES_Td3[st.B(AES_ishifts[4 * i + 3] + 16)]);
+    }
+#endif
 }
 
-void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
-    Reg st = *d;
+    Reg st = *v;
     Reg rk = *s;
 
     for (i = 0; i < 16; i++) {
         d->B(i) = rk.B(i) ^ (AES_isbox[st.B(AES_ishifts[i])]);
     }
+#if SHIFT == 2
+    for (i = 0; i < 16; i++) {
+        d->B(i + 16) = rk.B(i + 16) ^ (AES_isbox[st.B(AES_ishifts[i] + 16)]);
+    }
+#endif
 }
 
-void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
-    Reg st = *d;
+    Reg st = *v;
     Reg rk = *s;
 
     for (i = 0 ; i < 4 ; i++) {
-        d->L(i) = rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4*i+0])] ^
-                                    AES_Te1[st.B(AES_shifts[4*i+1])] ^
-                                    AES_Te2[st.B(AES_shifts[4*i+2])] ^
-                                    AES_Te3[st.B(AES_shifts[4*i+3])]);
+        d->L(i) = rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4 * i + 0])] ^
+                                    AES_Te1[st.B(AES_shifts[4 * i + 1])] ^
+                                    AES_Te2[st.B(AES_shifts[4 * i + 2])] ^
+                                    AES_Te3[st.B(AES_shifts[4 * i + 3])]);
+    }
+#if SHIFT == 2
+    for (i = 0 ; i < 4 ; i++) {
+        d->L(i + 4) = rk.L(i + 4) ^ bswap32(
+                AES_Te0[st.B(AES_shifts[4 * i + 0] + 16)] ^
+                AES_Te1[st.B(AES_shifts[4 * i + 1] + 16)] ^
+                AES_Te2[st.B(AES_shifts[4 * i + 2] + 16)] ^
+                AES_Te3[st.B(AES_shifts[4 * i + 3] + 16)]);
     }
+#endif
 }
 
-void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
-    Reg st = *d;
+    Reg st = *v;
     Reg rk = *s;
 
     for (i = 0; i < 16; i++) {
         d->B(i) = rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i])]);
     }
-
+#if SHIFT == 2
+    for (i = 0; i < 16; i++) {
+        d->B(i + 16) = rk.B(i + 16) ^ (AES_sbox[st.B(AES_shifts[i] + 16)]);
+    }
+#endif
 }
 
+#if SHIFT == 1
 void glue(helper_aesimc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int i;
     Reg tmp = *s;
 
     for (i = 0 ; i < 4 ; i++) {
-        d->L(i) = bswap32(AES_imc[tmp.B(4*i+0)][0] ^
-                          AES_imc[tmp.B(4*i+1)][1] ^
-                          AES_imc[tmp.B(4*i+2)][2] ^
-                          AES_imc[tmp.B(4*i+3)][3]);
+        d->L(i) = bswap32(AES_imc[tmp.B(4 * i + 0)][0] ^
+                          AES_imc[tmp.B(4 * i + 1)][1] ^
+                          AES_imc[tmp.B(4 * i + 2)][2] ^
+                          AES_imc[tmp.B(4 * i + 3)][3]);
     }
 }
 
@@ -2315,9 +3052,430 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     d->L(3) = (d->L(2) << 24 | d->L(2) >> 8) ^ ctrl;
 }
 #endif
+#endif
+
+#if SHIFT >= 1
+void glue(helper_vbroadcastb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint8_t val = s->B(0);
+    int i;
+
+    for (i = 0; i < 16 * SHIFT; i++) {
+        d->B(i) = val;
+    }
+}
+
+void glue(helper_vbroadcastw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint16_t val = s->W(0);
+    int i;
+
+    for (i = 0; i < 8 * SHIFT; i++) {
+        d->W(i) = val;
+    }
+}
+
+void glue(helper_vbroadcastl, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint32_t val = s->L(0);
+    int i;
+
+    for (i = 0; i < 8 * SHIFT; i++) {
+        d->L(i) = val;
+    }
+}
+
+void glue(helper_vbroadcastq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint64_t val = s->Q(0);
+    d->Q(0) = val;
+    d->Q(1) = val;
+#if SHIFT == 2
+    d->Q(2) = val;
+    d->Q(3) = val;
+#endif
+}
+
+void glue(helper_vpermilpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    uint64_t r0, r1;
+
+    r0 = v->Q((s->Q(0) >> 1) & 1);
+    r1 = v->Q((s->Q(1) >> 1) & 1);
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+#if SHIFT == 2
+    r0 = v->Q(((s->Q(2) >> 1) & 1) + 2);
+    r1 = v->Q(((s->Q(3) >> 1) & 1) + 2);
+    d->Q(2) = r0;
+    d->Q(3) = r1;
+#endif
+}
+
+void glue(helper_vpermilps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    uint32_t r0, r1, r2, r3;
+
+    r0 = v->L(s->L(0) & 3);
+    r1 = v->L(s->L(1) & 3);
+    r2 = v->L(s->L(2) & 3);
+    r3 = v->L(s->L(3) & 3);
+    d->L(0) = r0;
+    d->L(1) = r1;
+    d->L(2) = r2;
+    d->L(3) = r3;
+#if SHIFT == 2
+    r0 = v->L((s->L(4) & 3) + 4);
+    r1 = v->L((s->L(5) & 3) + 4);
+    r2 = v->L((s->L(6) & 3) + 4);
+    r3 = v->L((s->L(7) & 3) + 4);
+    d->L(4) = r0;
+    d->L(5) = r1;
+    d->L(6) = r2;
+    d->L(7) = r3;
+#endif
+}
+
+void glue(helper_vpermilpd_imm, SUFFIX)(CPUX86State *env,
+                                        Reg *d, Reg *s, uint32_t order)
+{
+    uint64_t r0, r1;
+
+    r0 = s->Q((order >> 0) & 1);
+    r1 = s->Q((order >> 1) & 1);
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+#if SHIFT == 2
+    r0 = s->Q(((order >> 2) & 1) + 2);
+    r1 = s->Q(((order >> 3) & 1) + 2);
+    d->Q(2) = r0;
+    d->Q(3) = r1;
+#endif
+}
+
+void glue(helper_vpermilps_imm, SUFFIX)(CPUX86State *env,
+                                        Reg *d, Reg *s, uint32_t order)
+{
+    uint32_t r0, r1, r2, r3;
+
+    r0 = s->L((order >> 0) & 3);
+    r1 = s->L((order >> 2) & 3);
+    r2 = s->L((order >> 4) & 3);
+    r3 = s->L((order >> 6) & 3);
+    d->L(0) = r0;
+    d->L(1) = r1;
+    d->L(2) = r2;
+    d->L(3) = r3;
+#if SHIFT == 2
+    r0 = s->L(((order >> 0) & 3) + 4);
+    r1 = s->L(((order >> 2) & 3) + 4);
+    r2 = s->L(((order >> 4) & 3) + 4);
+    r3 = s->L(((order >> 6) & 3) + 4);
+    d->L(4) = r0;
+    d->L(5) = r1;
+    d->L(6) = r2;
+    d->L(7) = r3;
+#endif
+}
+
+#if SHIFT == 1
+#define FPSRLVD(x, c) (c < 32 ? ((x) >> c) : 0)
+#define FPSRLVQ(x, c) (c < 64 ? ((x) >> c) : 0)
+#define FPSRAVD(x, c) ((int32_t)(x) >> (c < 64 ? c : 31))
+#define FPSRAVQ(x, c) ((int64_t)(x) >> (c < 64 ? c : 63))
+#define FPSLLVD(x, c) (c < 32 ? ((x) << c) : 0)
+#define FPSLLVQ(x, c) (c < 64 ? ((x) << c) : 0)
+#endif
+
+SSE_HELPER_L(helper_vpsrlvd, FPSRLVD)
+SSE_HELPER_L(helper_vpsravd, FPSRAVD)
+SSE_HELPER_L(helper_vpsllvd, FPSLLVD)
+
+SSE_HELPER_Q(helper_vpsrlvq, FPSRLVQ)
+SSE_HELPER_Q(helper_vpsravq, FPSRAVQ)
+SSE_HELPER_Q(helper_vpsllvq, FPSLLVQ)
+
+void glue(helper_vtestps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint32_t zf = (s->L(0) &  d->L(0)) | (s->L(1) &  d->L(1));
+    uint32_t cf = (s->L(0) & ~d->L(0)) | (s->L(1) & ~d->L(1));
+
+    zf |= (s->L(2) &  d->L(2)) | (s->L(3) &  d->L(3));
+    cf |= (s->L(2) & ~d->L(2)) | (s->L(3) & ~d->L(3));
+#if SHIFT == 2
+    zf |= (s->L(4) &  d->L(4)) | (s->L(5) &  d->L(5));
+    cf |= (s->L(4) & ~d->L(4)) | (s->L(5) & ~d->L(5));
+    zf |= (s->L(6) &  d->L(6)) | (s->L(7) &  d->L(7));
+    cf |= (s->L(6) & ~d->L(6)) | (s->L(7) & ~d->L(7));
+#endif
+    CC_SRC = ((zf >> 31) ? 0 : CC_Z) | ((cf >> 31) ? 0 : CC_C);
+}
+
+void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint64_t zf = (s->Q(0) &  d->Q(0)) | (s->Q(1) &  d->Q(1));
+    uint64_t cf = (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1));
+
+#if SHIFT == 2
+    zf |= (s->Q(2) &  d->Q(2)) | (s->Q(3) &  d->Q(3));
+    cf |= (s->Q(2) & ~d->Q(2)) | (s->Q(3) & ~d->Q(3));
+#endif
+    CC_SRC = ((zf >> 63) ? 0 : CC_Z) | ((cf >> 63) ? 0 : CC_C);
+}
+
+void glue(helper_vpmaskmovd_st, SUFFIX)(CPUX86State *env,
+                                        Reg *s, Reg *v, target_ulong a0)
+{
+    int i;
+
+    for (i = 0; i < (2 << SHIFT); i++) {
+        if (v->L(i) >> 31) {
+            cpu_stl_data_ra(env, a0 + i * 4, s->L(i), GETPC());
+        }
+    }
+}
+
+void glue(helper_vpmaskmovq_st, SUFFIX)(CPUX86State *env,
+                                        Reg *s, Reg *v, target_ulong a0)
+{
+    int i;
+
+    for (i = 0; i < (1 << SHIFT); i++) {
+        if (v->Q(i) >> 63) {
+            cpu_stq_data_ra(env, a0 + i * 8, s->Q(i), GETPC());
+        }
+    }
+}
+
+void glue(helper_vpmaskmovd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    d->L(0) = (v->L(0) >> 31) ? s->L(0) : 0;
+    d->L(1) = (v->L(1) >> 31) ? s->L(1) : 0;
+    d->L(2) = (v->L(2) >> 31) ? s->L(2) : 0;
+    d->L(3) = (v->L(3) >> 31) ? s->L(3) : 0;
+#if SHIFT == 2
+    d->L(4) = (v->L(4) >> 31) ? s->L(4) : 0;
+    d->L(5) = (v->L(5) >> 31) ? s->L(5) : 0;
+    d->L(6) = (v->L(6) >> 31) ? s->L(6) : 0;
+    d->L(7) = (v->L(7) >> 31) ? s->L(7) : 0;
+#endif
+}
+
+void glue(helper_vpmaskmovq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    d->Q(0) = (v->Q(0) >> 63) ? s->Q(0) : 0;
+    d->Q(1) = (v->Q(1) >> 63) ? s->Q(1) : 0;
+#if SHIFT == 2
+    d->Q(2) = (v->Q(2) >> 63) ? s->Q(2) : 0;
+    d->Q(3) = (v->Q(3) >> 63) ? s->Q(3) : 0;
+#endif
+}
+
+#define VGATHER_HELPER(scale)                                       \
+void glue(helper_vpgatherdd ## scale, SUFFIX)(CPUX86State *env,     \
+        Reg *d, Reg *v, Reg *s, target_ulong a0)                    \
+{                                                                   \
+    int i;                                                          \
+    for (i = 0; i < (2 << SHIFT); i++) {                            \
+        if (v->L(i) >> 31) {                                        \
+            target_ulong addr = a0                                  \
+                + ((target_ulong)(int32_t)s->L(i) << scale);        \
+            d->L(i) = cpu_ldl_data_ra(env, addr, GETPC());          \
+        }                                                           \
+        v->L(i) = 0;                                                \
+    }                                                               \
+}                                                                   \
+void glue(helper_vpgatherdq ## scale, SUFFIX)(CPUX86State *env,     \
+        Reg *d, Reg *v, Reg *s, target_ulong a0)                    \
+{                                                                   \
+    int i;                                                          \
+    for (i = 0; i < (1 << SHIFT); i++) {                            \
+        if (v->Q(i) >> 63) {                                        \
+            target_ulong addr = a0                                  \
+                + ((target_ulong)(int32_t)s->L(i) << scale);        \
+            d->Q(i) = cpu_ldq_data_ra(env, addr, GETPC());          \
+        }                                                           \
+        v->Q(i) = 0;                                                \
+    }                                                               \
+}                                                                   \
+void glue(helper_vpgatherqd ## scale, SUFFIX)(CPUX86State *env,     \
+        Reg *d, Reg *v, Reg *s, target_ulong a0)                    \
+{                                                                   \
+    int i;                                                          \
+    for (i = 0; i < (1 << SHIFT); i++) {                            \
+        if (v->L(i) >> 31) {                                        \
+            target_ulong addr = a0                                  \
+                + ((target_ulong)(int64_t)s->Q(i) << scale);        \
+            d->L(i) = cpu_ldl_data_ra(env, addr, GETPC());          \
+        }                                                           \
+        v->L(i) = 0;                                                \
+    }                                                               \
+    d->Q(SHIFT) = 0;                                                    \
+    v->Q(SHIFT) = 0;                                                    \
+    YMM_ONLY(                                                       \
+    d->Q(3) = 0;                                                    \
+    v->Q(3) = 0;                                                    \
+    )                                                               \
+}                                                                   \
+void glue(helper_vpgatherqq ## scale, SUFFIX)(CPUX86State *env,     \
+        Reg *d, Reg *v, Reg *s, target_ulong a0)                    \
+{                                                                   \
+    int i;                                                          \
+    for (i = 0; i < (1 << SHIFT); i++) {                            \
+        if (v->Q(i) >> 63) {                                        \
+            target_ulong addr = a0                                  \
+                + ((target_ulong)(int64_t)s->Q(i) << scale);        \
+            d->Q(i) = cpu_ldq_data_ra(env, addr, GETPC());          \
+        }                                                           \
+        v->Q(i) = 0;                                                \
+    }                                                               \
+}
+
+VGATHER_HELPER(0)
+VGATHER_HELPER(1)
+VGATHER_HELPER(2)
+VGATHER_HELPER(3)
+
+#if SHIFT == 2
+void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->Q(0) = s->Q(0);
+    d->Q(1) = s->Q(1);
+    d->Q(2) = s->Q(0);
+    d->Q(3) = s->Q(1);
+}
+
+void helper_vzeroall(CPUX86State *env)
+{
+    int i;
+
+    for (i = 0; i < 8; i++) {
+        env->xmm_regs[i].ZMM_Q(0) = 0;
+        env->xmm_regs[i].ZMM_Q(1) = 0;
+        env->xmm_regs[i].ZMM_Q(2) = 0;
+        env->xmm_regs[i].ZMM_Q(3) = 0;
+    }
+}
+
+void helper_vzeroupper(CPUX86State *env)
+{
+    int i;
+
+    for (i = 0; i < 8; i++) {
+        env->xmm_regs[i].ZMM_Q(2) = 0;
+        env->xmm_regs[i].ZMM_Q(3) = 0;
+    }
+}
+
+#ifdef TARGET_X86_64
+void helper_vzeroall_hi8(CPUX86State *env)
+{
+    int i;
+
+    for (i = 8; i < 16; i++) {
+        env->xmm_regs[i].ZMM_Q(0) = 0;
+        env->xmm_regs[i].ZMM_Q(1) = 0;
+        env->xmm_regs[i].ZMM_Q(2) = 0;
+        env->xmm_regs[i].ZMM_Q(3) = 0;
+    }
+}
+
+void helper_vzeroupper_hi8(CPUX86State *env)
+{
+    int i;
+
+    for (i = 8; i < 16; i++) {
+        env->xmm_regs[i].ZMM_Q(2) = 0;
+        env->xmm_regs[i].ZMM_Q(3) = 0;
+    }
+}
+#endif
+
+void helper_vpermdq_ymm(CPUX86State *env,
+                        Reg *d, Reg *v, Reg *s, uint32_t order)
+{
+    uint64_t r0, r1, r2, r3;
+
+    switch (order & 3) {
+    case 0:
+        r0 = v->Q(0);
+        r1 = v->Q(1);
+        break;
+    case 1:
+        r0 = v->Q(2);
+        r1 = v->Q(3);
+        break;
+    case 2:
+        r0 = s->Q(0);
+        r1 = s->Q(1);
+        break;
+    case 3:
+        r0 = s->Q(2);
+        r1 = s->Q(3);
+        break;
+    }
+    switch ((order >> 4) & 3) {
+    case 0:
+        r2 = v->Q(0);
+        r3 = v->Q(1);
+        break;
+    case 1:
+        r2 = v->Q(2);
+        r3 = v->Q(3);
+        break;
+    case 2:
+        r2 = s->Q(0);
+        r3 = s->Q(1);
+        break;
+    case 3:
+        r2 = s->Q(2);
+        r3 = s->Q(3);
+        break;
+    }
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+    d->Q(2) = r2;
+    d->Q(3) = r3;
+}
+
+void helper_vpermq_ymm(CPUX86State *env, Reg *d, Reg *s, uint32_t order)
+{
+    uint64_t r0, r1, r2, r3;
+    r0 = s->Q(order & 3);
+    r1 = s->Q((order >> 2) & 3);
+    r2 = s->Q((order >> 4) & 3);
+    r3 = s->Q((order >> 6) & 3);
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+    d->Q(2) = r2;
+    d->Q(3) = r3;
+}
+
+void helper_vpermd_ymm(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    uint32_t r[8];
+    int i;
+
+    for (i = 0; i < 8; i++) {
+        r[i] = s->L(v->L(i) & 7);
+    }
+    for (i = 0; i < 8; i++) {
+        d->L(i) = r[i];
+    }
+}
+
+#endif
+#endif
+
+#undef SHIFT_HELPER_W
+#undef SHIFT_HELPER_L
+#undef SHIFT_HELPER_Q
+#undef SSE_HELPER_S
+#undef SSE_HELPER_CMP
 
 #undef SHIFT
 #undef XMM_ONLY
+#undef YMM_ONLY
 #undef Reg
 #undef B
 #undef W
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index cef28f2aae..83efb8ab41 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -21,7 +21,11 @@
 #define SUFFIX _mmx
 #else
 #define Reg ZMMReg
+#if SHIFT == 1
 #define SUFFIX _xmm
+#else
+#define SUFFIX _ymm
+#endif
 #endif
 
 #define dh_alias_Reg ptr
@@ -34,31 +38,31 @@
 #define dh_typecode_ZMMReg dh_typecode_ptr
 #define dh_typecode_MMXReg dh_typecode_ptr
 
-DEF_HELPER_3(glue(psrlw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psraw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psllw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psrld, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psrad, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pslld, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psrlq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psllq, SUFFIX), void, env, Reg, Reg)
-
-#if SHIFT == 1
-DEF_HELPER_3(glue(psrldq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pslldq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(psrlw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psraw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psllw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psrld, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psrad, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pslld, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psrlq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psllq, SUFFIX), void, env, Reg, Reg, Reg)
+
+#if SHIFT >= 1
+DEF_HELPER_4(glue(psrldq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pslldq, SUFFIX), void, env, Reg, Reg, Reg)
 #endif
 
 #define SSE_HELPER_B(name, F)\
-    DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
+    DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg)
 
 #define SSE_HELPER_W(name, F)\
-    DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
+    DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg)
 
 #define SSE_HELPER_L(name, F)\
-    DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
+    DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg)
 
 #define SSE_HELPER_Q(name, F)\
-    DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
+    DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg)
 
 SSE_HELPER_B(paddb, FADD)
 SSE_HELPER_W(paddw, FADD)
@@ -101,7 +105,7 @@ SSE_HELPER_L(pcmpeql, FCMPEQ)
 
 SSE_HELPER_W(pmullw, FMULLW)
 #if SHIFT == 0
-SSE_HELPER_W(pmulhrw, FMULHRW)
+DEF_HELPER_3(glue(pmulhrw, SUFFIX), void, env, Reg, Reg)
 #endif
 SSE_HELPER_W(pmulhuw, FMULHUW)
 SSE_HELPER_W(pmulhw, FMULHW)
@@ -109,11 +113,13 @@ SSE_HELPER_W(pmulhw, FMULHW)
 SSE_HELPER_B(pavgb, FAVG)
 SSE_HELPER_W(pavgw, FAVG)
 
-DEF_HELPER_3(glue(pmuludq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaddwd, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(pmuludq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaddwd, SUFFIX), void, env, Reg, Reg, Reg)
 
-DEF_HELPER_3(glue(psadbw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(psadbw, SUFFIX), void, env, Reg, Reg, Reg)
+#if SHIFT < 2
 DEF_HELPER_4(glue(maskmov, SUFFIX), void, env, Reg, Reg, tl)
+#endif
 DEF_HELPER_2(glue(movl_mm_T0, SUFFIX), void, Reg, i32)
 #ifdef TARGET_X86_64
 DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64)
@@ -122,38 +128,63 @@ DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64)
 #if SHIFT == 0
 DEF_HELPER_3(glue(pshufw, SUFFIX), void, Reg, Reg, int)
 #else
-DEF_HELPER_3(shufps, void, Reg, Reg, int)
-DEF_HELPER_3(shufpd, void, Reg, Reg, int)
 DEF_HELPER_3(glue(pshufd, SUFFIX), void, Reg, Reg, int)
 DEF_HELPER_3(glue(pshuflw, SUFFIX), void, Reg, Reg, int)
 DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int)
 #endif
 
-#if SHIFT == 1
+#if SHIFT >= 1
 /* FPU ops */
 /* XXX: not accurate */
 
-#define SSE_HELPER_S(name, F)                            \
-    DEF_HELPER_3(name ## ps, void, env, Reg, Reg)        \
-    DEF_HELPER_3(name ## ss, void, env, Reg, Reg)        \
-    DEF_HELPER_3(name ## pd, void, env, Reg, Reg)        \
-    DEF_HELPER_3(name ## sd, void, env, Reg, Reg)
+#define SSE_HELPER_P4(name, ...)                         \
+    DEF_HELPER_4(glue(name ## ps, SUFFIX), __VA_ARGS__) \
+    DEF_HELPER_4(glue(name ## pd, SUFFIX), __VA_ARGS__)
+
+#define SSE_HELPER_P3(name, ...)                         \
+    DEF_HELPER_3(glue(name ## ps, SUFFIX), __VA_ARGS__) \
+    DEF_HELPER_3(glue(name ## pd, SUFFIX), __VA_ARGS__)
+
+#if SHIFT == 1
+#define SSE_HELPER_S4(name, ...)             \
+    SSE_HELPER_P4(name, __VA_ARGS__)         \
+    DEF_HELPER_4(name ## ss, __VA_ARGS__)   \
+    DEF_HELPER_4(name ## sd, __VA_ARGS__)
+#define SSE_HELPER_S3(name, ...)             \
+    SSE_HELPER_P3(name, __VA_ARGS__)         \
+    DEF_HELPER_3(name ## ss, __VA_ARGS__)   \
+    DEF_HELPER_3(name ## sd, __VA_ARGS__)
+#else
+#define SSE_HELPER_S4(name, ...) SSE_HELPER_P4(name, __VA_ARGS__)
+#define SSE_HELPER_S3(name, ...) SSE_HELPER_P3(name, __VA_ARGS__)
+#endif
+
+DEF_HELPER_4(glue(shufps, SUFFIX), void, Reg, Reg, Reg, int)
+DEF_HELPER_4(glue(shufpd, SUFFIX), void, Reg, Reg, Reg, int)
+
+SSE_HELPER_S4(add, void, env, Reg, Reg, Reg)
+SSE_HELPER_S4(sub, void, env, Reg, Reg, Reg)
+SSE_HELPER_S4(mul, void, env, Reg, Reg, Reg)
+SSE_HELPER_S4(div, void, env, Reg, Reg, Reg)
+SSE_HELPER_S4(min, void, env, Reg, Reg, Reg)
+SSE_HELPER_S4(max, void, env, Reg, Reg, Reg)
+
+SSE_HELPER_S3(sqrt, void, env, Reg, Reg)
 
-SSE_HELPER_S(add, FPU_ADD)
-SSE_HELPER_S(sub, FPU_SUB)
-SSE_HELPER_S(mul, FPU_MUL)
-SSE_HELPER_S(div, FPU_DIV)
-SSE_HELPER_S(min, FPU_MIN)
-SSE_HELPER_S(max, FPU_MAX)
-SSE_HELPER_S(sqrt, FPU_SQRT)
+DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(cvtdq2ps, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(cvtdq2pd, SUFFIX), void, env, Reg, Reg)
 
+DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
 
-DEF_HELPER_3(cvtps2pd, void, env, Reg, Reg)
-DEF_HELPER_3(cvtpd2ps, void, env, Reg, Reg)
+DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
+
+#if SHIFT == 1
 DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg)
 DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg)
-DEF_HELPER_3(cvtdq2ps, void, env, Reg, Reg)
-DEF_HELPER_3(cvtdq2pd, void, env, Reg, Reg)
 DEF_HELPER_3(cvtpi2ps, void, env, ZMMReg, MMXReg)
 DEF_HELPER_3(cvtpi2pd, void, env, ZMMReg, MMXReg)
 DEF_HELPER_3(cvtsi2ss, void, env, ZMMReg, i32)
@@ -164,8 +195,6 @@ DEF_HELPER_3(cvtsq2ss, void, env, ZMMReg, i64)
 DEF_HELPER_3(cvtsq2sd, void, env, ZMMReg, i64)
 #endif
 
-DEF_HELPER_3(cvtps2dq, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(cvtpd2dq, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(cvtps2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_3(cvtpd2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_2(cvtss2si, s32, env, ZMMReg)
@@ -175,8 +204,6 @@ DEF_HELPER_2(cvtss2sq, s64, env, ZMMReg)
 DEF_HELPER_2(cvtsd2sq, s64, env, ZMMReg)
 #endif
 
-DEF_HELPER_3(cvttps2dq, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(cvttpd2dq, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(cvttps2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_3(cvttpd2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_2(cvttss2si, s32, env, ZMMReg)
@@ -185,60 +212,88 @@ DEF_HELPER_2(cvttsd2si, s32, env, ZMMReg)
 DEF_HELPER_2(cvttss2sq, s64, env, ZMMReg)
 DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg)
 #endif
+#endif
 
-DEF_HELPER_3(rsqrtps, void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg)
+
+#if SHIFT == 1
 DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(rcpps, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(extrq_r, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int)
 DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int)
-DEF_HELPER_3(haddps, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(haddpd, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(hsubps, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(hsubpd, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(addsubps, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(addsubpd, void, env, ZMMReg, ZMMReg)
-
-#define SSE_HELPER_CMP(name, F)                           \
-    DEF_HELPER_3(name ## ps, void, env, Reg, Reg)         \
-    DEF_HELPER_3(name ## ss, void, env, Reg, Reg)         \
-    DEF_HELPER_3(name ## pd, void, env, Reg, Reg)         \
-    DEF_HELPER_3(name ## sd, void, env, Reg, Reg)
-
-SSE_HELPER_CMP(cmpeq, FPU_CMPEQ)
-SSE_HELPER_CMP(cmplt, FPU_CMPLT)
-SSE_HELPER_CMP(cmple, FPU_CMPLE)
-SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD)
-SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ)
-SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT)
-SSE_HELPER_CMP(cmpnle, FPU_CMPNLE)
-SSE_HELPER_CMP(cmpord, FPU_CMPORD)
+#endif
+
+SSE_HELPER_P4(hadd, void, env, Reg, Reg, Reg)
+SSE_HELPER_P4(hsub, void, env, Reg, Reg, Reg)
+SSE_HELPER_P4(addsub, void, env, Reg, Reg, Reg)
+
+#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_S4(name, void, env, Reg, Reg, Reg)
+
+SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ)
+SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT)
+SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE)
+SSE_HELPER_CMP(cmpunord, FPU_CMPQ,  FPU_UNORD)
+SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ)
+SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT)
+SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE)
+SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD)
+
+SSE_HELPER_CMP(cmpequ, FPU_CMPQ, FPU_EQU)
+SSE_HELPER_CMP(cmpnge, FPU_CMPS, !FPU_GE)
+SSE_HELPER_CMP(cmpngt, FPU_CMPS, !FPU_GT)
+SSE_HELPER_CMP(cmpfalse, FPU_CMPQ,  FPU_FALSE)
+SSE_HELPER_CMP(cmpnequ, FPU_CMPQ, FPU_EQU)
+SSE_HELPER_CMP(cmpge, FPU_CMPS, FPU_GE)
+SSE_HELPER_CMP(cmpgt, FPU_CMPS, FPU_GT)
+SSE_HELPER_CMP(cmptrue, FPU_CMPQ,  !FPU_FALSE)
+
+SSE_HELPER_CMP(cmpeqs, FPU_CMPS, FPU_EQ)
+SSE_HELPER_CMP(cmpltq, FPU_CMPQ, FPU_LT)
+SSE_HELPER_CMP(cmpleq, FPU_CMPQ, FPU_LE)
+SSE_HELPER_CMP(cmpunords, FPU_CMPS,  FPU_UNORD)
+SSE_HELPER_CMP(cmpneqq, FPU_CMPS, !FPU_EQ)
+SSE_HELPER_CMP(cmpnltq, FPU_CMPQ, !FPU_LT)
+SSE_HELPER_CMP(cmpnleq, FPU_CMPQ, !FPU_LE)
+SSE_HELPER_CMP(cmpords, FPU_CMPS, !FPU_UNORD)
+
+SSE_HELPER_CMP(cmpequs, FPU_CMPS, FPU_EQU)
+SSE_HELPER_CMP(cmpngeq, FPU_CMPQ, !FPU_GE)
+SSE_HELPER_CMP(cmpngtq, FPU_CMPQ, !FPU_GT)
+SSE_HELPER_CMP(cmpfalses, FPU_CMPS,  FPU_FALSE)
+SSE_HELPER_CMP(cmpnequs, FPU_CMPS, FPU_EQU)
+SSE_HELPER_CMP(cmpgeq, FPU_CMPQ, FPU_GE)
+SSE_HELPER_CMP(cmpgtq, FPU_CMPQ, FPU_GT)
+SSE_HELPER_CMP(cmptrues, FPU_CMPS,  !FPU_FALSE)
 
+#if SHIFT == 1
 DEF_HELPER_3(ucomiss, void, env, Reg, Reg)
 DEF_HELPER_3(comiss, void, env, Reg, Reg)
 DEF_HELPER_3(ucomisd, void, env, Reg, Reg)
 DEF_HELPER_3(comisd, void, env, Reg, Reg)
-DEF_HELPER_2(movmskps, i32, env, Reg)
-DEF_HELPER_2(movmskpd, i32, env, Reg)
+#endif
+
+DEF_HELPER_2(glue(movmskps, SUFFIX), i32, env, Reg)
+DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg)
 #endif
 
 DEF_HELPER_2(glue(pmovmskb, SUFFIX), i32, env, Reg)
-DEF_HELPER_3(glue(packsswb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(packuswb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(packssdw, SUFFIX), void, env, Reg, Reg)
-#define UNPCK_OP(base_name, base)                                       \
-    DEF_HELPER_3(glue(punpck ## base_name ## bw, SUFFIX), void, env, Reg, Reg) \
-    DEF_HELPER_3(glue(punpck ## base_name ## wd, SUFFIX), void, env, Reg, Reg) \
-    DEF_HELPER_3(glue(punpck ## base_name ## dq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(packsswb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(packuswb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(packssdw, SUFFIX), void, env, Reg, Reg, Reg)
+#define UNPCK_OP(name, base)                                       \
+    DEF_HELPER_4(glue(punpck ## name ## bw, SUFFIX), void, env, Reg, Reg, Reg) \
+    DEF_HELPER_4(glue(punpck ## name ## wd, SUFFIX), void, env, Reg, Reg, Reg) \
+    DEF_HELPER_4(glue(punpck ## name ## dq, SUFFIX), void, env, Reg, Reg, Reg)
 
 UNPCK_OP(l, 0)
 UNPCK_OP(h, 1)
 
-#if SHIFT == 1
-DEF_HELPER_3(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg)
+#if SHIFT >= 1
+DEF_HELPER_4(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg, Reg)
 #endif
 
 /* 3DNow! float ops */
@@ -265,28 +320,28 @@ DEF_HELPER_3(pswapd, void, env, MMXReg, MMXReg)
 #endif
 
 /* SSSE3 op helpers */
-DEF_HELPER_3(glue(phaddw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phaddd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phaddsw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phsubw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phsubd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phsubsw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(phaddw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phaddd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phaddsw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phsubw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phsubd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phsubsw, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_3(glue(pabsb, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pabsw, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pabsd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaddubsw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmulhrsw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pshufb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psignb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psignw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psignd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_4(glue(palignr, SUFFIX), void, env, Reg, Reg, s32)
+DEF_HELPER_4(glue(pmaddubsw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmulhrsw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pshufb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psignb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psignw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psignd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_5(glue(palignr, SUFFIX), void, env, Reg, Reg, Reg, s32)
 
 /* SSE4.1 op helpers */
-#if SHIFT == 1
-DEF_HELPER_3(glue(pblendvb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(blendvps, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(blendvpd, SUFFIX), void, env, Reg, Reg)
+#if SHIFT >= 1
+DEF_HELPER_5(glue(pblendvb, SUFFIX), void, env, Reg, Reg, Reg, Reg)
+DEF_HELPER_5(glue(blendvps, SUFFIX), void, env, Reg, Reg, Reg, Reg)
+DEF_HELPER_5(glue(blendvpd, SUFFIX), void, env, Reg, Reg, Reg, Reg)
 DEF_HELPER_3(glue(ptest, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovsxbw, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovsxbd, SUFFIX), void, env, Reg, Reg)
@@ -300,34 +355,42 @@ DEF_HELPER_3(glue(pmovzxbq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovzxwd, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovzxwq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovzxdq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmuldq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pcmpeqq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(packusdw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pminsb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pminsd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pminuw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pminud, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaxsb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaxsd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaxuw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaxud, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmulld, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(pmuldq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pcmpeqq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(packusdw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pminsb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pminsd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pminuw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pminud, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaxsb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaxsd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaxuw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaxud, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmulld, SUFFIX), void, env, Reg, Reg, Reg)
+#if SHIFT == 1
 DEF_HELPER_3(glue(phminposuw, SUFFIX), void, env, Reg, Reg)
+#endif
 DEF_HELPER_4(glue(roundps, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(roundpd, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(roundss, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(roundsd, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(blendps, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(blendpd, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(pblendw, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(dpps, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(dppd, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, i32)
+#if SHIFT == 1
+DEF_HELPER_4(roundss_xmm, void, env, Reg, Reg, i32)
+DEF_HELPER_4(roundsd_xmm, void, env, Reg, Reg, i32)
+#endif
+DEF_HELPER_5(glue(blendps, SUFFIX), void, env, Reg, Reg, Reg, i32)
+DEF_HELPER_5(glue(blendpd, SUFFIX), void, env, Reg, Reg, Reg, i32)
+DEF_HELPER_5(glue(pblendw, SUFFIX), void, env, Reg, Reg, Reg, i32)
+DEF_HELPER_5(glue(dpps, SUFFIX), void, env, Reg, Reg, Reg, i32)
+#if SHIFT == 1
+DEF_HELPER_5(glue(dppd, SUFFIX), void, env, Reg, Reg, Reg, i32)
+#endif
+DEF_HELPER_5(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, Reg, i32)
 #endif
 
 /* SSE4.2 op helpers */
+#if SHIFT >= 1
+DEF_HELPER_4(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg, Reg)
+#endif
 #if SHIFT == 1
-DEF_HELPER_3(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_4(glue(pcmpestri, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(pcmpestrm, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(pcmpistri, SUFFIX), void, env, Reg, Reg, i32)
@@ -336,14 +399,68 @@ DEF_HELPER_3(crc32, tl, i32, tl, i32)
 #endif
 
 /* AES-NI op helpers */
+#if SHIFT >= 1
+DEF_HELPER_4(glue(aesdec, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(aesdeclast, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(aesenc, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(aesenclast, SUFFIX), void, env, Reg, Reg, Reg)
 #if SHIFT == 1
-DEF_HELPER_3(glue(aesdec, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(aesdeclast, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(aesenc, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(aesenclast, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(aesimc, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32)
+#endif
+DEF_HELPER_5(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, Reg, i32)
+#endif
+
+/* AVX helpers */
+#if SHIFT >= 1
+DEF_HELPER_3(glue(vbroadcastb, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(vbroadcastw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(vbroadcastl, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(vbroadcastq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(vpermilpd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpermilps, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpermilpd_imm, SUFFIX), void, env, Reg, Reg, i32)
+DEF_HELPER_4(glue(vpermilps_imm, SUFFIX), void, env, Reg, Reg, i32)
+DEF_HELPER_4(glue(vpsrlvd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsravd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsllvd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsrlvq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsravq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsllvq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_3(glue(vtestps, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(vtestpd, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(vpmaskmovd_st, SUFFIX), void, env, Reg, Reg, tl)
+DEF_HELPER_4(glue(vpmaskmovq_st, SUFFIX), void, env, Reg, Reg, tl)
+DEF_HELPER_4(glue(vpmaskmovd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_5(glue(vpgatherdd0, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdq0, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqd0, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqq0, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdd1, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdq1, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqd1, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqq1, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdd2, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdq2, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqd2, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqq2, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdd3, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdq3, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqd3, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqq3, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_4(glue(vpmaskmovq, SUFFIX), void, env, Reg, Reg, Reg)
+#if SHIFT == 2
+DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_1(vzeroall, void, env)
+DEF_HELPER_1(vzeroupper, void, env)
+#ifdef TARGET_X86_64
+DEF_HELPER_1(vzeroall_hi8, void, env)
+DEF_HELPER_1(vzeroupper_hi8, void, env)
+#endif
+DEF_HELPER_5(vpermdq_ymm, void, env, Reg, Reg, Reg, i32)
+DEF_HELPER_4(vpermq_ymm, void, env, Reg, Reg, i32)
+DEF_HELPER_4(vpermd_ymm, void, env, Reg, Reg, Reg)
+#endif
 #endif
 
 #undef SHIFT
@@ -354,6 +471,9 @@ DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32)
 #undef SSE_HELPER_W
 #undef SSE_HELPER_L
 #undef SSE_HELPER_Q
-#undef SSE_HELPER_S
+#undef SSE_HELPER_S3
+#undef SSE_HELPER_S4
+#undef SSE_HELPER_P3
+#undef SSE_HELPER_P4
 #undef SSE_HELPER_CMP
 #undef UNPCK_OP
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
index b391b69635..74cf86c986 100644
--- a/target/i386/tcg/fpu_helper.c
+++ b/target/i386/tcg/fpu_helper.c
@@ -3053,3 +3053,6 @@ void helper_movq(CPUX86State *env, void *d, void *s)
 
 #define SHIFT 1
 #include "ops_sse.h"
+
+#define SHIFT 2
+#include "ops_sse.h"
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index c393913fe0..f1c7ab4455 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -125,6 +125,7 @@ typedef struct DisasContext {
     TCGv tmp4;
     TCGv_ptr ptr0;
     TCGv_ptr ptr1;
+    TCGv_ptr ptr2;
     TCGv_i32 tmp2_i32;
     TCGv_i32 tmp3_i32;
     TCGv_i64 tmp1_i64;
@@ -2739,6 +2740,29 @@ static inline void gen_ldo_env_A0(DisasContext *s, int offset)
     tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(1)));
 }
 
+static inline void gen_ldo_env_A0_ymmh(DisasContext *s, int offset)
+{
+    int mem_index = s->mem_index;
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_addi_tl(s->tmp0, s->A0, 8);
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3)));
+}
+
+/* Load 256-bit ymm register value */
+static inline void gen_ldy_env_A0(DisasContext *s, int offset)
+{
+    int mem_index = s->mem_index;
+    gen_ldo_env_A0(s, offset);
+    tcg_gen_addi_tl(s->tmp0, s->A0, 16);
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_addi_tl(s->tmp0, s->A0, 24);
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3)));
+}
+
 static inline void gen_sto_env_A0(DisasContext *s, int offset)
 {
     int mem_index = s->mem_index;
@@ -2749,6 +2773,29 @@ static inline void gen_sto_env_A0(DisasContext *s, int offset)
     tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
 }
 
+static inline void gen_sto_env_A0_ymmh(DisasContext *s, int offset)
+{
+    int mem_index = s->mem_index;
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ);
+    tcg_gen_addi_tl(s->tmp0, s->A0, 8);
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3)));
+    tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+}
+
+/* Store 256-bit ymm register value */
+static inline void gen_sty_env_A0(DisasContext *s, int offset)
+{
+    int mem_index = s->mem_index;
+    gen_sto_env_A0(s, offset);
+    tcg_gen_addi_tl(s->tmp0, s->A0, 16);
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+    tcg_gen_addi_tl(s->tmp0, s->A0, 24);
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3)));
+    tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+}
+
 static inline void gen_op_movo(DisasContext *s, int d_offset, int s_offset)
 {
     tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(0)));
@@ -2757,6 +2804,32 @@ static inline void gen_op_movo(DisasContext *s, int d_offset, int s_offset)
     tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(1)));
 }
 
+static inline void gen_op_movo_ymmh(DisasContext *s, int d_offset, int s_offset)
+{
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(3)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(3)));
+}
+
+static inline void gen_op_movo_ymm_l2h(DisasContext *s,
+                                       int d_offset, int s_offset)
+{
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(0)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(1)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(3)));
+}
+
+static inline void gen_op_movo_ymm_h2l(DisasContext *s,
+                                       int d_offset, int s_offset)
+{
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(0)));
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(3)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(1)));
+}
+
 static inline void gen_op_movq(DisasContext *s, int d_offset, int s_offset)
 {
     tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset);
@@ -2775,170 +2848,270 @@ static inline void gen_op_movq_env_0(DisasContext *s, int d_offset)
     tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset);
 }
 
+#define XMM_OFFSET(reg) offsetof(CPUX86State, xmm_regs[reg])
+
+/*
+ * Clear the top half of the ymm register after a VEX.128 instruction
+ * This could be optimized by tracking this in env->hflags
+ */
+static void gen_clear_ymmh(DisasContext *s, int reg)
+{
+    if (s->prefix & PREFIX_VEX) {
+        gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2)));
+        gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(3)));
+    }
+}
+
+typedef void (*SSEFunc_0_pp)(TCGv_ptr reg_a, TCGv_ptr reg_b);
 typedef void (*SSEFunc_i_ep)(TCGv_i32 val, TCGv_ptr env, TCGv_ptr reg);
 typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr env, TCGv_ptr reg);
 typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val);
 typedef void (*SSEFunc_0_epl)(TCGv_ptr env, TCGv_ptr reg, TCGv_i64 val);
 typedef void (*SSEFunc_0_epp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b);
+typedef void (*SSEFunc_0_eppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
+                               TCGv_ptr reg_c);
+typedef void (*SSEFunc_0_epppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
+                                TCGv_ptr reg_c, TCGv_ptr reg_d);
 typedef void (*SSEFunc_0_eppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
                                TCGv_i32 val);
+typedef void (*SSEFunc_0_epppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
+                                TCGv_ptr reg_c, TCGv_i32 val);
 typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_i32 val);
+typedef void (*SSEFunc_0_pppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_ptr reg_c,
+                               TCGv_i32 val);
 typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
                                TCGv val);
+typedef void (*SSEFunc_0_epppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
+                                TCGv_ptr reg_c, TCGv val);
+
+#define SSE_OPF_V0        (1 << 0) /* vex.v must be 1111b (only 2 operands) */
+#define SSE_OPF_CMP       (1 << 1) /* does not write for first operand */
+#define SSE_OPF_BLENDV    (1 << 2) /* blendv* instruction */
+#define SSE_OPF_SPECIAL   (1 << 3) /* magic */
+#define SSE_OPF_3DNOW     (1 << 4) /* 3DNow! instruction */
+#define SSE_OPF_MMX       (1 << 5) /* MMX/integer/AVX2 instruction */
+#define SSE_OPF_SCALAR    (1 << 6) /* Has SSE scalar variants */
+#define SSE_OPF_AVX2      (1 << 7) /* AVX2 instruction */
+#define SSE_OPF_SHUF      (1 << 9) /* pshufx/shufpx */
+
+#define OP(op, flags, a, b, c, d, e, f, g, h)       \
+    {flags, {{.op = a}, {.op = b}, {.op = c}, {.op = d},    \
+             {.op = e}, {.op = f}, {.op = g}, {.op = h} } }
+
+#define MMX_OP(x) OP(op2, SSE_OPF_MMX, \
+        gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL, \
+        NULL, gen_helper_ ## x ## _ymm, NULL, NULL)
+
+#define SSE_FOP(name) OP(op2, SSE_OPF_SCALAR, \
+        gen_helper_##name##ps_xmm, gen_helper_##name##pd_xmm, \
+        gen_helper_##name##ss, gen_helper_##name##sd, \
+        gen_helper_##name##ps_ymm, gen_helper_##name##pd_ymm, NULL, NULL)
+#define SSE_OP(sname, dname, op, flags) OP(op, flags, \
+        gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL, \
+        gen_helper_##sname##_ymm, gen_helper_##dname##_ymm, NULL, NULL)
+
+struct SSEOpHelper_table1 {
+    int flags;
+    union {
+        SSEFunc_0_epp op1;
+        SSEFunc_0_ppi op1i;
+        SSEFunc_0_eppt op1t;
+        SSEFunc_0_eppp op2;
+        SSEFunc_0_pppi op2i;
+    } fn[8];
+};
 
-#define SSE_SPECIAL ((void *)1)
-#define SSE_DUMMY ((void *)2)
-
-#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm }
-#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \
-                     gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, }
+#define SSE_3DNOW { SSE_OPF_3DNOW }
+#define SSE_SPECIAL { SSE_OPF_SPECIAL }
 
-static const SSEFunc_0_epp sse_op_table1[256][4] = {
+static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     /* 3DNow! extensions */
-    [0x0e] = { SSE_DUMMY }, /* femms */
-    [0x0f] = { SSE_DUMMY }, /* pf... */
+    [0x0e] = SSE_SPECIAL, /* femms */
+    [0x0f] = SSE_3DNOW, /* pf... (sse_op_table5) */
     /* pure SSE operations */
-    [0x10] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movups, movupd, movss, movsd */
-    [0x11] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movups, movupd, movss, movsd */
-    [0x12] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movlps, movlpd, movsldup, movddup */
-    [0x13] = { SSE_SPECIAL, SSE_SPECIAL },  /* movlps, movlpd */
-    [0x14] = { gen_helper_punpckldq_xmm, gen_helper_punpcklqdq_xmm },
-    [0x15] = { gen_helper_punpckhdq_xmm, gen_helper_punpckhqdq_xmm },
-    [0x16] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },  /* movhps, movhpd, movshdup */
-    [0x17] = { SSE_SPECIAL, SSE_SPECIAL },  /* movhps, movhpd */
-
-    [0x28] = { SSE_SPECIAL, SSE_SPECIAL },  /* movaps, movapd */
-    [0x29] = { SSE_SPECIAL, SSE_SPECIAL },  /* movaps, movapd */
-    [0x2a] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */
-    [0x2b] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movntps, movntpd, movntss, movntsd */
-    [0x2c] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si */
-    [0x2d] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */
-    [0x2e] = { gen_helper_ucomiss, gen_helper_ucomisd },
-    [0x2f] = { gen_helper_comiss, gen_helper_comisd },
-    [0x50] = { SSE_SPECIAL, SSE_SPECIAL }, /* movmskps, movmskpd */
-    [0x51] = SSE_FOP(sqrt),
-    [0x52] = { gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL },
-    [0x53] = { gen_helper_rcpps, NULL, gen_helper_rcpss, NULL },
-    [0x54] = { gen_helper_pand_xmm, gen_helper_pand_xmm }, /* andps, andpd */
-    [0x55] = { gen_helper_pandn_xmm, gen_helper_pandn_xmm }, /* andnps, andnpd */
-    [0x56] = { gen_helper_por_xmm, gen_helper_por_xmm }, /* orps, orpd */
-    [0x57] = { gen_helper_pxor_xmm, gen_helper_pxor_xmm }, /* xorps, xorpd */
+    [0x10] = SSE_SPECIAL, /* movups, movupd, movss, movsd */
+    [0x11] = SSE_SPECIAL, /* movups, movupd, movss, movsd */
+    [0x12] = SSE_SPECIAL, /* movlps, movlpd, movsldup, movddup */
+    [0x13] = SSE_SPECIAL, /* movlps, movlpd */
+    [0x14] = SSE_OP(punpckldq, punpcklqdq, op2, 0), /* unpcklps, unpcklpd */
+    [0x15] = SSE_OP(punpckhdq, punpckhqdq, op2, 0), /* unpckhps, unpckhpd */
+    [0x16] = SSE_SPECIAL, /* movhps, movhpd, movshdup */
+    [0x17] = SSE_SPECIAL, /* movhps, movhpd */
+
+    [0x28] = SSE_SPECIAL, /* movaps, movapd */
+    [0x29] = SSE_SPECIAL, /* movaps, movapd */
+    [0x2a] = SSE_SPECIAL, /* cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */
+    [0x2b] = SSE_SPECIAL, /* movntps, movntpd, movntss, movntsd */
+    [0x2c] = SSE_SPECIAL, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si */
+    [0x2d] = SSE_SPECIAL, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */
+    [0x2e] = OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0,
+            gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL,
+            NULL, NULL, NULL, NULL),
+    [0x2f] = OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0,
+            gen_helper_comiss, gen_helper_comisd, NULL, NULL,
+            NULL, NULL, NULL, NULL),
+    [0x50] = SSE_SPECIAL, /* movmskps, movmskpd */
+    [0x51] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
+                gen_helper_sqrtps_xmm, gen_helper_sqrtpd_xmm,
+                gen_helper_sqrtss, gen_helper_sqrtsd,
+                gen_helper_sqrtps_ymm, gen_helper_sqrtpd_ymm, NULL, NULL),
+    [0x52] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
+                gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL,
+                gen_helper_rsqrtps_ymm, NULL, NULL, NULL),
+    [0x53] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
+                gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL,
+                gen_helper_rcpps_ymm, NULL, NULL, NULL),
+    [0x54] = SSE_OP(pand, pand, op2, 0), /* andps, andpd */
+    [0x55] = SSE_OP(pandn, pandn, op2, 0), /* andnps, andnpd */
+    [0x56] = SSE_OP(por, por, op2, 0), /* orps, orpd */
+    [0x57] = SSE_OP(pxor, pxor, op2, 0), /* xorps, xorpd */
     [0x58] = SSE_FOP(add),
     [0x59] = SSE_FOP(mul),
-    [0x5a] = { gen_helper_cvtps2pd, gen_helper_cvtpd2ps,
-               gen_helper_cvtss2sd, gen_helper_cvtsd2ss },
-    [0x5b] = { gen_helper_cvtdq2ps, gen_helper_cvtps2dq, gen_helper_cvttps2dq },
+    [0x5a] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
+                gen_helper_cvtps2pd_xmm, gen_helper_cvtpd2ps_xmm,
+                gen_helper_cvtss2sd, gen_helper_cvtsd2ss,
+                gen_helper_cvtps2pd_ymm, gen_helper_cvtpd2ps_ymm, NULL, NULL),
+    [0x5b] = OP(op1, SSE_OPF_V0,
+                gen_helper_cvtdq2ps_xmm, gen_helper_cvtps2dq_xmm,
+                gen_helper_cvttps2dq_xmm, NULL,
+                gen_helper_cvtdq2ps_ymm, gen_helper_cvtps2dq_ymm,
+                gen_helper_cvttps2dq_ymm, NULL),
     [0x5c] = SSE_FOP(sub),
     [0x5d] = SSE_FOP(min),
     [0x5e] = SSE_FOP(div),
     [0x5f] = SSE_FOP(max),
 
-    [0xc2] = SSE_FOP(cmpeq),
-    [0xc6] = { (SSEFunc_0_epp)gen_helper_shufps,
-               (SSEFunc_0_epp)gen_helper_shufpd }, /* XXX: casts */
+    [0xc2] = SSE_FOP(cmpeq), /* sse_op_table4 */
+    [0xc6] = SSE_OP(shufps, shufpd, op2i, SSE_OPF_SHUF),
 
     /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX.  */
-    [0x38] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },
-    [0x3a] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },
+    [0x38] = SSE_SPECIAL,
+    [0x3a] = SSE_SPECIAL,
 
     /* MMX ops and their SSE extensions */
-    [0x60] = MMX_OP2(punpcklbw),
-    [0x61] = MMX_OP2(punpcklwd),
-    [0x62] = MMX_OP2(punpckldq),
-    [0x63] = MMX_OP2(packsswb),
-    [0x64] = MMX_OP2(pcmpgtb),
-    [0x65] = MMX_OP2(pcmpgtw),
-    [0x66] = MMX_OP2(pcmpgtl),
-    [0x67] = MMX_OP2(packuswb),
-    [0x68] = MMX_OP2(punpckhbw),
-    [0x69] = MMX_OP2(punpckhwd),
-    [0x6a] = MMX_OP2(punpckhdq),
-    [0x6b] = MMX_OP2(packssdw),
-    [0x6c] = { NULL, gen_helper_punpcklqdq_xmm },
-    [0x6d] = { NULL, gen_helper_punpckhqdq_xmm },
-    [0x6e] = { SSE_SPECIAL, SSE_SPECIAL }, /* movd mm, ea */
-    [0x6f] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa, , movqdu */
-    [0x70] = { (SSEFunc_0_epp)gen_helper_pshufw_mmx,
-               (SSEFunc_0_epp)gen_helper_pshufd_xmm,
-               (SSEFunc_0_epp)gen_helper_pshufhw_xmm,
-               (SSEFunc_0_epp)gen_helper_pshuflw_xmm }, /* XXX: casts */
-    [0x71] = { SSE_SPECIAL, SSE_SPECIAL }, /* shiftw */
-    [0x72] = { SSE_SPECIAL, SSE_SPECIAL }, /* shiftd */
-    [0x73] = { SSE_SPECIAL, SSE_SPECIAL }, /* shiftq */
-    [0x74] = MMX_OP2(pcmpeqb),
-    [0x75] = MMX_OP2(pcmpeqw),
-    [0x76] = MMX_OP2(pcmpeql),
-    [0x77] = { SSE_DUMMY }, /* emms */
-    [0x78] = { NULL, SSE_SPECIAL, NULL, SSE_SPECIAL }, /* extrq_i, insertq_i */
-    [0x79] = { NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r },
-    [0x7c] = { NULL, gen_helper_haddpd, NULL, gen_helper_haddps },
-    [0x7d] = { NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps },
-    [0x7e] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movd, movd, , movq */
-    [0x7f] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa, movdqu */
-    [0xc4] = { SSE_SPECIAL, SSE_SPECIAL }, /* pinsrw */
-    [0xc5] = { SSE_SPECIAL, SSE_SPECIAL }, /* pextrw */
-    [0xd0] = { NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps },
-    [0xd1] = MMX_OP2(psrlw),
-    [0xd2] = MMX_OP2(psrld),
-    [0xd3] = MMX_OP2(psrlq),
-    [0xd4] = MMX_OP2(paddq),
-    [0xd5] = MMX_OP2(pmullw),
-    [0xd6] = { NULL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },
-    [0xd7] = { SSE_SPECIAL, SSE_SPECIAL }, /* pmovmskb */
-    [0xd8] = MMX_OP2(psubusb),
-    [0xd9] = MMX_OP2(psubusw),
-    [0xda] = MMX_OP2(pminub),
-    [0xdb] = MMX_OP2(pand),
-    [0xdc] = MMX_OP2(paddusb),
-    [0xdd] = MMX_OP2(paddusw),
-    [0xde] = MMX_OP2(pmaxub),
-    [0xdf] = MMX_OP2(pandn),
-    [0xe0] = MMX_OP2(pavgb),
-    [0xe1] = MMX_OP2(psraw),
-    [0xe2] = MMX_OP2(psrad),
-    [0xe3] = MMX_OP2(pavgw),
-    [0xe4] = MMX_OP2(pmulhuw),
-    [0xe5] = MMX_OP2(pmulhw),
-    [0xe6] = { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, gen_helper_cvtpd2dq },
-    [0xe7] = { SSE_SPECIAL , SSE_SPECIAL },  /* movntq, movntq */
-    [0xe8] = MMX_OP2(psubsb),
-    [0xe9] = MMX_OP2(psubsw),
-    [0xea] = MMX_OP2(pminsw),
-    [0xeb] = MMX_OP2(por),
-    [0xec] = MMX_OP2(paddsb),
-    [0xed] = MMX_OP2(paddsw),
-    [0xee] = MMX_OP2(pmaxsw),
-    [0xef] = MMX_OP2(pxor),
-    [0xf0] = { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */
-    [0xf1] = MMX_OP2(psllw),
-    [0xf2] = MMX_OP2(pslld),
-    [0xf3] = MMX_OP2(psllq),
-    [0xf4] = MMX_OP2(pmuludq),
-    [0xf5] = MMX_OP2(pmaddwd),
-    [0xf6] = MMX_OP2(psadbw),
-    [0xf7] = { (SSEFunc_0_epp)gen_helper_maskmov_mmx,
-               (SSEFunc_0_epp)gen_helper_maskmov_xmm }, /* XXX: casts */
-    [0xf8] = MMX_OP2(psubb),
-    [0xf9] = MMX_OP2(psubw),
-    [0xfa] = MMX_OP2(psubl),
-    [0xfb] = MMX_OP2(psubq),
-    [0xfc] = MMX_OP2(paddb),
-    [0xfd] = MMX_OP2(paddw),
-    [0xfe] = MMX_OP2(paddl),
+    [0x60] = MMX_OP(punpcklbw),
+    [0x61] = MMX_OP(punpcklwd),
+    [0x62] = MMX_OP(punpckldq),
+    [0x63] = MMX_OP(packsswb),
+    [0x64] = MMX_OP(pcmpgtb),
+    [0x65] = MMX_OP(pcmpgtw),
+    [0x66] = MMX_OP(pcmpgtl),
+    [0x67] = MMX_OP(packuswb),
+    [0x68] = MMX_OP(punpckhbw),
+    [0x69] = MMX_OP(punpckhwd),
+    [0x6a] = MMX_OP(punpckhdq),
+    [0x6b] = MMX_OP(packssdw),
+    [0x6c] = OP(op2, SSE_OPF_MMX,
+                NULL, gen_helper_punpcklqdq_xmm, NULL, NULL,
+                NULL, gen_helper_punpcklqdq_ymm, NULL, NULL),
+    [0x6d] = OP(op2, SSE_OPF_MMX,
+                NULL, gen_helper_punpckhqdq_xmm, NULL, NULL,
+                NULL, gen_helper_punpckhqdq_ymm, NULL, NULL),
+    [0x6e] = SSE_SPECIAL, /* movd mm, ea */
+    [0x6f] = SSE_SPECIAL, /* movq, movdqa, , movqdu */
+    [0x70] = OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX | SSE_OPF_V0,
+            gen_helper_pshufw_mmx, gen_helper_pshufd_xmm,
+            gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm,
+            NULL, gen_helper_pshufd_ymm,
+            gen_helper_pshufhw_ymm, gen_helper_pshuflw_ymm),
+    [0x71] = SSE_SPECIAL, /* shiftw */
+    [0x72] = SSE_SPECIAL, /* shiftd */
+    [0x73] = SSE_SPECIAL, /* shiftq */
+    [0x74] = MMX_OP(pcmpeqb),
+    [0x75] = MMX_OP(pcmpeqw),
+    [0x76] = MMX_OP(pcmpeql),
+    [0x77] = SSE_SPECIAL, /* emms */
+    [0x78] = SSE_SPECIAL, /* extrq_i, insertq_i (sse4a) */
+    [0x79] = OP(op1, SSE_OPF_V0,
+            NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r,
+            NULL, NULL, NULL, NULL),
+    [0x7c] = OP(op2, 0,
+                NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm,
+                NULL, gen_helper_haddpd_ymm, NULL, gen_helper_haddps_ymm),
+    [0x7d] = OP(op2, 0,
+                NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm,
+                NULL, gen_helper_hsubpd_ymm, NULL, gen_helper_hsubps_ymm),
+    [0x7e] = SSE_SPECIAL, /* movd, movd, , movq */
+    [0x7f] = SSE_SPECIAL, /* movq, movdqa, movdqu */
+    [0xc4] = SSE_SPECIAL, /* pinsrw */
+    [0xc5] = SSE_SPECIAL, /* pextrw */
+    [0xd0] = OP(op2, 0,
+                NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_xmm,
+                NULL, gen_helper_addsubpd_ymm, NULL, gen_helper_addsubps_ymm),
+    [0xd1] = MMX_OP(psrlw),
+    [0xd2] = MMX_OP(psrld),
+    [0xd3] = MMX_OP(psrlq),
+    [0xd4] = MMX_OP(paddq),
+    [0xd5] = MMX_OP(pmullw),
+    [0xd6] = SSE_SPECIAL,
+    [0xd7] = SSE_SPECIAL, /* pmovmskb */
+    [0xd8] = MMX_OP(psubusb),
+    [0xd9] = MMX_OP(psubusw),
+    [0xda] = MMX_OP(pminub),
+    [0xdb] = MMX_OP(pand),
+    [0xdc] = MMX_OP(paddusb),
+    [0xdd] = MMX_OP(paddusw),
+    [0xde] = MMX_OP(pmaxub),
+    [0xdf] = MMX_OP(pandn),
+    [0xe0] = MMX_OP(pavgb),
+    [0xe1] = MMX_OP(psraw),
+    [0xe2] = MMX_OP(psrad),
+    [0xe3] = MMX_OP(pavgw),
+    [0xe4] = MMX_OP(pmulhuw),
+    [0xe5] = MMX_OP(pmulhw),
+    [0xe6] = OP(op1, SSE_OPF_V0,
+            NULL, gen_helper_cvttpd2dq_xmm,
+            gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm,
+            NULL, gen_helper_cvttpd2dq_ymm,
+            gen_helper_cvtdq2pd_ymm, gen_helper_cvtpd2dq_ymm),
+    [0xe7] = SSE_SPECIAL,  /* movntq, movntq */
+    [0xe8] = MMX_OP(psubsb),
+    [0xe9] = MMX_OP(psubsw),
+    [0xea] = MMX_OP(pminsw),
+    [0xeb] = MMX_OP(por),
+    [0xec] = MMX_OP(paddsb),
+    [0xed] = MMX_OP(paddsw),
+    [0xee] = MMX_OP(pmaxsw),
+    [0xef] = MMX_OP(pxor),
+    [0xf0] = SSE_SPECIAL, /* lddqu */
+    [0xf1] = MMX_OP(psllw),
+    [0xf2] = MMX_OP(pslld),
+    [0xf3] = MMX_OP(psllq),
+    [0xf4] = MMX_OP(pmuludq),
+    [0xf5] = MMX_OP(pmaddwd),
+    [0xf6] = MMX_OP(psadbw),
+    [0xf7] = OP(op1t, SSE_OPF_MMX | SSE_OPF_V0,
+                gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL,
+                NULL, NULL, NULL, NULL),
+    [0xf8] = MMX_OP(psubb),
+    [0xf9] = MMX_OP(psubw),
+    [0xfa] = MMX_OP(psubl),
+    [0xfb] = MMX_OP(psubq),
+    [0xfc] = MMX_OP(paddb),
+    [0xfd] = MMX_OP(paddw),
+    [0xfe] = MMX_OP(paddl),
 };
-
-static const SSEFunc_0_epp sse_op_table2[3 * 8][2] = {
-    [0 + 2] = MMX_OP2(psrlw),
-    [0 + 4] = MMX_OP2(psraw),
-    [0 + 6] = MMX_OP2(psllw),
-    [8 + 2] = MMX_OP2(psrld),
-    [8 + 4] = MMX_OP2(psrad),
-    [8 + 6] = MMX_OP2(pslld),
-    [16 + 2] = MMX_OP2(psrlq),
-    [16 + 3] = { NULL, gen_helper_psrldq_xmm },
-    [16 + 6] = MMX_OP2(psllq),
-    [16 + 7] = { NULL, gen_helper_pslldq_xmm },
+#undef MMX_OP
+#undef OP
+#undef SSE_FOP
+#undef SSE_OP
+#undef SSE_SPECIAL
+
+#define MMX_OP(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, \
+                    gen_helper_ ## x ## _ymm}
+static const SSEFunc_0_eppp sse_op_table2[3 * 8][3] = {
+    [0 + 2] = MMX_OP(psrlw),
+    [0 + 4] = MMX_OP(psraw),
+    [0 + 6] = MMX_OP(psllw),
+    [8 + 2] = MMX_OP(psrld),
+    [8 + 4] = MMX_OP(psrad),
+    [8 + 6] = MMX_OP(pslld),
+    [16 + 2] = MMX_OP(psrlq),
+    [16 + 3] = { NULL, gen_helper_psrldq_xmm, gen_helper_psrldq_ymm},
+    [16 + 6] = MMX_OP(psllq),
+    [16 + 7] = { NULL, gen_helper_pslldq_xmm, gen_helper_pslldq_ymm},
 };
+#undef MMX_OP
 
 static const SSEFunc_0_epi sse_op_table3ai[] = {
     gen_helper_cvtsi2ss,
@@ -2968,16 +3141,53 @@ static const SSEFunc_l_ep sse_op_table3bq[] = {
 };
 #endif
 
-static const SSEFunc_0_epp sse_op_table4[8][4] = {
-    SSE_FOP(cmpeq),
-    SSE_FOP(cmplt),
-    SSE_FOP(cmple),
-    SSE_FOP(cmpunord),
-    SSE_FOP(cmpneq),
-    SSE_FOP(cmpnlt),
-    SSE_FOP(cmpnle),
-    SSE_FOP(cmpord),
+#define SSE_CMP(x) { \
+    gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \
+    gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, \
+    gen_helper_ ## x ## ps ## _ymm, gen_helper_ ## x ## pd ## _ymm}
+static const SSEFunc_0_eppp sse_op_table4[32][6] = {
+    SSE_CMP(cmpeq),
+    SSE_CMP(cmplt),
+    SSE_CMP(cmple),
+    SSE_CMP(cmpunord),
+    SSE_CMP(cmpneq),
+    SSE_CMP(cmpnlt),
+    SSE_CMP(cmpnle),
+    SSE_CMP(cmpord),
+
+    SSE_CMP(cmpequ),
+    SSE_CMP(cmpnge),
+    SSE_CMP(cmpngt),
+    SSE_CMP(cmpfalse),
+    SSE_CMP(cmpnequ),
+    SSE_CMP(cmpge),
+    SSE_CMP(cmpgt),
+    SSE_CMP(cmptrue),
+
+    SSE_CMP(cmpeqs),
+    SSE_CMP(cmpltq),
+    SSE_CMP(cmpleq),
+    SSE_CMP(cmpunords),
+    SSE_CMP(cmpneqq),
+    SSE_CMP(cmpnltq),
+    SSE_CMP(cmpnleq),
+    SSE_CMP(cmpords),
+
+    SSE_CMP(cmpequs),
+    SSE_CMP(cmpngeq),
+    SSE_CMP(cmpngtq),
+    SSE_CMP(cmpfalses),
+    SSE_CMP(cmpnequs),
+    SSE_CMP(cmpgeq),
+    SSE_CMP(cmpgtq),
+    SSE_CMP(cmptrues),
 };
+#undef SSE_CMP
+
+static void gen_helper_pavgusb(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b)
+{
+    gen_helper_pavgb_mmx(env, reg_a, reg_a, reg_b);
+}
 
 static const SSEFunc_0_epp sse_op_table5[256] = {
     [0x0c] = gen_helper_pi2fw,
@@ -3003,117 +3213,291 @@ static const SSEFunc_0_epp sse_op_table5[256] = {
     [0xb6] = gen_helper_movq, /* pfrcpit2 */
     [0xb7] = gen_helper_pmulhrw_mmx,
     [0xbb] = gen_helper_pswapd,
-    [0xbf] = gen_helper_pavgb_mmx /* pavgusb */
+    [0xbf] = gen_helper_pavgusb,
 };
 
-struct SSEOpHelper_epp {
-    SSEFunc_0_epp op[2];
+struct SSEOpHelper_table6 {
+    union {
+        SSEFunc_0_epp op1;
+        SSEFunc_0_eppp op2;
+        SSEFunc_0_epppp op3;
+    } fn[3]; /* [0] = mmx, [1] = xmm, fn[2] = ymm */
     uint32_t ext_mask;
+    int flags;
 };
 
-struct SSEOpHelper_eppi {
-    SSEFunc_0_eppi op[2];
+struct SSEOpHelper_table7 {
+    union {
+        SSEFunc_0_eppi op1;
+        SSEFunc_0_epppi op2;
+        SSEFunc_0_epppp op3;
+    } fn[3];
     uint32_t ext_mask;
+    int flags;
+};
+
+#define gen_helper_special_xmm NULL
+#define gen_helper_special_ymm NULL
+
+#define OP(name, op, flags, ext, mmx_name) \
+    {{{.op = mmx_name}, {.op = gen_helper_ ## name ## _xmm}, \
+      {.op = gen_helper_ ## name ## _ymm} }, CPUID_EXT_ ## ext, flags}
+#define BINARY_OP_MMX(name, ext) \
+    OP(name, op2, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx)
+#define BINARY_OP(name, ext, flags) \
+    OP(name, op2, flags, ext, NULL)
+#define UNARY_OP_MMX(name, ext) \
+    OP(name, op1, SSE_OPF_V0 | SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx)
+#define UNARY_OP(name, ext, flags) \
+    OP(name, op1, SSE_OPF_V0 | flags, ext, NULL)
+#define BLENDV_OP(name, ext, flags) OP(name, op3, SSE_OPF_BLENDV, ext, NULL)
+#define CMP_OP(name, ext) OP(name, op1, SSE_OPF_CMP | SSE_OPF_V0, ext, NULL)
+#define SPECIAL_OP(ext) OP(special, op1, SSE_OPF_SPECIAL, ext, NULL)
+
+/* prefix [66] 0f 38 */
+static const struct SSEOpHelper_table6 sse_op_table6[256] = {
+    [0x00] = BINARY_OP_MMX(pshufb, SSSE3),
+    [0x01] = BINARY_OP_MMX(phaddw, SSSE3),
+    [0x02] = BINARY_OP_MMX(phaddd, SSSE3),
+    [0x03] = BINARY_OP_MMX(phaddsw, SSSE3),
+    [0x04] = BINARY_OP_MMX(pmaddubsw, SSSE3),
+    [0x05] = BINARY_OP_MMX(phsubw, SSSE3),
+    [0x06] = BINARY_OP_MMX(phsubd, SSSE3),
+    [0x07] = BINARY_OP_MMX(phsubsw, SSSE3),
+    [0x08] = BINARY_OP_MMX(psignb, SSSE3),
+    [0x09] = BINARY_OP_MMX(psignw, SSSE3),
+    [0x0a] = BINARY_OP_MMX(psignd, SSSE3),
+    [0x0b] = BINARY_OP_MMX(pmulhrsw, SSSE3),
+    [0x0c] = BINARY_OP(vpermilps, AVX, 0),
+    [0x0d] = BINARY_OP(vpermilpd, AVX, 0),
+    [0x0e] = CMP_OP(vtestps, AVX),
+    [0x0f] = CMP_OP(vtestpd, AVX),
+    [0x10] = BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX),
+    [0x14] = BLENDV_OP(blendvps, SSE41, 0),
+    [0x15] = BLENDV_OP(blendvpd, SSE41, 0),
+#define gen_helper_vpermd_xmm NULL
+    [0x16] = BINARY_OP(vpermd, AVX, SSE_OPF_AVX2), /* vpermps */
+    [0x17] = CMP_OP(ptest, SSE41),
+    /* TODO:Some vbroadcast variants require AVX2 */
+    [0x18] = UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR), /* vbroadcastss */
+    [0x19] = UNARY_OP(vbroadcastq, AVX, SSE_OPF_SCALAR), /* vbroadcastsd */
+#define gen_helper_vbroadcastdq_xmm NULL
+    [0x1a] = UNARY_OP(vbroadcastdq, AVX, SSE_OPF_SCALAR), /* vbroadcastf128 */
+    [0x1c] = UNARY_OP_MMX(pabsb, SSSE3),
+    [0x1d] = UNARY_OP_MMX(pabsw, SSSE3),
+    [0x1e] = UNARY_OP_MMX(pabsd, SSSE3),
+    [0x20] = UNARY_OP(pmovsxbw, SSE41, SSE_OPF_MMX),
+    [0x21] = UNARY_OP(pmovsxbd, SSE41, SSE_OPF_MMX),
+    [0x22] = UNARY_OP(pmovsxbq, SSE41, SSE_OPF_MMX),
+    [0x23] = UNARY_OP(pmovsxwd, SSE41, SSE_OPF_MMX),
+    [0x24] = UNARY_OP(pmovsxwq, SSE41, SSE_OPF_MMX),
+    [0x25] = UNARY_OP(pmovsxdq, SSE41, SSE_OPF_MMX),
+    [0x28] = BINARY_OP(pmuldq, SSE41, SSE_OPF_MMX),
+    [0x29] = BINARY_OP(pcmpeqq, SSE41, SSE_OPF_MMX),
+    [0x2a] = SPECIAL_OP(SSE41), /* movntqda */
+    [0x2b] = BINARY_OP(packusdw, SSE41, SSE_OPF_MMX),
+    [0x2c] = BINARY_OP(vpmaskmovd, AVX, 0), /* vmaskmovps */
+    [0x2d] = BINARY_OP(vpmaskmovq, AVX, 0), /* vmaskmovpd */
+    [0x2e] = SPECIAL_OP(AVX), /* vmaskmovps */
+    [0x2f] = SPECIAL_OP(AVX), /* vmaskmovpd */
+    [0x30] = UNARY_OP(pmovzxbw, SSE41, SSE_OPF_MMX),
+    [0x31] = UNARY_OP(pmovzxbd, SSE41, SSE_OPF_MMX),
+    [0x32] = UNARY_OP(pmovzxbq, SSE41, SSE_OPF_MMX),
+    [0x33] = UNARY_OP(pmovzxwd, SSE41, SSE_OPF_MMX),
+    [0x34] = UNARY_OP(pmovzxwq, SSE41, SSE_OPF_MMX),
+    [0x35] = UNARY_OP(pmovzxdq, SSE41, SSE_OPF_MMX),
+    [0x36] = BINARY_OP(vpermd, AVX, SSE_OPF_AVX2), /* vpermd */
+    [0x37] = BINARY_OP(pcmpgtq, SSE41, SSE_OPF_MMX),
+    [0x38] = BINARY_OP(pminsb, SSE41, SSE_OPF_MMX),
+    [0x39] = BINARY_OP(pminsd, SSE41, SSE_OPF_MMX),
+    [0x3a] = BINARY_OP(pminuw, SSE41, SSE_OPF_MMX),
+    [0x3b] = BINARY_OP(pminud, SSE41, SSE_OPF_MMX),
+    [0x3c] = BINARY_OP(pmaxsb, SSE41, SSE_OPF_MMX),
+    [0x3d] = BINARY_OP(pmaxsd, SSE41, SSE_OPF_MMX),
+    [0x3e] = BINARY_OP(pmaxuw, SSE41, SSE_OPF_MMX),
+    [0x3f] = BINARY_OP(pmaxud, SSE41, SSE_OPF_MMX),
+    [0x40] = BINARY_OP(pmulld, SSE41, SSE_OPF_MMX),
+#define gen_helper_phminposuw_ymm NULL
+    [0x41] = UNARY_OP(phminposuw, SSE41, 0),
+    [0x45] = BINARY_OP(vpsrlvd, AVX, SSE_OPF_AVX2),
+    [0x46] = BINARY_OP(vpsravd, AVX, SSE_OPF_AVX2),
+    [0x47] = BINARY_OP(vpsllvd, AVX, SSE_OPF_AVX2),
+    /* vpbroadcastd */
+    [0x58] = UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vpbroadcastq */
+    [0x59] = UNARY_OP(vbroadcastq, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vbroadcasti128 */
+    [0x5a] = UNARY_OP(vbroadcastdq, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vpbroadcastb */
+    [0x78] = UNARY_OP(vbroadcastb, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vpbroadcastw */
+    [0x79] = UNARY_OP(vbroadcastw, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vpmaskmovd, vpmaskmovq */
+    [0x8c] = BINARY_OP(vpmaskmovd, AVX, SSE_OPF_AVX2),
+    [0x8e] = SPECIAL_OP(AVX), /* vpmaskmovd, vpmaskmovq */
+    [0x90] = SPECIAL_OP(AVX), /* vpgatherdd, vpgatherdq */
+    [0x91] = SPECIAL_OP(AVX), /* vpgatherqd, vpgatherqq */
+    [0x92] = SPECIAL_OP(AVX), /* vgatherdpd, vgatherdps */
+    [0x93] = SPECIAL_OP(AVX), /* vgatherqpd, vgatherqps */
+#define gen_helper_aesimc_ymm NULL
+    [0xdb] = UNARY_OP(aesimc, AES, 0),
+    [0xdc] = BINARY_OP(aesenc, AES, 0),
+    [0xdd] = BINARY_OP(aesenclast, AES, 0),
+    [0xde] = BINARY_OP(aesdec, AES, 0),
+    [0xdf] = BINARY_OP(aesdeclast, AES, 0),
+};
+
+/* prefix [66] 0f 3a */
+static const struct SSEOpHelper_table7 sse_op_table7[256] = {
+#define gen_helper_vpermq_xmm NULL
+    [0x00] = UNARY_OP(vpermq, AVX, SSE_OPF_AVX2),
+    [0x01] = UNARY_OP(vpermq, AVX, SSE_OPF_AVX2), /* vpermpd */
+    [0x02] = BINARY_OP(blendps, AVX, SSE_OPF_AVX2), /* vpblendd */
+    [0x04] = UNARY_OP(vpermilps_imm, AVX, 0),
+    [0x05] = UNARY_OP(vpermilpd_imm, AVX, 0),
+#define gen_helper_vpermdq_xmm NULL
+    [0x06] = BINARY_OP(vpermdq, AVX, 0), /* vperm2f128 */
+    [0x08] = UNARY_OP(roundps, SSE41, 0),
+    [0x09] = UNARY_OP(roundpd, SSE41, 0),
+#define gen_helper_roundss_ymm NULL
+    [0x0a] = UNARY_OP(roundss, SSE41, SSE_OPF_SCALAR),
+#define gen_helper_roundsd_ymm NULL
+    [0x0b] = UNARY_OP(roundsd, SSE41, SSE_OPF_SCALAR),
+    [0x0c] = BINARY_OP(blendps, SSE41, 0),
+    [0x0d] = BINARY_OP(blendpd, SSE41, 0),
+    [0x0e] = BINARY_OP(pblendw, SSE41, SSE_OPF_MMX),
+    [0x0f] = BINARY_OP_MMX(palignr, SSSE3),
+    [0x14] = SPECIAL_OP(SSE41), /* pextrb */
+    [0x15] = SPECIAL_OP(SSE41), /* pextrw */
+    [0x16] = SPECIAL_OP(SSE41), /* pextrd/pextrq */
+    [0x17] = SPECIAL_OP(SSE41), /* extractps */
+    [0x18] = SPECIAL_OP(AVX), /* vinsertf128 */
+    [0x19] = SPECIAL_OP(AVX), /* vextractf128 */
+    [0x20] = SPECIAL_OP(SSE41), /* pinsrb */
+    [0x21] = SPECIAL_OP(SSE41), /* insertps */
+    [0x22] = SPECIAL_OP(SSE41), /* pinsrd/pinsrq */
+    [0x38] = SPECIAL_OP(AVX), /* vinserti128 */
+    [0x39] = SPECIAL_OP(AVX), /* vextracti128 */
+    [0x40] = BINARY_OP(dpps, SSE41, 0),
+#define gen_helper_dppd_ymm NULL
+    [0x41] = BINARY_OP(dppd, SSE41, 0),
+    [0x42] = BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX),
+    [0x44] = BINARY_OP(pclmulqdq, PCLMULQDQ, 0),
+    [0x46] = BINARY_OP(vpermdq, AVX, SSE_OPF_AVX2), /* vperm2i128 */
+    [0x4a] = BLENDV_OP(blendvps, SSE41, 0),
+    [0x4b] = BLENDV_OP(blendvpd, SSE41, 0),
+    [0x4c] = BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX),
+#define gen_helper_pcmpestrm_ymm NULL
+    [0x60] = CMP_OP(pcmpestrm, SSE42),
+#define gen_helper_pcmpestri_ymm NULL
+    [0x61] = CMP_OP(pcmpestri, SSE42),
+#define gen_helper_pcmpistrm_ymm NULL
+    [0x62] = CMP_OP(pcmpistrm, SSE42),
+#define gen_helper_pcmpistri_ymm NULL
+    [0x63] = CMP_OP(pcmpistri, SSE42),
+#define gen_helper_aeskeygenassist_ymm NULL
+    [0xdf] = UNARY_OP(aeskeygenassist, AES, 0),
 };
 
-#define SSSE3_OP(x) { MMX_OP2(x), CPUID_EXT_SSSE3 }
-#define SSE41_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE41 }
-#define SSE42_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE42 }
-#define SSE41_SPECIAL { { NULL, SSE_SPECIAL }, CPUID_EXT_SSE41 }
-#define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \
-        CPUID_EXT_PCLMULQDQ }
-#define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES }
-
-static const struct SSEOpHelper_epp sse_op_table6[256] = {
-    [0x00] = SSSE3_OP(pshufb),
-    [0x01] = SSSE3_OP(phaddw),
-    [0x02] = SSSE3_OP(phaddd),
-    [0x03] = SSSE3_OP(phaddsw),
-    [0x04] = SSSE3_OP(pmaddubsw),
-    [0x05] = SSSE3_OP(phsubw),
-    [0x06] = SSSE3_OP(phsubd),
-    [0x07] = SSSE3_OP(phsubsw),
-    [0x08] = SSSE3_OP(psignb),
-    [0x09] = SSSE3_OP(psignw),
-    [0x0a] = SSSE3_OP(psignd),
-    [0x0b] = SSSE3_OP(pmulhrsw),
-    [0x10] = SSE41_OP(pblendvb),
-    [0x14] = SSE41_OP(blendvps),
-    [0x15] = SSE41_OP(blendvpd),
-    [0x17] = SSE41_OP(ptest),
-    [0x1c] = SSSE3_OP(pabsb),
-    [0x1d] = SSSE3_OP(pabsw),
-    [0x1e] = SSSE3_OP(pabsd),
-    [0x20] = SSE41_OP(pmovsxbw),
-    [0x21] = SSE41_OP(pmovsxbd),
-    [0x22] = SSE41_OP(pmovsxbq),
-    [0x23] = SSE41_OP(pmovsxwd),
-    [0x24] = SSE41_OP(pmovsxwq),
-    [0x25] = SSE41_OP(pmovsxdq),
-    [0x28] = SSE41_OP(pmuldq),
-    [0x29] = SSE41_OP(pcmpeqq),
-    [0x2a] = SSE41_SPECIAL, /* movntqda */
-    [0x2b] = SSE41_OP(packusdw),
-    [0x30] = SSE41_OP(pmovzxbw),
-    [0x31] = SSE41_OP(pmovzxbd),
-    [0x32] = SSE41_OP(pmovzxbq),
-    [0x33] = SSE41_OP(pmovzxwd),
-    [0x34] = SSE41_OP(pmovzxwq),
-    [0x35] = SSE41_OP(pmovzxdq),
-    [0x37] = SSE42_OP(pcmpgtq),
-    [0x38] = SSE41_OP(pminsb),
-    [0x39] = SSE41_OP(pminsd),
-    [0x3a] = SSE41_OP(pminuw),
-    [0x3b] = SSE41_OP(pminud),
-    [0x3c] = SSE41_OP(pmaxsb),
-    [0x3d] = SSE41_OP(pmaxsd),
-    [0x3e] = SSE41_OP(pmaxuw),
-    [0x3f] = SSE41_OP(pmaxud),
-    [0x40] = SSE41_OP(pmulld),
-    [0x41] = SSE41_OP(phminposuw),
-    [0xdb] = AESNI_OP(aesimc),
-    [0xdc] = AESNI_OP(aesenc),
-    [0xdd] = AESNI_OP(aesenclast),
-    [0xde] = AESNI_OP(aesdec),
-    [0xdf] = AESNI_OP(aesdeclast),
+#define SSE_OP(name) \
+    {gen_helper_ ## name ##_xmm, gen_helper_ ## name ##_ymm}
+static const SSEFunc_0_eppp sse_op_table8[3][2] = {
+    SSE_OP(vpsrlvq),
+    SSE_OP(vpsravq),
+    SSE_OP(vpsllvq),
 };
 
-static const struct SSEOpHelper_eppi sse_op_table7[256] = {
-    [0x08] = SSE41_OP(roundps),
-    [0x09] = SSE41_OP(roundpd),
-    [0x0a] = SSE41_OP(roundss),
-    [0x0b] = SSE41_OP(roundsd),
-    [0x0c] = SSE41_OP(blendps),
-    [0x0d] = SSE41_OP(blendpd),
-    [0x0e] = SSE41_OP(pblendw),
-    [0x0f] = SSSE3_OP(palignr),
-    [0x14] = SSE41_SPECIAL, /* pextrb */
-    [0x15] = SSE41_SPECIAL, /* pextrw */
-    [0x16] = SSE41_SPECIAL, /* pextrd/pextrq */
-    [0x17] = SSE41_SPECIAL, /* extractps */
-    [0x20] = SSE41_SPECIAL, /* pinsrb */
-    [0x21] = SSE41_SPECIAL, /* insertps */
-    [0x22] = SSE41_SPECIAL, /* pinsrd/pinsrq */
-    [0x40] = SSE41_OP(dpps),
-    [0x41] = SSE41_OP(dppd),
-    [0x42] = SSE41_OP(mpsadbw),
-    [0x44] = PCLMULQDQ_OP(pclmulqdq),
-    [0x60] = SSE42_OP(pcmpestrm),
-    [0x61] = SSE42_OP(pcmpestri),
-    [0x62] = SSE42_OP(pcmpistrm),
-    [0x63] = SSE42_OP(pcmpistri),
-    [0xdf] = AESNI_OP(aeskeygenassist),
+static const SSEFunc_0_eppt sse_op_table9[2][2] = {
+    SSE_OP(vpmaskmovd_st),
+    SSE_OP(vpmaskmovq_st),
 };
 
+static const SSEFunc_0_epppt sse_op_table10[16][2] = {
+    SSE_OP(vpgatherdd0),
+    SSE_OP(vpgatherdq0),
+    SSE_OP(vpgatherqd0),
+    SSE_OP(vpgatherqq0),
+    SSE_OP(vpgatherdd1),
+    SSE_OP(vpgatherdq1),
+    SSE_OP(vpgatherqd1),
+    SSE_OP(vpgatherqq1),
+    SSE_OP(vpgatherdd2),
+    SSE_OP(vpgatherdq2),
+    SSE_OP(vpgatherqd2),
+    SSE_OP(vpgatherqq2),
+    SSE_OP(vpgatherdd3),
+    SSE_OP(vpgatherdq3),
+    SSE_OP(vpgatherqd3),
+    SSE_OP(vpgatherqq3),
+};
+#undef SSE_OP
+
+#undef OP
+#undef BINARY_OP_MMX
+#undef BINARY_OP
+#undef UNARY_OP_MMX
+#undef UNARY_OP
+#undef BLENDV_OP
+#undef SPECIAL_OP
+
+/* VEX prefix not allowed */
+#define CHECK_NO_VEX(s) do { \
+    if (s->prefix & PREFIX_VEX) \
+        goto illegal_op; \
+    } while (0)
+
+/*
+ * VEX encodings require AVX
+ * Allow legacy SSE encodings even if AVX not enabled
+ */
+#define CHECK_AVX(s) do { \
+    if ((s->prefix & PREFIX_VEX) \
+        && !(env->hflags & HF_AVX_EN_MASK)) \
+        goto illegal_op; \
+    } while (0)
+
+/* If a VEX prefix is used then it must have V=1111b */
+#define CHECK_AVX_V0(s) do { \
+    CHECK_AVX(s); \
+    if ((s->prefix & PREFIX_VEX) && (s->vex_v != 0)) \
+        goto illegal_op; \
+    } while (0)
+
+/* If a VEX prefix is used then it must have L=0 */
+#define CHECK_AVX_128(s) do { \
+    CHECK_AVX(s); \
+    if ((s->prefix & PREFIX_VEX) && (s->vex_l != 0)) \
+        goto illegal_op; \
+    } while (0)
+
+/* If a VEX prefix is used then it must have V=1111b and L=0 */
+#define CHECK_AVX_V0_128(s) do { \
+    CHECK_AVX(s); \
+    if ((s->prefix & PREFIX_VEX) && (s->vex_v != 0 || s->vex_l != 0)) \
+        goto illegal_op; \
+    } while (0)
+
+/* 256-bit (ymm) variants require AVX2 */
+#define CHECK_AVX2_256(s) do { \
+    if (s->vex_l && !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2)) \
+        goto illegal_op; \
+    } while (0)
+
+/* Requires AVX2 and VEX encoding */
+#define CHECK_AVX2(s) do { \
+    if ((s->prefix & PREFIX_VEX) == 0 \
+            || !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2)) \
+        goto illegal_op; \
+    } while (0)
+
 static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     target_ulong pc_start)
 {
-    int b1, op1_offset, op2_offset, is_xmm, val;
-    int modrm, mod, rm, reg;
-    SSEFunc_0_epp sse_fn_epp;
-    SSEFunc_0_eppi sse_fn_eppi;
-    SSEFunc_0_ppi sse_fn_ppi;
-    SSEFunc_0_eppt sse_fn_eppt;
+    int b1, op1_offset, op2_offset, v_offset, is_xmm, val, scalar_op;
+    int modrm, mod, rm, reg, reg_v;
+    struct SSEOpHelper_table1 sse_op;
+    struct SSEOpHelper_table6 op6;
+    struct SSEOpHelper_table7 op7;
     MemOp ot;
 
     b &= 0xff;
@@ -3125,10 +3509,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         b1 = 3;
     else
         b1 = 0;
-    sse_fn_epp = sse_op_table1[b][b1];
-    if (!sse_fn_epp) {
-        goto unknown_op;
-    }
+    sse_op = sse_op_table1[b];
     if ((b <= 0x5f && b >= 0x10) || b == 0xc6 || b == 0xc2) {
         is_xmm = 1;
     } else {
@@ -3139,20 +3520,28 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             is_xmm = 1;
         }
     }
+    if (sse_op.flags & SSE_OPF_3DNOW) {
+        if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) {
+            goto illegal_op;
+        }
+    }
     /* simple MMX/SSE operation */
     if (s->flags & HF_TS_MASK) {
         gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
         return;
     }
-    if (s->flags & HF_EM_MASK) {
-    illegal_op:
-        gen_illegal_opcode(s);
-        return;
-    }
-    if (is_xmm
-        && !(s->flags & HF_OSFXSR_MASK)
-        && (b != 0x38 && b != 0x3a)) {
-        goto unknown_op;
+    /* VEX encoded instuctions ignore EM bit. See also CHECK_AVX */
+    if (!(s->prefix & PREFIX_VEX)) {
+        if (s->flags & HF_EM_MASK) {
+        illegal_op:
+            gen_illegal_opcode(s);
+            return;
+        }
+        if (is_xmm
+            && !(s->flags & HF_OSFXSR_MASK)
+            && (b != 0x38 && b != 0x3a)) {
+            goto unknown_op;
+        }
     }
     if (b == 0x0e) {
         if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) {
@@ -3164,9 +3553,29 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         return;
     }
     if (b == 0x77) {
-        /* emms */
-        gen_helper_emms(cpu_env);
-        return;
+        if (s->prefix & PREFIX_VEX) {
+            CHECK_AVX(s);
+            if (s->vex_l) {
+                gen_helper_vzeroall(cpu_env);
+#ifdef TARGET_X86_64
+                if (CODE64(s)) {
+                    gen_helper_vzeroall_hi8(cpu_env);
+                }
+#endif
+            } else {
+                gen_helper_vzeroupper(cpu_env);
+#ifdef TARGET_X86_64
+                if (CODE64(s)) {
+                    gen_helper_vzeroupper_hi8(cpu_env);
+                }
+#endif
+            }
+            return;
+        } else {
+            /* emms */
+            gen_helper_emms(cpu_env);
+            return;
+        }
     }
     /* prepare MMX state (XXX: optimize by storing fptt and fptags in
        the static cpu state) */
@@ -3179,11 +3588,17 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
     if (is_xmm) {
         reg |= REX_R(s);
     }
+    if (s->prefix & PREFIX_VEX) {
+        reg_v = s->vex_v;
+    } else {
+        reg_v = reg;
+    }
     mod = (modrm >> 6) & 3;
-    if (sse_fn_epp == SSE_SPECIAL) {
+    if (sse_op.flags & SSE_OPF_SPECIAL) {
         b |= (b1 << 8);
         switch(b) {
         case 0x0e7: /* movntq */
+            CHECK_NO_VEX(s);
             if (mod == 3) {
                 goto illegal_op;
             }
@@ -3193,19 +3608,31 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x1e7: /* movntdq */
         case 0x02b: /* movntps */
         case 0x12b: /* movntps */
+            CHECK_AVX_V0(s);
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
-            gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+            if (s->vex_l) {
+                gen_sty_env_A0(s, XMM_OFFSET(reg));
+            } else {
+                gen_sto_env_A0(s, XMM_OFFSET(reg));
+            }
             break;
         case 0x3f0: /* lddqu */
+            CHECK_AVX_V0(s);
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
-            gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+            if (s->vex_l) {
+                gen_ldy_env_A0(s, XMM_OFFSET(reg));
+            } else {
+                gen_ldo_env_A0(s, XMM_OFFSET(reg));
+                gen_clear_ymmh(s, reg);
+            }
             break;
         case 0x22b: /* movntss */
         case 0x32b: /* movntsd */
+            CHECK_AVX_V0_128(s);
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
@@ -3219,6 +3646,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x6e: /* movd mm, ea */
+            CHECK_NO_VEX(s);
 #ifdef TARGET_X86_64
             if (s->dflag == MO_64) {
                 gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0);
@@ -3235,23 +3663,24 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x16e: /* movd xmm, ea */
+            CHECK_AVX_V0_128(s);
 #ifdef TARGET_X86_64
             if (s->dflag == MO_64) {
                 gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0);
-                tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                                 offsetof(CPUX86State,xmm_regs[reg]));
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(reg));
                 gen_helper_movq_mm_T0_xmm(s->ptr0, s->T0);
             } else
 #endif
             {
                 gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 0);
-                tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                                 offsetof(CPUX86State,xmm_regs[reg]));
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(reg));
                 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
                 gen_helper_movl_mm_T0_xmm(s->ptr0, s->tmp2_i32);
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x6f: /* movq mm, ea */
+            CHECK_NO_VEX(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx));
@@ -3269,17 +3698,28 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x128: /* movapd */
         case 0x16f: /* movdqa xmm, ea */
         case 0x26f: /* movdqu xmm, ea */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+                if (s->vex_l) {
+                    gen_ldy_env_A0(s, XMM_OFFSET(reg));
+                } else {
+                    gen_ldo_env_A0(s, XMM_OFFSET(reg));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                gen_op_movo(s, offsetof(CPUX86State, xmm_regs[reg]),
-                            offsetof(CPUX86State,xmm_regs[rm]));
+                gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(rm));
+                if (s->vex_l) {
+                    gen_op_movo_ymmh(s, XMM_OFFSET(reg), XMM_OFFSET(rm));
+                }
+            }
+            if (!s->vex_l) {
+                gen_clear_ymmh(s, reg);
             }
             break;
         case 0x210: /* movss xmm, ea */
             if (mod != 3) {
+                CHECK_AVX_V0_128(s);
                 gen_lea_modrm(env, s, modrm);
                 gen_op_ld_v(s, MO_32, s->T0, s->A0);
                 tcg_gen_st32_tl(s->T0, cpu_env,
@@ -3292,13 +3732,21 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)));
             } else {
+                CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
-                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0)));
+                tcg_gen_ld_i32(s->tmp2_i32, cpu_env,
+                               offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)));
+                if (reg != reg_v) {
+                    gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v));
+                }
+                tcg_gen_st_i32(s->tmp2_i32, cpu_env,
+                               offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x310: /* movsd xmm, ea */
             if (mod != 3) {
+                CHECK_AVX_V0_128(s);
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
                                            xmm_regs[reg].ZMM_Q(0)));
@@ -3308,13 +3756,21 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)));
             } else {
+                CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
+                if (reg != reg_v) {
+                    gen_op_movq(s,
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
+                }
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x012: /* movlps */
         case 0x112: /* movlpd */
+            CHECK_AVX_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
@@ -3323,40 +3779,84 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 /* movhlps */
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(1)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(1)));
+            }
+            if (reg != reg_v) {
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x212: /* movsldup */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+                if (s->vex_l) {
+                    gen_ldy_env_A0(s, XMM_OFFSET(reg));
+                } else {
+                    gen_ldo_env_A0(s, XMM_OFFSET(reg));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)));
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_L(2)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_L(2)));
+                if (s->vex_l) {
+                    gen_op_movl(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_L(4)));
+                    gen_op_movl(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_L(6)));
+                }
             }
             gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)),
-                        offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0)));
+                        offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)));
             gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)),
-                        offsetof(CPUX86State,xmm_regs[reg].ZMM_L(2)));
+                        offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2)));
+            if (s->vex_l) {
+                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4)));
+                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6)));
+            } else {
+                gen_clear_ymmh(s, reg);
+            }
             break;
         case 0x312: /* movddup */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
                                            xmm_regs[reg].ZMM_Q(0)));
+                if (s->vex_l) {
+                    tcg_gen_addi_tl(s->A0, s->A0, 16);
+                    gen_ldq_env_A0(s, offsetof(CPUX86State,
+                                               xmm_regs[reg].ZMM_Q(2)));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)));
+                if (s->vex_l) {
+                    gen_op_movq(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(2)));
+                }
             }
             gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
-                        offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0)));
+                        offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)));
+            if (s->vex_l) {
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(3)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2)));
+            } else {
+                gen_clear_ymmh(s, reg);
+            }
             break;
         case 0x016: /* movhps */
         case 0x116: /* movhpd */
+            CHECK_AVX_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
@@ -3365,27 +3865,54 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 /* movlhps */
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)));
+            }
+            if (reg != reg_v) {
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(0)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x216: /* movshdup */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+                if (s->vex_l) {
+                    gen_ldy_env_A0(s, XMM_OFFSET(reg));
+                } else {
+                    gen_ldo_env_A0(s, XMM_OFFSET(reg));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_L(1)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_L(1)));
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_L(3)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_L(3)));
+                if (s->vex_l) {
+                    gen_op_movl(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_L(5)));
+                    gen_op_movl(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_L(7)));
+                }
             }
             gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)),
-                        offsetof(CPUX86State,xmm_regs[reg].ZMM_L(1)));
+                        offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)));
             gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2)),
-                        offsetof(CPUX86State,xmm_regs[reg].ZMM_L(3)));
+                        offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)));
+            if (s->vex_l) {
+                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5)));
+                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7)));
+            } else {
+                gen_clear_ymmh(s, reg);
+            }
             break;
         case 0x178:
         case 0x378:
+            CHECK_NO_VEX(s);
             {
                 int bit_index, field_length;
 
@@ -3393,8 +3920,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     goto illegal_op;
                 field_length = x86_ldub_code(env, s) & 0x3F;
                 bit_index = x86_ldub_code(env, s) & 0x3F;
-                tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                    offsetof(CPUX86State,xmm_regs[reg]));
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(reg));
                 if (b1 == 1)
                     gen_helper_extrq_i(cpu_env, s->ptr0,
                                        tcg_const_i32(bit_index),
@@ -3406,6 +3932,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x7e: /* movd ea, mm */
+            CHECK_NO_VEX(s);
 #ifdef TARGET_X86_64
             if (s->dflag == MO_64) {
                 tcg_gen_ld_i64(s->T0, cpu_env,
@@ -3420,20 +3947,22 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x17e: /* movd ea, xmm */
+            CHECK_AVX_V0_128(s);
 #ifdef TARGET_X86_64
             if (s->dflag == MO_64) {
                 tcg_gen_ld_i64(s->T0, cpu_env,
-                               offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0)));
+                               offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)));
                 gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 1);
             } else
 #endif
             {
                 tcg_gen_ld32u_tl(s->T0, cpu_env,
-                                 offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0)));
+                                 offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)));
                 gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 1);
             }
             break;
         case 0x27e: /* movq xmm, ea */
+            CHECK_AVX_V0_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
@@ -3441,11 +3970,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)));
             }
             gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)));
+            gen_clear_ymmh(s, reg);
             break;
         case 0x7f: /* movq ea, mm */
+            CHECK_NO_VEX(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx));
@@ -3461,40 +3992,64 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x129: /* movapd */
         case 0x17f: /* movdqa ea, xmm */
         case 0x27f: /* movdqu ea, xmm */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+                if (s->vex_l) {
+                    gen_sty_env_A0(s, XMM_OFFSET(reg));
+                } else {
+                    gen_sto_env_A0(s, XMM_OFFSET(reg));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                gen_op_movo(s, offsetof(CPUX86State, xmm_regs[rm]),
-                            offsetof(CPUX86State,xmm_regs[reg]));
+                gen_op_movo(s, XMM_OFFSET(rm), XMM_OFFSET(reg));
+                if (s->vex_l) {
+                    gen_op_movo_ymmh(s, XMM_OFFSET(rm), XMM_OFFSET(reg));
+                } else {
+                    gen_clear_ymmh(s, rm);
+                }
             }
             break;
         case 0x211: /* movss ea, xmm */
             if (mod != 3) {
+                CHECK_AVX_V0_128(s);
                 gen_lea_modrm(env, s, modrm);
                 tcg_gen_ld32u_tl(s->T0, cpu_env,
                                  offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)));
                 gen_op_st_v(s, MO_32, s->T0, s->A0);
             } else {
+                CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
+                if (rm != reg_v) {
+                    gen_op_movo(s, XMM_OFFSET(rm), XMM_OFFSET(reg_v));
+                }
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)),
-                            offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0)));
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)));
+                gen_clear_ymmh(s, rm);
             }
             break;
         case 0x311: /* movsd ea, xmm */
             if (mod != 3) {
+                CHECK_AVX_V0_128(s);
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State,
                                            xmm_regs[reg].ZMM_Q(0)));
             } else {
+                CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
+                if (rm != reg_v) {
+                    gen_op_movq(s,
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
+                }
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)),
-                            offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0)));
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)));
+                gen_clear_ymmh(s, rm);
             }
             break;
         case 0x013: /* movlps */
         case 0x113: /* movlpd */
+            CHECK_AVX_V0_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State,
@@ -3505,6 +4060,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x017: /* movhps */
         case 0x117: /* movhpd */
+            CHECK_AVX_V0_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State,
@@ -3521,65 +4077,91 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x173:
             val = x86_ldub_code(env, s);
             if (is_xmm) {
+                CHECK_AVX(s);
+                CHECK_AVX2_256(s);
                 tcg_gen_movi_tl(s->T0, val);
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, xmm_t0.ZMM_L(0)));
                 tcg_gen_movi_tl(s->T0, 0);
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, xmm_t0.ZMM_L(1)));
-                op1_offset = offsetof(CPUX86State,xmm_t0);
+                op1_offset = offsetof(CPUX86State, xmm_t0);
             } else {
+                CHECK_NO_VEX(s);
                 tcg_gen_movi_tl(s->T0, val);
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, mmx_t0.MMX_L(0)));
                 tcg_gen_movi_tl(s->T0, 0);
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, mmx_t0.MMX_L(1)));
-                op1_offset = offsetof(CPUX86State,mmx_t0);
+                op1_offset = offsetof(CPUX86State, mmx_t0);
             }
             assert(b1 < 2);
-            sse_fn_epp = sse_op_table2[((b - 1) & 3) * 8 +
+            if (s->vex_l) {
+                b1 = 2;
+            }
+            SSEFunc_0_eppp fn = sse_op_table2[((b - 1) & 3) * 8 +
                                        (((modrm >> 3)) & 7)][b1];
-            if (!sse_fn_epp) {
+            if (!fn) {
                 goto unknown_op;
             }
             if (is_xmm) {
                 rm = (modrm & 7) | REX_B(s);
-                op2_offset = offsetof(CPUX86State,xmm_regs[rm]);
+                op2_offset = XMM_OFFSET(rm);
+                if (s->prefix & PREFIX_VEX) {
+                    v_offset = XMM_OFFSET(reg_v);
+                } else {
+                    v_offset = op2_offset;
+                }
             } else {
                 rm = (modrm & 7);
                 op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
+                v_offset = op2_offset;
+            }
+            tcg_gen_addi_ptr(s->ptr0, cpu_env, v_offset);
+            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+            tcg_gen_addi_ptr(s->ptr2, cpu_env, op1_offset);
+            fn(cpu_env, s->ptr0, s->ptr1, s->ptr2);
+            if (!s->vex_l) {
+                gen_clear_ymmh(s, reg_v);
             }
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op1_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
             break;
         case 0x050: /* movmskps */
+            CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
             tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                             offsetof(CPUX86State,xmm_regs[rm]));
-            gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0);
+                             offsetof(CPUX86State, xmm_regs[rm]));
+            if (s->vex_l) {
+                gen_helper_movmskps_ymm(s->tmp2_i32, cpu_env, s->ptr0);
+            } else {
+                gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+            }
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
         case 0x150: /* movmskpd */
+            CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
-            tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                             offsetof(CPUX86State,xmm_regs[rm]));
-            gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0);
+            tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(rm));
+            if (s->vex_l) {
+                gen_helper_movmskpd_ymm(s->tmp2_i32, cpu_env, s->ptr0);
+            } else {
+                gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+            }
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
         case 0x02a: /* cvtpi2ps */
         case 0x12a: /* cvtpi2pd */
-            gen_helper_enter_mmx(cpu_env);
+            CHECK_NO_VEX(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 op2_offset = offsetof(CPUX86State,mmx_t0);
                 gen_ldq_env_A0(s, op2_offset);
             } else {
+                gen_helper_enter_mmx(cpu_env);
                 rm = (modrm & 7);
                 op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
             }
-            op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+            op1_offset = XMM_OFFSET(reg);
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
             tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
             switch(b >> 8) {
@@ -3594,9 +4176,14 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x22a: /* cvtsi2ss */
         case 0x32a: /* cvtsi2sd */
+            CHECK_AVX(s);
             ot = mo_64_32(s->dflag);
             gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
-            op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+            op1_offset = XMM_OFFSET(reg);
+            v_offset = XMM_OFFSET(reg_v);
+            if (op1_offset != v_offset) {
+                gen_op_movo(s, op1_offset, v_offset);
+            }
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
             if (ot == MO_32) {
                 SSEFunc_0_epi sse_fn_epi = sse_op_table3ai[(b >> 8) & 1];
@@ -3610,19 +4197,21 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 goto illegal_op;
 #endif
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x02c: /* cvttps2pi */
         case 0x12c: /* cvttpd2pi */
         case 0x02d: /* cvtps2pi */
         case 0x12d: /* cvtpd2pi */
+            CHECK_NO_VEX(s);
             gen_helper_enter_mmx(cpu_env);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                op2_offset = offsetof(CPUX86State,xmm_t0);
+                op2_offset = offsetof(CPUX86State, xmm_t0);
                 gen_ldo_env_A0(s, op2_offset);
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                op2_offset = offsetof(CPUX86State,xmm_regs[rm]);
+                op2_offset = XMM_OFFSET(rm);
             }
             op1_offset = offsetof(CPUX86State,fpregs[reg & 7].mmx);
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
@@ -3646,6 +4235,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x32c: /* cvttsd2si */
         case 0x22d: /* cvtss2si */
         case 0x32d: /* cvtsd2si */
+            CHECK_AVX_V0(s);
             ot = mo_64_32(s->dflag);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
@@ -3656,10 +4246,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     tcg_gen_st32_tl(s->T0, cpu_env,
                                     offsetof(CPUX86State, xmm_t0.ZMM_L(0)));
                 }
-                op2_offset = offsetof(CPUX86State,xmm_t0);
+                op2_offset = offsetof(CPUX86State, xmm_t0);
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                op2_offset = offsetof(CPUX86State,xmm_regs[rm]);
+                op2_offset = XMM_OFFSET(rm);
             }
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset);
             if (ot == MO_32) {
@@ -3680,21 +4270,28 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0xc4: /* pinsrw */
         case 0x1c4:
+            CHECK_AVX_128(s);
             s->rip_offset = 1;
             gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
             val = x86_ldub_code(env, s);
+            if (reg != reg_v) {
+                gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v));
+            }
             if (b1) {
                 val &= 7;
                 tcg_gen_st16_tl(s->T0, cpu_env,
-                                offsetof(CPUX86State,xmm_regs[reg].ZMM_W(val)));
+                        offsetof(CPUX86State, xmm_regs[reg].ZMM_W(val)));
             } else {
+                CHECK_NO_VEX(s);
                 val &= 3;
                 tcg_gen_st16_tl(s->T0, cpu_env,
-                                offsetof(CPUX86State,fpregs[reg].mmx.MMX_W(val)));
+                        offsetof(CPUX86State, fpregs[reg].mmx.MMX_W(val)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0xc5: /* pextrw */
         case 0x1c5:
+            CHECK_AVX_V0_128(s);
             if (mod != 3)
                 goto illegal_op;
             ot = mo_64_32(s->dflag);
@@ -3703,17 +4300,18 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 val &= 7;
                 rm = (modrm & 7) | REX_B(s);
                 tcg_gen_ld16u_tl(s->T0, cpu_env,
-                                 offsetof(CPUX86State,xmm_regs[rm].ZMM_W(val)));
+                        offsetof(CPUX86State, xmm_regs[rm].ZMM_W(val)));
             } else {
                 val &= 3;
                 rm = (modrm & 7);
                 tcg_gen_ld16u_tl(s->T0, cpu_env,
-                                offsetof(CPUX86State,fpregs[rm].mmx.MMX_W(val)));
+                        offsetof(CPUX86State, fpregs[rm].mmx.MMX_W(val)));
             }
             reg = ((modrm >> 3) & 7) | REX_R(s);
             gen_op_mov_reg_v(s, ot, reg, s->T0);
             break;
         case 0x1d6: /* movq ea, xmm */
+            CHECK_AVX_V0_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State,
@@ -3721,12 +4319,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)),
-                            offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0)));
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)));
                 gen_op_movq_env_0(s,
                                   offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(1)));
             }
             break;
         case 0x2d6: /* movq2dq */
+            CHECK_NO_VEX(s);
             gen_helper_enter_mmx(cpu_env);
             rm = (modrm & 7);
             gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
@@ -3734,21 +4333,27 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)));
             break;
         case 0x3d6: /* movdq2q */
+            CHECK_NO_VEX(s);
             gen_helper_enter_mmx(cpu_env);
             rm = (modrm & 7) | REX_B(s);
             gen_op_movq(s, offsetof(CPUX86State, fpregs[reg & 7].mmx),
-                        offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
+                        offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)));
             break;
         case 0xd7: /* pmovmskb */
         case 0x1d7:
             if (mod != 3)
                 goto illegal_op;
             if (b1) {
+                CHECK_AVX_V0(s);
                 rm = (modrm & 7) | REX_B(s);
-                tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                                 offsetof(CPUX86State, xmm_regs[rm]));
-                gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(rm));
+                if (s->vex_l) {
+                    gen_helper_pmovmskb_ymm(s->tmp2_i32, cpu_env, s->ptr0);
+                } else {
+                    gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+                }
             } else {
+                CHECK_NO_VEX(s);
                 rm = (modrm & 7);
                 tcg_gen_addi_ptr(s->ptr0, cpu_env,
                                  offsetof(CPUX86State, fpregs[rm].mmx));
@@ -3768,50 +4373,241 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             rm = modrm & 7;
             reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
+            if (s->prefix & PREFIX_VEX) {
+                reg_v = s->vex_v;
+            } else {
+                reg_v = reg;
+            }
 
             assert(b1 < 2);
-            sse_fn_epp = sse_op_table6[b].op[b1];
-            if (!sse_fn_epp) {
+            op6 = sse_op_table6[b];
+            if (op6.ext_mask == 0) {
                 goto unknown_op;
             }
-            if (!(s->cpuid_ext_features & sse_op_table6[b].ext_mask))
+            if (!(s->cpuid_ext_features & op6.ext_mask)) {
+                goto illegal_op;
+            }
+
+            if (op6.ext_mask == CPUID_EXT_AVX
+                    && (s->prefix & PREFIX_VEX) == 0) {
                 goto illegal_op;
+            }
+            if (op6.flags & SSE_OPF_AVX2) {
+                CHECK_AVX2(s);
+            }
 
             if (b1) {
-                op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+                if (op6.flags & SSE_OPF_V0) {
+                    CHECK_AVX_V0(s);
+                } else {
+                    CHECK_AVX(s);
+                }
+
+                op1_offset = XMM_OFFSET(reg);
+
+                if ((b & 0xfc) == 0x90) { /* vgather */
+                    int scale, index, base;
+                    target_long disp = 0;
+                    CHECK_AVX2(s);
+                    if (mod == 3 || rm != 4) {
+                        goto illegal_op;
+                    }
+
+                    /* Vector SIB */
+                    val = x86_ldub_code(env, s);
+                    scale = (val >> 6) & 3;
+                    index = ((val >> 3) & 7) | REX_X(s);
+                    base = (val & 7) | REX_B(s);
+                    switch (mod) {
+                    case 0:
+                        if (base == 5) {
+                            base = -1;
+                            disp = (int32_t)x86_ldl_code(env, s);
+                        }
+                        break;
+                    case 1:
+                        disp = (int8_t)x86_ldub_code(env, s);
+                        break;
+                    default:
+                    case 2:
+                        disp = (int32_t)x86_ldl_code(env, s);
+                        break;
+                    }
+
+                    /* destination, index and mask registers must not overlap */
+                    if (reg == index || reg == reg_v) {
+                        goto illegal_op;
+                    }
+
+                    tcg_gen_addi_tl(s->A0, cpu_regs[base], disp);
+                    gen_add_A0_ds_seg(s);
+                    op2_offset = XMM_OFFSET(index);
+                    v_offset = XMM_OFFSET(reg_v);
+                    tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                    tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                    tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+                    b1 = REX_W(s) | ((b & 1) << 1) | (scale << 2);
+                    sse_op_table10[b1][s->vex_l](cpu_env,
+                            s->ptr0, s->ptr2, s->ptr1, s->A0);
+                    if (!s->vex_l) {
+                        gen_clear_ymmh(s, reg);
+                        gen_clear_ymmh(s, reg_v);
+                    }
+                    return;
+                }
+
+                if (op6.flags & SSE_OPF_MMX) {
+                    CHECK_AVX2_256(s);
+                }
+                if (op6.flags & SSE_OPF_BLENDV) {
+                    /*
+                     * VEX encodings of the blendv opcodes are not valid
+                     * they use a different opcode with an 0f 3a prefix
+                     */
+                    CHECK_NO_VEX(s);
+                }
+
                 if (mod == 3) {
-                    op2_offset = offsetof(CPUX86State,xmm_regs[rm | REX_B(s)]);
+                    op2_offset = XMM_OFFSET(rm | REX_B(s));
                 } else {
-                    op2_offset = offsetof(CPUX86State,xmm_t0);
+                    int size;
+                    op2_offset = offsetof(CPUX86State, xmm_t0);
                     gen_lea_modrm(env, s, modrm);
                     switch (b) {
+                    case 0x78: /* vpbroadcastb */
+                        size = 8;
+                        break;
+                    case 0x79: /* vpbroadcastw */
+                        size = 16;
+                        break;
+                    case 0x18: /* vbroadcastss */
+                    case 0x58: /* vpbroadcastd */
+                        size = 32;
+                        break;
+                    case 0x19: /* vbroadcastsd */
+                    case 0x59: /* vpbroadcastq */
+                        size = 64;
+                        break;
+                    case 0x1a: /* vbroadcastf128 */
+                    case 0x5a: /* vbroadcasti128 */
+                        size = 128;
+                        break;
                     case 0x20: case 0x30: /* pmovsxbw, pmovzxbw */
                     case 0x23: case 0x33: /* pmovsxwd, pmovzxwd */
                     case 0x25: case 0x35: /* pmovsxdq, pmovzxdq */
-                        gen_ldq_env_A0(s, op2_offset +
-                                        offsetof(ZMMReg, ZMM_Q(0)));
+                        size = 64;
                         break;
                     case 0x21: case 0x31: /* pmovsxbd, pmovzxbd */
                     case 0x24: case 0x34: /* pmovsxwq, pmovzxwq */
-                        tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
-                                            s->mem_index, MO_LEUL);
-                        tcg_gen_st_i32(s->tmp2_i32, cpu_env, op2_offset +
-                                        offsetof(ZMMReg, ZMM_L(0)));
+                        size = 32;
                         break;
                     case 0x22: case 0x32: /* pmovsxbq, pmovzxbq */
+                        size = 16;
+                        break;
+                    case 0x2a:            /* movntqda */
+                        if (s->vex_l) {
+                            gen_ldy_env_A0(s, op1_offset);
+                        } else {
+                            gen_ldo_env_A0(s, op1_offset);
+                            gen_clear_ymmh(s, reg);
+                        }
+                        return;
+                    case 0x2e: /* maskmovpd */
+                        b1 = 0;
+                        goto vpmaskmov;
+                    case 0x2f: /* maskmovpd */
+                        b1 = 1;
+                        goto vpmaskmov;
+                    case 0x8e: /* vpmaskmovd, vpmaskmovq */
+                        CHECK_AVX2(s);
+                        b1 = REX_W(s);
+                    vpmaskmov:
+                        tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                        v_offset = XMM_OFFSET(reg_v);
+                        tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+                        sse_op_table9[b1][s->vex_l](cpu_env,
+                                s->ptr0, s->ptr2, s->A0);
+                        return;
+                    default:
+                        size = 128;
+                    }
+                    if ((op6.flags & SSE_OPF_SCALAR) == 0 && s->vex_l) {
+                        size *= 2;
+                    }
+                    switch (size) {
+                    case 8:
+                        tcg_gen_qemu_ld_tl(s->tmp0, s->A0,
+                                           s->mem_index, MO_UB);
+                        tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset +
+                                        offsetof(ZMMReg, ZMM_B(0)));
+                        break;
+                    case 16:
                         tcg_gen_qemu_ld_tl(s->tmp0, s->A0,
                                            s->mem_index, MO_LEUW);
                         tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset +
                                         offsetof(ZMMReg, ZMM_W(0)));
                         break;
-                    case 0x2a:            /* movntqda */
-                        gen_ldo_env_A0(s, op1_offset);
-                        return;
-                    default:
+                    case 32:
+                        tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+                                            s->mem_index, MO_LEUL);
+                        tcg_gen_st_i32(s->tmp2_i32, cpu_env, op2_offset +
+                                        offsetof(ZMMReg, ZMM_L(0)));
+                        break;
+                    case 64:
+                        gen_ldq_env_A0(s, op2_offset +
+                                        offsetof(ZMMReg, ZMM_Q(0)));
+                        break;
+                    case 128:
                         gen_ldo_env_A0(s, op2_offset);
+                        break;
+                    case 256:
+                        gen_ldy_env_A0(s, op2_offset);
+                        break;
+                    }
+                }
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                if (s->vex_l) {
+                    b1 = 2;
+                }
+                if (!op6.fn[b1].op1) {
+                    goto illegal_op;
+                }
+                if (op6.flags & SSE_OPF_V0) {
+                    op6.fn[b1].op1(cpu_env, s->ptr0, s->ptr1);
+                } else {
+                    v_offset = XMM_OFFSET(reg_v);
+                    tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+                    if (op6.flags & SSE_OPF_BLENDV) {
+                        TCGv_ptr mask = tcg_temp_new_ptr();
+                        tcg_gen_addi_ptr(mask, cpu_env, XMM_OFFSET(0));
+                        op6.fn[b1].op3(cpu_env, s->ptr0, s->ptr2, s->ptr1,
+                                       mask);
+                        tcg_temp_free_ptr(mask);
+                    } else {
+                        SSEFunc_0_eppp fn = op6.fn[b1].op2;
+                        if (REX_W(s)) {
+                            if (b >= 0x45 && b <= 0x47) {
+                                fn = sse_op_table8[b - 0x45][b1 - 1];
+                            } else if (b == 0x8c) {
+                                if (s->vex_l) {
+                                    fn = gen_helper_vpmaskmovq_ymm;
+                                } else {
+                                    fn = gen_helper_vpmaskmovq_xmm;
+                                }
+                            }
+                        }
+                        fn(cpu_env, s->ptr0, s->ptr2, s->ptr1);
                     }
                 }
+                if ((op6.flags & SSE_OPF_CMP) == 0 && s->vex_l == 0) {
+                    gen_clear_ymmh(s, reg);
+                }
             } else {
+                CHECK_NO_VEX(s);
+                if ((op6.flags & SSE_OPF_MMX) == 0) {
+                    goto unknown_op;
+                }
                 op1_offset = offsetof(CPUX86State,fpregs[reg].mmx);
                 if (mod == 3) {
                     op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
@@ -3820,16 +4616,16 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     gen_lea_modrm(env, s, modrm);
                     gen_ldq_env_A0(s, op2_offset);
                 }
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                if (op6.flags & SSE_OPF_V0) {
+                    op6.fn[0].op1(cpu_env, s->ptr0, s->ptr1);
+                } else {
+                    op6.fn[0].op2(cpu_env, s->ptr0, s->ptr0, s->ptr1);
+                }
             }
-            if (sse_fn_epp == SSE_SPECIAL) {
-                goto unknown_op;
-            }
-
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
 
-            if (b == 0x17) {
+            if (op6.flags & SSE_OPF_CMP) {
                 set_cc_op(s, CC_OP_EFLAGS);
             }
             break;
@@ -3846,6 +4642,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             case 0x3f0: /* crc32 Gd,Eb */
             case 0x3f1: /* crc32 Gd,Ey */
             do_crc32:
+                CHECK_NO_VEX(s);
                 if (!(s->cpuid_ext_features & CPUID_EXT_SSE42)) {
                     goto illegal_op;
                 }
@@ -3877,6 +4674,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 /* FALLTHRU */
             case 0x0f0: /* movbe Gy,My */
             case 0x0f1: /* movbe My,Gy */
+                CHECK_NO_VEX(s);
                 if (!(s->cpuid_ext_features & CPUID_EXT_MOVBE)) {
                     goto illegal_op;
                 }
@@ -4043,6 +4841,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 
             case 0x1f6: /* adcx Gy, Ey */
             case 0x2f6: /* adox Gy, Ey */
+                CHECK_NO_VEX(s);
                 if (!(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_ADX)) {
                     goto illegal_op;
                 } else {
@@ -4196,18 +4995,28 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             rm = modrm & 7;
             reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
+            if (s->prefix & PREFIX_VEX) {
+                reg_v = s->vex_v;
+            } else {
+                reg_v = reg;
+            }
 
             assert(b1 < 2);
-            sse_fn_eppi = sse_op_table7[b].op[b1];
-            if (!sse_fn_eppi) {
+            op7 = sse_op_table7[b];
+            if (op7.ext_mask == 0) {
                 goto unknown_op;
             }
-            if (!(s->cpuid_ext_features & sse_op_table7[b].ext_mask))
+            if (!(s->cpuid_ext_features & op7.ext_mask)) {
                 goto illegal_op;
+            }
 
             s->rip_offset = 1;
 
-            if (sse_fn_eppi == SSE_SPECIAL) {
+            if (op7.flags & SSE_OPF_SPECIAL) {
+                /* None of the "special" ops are valid on mmx registers */
+                if (b1 == 0) {
+                    goto illegal_op;
+                }
                 ot = mo_64_32(s->dflag);
                 rm = (modrm & 7) | REX_B(s);
                 if (mod != 3)
@@ -4216,6 +5025,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 val = x86_ldub_code(env, s);
                 switch (b) {
                 case 0x14: /* pextrb */
+                    CHECK_AVX_V0_128(s);
                     tcg_gen_ld8u_tl(s->T0, cpu_env, offsetof(CPUX86State,
                                             xmm_regs[reg].ZMM_B(val & 15)));
                     if (mod == 3) {
@@ -4226,6 +5036,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     break;
                 case 0x15: /* pextrw */
+                    CHECK_AVX_V0_128(s);
                     tcg_gen_ld16u_tl(s->T0, cpu_env, offsetof(CPUX86State,
                                             xmm_regs[reg].ZMM_W(val & 7)));
                     if (mod == 3) {
@@ -4236,6 +5047,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     break;
                 case 0x16:
+                    CHECK_AVX_V0_128(s);
                     if (ot == MO_32) { /* pextrd */
                         tcg_gen_ld_i32(s->tmp2_i32, cpu_env,
                                         offsetof(CPUX86State,
@@ -4263,6 +5075,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     break;
                 case 0x17: /* extractps */
+                    CHECK_AVX_V0_128(s);
                     tcg_gen_ld32u_tl(s->T0, cpu_env, offsetof(CPUX86State,
                                             xmm_regs[reg].ZMM_L(val & 3)));
                     if (mod == 3) {
@@ -4273,6 +5086,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     break;
                 case 0x20: /* pinsrb */
+                    CHECK_AVX_128(s);
+                    if (reg != reg_v) {
+                        gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v));
+                    }
                     if (mod == 3) {
                         gen_op_mov_v_reg(s, MO_32, s->T0, rm);
                     } else {
@@ -4281,18 +5098,23 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     tcg_gen_st8_tl(s->T0, cpu_env, offsetof(CPUX86State,
                                             xmm_regs[reg].ZMM_B(val & 15)));
+                    gen_clear_ymmh(s, reg);
                     break;
                 case 0x21: /* insertps */
+                    CHECK_AVX_128(s);
                     if (mod == 3) {
                         tcg_gen_ld_i32(s->tmp2_i32, cpu_env,
-                                        offsetof(CPUX86State,xmm_regs[rm]
+                                        offsetof(CPUX86State, xmm_regs[rm]
                                                 .ZMM_L((val >> 6) & 3)));
                     } else {
                         tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
                                             s->mem_index, MO_LEUL);
                     }
+                    if (reg != reg_v) {
+                        gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v));
+                    }
                     tcg_gen_st_i32(s->tmp2_i32, cpu_env,
-                                    offsetof(CPUX86State,xmm_regs[reg]
+                                    offsetof(CPUX86State, xmm_regs[reg]
                                             .ZMM_L((val >> 4) & 3)));
                     if ((val >> 0) & 1)
                         tcg_gen_st_i32(tcg_const_i32(0 /*float32_zero*/),
@@ -4310,8 +5132,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         tcg_gen_st_i32(tcg_const_i32(0 /*float32_zero*/),
                                         cpu_env, offsetof(CPUX86State,
                                                 xmm_regs[reg].ZMM_L(3)));
+                    gen_clear_ymmh(s, reg);
                     break;
                 case 0x22:
+                    CHECK_AVX_128(s);
+                    if (reg != reg_v) {
+                        gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v));
+                    }
                     if (ot == MO_32) { /* pinsrd */
                         if (mod == 3) {
                             tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[rm]);
@@ -4337,21 +5164,91 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         goto illegal_op;
 #endif
                     }
+                    gen_clear_ymmh(s, reg);
+                    break;
+                case 0x38: /* vinserti128 */
+                    CHECK_AVX2_256(s);
+                    /* fall through */
+                case 0x18: /* vinsertf128 */
+                    CHECK_AVX(s);
+                    if ((s->prefix & PREFIX_VEX) == 0 || s->vex_l == 0) {
+                        goto illegal_op;
+                    }
+                    if (mod == 3) {
+                        if (val & 1) {
+                            gen_op_movo_ymm_l2h(s, XMM_OFFSET(reg),
+                                                XMM_OFFSET(rm));
+                        } else {
+                            gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(rm));
+                        }
+                    } else {
+                        if (val & 1) {
+                            gen_ldo_env_A0_ymmh(s, XMM_OFFSET(reg));
+                        } else {
+                            gen_ldo_env_A0(s, XMM_OFFSET(reg));
+                        }
+                    }
+                    if (reg != reg_v) {
+                        if (val & 1) {
+                            gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v));
+                        } else {
+                            gen_op_movo_ymmh(s, XMM_OFFSET(reg),
+                                             XMM_OFFSET(reg_v));
+                        }
+                    }
+                    break;
+                case 0x39: /* vextracti128 */
+                    CHECK_AVX2_256(s);
+                    /* fall through */
+                case 0x19: /* vextractf128 */
+                    CHECK_AVX_V0(s);
+                    if ((s->prefix & PREFIX_VEX) == 0 || s->vex_l == 0) {
+                        goto illegal_op;
+                    }
+                    if (mod == 3) {
+                        op1_offset = XMM_OFFSET(rm);
+                        if (val & 1) {
+                            gen_op_movo_ymm_h2l(s, XMM_OFFSET(rm),
+                                                XMM_OFFSET(reg));
+                        } else {
+                            gen_op_movo(s, XMM_OFFSET(rm), XMM_OFFSET(reg));
+                        }
+                        gen_clear_ymmh(s, rm);
+                    } else{
+                        if (val & 1) {
+                            gen_sto_env_A0_ymmh(s, XMM_OFFSET(reg));
+                        } else {
+                            gen_sto_env_A0(s, XMM_OFFSET(reg));
+                        }
+                    }
                     break;
+                default:
+                    goto unknown_op;
                 }
                 return;
             }
 
-            if (b1) {
-                op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
-                if (mod == 3) {
-                    op2_offset = offsetof(CPUX86State,xmm_regs[rm | REX_B(s)]);
-                } else {
-                    op2_offset = offsetof(CPUX86State,xmm_t0);
-                    gen_lea_modrm(env, s, modrm);
-                    gen_ldo_env_A0(s, op2_offset);
+            CHECK_AVX(s);
+            scalar_op = (s->prefix & PREFIX_VEX)
+                && (op7.flags & SSE_OPF_SCALAR)
+                && !(op7.flags & SSE_OPF_CMP);
+            if (is_xmm && (op7.flags & SSE_OPF_MMX)) {
+                CHECK_AVX2_256(s);
+            }
+            if (op7.flags & SSE_OPF_AVX2) {
+                CHECK_AVX2(s);
+            }
+            if ((op7.flags & SSE_OPF_V0) && !scalar_op) {
+                CHECK_AVX_V0(s);
+            }
+
+            if (b1 == 0) {
+                CHECK_NO_VEX(s);
+                /* MMX */
+                if ((op7.flags & SSE_OPF_MMX) == 0) {
+                    goto illegal_op;
                 }
-            } else {
+
                 op1_offset = offsetof(CPUX86State,fpregs[reg].mmx);
                 if (mod == 3) {
                     op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
@@ -4360,9 +5257,37 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     gen_lea_modrm(env, s, modrm);
                     gen_ldq_env_A0(s, op2_offset);
                 }
+                val = x86_ldub_code(env, s);
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+
+                /* We only actually have one MMX instuction (palignr) */
+                assert(b == 0x0f);
+
+                op7.fn[0].op2(cpu_env, s->ptr0, s->ptr0, s->ptr1,
+                              tcg_const_i32(val));
+                break;
+            }
+
+            /* SSE */
+            if (op7.flags & SSE_OPF_BLENDV && !(s->prefix & PREFIX_VEX)) {
+                /* Only VEX encodings are valid for these blendv opcodes */
+                goto illegal_op;
+            }
+            op1_offset = XMM_OFFSET(reg);
+            if (mod == 3) {
+                op2_offset = XMM_OFFSET(rm | REX_B(s));
+            } else {
+                op2_offset = offsetof(CPUX86State, xmm_t0);
+                gen_lea_modrm(env, s, modrm);
+                if (s->vex_l) {
+                    gen_ldy_env_A0(s, op2_offset);
+                } else {
+                    gen_ldo_env_A0(s, op2_offset);
+                }
             }
-            val = x86_ldub_code(env, s);
 
+            val = x86_ldub_code(env, s);
             if ((b & 0xfc) == 0x60) { /* pcmpXstrX */
                 set_cc_op(s, CC_OP_EFLAGS);
 
@@ -4370,11 +5295,49 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     /* The helper must use entire 64-bit gp registers */
                     val |= 1 << 8;
                 }
+                if ((b & 1) == 0) /* pcmpXsrtm */
+                    gen_clear_ymmh(s, 0);
             }
 
+            if (s->vex_l) {
+                b1 = 2;
+            }
+            v_offset = XMM_OFFSET(reg_v);
+            /*
+             * Populate the top part of the destination register for VEX
+             * encoded scalar operations
+             */
+            if (scalar_op && op1_offset != v_offset) {
+                if (b == 0x0a) { /* roundss */
+                    gen_op_movl(s,
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_L(1)));
+                }
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
+            }
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
             tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_eppi(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val));
+            if (op7.flags & SSE_OPF_V0) {
+                op7.fn[b1].op1(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val));
+            } else {
+                tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+                if (op7.flags & SSE_OPF_BLENDV) {
+                    TCGv_ptr mask = tcg_temp_new_ptr();
+                    tcg_gen_addi_ptr(mask, cpu_env, XMM_OFFSET(val >> 4));
+                    op7.fn[b1].op3(cpu_env, s->ptr0, s->ptr2, s->ptr1, mask);
+                    tcg_temp_free_ptr(mask);
+                } else {
+                    op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1,
+                                   tcg_const_i32(val));
+                }
+            }
+            if ((op7.flags & SSE_OPF_CMP) == 0 && s->vex_l == 0) {
+                gen_clear_ymmh(s, reg);
+            }
+            if (op7.flags & SSE_OPF_CMP) {
+                set_cc_op(s, CC_OP_EFLAGS);
+            }
             break;
 
         case 0x33a:
@@ -4424,34 +5387,49 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         default:
             break;
         }
+        if (s->vex_l) {
+            b1 += 4;
+        }
+        if ((sse_op.flags & SSE_OPF_3DNOW) == 0 && !sse_op.fn[b1].op1) {
+            goto unknown_op;
+        }
         if (is_xmm) {
-            op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+            scalar_op = (s->prefix & PREFIX_VEX)
+                && (sse_op.flags & SSE_OPF_SCALAR)
+                && !(sse_op.flags & SSE_OPF_CMP)
+                && (b1 == 2 || b1 == 3);
+            /* VEX encoded scalar ops always have 3 operands! */
+            if ((sse_op.flags & SSE_OPF_V0) && !scalar_op) {
+                CHECK_AVX_V0(s);
+            } else {
+                CHECK_AVX(s);
+            }
+            if (sse_op.flags & SSE_OPF_MMX) {
+                CHECK_AVX2_256(s);
+            }
+            op1_offset = XMM_OFFSET(reg);
             if (mod != 3) {
-                int sz = 4;
+                int sz = s->vex_l ? 5 : 4;
 
                 gen_lea_modrm(env, s, modrm);
-                op2_offset = offsetof(CPUX86State,xmm_t0);
-
-                switch (b) {
-                case 0x50 ... 0x5a:
-                case 0x5c ... 0x5f:
-                case 0xc2:
-                    /* Most sse scalar operations.  */
-                    if (b1 == 2) {
-                        sz = 2;
-                    } else if (b1 == 3) {
-                        sz = 3;
-                    }
-                    break;
+                op2_offset = offsetof(CPUX86State, xmm_t0);
 
-                case 0x2e:  /* ucomis[sd] */
-                case 0x2f:  /* comis[sd] */
-                    if (b1 == 0) {
-                        sz = 2;
+                if (sse_op.flags & SSE_OPF_SCALAR) {
+                    if (sse_op.flags & SSE_OPF_CMP) {
+                        /* ucomis[sd], comis[sd] */
+                        if (b1 == 0) {
+                            sz = 2;
+                        } else {
+                            sz = 3;
+                        }
                     } else {
-                        sz = 3;
+                        /* Most sse scalar operations.  */
+                        if (b1 == 2) {
+                            sz = 2;
+                        } else if (b1 == 3) {
+                            sz = 3;
+                        }
                     }
-                    break;
                 }
 
                 switch (sz) {
@@ -4459,22 +5437,29 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     /* 32 bit access */
                     gen_op_ld_v(s, MO_32, s->T0, s->A0);
                     tcg_gen_st32_tl(s->T0, cpu_env,
-                                    offsetof(CPUX86State,xmm_t0.ZMM_L(0)));
+                                    offsetof(CPUX86State, xmm_t0.ZMM_L(0)));
                     break;
                 case 3:
                     /* 64 bit access */
                     gen_ldq_env_A0(s, offsetof(CPUX86State, xmm_t0.ZMM_D(0)));
                     break;
-                default:
+                case 4:
                     /* 128 bit access */
                     gen_ldo_env_A0(s, op2_offset);
                     break;
+                case 5:
+                    /* 256 bit access */
+                    gen_ldy_env_A0(s, op2_offset);
+                    break;
                 }
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                op2_offset = offsetof(CPUX86State,xmm_regs[rm]);
+                op2_offset = XMM_OFFSET(rm);
             }
+            v_offset = XMM_OFFSET(reg_v);
         } else {
+            CHECK_NO_VEX(s);
+            scalar_op = 0;
             op1_offset = offsetof(CPUX86State,fpregs[reg].mmx);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
@@ -4484,60 +5469,100 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 rm = (modrm & 7);
                 op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
             }
-        }
-        switch(b) {
-        case 0x0f: /* 3DNow! data insns */
-            val = x86_ldub_code(env, s);
-            sse_fn_epp = sse_op_table5[val];
-            if (!sse_fn_epp) {
-                goto unknown_op;
+            if (sse_op.flags & SSE_OPF_3DNOW) {
+                /* 3DNow! data insns */
+                val = x86_ldub_code(env, s);
+                SSEFunc_0_epp sse_fn_epp = sse_op_table5[val];
+                if (!sse_fn_epp) {
+                    goto unknown_op;
+                }
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
+                return;
             }
-            if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) {
-                goto illegal_op;
+            v_offset = op1_offset;
+        }
+
+        /*
+         * Populate the top part of the destination register for VEX
+         * encoded scalar operations
+         */
+        if (scalar_op && op1_offset != v_offset) {
+            if (b == 0x5a) {
+                /*
+                 * Scalar conversions are tricky because the src and dest
+                 * may be different sizes
+                 */
+                if (op1_offset == op2_offset) {
+                    /*
+                     * The the second source operand overlaps the
+                     * destination, so we need to copy the value
+                     */
+                    op2_offset = offsetof(CPUX86State, xmm_t0);
+                    gen_op_movq(s, op2_offset, op1_offset);
+                }
+                gen_op_movo(s, op1_offset, v_offset);
+            } else {
+                if (b1 == 2) { /* ss */
+                    gen_op_movl(s,
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_L(1)));
+                }
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
             }
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
-            break;
-        case 0x70: /* pshufx insn */
-        case 0xc6: /* pshufx insn */
-            val = x86_ldub_code(env, s);
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            /* XXX: introduce a new table? */
-            sse_fn_ppi = (SSEFunc_0_ppi)sse_fn_epp;
-            sse_fn_ppi(s->ptr0, s->ptr1, tcg_const_i32(val));
-            break;
-        case 0xc2:
-            /* compare insns, bits 7:3 (7:5 for AVX) are ignored */
-            val = x86_ldub_code(env, s) & 7;
-            sse_fn_epp = sse_op_table4[val][b1];
+        }
 
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
-            break;
-        case 0xf7:
-            /* maskmov : we must prepare A0 */
-            if (mod != 3)
-                goto illegal_op;
-            tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]);
-            gen_extu(s->aflag, s->A0);
-            gen_add_A0_ds_seg(s);
+        tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+        tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+        if (sse_op.flags & SSE_OPF_V0) {
+            if (sse_op.flags & SSE_OPF_SHUF) {
+                val = x86_ldub_code(env, s);
+                sse_op.fn[b1].op1i(s->ptr0, s->ptr1, tcg_const_i32(val));
+            } else if (b == 0xf7) {
+                /* maskmov : we must prepare A0 */
+                if (mod != 3) {
+                    goto illegal_op;
+                }
+                tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]);
+                gen_extu(s->aflag, s->A0);
+                gen_add_A0_ds_seg(s);
+
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                sse_op.fn[b1].op1t(cpu_env, s->ptr0, s->ptr1, s->A0);
+                /* Does not write to the fist operand */
+                return;
+            } else {
+                sse_op.fn[b1].op1(cpu_env, s->ptr0, s->ptr1);
+            }
+        } else {
+            tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+            if (sse_op.flags & SSE_OPF_SHUF) {
+                val = x86_ldub_code(env, s);
+                sse_op.fn[b1].op2i(s->ptr0, s->ptr2, s->ptr1,
+                                   tcg_const_i32(val));
+            } else {
+                SSEFunc_0_eppp fn = sse_op.fn[b1].op2;
+                if (b == 0xc2) {
+                    /* compare insns */
+                    val = x86_ldub_code(env, s);
+                    if (s->prefix & PREFIX_VEX) {
+                        val &= 0x1f;
+                    } else {
+                        val &= 7;
+                    }
+                    fn = sse_op_table4[val][b1];
+                }
+                fn(cpu_env, s->ptr0, s->ptr2, s->ptr1);
+            }
+        }
 
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            /* XXX: introduce a new table? */
-            sse_fn_eppt = (SSEFunc_0_eppt)sse_fn_epp;
-            sse_fn_eppt(cpu_env, s->ptr0, s->ptr1, s->A0);
-            break;
-        default:
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
-            break;
+        if (s->vex_l == 0 && (sse_op.flags & SSE_OPF_CMP) == 0) {
+            gen_clear_ymmh(s, reg);
         }
-        if (b == 0x2e || b == 0x2f) {
+        if (sse_op.flags & SSE_OPF_CMP) {
             set_cc_op(s, CC_OP_EFLAGS);
         }
     }
@@ -8619,6 +9644,7 @@ static void i386_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cpu)
     dc->tmp4 = tcg_temp_new();
     dc->ptr0 = tcg_temp_new_ptr();
     dc->ptr1 = tcg_temp_new_ptr();
+    dc->ptr2 = tcg_temp_new_ptr();
     dc->cc_srcT = tcg_temp_local_new();
 }
 
-- 
2.35.2



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 3/4] Enable all x86-64 cpu features in user mode
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
  2022-04-18 17:39 ` [PATCH 1/4] Add AVX_EN hflag Paul Brook
  2022-04-18 17:39 ` [PATCH 2/4] TCG support for AVX Paul Brook
@ 2022-04-18 17:39 ` Paul Brook
  2022-04-18 17:39 ` [PATCH 4/4] AVX tests Paul Brook
                   ` (42 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-18 17:39 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Paolo Bonzini, Richard Henderson,
	Laurent Vivier, Paul Brook

We don't have any migration concerns for usermode emulation, so we may
as well enable all available CPU features by default.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 linux-user/x86_64/target_elf.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/x86_64/target_elf.h b/linux-user/x86_64/target_elf.h
index 7b76a90de8..3f628f8d66 100644
--- a/linux-user/x86_64/target_elf.h
+++ b/linux-user/x86_64/target_elf.h
@@ -9,6 +9,6 @@
 #define X86_64_TARGET_ELF_H
 static inline const char *cpu_get_model(uint32_t eflags)
 {
-    return "qemu64";
+    return "max";
 }
 #endif
-- 
2.35.2



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 4/4] AVX tests
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (2 preceding siblings ...)
  2022-04-18 17:39 ` [PATCH 3/4] Enable all x86-64 cpu features in user mode Paul Brook
@ 2022-04-18 17:39 ` Paul Brook
  2022-04-19 10:34   ` Alex Bennée
  2022-04-24 22:01 ` [PATCH v2 01/42] i386: pcmpestr 64-bit sign extension bug Paul Brook
                   ` (41 subsequent siblings)
  45 siblings, 1 reply; 67+ messages in thread
From: Paul Brook @ 2022-04-18 17:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eduardo Habkost, Paolo Bonzini, Richard Henderson, Paul Brook

Tests for correct operation of most x86-64 SSE and AVX instructions.
It should cover all combinations of overlapping register and memory
operands on a set of random-ish data.

Results are bit-identical to an Intel i5-8500, with the exception of
the RCPSS and RSQRT approximations where the real CPU gives less accurate
results (the Intel spec allows relative errors up to 1.5 * 2^-12)

Signed-off-by: Paul Brook <paul@nowt.org>
---
 tests/tcg/i386/Makefile.target |   10 +-
 tests/tcg/i386/README          |    9 +
 tests/tcg/i386/test-avx.c      |  347 +++
 tests/tcg/i386/test-avx.py     |  352 +++
 tests/tcg/i386/x86.csv         | 4658 ++++++++++++++++++++++++++++++++
 5 files changed, 5374 insertions(+), 2 deletions(-)
 create mode 100644 tests/tcg/i386/test-avx.c
 create mode 100755 tests/tcg/i386/test-avx.py
 create mode 100644 tests/tcg/i386/x86.csv

diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
index e1c0310be6..f1c3275e2e 100644
--- a/tests/tcg/i386/Makefile.target
+++ b/tests/tcg/i386/Makefile.target
@@ -7,8 +7,8 @@ VPATH 		+= $(I386_SRC)
 
 I386_SRCS=$(notdir $(wildcard $(I386_SRC)/*.c))
 ALL_X86_TESTS=$(I386_SRCS:.c=)
-SKIP_I386_TESTS=test-i386-ssse3
-X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
+SKIP_I386_TESTS=test-i386-ssse3 test-avx
+X86_64_TESTS:=$(filter test-i386-ssse3 test-avx, $(ALL_X86_TESTS))
 
 test-i386-sse-exceptions: CFLAGS += -msse4.1 -mfpmath=sse
 run-test-i386-sse-exceptions: QEMU_OPTS += -cpu max
@@ -80,3 +80,9 @@ run-sha512-sse: QEMU_OPTS+=-cpu max
 run-plugin-sha512-sse-with-%: QEMU_OPTS+=-cpu max
 
 TESTS+=sha512-sse
+
+test-avx.h: test-avx.py x86.csv
+	$(PYTHON) $(I386_SRC)/test-avx.py $(I386_SRC)/x86.csv $@
+
+test-avx: CFLAGS += -mavx -masm=intel -O -I.
+test-avx: test-avx.h
diff --git a/tests/tcg/i386/README b/tests/tcg/i386/README
index 09e88f30dc..403d10dad8 100644
--- a/tests/tcg/i386/README
+++ b/tests/tcg/i386/README
@@ -15,6 +15,15 @@ The Linux system call vm86() is used to test vm86 emulation.
 Various exceptions are raised to test most of the x86 user space
 exception reporting.
 
+test-avx
+--------
+
+This program executes most SSE/AVX instructions and generates a text output,
+for comparison with the output obtained with a real CPU or another emulator.
+
+test-avx.h is generate from x86.csv by test-avx.py
+x86.csv comes from https://github.com/quasilyte/avx512test
+
 linux-test
 ----------
 
diff --git a/tests/tcg/i386/test-avx.c b/tests/tcg/i386/test-avx.c
new file mode 100644
index 0000000000..953e2906fe
--- /dev/null
+++ b/tests/tcg/i386/test-avx.c
@@ -0,0 +1,347 @@
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+typedef void (*testfn)(void);
+
+typedef struct {
+    uint64_t q0, q1, q2, q3;
+} __attribute__((aligned(32))) v4di;
+
+typedef struct {
+    uint64_t mm[8];
+    v4di ymm[16];
+    uint64_t r[16];
+    uint64_t flags;
+    uint32_t ff;
+    uint64_t pad;
+    v4di mem[4];
+    v4di mem0[4];
+} reg_state;
+
+typedef struct {
+    int n;
+    testfn fn;
+    const char *s;
+    reg_state *init;
+} TestDef;
+
+reg_state initI;
+reg_state initF32;
+reg_state initF64;
+
+static void dump_ymm(const char *name, int n, const v4di *r, int ff)
+{
+    printf("%s%d = %016lx %016lx %016lx %016lx\n",
+           name, n, r->q3, r->q2, r->q1, r->q0);
+    if (ff == 64) {
+        double v[4];
+        memcpy(v, r, sizeof(v));
+        printf("        %16g %16g %16g %16g\n",
+                v[3], v[2], v[1], v[0]);
+    } else if (ff == 32) {
+        float v[8];
+        memcpy(v, r, sizeof(v));
+        printf(" %8g %8g %8g %8g %8g %8g %8g %8g\n",
+                v[7], v[6], v[5], v[4], v[3], v[2], v[1], v[0]);
+    }
+}
+
+static void dump_regs(reg_state *s)
+{
+    int i;
+
+    for (i = 0; i < 16; i++) {
+        dump_ymm("ymm", i, &s->ymm[i], 0);
+    }
+    for (i = 0; i < 4; i++) {
+        dump_ymm("mem", i, &s->mem0[i], 0);
+    }
+}
+
+static void compare_state(const reg_state *a, const reg_state *b)
+{
+    int i;
+    for (i = 0; i < 8; i++) {
+        if (a->mm[i] != b->mm[i]) {
+            printf("MM%d = %016lx\n", i, b->mm[i]);
+        }
+    }
+    for (i = 0; i < 16; i++) {
+        if (a->r[i] != b->r[i]) {
+            printf("r%d = %016lx\n", i, b->r[i]);
+        }
+    }
+    for (i = 0; i < 16; i++) {
+        if (memcmp(&a->ymm[i], &b->ymm[i], 32)) {
+            dump_ymm("ymm", i, &b->ymm[i], a->ff);
+        }
+    }
+    for (i = 0; i < 4; i++) {
+        if (memcmp(&a->mem0[i], &a->mem[i], 32)) {
+            dump_ymm("mem", i, &a->mem[i], a->ff);
+        }
+    }
+    if (a->flags != b->flags) {
+        printf("FLAGS = %016lx\n", b->flags);
+    }
+}
+
+#define LOADMM(r, o) "movq " #r ", " #o "[%0]\n\t"
+#define LOADYMM(r, o) "vmovdqa " #r ", " #o "[%0]\n\t"
+#define STOREMM(r, o) "movq " #o "[%1], " #r "\n\t"
+#define STOREYMM(r, o) "vmovdqa " #o "[%1], " #r "\n\t"
+#define MMREG(F) \
+    F(mm0, 0x00) \
+    F(mm1, 0x08) \
+    F(mm2, 0x10) \
+    F(mm3, 0x18) \
+    F(mm4, 0x20) \
+    F(mm5, 0x28) \
+    F(mm6, 0x30) \
+    F(mm7, 0x38)
+#define YMMREG(F) \
+    F(ymm0, 0x040) \
+    F(ymm1, 0x060) \
+    F(ymm2, 0x080) \
+    F(ymm3, 0x0a0) \
+    F(ymm4, 0x0c0) \
+    F(ymm5, 0x0e0) \
+    F(ymm6, 0x100) \
+    F(ymm7, 0x120) \
+    F(ymm8, 0x140) \
+    F(ymm9, 0x160) \
+    F(ymm10, 0x180) \
+    F(ymm11, 0x1a0) \
+    F(ymm12, 0x1c0) \
+    F(ymm13, 0x1e0) \
+    F(ymm14, 0x200) \
+    F(ymm15, 0x220)
+#define LOADREG(r, o) "mov " #r ", " #o "[rax]\n\t"
+#define STOREREG(r, o) "mov " #o "[rax], " #r "\n\t"
+#define REG(F) \
+    F(rbx, 0x248) \
+    F(rcx, 0x250) \
+    F(rdx, 0x258) \
+    F(rsi, 0x260) \
+    F(rdi, 0x268) \
+    F(r8, 0x280) \
+    F(r9, 0x288) \
+    F(r10, 0x290) \
+    F(r11, 0x298) \
+    F(r12, 0x2a0) \
+    F(r13, 0x2a8) \
+    F(r14, 0x2b0) \
+    F(r15, 0x2b8) \
+
+static void run_test(const TestDef *t)
+{
+    reg_state result;
+    reg_state *init = t->init;
+    memcpy(init->mem, init->mem0, sizeof(init->mem));
+    printf("%5d %s\n", t->n, t->s);
+    asm volatile(
+            MMREG(LOADMM)
+            YMMREG(LOADYMM)
+            "sub rsp, 128\n\t"
+            "push rax\n\t"
+            "push rbx\n\t"
+            "push rcx\n\t"
+            "push rdx\n\t"
+            "push %1\n\t"
+            "push %2\n\t"
+            "mov rax, %0\n\t"
+            "pushf\n\t"
+            "pop rbx\n\t"
+            "shr rbx, 8\n\t"
+            "shl rbx, 8\n\t"
+            "mov rcx, 0x2c0[rax]\n\t"
+            "and rcx, 0xff\n\t"
+            "or rbx, rcx\n\t"
+            "push rbx\n\t"
+            "popf\n\t"
+            REG(LOADREG)
+            "mov rax, 0x240[rax]\n\t"
+            "call [rsp]\n\t"
+            "mov [rsp], rax\n\t"
+            "mov rax, 8[rsp]\n\t"
+            REG(STOREREG)
+            "mov rbx, [rsp]\n\t"
+            "mov 0x240[rax], rbx\n\t"
+            "mov rbx, 0\n\t"
+            "mov 0x270[rax], rbx\n\t"
+            "mov 0x278[rax], rbx\n\t"
+            "pushf\n\t"
+            "pop rbx\n\t"
+            "and rbx, 0xff\n\t"
+            "mov 0x2c0[rax], rbx\n\t"
+            "add rsp, 16\n\t"
+            "pop rdx\n\t"
+            "pop rcx\n\t"
+            "pop rbx\n\t"
+            "pop rax\n\t"
+            "add rsp, 128\n\t"
+            MMREG(STOREMM)
+            YMMREG(STOREYMM)
+            : : "r"(init), "r"(&result), "r"(t->fn)
+            : "memory", "cc",
+            "rsi", "rdi",
+            "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15",
+            "mm0", "mm1", "mm2", "mm3", "mm4", "mm5", "mm6", "mm7",
+            "ymm0", "ymm1", "ymm2", "ymm3", "ymm4", "ymm5",
+            "ymm6", "ymm7", "ymm8", "ymm9", "ymm10", "ymm11",
+            "ymm12", "ymm13", "ymm14", "ymm15"
+            );
+    compare_state(init, &result);
+}
+
+#define TEST(n, cmd, type) \
+static void __attribute__((naked)) test_##n(void) \
+{ \
+    asm volatile(cmd); \
+    asm volatile("ret"); \
+}
+#include "test-avx.h"
+
+
+static const TestDef test_table[] = {
+#define TEST(n, cmd, type) {n, test_##n, cmd, &init##type},
+#include "test-avx.h"
+    {-1, NULL, "", NULL}
+};
+
+static void run_all(void)
+{
+    const TestDef *t;
+    for (t = test_table; t->fn; t++) {
+        run_test(t);
+    }
+}
+
+#define ARRAY_LEN(x) (sizeof(x) / sizeof(x[0]))
+
+float val_f32[] = {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5, 8.3};
+double val_f64[] = {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5};
+v4di val_i64[] = {
+    {0x3d6b3b6a9e4118f2lu, 0x355ae76d2774d78clu,
+     0xac3ff76c4daa4b28lu, 0xe7fabd204cb54083lu},
+    {0xd851c54a56bf1f29lu, 0x4a84d1d50bf4c4fflu,
+     0x56621e553d52b56clu, 0xd0069553da8f584alu},
+    {0x5826475e2c5fd799lu, 0xfd32edc01243f5e9lu,
+     0x738ba2c66d3fe126lu, 0x5707219c6e6c26b4lu},
+};
+
+v4di deadbeef = {0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull,
+                 0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull};
+v4di indexq = {0x000000000000001full, 0x000000000000008full,
+               0xffffffffffffffffull, 0xffffffffffffff5full};
+v4di indexd = {0x00000002000000efull, 0xfffffff500000010ull,
+               0x0000000afffffff0ull, 0x000000000000000eull};
+
+v4di gather_mem[0x20];
+
+void init_f32reg(v4di *r)
+{
+    static int n;
+    float v[8];
+    int i;
+    for (i = 0; i < 8; i++) {
+        v[i] = val_f32[n++];
+        if (n == ARRAY_LEN(val_f32)) {
+            n = 0;
+        }
+    }
+    memcpy(r, v, sizeof(*r));
+}
+
+void init_f64reg(v4di *r)
+{
+    static int n;
+    double v[4];
+    int i;
+    for (i = 0; i < 4; i++) {
+        v[i] = val_f64[n++];
+        if (n == ARRAY_LEN(val_f64)) {
+            n = 0;
+        }
+    }
+    memcpy(r, v, sizeof(*r));
+}
+
+void init_intreg(v4di *r)
+{
+    static uint64_t mask;
+    static int n;
+
+    r->q0 = val_i64[n].q0 ^ mask;
+    r->q1 = val_i64[n].q1 ^ mask;
+    r->q2 = val_i64[n].q2 ^ mask;
+    r->q3 = val_i64[n].q3 ^ mask;
+    n++;
+    if (n == ARRAY_LEN(val_i64)) {
+        n = 0;
+        mask *= 0x104C11DB7;
+    }
+}
+
+static void init_all(reg_state *s)
+{
+    int i;
+
+    s->r[3] = (uint64_t)&s->mem[0]; /* rdx */
+    s->r[4] = (uint64_t)&gather_mem[ARRAY_LEN(gather_mem) / 2]; /* rsi */
+    s->r[5] = (uint64_t)&s->mem[2]; /* rdi */
+    s->flags = 2;
+    for (i = 0; i < 16; i++) {
+        s->ymm[i] = deadbeef;
+    }
+    s->ymm[13] = indexd;
+    s->ymm[14] = indexq;
+    for (i = 0; i < 4; i++) {
+        s->mem0[i] = deadbeef;
+    }
+}
+
+int main(int argc, char *argv[])
+{
+    int i;
+
+    init_all(&initI);
+    init_intreg(&initI.ymm[10]);
+    init_intreg(&initI.ymm[11]);
+    init_intreg(&initI.ymm[12]);
+    init_intreg(&initI.mem0[1]);
+    printf("Int:\n");
+    dump_regs(&initI);
+
+    init_all(&initF32);
+    init_f32reg(&initF32.ymm[10]);
+    init_f32reg(&initF32.ymm[11]);
+    init_f32reg(&initF32.ymm[12]);
+    init_f32reg(&initF32.mem0[1]);
+    initF32.ff = 32;
+    printf("F32:\n");
+    dump_regs(&initF32);
+
+    init_all(&initF64);
+    init_f64reg(&initF64.ymm[10]);
+    init_f64reg(&initF64.ymm[11]);
+    init_f64reg(&initF64.ymm[12]);
+    init_f64reg(&initF64.mem0[1]);
+    initF64.ff = 64;
+    printf("F64:\n");
+    dump_regs(&initF64);
+
+    for (i = 0; i < ARRAY_LEN(gather_mem); i++) {
+        init_intreg(&gather_mem[i]);
+    }
+
+    if (argc > 1) {
+        int n = atoi(argv[1]);
+        run_test(&test_table[n]);
+    } else {
+        run_all();
+    }
+    return 0;
+}
diff --git a/tests/tcg/i386/test-avx.py b/tests/tcg/i386/test-avx.py
new file mode 100755
index 0000000000..0b2d799c5c
--- /dev/null
+++ b/tests/tcg/i386/test-avx.py
@@ -0,0 +1,352 @@
+#! /usr/bin/env python3
+
+# Generate test-avx.h from x86.csv
+
+import csv
+import sys
+from fnmatch import fnmatch
+
+archs = [
+    # TODO: MMX?
+    "SSE", "SSE2", "SSE3", "SSSE3", "SSE4_1", "SSE4_2",
+    "AVX", "AVX2", "AES+AVX", # "VAES+AVX",
+]
+
+ignore = set(["FISTTP",
+    "LDMXCSR", "VLDMXCSR", "STMXCSR", "VSTMXCSR"])
+
+imask = {
+    'vBLENDPD': 0xff,
+    'vBLENDPS': 0x0f,
+    'CMP[PS][SD]': 0x07,
+    'VCMP[PS][SD]': 0x1f,
+    'vDPPD': 0x33,
+    'vDPPS': 0xff,
+    'vEXTRACTPS': 0x03,
+    'vINSERTPS': 0xff,
+    'MPSADBW': 0x7,
+    'VMPSADBW': 0x3f,
+    'vPALIGNR': 0x3f,
+    'vPBLENDW': 0xff,
+    'vPCMP[EI]STR*': 0x0f,
+    'vPEXTRB': 0x0f,
+    'vPEXTRW': 0x07,
+    'vPEXTRD': 0x03,
+    'vPEXTRQ': 0x01,
+    'vPINSRB': 0x0f,
+    'vPINSRW': 0x07,
+    'vPINSRD': 0x03,
+    'vPINSRQ': 0x01,
+    'vPSHUF[DW]': 0xff,
+    'vPSHUF[LH]W': 0xff,
+    'vPS[LR][AL][WDQ]': 0x3f,
+    'vPS[RL]LDQ': 0x1f,
+    'vROUND[PS][SD]': 0x7,
+    'vSHUFPD': 0x0f,
+    'vSHUFPS': 0xff,
+    'vAESKEYGENASSIST': 0,
+    'VEXTRACT[FI]128': 0x01,
+    'VINSERT[FI]128': 0x01,
+    'VPBLENDD': 0xff,
+    'VPERM2[FI]128': 0x33,
+    'VPERMPD': 0xff,
+    'VPERMQ': 0xff,
+    'VPERMILPS': 0xff,
+    'VPERMILPD': 0x0f,
+    }
+
+def strip_comments(x):
+    for l in x:
+        if l != '' and l[0] != '#':
+            yield l
+
+def reg_w(w):
+    if w == 8:
+        return 'al'
+    elif w == 16:
+        return 'ax'
+    elif w == 32:
+        return 'eax'
+    elif w == 64:
+        return 'rax'
+    raise Exception("bad reg_w %d" % w)
+
+def mem_w(w):
+    if w == 8:
+        t = "BYTE"
+    elif w == 16:
+        t = "WORD"
+    elif w == 32:
+        t = "DWORD"
+    elif w == 64:
+        t = "QWORD"
+    elif w == 128:
+        t = "XMMWORD"
+    elif w == 256:
+        t = "YMMWORD"
+    else:
+        raise Exception()
+
+    return t + " PTR 32[rdx]"
+
+class XMMArg():
+    isxmm = True
+    def __init__(self, reg, mw):
+        if mw not in [0, 8, 16, 32, 64, 128, 256]:
+            raise Exception("Bad /m width: %s" % w)
+        self.reg = reg
+        self.mw = mw
+        self.ismem = mw != 0
+    def regstr(self, n):
+        if n < 0:
+            return mem_w(self.mw)
+        else:
+            return "%smm%d" % (self.reg, n)
+
+class MMArg():
+    isxmm = True
+    ismem = False # TODO
+    def regstr(self, n):
+        return "mm%d" % (n & 7)
+
+def match(op, pattern):
+    if pattern[0] == 'v':
+        return fnmatch(op, pattern[1:]) or fnmatch(op, 'V'+pattern[1:])
+    return fnmatch(op, pattern)
+
+class ArgVSIB():
+    isxmm = True
+    ismem = False
+    def __init__(self, reg, w):
+        if w not in [32, 64]:
+            raise Exception("Bad vsib width: %s" % w)
+        self.w = w
+        self.reg = reg
+    def regstr(self, n):
+        reg = "%smm%d" % (self.reg, n >> 2)
+        return "[rsi + %s * %d]" % (reg, 1 << (n & 3))
+
+class ArgImm8u():
+    isxmm = False
+    ismem = False
+    def __init__(self, op):
+        for k, v in imask.items():
+            if match(op, k):
+                self.mask = imask[k];
+                return
+        raise Exception("Unknown immediate")
+    def vals(self):
+        mask = self.mask
+        yield 0
+        n = 0
+        while n != mask:
+            n += 1
+            while (n & ~mask) != 0:
+                n += (n & ~mask)
+            yield n
+
+class ArgRM():
+    isxmm = False
+    def __init__(self, rw, mw):
+        if rw not in [8, 16, 32, 64]:
+            raise Exception("Bad r/w width: %s" % w)
+        if mw not in [0, 8, 16, 32, 64]:
+            raise Exception("Bad r/w width: %s" % w)
+        self.rw = rw
+        self.mw = mw
+        self.ismem = mw != 0
+    def regstr(self, n):
+        if n < 0:
+            return mem_w(self.mw)
+        else:
+            return reg_w(self.rw)
+
+class ArgMem():
+    isxmm = False
+    ismem = True
+    def __init__(self, w):
+        if w not in [8, 16, 32, 64, 128, 256]:
+            raise Exception("Bad mem width: %s" % w)
+        self.w = w
+    def regstr(self, n):
+        return mem_w(self.w)
+
+def ArgGenerator(arg, op):
+    if arg[:3] == 'xmm' or arg[:3] == "ymm":
+        if "/" in arg:
+            r, m = arg.split('/')
+            if (m[0] != 'm'):
+                raise Exception("Expected /m: %s", arg)
+            return XMMArg(arg[0], int(m[1:]));
+        else:
+            return XMMArg(arg[0], 0);
+    elif arg[:2] == 'mm':
+        return MMArg();
+    elif arg[:4] == 'imm8':
+        return ArgImm8u(op);
+    elif arg == '<XMM0>':
+        return None
+    elif arg[0] == 'r':
+        if '/m' in arg:
+            r, m = arg.split('/')
+            if (m[0] != 'm'):
+                raise Exception("Expected /m: %s", arg)
+            mw = int(m[1:])
+            if r == 'r':
+                rw = mw
+            else:
+                rw = int(r[1:])
+            return ArgRM(rw, mw)
+
+        return ArgRM(int(arg[1:]), 0);
+    elif arg[0] == 'm':
+        return ArgMem(int(arg[1:]))
+    elif arg[:2] == 'vm':
+        return ArgVSIB(arg[-1], int(arg[2:-1]))
+    else:
+        raise Exception("Unrecognised arg: %s", arg)
+
+class InsnGenerator:
+    def __init__(self, op, args):
+        self.op = op
+        if op[-2:] in ["PS", "PD", "SS", "SD"]:
+            if op[-1] == 'S':
+                self.optype = 'F32'
+            else:
+                self.optype = 'F64'
+        else:
+            self.optype = 'I'
+
+        try:
+            self.args = list(ArgGenerator(a, op) for a in args)
+            if len(self.args) > 0 and self.args[-1] is None:
+                self.args = self.args[:-1]
+        except Exception as e:
+            raise Exception("Bad arg %s: %s" % (op, e))
+
+    def gen(self):
+        regs = (10, 11, 12)
+        dest = 9
+
+        nreg = len(self.args)
+        if nreg == 0:
+            yield self.op
+            return
+        if isinstance(self.args[-1], ArgImm8u):
+            nreg -= 1
+            immarg = self.args[-1]
+        else:
+            immarg = None
+        memarg = -1
+        for n, arg in enumerate(self.args):
+            if arg.ismem:
+                memarg = n
+
+        if (self.op.startswith("VGATHER") or self.op.startswith("VPGATHER")):
+            if "GATHERD" in self.op:
+                ireg = 13 << 2
+            else:
+                ireg = 14 << 2
+            regset = [
+                (dest, ireg | 0, regs[0]),
+                (dest, ireg | 1, regs[0]),
+                (dest, ireg | 2, regs[0]),
+                (dest, ireg | 3, regs[0]),
+                ]
+            if memarg >= 0:
+                raise Exception("vsib with memory: %s" % self.op)
+        elif nreg == 1:
+            regset = [(regs[0],)]
+            if memarg == 0:
+                regset += [(-1,)]
+        elif nreg == 2:
+            regset = [
+                (regs[0], regs[1]),
+                (regs[0], regs[0]),
+                ]
+            if memarg == 0:
+                regset += [(-1, regs[0])]
+            elif memarg == 1:
+                regset += [(dest, -1)]
+        elif nreg == 3:
+            regset = [
+                (dest, regs[0], regs[1]),
+                (dest, regs[0], regs[0]),
+                (regs[0], regs[0], regs[1]),
+                (regs[0], regs[1], regs[0]),
+                (regs[0], regs[0], regs[0]),
+                ]
+            if memarg == 2:
+                regset += [
+                    (dest, regs[0], -1),
+                    (regs[0], regs[0], -1),
+                    ]
+            elif memarg > 0:
+                raise Exception("Memarg %d" % memarg)
+        elif nreg == 4:
+            regset = [
+                (dest, regs[0], regs[1], regs[2]),
+                (dest, regs[0], regs[0], regs[1]),
+                (dest, regs[0], regs[1], regs[0]),
+                (dest, regs[1], regs[0], regs[0]),
+                (dest, regs[0], regs[0], regs[0]),
+                (regs[0], regs[0], regs[1], regs[2]),
+                (regs[0], regs[1], regs[0], regs[2]),
+                (regs[0], regs[1], regs[2], regs[0]),
+                (regs[0], regs[0], regs[0], regs[1]),
+                (regs[0], regs[0], regs[1], regs[0]),
+                (regs[0], regs[1], regs[0], regs[0]),
+                (regs[0], regs[0], regs[0], regs[0]),
+                ]
+            if memarg == 2:
+                regset += [
+                    (dest, regs[0], -1, regs[1]),
+                    (dest, regs[0], -1, regs[0]),
+                    (regs[0], regs[0], -1, regs[1]),
+                    (regs[0], regs[1], -1, regs[0]),
+                    (regs[0], regs[0], -1, regs[0]),
+                    ]
+            elif memarg > 0:
+                raise Exception("Memarg4 %d" % memarg)
+        else:
+            raise Exception("Too many regs: %s(%d)" % (self.op, nreg))
+
+        for regv in regset:
+            argstr = []
+            for i in range(nreg):
+                arg = self.args[i]
+                argstr.append(arg.regstr(regv[i]))
+            if immarg is None:
+                yield self.op + ' ' + ','.join(argstr)
+            else:
+                for immval in immarg.vals():
+                    yield self.op + ' ' + ','.join(argstr) + ',' + str(immval)
+
+def split0(s):
+    if s == '':
+        return []
+    return s.split(',')
+
+def main():
+    n = 0
+    if len(sys.argv) != 3:
+        print("Usage: test-avx.py x86.csv test-avx.h")
+        exit(1)
+    csvfile = open(sys.argv[1], 'r', newline='')
+    with open(sys.argv[2], "w") as outf:
+        outf.write("// Generated by test-avx.py. Do not edit.\n")
+        for row in csv.reader(strip_comments(csvfile)):
+            insn = row[0].replace(',', '').split()
+            if insn[0] in ignore:
+                continue
+            cpuid = row[6]
+            if cpuid in archs:
+                g = InsnGenerator(insn[0], insn[1:])
+                for insn in g.gen():
+                    outf.write('TEST(%d, "%s", %s)\n' % (n, insn, g.optype))
+                    n += 1
+        outf.write("#undef TEST\n")
+        csvfile.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/tests/tcg/i386/x86.csv b/tests/tcg/i386/x86.csv
new file mode 100644
index 0000000000..d5d0c17f1b
--- /dev/null
+++ b/tests/tcg/i386/x86.csv
@@ -0,0 +1,4658 @@
+# x86 instruction set description version 0.2x, 2018-05-08
+#
+# https://golang.org/x/arch/x86
+#
+# The latest version of the CSV file is
+# available online at https://golang.org/s/x86.csv.
+#
+# This file contains a block of comment lines, each beginning with #,
+# followed by entries in CSV format. All the # comments are at the top
+# of the file, so a reader can skip past the comments and hand the
+# rest of the file to a standard CSV reader.
+# Each CSV line contains these fields:
+#
+# 1. The Intel manual instruction mnemonic. For example, "SHR r/m32, imm8".
+#
+# 2. The Go assembler instruction mnemonic. For example, "SHRL imm8, r/m32".
+#
+# 3. The GNU binutils instruction mnemonic. For example, "shrl imm8, r/m32".
+#
+# 4. The instruction encoding. For example, "C1 /4 ib".
+#
+# 5. The validity of the instruction in 32-bit (aka compatiblity, legacy) mode.
+#
+# 6. The validity of the instruction in 64-bit mode.
+#
+# 7. The CPUID feature flags that signal support for the instruction.
+#
+# 8. Additional comma-separated tags containing hints about the instruction.
+#
+# 9. The read/write actions of the instruction on the arguments used in
+# the Intel mnemonic. For example, "rw,r" to denote that "SHR r/m32, imm8"
+# reads and writes its first argument but only reads its second argument.
+#
+# 10. Whether the opcode used in the Intel mnemonic has encoding forms
+# distinguished only by operand size, like most arithmetic instructions.
+# The string "Y" indicates yes, the string "" indicates no.
+#
+# 11. The data size of the operation in bits. In general this is the size corresponding
+# to the Go and GNU assembler opcode suffix.
+# Mnemonics (the opcode string)
+#
+# The instruction mnemonics are as used in the Intel manual, with a few exceptions.
+#
+# Mnemonics claiming general memory forms but that really require fixed addressing modes
+# are omitted in favor of their equivalents with implicit arguments..
+# For example, "CMPS m16, m16" (really CMPS [SI], [DI]) is omitted in favor of "CMPSW".
+#
+# Instruction forms with an explicit REP, REPE, or REPNE prefix are also omitted.
+# Encoders and decoders are expected to handle those prefixes separately.
+#
+# Perhaps most significantly, the argument syntaxes used in the mnemonic indicate
+# exactly how to derive the argument from the instruction encoding, or vice versa.
+#
+# Immediate values: imm8, imm8u, imm16, imm16u, imm32, imm64.
+# Immediates are signed by default; the u suffixes indicates an unsigned value.
+# Immediates may have bitfield-like modifier that specifies how much bits
+# are used. For example, imm8u:4 is encoded like 8bit immediate,
+# but only 4bits are meaningful while the others are ignored or must be 0.
+#
+# Memory operands. The forms m, m128, m14/28byte, m16, m16&16, m16&32, m16&64, m16:16, m16:32,
+# m16:64, m16int, m256, m2byte, m32, m32&32, m32fp, m32int, m512byte, m64, m64fp, m64int,
+# m8, m80bcd, m80dec, m80fp, m94/108byte. These operands always correspond to the
+# memory address specified by the r/m half of the modrm encoding.
+#
+# Integer registers.
+# The forms r8, r16, r32, r64 indicate a register selected by the modrm reg encoding.
+# The forms rmr16, rmr32, rmr64 indicate a register (never memory) selected by the modrm r/m encoding.
+# The forms r/m8, r/m16, r/m32, and r/m64 indicate a register or memory selected by the modrm r/m encoding.
+# Forms with two sizes, like r32/m16 also indicate a register or memory selected by the modrm r/m encodng,
+# but the size for a register argument differs from the size of a memory argument.
+# The forms r8V, r16V, r32V, r64V indicate a register selected by the VEX.vvvv bits.
+#
+# Multimedia registers.
+# The forms mm1, xmm1, and ymm1 indicate a multimedia register selected by the
+# modrm reg encoding.
+# The forms mm2, xmm2, and ymm2 indicate a register (never memory) selected by
+# the modrm r/m encoding.
+# The forms mm2/m64, xmm2/m128, and so on indicate a register or memory
+# selected by the modrm r/m encoding.
+# The forms xmmV and ymmV indicate a register selected by the VEX.vvvv bits.
+# The forms xmmI and ymmI indicate a register selected by the top four bits of an /is4 immediate byte.
+#
+# Bound registers.
+# The form bnd1 indicates a bound register selected by the modrm reg encoding.
+# The form bnd2 indicates a bound register (never memory) selected by the modrm r/m encoding.
+# The forms bnd2/m64 and bnd2/m128 indicate a register or memorys selected by the modrm r/m encoding.
+# TODO: Describe mib.
+#
+# One-of-a-kind operands: rel8, rel16, rel32, ptr16:16, ptr16:32,
+# moffs8, moffs16, moffs32, moffs64, vm32x, vm32y, vm64x, and vm64y
+# are all as in the Intel manual.
+#
+# Encodings
+#
+# The encodings are also as used in the Intel manual, with automated corrections.
+# For example, the Intel manual sometimes omits the modrm /r indicator or other trailing bytes,
+# and it also contains typographical errors.
+# These problems are corrected so that the CSV data may be used to generate
+# tools for processing x86 machine code.
+# See https://golang.org/x/arch/x86/x86map for one such generator.
+#
+# Valid32 and Valid64
+#
+# These columns hold validity abbreviations as defined in the Intel manual:
+# V, I, N.E., N.P., N.S., or N.I.
+# Tools processing the data are typically only concerned with whether the
+# column is "V" (valid) or not.
+# This data is also corrected compared to the manual.
+# For example, the manual lists many instruction forms using REX bytes
+# with an incorrect "V" in the Valid32 column.
+#
+# CPUID Feature Flags
+#
+# This column specifies CPUID feature flags that must be present in order
+# to use the instruction. If multiple flags are required,
+# they are listed separated by plus signs, as in PCLMULQDQ+AVX.
+# The column can also list one of the values 486, Pentium, PentiumII, and P6,
+# indicating that the instruction was introduced on that architecture version.
+#
+# Tags
+#
+# The tag column does not correspond to a traditional column in the Intel manual tables.
+# Instead, it is itself a comma-separated list of tags or hints derived by analysis
+# of the instruction set or the instruction encodings.
+#
+# The tags address16, address32, and address64 indicate that the instruction form
+# applies when using the specified addressing size. It may therefore be necessary to use an
+# address size prefix byte to access the instruction.
+# If two address tags are listed, the instruction can be used with either of those
+# address sizes. An instruction will never list all three address sizes.
+# (In fact, today, no instruction lists two address sizes, but that may change.)
+#
+# The tags operand16, operand32, and operand64 indicate that the instruction form
+# applies when using the specified operand size. It may therefore be necessary to use an
+# operand size prefix byte to access the instruction.
+# If two operand tags are listed,  the instruction can be used with either of those
+# operand sizes. An instruction will never list all three operand sizes.
+# For some instructions, default64 is used instead of operand64,
+# which specifies data promotion to 64-bit.
+# For instructions with different possible data sizes,
+# it also describes that default data size is 64-bit instead of 32-bit.
+# Using refining prefix like 0x66 will lead to 32-bit operation (if supported).
+#
+# The tags modrm_regonly or modrm_memonly indicate that the modrm byte's
+# r/m encoding must specify a register or memory, respectively.
+# Especially in newer instructions, the modrm constraint may be the only way
+# to distinguish two instruction forms. For example the MOVHLPS and MOVLPS
+# instructions share the same encoding, except that the former requires the
+# modrm byte's r/m to indicate a register, while the latter requires it to indicate memory.
+#
+# The tags pseudo and pseudo64 indicate that this instruction form is redundant
+# with others listed in the table and should be ignored when generating disassembly
+# or instruction scanning programs. The pseudo64 tag is reserved for the case where
+# the manual lists an instruction twice, once with the optional 64-bit mode REX byte.
+# Since most decoders will handle the REX byte separately, the form with the
+# unnecessary REX is tagged pseudo64.
+#
+# The amd tag marks AMD-specific instructions.
+# As an example, all instructions of SSE4a have such tag.
+#
+# The AVX512-specific tags: scaleX and bscaleX.
+# scale1, scale2, scale4, scale8, scale16, scale32, scale64 specify
+# the compressed displacement multiplier (scaling).
+# For example, if displacement is 128 and scale32 is set,
+# disp8 value should be calculated as 128/32.
+# bscale4 and bscale8 have the same meaning, but are used
+# when instruction uses embedded broadcast feature.
+# If instruction does not have bscaleX tag, it does not support EVEX broadcasting.
+#
+# Related packages (can be a good source of additional documentation):
+#	x86csv - read and manipulate x86.csv
+#	x86spec - x86.csv generator
+#	x86map - x86asm table generator based on x86.csv
+#	x86avxgen - cmd/internal/obj/x86 optab generator based x86.csv
+# All listed packages are located at golang.org/x/arch/x86/.
+"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","V","N.S.","","operand32","r","Y",""
+"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","N.S.","V","","default64","r","Y",""
+"AAA","AAA","aaa","37","V","N.S.","","","","",""
+"AAD","AAD","aad","D5 0A","V","I","","pseudo","","",""
+"AAD imm8u","AAD imm8u","aad imm8u","D5 ib","V","N.S.","","","r","",""
+"AAM","AAM","aam","D4 0A","V","I","","pseudo","","",""
+"AAM imm8u","AAM imm8u","aam imm8u","D4 ib","V","N.S.","","","r","",""
+"AAS","AAS","aas","3F","V","N.S.","","","","",""
+"ADC AL, imm8","ADCB imm8, AL","adcb imm8, AL","14 ib","V","V","","","rw,r","Y","8"
+"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","80 /2 ib","V","V","","","rw,r","Y","8"
+"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","82 /2 ib","V","N.S.","","","rw,r","Y","8"
+"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","REX 80 /2 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","12 /r","V","V","","","rw,r","Y","8"
+"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","REX 12 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","10 /r","V","V","","","rw,r","Y","8"
+"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","REX 10 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADC EAX, imm32","ADCL imm32, EAX","adcl imm32, EAX","15 id","V","V","","operand32","rw,r","Y","32"
+"ADC r/m32, imm32","ADCL imm32, r/m32","adcl imm32, r/m32","81 /2 id","V","V","","operand32","rw,r","Y","32"
+"ADC r/m32, imm8","ADCL imm8, r/m32","adcl imm8, r/m32","83 /2 ib","V","V","","operand32","rw,r","Y","32"
+"ADC r32, r/m32","ADCL r/m32, r32","adcl r/m32, r32","13 /r","V","V","","operand32","rw,r","Y","32"
+"ADC r/m32, r32","ADCL r32, r/m32","adcl r32, r/m32","11 /r","V","V","","operand32","rw,r","Y","32"
+"ADC RAX, imm32","ADCQ imm32, RAX","adcq imm32, RAX","REX.W 15 id","N.S.","V","","","rw,r","Y","64"
+"ADC r/m64, imm32","ADCQ imm32, r/m64","adcq imm32, r/m64","REX.W 81 /2 id","N.S.","V","","","rw,r","Y","64"
+"ADC r/m64, imm8","ADCQ imm8, r/m64","adcq imm8, r/m64","REX.W 83 /2 ib","N.S.","V","","","rw,r","Y","64"
+"ADC r64, r/m64","ADCQ r/m64, r64","adcq r/m64, r64","REX.W 13 /r","N.S.","V","","","rw,r","Y","64"
+"ADC r/m64, r64","ADCQ r64, r/m64","adcq r64, r/m64","REX.W 11 /r","N.S.","V","","","rw,r","Y","64"
+"ADC AX, imm16","ADCW imm16, AX","adcw imm16, AX","15 iw","V","V","","operand16","rw,r","Y","16"
+"ADC r/m16, imm16","ADCW imm16, r/m16","adcw imm16, r/m16","81 /2 iw","V","V","","operand16","rw,r","Y","16"
+"ADC r/m16, imm8","ADCW imm8, r/m16","adcw imm8, r/m16","83 /2 ib","V","V","","operand16","rw,r","Y","16"
+"ADC r16, r/m16","ADCW r/m16, r16","adcw r/m16, r16","13 /r","V","V","","operand16","rw,r","Y","16"
+"ADC r/m16, r16","ADCW r16, r/m16","adcw r16, r/m16","11 /r","V","V","","operand16","rw,r","Y","16"
+"ADCX r32, r/m32","ADCXL r/m32, r32","adcxl r/m32, r32","66 0F 38 F6 /r","V","V","ADX","operand16,operand32","rw,r","Y","32"
+"ADCX r64, r/m64","ADCXQ r/m64, r64","adcxq r/m64, r64","66 REX.W 0F 38 F6 /r","N.S.","V","ADX","","rw,r","Y","64"
+"ADD AL, imm8","ADDB imm8, AL","addb imm8, AL","04 ib","V","V","","","rw,r","Y","8"
+"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","80 /0 ib","V","V","","","rw,r","Y","8"
+"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","82 /0 ib","V","N.S.","","","rw,r","Y","8"
+"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","REX 80 /0 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","02 /r","V","V","","","rw,r","Y","8"
+"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","REX 02 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","00 /r","V","V","","","rw,r","Y","8"
+"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","REX 00 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADD EAX, imm32","ADDL imm32, EAX","addl imm32, EAX","05 id","V","V","","operand32","rw,r","Y","32"
+"ADD r/m32, imm32","ADDL imm32, r/m32","addl imm32, r/m32","81 /0 id","V","V","","operand32","rw,r","Y","32"
+"ADD r/m32, imm8","ADDL imm8, r/m32","addl imm8, r/m32","83 /0 ib","V","V","","operand32","rw,r","Y","32"
+"ADD r32, r/m32","ADDL r/m32, r32","addl r/m32, r32","03 /r","V","V","","operand32","rw,r","Y","32"
+"ADD r/m32, r32","ADDL r32, r/m32","addl r32, r/m32","01 /r","V","V","","operand32","rw,r","Y","32"
+"ADDPD xmm1, xmm2/m128","ADDPD xmm2/m128, xmm1","addpd xmm2/m128, xmm1","66 0F 58 /r","V","V","SSE2","","rw,r","",""
+"ADDPS xmm1, xmm2/m128","ADDPS xmm2/m128, xmm1","addps xmm2/m128, xmm1","0F 58 /r","V","V","SSE","","rw,r","",""
+"ADD RAX, imm32","ADDQ imm32, RAX","addq imm32, RAX","REX.W 05 id","N.S.","V","","","rw,r","Y","64"
+"ADD r/m64, imm32","ADDQ imm32, r/m64","addq imm32, r/m64","REX.W 81 /0 id","N.S.","V","","","rw,r","Y","64"
+"ADD r/m64, imm8","ADDQ imm8, r/m64","addq imm8, r/m64","REX.W 83 /0 ib","N.S.","V","","","rw,r","Y","64"
+"ADD r64, r/m64","ADDQ r/m64, r64","addq r/m64, r64","REX.W 03 /r","N.S.","V","","","rw,r","Y","64"
+"ADD r/m64, r64","ADDQ r64, r/m64","addq r64, r/m64","REX.W 01 /r","N.S.","V","","","rw,r","Y","64"
+"ADDSD xmm1, xmm2/m64","ADDSD xmm2/m64, xmm1","addsd xmm2/m64, xmm1","F2 0F 58 /r","V","V","SSE2","","rw,r","",""
+"ADDSS xmm1, xmm2/m32","ADDSS xmm2/m32, xmm1","addss xmm2/m32, xmm1","F3 0F 58 /r","V","V","SSE","","rw,r","",""
+"ADDSUBPD xmm1, xmm2/m128","ADDSUBPD xmm2/m128, xmm1","addsubpd xmm2/m128, xmm1","66 0F D0 /r","V","V","SSE3","","rw,r","",""
+"ADDSUBPS xmm1, xmm2/m128","ADDSUBPS xmm2/m128, xmm1","addsubps xmm2/m128, xmm1","F2 0F D0 /r","V","V","SSE3","","rw,r","",""
+"ADD AX, imm16","ADDW imm16, AX","addw imm16, AX","05 iw","V","V","","operand16","rw,r","Y","16"
+"ADD r/m16, imm16","ADDW imm16, r/m16","addw imm16, r/m16","81 /0 iw","V","V","","operand16","rw,r","Y","16"
+"ADD r/m16, imm8","ADDW imm8, r/m16","addw imm8, r/m16","83 /0 ib","V","V","","operand16","rw,r","Y","16"
+"ADD r16, r/m16","ADDW r/m16, r16","addw r/m16, r16","03 /r","V","V","","operand16","rw,r","Y","16"
+"ADD r/m16, r16","ADDW r16, r/m16","addw r16, r/m16","01 /r","V","V","","operand16","rw,r","Y","16"
+"ADOX r32, r/m32","ADOXL r/m32, r32","adoxl r/m32, r32","F3 0F 38 F6 /r","V","V","ADX","operand16,operand32","rw,r","Y","32"
+"ADOX r64, r/m64","ADOXQ r/m64, r64","adoxq r/m64, r64","F3 REX.W 0F 38 F6 /r","N.S.","V","ADX","","rw,r","Y","64"
+"AESDEC xmm1, xmm2/m128","AESDEC xmm2/m128, xmm1","aesdec xmm2/m128, xmm1","66 0F 38 DE /r","V","V","AES","","rw,r","",""
+"AESDECLAST xmm1, xmm2/m128","AESDECLAST xmm2/m128, xmm1","aesdeclast xmm2/m128, xmm1","66 0F 38 DF /r","V","V","AES","","rw,r","",""
+"AESENC xmm1, xmm2/m128","AESENC xmm2/m128, xmm1","aesenc xmm2/m128, xmm1","66 0F 38 DC /r","V","V","AES","","rw,r","",""
+"AESENCLAST xmm1, xmm2/m128","AESENCLAST xmm2/m128, xmm1","aesenclast xmm2/m128, xmm1","66 0F 38 DD /r","V","V","AES","","rw,r","",""
+"AESIMC xmm1, xmm2/m128","AESIMC xmm2/m128, xmm1","aesimc xmm2/m128, xmm1","66 0F 38 DB /r","V","V","AES","","w,r","",""
+"AESKEYGENASSIST xmm1, xmm2/m128, imm8u","AESKEYGENASSIST imm8u, xmm2/m128, xmm1","aeskeygenassist imm8u, xmm2/m128, xmm1","66 0F 3A DF /r ib","V","V","AES","","w,r,r","",""
+"AND AL, imm8","ANDB imm8, AL","andb imm8, AL","24 ib","V","V","","","rw,r","Y","8"
+"AND r/m8, imm8","ANDB imm8, r/m8","andb imm8, r/m8","REX 80 /4 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","80 /4 ib","V","V","","","rw,r","Y","8"
+"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","82 /4 ib","V","N.S.","","","rw,r","Y","8"
+"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","22 /r","V","V","","","rw,r","Y","8"
+"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","REX 22 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","20 /r","V","V","","","rw,r","Y","8"
+"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","REX 20 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"AND EAX, imm32","ANDL imm32, EAX","andl imm32, EAX","25 id","V","V","","operand32","rw,r","Y","32"
+"AND r/m32, imm32","ANDL imm32, r/m32","andl imm32, r/m32","81 /4 id","V","V","","operand32","rw,r","Y","32"
+"AND r/m32, imm8","ANDL imm8, r/m32","andl imm8, r/m32","83 /4 ib","V","V","","operand32","rw,r","Y","32"
+"AND r32, r/m32","ANDL r/m32, r32","andl r/m32, r32","23 /r","V","V","","operand32","rw,r","Y","32"
+"AND r/m32, r32","ANDL r32, r/m32","andl r32, r/m32","21 /r","V","V","","operand32","rw,r","Y","32"
+"ANDN r32, r32V, r/m32","ANDNL r/m32, r32V, r32","andnl r/m32, r32V, r32","VEX.DDS.128.0F38.W0 F2 /r","V","V","BMI1","","rw,r,r","Y","32"
+"ANDNPD xmm1, xmm2/m128","ANDNPD xmm2/m128, xmm1","andnpd xmm2/m128, xmm1","66 0F 55 /r","V","V","SSE2","","rw,r","",""
+"ANDNPS xmm1, xmm2/m128","ANDNPS xmm2/m128, xmm1","andnps xmm2/m128, xmm1","0F 55 /r","V","V","SSE","","rw,r","",""
+"ANDN r64, r64V, r/m64","ANDNQ r/m64, r64V, r64","andnq r/m64, r64V, r64","VEX.DDS.128.0F38.W1 F2 /r","N.S.","V","BMI1","","rw,r,r","Y","64"
+"ANDPD xmm1, xmm2/m128","ANDPD xmm2/m128, xmm1","andpd xmm2/m128, xmm1","66 0F 54 /r","V","V","SSE2","","rw,r","",""
+"ANDPS xmm1, xmm2/m128","ANDPS xmm2/m128, xmm1","andps xmm2/m128, xmm1","0F 54 /r","V","V","SSE","","rw,r","",""
+"AND RAX, imm32","ANDQ imm32, RAX","andq imm32, RAX","REX.W 25 id","N.S.","V","","","rw,r","Y","64"
+"AND r/m64, imm32","ANDQ imm32, r/m64","andq imm32, r/m64","REX.W 81 /4 id","N.S.","V","","","rw,r","Y","64"
+"AND r/m64, imm8","ANDQ imm8, r/m64","andq imm8, r/m64","REX.W 83 /4 ib","N.S.","V","","","rw,r","Y","64"
+"AND r64, r/m64","ANDQ r/m64, r64","andq r/m64, r64","REX.W 23 /r","N.S.","V","","","rw,r","Y","64"
+"AND r/m64, r64","ANDQ r64, r/m64","andq r64, r/m64","REX.W 21 /r","N.S.","V","","","rw,r","Y","64"
+"AND AX, imm16","ANDW imm16, AX","andw imm16, AX","25 iw","V","V","","operand16","rw,r","Y","16"
+"AND r/m16, imm16","ANDW imm16, r/m16","andw imm16, r/m16","81 /4 iw","V","V","","operand16","rw,r","Y","16"
+"AND r/m16, imm8","ANDW imm8, r/m16","andw imm8, r/m16","83 /4 ib","V","V","","operand16","rw,r","Y","16"
+"AND r16, r/m16","ANDW r/m16, r16","andw r/m16, r16","23 /r","V","V","","operand16","rw,r","Y","16"
+"AND r/m16, r16","ANDW r16, r/m16","andw r16, r/m16","21 /r","V","V","","operand16","rw,r","Y","16"
+"ARPL r/m16, r16","ARPL r16, r/m16","arpl r16, r/m16","63 /r","V","N.S.","","","rw,r","",""
+"BEXTR r32, r/m32, r32V","BEXTRL r32V, r/m32, r32","bextrl r32V, r/m32, r32","VEX.NDS.128.0F38.W0 F7 /r","V","V","BMI1","","w,r,r","Y","32"
+"BEXTR r64, r/m64, r64V","BEXTRQ r64V, r/m64, r64","bextrq r64V, r/m64, r64","VEX.NDS.128.0F38.W1 F7 /r","N.S.","V","BMI1","","w,r,r","Y","64"
+"BEXTR_XOP r32, r/m32, imm32u","BEXTR_XOPL imm32u, r/m32, r32","bextr_xopl imm32u, r/m32, r32","XOP.128.0A.WIG 10 /r","V","V","TBM","amd,operand16,operand32","w,r,r","Y","32"
+"BEXTR_XOP r64, r/m64, imm32u","BEXTR_XOPQ imm32u, r/m64, r64","bextr_xopq imm32u, r/m64, r64","XOP.128.0A.WIG 10 /r","N.S.","V","TBM","amd,operand64","w,r,r","Y","64"
+"BLCFILL r32V, r/m32","BLCFILLL r/m32, r32V","blcfill r/m32, r32V","XOP.NDD.128.09.WIG 01 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCFILL r64V, r/m64","BLCFILLQ r/m64, r64V","blcfill r/m64, r64V","XOP.NDD.128.09.W1 01 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLCIC r32V, r/m32","BLCICL r/m32, r32V","blcicl r/m32, r32V","XOP.NDD.128.09.WIG 01 /5","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCIC r64V, r/m64","BLCICQ r/m64, r64V","blcicq r/m64, r64V","XOP.NDD.128.09.WIG 01 /5","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLCI r32V, r/m32","BLCIL r/m32, r32V","blcil r/m32, r32V","XOP.NDD.128.09.WIG 02 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCI r64V, r/m64","BLCIQ r/m64, r64V","blciq r/m64, r64V","XOP.NDD.128.09.WIG 02 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLCMSK r32V, r/m32","BLCMSKL r/m32, r32V","blcmskl r/m32, r32V","XOP.NDD.128.09.WIG 02 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCMSK r64V, r/m64","BLCMSKQ r/m64, r64V","blcmskq r/m64, r64V","XOP.NDD.128.09.WIG 02 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLCS r32V, r/m32","BLCSL r/m32, r32V","blcsl r/m32, r32V","XOP.NDD.128.09.WIG 01 /3","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCS r64V, r/m64","BLCSQ r/m64, r64V","blcsq r/m64, r64V","XOP.NDD.128.09.WIG 01 /3","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLENDPD xmm1, xmm2/m128, imm8u","BLENDPD imm8u, xmm2/m128, xmm1","blendpd imm8u, xmm2/m128, xmm1","66 0F 3A 0D /r ib","V","V","SSE4_1","","rw,r,r","",""
+"BLENDPS xmm1, xmm2/m128, imm8u","BLENDPS imm8u, xmm2/m128, xmm1","blendps imm8u, xmm2/m128, xmm1","66 0F 3A 0C /r ib","V","V","SSE4_1","","rw,r,r","",""
+"BLENDVPD xmm1, xmm2/m128, <XMM0>","BLENDVPD <XMM0>, xmm2/m128, xmm1","blendvpd <XMM0>, xmm2/m128, xmm1","66 0F 38 15 /r","V","V","SSE4_1","","rw,r,r","",""
+"BLENDVPS xmm1, xmm2/m128, <XMM0>","BLENDVPS <XMM0>, xmm2/m128, xmm1","blendvps <XMM0>, xmm2/m128, xmm1","66 0F 38 14 /r","V","V","SSE4_1","","rw,r,r","",""
+"BLSFILL r32V, r/m32","BLSFILLL r/m32, r32V","blsfill r/m32, r32V","XOP.NDD.128.09.WIG 01 /2","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLSFILL r64V, r/m64","BLSFILLQ r/m64, r64V","blsfill r/m64, r64V","XOP.NDD.128.09.W1 01 /2","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLSIC r32V, r/m32","BLSICL r/m32, r32V","blsicl r/m32, r32V","XOP.NDD.128.09.WIG 01 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLSIC r64V, r/m64","BLSICQ r/m64, r64V","blsicq r/m64, r64V","XOP.NDD.128.09.WIG 01 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLSI r32V, r/m32","BLSIL r/m32, r32V","blsil r/m32, r32V","VEX.NDD.128.0F38.W0 F3 /3","V","V","BMI1","","w,r","Y","32"
+"BLSI r64V, r/m64","BLSIQ r/m64, r64V","blsiq r/m64, r64V","VEX.NDD.128.0F38.W1 F3 /3","N.S.","V","BMI1","","w,r","Y","64"
+"BLSMSK r32V, r/m32","BLSMSKL r/m32, r32V","blsmskl r/m32, r32V","VEX.NDD.128.0F38.W0 F3 /2","V","V","BMI1","","w,r","Y","32"
+"BLSMSK r64V, r/m64","BLSMSKQ r/m64, r64V","blsmskq r/m64, r64V","VEX.NDD.128.0F38.W1 F3 /2","N.S.","V","BMI1","","w,r","Y","64"
+"BLSR r32V, r/m32","BLSRL r/m32, r32V","blsrl r/m32, r32V","VEX.NDD.128.0F38.W0 F3 /1","V","V","BMI1","","w,r","Y","32"
+"BLSR r64V, r/m64","BLSRQ r/m64, r64V","blsrq r/m64, r64V","VEX.NDD.128.0F38.W1 F3 /1","N.S.","V","BMI1","","w,r","Y","64"
+"BNDCL bnd1, r/m32","BNDCL r/m32, bnd1","bndcl r/m32, bnd1","F3 0F 1A /r","V","N.S.","MPX","","r,r","",""
+"BNDCL bnd1, r/m64","BNDCL r/m64, bnd1","bndcl r/m64, bnd1","F3 0F 1A /r","N.S.","V","MPX","","r,r","",""
+"BNDCN bnd1, r/m32","BNDCN r/m32, bnd1","bndcn r/m32, bnd1","F2 0F 1B /r","V","N.S.","MPX","","r,r","",""
+"BNDCN bnd1, r/m64","BNDCN r/m64, bnd1","bndcn r/m64, bnd1","F2 0F 1B /r","N.S.","V","MPX","","r,r","",""
+"BNDCU bnd1, r/m32","BNDCU r/m32, bnd1","bndcu r/m32, bnd1","F2 0F 1A /r","V","N.S.","MPX","","r,r","",""
+"BNDCU bnd1, r/m64","BNDCU r/m64, bnd1","bndcu r/m64, bnd1","F2 0F 1A /r","N.S.","V","MPX","","r,r","",""
+"BNDLDX bnd1, mib","BNDLDX mib, bnd1","bndldx mib, bnd1","0F 1A /r","V","V","MPX","modrm_memonly","w,r","",""
+"BNDMK bnd1, m32","BNDMK m32, bnd1","bndmk m32, bnd1","F3 0F 1B /r","V","N.S.","MPX","modrm_memonly","w,r","",""
+"BNDMK bnd1, m64","BNDMK m64, bnd1","bndmk m64, bnd1","F3 0F 1B /r","N.S.","V","MPX","modrm_memonly","w,r","",""
+"BNDMOV bnd2/m128, bnd1","BNDMOV bnd1, bnd2/m128","bndmov bnd1, bnd2/m128","66 0F 1B /r","N.S.","V","MPX","","w,r","",""
+"BNDMOV bnd2/m64, bnd1","BNDMOV bnd1, bnd2/m64","bndmov bnd1, bnd2/m64","66 0F 1B /r","V","N.S.","MPX","","w,r","",""
+"BNDMOV bnd1, bnd2/m128","BNDMOV bnd2/m128, bnd1","bndmov bnd2/m128, bnd1","66 0F 1A /r","N.S.","V","MPX","","w,r","",""
+"BNDMOV bnd1, bnd2/m64","BNDMOV bnd2/m64, bnd1","bndmov bnd2/m64, bnd1","66 0F 1A /r","V","N.S.","MPX","","w,r","",""
+"BNDSTX mib, bnd1","BNDSTX bnd1, mib","bndstx bnd1, mib","0F 1B /r","V","V","MPX","modrm_memonly","w,r","",""
+"BOUND r32, m32&32","BOUNDL m32&32, r32","boundl r32, m32&32","62 /r","V","N.S.","","modrm_memonly,operand32","r,r","Y","32"
+"BOUND r16, m16&16","BOUNDW m16&16, r16","boundw r16, m16&16","62 /r","V","N.S.","","modrm_memonly,operand16","r,r","Y","16"
+"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","0F BC /r","V","V","","operand32","rw,r","Y","32"
+"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","F3 0F BC /r","V","V","","operand32","rw,r","Y","32"
+"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","F3 REX.W 0F BC /r","N.S.","V","","","rw,r","Y","64"
+"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","REX.W 0F BC /r","N.S.","V","","","rw,r","Y","64"
+"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","0F BC /r","V","V","","operand16","rw,r","Y","16"
+"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","F3 0F BC /r","V","V","","operand16","rw,r","Y","16"
+"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","0F BD /r","V","V","","operand32","rw,r","Y","32"
+"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","F3 0F BD /r","V","V","","operand32","rw,r","Y","32"
+"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","F3 REX.W 0F BD /r","N.S.","V","","","rw,r","Y","64"
+"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","REX.W 0F BD /r","N.S.","V","","","rw,r","Y","64"
+"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","0F BD /r","V","V","","operand16","rw,r","Y","16"
+"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","F3 0F BD /r","V","V","","operand16","rw,r","Y","16"
+"BSWAP r32op","BSWAPL r32op","bswap r32op","0F C8+rd","V","V","486","operand32","rw","Y","32"
+"BSWAP r64op","BSWAPQ r64op","bswap r64op","REX.W 0F C8+ro","N.S.","V","486","","rw","Y","64"
+"BSWAP r16op","BSWAPW r16op","bswap r16op","0F C8+rw","V","V","486","operand16","rw","Y","16"
+"BTC r/m32, imm8u","BTCL imm8u, r/m32","btcl imm8u, r/m32","0F BA /7 ib","V","V","","operand32","rw,r","Y","32"
+"BTC r/m32, r32","BTCL r32, r/m32","btcl r32, r/m32","0F BB /r","V","V","","operand32","rw,r","Y","32"
+"BTC r/m64, imm8u","BTCQ imm8u, r/m64","btcq imm8u, r/m64","REX.W 0F BA /7 ib","N.S.","V","","","rw,r","Y","64"
+"BTC r/m64, r64","BTCQ r64, r/m64","btcq r64, r/m64","REX.W 0F BB /r","N.S.","V","","","rw,r","Y","64"
+"BTC r/m16, imm8u","BTCW imm8u, r/m16","btcw imm8u, r/m16","0F BA /7 ib","V","V","","operand16","rw,r","Y","16"
+"BTC r/m16, r16","BTCW r16, r/m16","btcw r16, r/m16","0F BB /r","V","V","","operand16","rw,r","Y","16"
+"BT r/m32, imm8u","BTL imm8u, r/m32","btl imm8u, r/m32","0F BA /4 ib","V","V","","operand32","r,r","Y","32"
+"BT r/m32, r32","BTL r32, r/m32","btl r32, r/m32","0F A3 /r","V","V","","operand32","r,r","Y","32"
+"BT r/m64, imm8u","BTQ imm8u, r/m64","btq imm8u, r/m64","REX.W 0F BA /4 ib","N.S.","V","","","r,r","Y","64"
+"BT r/m64, r64","BTQ r64, r/m64","btq r64, r/m64","REX.W 0F A3 /r","N.S.","V","","","r,r","Y","64"
+"BTR r/m32, imm8u","BTRL imm8u, r/m32","btrl imm8u, r/m32","0F BA /6 ib","V","V","","operand32","rw,r","Y","32"
+"BTR r/m32, r32","BTRL r32, r/m32","btrl r32, r/m32","0F B3 /r","V","V","","operand32","rw,r","Y","32"
+"BTR r/m64, imm8u","BTRQ imm8u, r/m64","btrq imm8u, r/m64","REX.W 0F BA /6 ib","N.S.","V","","","rw,r","Y","64"
+"BTR r/m64, r64","BTRQ r64, r/m64","btrq r64, r/m64","REX.W 0F B3 /r","N.S.","V","","","rw,r","Y","64"
+"BTR r/m16, imm8u","BTRW imm8u, r/m16","btrw imm8u, r/m16","0F BA /6 ib","V","V","","operand16","rw,r","Y","16"
+"BTR r/m16, r16","BTRW r16, r/m16","btrw r16, r/m16","0F B3 /r","V","V","","operand16","rw,r","Y","16"
+"BTS r/m32, imm8u","BTSL imm8u, r/m32","btsl imm8u, r/m32","0F BA /5 ib","V","V","","operand32","rw,r","Y","32"
+"BTS r/m32, r32","BTSL r32, r/m32","btsl r32, r/m32","0F AB /r","V","V","","operand32","rw,r","Y","32"
+"BTS r/m64, imm8u","BTSQ imm8u, r/m64","btsq imm8u, r/m64","REX.W 0F BA /5 ib","N.S.","V","","","rw,r","Y","64"
+"BTS r/m64, r64","BTSQ r64, r/m64","btsq r64, r/m64","REX.W 0F AB /r","N.S.","V","","","rw,r","Y","64"
+"BTS r/m16, imm8u","BTSW imm8u, r/m16","btsw imm8u, r/m16","0F BA /5 ib","V","V","","operand16","rw,r","Y","16"
+"BTS r/m16, r16","BTSW r16, r/m16","btsw r16, r/m16","0F AB /r","V","V","","operand16","rw,r","Y","16"
+"BT r/m16, imm8u","BTW imm8u, r/m16","btw imm8u, r/m16","0F BA /4 ib","V","V","","operand16","r,r","Y","16"
+"BT r/m16, r16","BTW r16, r/m16","btw r16, r/m16","0F A3 /r","V","V","","operand16","r,r","Y","16"
+"BZHI r32, r/m32, r32V","BZHIL r32V, r/m32, r32","bzhil r32V, r/m32, r32","VEX.NDS.128.0F38.W0 F5 /r","V","V","BMI2","","w,r,r","Y","32"
+"BZHI r64, r/m64, r64V","BZHIQ r64V, r/m64, r64","bzhiq r64V, r/m64, r64","VEX.NDS.128.0F38.W1 F5 /r","N.S.","V","BMI2","","w,r,r","Y","64"
+"CALL rel16","CALL rel16","call rel16","E8 cw","V","N.S.","","operand16","r","Y",""
+"CALL rel32","CALL rel32","call rel32","E8 cd","V","N.S.","","operand32","r","Y",""
+"CALL rel32","CALL rel32","call rel32","E8 cd","N.S.","V","","default64","r","Y",""
+"CALL r/m32","CALLL* r/m32","calll* r/m32","FF /2","V","N.S.","","operand32","r","Y","32"
+"CALL r/m64","CALLQ* r/m64","callq* r/m64","FF /2","N.S.","V","","default64","r","Y","64"
+"CALL r/m16","CALLW* r/m16","callw* r/m16","FF /2","V","N.S.","","operand16","r","Y","16"
+"CBW","CBW","cbtw","98","V","V","","operand16","","",""
+"CDQ","CDQ","cltd","99","V","V","","operand32","","",""
+"CDQE","CDQE","cltq","REX.W 98","N.S.","V","","","","",""
+"CLAC","CLAC","clac","0F 01 CA","V","V","","","","",""
+"CLC","CLC","clc","F8","V","V","","","","",""
+"CLD","CLD","cld","FC","V","V","","","","",""
+"CLFLUSH m8","CLFLUSH m8","clflush m8","0F AE /7","V","V","","modrm_memonly","r","",""
+"CLFLUSHOPT m8","CLFLUSHOPT m8","clflushopt m8","66 0F AE /7","V","V","","modrm_memonly","r","",""
+"CLGI","CLGI","clgi","0F 01 DD","V","V","SVM","amd","","",""
+"CLI","CLI","cli","FA","V","V","","","","",""
+"CLRSSBSY m64","CLRSSBSY m64","clrssbsy m64","F3 0F AE /6","V","V","CET","modrm_memonly","w","",""
+"CLTS","CLTS","clts","0F 06","V","V","","","","",""
+"CLWB m8","CLWB m8","clwb m8","66 0F AE /6","V","V","CLWB","modrm_memonly","r","",""
+"CLZERO EAX","CLZEROL EAX","clzerol EAX","0F 01 FC","V","V","CLZERO","amd,modrm_regonly,operand32","r","Y","32"
+"CLZERO RAX","CLZEROQ RAX","clzeroq RAX","REX.W 0F 01 FC","N.S.","V","CLZERO","amd,modrm_regonly","r","Y","64"
+"CLZERO AX","CLZEROW AX","clzerow AX","0F 01 FC","V","V","CLZERO","amd,modrm_regonly,operand16","r","Y","16"
+"CMC","CMC","cmc","F5","V","V","","","","",""
+"CMOVC r16, r/m16","CMOVC r/m16, r16","cmovc r/m16, r16","0F 42 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVC r32, r/m32","CMOVC r/m32, r32","cmovc r/m32, r32","0F 42 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVC r64, r/m64","CMOVC r/m64, r64","cmovc r/m64, r64","REX.W 0F 42 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVAE r32, r/m32","CMOVLCC r/m32, r32","cmovael r/m32, r32","0F 43 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVB r32, r/m32","CMOVLCS r/m32, r32","cmovbl r/m32, r32","0F 42 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVE r32, r/m32","CMOVLEQ r/m32, r32","cmovel r/m32, r32","0F 44 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVGE r32, r/m32","CMOVLGE r/m32, r32","cmovgel r/m32, r32","0F 4D /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVG r32, r/m32","CMOVLGT r/m32, r32","cmovgl r/m32, r32","0F 4F /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVA r32, r/m32","CMOVLHI r/m32, r32","cmoval r/m32, r32","0F 47 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVLE r32, r/m32","CMOVLLE r/m32, r32","cmovlel r/m32, r32","0F 4E /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVBE r32, r/m32","CMOVLLS r/m32, r32","cmovbel r/m32, r32","0F 46 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVL r32, r/m32","CMOVLLT r/m32, r32","cmovll r/m32, r32","0F 4C /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVS r32, r/m32","CMOVLMI r/m32, r32","cmovsl r/m32, r32","0F 48 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNE r32, r/m32","CMOVLNE r/m32, r32","cmovnel r/m32, r32","0F 45 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNO r32, r/m32","CMOVLOC r/m32, r32","cmovnol r/m32, r32","0F 41 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVO r32, r/m32","CMOVLOS r/m32, r32","cmovol r/m32, r32","0F 40 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNP r32, r/m32","CMOVLPC r/m32, r32","cmovnpl r/m32, r32","0F 4B /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNS r32, r/m32","CMOVLPL r/m32, r32","cmovnsl r/m32, r32","0F 49 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVP r32, r/m32","CMOVLPS r/m32, r32","cmovpl r/m32, r32","0F 4A /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNA r16, r/m16","CMOVNA r/m16, r16","cmovna r/m16, r16","0F 46 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNA r32, r/m32","CMOVNA r/m32, r32","cmovna r/m32, r32","0F 46 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNA r64, r/m64","CMOVNA r/m64, r64","cmovna r/m64, r64","REX.W 0F 46 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNAE r16, r/m16","CMOVNAE r/m16, r16","cmovnae r/m16, r16","0F 42 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNAE r32, r/m32","CMOVNAE r/m32, r32","cmovnae r/m32, r32","0F 42 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNAE r64, r/m64","CMOVNAE r/m64, r64","cmovnae r/m64, r64","REX.W 0F 42 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNB r16, r/m16","CMOVNB r/m16, r16","cmovnb r/m16, r16","0F 43 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNB r32, r/m32","CMOVNB r/m32, r32","cmovnb r/m32, r32","0F 43 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNB r64, r/m64","CMOVNB r/m64, r64","cmovnb r/m64, r64","REX.W 0F 43 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNBE r16, r/m16","CMOVNBE r/m16, r16","cmovnbe r/m16, r16","0F 47 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNBE r32, r/m32","CMOVNBE r/m32, r32","cmovnbe r/m32, r32","0F 47 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNBE r64, r/m64","CMOVNBE r/m64, r64","cmovnbe r/m64, r64","REX.W 0F 47 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNC r16, r/m16","CMOVNC r/m16, r16","cmovnc r/m16, r16","0F 43 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNC r32, r/m32","CMOVNC r/m32, r32","cmovnc r/m32, r32","0F 43 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNC r64, r/m64","CMOVNC r/m64, r64","cmovnc r/m64, r64","REX.W 0F 43 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNG r16, r/m16","CMOVNG r/m16, r16","cmovng r/m16, r16","0F 4E /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNG r32, r/m32","CMOVNG r/m32, r32","cmovng r/m32, r32","0F 4E /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNG r64, r/m64","CMOVNG r/m64, r64","cmovng r/m64, r64","REX.W 0F 4E /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNGE r16, r/m16","CMOVNGE r/m16, r16","cmovnge r/m16, r16","0F 4C /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNGE r32, r/m32","CMOVNGE r/m32, r32","cmovnge r/m32, r32","0F 4C /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNGE r64, r/m64","CMOVNGE r/m64, r64","cmovnge r/m64, r64","REX.W 0F 4C /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNL r16, r/m16","CMOVNL r/m16, r16","cmovnl r/m16, r16","0F 4D /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNL r32, r/m32","CMOVNL r/m32, r32","cmovnl r/m32, r32","0F 4D /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNL r64, r/m64","CMOVNL r/m64, r64","cmovnl r/m64, r64","REX.W 0F 4D /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNLE r16, r/m16","CMOVNLE r/m16, r16","cmovnle r/m16, r16","0F 4F /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNLE r32, r/m32","CMOVNLE r/m32, r32","cmovnle r/m32, r32","0F 4F /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNLE r64, r/m64","CMOVNLE r/m64, r64","cmovnle r/m64, r64","REX.W 0F 4F /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNZ r16, r/m16","CMOVNZ r/m16, r16","cmovnz r/m16, r16","0F 45 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNZ r32, r/m32","CMOVNZ r/m32, r32","cmovnz r/m32, r32","0F 45 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNZ r64, r/m64","CMOVNZ r/m64, r64","cmovnz r/m64, r64","REX.W 0F 45 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVPE r16, r/m16","CMOVPE r/m16, r16","cmovpe r/m16, r16","0F 4A /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVPE r32, r/m32","CMOVPE r/m32, r32","cmovpe r/m32, r32","0F 4A /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVPE r64, r/m64","CMOVPE r/m64, r64","cmovpe r/m64, r64","REX.W 0F 4A /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVPO r16, r/m16","CMOVPO r/m16, r16","cmovpo r/m16, r16","0F 4B /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVPO r32, r/m32","CMOVPO r/m32, r32","cmovpo r/m32, r32","0F 4B /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVPO r64, r/m64","CMOVPO r/m64, r64","cmovpo r/m64, r64","REX.W 0F 4B /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVAE r64, r/m64","CMOVQCC r/m64, r64","cmovaeq r/m64, r64","REX.W 0F 43 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVB r64, r/m64","CMOVQCS r/m64, r64","cmovbq r/m64, r64","REX.W 0F 42 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVE r64, r/m64","CMOVQEQ r/m64, r64","cmoveq r/m64, r64","REX.W 0F 44 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVGE r64, r/m64","CMOVQGE r/m64, r64","cmovgeq r/m64, r64","REX.W 0F 4D /r","N.S.","V","","","rw,r","Y","64"
+"CMOVG r64, r/m64","CMOVQGT r/m64, r64","cmovgq r/m64, r64","REX.W 0F 4F /r","N.S.","V","","","rw,r","Y","64"
+"CMOVA r64, r/m64","CMOVQHI r/m64, r64","cmovaq r/m64, r64","REX.W 0F 47 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVLE r64, r/m64","CMOVQLE r/m64, r64","cmovleq r/m64, r64","REX.W 0F 4E /r","N.S.","V","","","rw,r","Y","64"
+"CMOVBE r64, r/m64","CMOVQLS r/m64, r64","cmovbeq r/m64, r64","REX.W 0F 46 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVL r64, r/m64","CMOVQLT r/m64, r64","cmovlq r/m64, r64","REX.W 0F 4C /r","N.S.","V","","","rw,r","Y","64"
+"CMOVS r64, r/m64","CMOVQMI r/m64, r64","cmovsq r/m64, r64","REX.W 0F 48 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVNE r64, r/m64","CMOVQNE r/m64, r64","cmovneq r/m64, r64","REX.W 0F 45 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVNO r64, r/m64","CMOVQOC r/m64, r64","cmovnoq r/m64, r64","REX.W 0F 41 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVO r64, r/m64","CMOVQOS r/m64, r64","cmovoq r/m64, r64","REX.W 0F 40 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVNP r64, r/m64","CMOVQPC r/m64, r64","cmovnpq r/m64, r64","REX.W 0F 4B /r","N.S.","V","","","rw,r","Y","64"
+"CMOVNS r64, r/m64","CMOVQPL r/m64, r64","cmovnsq r/m64, r64","REX.W 0F 49 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVP r64, r/m64","CMOVQPS r/m64, r64","cmovpq r/m64, r64","REX.W 0F 4A /r","N.S.","V","","","rw,r","Y","64"
+"CMOVAE r16, r/m16","CMOVWCC r/m16, r16","cmovaew r/m16, r16","0F 43 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVB r16, r/m16","CMOVWCS r/m16, r16","cmovbw r/m16, r16","0F 42 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVE r16, r/m16","CMOVWEQ r/m16, r16","cmovew r/m16, r16","0F 44 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVGE r16, r/m16","CMOVWGE r/m16, r16","cmovgew r/m16, r16","0F 4D /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVG r16, r/m16","CMOVWGT r/m16, r16","cmovgw r/m16, r16","0F 4F /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVA r16, r/m16","CMOVWHI r/m16, r16","cmovaw r/m16, r16","0F 47 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVLE r16, r/m16","CMOVWLE r/m16, r16","cmovlew r/m16, r16","0F 4E /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVBE r16, r/m16","CMOVWLS r/m16, r16","cmovbew r/m16, r16","0F 46 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVL r16, r/m16","CMOVWLT r/m16, r16","cmovlw r/m16, r16","0F 4C /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVS r16, r/m16","CMOVWMI r/m16, r16","cmovsw r/m16, r16","0F 48 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVNE r16, r/m16","CMOVWNE r/m16, r16","cmovnew r/m16, r16","0F 45 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVNO r16, r/m16","CMOVWOC r/m16, r16","cmovnow r/m16, r16","0F 41 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVO r16, r/m16","CMOVWOS r/m16, r16","cmovow r/m16, r16","0F 40 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVNP r16, r/m16","CMOVWPC r/m16, r16","cmovnpw r/m16, r16","0F 4B /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVNS r16, r/m16","CMOVWPL r/m16, r16","cmovnsw r/m16, r16","0F 49 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVP r16, r/m16","CMOVWPS r/m16, r16","cmovpw r/m16, r16","0F 4A /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVZ r16, r/m16","CMOVZ r/m16, r16","cmovz r/m16, r16","0F 44 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVZ r32, r/m32","CMOVZ r/m32, r32","cmovz r/m32, r32","0F 44 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVZ r64, r/m64","CMOVZ r/m64, r64","cmovz r/m64, r64","REX.W 0F 44 /r","N.E.","V","","pseudo","rw,r","",""
+"CMP AL, imm8","CMPB AL, imm8","cmpb imm8, AL","3C ib","V","V","","","r,r","Y","8"
+"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","80 /7 ib","V","V","","","r,r","Y","8"
+"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","82 /7 ib","V","N.S.","","","r,r","Y","8"
+"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","REX 80 /7 ib","N.E.","V","","pseudo64","r,r","Y","8"
+"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","38 /r","V","V","","","r,r","Y","8"
+"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","REX 38 /r","N.E.","V","","pseudo64","r,r","Y","8"
+"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","3A /r","V","V","","","r,r","Y","8"
+"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","REX 3A /r","N.E.","V","","pseudo64","r,r","Y","8"
+"CMP EAX, imm32","CMPL EAX, imm32","cmpl imm32, EAX","3D id","V","V","","operand32","r,r","Y","32"
+"CMP r/m32, imm32","CMPL r/m32, imm32","cmpl imm32, r/m32","81 /7 id","V","V","","operand32","r,r","Y","32"
+"CMP r/m32, imm8","CMPL r/m32, imm8","cmpl imm8, r/m32","83 /7 ib","V","V","","operand32","r,r","Y","32"
+"CMP r/m32, r32","CMPL r/m32, r32","cmpl r32, r/m32","39 /r","V","V","","operand32","r,r","Y","32"
+"CMP r32, r/m32","CMPL r32, r/m32","cmpl r/m32, r32","3B /r","V","V","","operand32","r,r","Y","32"
+"CMPPD xmm1, xmm2/m128, imm8u","CMPPD imm8u, xmm1, xmm2/m128","cmppd imm8u, xmm2/m128, xmm1","66 0F C2 /r ib","V","V","SSE2","","rw,r,r","",""
+"CMPPS xmm1, xmm2/m128, imm8u","CMPPS imm8u, xmm1, xmm2/m128","cmpps imm8u, xmm2/m128, xmm1","0F C2 /r ib","V","V","SSE","","rw,r,r","",""
+"CMP RAX, imm32","CMPQ RAX, imm32","cmpq imm32, RAX","REX.W 3D id","N.S.","V","","","r,r","Y","64"
+"CMP r/m64, imm32","CMPQ r/m64, imm32","cmpq imm32, r/m64","REX.W 81 /7 id","N.S.","V","","","r,r","Y","64"
+"CMP r/m64, imm8","CMPQ r/m64, imm8","cmpq imm8, r/m64","REX.W 83 /7 ib","N.S.","V","","","r,r","Y","64"
+"CMP r/m64, r64","CMPQ r/m64, r64","cmpq r64, r/m64","REX.W 39 /r","N.S.","V","","","r,r","Y","64"
+"CMP r64, r/m64","CMPQ r64, r/m64","cmpq r/m64, r64","REX.W 3B /r","N.S.","V","","","r,r","Y","64"
+"CMPSB","CMPSB","cmpsb","A6","V","V","","","","",""
+"CMPSD xmm1, xmm2/m64, imm8u","CMPSD imm8u, xmm1, xmm2/m64","cmpsd imm8u, xmm2/m64, xmm1","F2 0F C2 /r ib","V","V","SSE2","","rw,r,r","",""
+"CMPSD","CMPSL","cmpsl","A7","V","V","","operand32","","",""
+"CMPSQ","CMPSQ","cmpsq","REX.W A7","N.S.","V","","","","",""
+"CMPSS xmm1, xmm2/m32, imm8u","CMPSS imm8u, xmm1, xmm2/m32","cmpss imm8u, xmm2/m32, xmm1","F3 0F C2 /r ib","V","V","SSE","","rw,r,r","",""
+"CMPSW","CMPSW","cmpsw","A7","V","V","","operand16","","",""
+"CMP AX, imm16","CMPW AX, imm16","cmpw imm16, AX","3D iw","V","V","","operand16","r,r","Y","16"
+"CMP r/m16, imm16","CMPW r/m16, imm16","cmpw imm16, r/m16","81 /7 iw","V","V","","operand16","r,r","Y","16"
+"CMP r/m16, imm8","CMPW r/m16, imm8","cmpw imm8, r/m16","83 /7 ib","V","V","","operand16","r,r","Y","16"
+"CMP r/m16, r16","CMPW r/m16, r16","cmpw r16, r/m16","39 /r","V","V","","operand16","r,r","Y","16"
+"CMP r16, r/m16","CMPW r16, r/m16","cmpw r/m16, r16","3B /r","V","V","","operand16","r,r","Y","16"
+"CMPXCHG16B m128","CMPXCHG16B m128","cmpxchg16b m128","REX.W 0F C7 /1","N.S.","V","","modrm_memonly","rw","",""
+"CMPXCHG8B m64","CMPXCHG8B m64","cmpxchg8b m64","0F C7 /1","V","V","Pentium","modrm_memonly,operand16,operand32","rw","",""
+"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","0F B0 /r","V","V","486","","rw,r","Y","8"
+"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","REX 0F B0 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"CMPXCHG r/m32, r32","CMPXCHGL r32, r/m32","cmpxchgl r32, r/m32","0F B1 /r","V","V","486","operand32","rw,r","Y","32"
+"CMPXCHG r/m64, r64","CMPXCHGQ r64, r/m64","cmpxchgq r64, r/m64","REX.W 0F B1 /r","N.S.","V","486","","rw,r","Y","64"
+"CMPXCHG r/m16, r16","CMPXCHGW r16, r/m16","cmpxchgw r16, r/m16","0F B1 /r","V","V","486","operand16","rw,r","Y","16"
+"COMISD xmm1, xmm2/m64","COMISD xmm2/m64, xmm1","comisd xmm2/m64, xmm1","66 0F 2F /r","V","V","SSE2","","r,r","",""
+"COMISS xmm1, xmm2/m32","COMISS xmm2/m32, xmm1","comiss xmm2/m32, xmm1","0F 2F /r","V","V","SSE","","r,r","",""
+"CPUID","CPUID","cpuid","0F A2","V","V","486","","","",""
+"CQO","CQO","cqto","REX.W 99","N.S.","V","","","","",""
+"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 0F 38 F0 /r","V","V","SSE4_2","operand16,operand32","rw,r","Y","8"
+"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 REX 0F 38 F0 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"CRC32 r64, r/m8","CRC32B r/m8, r64","crc32b r/m8, r64","F2 REX.W 0F 38 F0 /r","N.S.","V","SSE4_2","","rw,r","Y","8"
+"CRC32 r32, r/m32","CRC32L r/m32, r32","crc32l r/m32, r32","F2 0F 38 F1 /r","V","V","SSE4_2","operand32","rw,r","Y","32"
+"CRC32 r64, r/m64","CRC32Q r/m64, r64","crc32q r/m64, r64","F2 REX.W 0F 38 F1 /r","N.S.","V","SSE4_2","","rw,r","Y","64"
+"CRC32 r32, r/m16","CRC32W r/m16, r32","crc32w r/m16, r32","F2 0F 38 F1 /r","V","V","SSE4_2","operand16","rw,r","Y","16"
+"CVTPD2PI mm1, xmm2/m128","CVTPD2PI xmm2/m128, mm1","cvtpd2pi xmm2/m128, mm1","66 0F 2D /r","V","V","SSE2","","w,r","",""
+"CVTPD2DQ xmm1, xmm2/m128","CVTPD2PL xmm2/m128, xmm1","cvtpd2dq xmm2/m128, xmm1","F2 0F E6 /r","V","V","SSE2","","w,r","",""
+"CVTPD2PS xmm1, xmm2/m128","CVTPD2PS xmm2/m128, xmm1","cvtpd2ps xmm2/m128, xmm1","66 0F 5A /r","V","V","SSE2","","w,r","",""
+"CVTPI2PD xmm1, mm2/m64","CVTPI2PD mm2/m64, xmm1","cvtpi2pd mm2/m64, xmm1","66 0F 2A /r","V","V","SSE2","","w,r","",""
+"CVTPI2PS xmm1, mm2/m64","CVTPI2PS mm2/m64, xmm1","cvtpi2ps mm2/m64, xmm1","0F 2A /r","V","V","SSE","","w,r","",""
+"CVTDQ2PD xmm1, xmm2/m64","CVTPL2PD xmm2/m64, xmm1","cvtdq2pd xmm2/m64, xmm1","F3 0F E6 /r","V","V","SSE2","","w,r","",""
+"CVTDQ2PS xmm1, xmm2/m128","CVTPL2PS xmm2/m128, xmm1","cvtdq2ps xmm2/m128, xmm1","0F 5B /r","V","V","SSE2","","w,r","",""
+"CVTPS2PD xmm1, xmm2/m64","CVTPS2PD xmm2/m64, xmm1","cvtps2pd xmm2/m64, xmm1","0F 5A /r","V","V","SSE2","","w,r","",""
+"CVTPS2PI mm1, xmm2/m64","CVTPS2PI xmm2/m64, mm1","cvtps2pi xmm2/m64, mm1","0F 2D /r","V","V","SSE","","w,r","",""
+"CVTPS2DQ xmm1, xmm2/m128","CVTPS2PL xmm2/m128, xmm1","cvtps2dq xmm2/m128, xmm1","66 0F 5B /r","V","V","SSE2","","w,r","",""
+"CVTSD2SI r32, xmm2/m64","CVTSD2SL xmm2/m64, r32","cvtsd2si xmm2/m64, r32","F2 0F 2D /r","V","V","SSE2","operand16,operand32","w,r","Y","32"
+"CVTSD2SI r64, xmm2/m64","CVTSD2SL xmm2/m64, r64","cvtsd2siq xmm2/m64, r64","F2 REX.W 0F 2D /r","N.S.","V","SSE2","","w,r","Y","64"
+"CVTSD2SS xmm1, xmm2/m64","CVTSD2SS xmm2/m64, xmm1","cvtsd2ss xmm2/m64, xmm1","F2 0F 5A /r","V","V","SSE2","","w,r","",""
+"CVTSI2SD xmm1, r/m32","CVTSL2SD r/m32, xmm1","cvtsi2sdl r/m32, xmm1","F2 0F 2A /r","V","V","SSE2","operand16,operand32","w,r","Y","32"
+"CVTSI2SS xmm1, r/m32","CVTSL2SS r/m32, xmm1","cvtsi2ssl r/m32, xmm1","F3 0F 2A /r","V","V","SSE","operand16,operand32","w,r","Y","32"
+"CVTSI2SD xmm1, r/m64","CVTSQ2SD r/m64, xmm1","cvtsi2sdq r/m64, xmm1","F2 REX.W 0F 2A /r","N.S.","V","SSE2","","w,r","Y","64"
+"CVTSI2SS xmm1, r/m64","CVTSQ2SS r/m64, xmm1","cvtsi2ssq r/m64, xmm1","F3 REX.W 0F 2A /r","N.S.","V","SSE","","w,r","Y","64"
+"CVTSS2SD xmm1, xmm2/m32","CVTSS2SD xmm2/m32, xmm1","cvtss2sd xmm2/m32, xmm1","F3 0F 5A /r","V","V","SSE2","","w,r","",""
+"CVTSS2SI r32, xmm2/m32","CVTSS2SL xmm2/m32, r32","cvtss2si xmm2/m32, r32","F3 0F 2D /r","V","V","SSE","operand16,operand32","w,r","Y","32"
+"CVTSS2SI r64, xmm2/m32","CVTSS2SL xmm2/m32, r64","cvtss2siq xmm2/m32, r64","F3 REX.W 0F 2D /r","N.S.","V","SSE","","w,r","Y","64"
+"CVTTPD2PI mm1, xmm2/m128","CVTTPD2PI xmm2/m128, mm1","cvttpd2pi xmm2/m128, mm1","66 0F 2C /r","V","V","SSE2","","w,r","",""
+"CVTTPD2DQ xmm1, xmm2/m128","CVTTPD2PL xmm2/m128, xmm1","cvttpd2dq xmm2/m128, xmm1","66 0F E6 /r","V","V","SSE2","","w,r","",""
+"CVTTPS2PI mm1, xmm2/m64","CVTTPS2PI xmm2/m64, mm1","cvttps2pi xmm2/m64, mm1","0F 2C /r","V","V","SSE","","w,r","",""
+"CVTTPS2DQ xmm1, xmm2/m128","CVTTPS2PL xmm2/m128, xmm1","cvttps2dq xmm2/m128, xmm1","F3 0F 5B /r","V","V","SSE2","","w,r","",""
+"CVTTSD2SI r32, xmm2/m64","CVTTSD2SL xmm2/m64, r32","cvttsd2si xmm2/m64, r32","F2 0F 2C /r","V","V","SSE2","operand16,operand32","w,r","Y","32"
+"CVTTSD2SI r64, xmm2/m64","CVTTSD2SL xmm2/m64, r64","cvttsd2siq xmm2/m64, r64","F2 REX.W 0F 2C /r","N.S.","V","SSE2","","w,r","Y","64"
+"CVTTSS2SI r32, xmm2/m32","CVTTSS2SL xmm2/m32, r32","cvttss2si xmm2/m32, r32","F3 0F 2C /r","V","V","SSE","operand16,operand32","w,r","Y","32"
+"CVTTSS2SI r64, xmm2/m32","CVTTSS2SL xmm2/m32, r64","cvttss2siq xmm2/m32, r64","F3 REX.W 0F 2C /r","N.S.","V","SSE","","w,r","Y","64"
+"CWD","CWD","cwtd","99","V","V","","operand16","","",""
+"CWDE","CWDE","cwtl","98","V","V","","operand32","","",""
+"DAA","DAA","daa","27","V","N.S.","","","","",""
+"DAS","DAS","das","2F","V","N.S.","","","","",""
+"DEC r/m8","DECB r/m8","decb r/m8","FE /1","V","V","","","rw","Y","8"
+"DEC r/m8","DECB r/m8","decb r/m8","REX FE /1","N.E.","V","","pseudo64","rw","Y","8"
+"DEC r/m32","DECL r/m32","decl r/m32","FF /1","V","V","","operand32","rw","Y","32"
+"DEC r32op","DECL r32op","decl r32op","48+rd","V","N.S.","","operand32","rw","Y","32"
+"DEC r/m64","DECQ r/m64","decq r/m64","REX.W FF /1","N.S.","V","","","rw","Y","64"
+"DEC r/m16","DECW r/m16","decw r/m16","FF /1","V","V","","operand16","rw","Y","16"
+"DEC r16op","DECW r16op","decw r16op","48+rw","V","N.S.","","operand16","rw","Y","16"
+"DIV r/m8","DIVB r/m8","divb r/m8","F6 /6","V","V","","","r","Y","8"
+"DIV r/m8","DIVB r/m8","divb r/m8","REX F6 /6","N.E.","V","","pseudo64","w","Y","8"
+"DIV r/m32","DIVL r/m32","divl r/m32","F7 /6","V","V","","operand32","r","Y","32"
+"DIVPD xmm1, xmm2/m128","DIVPD xmm2/m128, xmm1","divpd xmm2/m128, xmm1","66 0F 5E /r","V","V","SSE2","","rw,r","",""
+"DIVPS xmm1, xmm2/m128","DIVPS xmm2/m128, xmm1","divps xmm2/m128, xmm1","0F 5E /r","V","V","SSE","","rw,r","",""
+"DIV r/m64","DIVQ r/m64","divq r/m64","REX.W F7 /6","N.S.","V","","","r","Y","64"
+"DIVSD xmm1, xmm2/m64","DIVSD xmm2/m64, xmm1","divsd xmm2/m64, xmm1","F2 0F 5E /r","V","V","SSE2","","rw,r","",""
+"DIVSS xmm1, xmm2/m32","DIVSS xmm2/m32, xmm1","divss xmm2/m32, xmm1","F3 0F 5E /r","V","V","SSE","","rw,r","",""
+"DIV r/m16","DIVW r/m16","divw r/m16","F7 /6","V","V","","operand16","r","Y","16"
+"DPPD xmm1, xmm2/m128, imm8u","DPPD imm8u, xmm2/m128, xmm1","dppd imm8u, xmm2/m128, xmm1","66 0F 3A 41 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"DPPS xmm1, xmm2/m128, imm8u","DPPS imm8u, xmm2/m128, xmm1","dpps imm8u, xmm2/m128, xmm1","66 0F 3A 40 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"EMMS","EMMS","emms","0F 77","V","V","MMX","","","",""
+"ENCLS","ENCLS","encls","0F 01 CF","V","V","","","","",""
+"ENCLU","ENCLU","enclu","0F 01 D7","V","V","","","","",""
+"ENDBR32","ENDBR32","endbr32","F3 0F 1E FB","V","V","CET","","","",""
+"ENDBR64","ENDBR64","endbr64","F3 0F 1E FA","V","V","CET","","","Y",""
+"ENTER imm16, 0","ENTER 0, imm16","enter imm16, 0","C8 iw 00","V","V","","pseudo","r,r","",""
+"ENTER imm16, 1","ENTER 1, imm16","enter imm16, 1","C8 iw 01","V","V","","pseudo","r,r","",""
+"ENTER imm16, imm8b","ENTERW/ENTERL/ENTERQ imm8b, imm16","enterw/enterl/enterq imm16, imm8b","C8 iw ib","V","V","","","r,r","",""
+"EXTRACTPS r/m32, xmm1, imm8u:2","EXTRACTPS imm8u:2, xmm1, r/m32","extractps imm8u:2, xmm1, r/m32","66 0F 3A 17 /r ib","V","V","SSE4_1","","w,r,r","",""
+"EXTRQ xmm1, imm8u, imm8u","EXTRQ imm8u, imm8u, xmm1","extrq imm8u, imm8u, xmm1","66 0F 78 /0 ib ib","V","V","SSE4a","amd,modrm_regonly","w,r,r","",""
+"EXTRQ xmm1, xmm2","EXTRQ xmm2, xmm1","extrq xmm2, xmm1","66 0F 79 /r","V","V","SSE4a","amd,modrm_regonly","w,r","",""
+"F2XM1","F2XM1","f2xm1","D9 F0","V","V","","","","",""
+"FABS","FABS","fabs","D9 E1","V","V","","","","",""
+"FADD ST(i), ST(0)","FADDD ST(0), ST(i)","fadd ST(0), ST(i)","DC C0+i","V","V","","","rw,r","Y",""
+"FADD ST(0), ST(i)","FADDD ST(i), ST(0)","fadd ST(i), ST(0)","D8 C0+i","V","V","","","rw,r","Y",""
+"FADD ST(0), m32fp","FADDD m32fp, ST(0)","fadds m32fp, ST(0)","D8 /0","V","V","","","rw,r","Y","32"
+"FADD ST(0), m64fp","FADDD m64fp, ST(0)","faddl m64fp, ST(0)","DC /0","V","V","","","rw,r","Y","64"
+"FADDP","FADDDP","faddp","DE C1","V","V","","pseudo","","",""
+"FADDP ST(i), ST(0)","FADDDP ST(0), ST(i)","faddp ST(0), ST(i)","DE C0+i","V","V","","","rw,r","",""
+"FBLD ST(0), m80dec","FBLD m80dec, ST(0)","fbld m80dec, ST(0)","DF /4","V","V","","","w,r","",""
+"FBSTP m80dec, ST(0)","FBSTP ST(0), m80dec","fbstp ST(0), m80dec","DF /6","V","V","","","w,r","",""
+"FCHS","FCHS","fchs","D9 E0","V","V","","","","",""
+"FCLEX","FCLEX","fclex","9B DB E2","V","V","","pseudo","","",""
+"FCMOVB ST(0), ST(i)","FCMOVB ST(i), ST(0)","fcmovb ST(i), ST(0)","DA C0+i","V","V","","P6","rw,r","",""
+"FCMOVBE ST(0), ST(i)","FCMOVBE ST(i), ST(0)","fcmovbe ST(i), ST(0)","DA D0+i","V","V","","P6","rw,r","",""
+"FCMOVE ST(0), ST(i)","FCMOVE ST(i), ST(0)","fcmove ST(i), ST(0)","DA C8+i","V","V","","P6","rw,r","",""
+"FCMOVNB ST(0), ST(i)","FCMOVNB ST(i), ST(0)","fcmovnb ST(i), ST(0)","DB C0+i","V","V","","P6","rw,r","",""
+"FCMOVNBE ST(0), ST(i)","FCMOVNBE ST(i), ST(0)","fcmovnbe ST(i), ST(0)","DB D0+i","V","V","","P6","rw,r","",""
+"FCMOVNE ST(0), ST(i)","FCMOVNE ST(i), ST(0)","fcmovne ST(i), ST(0)","DB C8+i","V","V","","P6","rw,r","",""
+"FCMOVNU ST(0), ST(i)","FCMOVNU ST(i), ST(0)","fcmovnu ST(i), ST(0)","DB D8+i","V","V","","P6","rw,r","",""
+"FCMOVU ST(0), ST(i)","FCMOVU ST(i), ST(0)","fcmovu ST(i), ST(0)","DA D8+i","V","V","","P6","rw,r","",""
+"FCOM","FCOMD","fcom","D8 D1","V","V","","pseudo","","Y",""
+"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","D8 D0+i","V","V","","","r,r","Y",""
+"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","DC D0+i","V","V","","","r,r","Y",""
+"FCOM ST(0), m32fp","FCOMD m32fp, ST(0)","fcoms m32fp, ST(0)","D8 /2","V","V","","","r,r","Y","32"
+"FCOM ST(0), m64fp","FCOMD m64fp, ST(0)","fcoml m64fp, ST(0)","DC /2","V","V","","","r,r","Y","64"
+"FCOMP ST(0), m32fp","FCOMFP m32fp, ST(0)","fcomps m32fp, ST(0)","D8 /3","V","V","","","r,r","Y","32"
+"FCOMI ST(0), ST(i)","FCOMI ST(i), ST(0)","fcomi ST(i), ST(0)","DB F0+i","V","V","PPRO","P6","r,r","",""
+"FCOMIP ST(0), ST(i)","FCOMIP ST(i), ST(0)","fcomip ST(i), ST(0)","DF F0+i","V","V","PPRO","P6","r,r","",""
+"FCOMP","FCOMP","fcomp","D8 D9","V","V","","pseudo","","Y",""
+"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","D8 D8+i","V","V","","","r,r","Y",""
+"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DC D8+i","V","V","","","r,r","Y",""
+"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DE D0+i","V","V","","","r,r","Y",""
+"FCOMP ST(0), m64fp","FCOMPL m64fp, ST(0)","fcompl m64fp, ST(0)","DC /3","V","V","","","r,r","Y","64"
+"FCOMPP","FCOMPP","fcompp","DE D9","V","V","","","","",""
+"FCOS","FCOS","fcos","D9 FF","V","V","","","","",""
+"FDECSTP","FDECSTP","fdecstp","D9 F6","V","V","","","","",""
+"FDISI8087_NOP","FDISI8087_NOP","fdisi8087_nop","DB E1","V","V","","","","",""
+"FDIVR ST(i), ST(0)","FDIVD ST(0), ST(i)","fdiv ST(0), ST(i)","DC F0+i","V","V","","","rw,r","Y",""
+"FDIV ST(i), ST(0)","FDIVD ST(0), ST(i)","fdivr ST(0), ST(i)","DC F8+i","V","V","","","rw,r","Y",""
+"FDIV ST(0), ST(i)","FDIVD ST(i), ST(0)","fdiv ST(i), ST(0)","D8 F0+i","V","V","","","rw,r","Y",""
+"FDIV ST(0), m32fp","FDIVD m32fp, ST(0)","fdivs m32fp, ST(0)","D8 /6","V","V","","","rw,r","Y","32"
+"FDIV ST(0), m64fp","FDIVD m64fp, ST(0)","fdivl m64fp, ST(0)","DC /6","V","V","","","rw,r","Y","64"
+"FDIVR ST(0), m32fp","FDIVFR m32fp, ST(0)","fdivrs m32fp, ST(0)","D8 /7","V","V","","","rw,r","Y","32"
+"FDIVP","FDIVP","fdivp","DE F9","V","V","","pseudo","","",""
+"FDIVRP ST(i), ST(0)","FDIVP ST(0), ST(i)","fdivp ST(0), ST(i)","DE F0+i","V","V","","","rw,r","",""
+"FDIVR ST(0), ST(i)","FDIVR ST(i), ST(0)","fdivr ST(i), ST(0)","D8 F8+i","V","V","","","rw,r","Y",""
+"FDIVR ST(0), m64fp","FDIVRL m64fp, ST(0)","fdivrl m64fp, ST(0)","DC /7","V","V","","","rw,r","Y","64"
+"FDIVRP","FDIVRP","fdivrp","DE F1","V","V","","pseudo","","",""
+"FDIVP ST(i), ST(0)","FDIVRP ST(0), ST(i)","fdivrp ST(0), ST(i)","DE F8+i","V","V","","","rw,r","",""
+"FEMMS","FEMMS","femms","0F 0E","V","V","3DNOW","amd","","",""
+"FENI8087_NOP","FENI8087_NOP","feni8087_nop","DB E0","V","V","","","","",""
+"FFREE ST(i)","FFREE ST(i)","ffree ST(i)","DD C0+i","V","V","","","r","",""
+"FFREEP ST(i)","FFREEP ST(i)","ffreep ST(i)","DF C0+i","V","V","","","r","",""
+"FIADD ST(0), m16int","FIADD m16int, ST(0)","fiadd m16int, ST(0)","DE /0","V","V","","","rw,r","Y",""
+"FIADD ST(0), m32int","FIADDL m32int, ST(0)","fiaddl m32int, ST(0)","DA /0","V","V","","","rw,r","Y","32"
+"FICOM ST(0), m16int","FICOM m16int, ST(0)","ficom m16int, ST(0)","DE /2","V","V","","","r,r","Y",""
+"FICOM ST(0), m32int","FICOML m32int, ST(0)","ficoml m32int, ST(0)","DA /2","V","V","","","r,r","Y","32"
+"FICOMP ST(0), m16int","FICOMP m16int, ST(0)","ficomp m16int, ST(0)","DE /3","V","V","","","r,r","Y",""
+"FICOMP ST(0), m32int","FICOMPL m32int, ST(0)","ficompl m32int, ST(0)","DA /3","V","V","","","r,r","Y","32"
+"FIDIV ST(0), m16int","FIDIV m16int, ST(0)","fidiv m16int, ST(0)","DE /6","V","V","","","rw,r","Y",""
+"FIDIV ST(0), m32int","FIDIVL m32int, ST(0)","fidivl m32int, ST(0)","DA /6","V","V","","","rw,r","Y","32"
+"FIDIVR ST(0), m16int","FIDIVR m16int, ST(0)","fidivr m16int, ST(0)","DE /7","V","V","","","rw,r","Y",""
+"FIDIVR ST(0), m32int","FIDIVRL m32int, ST(0)","fidivrl m32int, ST(0)","DA /7","V","V","","","rw,r","Y","32"
+"FILD ST(0), m16int","FILD m16int, ST(0)","fild m16int, ST(0)","DF /0","V","V","","","w,r","Y",""
+"FILD ST(0), m32int","FILDL m32int, ST(0)","fildl m32int, ST(0)","DB /0","V","V","","","w,r","Y","32"
+"FILD ST(0), m64int","FILDLL m64int, ST(0)","fildll m64int, ST(0)","DF /5","V","V","","","w,r","Y","64"
+"FIMUL ST(0), m16int","FIMUL m16int, ST(0)","fimul m16int, ST(0)","DE /1","V","V","","","rw,r","Y",""
+"FIMUL ST(0), m32int","FIMULL m32int, ST(0)","fimull m32int, ST(0)","DA /1","V","V","","","rw,r","Y","32"
+"FINCSTP","FINCSTP","fincstp","D9 F7","V","V","","","","",""
+"FINIT","FINIT","finit","9B DB E3","V","V","","pseudo","","",""
+"FIST m16int, ST(0)","FIST ST(0), m16int","fist ST(0), m16int","DF /2","V","V","","","w,r","Y",""
+"FIST m32int, ST(0)","FISTL ST(0), m32int","fistl ST(0), m32int","DB /2","V","V","","","w,r","Y","32"
+"FISTP m16int, ST(0)","FISTP ST(0), m16int","fistp ST(0), m16int","DF /3","V","V","","","w,r","Y",""
+"FISTP m32int, ST(0)","FISTPL ST(0), m32int","fistpl ST(0), m32int","DB /3","V","V","","","w,r","Y","32"
+"FISTP m64int, ST(0)","FISTPLL ST(0), m64int","fistpll ST(0), m64int","DF /7","V","V","","","w,r","Y","64"
+"FISTTP m16int, ST(0)","FISTTP ST(0), m16int","fisttp ST(0), m16int","DF /1","V","V","SSE3","modrm_memonly","w,r","Y",""
+"FISTTP m32int, ST(0)","FISTTPL ST(0), m32int","fisttpl ST(0), m32int","DB /1","V","V","SSE3","modrm_memonly","w,r","Y","32"
+"FISTTP m64int, ST(0)","FISTTPLL ST(0), m64int","fisttpll ST(0), m64int","DD /1","V","V","SSE3","modrm_memonly","w,r","Y","64"
+"FISUB ST(0), m16int","FISUB m16int, ST(0)","fisub m16int, ST(0)","DE /4","V","V","","","rw,r","Y",""
+"FISUB ST(0), m32int","FISUBL m32int, ST(0)","fisubl m32int, ST(0)","DA /4","V","V","","","rw,r","Y","32"
+"FISUBR ST(0), m16int","FISUBR m16int, ST(0)","fisubr m16int, ST(0)","DE /5","V","V","","","rw,r","Y",""
+"FISUBR ST(0), m32int","FISUBRL m32int, ST(0)","fisubrl m32int, ST(0)","DA /5","V","V","","","rw,r","Y","32"
+"FLD ST(0), ST(i)","FLD ST(i), ST(0)","fld ST(i), ST(0)","D9 C0+i","V","V","","","w,r","Y",""
+"FLD1","FLD1","fld1","D9 E8","V","V","","","","",""
+"FLDCW m2byte","FLDCW m2byte","fldcw m2byte","D9 /5","V","V","","","r","",""
+"FLDENV m28byte","FLDENV m28byte","fldenv m28byte","D9 /4","V","V","","operand32,operand64","r","",""
+"FLDENV m14byte","FLDENVS m14byte","fldenv m14byte","D9 /4","V","V","","operand16","r","",""
+"FLD ST(0), m64fp","FLDL m64fp, ST(0)","fldl m64fp, ST(0)","DD /0","V","V","","","w,r","Y","64"
+"FLDL2E","FLDL2E","fldl2e","D9 EA","V","V","","","","",""
+"FLDL2T","FLDL2T","fldl2t","D9 E9","V","V","","","","",""
+"FLDLG2","FLDLG2","fldlg2","D9 EC","V","V","","","","",""
+"FLDLN2","FLDLN2","fldln2","D9 ED","V","V","","","","",""
+"FLDPI","FLDPI","fldpi","D9 EB","V","V","","","","",""
+"FLD ST(0), m32fp","FLDS m32fp, ST(0)","flds m32fp, ST(0)","D9 /0","V","V","","","w,r","Y","32"
+"FLD ST(0), m80fp","FLDT m80fp, ST(0)","fldt m80fp, ST(0)","DB /5","V","V","","","w,r","Y","80"
+"FLDZ","FLDZ","fldz","D9 EE","V","V","","","","",""
+"FMUL ST(i), ST(0)","FMUL ST(0), ST(i)","fmul ST(0), ST(i)","DC C8+i","V","V","","","rw,r","Y",""
+"FMUL ST(0), ST(i)","FMUL ST(i), ST(0)","fmul ST(i), ST(0)","D8 C8+i","V","V","","","rw,r","Y",""
+"FMUL ST(0), m64fp","FMULL m64fp, ST(0)","fmull m64fp, ST(0)","DC /1","V","V","","","rw,r","Y","64"
+"FMULP","FMULP","fmulp","DE C9","V","V","","pseudo","","",""
+"FMULP ST(i), ST(0)","FMULP ST(0), ST(i)","fmulp ST(0), ST(i)","DE C8+i","V","V","","","rw,r","",""
+"FMUL ST(0), m32fp","FMULS m32fp, ST(0)","fmuls m32fp, ST(0)","D8 /1","V","V","","","rw,r","Y","32"
+"FNCLEX","FNCLEX","fnclex","DB E2","V","V","","","","",""
+"FNINIT","FNINIT","fninit","DB E3","V","V","","","","",""
+"FNOP","FNOP","fnop","D9 D0","V","V","","","","",""
+"FNSAVE m108byte","FNSAVE m108byte","fnsave m108byte","DD /6","V","V","","operand32,operand64","w","",""
+"FNSAVE m94byte","FNSAVES m94byte","fnsave m94byte","DD /6","V","V","","operand16","w","",""
+"FNSTCW m2byte","FNSTCW m2byte","fnstcw m2byte","D9 /7","V","V","","","w","",""
+"FNSTENV m28byte","FNSTENV m28byte","fnstenv m28byte","D9 /6","V","V","","operand32,operand64","w","",""
+"FNSTENV m14byte","FNSTENVS m14byte","fnstenv m14byte","D9 /6","V","V","","operand16","w","",""
+"FNSTSW AX","FNSTSW AX","fnstsw AX","DF E0","V","V","","","w","",""
+"FNSTSW m2byte","FNSTSW m2byte","fnstsw m2byte","DD /7","V","V","","","w","",""
+"FPATAN","FPATAN","fpatan","D9 F3","V","V","","","","",""
+"FPREM","FPREM","fprem","D9 F8","V","V","","","","",""
+"FPREM1","FPREM1","fprem1","D9 F5","V","V","","","","",""
+"FPTAN","FPTAN","fptan","D9 F2","V","V","","","","",""
+"FRNDINT","FRNDINT","frndint","D9 FC","V","V","","","","",""
+"FRSTOR m108byte","FRSTOR m108byte","frstor m108byte","DD /4","V","V","","operand32,operand64","r","",""
+"FRSTOR m94byte","FRSTORS m94byte","frstor m94byte","DD /4","V","V","","operand16","r","",""
+"FSAVE m94/108byte","FSAVE m94/108byte","fsave m94/108byte","9B DD /6","V","V","","pseudo","w","",""
+"FSCALE","FSCALE","fscale","D9 FD","V","V","","","","",""
+"FSETPM287_NOP","FSETPM287_NOP","fsetpm287_nop","DB E4","V","V","","","","",""
+"FSIN","FSIN","fsin","D9 FE","V","V","","","","",""
+"FSINCOS","FSINCOS","fsincos","D9 FB","V","V","","","","",""
+"FSQRT","FSQRT","fsqrt","D9 FA","V","V","","","","",""
+"FST ST(i), ST(0)","FST ST(0), ST(i)","fst ST(0), ST(i)","DD D0+i","V","V","","","w,r","Y",""
+"FSTCW m2byte","FSTCW m2byte","fstcw m2byte","9B D9 /7","V","V","","pseudo","w","",""
+"FSTENV m14/28byte","FSTENV m14/28byte","fstenv m14/28byte","9B D9 /6","V","V","","pseudo","w","",""
+"FST m64fp, ST(0)","FSTL ST(0), m64fp","fstl ST(0), m64fp","DD /2","V","V","","","w,r","Y","64"
+"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DD D8+i","V","V","","","w,r","Y",""
+"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D0+i","V","V","","","w,r","Y",""
+"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D8+i","V","V","","","w,r","Y",""
+"FSTP m64fp, ST(0)","FSTPL ST(0), m64fp","fstpl ST(0), m64fp","DD /3","V","V","","","w,r","Y","64"
+"FSTPNCE ST(i), ST(0)","FSTPNCE ST(0), ST(i)","fstpnce ST(0), ST(i)","D9 D8+i","V","V","","","w,r","",""
+"FSTP m32fp, ST(0)","FSTPS ST(0), m32fp","fstps ST(0), m32fp","D9 /3","V","V","","","w,r","Y","32"
+"FSTP m80fp, ST(0)","FSTPT ST(0), m80fp","fstpt ST(0), m80fp","DB /7","V","V","","","w,r","Y","80"
+"FST m32fp, ST(0)","FSTS ST(0), m32fp","fsts ST(0), m32fp","D9 /2","V","V","","","w,r","Y","32"
+"FSTSW AX","FSTSW AX","fstsw AX","9B DF E0","V","V","","pseudo","w","",""
+"FSTSW m2byte","FSTSW m2byte","fstsw m2byte","9B DD /7","V","V","","pseudo","w","",""
+"FSUBR ST(i), ST(0)","FSUB ST(0), ST(i)","fsub ST(0), ST(i)","DC E0+i","V","V","","","rw,r","Y",""
+"FSUB ST(0), ST(i)","FSUB ST(i), ST(0)","fsub ST(i), ST(0)","D8 E0+i","V","V","","","rw,r","Y",""
+"FSUB ST(0), m64fp","FSUBL m64fp, ST(0)","fsubl m64fp, ST(0)","DC /4","V","V","","","rw,r","Y","64"
+"FSUBP","FSUBP","fsubp","DE E9","V","V","","pseudo","","",""
+"FSUBRP ST(i), ST(0)","FSUBP ST(0), ST(i)","fsubp ST(0), ST(i)","DE E0+i","V","V","","","rw,r","",""
+"FSUB ST(i), ST(0)","FSUBR ST(0), ST(i)","fsubr ST(0), ST(i)","DC E8+i","V","V","","","rw,r","Y",""
+"FSUBR ST(0), ST(i)","FSUBR ST(i), ST(0)","fsubr ST(i), ST(0)","D8 E8+i","V","V","","","rw,r","Y",""
+"FSUBR ST(0), m64fp","FSUBRL m64fp, ST(0)","fsubrl m64fp, ST(0)","DC /5","V","V","","","rw,r","Y","64"
+"FSUBRP","FSUBRP","fsubrp","DE E1","V","V","","pseudo","","",""
+"FSUBP ST(i), ST(0)","FSUBRP ST(0), ST(i)","fsubrp ST(0), ST(i)","DE E8+i","V","V","","","rw,r","",""
+"FSUBR ST(0), m32fp","FSUBRS m32fp, ST(0)","fsubrs m32fp, ST(0)","D8 /5","V","V","","","rw,r","Y","32"
+"FSUB ST(0), m32fp","FSUBS m32fp, ST(0)","fsubs m32fp, ST(0)","D8 /4","V","V","","","rw,r","Y","32"
+"FTST","FTST","ftst","D9 E4","V","V","","","","",""
+"FUCOM","FUCOM","fucom","DD E1","V","V","","pseudo","","",""
+"FUCOM ST(0), ST(i)","FUCOM ST(i), ST(0)","fucom ST(i), ST(0)","DD E0+i","V","V","","","r,r","",""
+"FUCOMI ST(0), ST(i)","FUCOMI ST(i), ST(0)","fucomi ST(i), ST(0)","DB E8+i","V","V","PPRO","P6","r,r","",""
+"FUCOMIP ST(0), ST(i)","FUCOMIP ST(i), ST(0)","fucomip ST(i), ST(0)","DF E8+i","V","V","PPRO","P6","r,r","",""
+"FUCOMP","FUCOMP","fucomp","DD E9","V","V","","pseudo","","",""
+"FUCOMP ST(0), ST(i)","FUCOMP ST(i), ST(0)","fucomp ST(i), ST(0)","DD E8+i","V","V","","","r,r","",""
+"FUCOMPP","FUCOMPP","fucompp","DA E9","V","V","","","","",""
+"FWAIT","FWAIT","fwait","9B","V","V","","","","",""
+"FXAM","FXAM","fxam","D9 E5","V","V","","","","",""
+"FXCH","FXCH","fxch","D9 C9","V","V","","pseudo","","",""
+"FXCH ST(0), ST(i)","FXCH ST(i), ST(0)","fxch ST(i), ST(0)","D9 C8+i","V","V","","","rw,rw","",""
+"FXCH_ALIAS1 ST(0), ST(i)","FXCH_ALIAS1 ST(i), ST(0)","fxch_alias1 ST(i), ST(0)","DD C8+i","V","V","","","rw,rw","",""
+"FXCH_ALIAS2 ST(0), ST(i)","FXCH_ALIAS2 ST(i), ST(0)","fxch_alias2 ST(i), ST(0)","DF C8+i","V","V","","","rw,rw","",""
+"FXRSTOR m512byte","FXRSTOR m512byte","fxrstor m512byte","0F AE /1","V","V","","modrm_memonly,operand16,operand32","r","",""
+"FXRSTOR64 m512byte","FXRSTOR64 m512byte","fxrstor64 m512byte","REX.W 0F AE /1","N.S.","V","","modrm_memonly","r","",""
+"FXSAVE m512byte","FXSAVE m512byte","fxsave m512byte","0F AE /0","V","V","","modrm_memonly,operand16,operand32","w","",""
+"FXSAVE64 m512byte","FXSAVE64 m512byte","fxsave64 m512byte","REX.W 0F AE /0","N.S.","V","","modrm_memonly","w","",""
+"FXTRACT","FXTRACT","fxtract","D9 F4","V","V","","","","",""
+"FYL2X","FYL2X","fyl2x","D9 F1","V","V","","","","",""
+"FYL2XP1","FYL2XP1","fyl2xp1","D9 F9","V","V","","","","",""
+"GETSEC","GETSEC","getsec","0F 37","V","V","SMX","","","",""
+"GF2P8AFFINEINVQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEINVQB imm8u, xmm2/m128, xmm1","gf2p8affineinvqb imm8u, xmm2/m128, xmm1","66 0F 3A CF /r ib","V","V","GFNI","","rw,r,r","",""
+"GF2P8AFFINEQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEQB imm8u, xmm2/m128, xmm1","gf2p8affineqb imm8u, xmm2/m128, xmm1","66 0F 3A CE /r ib","V","V","GFNI","","rw,r,r","",""
+"GF2P8MULB xmm1, xmm2/m128","GF2P8MULB xmm2/m128, xmm1","gf2p8mulb xmm2/m128, xmm1","66 0F 38 CF /r","V","V","GFNI","","rw,r","",""
+"HADDPD xmm1, xmm2/m128","HADDPD xmm2/m128, xmm1","haddpd xmm2/m128, xmm1","66 0F 7C /r","V","V","SSE3","","rw,r","",""
+"HADDPS xmm1, xmm2/m128","HADDPS xmm2/m128, xmm1","haddps xmm2/m128, xmm1","F2 0F 7C /r","V","V","SSE3","","rw,r","",""
+"HLT","HLT","hlt","F4","V","V","","","","",""
+"HSUBPD xmm1, xmm2/m128","HSUBPD xmm2/m128, xmm1","hsubpd xmm2/m128, xmm1","66 0F 7D /r","V","V","SSE3","","rw,r","",""
+"HSUBPS xmm1, xmm2/m128","HSUBPS xmm2/m128, xmm1","hsubps xmm2/m128, xmm1","F2 0F 7D /r","V","V","SSE3","","rw,r","",""
+"ICEBP","ICEBP","icebp","F1","V","V","","","","",""
+"IDIV r/m8","IDIVB r/m8","idivb r/m8","F6 /7","V","V","","","r","Y","8"
+"IDIV r/m8","IDIVB r/m8","idivb r/m8","REX F6 /7","N.E.","V","","pseudo64","r","Y","8"
+"IDIV r/m32","IDIVL r/m32","idivl r/m32","F7 /7","V","V","","operand32","r","Y","32"
+"IDIV r/m64","IDIVQ r/m64","idivq r/m64","REX.W F7 /7","N.S.","V","","","r","Y","64"
+"IDIV r/m16","IDIVW r/m16","idivw r/m16","F7 /7","V","V","","operand16","r","Y","16"
+"IMUL r32, r/m32, imm32","IMUL3 imm32, r/m32, r32","imull imm32, r/m32, r32","69 /r id","V","V","","operand32","w,r,r","Y","32"
+"IMUL r64, r/m64, imm32","IMUL3 imm32, r/m64, r64","imulq imm32, r/m64, r64","REX.W 69 /r id","N.S.","V","","","w,r,r","Y","64"
+"IMUL r16, r/m16, imm8","IMUL3 imm8, r/m16, r16","imulw imm8, r/m16, r16","6B /r ib","V","V","","operand16","w,r,r","Y","16"
+"IMUL r32, r/m32, imm8","IMUL3 imm8, r/m32, r32","imull imm8, r/m32, r32","6B /r ib","V","V","","operand32","w,r,r","Y","32"
+"IMUL r64, r/m64, imm8","IMUL3 imm8, r/m64, r64","imulq imm8, r/m64, r64","REX.W 6B /r ib","N.S.","V","","","w,r,r","Y","64"
+"IMUL r/m8","IMULB r/m8","imulb r/m8","F6 /5","V","V","","","r","Y","8"
+"IMUL r/m32","IMULL r/m32","imull r/m32","F7 /5","V","V","","operand32","r","Y","32"
+"IMUL r32, r/m32","IMULL r/m32, r32","imull r/m32, r32","0F AF /r","V","V","","operand32","rw,r","Y","32"
+"IMUL r/m64","IMULQ r/m64","imulq r/m64","REX.W F7 /5","N.S.","V","","","r","Y","64"
+"IMUL r64, r/m64","IMULQ r/m64, r64","imulq r/m64, r64","REX.W 0F AF /r","N.S.","V","","","rw,r","Y","64"
+"IMUL r16, r/m16, imm16","IMULW imm16, r/m16, r16","imulw imm16, r/m16, r16","69 /r iw","V","V","","operand16","w,r,r","Y","16"
+"IMUL r/m16","IMULW r/m16","imulw r/m16","F7 /5","V","V","","operand16","r","Y","16"
+"IMUL r16, r/m16","IMULW r/m16, r16","imulw r/m16, r16","0F AF /r","V","V","","operand16","rw,r","Y","16"
+"IN AL, DX","INB DX, AL","inb DX, AL","EC","V","V","","","w,r","Y","8"
+"IN AL, imm8u","INB imm8u, AL","inb imm8u, AL","E4 ib","V","V","","","w,r","Y","8"
+"INC r/m8","INCB r/m8","incb r/m8","FE /0","V","V","","","rw","Y","8"
+"INC r/m8","INCB r/m8","incb r/m8","REX FE /0","N.E.","V","","pseudo64","rw","Y","8"
+"INC r/m32","INCL r/m32","incl r/m32","FF /0","V","V","","operand32","rw","Y","32"
+"INC r32op","INCL r32op","incl r32op","40+rd","V","N.S.","","operand32","rw","Y","32"
+"INC r/m64","INCQ r/m64","incq r/m64","REX.W FF /0","N.S.","V","","","rw","Y","64"
+"INCSSPD rmr32","INCSSPD rmr32","incsspd rmr32","F3 0F AE /5","V","V","CET","modrm_regonly,operand16,operand32","r","",""
+"INCSSPQ rmr64","INCSSPQ rmr64","incsspq rmr64","F3 REX.W 0F AE /5","N.S.","V","CET","modrm_regonly","r","",""
+"INC r/m16","INCW r/m16","incw r/m16","FF /0","V","V","","operand16","rw","Y","16"
+"INC r16op","INCW r16op","incw r16op","40+rw","V","N.S.","","operand16","rw","Y","16"
+"IN EAX, DX","INL DX, EAX","inl DX, EAX","ED","V","V","","operand32,operand64","w,r","Y","32"
+"IN EAX, imm8u","INL imm8u, EAX","inl imm8u, EAX","E5 ib","V","V","","operand32,operand64","w,r","Y","32"
+"INSB","INSB","insb","6C","V","V","","","","",""
+"INSERTPS xmm1, xmm2/m32, imm8u","INSERTPS imm8u, xmm2/m32, xmm1","insertps imm8u, xmm2/m32, xmm1","66 0F 3A 21 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"INSERTQ xmm1, xmm2, imm8u, imm8u","INSERTQ imm8u, imm8u, xmm2, xmm1","insertq imm8u, imm8u, xmm2, xmm1","F2 0F 78 /r ib ib","V","V","SSE4a","amd,modrm_regonly","w,r,r,r","",""
+"INSERTQ xmm1, xmm2","INSERTQ xmm2, xmm1","insertq xmm2, xmm1","F2 0F 79 /r","V","V","SSE4a","amd,modrm_regonly","w,r","",""
+"INSD","INSL","insl","6D","V","V","","operand32,operand64","","",""
+"INSW","INSW","insw","6D","V","V","","operand16","","",""
+"INT 3","INT 3","int 3","CC","V","V","","","r","",""
+"INT imm8u","INT imm8u","int imm8u","CD ib","V","V","","","r","",""
+"INTO","INTO","into","CE","V","N.S.","","","","",""
+"INVD","INVD","invd","0F 08","V","V","486","","","",""
+"INVEPT r32, m128","INVEPT m128, r32","invept m128, r32","66 0F 38 80 /r","V","N.S.","VTX","modrm_memonly","r,r","",""
+"INVEPT r64, m128","INVEPT m128, r64","invept m128, r64","66 0F 38 80 /r","N.S.","V","VTX","default64,modrm_memonly","r,r","",""
+"INVLPG m","INVLPG m","invlpg m","0F 01 /7","V","V","486","modrm_memonly","r","",""
+"INVLPGA EAX, ECX","INVLPGAL ECX, EAX","invlpgal ECX, EAX","0F 01 DF","V","V","SVM","amd,modrm_regonly,operand32","r,r","Y","32"
+"INVLPGA RAX, ECX","INVLPGAQ ECX, RAX","invlpgaq ECX, RAX","REX.W 0F 01 DF","N.S.","V","SVM","amd,modrm_regonly","r,r","Y","64"
+"INVLPGA AX, ECX","INVLPGAW ECX, AX","invlpgaw ECX, AX","0F 01 DF","V","V","SVM","amd,modrm_regonly,operand16","r,r","Y","16"
+"INVPCID r32, m128","INVPCID m128, r32","invpcid m128, r32","66 0F 38 82 /r","V","N.S.","INVPCID","modrm_memonly","r,r","",""
+"INVPCID r64, m128","INVPCID m128, r64","invpcid m128, r64","66 0F 38 82 /r","N.S.","V","INVPCID","default64,modrm_memonly","r,r","",""
+"INVVPID r32, m128","INVVPID m128, r32","invvpid m128, r32","66 0F 38 81 /r","V","N.S.","VTX","modrm_memonly","r,r","",""
+"INVVPID r64, m128","INVVPID m128, r64","invvpid m128, r64","66 0F 38 81 /r","N.S.","V","VTX","default64,modrm_memonly","r,r","",""
+"IN AX, DX","INW DX, AX","inw DX, AX","ED","V","V","","operand16","w,r","Y","16"
+"IN AX, imm8u","INW imm8u, AX","inw imm8u, AX","E5 ib","V","V","","operand16","w,r","Y","16"
+"IRETD","IRETL","iretl","CF","V","V","","operand32","","",""
+"IRETQ","IRETQ","iretq","REX.W CF","N.S.","V","","","","",""
+"IRET","IRETW","iretw","CF","V","V","","operand16","","",""
+"JA rel16","JA rel16","ja rel16","0F 87 cw","V","N.S.","","operand16","r","",""
+"JA rel32","JA rel32","ja rel32","0F 87 cd","V","N.S.","","operand32","r","",""
+"JA rel32","JA rel32","ja rel32","0F 87 cd","N.S.","V","","default64","r","",""
+"JA rel8","JA rel8","ja rel8","77 cb","N.S.","V","","default64","r","",""
+"JA rel8","JA rel8","ja rel8","77 cb","V","N.S.","","","r","",""
+"JAE rel16","JAE rel16","jae rel16","0F 83 cw","V","N.S.","","operand16","r","",""
+"JAE rel32","JAE rel32","jae rel32","0F 83 cd","N.S.","V","","default64","r","",""
+"JAE rel32","JAE rel32","jae rel32","0F 83 cd","V","N.S.","","operand32","r","",""
+"JAE rel8","JAE rel8","jae rel8","73 cb","V","N.S.","","","r","",""
+"JAE rel8","JAE rel8","jae rel8","73 cb","N.S.","V","","default64","r","",""
+"JB rel16","JB rel16","jb rel16","0F 82 cw","V","N.S.","","operand16","r","",""
+"JB rel32","JB rel32","jb rel32","0F 82 cd","V","N.S.","","operand32","r","",""
+"JB rel32","JB rel32","jb rel32","0F 82 cd","N.S.","V","","default64","r","",""
+"JB rel8","JB rel8","jb rel8","72 cb","N.S.","V","","default64","r","",""
+"JB rel8","JB rel8","jb rel8","72 cb","V","N.S.","","","r","",""
+"JBE rel16","JBE rel16","jbe rel16","0F 86 cw","V","N.S.","","operand16","r","",""
+"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","V","N.S.","","operand32","r","",""
+"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","N.S.","V","","default64","r","",""
+"JBE rel8","JBE rel8","jbe rel8","76 cb","V","N.S.","","","r","",""
+"JBE rel8","JBE rel8","jbe rel8","76 cb","N.S.","V","","default64","r","",""
+"JC rel16","JC rel16","jc rel16","0F 82 cw","V","N.S.","","pseudo","r","",""
+"JC rel32","JC rel32","jc rel32","0F 82 cd","V","V","","pseudo","r","",""
+"JC rel8","JC rel8","jc rel8","72 cb","V","V","","pseudo","r","",""
+"JCXZ rel8","JCXZ rel8","jcxz rel8","E3 cb","V","N.S.","","address16","r","",""
+"JE rel16","JE rel16","je rel16","0F 84 cw","V","N.S.","","operand16","r","",""
+"JE rel32","JE rel32","je rel32","0F 84 cd","V","N.S.","","operand32","r","",""
+"JE rel32","JE rel32","je rel32","0F 84 cd","N.S.","V","","default64","r","",""
+"JE rel8","JE rel8","je rel8","74 cb","N.S.","V","","default64","r","",""
+"JE rel8","JE rel8","je rel8","74 cb","V","N.S.","","","r","",""
+"JECXZ rel8","JECXZ rel8","jecxz rel8","E3 cb","V","V","","address32","r","",""
+"JG rel16","JG rel16","jg rel16","0F 8F cw","V","N.S.","","operand16","r","",""
+"JG rel32","JG rel32","jg rel32","0F 8F cd","N.S.","V","","default64","r","",""
+"JG rel32","JG rel32","jg rel32","0F 8F cd","V","N.S.","","operand32","r","",""
+"JG rel8","JG rel8","jg rel8","7F cb","V","N.S.","","","r","",""
+"JG rel8","JG rel8","jg rel8","7F cb","N.S.","V","","default64","r","",""
+"JGE rel16","JGE rel16","jge rel16","0F 8D cw","V","N.S.","","operand16","r","",""
+"JGE rel32","JGE rel32","jge rel32","0F 8D cd","V","N.S.","","operand32","r","",""
+"JGE rel32","JGE rel32","jge rel32","0F 8D cd","N.S.","V","","default64","r","",""
+"JGE rel8","JGE rel8","jge rel8","7D cb","N.S.","V","","default64","r","",""
+"JGE rel8","JGE rel8","jge rel8","7D cb","V","N.S.","","","r","",""
+"JL rel16","JL rel16","jl rel16","0F 8C cw","V","N.S.","","operand16","r","",""
+"JL rel32","JL rel32","jl rel32","0F 8C cd","V","N.S.","","operand32","r","",""
+"JL rel32","JL rel32","jl rel32","0F 8C cd","N.S.","V","","default64","r","",""
+"JL rel8","JL rel8","jl rel8","7C cb","V","N.S.","","","r","",""
+"JL rel8","JL rel8","jl rel8","7C cb","N.S.","V","","default64","r","",""
+"JLE rel16","JLE rel16","jle rel16","0F 8E cw","V","N.S.","","operand16","r","",""
+"JLE rel32","JLE rel32","jle rel32","0F 8E cd","V","N.S.","","operand32","r","",""
+"JLE rel32","JLE rel32","jle rel32","0F 8E cd","N.S.","V","","default64","r","",""
+"JLE rel8","JLE rel8","jle rel8","7E cb","N.S.","V","","default64","r","",""
+"JLE rel8","JLE rel8","jle rel8","7E cb","V","N.S.","","","r","",""
+"JMP rel16","JMP rel16","jmp rel16","E9 cw","V","N.S.","","operand16","r","Y",""
+"JMP rel32","JMP rel32","jmp rel32","E9 cd","N.S.","V","","default64","r","Y",""
+"JMP rel32","JMP rel32","jmp rel32","E9 cd","V","N.S.","","operand32","r","Y",""
+"JMP rel8","JMP rel8","jmp rel8","EB cb","N.S.","V","","default64","r","Y",""
+"JMP rel8","JMP rel8","jmp rel8","EB cb","V","N.S.","","","r","Y",""
+"JMP r/m32","JMPL* r/m32","jmpl* r/m32","FF /4","V","N.S.","","operand32","r","Y","32"
+"JMP r/m64","JMPQ* r/m64","jmpq* r/m64","FF /4","N.S.","V","","","r","Y","64"
+"JMP r/m16","JMPW* r/m16","jmpw* r/m16","FF /4","V","N.S.","","operand16","r","Y","16"
+"JNA rel16","JNA rel16","jna rel16","0F 86 cw","V","N.S.","","pseudo","r","",""
+"JNA rel32","JNA rel32","jna rel32","0F 86 cd","V","V","","pseudo","r","",""
+"JNA rel8","JNA rel8","jna rel8","76 cb","V","V","","pseudo","r","",""
+"JNAE rel16","JNAE rel16","jnae rel16","0F 82 cw","V","N.S.","","pseudo","r","",""
+"JNAE rel32","JNAE rel32","jnae rel32","0F 82 cd","V","V","","pseudo","r","",""
+"JNAE rel8","JNAE rel8","jnae rel8","72 cb","V","V","","pseudo","r","",""
+"JNB rel16","JNB rel16","jnb rel16","0F 83 cw","V","N.S.","","pseudo","r","",""
+"JNB rel32","JNB rel32","jnb rel32","0F 83 cd","V","V","","pseudo","r","",""
+"JNB rel8","JNB rel8","jnb rel8","73 cb","V","V","","pseudo","r","",""
+"JNBE rel16","JNBE rel16","jnbe rel16","0F 87 cw","V","N.S.","","pseudo","r","",""
+"JNBE rel32","JNBE rel32","jnbe rel32","0F 87 cd","V","V","","pseudo","r","",""
+"JNBE rel8","JNBE rel8","jnbe rel8","77 cb","V","V","","pseudo","r","",""
+"JNC rel16","JNC rel16","jnc rel16","0F 83 cw","V","N.S.","","pseudo","r","",""
+"JNC rel32","JNC rel32","jnc rel32","0F 83 cd","V","V","","pseudo","r","",""
+"JNC rel8","JNC rel8","jnc rel8","73 cb","V","V","","pseudo","r","",""
+"JNE rel16","JNE rel16","jne rel16","0F 85 cw","V","N.S.","","operand16","r","",""
+"JNE rel32","JNE rel32","jne rel32","0F 85 cd","N.S.","V","","default64","r","",""
+"JNE rel32","JNE rel32","jne rel32","0F 85 cd","V","N.S.","","operand32","r","",""
+"JNE rel8","JNE rel8","jne rel8","75 cb","V","N.S.","","","r","",""
+"JNE rel8","JNE rel8","jne rel8","75 cb","N.S.","V","","default64","r","",""
+"JNG rel16","JNG rel16","jng rel16","0F 8E cw","V","N.S.","","pseudo","r","",""
+"JNG rel32","JNG rel32","jng rel32","0F 8E cd","V","V","","pseudo","r","",""
+"JNG rel8","JNG rel8","jng rel8","7E cb","V","V","","pseudo","r","",""
+"JNGE rel16","JNGE rel16","jnge rel16","0F 8C cw","V","N.S.","","pseudo","r","",""
+"JNGE rel32","JNGE rel32","jnge rel32","0F 8C cd","V","V","","pseudo","r","",""
+"JNGE rel8","JNGE rel8","jnge rel8","7C cb","V","V","","pseudo","r","",""
+"JNL rel16","JNL rel16","jnl rel16","0F 8D cw","V","N.S.","","pseudo","r","",""
+"JNL rel32","JNL rel32","jnl rel32","0F 8D cd","V","V","","pseudo","r","",""
+"JNL rel8","JNL rel8","jnl rel8","7D cb","V","V","","pseudo","r","",""
+"JNLE rel16","JNLE rel16","jnle rel16","0F 8F cw","V","N.S.","","pseudo","r","",""
+"JNLE rel32","JNLE rel32","jnle rel32","0F 8F cd","V","V","","pseudo","r","",""
+"JNLE rel8","JNLE rel8","jnle rel8","7F cb","V","V","","pseudo","r","",""
+"JNO rel16","JNO rel16","jno rel16","0F 81 cw","V","N.S.","","operand16","r","",""
+"JNO rel32","JNO rel32","jno rel32","0F 81 cd","V","N.S.","","operand32","r","",""
+"JNO rel32","JNO rel32","jno rel32","0F 81 cd","N.S.","V","","default64","r","",""
+"JNO rel8","JNO rel8","jno rel8","71 cb","V","N.S.","","","r","",""
+"JNO rel8","JNO rel8","jno rel8","71 cb","N.S.","V","","default64","r","",""
+"JNP rel16","JNP rel16","jnp rel16","0F 8B cw","V","N.S.","","operand16","r","",""
+"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","V","N.S.","","operand32","r","",""
+"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","N.S.","V","","default64","r","",""
+"JNP rel8","JNP rel8","jnp rel8","7B cb","N.S.","V","","default64","r","",""
+"JNP rel8","JNP rel8","jnp rel8","7B cb","V","N.S.","","","r","",""
+"JNS rel16","JNS rel16","jns rel16","0F 89 cw","V","N.S.","","operand16","r","",""
+"JNS rel32","JNS rel32","jns rel32","0F 89 cd","N.S.","V","","default64","r","",""
+"JNS rel32","JNS rel32","jns rel32","0F 89 cd","V","N.S.","","operand32","r","",""
+"JNS rel8","JNS rel8","jns rel8","79 cb","V","N.S.","","","r","",""
+"JNS rel8","JNS rel8","jns rel8","79 cb","N.S.","V","","default64","r","",""
+"JNZ rel16","JNZ rel16","jnz rel16","0F 85 cw","V","N.S.","","pseudo","r","",""
+"JNZ rel32","JNZ rel32","jnz rel32","0F 85 cd","V","V","","pseudo","r","",""
+"JNZ rel8","JNZ rel8","jnz rel8","75 cb","V","V","","pseudo","r","",""
+"JO rel16","JO rel16","jo rel16","0F 80 cw","V","N.S.","","operand16","r","",""
+"JO rel32","JO rel32","jo rel32","0F 80 cd","V","N.S.","","operand32","r","",""
+"JO rel32","JO rel32","jo rel32","0F 80 cd","N.S.","V","","default64","r","",""
+"JO rel8","JO rel8","jo rel8","70 cb","V","N.S.","","","r","",""
+"JO rel8","JO rel8","jo rel8","70 cb","N.S.","V","","default64","r","",""
+"JP rel16","JP rel16","jp rel16","0F 8A cw","V","N.S.","","operand16","r","",""
+"JP rel32","JP rel32","jp rel32","0F 8A cd","N.S.","V","","default64","r","",""
+"JP rel32","JP rel32","jp rel32","0F 8A cd","V","N.S.","","operand32","r","",""
+"JP rel8","JP rel8","jp rel8","7A cb","N.S.","V","","default64","r","",""
+"JP rel8","JP rel8","jp rel8","7A cb","V","N.S.","","","r","",""
+"JPE rel16","JPE rel16","jpe rel16","0F 8A cw","V","N.S.","","pseudo","r","",""
+"JPE rel32","JPE rel32","jpe rel32","0F 8A cd","V","V","","pseudo","r","",""
+"JPE rel8","JPE rel8","jpe rel8","7A cb","V","V","","pseudo","r","",""
+"JPO rel16","JPO rel16","jpo rel16","0F 8B cw","V","N.S.","","pseudo","r","",""
+"JPO rel32","JPO rel32","jpo rel32","0F 8B cd","V","V","","pseudo","r","",""
+"JPO rel8","JPO rel8","jpo rel8","7B cb","V","V","","pseudo","r","",""
+"JRCXZ rel8","JRCXZ rel8","jrcxz rel8","E3 cb","N.S.","V","","address64","r","",""
+"JS rel16","JS rel16","js rel16","0F 88 cw","V","N.S.","","operand16","r","",""
+"JS rel32","JS rel32","js rel32","0F 88 cd","V","N.S.","","operand32","r","",""
+"JS rel32","JS rel32","js rel32","0F 88 cd","N.S.","V","","default64","r","",""
+"JS rel8","JS rel8","js rel8","78 cb","V","N.S.","","","r","",""
+"JS rel8","JS rel8","js rel8","78 cb","N.S.","V","","default64","r","",""
+"JZ rel16","JZ rel16","jz rel16","0F 84 cw","V","N.S.","","operand16,pseudo","r","",""
+"JZ rel32","JZ rel32","jz rel32","0F 84 cd","V","V","","operand32,pseudo","r","",""
+"JZ rel8","JZ rel8","jz rel8","74 cb","V","V","","pseudo","r","",""
+"KADDB k1, kV, k2","KADDB k2, kV, k1","kaddb k2, kV, k1","VEX.NDS.256.66.0F.W0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KADDD k1, kV, k2","KADDD k2, kV, k1","kaddd k2, kV, k1","VEX.NDS.256.66.0F.W1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KADDQ k1, kV, k2","KADDQ k2, kV, k1","kaddq k2, kV, k1","VEX.NDS.256.0F.W1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KADDW k1, kV, k2","KADDW k2, kV, k1","kaddw k2, kV, k1","VEX.NDS.256.0F.W0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KANDB k1, kV, k2","KANDB k2, kV, k1","kandb k2, kV, k1","VEX.NDS.256.66.0F.W0 41 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KANDD k1, kV, k2","KANDD k2, kV, k1","kandd k2, kV, k1","VEX.NDS.256.66.0F.W1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KANDNB k1, kV, k2","KANDNB k2, kV, k1","kandnb k2, kV, k1","VEX.NDS.256.66.0F.W0 42 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KANDND k1, kV, k2","KANDND k2, kV, k1","kandnd k2, kV, k1","VEX.NDS.256.66.0F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KANDNQ k1, kV, k2","KANDNQ k2, kV, k1","kandnq k2, kV, k1","VEX.NDS.256.0F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KANDNW k1, kV, k2","KANDNW k2, kV, k1","kandnw k2, kV, k1","VEX.NDS.256.0F.W0 42 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KANDQ k1, kV, k2","KANDQ k2, kV, k1","kandq k2, kV, k1","VEX.NDS.256.0F.W1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KANDW k1, kV, k2","KANDW k2, kV, k1","kandw k2, kV, k1","VEX.NDS.256.0F.W0 41 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KMOVB m8, k1","KMOVB k1, m8","kmovb k1, m8","VEX.128.66.0F.W0 91 /r","V","V","AVX512DQ","modrm_memonly","w,r","",""
+"KMOVB r32, k2","KMOVB k2, r32","kmovb k2, r32","VEX.128.66.0F.W0 93 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"KMOVB k1, k2/m8","KMOVB k2/m8, k1","kmovb k2/m8, k1","VEX.128.66.0F.W0 90 /r","V","V","AVX512DQ","","w,r","",""
+"KMOVB k1, rmr32","KMOVB rmr32, k1","kmovb rmr32, k1","VEX.128.66.0F.W0 92 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"KMOVD m32, k1","KMOVD k1, m32","kmovd k1, m32","VEX.128.66.0F.W1 91 /r","V","V","AVX512BW","modrm_memonly","w,r","",""
+"KMOVD r32, k2","KMOVD k2, r32","kmovd k2, r32","VEX.128.F2.0F.W0 93 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"KMOVD k1, k2/m32","KMOVD k2/m32, k1","kmovd k2/m32, k1","VEX.128.66.0F.W1 90 /r","V","V","AVX512BW","","w,r","",""
+"KMOVD k1, rmr32","KMOVD rmr32, k1","kmovd rmr32, k1","VEX.128.F2.0F.W0 92 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"KMOVQ m64, k1","KMOVQ k1, m64","kmovq k1, m64","VEX.128.0F.W1 91 /r","V","V","AVX512BW","modrm_memonly","w,r","",""
+"KMOVQ r64, k2","KMOVQ k2, r64","kmovq k2, r64","VEX.128.F2.0F.W1 93 /r","N.S.","V","AVX512BW","modrm_regonly","w,r","",""
+"KMOVQ k1, k2/m64","KMOVQ k2/m64, k1","kmovq k2/m64, k1","VEX.128.0F.W1 90 /r","V","V","AVX512BW","","w,r","",""
+"KMOVQ k1, rmr64","KMOVQ rmr64, k1","kmovq rmr64, k1","VEX.128.F2.0F.W1 92 /r","N.S.","V","AVX512BW","modrm_regonly","w,r","",""
+"KMOVW m16, k1","KMOVW k1, m16","kmovw k1, m16","VEX.128.0F.W0 91 /r","V","V","AVX512F","modrm_memonly","w,r","",""
+"KMOVW r32, k2","KMOVW k2, r32","kmovw k2, r32","VEX.128.0F.W0 93 /r","V","V","AVX512F","modrm_regonly","w,r","",""
+"KMOVW k1, k2/m16","KMOVW k2/m16, k1","kmovw k2/m16, k1","VEX.128.0F.W0 90 /r","V","V","AVX512F","","w,r","",""
+"KMOVW k1, rmr32","KMOVW rmr32, k1","kmovw rmr32, k1","VEX.128.0F.W0 92 /r","V","V","AVX512F","modrm_regonly","w,r","",""
+"KNOTB k1, k2","KNOTB k2, k1","knotb k2, k1","VEX.128.66.0F.W0 44 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"KNOTD k1, k2","KNOTD k2, k1","knotd k2, k1","VEX.128.66.0F.W1 44 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"KNOTQ k1, k2","KNOTQ k2, k1","knotq k2, k1","VEX.128.0F.W1 44 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"KNOTW k1, k2","KNOTW k2, k1","knotw k2, k1","VEX.128.0F.W0 44 /r","V","V","AVX512F","modrm_regonly","w,r","",""
+"KORB k1, kV, k2","KORB k2, kV, k1","korb k2, kV, k1","VEX.NDS.256.66.0F.W0 45 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KORD k1, kV, k2","KORD k2, kV, k1","kord k2, kV, k1","VEX.NDS.256.66.0F.W1 45 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KORQ k1, kV, k2","KORQ k2, kV, k1","korq k2, kV, k1","VEX.NDS.256.0F.W1 45 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KORTESTB k1, k2","KORTESTB k2, k1","kortestb k2, k1","VEX.128.66.0F.W0 98 /r","V","V","AVX512DQ","modrm_regonly","r,r","",""
+"KORTESTD k1, k2","KORTESTD k2, k1","kortestd k2, k1","VEX.128.66.0F.W1 98 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
+"KORTESTQ k1, k2","KORTESTQ k2, k1","kortestq k2, k1","VEX.128.0F.W1 98 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
+"KORTESTW k1, k2","KORTESTW k2, k1","kortestw k2, k1","VEX.128.0F.W0 98 /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"KORW k1, kV, k2","KORW k2, kV, k1","korw k2, kV, k1","VEX.NDS.256.0F.W0 45 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KSHIFTLB k1, k2, imm8u","KSHIFTLB imm8u, k2, k1","kshiftlb imm8u, k2, k1","VEX.128.66.0F3A.W0 32 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KSHIFTLD k1, k2, imm8u","KSHIFTLD imm8u, k2, k1","kshiftld imm8u, k2, k1","VEX.128.66.0F3A.W0 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KSHIFTLQ k1, k2, imm8u","KSHIFTLQ imm8u, k2, k1","kshiftlq imm8u, k2, k1","VEX.128.66.0F3A.W1 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KSHIFTLW k1, k2, imm8u","KSHIFTLW imm8u, k2, k1","kshiftlw imm8u, k2, k1","VEX.128.66.0F3A.W1 32 /r ib","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KSHIFTRB k1, k2, imm8u","KSHIFTRB imm8u, k2, k1","kshiftrb imm8u, k2, k1","VEX.128.66.0F3A.W0 30 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KSHIFTRD k1, k2, imm8u","KSHIFTRD imm8u, k2, k1","kshiftrd imm8u, k2, k1","VEX.128.66.0F3A.W0 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KSHIFTRQ k1, k2, imm8u","KSHIFTRQ imm8u, k2, k1","kshiftrq imm8u, k2, k1","VEX.128.66.0F3A.W1 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KSHIFTRW k1, k2, imm8u","KSHIFTRW imm8u, k2, k1","kshiftrw imm8u, k2, k1","VEX.128.66.0F3A.W1 30 /r ib","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KTESTB k1, k2","KTESTB k2, k1","ktestb k2, k1","VEX.128.66.0F.W0 99 /r","V","V","AVX512DQ","modrm_regonly","r,r","",""
+"KTESTD k1, k2","KTESTD k2, k1","ktestd k2, k1","VEX.128.66.0F.W1 99 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
+"KTESTQ k1, k2","KTESTQ k2, k1","ktestq k2, k1","VEX.128.0F.W1 99 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
+"KTESTW k1, k2","KTESTW k2, k1","ktestw k2, k1","VEX.128.0F.W0 99 /r","V","V","AVX512DQ","modrm_regonly","r,r","",""
+"KUNPCKBW k1, kV, k2","KUNPCKBW k2, kV, k1","kunpckbw k2, kV, k1","VEX.NDS.256.66.0F.W0 4B /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KUNPCKDQ k1, kV, k2","KUNPCKDQ k2, kV, k1","kunpckdq k2, kV, k1","VEX.NDS.256.0F.W1 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KUNPCKWD k1, kV, k2","KUNPCKWD k2, kV, k1","kunpckwd k2, kV, k1","VEX.NDS.256.0F.W0 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXNORB k1, kV, k2","KXNORB k2, kV, k1","kxnorb k2, kV, k1","VEX.NDS.256.66.0F.W0 46 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KXNORD k1, kV, k2","KXNORD k2, kV, k1","kxnord k2, kV, k1","VEX.NDS.256.66.0F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXNORQ k1, kV, k2","KXNORQ k2, kV, k1","kxnorq k2, kV, k1","VEX.NDS.256.0F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXNORW k1, kV, k2","KXNORW k2, kV, k1","kxnorw k2, kV, k1","VEX.NDS.256.0F.W0 46 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KXORB k1, kV, k2","KXORB k2, kV, k1","kxorb k2, kV, k1","VEX.NDS.256.66.0F.W0 47 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KXORD k1, kV, k2","KXORD k2, kV, k1","kxord k2, kV, k1","VEX.NDS.256.66.0F.W1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXORQ k1, kV, k2","KXORQ k2, kV, k1","kxorq k2, kV, k1","VEX.NDS.256.0F.W1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXORW k1, kV, k2","KXORW k2, kV, k1","kxorw k2, kV, k1","VEX.NDS.256.0F.W0 47 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"LAHF","LAHF","lahf","9F","V","V","LAHFSAHF","","","",""
+"LAR r32, r32/m16","LARL r32/m16, r32","larl r32/m16, r32","0F 02 /r","V","V","","operand32","rw,r","Y","32"
+"LAR r64, r64/m16","LARQ r64/m16, r64","larq r64/m16, r64","REX.W 0F 02 /r","N.S.","V","","","rw,r","Y","64"
+"LAR r16, r/m16","LARW r/m16, r16","larw r/m16, r16","0F 02 /r","V","V","","operand16","rw,r","Y","16"
+"CALL_FAR ptr16:32","LCALLL ptr16:32","lcalll ptr16:32","9A cd iw","V","N.S.","","operand32","r","Y",""
+"CALL_FAR m16:32","LCALLL* m16:32","lcalll* m16:32","FF /3","V","V","","modrm_memonly,operand32","r","Y",""
+"CALL_FAR m16:64","LCALLQ* m16:64","lcallq* m16:64","REX.W FF /3","N.S.","V","","modrm_memonly","r","Y",""
+"CALL_FAR ptr16:16","LCALLW ptr16:16","lcallw ptr16:16","9A cw iw","V","N.S.","","operand16","r","Y",""
+"CALL_FAR m16:16","LCALLW* m16:16","lcallw* m16:16","FF /3","V","V","","modrm_memonly,operand16","r","Y",""
+"LDDQU xmm1, m128","LDDQU m128, xmm1","lddqu m128, xmm1","F2 0F F0 /r","V","V","SSE3","modrm_memonly","w,r","",""
+"LDMXCSR m32","LDMXCSR m32","ldmxcsr m32","0F AE /2","V","V","SSE","modrm_memonly","r","",""
+"LDS r32, m16:32","LDSL m16:32, r32","ldsl m16:32, r32","C5 /r","V","N.S.","","modrm_memonly,operand32","w,r","Y","32"
+"LDS r16, m16:16","LDSW m16:16, r16","ldsw m16:16, r16","C5 /r","V","N.S.","","modrm_memonly,operand16","w,r","Y","16"
+"LEA r32, m","LEAL m, r32","leal m, r32","8D /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
+"LEA r64, m","LEAQ m, r64","leaq m, r64","REX.W 8D /r","N.S.","V","","modrm_memonly","w,r","Y","64"
+"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","N.S.","V","","default64","","Y",""
+"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","N.S.","","operand32","","Y",""
+"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","V","","operand16","","Y",""
+"LEA r16, m","LEAW m, r16","leaw m, r16","8D /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
+"LES r32, m16:32","LESL m16:32, r32","lesl m16:32, r32","C4 /r","V","N.S.","","modrm_memonly,operand32","w,r","Y","32"
+"LES r16, m16:16","LESW m16:16, r16","lesw m16:16, r16","C4 /r","V","N.S.","","modrm_memonly,operand16","w,r","Y","16"
+"LFENCE","LFENCE","lfence","0F AE /5","V","V","SSE2","","","",""
+"LFS r32, m16:32","LFSL m16:32, r32","lfsl m16:32, r32","0F B4 /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
+"LFS r64, m16:64","LFSQ m16:64, r64","lfsq m16:64, r64","REX.W 0F B4 /r","N.S.","V","","modrm_memonly","w,r","Y","64"
+"LFS r16, m16:16","LFSW m16:16, r16","lfsw m16:16, r16","0F B4 /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
+"LGDT m16&64","LGDT m16&64","lgdt m16&64","0F 01 /2","N.S.","V","","default64,modrm_memonly","r","",""
+"LGDT m16&32","LGDTW/LGDTL m16&32","lgdtw/lgdtl m16&32","0F 01 /2","V","N.S.","","modrm_memonly","r","",""
+"LGS r32, m16:32","LGSL m16:32, r32","lgsl m16:32, r32","0F B5 /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
+"LGS r64, m16:64","LGSQ m16:64, r64","lgsq m16:64, r64","REX.W 0F B5 /r","N.S.","V","","modrm_memonly","w,r","Y","64"
+"LGS r16, m16:16","LGSW m16:16, r16","lgsw m16:16, r16","0F B5 /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
+"LIDT m16&64","LIDT m16&64","lidt m16&64","0F 01 /3","N.S.","V","","default64,modrm_memonly","r","",""
+"LIDT m16&32","LIDTW/LIDTL m16&32","lidtw/lidtl m16&32","0F 01 /3","V","N.S.","","modrm_memonly","r","",""
+"JMP_FAR ptr16:32","LJMPL ptr16:32","ljmpl ptr16:32","EA cd iw","V","N.S.","","operand32","r","Y",""
+"JMP_FAR m16:32","LJMPL* m16:32","ljmpl* m16:32","FF /5","V","V","","modrm_memonly,operand32","r","Y",""
+"JMP_FAR m16:64","LJMPQ* m16:64","ljmpq* m16:64","REX.W FF /5","N.S.","V","","modrm_memonly","r","Y",""
+"JMP_FAR ptr16:16","LJMPW ptr16:16","ljmpw ptr16:16","EA cw iw","V","N.S.","","operand16","r","Y",""
+"JMP_FAR m16:16","LJMPW* m16:16","ljmpw* m16:16","FF /5","V","V","","modrm_memonly,operand16","r","Y",""
+"LLDT r/m16","LLDT r/m16","lldt r/m16","0F 00 /2","V","V","","","r","",""
+"LLWPCB rmr32","LLWPCBL rmr32","llwpcbl rmr32","XOP.128.09.W0 12 /0","V","V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32"
+"LLWPCB rmr64","LLWPCBQ rmr64","llwpcbq rmr64","XOP.128.09.W0 12 /0","N.S.","V","XOP","amd,modrm_regonly,operand64","w","Y","64"
+"LMSW r/m16","LMSW r/m16","lmsw r/m16","0F 01 /6","V","V","","","r","",""
+"LOCK","LOCK","lock","F0","V","V","","pseudo","","",""
+"LODSB","LODSB","lodsb","AC","V","V","","","","",""
+"LODSD","LODSL","lodsl","AD","V","V","","operand32","","",""
+"LODSQ","LODSQ","lodsq","REX.W AD","N.S.","V","","","","",""
+"LODSW","LODSW","lodsw","AD","V","V","","operand16","","",""
+"LOOP rel8","LOOP rel8","loop rel8","E2 cb","V","V","","","r","",""
+"LOOPE rel8","LOOPEQ rel8","loope rel8","E1 cb","V","V","","","r","",""
+"LOOPNE rel8","LOOPNE rel8","loopne rel8","E0 cb","V","V","","","r","",""
+"LSL r32, r32/m16","LSLL r32/m16, r32","lsll r32/m16, r32","0F 03 /r","V","V","","operand32","rw,r","Y","32"
+"LSL r64, r32/m16","LSLQ r32/m16, r64","lslq r32/m16, r64","REX.W 0F 03 /r","N.S.","V","","","rw,r","Y","64"
+"LSL r16, r/m16","LSLW r/m16, r16","lslw r/m16, r16","0F 03 /r","V","V","","operand16","rw,r","Y","16"
+"LSS r32, m16:32","LSSL m16:32, r32","lssl m16:32, r32","0F B2 /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
+"LSS r64, m16:64","LSSQ m16:64, r64","lssq m16:64, r64","REX.W 0F B2 /r","N.S.","V","","modrm_memonly","w,r","Y","64"
+"LSS r16, m16:16","LSSW m16:16, r16","lssw m16:16, r16","0F B2 /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
+"LTR r/m16","LTR r/m16","ltr r/m16","0F 00 /3","V","V","","","r","",""
+"LWPINS r32V, r/m32, imm32u","LWPINS imm32u, r/m32, r32V","lwpins imm32u, r/m32, r32V","XOP.NDD.128.0A.W0 12 /0","V","V","XOP","amd,operand16,operand32","w,r,r","",""
+"LWPINS r64V, r64/m32, imm32u","LWPINS imm32u, r64/m32, r64V","lwpins imm32u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /0","N.S.","V","XOP","amd,operand64","w,r,r","",""
+"LWPVAL r32V, r/m32, imm32u","LWPVAL imm32u, r/m32, r32V","lwpval imm32u, r/m32, r32V","XOP.NDD.128.0A.W0 12 /1","V","V","XOP","amd,operand16,operand32","w,r,r","",""
+"LWPVAL r64V, r64/m32, imm32u","LWPVAL imm32u, r64/m32, r64V","lwpval imm32u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /1","N.S.","V","XOP","amd,operand64","w,r,r","",""
+"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","V","V","LZCNT","operand32","w,r","Y","32"
+"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","V","V","AMD","amd,operand32","w,r","Y","32"
+"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD /r","N.S.","V","AMD","amd","w,r","Y","64"
+"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD /r","N.S.","V","LZCNT","","w,r","Y","64"
+"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","V","V","AMD","amd,operand16","w,r","Y","16"
+"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","V","V","LZCNT","operand16","w,r","Y","16"
+"MASKMOVDQU xmm1, xmm2","MASKMOVOU xmm2, xmm1","maskmovdqu xmm2, xmm1","66 0F F7 /r","V","V","SSE2","modrm_regonly","r,r","",""
+"MASKMOVQ mm1, mm2","MASKMOVQ mm2, mm1","maskmovq mm2, mm1","0F F7 /r","V","V","MMX","modrm_regonly","r,r","",""
+"MAXPD xmm1, xmm2/m128","MAXPD xmm2/m128, xmm1","maxpd xmm2/m128, xmm1","66 0F 5F /r","V","V","SSE2","","rw,r","",""
+"MAXPS xmm1, xmm2/m128","MAXPS xmm2/m128, xmm1","maxps xmm2/m128, xmm1","0F 5F /r","V","V","SSE","","rw,r","",""
+"MAXSD xmm1, xmm2/m64","MAXSD xmm2/m64, xmm1","maxsd xmm2/m64, xmm1","F2 0F 5F /r","V","V","SSE2","","rw,r","",""
+"MAXSS xmm1, xmm2/m32","MAXSS xmm2/m32, xmm1","maxss xmm2/m32, xmm1","F3 0F 5F /r","V","V","SSE","","rw,r","",""
+"MFENCE","MFENCE","mfence","0F AE /6","V","V","SSE2","","","",""
+"MINPD xmm1, xmm2/m128","MINPD xmm2/m128, xmm1","minpd xmm2/m128, xmm1","66 0F 5D /r","V","V","SSE2","","rw,r","",""
+"MINPS xmm1, xmm2/m128","MINPS xmm2/m128, xmm1","minps xmm2/m128, xmm1","0F 5D /r","V","V","SSE","","rw,r","",""
+"MINSD xmm1, xmm2/m64","MINSD xmm2/m64, xmm1","minsd xmm2/m64, xmm1","F2 0F 5D /r","V","V","SSE2","","rw,r","",""
+"MINSS xmm1, xmm2/m32","MINSS xmm2/m32, xmm1","minss xmm2/m32, xmm1","F3 0F 5D /r","V","V","SSE","","rw,r","",""
+"MONITOR","MONITOR","monitor","0F 01 C8","V","V","MONITOR","","","",""
+"MOVAPD xmm2/m128, xmm1","MOVAPD xmm1, xmm2/m128","movapd xmm1, xmm2/m128","66 0F 29 /r","V","V","SSE2","","w,r","",""
+"MOVAPD xmm1, xmm2/m128","MOVAPD xmm2/m128, xmm1","movapd xmm2/m128, xmm1","66 0F 28 /r","V","V","SSE2","","w,r","",""
+"MOVAPS xmm2/m128, xmm1","MOVAPS xmm1, xmm2/m128","movaps xmm1, xmm2/m128","0F 29 /r","V","V","SSE","","w,r","",""
+"MOVAPS xmm1, xmm2/m128","MOVAPS xmm2/m128, xmm1","movaps xmm2/m128, xmm1","0F 28 /r","V","V","SSE","","w,r","",""
+"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","C6 /0 ib","V","V","","","w,r","Y","8"
+"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","REX C6 /0 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","B0+rb ib","V","V","","","w,r","Y","8"
+"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","REX B0+rb ib","N.E.","V","","pseudo64","w,r","Y","8"
+"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","8A /r","V","V","","","w,r","Y","8"
+"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","REX 8A /r","N.E.","V","","pseudo64","w,r","Y","8"
+"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","88 /r","V","V","","","w,r","Y","8"
+"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","REX 88 /r","N.E.","V","","pseudo64","w,r","Y","8"
+"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, moffs8","A2 cm","V","V","","","w,r","Y","8"
+"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, moffs8","REX.W A2 cm","N.E.","V","","pseudo","w,r","Y","8"
+"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8, AL","A0 cm","V","V","","","w,r","Y","8"
+"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8, AL","REX.W A0 cm","N.E.","V","","pseudo","w,r","Y","8"
+"MOVBE r32, m32","MOVBELL m32, r32","movbell m32, r32","0F 38 F0 /r","V","V","MOVBE","modrm_memonly,operand32","w,r","Y","32"
+"MOVBE m32, r32","MOVBELL r32, m32","movbell r32, m32","0F 38 F1 /r","V","V","MOVBE","modrm_memonly,operand32","w,r","Y","32"
+"MOVBE r64, m64","MOVBEQQ m64, r64","movbeqq m64, r64","REX.W 0F 38 F0 /r","N.S.","V","MOVBE","modrm_memonly","w,r","Y","64"
+"MOVBE m64, r64","MOVBEQQ r64, m64","movbeqq r64, m64","REX.W 0F 38 F1 /r","N.S.","V","MOVBE","modrm_memonly","w,r","Y","64"
+"MOVBE r16, m16","MOVBEWW m16, r16","movbeww m16, r16","0F 38 F0 /r","V","V","MOVBE","modrm_memonly,operand16","w,r","Y","16"
+"MOVBE m16, r16","MOVBEWW r16, m16","movbeww r16, m16","0F 38 F1 /r","V","V","MOVBE","modrm_memonly,operand16","w,r","Y","16"
+"MOVSX r32, r/m8","MOVBLSX r/m8, r32","movsbl r/m8, r32","0F BE /r","V","V","","operand32","w,r","Y","32"
+"MOVZX r32, r/m8","MOVBLZX r/m8, r32","movzbl r/m8, r32","0F B6 /r","V","V","","operand32","w,r","Y","32"
+"MOVSX r64, r/m8","MOVBQSX r/m8, r64","movsbq r/m8, r64","REX.W 0F BE /r","N.S.","V","","","w,r","Y","64"
+"MOVZX r64, r/m8","MOVBQZX r/m8, r64","movzbq r/m8, r64","REX.W 0F B6 /r","N.S.","V","","","w,r","Y","64"
+"MOVSX r16, r/m8","MOVBWSX r/m8, r16","movsbw r/m8, r16","0F BE /r","V","V","","operand16","w,r","Y","16"
+"MOVZX r16, r/m8","MOVBWZX r/m8, r16","movzbw r/m8, r16","0F B6 /r","V","V","","operand16","w,r","Y","16"
+"MOVD r/m32, mm1","MOVD mm1, r/m32","movd mm1, r/m32","0F 7E /r","V","V","MMX","operand16,operand32","w,r","",""
+"MOVD mm1, r/m32","MOVD r/m32, mm1","movd r/m32, mm1","0F 6E /r","V","V","MMX","operand16,operand32","w,r","",""
+"MOVD xmm1, r/m32","MOVD r/m32, xmm1","movd r/m32, xmm1","66 0F 6E /r","V","V","SSE2","operand16,operand32","w,r","",""
+"MOVD r/m32, xmm1","MOVD xmm1, r/m32","movd xmm1, r/m32","66 0F 7E /r","V","V","SSE2","operand16,operand32","w,r","",""
+"MOVDDUP xmm1, xmm2/m64","MOVDDUP xmm2/m64, xmm1","movddup xmm2/m64, xmm1","F2 0F 12 /r","V","V","SSE3","","w,r","",""
+"MOVHLPS xmm1, xmm2","MOVHLPS xmm2, xmm1","movhlps xmm2, xmm1","0F 12 /r","V","V","SSE","modrm_regonly","w,r","",""
+"MOVHPD xmm1, m64","MOVHPD m64, xmm1","movhpd m64, xmm1","66 0F 16 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVHPD m64, xmm1","MOVHPD xmm1, m64","movhpd xmm1, m64","66 0F 17 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVHPS xmm1, m64","MOVHPS m64, xmm1","movhps m64, xmm1","0F 16 /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOVHPS m64, xmm1","MOVHPS xmm1, m64","movhps xmm1, m64","0F 17 /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOV rmr32, CR0-CR7","MOVL CR0-CR7, rmr32","movl CR0-CR7, rmr32","0F 20 /r","V","N.S.","","","w,r","Y","32"
+"MOV rmr32, DR0-DR7","MOVL DR0-DR7, rmr32","movl DR0-DR7, rmr32","0F 21 /r","V","N.S.","","","w,r","Y","32"
+"MOV moffs32, EAX","MOVL EAX, moffs32","movl EAX, moffs32","A3 cm","V","V","","operand32","w,r","Y","32"
+"MOV r/m32, imm32","MOVL imm32, r/m32","movl imm32, r/m32","C7 /0 id","V","V","","operand32","w,r","Y","32"
+"MOV r32op, imm32u","MOVL imm32u, r32op","movl imm32u, r32op","B8+rd id","V","V","","operand32","w,r","Y","32"
+"MOV EAX, moffs32","MOVL moffs32, EAX","movl moffs32, EAX","A1 cm","V","V","","operand32","w,r","Y","32"
+"MOV r32, r/m32","MOVL r/m32, r32","movl r/m32, r32","8B /r","V","V","","operand32","w,r","Y","32"
+"MOV r/m32, r32","MOVL r32, r/m32","movl r32, r/m32","89 /r","V","V","","operand32","w,r","Y","32"
+"MOV CR0-CR7, rmr32","MOVL rmr32, CR0-CR7","movl rmr32, CR0-CR7","0F 22 /r","V","N.S.","","","w,r","Y","32"
+"MOV DR0-DR7, rmr32","MOVL rmr32, DR0-DR7","movl rmr32, DR0-DR7","0F 23 /r","V","N.S.","","","w,r","Y","32"
+"MOVLHPS xmm1, xmm2","MOVLHPS xmm2, xmm1","movlhps xmm2, xmm1","0F 16 /r","V","V","SSE","modrm_regonly","w,r","",""
+"MOVLPD xmm1, m64","MOVLPD m64, xmm1","movlpd m64, xmm1","66 0F 12 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVLPD m64, xmm1","MOVLPD xmm1, m64","movlpd xmm1, m64","66 0F 13 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVLPS xmm1, m64","MOVLPS m64, xmm1","movlps m64, xmm1","0F 12 /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOVLPS m64, xmm1","MOVLPS xmm1, m64","movlps xmm1, m64","0F 13 /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOVSXD r32, r/m32","MOVLQSX r/m32, r32","movsxdl r/m32, r32","63 /r","N.S.","V","","operand32","w,r","Y","32"
+"MOVSXD r64, r/m32","MOVLQSX r/m32, r64","movslq r/m32, r64","REX.W 63 /r","N.S.","V","","","w,r","Y","64"
+"MOVMSKPD r32, xmm2","MOVMSKPD xmm2, r32","movmskpd xmm2, r32","66 0F 50 /r","V","V","SSE2","modrm_regonly","w,r","",""
+"MOVMSKPS r32, xmm2","MOVMSKPS xmm2, r32","movmskps xmm2, r32","0F 50 /r","V","V","SSE","modrm_regonly","w,r","",""
+"MOVNTDQA xmm1, m128","MOVNTDQA m128, xmm1","movntdqa m128, xmm1","66 0F 38 2A /r","V","V","SSE4_1","modrm_memonly","w,r","",""
+"MOVNTI m32, r32","MOVNTIL r32, m32","movntil r32, m32","0F C3 /r","V","V","SSE2","modrm_memonly,operand16,operand32","w,r","Y","32"
+"MOVNTI m64, r64","MOVNTIQ r64, m64","movntiq r64, m64","REX.W 0F C3 /r","N.S.","V","SSE2","modrm_memonly","w,r","Y","64"
+"MOVNTDQ m128, xmm1","MOVNTO xmm1, m128","movntdq xmm1, m128","66 0F E7 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVNTPD m128, xmm1","MOVNTPD xmm1, m128","movntpd xmm1, m128","66 0F 2B /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVNTPS m128, xmm1","MOVNTPS xmm1, m128","movntps xmm1, m128","0F 2B /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOVNTQ m64, mm1","MOVNTQ mm1, m64","movntq mm1, m64","0F E7 /r","V","V","MMX","modrm_memonly","w,r","",""
+"MOVNTSD m64, xmm1","MOVNTSD xmm1, m64","movntsd xmm1, m64","F2 0F 2B /r","V","V","SSE4a","amd,modrm_memonly","w,r","",""
+"MOVNTSS m32, xmm1","MOVNTSS xmm1, m32","movntss xmm1, m32","F3 0F 2B /r","V","V","SSE4a","amd,modrm_memonly","w,r","",""
+"MOVDQA xmm2/m128, xmm1","MOVO xmm1, xmm2/m128","movdqa xmm1, xmm2/m128","66 0F 7F /r","V","V","SSE2","","w,r","",""
+"MOVDQA xmm1, xmm2/m128","MOVO xmm2/m128, xmm1","movdqa xmm2/m128, xmm1","66 0F 6F /r","V","V","SSE2","","w,r","",""
+"MOVDQU xmm2/m128, xmm1","MOVOU xmm1, xmm2/m128","movdqu xmm1, xmm2/m128","F3 0F 7F /r","V","V","SSE2","","w,r","",""
+"MOVDQU xmm1, xmm2/m128","MOVOU xmm2/m128, xmm1","movdqu xmm2/m128, xmm1","F3 0F 6F /r","V","V","SSE2","","w,r","",""
+"MOV rmr64, CR0-CR7","MOVQ CR0-CR7, rmr64","movq CR0-CR7, rmr64","0F 20 /r","N.S.","V","","default64","w,r","Y","64"
+"MOV rmr64, CR8","MOVQ CR8, rmr64","movq CR8, rmr64","REX.R + 0F 20 /0","N.E.","V","","modrm_regonly,pseudo","w,r","Y","64"
+"MOV rmr64, DR0-DR7","MOVQ DR0-DR7, rmr64","movq DR0-DR7, rmr64","0F 21 /r","N.S.","V","","default64","w,r","Y","64"
+"MOV moffs64, RAX","MOVQ RAX, moffs64","movabsq RAX, moffs64","REX.W A3 cm","N.S.","V","","","w,r","Y","64"
+"MOV r/m64, imm32","MOVQ imm32, r/m64","movq imm32, r/m64","REX.W C7 /0 id","N.S.","V","","","w,r","Y","64"
+"MOV r64op, imm64u","MOVQ imm64u, r64op","movq imm64u, r64op","REX.W B8+ro io","N.S.","V","","","w,r","Y","64"
+"MOVQ mm2/m64, mm1","MOVQ mm1, mm2/m64","movq mm1, mm2/m64","0F 7F /r","V","V","MMX","","w,r","",""
+"MOVQ r/m64, mm1","MOVQ mm1, r/m64","movq mm1, r/m64","REX.W 0F 7E /r","N.S.","V","MMX","","w,r","",""
+"MOVQ mm1, mm2/m64","MOVQ mm2/m64, mm1","movq mm2/m64, mm1","0F 6F /r","V","V","MMX","","w,r","",""
+"MOV RAX, moffs64","MOVQ moffs64, RAX","movabsq moffs64, RAX","REX.W A1 cm","N.S.","V","","","w,r","Y","64"
+"MOVQ mm1, r/m64","MOVQ r/m64, mm1","movq r/m64, mm1","REX.W 0F 6E /r","N.S.","V","MMX","","w,r","",""
+"MOV r64, r/m64","MOVQ r/m64, r64","movq r/m64, r64","REX.W 8B /r","N.S.","V","","","w,r","Y","64"
+"MOVQ xmm1, r/m64","MOVQ r/m64, xmm1","movq r/m64, xmm1","66 REX.W 0F 6E /r","N.S.","V","SSE2","","w,r","",""
+"MOV r/m64, r64","MOVQ r64, r/m64","movq r64, r/m64","REX.W 89 /r","N.S.","V","","","w,r","Y","64"
+"MOV CR0-CR7, rmr64","MOVQ rmr64, CR0-CR7","movq rmr64, CR0-CR7","0F 22 /r","N.S.","V","","default64","w,r","Y","64"
+"MOV CR8, rmr64","MOVQ rmr64, CR8","movq rmr64, CR8","REX.R + 0F 22 /0","N.E.","V","","modrm_regonly,pseudo","w,r","Y","64"
+"MOV DR0-DR7, rmr64","MOVQ rmr64, DR0-DR7","movq rmr64, DR0-DR7","0F 23 /r","N.S.","V","","default64","w,r","Y","64"
+"MOVQ r/m64, xmm1","MOVQ xmm1, r/m64","movq xmm1, r/m64","66 REX.W 0F 7E /r","N.S.","V","SSE2","","w,r","",""
+"MOVQ xmm2/m64, xmm1","MOVQ xmm1, xmm2/m64","movq xmm1, xmm2/m64","66 0F D6 /r","V","V","SSE2","","w,r","",""
+"MOVDQ2Q mm1, xmm2","MOVQ xmm2, mm1","movdq2q xmm2, mm1","F2 0F D6 /r","V","V","SSE2","modrm_regonly","w,r","",""
+"MOVQ xmm1, xmm2/m64","MOVQ xmm2/m64, xmm1","movq xmm2/m64, xmm1","F3 0F 7E /r","V","V","SSE2","","w,r","",""
+"MOVQ2DQ xmm1, mm2","MOVQOZX mm2, xmm1","movq2dq mm2, xmm1","F3 0F D6 /r","V","V","SSE2","modrm_regonly","w,r","",""
+"MOVSB","MOVSB","movsb","A4","V","V","","","","",""
+"MOVSD xmm2/m64, xmm1","MOVSD xmm1, xmm2/m64","movsd xmm1, xmm2/m64","F2 0F 11 /r","V","V","SSE2","","w,r","",""
+"MOVSD xmm1, xmm2/m64","MOVSD xmm2/m64, xmm1","movsd xmm2/m64, xmm1","F2 0F 10 /r","V","V","SSE2","","w,r","",""
+"MOVSHDUP xmm1, xmm2/m128","MOVSHDUP xmm2/m128, xmm1","movshdup xmm2/m128, xmm1","F3 0F 16 /r","V","V","SSE3","","w,r","",""
+"MOVSD","MOVSL","movsl","A5","V","V","","operand32","","",""
+"MOVSLDUP xmm1, xmm2/m128","MOVSLDUP xmm2/m128, xmm1","movsldup xmm2/m128, xmm1","F3 0F 12 /r","V","V","SSE3","","w,r","",""
+"MOVSQ","MOVSQ","movsq","REX.W A5","N.S.","V","","","","",""
+"MOVSS xmm2/m32, xmm1","MOVSS xmm1, xmm2/m32","movss xmm1, xmm2/m32","F3 0F 11 /r","V","V","SSE","","w,r","",""
+"MOVSS xmm1, xmm2/m32","MOVSS xmm2/m32, xmm1","movss xmm2/m32, xmm1","F3 0F 10 /r","V","V","SSE","","w,r","",""
+"MOVSW","MOVSW","movsw","A5","V","V","","operand16","","",""
+"MOVSX r16, r/m16","MOVSWW r/m16, r16","movsww r/m16, r16","0F BF /r","V","V","","operand16","w,r","Y","16"
+"MOVUPD xmm2/m128, xmm1","MOVUPD xmm1, xmm2/m128","movupd xmm1, xmm2/m128","66 0F 11 /r","V","V","SSE2","","w,r","",""
+"MOVUPD xmm1, xmm2/m128","MOVUPD xmm2/m128, xmm1","movupd xmm2/m128, xmm1","66 0F 10 /r","V","V","SSE2","","w,r","",""
+"MOVUPS xmm2/m128, xmm1","MOVUPS xmm1, xmm2/m128","movups xmm1, xmm2/m128","0F 11 /r","V","V","SSE","","w,r","",""
+"MOVUPS xmm1, xmm2/m128","MOVUPS xmm2/m128, xmm1","movups xmm2/m128, xmm1","0F 10 /r","V","V","SSE","","w,r","",""
+"MOV moffs16, AX","MOVW AX, moffs16","movw AX, moffs16","A3 cm","V","V","","operand16","w,r","Y","16"
+"MOV r/m16, Sreg","MOVW Sreg, r/m16","movw Sreg, r/m16","8C /r","V","V","","operand16","w,r","Y","16"
+"MOV r/m16, imm16","MOVW imm16, r/m16","movw imm16, r/m16","C7 /0 iw","V","V","","operand16","w,r","Y","16"
+"MOV r16op, imm16u","MOVW imm16u, r16op","movw imm16u, r16op","B8+rw iw","V","V","","operand16","w,r","Y","16"
+"MOV AX, moffs16","MOVW moffs16, AX","movw moffs16, AX","A1 cm","V","V","","operand16","w,r","Y","16"
+"MOV Sreg, r/m16","MOVW r/m16, Sreg","movw r/m16, Sreg","8E /r","V","V","","","w,r","Y","16"
+"MOV r16, r/m16","MOVW r/m16, r16","movw r/m16, r16","8B /r","V","V","","operand16","w,r","Y","16"
+"MOV r/m16, r16","MOVW r16, r/m16","movw r16, r/m16","89 /r","V","V","","operand16","w,r","Y","16"
+"MOVSX r32, r/m16","MOVWLSX r/m16, r32","movswl r/m16, r32","0F BF /r","V","V","","operand32","w,r","Y","32"
+"MOVZX r32, r/m16","MOVWLZX r/m16, r32","movzwl r/m16, r32","0F B7 /r","V","V","","operand32","w,r","Y","32"
+"MOVSX r64, r/m16","MOVWQSX r/m16, r64","movswq r/m16, r64","REX.W 0F BF /r","N.S.","V","","","w,r","Y","64"
+"MOVSXD r16, r/m32","MOVWQSX r/m32, r16","movsxdw r/m32, r16","63 /r","N.S.","V","","operand16","w,r","Y","16"
+"MOVZX r64, r/m16","MOVWQZX r/m16, r64","movzwq r/m16, r64","REX.W 0F B7 /r","N.S.","V","","","w,r","Y","64"
+"MOVZX r16, r/m16","MOVZWW r/m16, r16","movzww r/m16, r16","0F B7 /r","V","V","","operand16","w,r","Y","16"
+"MOV r32/m16, Sreg","MOV{L/W} Sreg, r32/m16","mov{l/w} Sreg, r32/m16","8C /r","V","V","","operand32","w,r","Y",""
+"MOV r64/m16, Sreg","MOV{Q/W} Sreg, r64/m16","mov{q/w} Sreg, r64/m16","REX.W 8C /r","N.S.","V","","","w,r","Y",""
+"MPSADBW xmm1, xmm2/m128, imm8u","MPSADBW imm8u, xmm2/m128, xmm1","mpsadbw imm8u, xmm2/m128, xmm1","66 0F 3A 42 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"MUL r/m8","MULB r/m8","mulb r/m8","F6 /4","V","V","","","r","Y","8"
+"MUL r/m8","MULB r/m8","mulb r/m8","REX F6 /4","N.E.","V","","pseudo64","r","Y","8"
+"MUL r/m32","MULL r/m32","mull r/m32","F7 /4","V","V","","operand32","r","Y","32"
+"MULPD xmm1, xmm2/m128","MULPD xmm2/m128, xmm1","mulpd xmm2/m128, xmm1","66 0F 59 /r","V","V","SSE2","","rw,r","",""
+"MULPS xmm1, xmm2/m128","MULPS xmm2/m128, xmm1","mulps xmm2/m128, xmm1","0F 59 /r","V","V","SSE","","rw,r","",""
+"MUL r/m64","MULQ r/m64","mulq r/m64","REX.W F7 /4","N.S.","V","","","r","Y","64"
+"MULSD xmm1, xmm2/m64","MULSD xmm2/m64, xmm1","mulsd xmm2/m64, xmm1","F2 0F 59 /r","V","V","SSE2","","rw,r","",""
+"MULSS xmm1, xmm2/m32","MULSS xmm2/m32, xmm1","mulss xmm2/m32, xmm1","F3 0F 59 /r","V","V","SSE","","rw,r","",""
+"MUL r/m16","MULW r/m16","mulw r/m16","F7 /4","V","V","","operand16","r","Y","16"
+"MULX r32, r32V, r/m32","MULXL r/m32, r32V, r32","mulxl r/m32, r32V, r32","VEX.NDD.128.F2.0F38.W0 F6 /r","V","V","BMI2","","w,w,r","Y","32"
+"MULX r64, r64V, r/m64","MULXQ r/m64, r64V, r64","mulxq r/m64, r64V, r64","VEX.NDD.128.F2.0F38.W1 F6 /r","N.S.","V","BMI2","","w,w,r","Y","64"
+"MWAIT","MWAIT","mwait","0F 01 C9","V","V","MONITOR","","","",""
+"NEG r/m8","NEGB r/m8","negb r/m8","F6 /3","V","V","","","rw","Y","8"
+"NEG r/m8","NEGB r/m8","negb r/m8","REX F6 /3","N.E.","V","","pseudo64","rw","Y","8"
+"NEG r/m32","NEGL r/m32","negl r/m32","F7 /3","V","V","","operand32","rw","Y","32"
+"NEG r/m64","NEGQ r/m64","negq r/m64","REX.W F7 /3","N.S.","V","","","rw","Y","64"
+"NEG r/m16","NEGW r/m16","negw r/m16","F7 /3","V","V","","operand16","rw","Y","16"
+"NOP","NOP","nop","90","V","V","","pseudo","","Y",""
+"NOP","NOP","nop","90+rd","V","V","","operand32,operand64","","Y",""
+"NOP","NOP","nop","90+rw","V","V","","operand16,operand64","","Y",""
+"NOP","NOP","nop","F3 90+rd","V","V","","operand32","","Y",""
+"NOP","NOP","nop","F3 90+rw","V","V","","operand16","","Y",""
+"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /4","V","V","","operand32","r","Y","32"
+"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /5","V","V","","operand32","r","Y","32"
+"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /6","V","V","","operand32","r","Y","32"
+"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /7","V","V","","operand32","r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 19 /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1A /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1B /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1C /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1D /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V","PPRO","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1F /r","V","V","","operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 0D /r","V","V","PRFCHW","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1A /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1B /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","66 0F 1E /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F2 0F 1E /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1B /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /0","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /1","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /2","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /3","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /4","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /5","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /6","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F8","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F9","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FA","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FB","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FC","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FD","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FE","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FF","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /0","V","V","","modrm_regonly,operand32","r","Y","32"
+"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /1","V","V","","modrm_regonly,operand32","r","Y","32"
+"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /2","V","V","","modrm_regonly,operand32","r","Y","32"
+"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /3","V","V","","modrm_regonly,operand32","r","Y","32"
+"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /4","N.S.","V","","","r","Y","64"
+"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /5","N.S.","V","","","r","Y","64"
+"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /6","N.S.","V","","","r","Y","64"
+"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /7","N.S.","V","","","r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 19 /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1A /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1B /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1C /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1D /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S.","V","PPRO","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1F /r","N.S.","V","","","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","66 REX.W 0F 1E /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F2 REX.W 0F 1E /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1B /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /0","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /1","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /2","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /3","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /4","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /5","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /6","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F8","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F9","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FA","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FB","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FC","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FD","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FE","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FF","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 0D /r","N.S.","V","PRFCHW","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1A /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1B /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /0","N.S.","V","","modrm_regonly","r","Y","64"
+"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /1","N.S.","V","","modrm_regonly","r","Y","64"
+"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /2","N.S.","V","","modrm_regonly","r","Y","64"
+"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /3","N.S.","V","","modrm_regonly","r","Y","64"
+"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /4","V","V","","operand16","r","Y","16"
+"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /5","V","V","","operand16","r","Y","16"
+"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /6","V","V","","operand16","r","Y","16"
+"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /7","V","V","","operand16","r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 19 /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1A /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1B /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1C /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1D /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V","PPRO","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1F /r","V","V","","operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 0D /r","V","V","PRFCHW","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1A /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1B /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","66 0F 1E /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F2 0F 1E /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1B /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /0","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /1","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /2","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /3","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /4","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /5","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /6","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F8","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F9","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FA","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FB","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FC","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FD","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FE","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FF","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /0","V","V","","modrm_regonly,operand16","r","Y","16"
+"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /1","V","V","","modrm_regonly,operand16","r","Y","16"
+"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /2","V","V","","modrm_regonly,operand16","r","Y","16"
+"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /3","V","V","","modrm_regonly,operand16","r","Y","16"
+"NOT r/m8","NOTB r/m8","notb r/m8","F6 /2","V","V","","","rw","Y","8"
+"NOT r/m8","NOTB r/m8","notb r/m8","REX F6 /2","N.E.","V","","pseudo64","rw","Y","8"
+"NOT r/m32","NOTL r/m32","notl r/m32","F7 /2","V","V","","operand32","rw","Y","32"
+"NOT r/m64","NOTQ r/m64","notq r/m64","REX.W F7 /2","N.S.","V","","","rw","Y","64"
+"NOT r/m16","NOTW r/m16","notw r/m16","F7 /2","V","V","","operand16","rw","Y","16"
+"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","80 /1 ib","V","V","","","rw,r","Y","8"
+"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","82 /1 ib","V","N.S.","","","rw,r","Y","8"
+"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","REX 80 /1 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"OR AL, imm8u","ORB imm8u, AL","orb imm8u, AL","0C ib","V","V","","","rw,r","Y","8"
+"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","0A /r","V","V","","","rw,r","Y","8"
+"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","REX 0A /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","08 /r","V","V","","","rw,r","Y","8"
+"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","REX 08 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"OR EAX, imm32","ORL imm32, EAX","orl imm32, EAX","0D id","V","V","","operand32","rw,r","Y","32"
+"OR r/m32, imm32","ORL imm32, r/m32","orl imm32, r/m32","81 /1 id","V","V","","operand32","rw,r","Y","32"
+"OR r/m32, imm8","ORL imm8, r/m32","orl imm8, r/m32","83 /1 ib","V","V","","operand32","rw,r","Y","32"
+"OR r32, r/m32","ORL r/m32, r32","orl r/m32, r32","0B /r","V","V","","operand32","rw,r","Y","32"
+"OR r/m32, r32","ORL r32, r/m32","orl r32, r/m32","09 /r","V","V","","operand32","rw,r","Y","32"
+"ORPD xmm1, xmm2/m128","ORPD xmm2/m128, xmm1","orpd xmm2/m128, xmm1","66 0F 56 /r","V","V","SSE2","","rw,r","",""
+"ORPS xmm1, xmm2/m128","ORPS xmm2/m128, xmm1","orps xmm2/m128, xmm1","0F 56 /r","V","V","SSE","","rw,r","",""
+"OR RAX, imm32","ORQ imm32, RAX","orq imm32, RAX","REX.W 0D id","N.S.","V","","","rw,r","Y","64"
+"OR r/m64, imm32","ORQ imm32, r/m64","orq imm32, r/m64","REX.W 81 /1 id","N.S.","V","","","rw,r","Y","64"
+"OR r/m64, imm8","ORQ imm8, r/m64","orq imm8, r/m64","REX.W 83 /1 ib","N.S.","V","","","rw,r","Y","64"
+"OR r64, r/m64","ORQ r/m64, r64","orq r/m64, r64","REX.W 0B /r","N.S.","V","","","rw,r","Y","64"
+"OR r/m64, r64","ORQ r64, r/m64","orq r64, r/m64","REX.W 09 /r","N.S.","V","","","rw,r","Y","64"
+"OR AX, imm16","ORW imm16, AX","orw imm16, AX","0D iw","V","V","","operand16","rw,r","Y","16"
+"OR r/m16, imm16","ORW imm16, r/m16","orw imm16, r/m16","81 /1 iw","V","V","","operand16","rw,r","Y","16"
+"OR r/m16, imm8","ORW imm8, r/m16","orw imm8, r/m16","83 /1 ib","V","V","","operand16","rw,r","Y","16"
+"OR r16, r/m16","ORW r/m16, r16","orw r/m16, r16","0B /r","V","V","","operand16","rw,r","Y","16"
+"OR r/m16, r16","ORW r16, r/m16","orw r16, r/m16","09 /r","V","V","","operand16","rw,r","Y","16"
+"OUT DX, AL","OUTB AL, DX","outb AL, DX","EE","V","V","","","r,r","Y","8"
+"OUT imm8u, AL","OUTB AL, imm8u","outb AL, imm8u","E6 ib","V","V","","","r,r","Y","8"
+"OUT DX, EAX","OUTL EAX, DX","outl EAX, DX","EF","V","V","","operand32,operand64","r,r","Y","32"
+"OUT imm8u, EAX","OUTL EAX, imm8u","outl EAX, imm8u","E7 ib","V","V","","operand32,operand64","r,r","Y","32"
+"OUTSB","OUTSB","outsb","6E","V","V","","","","",""
+"OUTSD","OUTSL","outsl","6F","V","V","","operand32,operand64","","",""
+"OUTSW","OUTSW","outsw","6F","V","V","","operand16","","",""
+"OUT DX, AX","OUTW AX, DX","outw AX, DX","EF","V","V","","operand16","r,r","Y","16"
+"OUT imm8u, AX","OUTW AX, imm8u","outw AX, imm8u","E7 ib","V","V","","operand16","r,r","Y","16"
+"PABSB mm1, mm2/m64","PABSB mm2/m64, mm1","pabsb mm2/m64, mm1","0F 38 1C /r","V","V","SSSE3","","w,r","",""
+"PABSB xmm1, xmm2/m128","PABSB xmm2/m128, xmm1","pabsb xmm2/m128, xmm1","66 0F 38 1C /r","V","V","SSSE3","","w,r","",""
+"PABSD mm1, mm2/m64","PABSD mm2/m64, mm1","pabsd mm2/m64, mm1","0F 38 1E /r","V","V","SSSE3","","w,r","",""
+"PABSD xmm1, xmm2/m128","PABSD xmm2/m128, xmm1","pabsd xmm2/m128, xmm1","66 0F 38 1E /r","V","V","SSSE3","","w,r","",""
+"PABSW mm1, mm2/m64","PABSW mm2/m64, mm1","pabsw mm2/m64, mm1","0F 38 1D /r","V","V","SSSE3","","w,r","",""
+"PABSW xmm1, xmm2/m128","PABSW xmm2/m128, xmm1","pabsw xmm2/m128, xmm1","66 0F 38 1D /r","V","V","SSSE3","","w,r","",""
+"PACKSSDW mm1, mm2/m64","PACKSSLW mm2/m64, mm1","packssdw mm2/m64, mm1","0F 6B /r","V","V","MMX","","rw,r","",""
+"PACKSSDW xmm1, xmm2/m128","PACKSSLW xmm2/m128, xmm1","packssdw xmm2/m128, xmm1","66 0F 6B /r","V","V","SSE2","","rw,r","",""
+"PACKSSWB mm1, mm2/m64","PACKSSWB mm2/m64, mm1","packsswb mm2/m64, mm1","0F 63 /r","V","V","MMX","","rw,r","",""
+"PACKSSWB xmm1, xmm2/m128","PACKSSWB xmm2/m128, xmm1","packsswb xmm2/m128, xmm1","66 0F 63 /r","V","V","SSE2","","rw,r","",""
+"PACKUSDW xmm1, xmm2/m128","PACKUSDW xmm2/m128, xmm1","packusdw xmm2/m128, xmm1","66 0F 38 2B /r","V","V","SSE4_1","","rw,r","",""
+"PACKUSWB mm1, mm2/m64","PACKUSWB mm2/m64, mm1","packuswb mm2/m64, mm1","0F 67 /r","V","V","MMX","","rw,r","",""
+"PACKUSWB xmm1, xmm2/m128","PACKUSWB xmm2/m128, xmm1","packuswb xmm2/m128, xmm1","66 0F 67 /r","V","V","SSE2","","rw,r","",""
+"PADDB mm1, mm2/m64","PADDB mm2/m64, mm1","paddb mm2/m64, mm1","0F FC /r","V","V","MMX","","rw,r","",""
+"PADDB xmm1, xmm2/m128","PADDB xmm2/m128, xmm1","paddb xmm2/m128, xmm1","66 0F FC /r","V","V","SSE2","","rw,r","",""
+"PADDD mm1, mm2/m64","PADDL mm2/m64, mm1","paddd mm2/m64, mm1","0F FE /r","V","V","MMX","","rw,r","",""
+"PADDD xmm1, xmm2/m128","PADDL xmm2/m128, xmm1","paddd xmm2/m128, xmm1","66 0F FE /r","V","V","SSE2","","rw,r","",""
+"PADDQ mm1, mm2/m64","PADDQ mm2/m64, mm1","paddq mm2/m64, mm1","0F D4 /r","V","V","SSE2","","rw,r","",""
+"PADDQ xmm1, xmm2/m128","PADDQ xmm2/m128, xmm1","paddq xmm2/m128, xmm1","66 0F D4 /r","V","V","SSE2","","rw,r","",""
+"PADDSB mm1, mm2/m64","PADDSB mm2/m64, mm1","paddsb mm2/m64, mm1","0F EC /r","V","V","MMX","","rw,r","",""
+"PADDSB xmm1, xmm2/m128","PADDSB xmm2/m128, xmm1","paddsb xmm2/m128, xmm1","66 0F EC /r","V","V","SSE2","","rw,r","",""
+"PADDSW mm1, mm2/m64","PADDSW mm2/m64, mm1","paddsw mm2/m64, mm1","0F ED /r","V","V","MMX","","rw,r","",""
+"PADDSW xmm1, xmm2/m128","PADDSW xmm2/m128, xmm1","paddsw xmm2/m128, xmm1","66 0F ED /r","V","V","SSE2","","rw,r","",""
+"PADDUSB mm1, mm2/m64","PADDUSB mm2/m64, mm1","paddusb mm2/m64, mm1","0F DC /r","V","V","MMX","","rw,r","",""
+"PADDUSB xmm1, xmm2/m128","PADDUSB xmm2/m128, xmm1","paddusb xmm2/m128, xmm1","66 0F DC /r","V","V","SSE2","","rw,r","",""
+"PADDUSW mm1, mm2/m64","PADDUSW mm2/m64, mm1","paddusw mm2/m64, mm1","0F DD /r","V","V","MMX","","rw,r","",""
+"PADDUSW xmm1, xmm2/m128","PADDUSW xmm2/m128, xmm1","paddusw xmm2/m128, xmm1","66 0F DD /r","V","V","SSE2","","rw,r","",""
+"PADDW mm1, mm2/m64","PADDW mm2/m64, mm1","paddw mm2/m64, mm1","0F FD /r","V","V","MMX","","rw,r","",""
+"PADDW xmm1, xmm2/m128","PADDW xmm2/m128, xmm1","paddw xmm2/m128, xmm1","66 0F FD /r","V","V","SSE2","","rw,r","",""
+"PALIGNR mm1, mm2/m64, imm8u","PALIGNR imm8u, mm2/m64, mm1","palignr imm8u, mm2/m64, mm1","0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r","",""
+"PALIGNR xmm1, xmm2/m128, imm8u","PALIGNR imm8u, xmm2/m128, xmm1","palignr imm8u, xmm2/m128, xmm1","66 0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r","",""
+"PAND mm1, mm2/m64","PAND mm2/m64, mm1","pand mm2/m64, mm1","0F DB /r","V","V","MMX","","rw,r","",""
+"PAND xmm1, xmm2/m128","PAND xmm2/m128, xmm1","pand xmm2/m128, xmm1","66 0F DB /r","V","V","SSE2","","rw,r","",""
+"PANDN mm1, mm2/m64","PANDN mm2/m64, mm1","pandn mm2/m64, mm1","0F DF /r","V","V","MMX","","rw,r","",""
+"PANDN xmm1, xmm2/m128","PANDN xmm2/m128, xmm1","pandn xmm2/m128, xmm1","66 0F DF /r","V","V","SSE2","","rw,r","",""
+"PAUSE","PAUSE","pause","F3 90","V","V","","pseudo","","",""
+"PAUSE","PAUSE","pause","F3 90+rd","V","V","","operand32","","Y",""
+"PAUSE","PAUSE","pause","F3 90+rw","V","V","","operand16,operand64","","Y",""
+"PAVGB mm1, mm2/m64","PAVGB mm2/m64, mm1","pavgb mm2/m64, mm1","0F E0 /r","V","V","MMX","","rw,r","",""
+"PAVGB xmm1, xmm2/m128","PAVGB xmm2/m128, xmm1","pavgb xmm2/m128, xmm1","66 0F E0 /r","V","V","SSE2","","rw,r","",""
+"PAVGUSB mm1, mm2/m64","PAVGUSB mm2/m64, mm1","pavgusb mm2/m64, mm1","0F 0F BF /r","V","V","3DNOW","amd","rw,r","",""
+"PAVGW mm1, mm2/m64","PAVGW mm2/m64, mm1","pavgw mm2/m64, mm1","0F E3 /r","V","V","MMX","","rw,r","",""
+"PAVGW xmm1, xmm2/m128","PAVGW xmm2/m128, xmm1","pavgw xmm2/m128, xmm1","66 0F E3 /r","V","V","SSE2","","rw,r","",""
+"PBLENDVB xmm1, xmm2/m128, <XMM0>","PBLENDVB <XMM0>, xmm2/m128, xmm1","pblendvb <XMM0>, xmm2/m128, xmm1","66 0F 38 10 /r","V","V","SSE4_1","","rw,r,r","",""
+"PBLENDW xmm1, xmm2/m128, imm8u","PBLENDW imm8u, xmm2/m128, xmm1","pblendw imm8u, xmm2/m128, xmm1","66 0F 3A 0E /r ib","V","V","SSE4_1","","rw,r,r","",""
+"PCLMULQDQ xmm1, xmm2/m128, imm8u","PCLMULQDQ imm8u, xmm2/m128, xmm1","pclmulqdq imm8u, xmm2/m128, xmm1","66 0F 3A 44 /r ib","V","V","PCLMULQDQ","","rw,r,r","",""
+"PCMPEQB mm1, mm2/m64","PCMPEQB mm2/m64, mm1","pcmpeqb mm2/m64, mm1","0F 74 /r","V","V","MMX","","rw,r","",""
+"PCMPEQB xmm1, xmm2/m128","PCMPEQB xmm2/m128, xmm1","pcmpeqb xmm2/m128, xmm1","66 0F 74 /r","V","V","SSE2","","rw,r","",""
+"PCMPEQD mm1, mm2/m64","PCMPEQL mm2/m64, mm1","pcmpeqd mm2/m64, mm1","0F 76 /r","V","V","MMX","","rw,r","",""
+"PCMPEQD xmm1, xmm2/m128","PCMPEQL xmm2/m128, xmm1","pcmpeqd xmm2/m128, xmm1","66 0F 76 /r","V","V","SSE2","","rw,r","",""
+"PCMPEQQ xmm1, xmm2/m128","PCMPEQQ xmm2/m128, xmm1","pcmpeqq xmm2/m128, xmm1","66 0F 38 29 /r","V","V","SSE4_1","","rw,r","",""
+"PCMPEQW mm1, mm2/m64","PCMPEQW mm2/m64, mm1","pcmpeqw mm2/m64, mm1","0F 75 /r","V","V","MMX","","rw,r","",""
+"PCMPEQW xmm1, xmm2/m128","PCMPEQW xmm2/m128, xmm1","pcmpeqw xmm2/m128, xmm1","66 0F 75 /r","V","V","SSE2","","rw,r","",""
+"PCMPESTRI xmm1, xmm2/m128, imm8u","PCMPESTRI imm8u, xmm2/m128, xmm1","pcmpestri imm8u, xmm2/m128, xmm1","66 0F 3A 61 /r ib","V","V","SSE4_2","","r,r,r","",""
+"PCMPESTRM xmm1, xmm2/m128, imm8u","PCMPESTRM imm8u, xmm2/m128, xmm1","pcmpestrm imm8u, xmm2/m128, xmm1","66 0F 3A 60 /r ib","V","V","SSE4_2","","r,r,r","",""
+"PCMPGTB mm1, mm2/m64","PCMPGTB mm2/m64, mm1","pcmpgtb mm2/m64, mm1","0F 64 /r","V","V","MMX","","rw,r","",""
+"PCMPGTB xmm1, xmm2/m128","PCMPGTB xmm2/m128, xmm1","pcmpgtb xmm2/m128, xmm1","66 0F 64 /r","V","V","SSE2","","rw,r","",""
+"PCMPGTD mm1, mm2/m64","PCMPGTL mm2/m64, mm1","pcmpgtd mm2/m64, mm1","0F 66 /r","V","V","MMX","","rw,r","",""
+"PCMPGTD xmm1, xmm2/m128","PCMPGTL xmm2/m128, xmm1","pcmpgtd xmm2/m128, xmm1","66 0F 66 /r","V","V","SSE2","","rw,r","",""
+"PCMPGTQ xmm1, xmm2/m128","PCMPGTQ xmm2/m128, xmm1","pcmpgtq xmm2/m128, xmm1","66 0F 38 37 /r","V","V","SSE4_2","","rw,r","",""
+"PCMPGTW mm1, mm2/m64","PCMPGTW mm2/m64, mm1","pcmpgtw mm2/m64, mm1","0F 65 /r","V","V","MMX","","rw,r","",""
+"PCMPGTW xmm1, xmm2/m128","PCMPGTW xmm2/m128, xmm1","pcmpgtw xmm2/m128, xmm1","66 0F 65 /r","V","V","SSE2","","rw,r","",""
+"PCMPISTRI xmm1, xmm2/m128, imm8u","PCMPISTRI imm8u, xmm2/m128, xmm1","pcmpistri imm8u, xmm2/m128, xmm1","66 0F 3A 63 /r ib","V","V","SSE4_2","","r,r,r","",""
+"PCMPISTRM xmm1, xmm2/m128, imm8u","PCMPISTRM imm8u, xmm2/m128, xmm1","pcmpistrm imm8u, xmm2/m128, xmm1","66 0F 3A 62 /r ib","V","V","SSE4_2","","r,r,r","",""
+"PDEP r32, r32V, r/m32","PDEPL r/m32, r32V, r32","pdepl r/m32, r32V, r32","VEX.DDS.128.F2.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32"
+"PDEP r64, r64V, r/m64","PDEPQ r/m64, r64V, r64","pdepq r/m64, r64V, r64","VEX.DDS.128.F2.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64"
+"PEXT r32, r32V, r/m32","PEXTL r/m32, r32V, r32","pextl r/m32, r32V, r32","VEX.DDS.128.F3.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32"
+"PEXT r64, r64V, r/m64","PEXTQ r/m64, r64V, r64","pextq r/m64, r64V, r64","VEX.DDS.128.F3.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64"
+"PEXTRB r32/m8, xmm1, imm8u","PEXTRB imm8u, xmm1, r32/m8","pextrb imm8u, xmm1, r32/m8","66 0F 3A 14 /r ib","V","V","SSE4_1","","w,r,r","",""
+"PEXTRD r/m32, xmm1, imm8u","PEXTRD imm8u, xmm1, r/m32","pextrd imm8u, xmm1, r/m32","66 0F 3A 16 /r ib","V","V","SSE4_1","operand16,operand32","w,r,r","",""
+"PEXTRQ r/m64, xmm1, imm8u","PEXTRQ imm8u, xmm1, r/m64","pextrq imm8u, xmm1, r/m64","66 REX.W 0F 3A 16 /r ib","N.S.","V","SSE4_1","","w,r,r","",""
+"PEXTRW r32, mm2, imm8u","PEXTRW imm8u, mm2, r32","pextrw imm8u, mm2, r32","0F C5 /r ib","V","V","MMX","modrm_regonly","w,r,r","",""
+"PEXTRW r32/m16, xmm1, imm8u","PEXTRW imm8u, xmm1, r32/m16","pextrw imm8u, xmm1, r32/m16","66 0F 3A 15 /r ib","V","V","SSE4_1","","w,r,r","",""
+"PEXTRW r32, xmm2, imm8u","PEXTRW imm8u, xmm2, r32","pextrw imm8u, xmm2, r32","66 0F C5 /r ib","V","V","SSE2","modrm_regonly","w,r,r","",""
+"PF2ID mm1, mm2/m64","PF2ID mm2/m64, mm1","pf2id mm2/m64, mm1","0F 0F 1D /r","V","V","3DNOW","amd","rw,r","",""
+"PF2IW mm1, mm2/m64","PF2IW mm2/m64, mm1","pf2iw mm2/m64, mm1","0F 0F 1C /r","V","V","3DNOW","amd","rw,r","",""
+"PFACC mm1, mm2/m64","PFACC mm2/m64, mm1","pfacc mm2/m64, mm1","0F 0F AE /r","V","V","3DNOW","amd","rw,r","",""
+"PFADD mm1, mm2/m64","PFADD mm2/m64, mm1","pfadd mm2/m64, mm1","0F 0F 9E /r","V","V","3DNOW","amd","rw,r","",""
+"PFCMPEQ mm1, mm2/m64","PFCMPEQ mm2/m64, mm1","pfcmpeq mm2/m64, mm1","0F 0F B0 /r","V","V","3DNOW","amd","rw,r","",""
+"PFCMPGE mm1, mm2/m64","PFCMPGE mm2/m64, mm1","pfcmpge mm2/m64, mm1","0F 0F 90 /r","V","V","3DNOW","amd","rw,r","",""
+"PFCMPGT mm1, mm2/m64","PFCMPGT mm2/m64, mm1","pfcmpgt mm2/m64, mm1","0F 0F A0 /r","V","V","3DNOW","amd","rw,r","",""
+"PFCPIT1 mm1, mm2/m64","PFCPIT1 mm2/m64, mm1","pfcpit1 mm2/m64, mm1","0F 0F A6 /r","V","V","3DNOW","amd","rw,r","",""
+"PFMAX mm1, mm2/m64","PFMAX mm2/m64, mm1","pfmax mm2/m64, mm1","0F 0F A4 /r","V","V","3DNOW","amd","rw,r","",""
+"PFMIN mm1, mm2/m64","PFMIN mm2/m64, mm1","pfmin mm2/m64, mm1","0F 0F 94 /r","V","V","3DNOW","amd","rw,r","",""
+"PFMUL mm1, mm2/m64","PFMUL mm2/m64, mm1","pfmul mm2/m64, mm1","0F 0F B4 /r","V","V","3DNOW","amd","rw,r","",""
+"PFNACC mm1, mm2/m64","PFNACC mm2/m64, mm1","pfnacc mm2/m64, mm1","0F 0F 8A /r","V","V","3DNOW","amd","rw,r","",""
+"PFPNACC mm1, mm2/m64","PFPNACC mm2/m64, mm1","pfpnacc mm2/m64, mm1","0F 0F 8E /r","V","V","3DNOW","amd","rw,r","",""
+"PFRCP mm1, mm2/m64","PFRCP mm2/m64, mm1","pfrcp mm2/m64, mm1","0F 0F 96 /r","V","V","3DNOW","amd","rw,r","",""
+"PFRCPIT2 mm1, mm2/m64","PFRCPIT2 mm2/m64, mm1","pfrcpit2 mm2/m64, mm1","0F 0F B6 /r","V","V","3DNOW","amd","rw,r","",""
+"PFRSQIT1 mm1, mm2/m64","PFRSQIT1 mm2/m64, mm1","pfrsqit1 mm2/m64, mm1","0F 0F A7 /r","V","V","3DNOW","amd","rw,r","",""
+"PFSQRT mm1, mm2/m64","PFSQRT mm2/m64, mm1","pfsqrt mm2/m64, mm1","0F 0F 97 /r","V","V","3DNOW","amd","rw,r","",""
+"PFSUB mm1, mm2/m64","PFSUB mm2/m64, mm1","pfsub mm2/m64, mm1","0F 0F 9A /r","V","V","3DNOW","amd","rw,r","",""
+"PFSUBR mm1, mm2/m64","PFSUBR mm2/m64, mm1","pfsubr mm2/m64, mm1","0F 0F AA /r","V","V","3DNOW","amd","rw,r","",""
+"PHADDD mm1, mm2/m64","PHADDD mm2/m64, mm1","phaddd mm2/m64, mm1","0F 38 02 /r","V","V","SSSE3","","rw,r","",""
+"PHADDD xmm1, xmm2/m128","PHADDD xmm2/m128, xmm1","phaddd xmm2/m128, xmm1","66 0F 38 02 /r","V","V","SSSE3","","rw,r","",""
+"PHADDSW mm1, mm2/m64","PHADDSW mm2/m64, mm1","phaddsw mm2/m64, mm1","0F 38 03 /r","V","V","SSSE3","","rw,r","",""
+"PHADDSW xmm1, xmm2/m128","PHADDSW xmm2/m128, xmm1","phaddsw xmm2/m128, xmm1","66 0F 38 03 /r","V","V","SSSE3","","rw,r","",""
+"PHADDW mm1, mm2/m64","PHADDW mm2/m64, mm1","phaddw mm2/m64, mm1","0F 38 01 /r","V","V","SSSE3","","rw,r","",""
+"PHADDW xmm1, xmm2/m128","PHADDW xmm2/m128, xmm1","phaddw xmm2/m128, xmm1","66 0F 38 01 /r","V","V","SSSE3","","rw,r","",""
+"PHMINPOSUW xmm1, xmm2/m128","PHMINPOSUW xmm2/m128, xmm1","phminposuw xmm2/m128, xmm1","66 0F 38 41 /r","V","V","SSE4_1","","w,r","",""
+"PHSUBD mm1, mm2/m64","PHSUBD mm2/m64, mm1","phsubd mm2/m64, mm1","0F 38 06 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBD xmm1, xmm2/m128","PHSUBD xmm2/m128, xmm1","phsubd xmm2/m128, xmm1","66 0F 38 06 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBSW mm1, mm2/m64","PHSUBSW mm2/m64, mm1","phsubsw mm2/m64, mm1","0F 38 07 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBSW xmm1, xmm2/m128","PHSUBSW xmm2/m128, xmm1","phsubsw xmm2/m128, xmm1","66 0F 38 07 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBW mm1, mm2/m64","PHSUBW mm2/m64, mm1","phsubw mm2/m64, mm1","0F 38 05 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBW xmm1, xmm2/m128","PHSUBW xmm2/m128, xmm1","phsubw xmm2/m128, xmm1","66 0F 38 05 /r","V","V","SSSE3","","rw,r","",""
+"PI2FD mm1, mm2/m64","PI2FD mm2/m64, mm1","pi2fd mm2/m64, mm1","0F 0F 0D /r","V","V","3DNOW","amd","rw,r","",""
+"PI2FW mm1, mm2/m64","PI2FW mm2/m64, mm1","pi2fw mm2/m64, mm1","0F 0F 0C /r","V","V","3DNOW","amd","rw,r","",""
+"PINSRB xmm1, r32/m8, imm8u","PINSRB imm8u, r32/m8, xmm1","pinsrb imm8u, r32/m8, xmm1","66 0F 3A 20 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"PINSRD xmm1, r/m32, imm8u","PINSRD imm8u, r/m32, xmm1","pinsrd imm8u, r/m32, xmm1","66 0F 3A 22 /r ib","V","V","SSE4_1","operand16,operand32","rw,r,r","",""
+"PINSRQ xmm1, r/m64, imm8u","PINSRQ imm8u, r/m64, xmm1","pinsrq imm8u, r/m64, xmm1","66 REX.W 0F 3A 22 /r ib","N.S.","V","SSE4_1","","rw,r,r","",""
+"PINSRW mm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, mm1","pinsrw imm8u, r32/m16, mm1","0F C4 /r ib","V","V","MMX","","rw,r,r","",""
+"PINSRW xmm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, xmm1","pinsrw imm8u, r32/m16, xmm1","66 0F C4 /r ib","V","V","SSE2","","rw,r,r","",""
+"PMADDUBSW mm1, mm2/m64","PMADDUBSW mm2/m64, mm1","pmaddubsw mm2/m64, mm1","0F 38 04 /r","V","V","SSSE3","","rw,r","",""
+"PMADDUBSW xmm1, xmm2/m128","PMADDUBSW xmm2/m128, xmm1","pmaddubsw xmm2/m128, xmm1","66 0F 38 04 /r","V","V","SSSE3","","rw,r","",""
+"PMADDWD mm1, mm2/m64","PMADDWL mm2/m64, mm1","pmaddwd mm2/m64, mm1","0F F5 /r","V","V","MMX","","rw,r","",""
+"PMADDWD xmm1, xmm2/m128","PMADDWL xmm2/m128, xmm1","pmaddwd xmm2/m128, xmm1","66 0F F5 /r","V","V","SSE2","","rw,r","",""
+"PMAXSB xmm1, xmm2/m128","PMAXSB xmm2/m128, xmm1","pmaxsb xmm2/m128, xmm1","66 0F 38 3C /r","V","V","SSE4_1","","rw,r","",""
+"PMAXSD xmm1, xmm2/m128","PMAXSD xmm2/m128, xmm1","pmaxsd xmm2/m128, xmm1","66 0F 38 3D /r","V","V","SSE4_1","","rw,r","",""
+"PMAXSW mm1, mm2/m64","PMAXSW mm2/m64, mm1","pmaxsw mm2/m64, mm1","0F EE /r","V","V","MMX","","rw,r","",""
+"PMAXSW xmm1, xmm2/m128","PMAXSW xmm2/m128, xmm1","pmaxsw xmm2/m128, xmm1","66 0F EE /r","V","V","SSE2","","rw,r","",""
+"PMAXUB mm1, mm2/m64","PMAXUB mm2/m64, mm1","pmaxub mm2/m64, mm1","0F DE /r","V","V","MMX","","rw,r","",""
+"PMAXUB xmm1, xmm2/m128","PMAXUB xmm2/m128, xmm1","pmaxub xmm2/m128, xmm1","66 0F DE /r","V","V","SSE2","","rw,r","",""
+"PMAXUD xmm1, xmm2/m128","PMAXUD xmm2/m128, xmm1","pmaxud xmm2/m128, xmm1","66 0F 38 3F /r","V","V","SSE4_1","","rw,r","",""
+"PMAXUW xmm1, xmm2/m128","PMAXUW xmm2/m128, xmm1","pmaxuw xmm2/m128, xmm1","66 0F 38 3E /r","V","V","SSE4_1","","rw,r","",""
+"PMINSB xmm1, xmm2/m128","PMINSB xmm2/m128, xmm1","pminsb xmm2/m128, xmm1","66 0F 38 38 /r","V","V","SSE4_1","","rw,r","",""
+"PMINSD xmm1, xmm2/m128","PMINSD xmm2/m128, xmm1","pminsd xmm2/m128, xmm1","66 0F 38 39 /r","V","V","SSE4_1","","rw,r","",""
+"PMINSW mm1, mm2/m64","PMINSW mm2/m64, mm1","pminsw mm2/m64, mm1","0F EA /r","V","V","MMX","","rw,r","",""
+"PMINSW xmm1, xmm2/m128","PMINSW xmm2/m128, xmm1","pminsw xmm2/m128, xmm1","66 0F EA /r","V","V","SSE2","","rw,r","",""
+"PMINUB mm1, mm2/m64","PMINUB mm2/m64, mm1","pminub mm2/m64, mm1","0F DA /r","V","V","MMX","","rw,r","",""
+"PMINUB xmm1, xmm2/m128","PMINUB xmm2/m128, xmm1","pminub xmm2/m128, xmm1","66 0F DA /r","V","V","SSE2","","rw,r","",""
+"PMINUD xmm1, xmm2/m128","PMINUD xmm2/m128, xmm1","pminud xmm2/m128, xmm1","66 0F 38 3B /r","V","V","SSE4_1","","rw,r","",""
+"PMINUW xmm1, xmm2/m128","PMINUW xmm2/m128, xmm1","pminuw xmm2/m128, xmm1","66 0F 38 3A /r","V","V","SSE4_1","","rw,r","",""
+"PMOVMSKB r32, mm2","PMOVMSKB mm2, r32","pmovmskb mm2, r32","0F D7 /r","V","V","SSE","modrm_regonly","w,r","",""
+"PMOVMSKB r32, xmm2","PMOVMSKB xmm2, r32","pmovmskb xmm2, r32","66 0F D7 /r","V","V","SSE2","modrm_regonly","w,r","",""
+"PMOVSXBD xmm1, xmm2/m32","PMOVSXBD xmm2/m32, xmm1","pmovsxbd xmm2/m32, xmm1","66 0F 38 21 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXBQ xmm1, xmm2/m16","PMOVSXBQ xmm2/m16, xmm1","pmovsxbq xmm2/m16, xmm1","66 0F 38 22 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXBW xmm1, xmm2/m64","PMOVSXBW xmm2/m64, xmm1","pmovsxbw xmm2/m64, xmm1","66 0F 38 20 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXDQ xmm1, xmm2/m64","PMOVSXDQ xmm2/m64, xmm1","pmovsxdq xmm2/m64, xmm1","66 0F 38 25 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXWD xmm1, xmm2/m64","PMOVSXWD xmm2/m64, xmm1","pmovsxwd xmm2/m64, xmm1","66 0F 38 23 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXWQ xmm1, xmm2/m32","PMOVSXWQ xmm2/m32, xmm1","pmovsxwq xmm2/m32, xmm1","66 0F 38 24 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXBD xmm1, xmm2/m32","PMOVZXBD xmm2/m32, xmm1","pmovzxbd xmm2/m32, xmm1","66 0F 38 31 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXBQ xmm1, xmm2/m16","PMOVZXBQ xmm2/m16, xmm1","pmovzxbq xmm2/m16, xmm1","66 0F 38 32 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXBW xmm1, xmm2/m64","PMOVZXBW xmm2/m64, xmm1","pmovzxbw xmm2/m64, xmm1","66 0F 38 30 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXDQ xmm1, xmm2/m64","PMOVZXDQ xmm2/m64, xmm1","pmovzxdq xmm2/m64, xmm1","66 0F 38 35 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXWD xmm1, xmm2/m64","PMOVZXWD xmm2/m64, xmm1","pmovzxwd xmm2/m64, xmm1","66 0F 38 33 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXWQ xmm1, xmm2/m32","PMOVZXWQ xmm2/m32, xmm1","pmovzxwq xmm2/m32, xmm1","66 0F 38 34 /r","V","V","SSE4_1","","w,r","",""
+"PMULDQ xmm1, xmm2/m128","PMULDQ xmm2/m128, xmm1","pmuldq xmm2/m128, xmm1","66 0F 38 28 /r","V","V","SSE4_1","","rw,r","",""
+"PMULHRSW mm1, mm2/m64","PMULHRSW mm2/m64, mm1","pmulhrsw mm2/m64, mm1","0F 38 0B /r","V","V","SSSE3","","rw,r","",""
+"PMULHRSW xmm1, xmm2/m128","PMULHRSW xmm2/m128, xmm1","pmulhrsw xmm2/m128, xmm1","66 0F 38 0B /r","V","V","SSSE3","","rw,r","",""
+"PMULHRW mm1, mm2/m64","PMULHRW mm2/m64, mm1","pmulhrw mm2/m64, mm1","0F 0F B7 /r","V","V","3DNOW","amd","rw,r","",""
+"PMULHUW mm1, mm2/m64","PMULHUW mm2/m64, mm1","pmulhuw mm2/m64, mm1","0F E4 /r","V","V","MMX","","rw,r","",""
+"PMULHUW xmm1, xmm2/m128","PMULHUW xmm2/m128, xmm1","pmulhuw xmm2/m128, xmm1","66 0F E4 /r","V","V","SSE2","","rw,r","",""
+"PMULHW mm1, mm2/m64","PMULHW mm2/m64, mm1","pmulhw mm2/m64, mm1","0F E5 /r","V","V","MMX","","rw,r","",""
+"PMULHW xmm1, xmm2/m128","PMULHW xmm2/m128, xmm1","pmulhw xmm2/m128, xmm1","66 0F E5 /r","V","V","SSE2","","rw,r","",""
+"PMULLD xmm1, xmm2/m128","PMULLD xmm2/m128, xmm1","pmulld xmm2/m128, xmm1","66 0F 38 40 /r","V","V","SSE4_1","","rw,r","",""
+"PMULLW mm1, mm2/m64","PMULLW mm2/m64, mm1","pmullw mm2/m64, mm1","0F D5 /r","V","V","MMX","","rw,r","",""
+"PMULLW xmm1, xmm2/m128","PMULLW xmm2/m128, xmm1","pmullw xmm2/m128, xmm1","66 0F D5 /r","V","V","SSE2","","rw,r","",""
+"PMULUDQ mm1, mm2/m64","PMULULQ mm2/m64, mm1","pmuludq mm2/m64, mm1","0F F4 /r","V","V","SSE2","","rw,r","",""
+"PMULUDQ xmm1, xmm2/m128","PMULULQ xmm2/m128, xmm1","pmuludq xmm2/m128, xmm1","66 0F F4 /r","V","V","SSE2","","rw,r","",""
+"POPAD","POPAL","popal","61","V","N.S.","","operand32","","",""
+"POPA","POPAW","popaw","61","V","N.S.","","operand16","","",""
+"POPCNT r32, r/m32","POPCNTL r/m32, r32","popcntl r/m32, r32","F3 0F B8 /r","V","V","POPCNT","operand32","w,r","Y","32"
+"POPCNT r64, r/m64","POPCNTQ r/m64, r64","popcntq r/m64, r64","F3 REX.W 0F B8 /r","N.S.","V","POPCNT","","w,r","Y","64"
+"POPCNT r16, r/m16","POPCNTW r/m16, r16","popcntw r/m16, r16","F3 0F B8 /r","V","V","POPCNT","operand16","w,r","Y","16"
+"POPFD","POPFL","popfl","9D","V","N.S.","","operand32","","",""
+"POPFQ","POPFQ","popfq","9D","N.S.","V","","default64","","",""
+"POPF","POPFW","popfw","9D","V","V","","operand16","","",""
+"POP r/m32","POPL r/m32","popl r/m32","8F /0","V","N.S.","","operand32","w","Y","32"
+"POP r32op","POPL r32op","popl r32op","58+rd","V","N.S.","","operand32","w","Y","32"
+"POP r/m64","POPQ r/m64","popq r/m64","8F /0","N.S.","V","","default64","w","Y","64"
+"POP r64op","POPQ r64op","popq r64op","58+ro","N.S.","V","","default64","w","Y","64"
+"POP r/m16","POPW r/m16","popw r/m16","8F /0","V","V","","operand16","w","Y","16"
+"POP r16op","POPW r16op","popw r16op","58+rw","V","V","","operand16","w","Y","16"
+"POP DS","POPW/POPL/POPQ DS","popw/popl/popq DS","1F","V","N.S.","","","w","Y",""
+"POP ES","POPW/POPL/POPQ ES","popw/popl/popq ES","07","V","N.S.","","","w","Y",""
+"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","N.S.","V","","default64","w","Y",""
+"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","N.S.","","operand32","w","Y",""
+"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","V","","operand16","w","Y",""
+"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","N.S.","V","","default64","w","Y",""
+"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","V","","operand16","w","Y",""
+"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","N.S.","","operand32","w","Y",""
+"POP SS","POPW/POPL/POPQ SS","popw/popl/popq SS","17","V","N.S.","","","w","Y",""
+"POR mm1, mm2/m64","POR mm2/m64, mm1","por mm2/m64, mm1","0F EB /r","V","V","MMX","","rw,r","",""
+"POR xmm1, xmm2/m128","POR xmm2/m128, xmm1","por xmm2/m128, xmm1","66 0F EB /r","V","V","SSE2","","rw,r","",""
+"PREFETCHNTA m8","PREFETCHNTA m8","prefetchnta m8","0F 18 /0","V","V","","modrm_memonly","r","",""
+"PREFETCHT0 m8","PREFETCHT0 m8","prefetcht0 m8","0F 18 /1","V","V","","modrm_memonly","r","",""
+"PREFETCHT1 m8","PREFETCHT1 m8","prefetcht1 m8","0F 18 /2","V","V","","modrm_memonly","r","",""
+"PREFETCHT2 m8","PREFETCHT2 m8","prefetcht2 m8","0F 18 /3","V","V","","modrm_memonly","r","",""
+"PREFETCHW m8","PREFETCHW m8","prefetchw m8","0F 0D /1","V","V","PRFCHW","modrm_memonly","r","",""
+"PREFETCHWT1 m8","PREFETCHWT1 m8","prefetchwt1 m8","0F 0D /2","V","V","PREFETCHWT1","modrm_memonly","r","",""
+"PREFETCHW_ALIAS m8","PREFETCHW_ALIAS m8","prefetchw_alias m8","0F 0D /3","V","V","PRFCHW","modrm_memonly","r","",""
+"PREFETCH_EXCLUSIVE m8","PREFETCH_EXCLUSIVE m8","prefetch_exclusive m8","0F 0D /0","V","V","PRFCHW","modrm_memonly","r","",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /2","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /4","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /5","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /6","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /7","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PSADBW mm1, mm2/m64","PSADBW mm2/m64, mm1","psadbw mm2/m64, mm1","0F F6 /r","V","V","MMX","","rw,r","",""
+"PSADBW xmm1, xmm2/m128","PSADBW xmm2/m128, xmm1","psadbw xmm2/m128, xmm1","66 0F F6 /r","V","V","SSE2","","rw,r","",""
+"PSHUFB mm1, mm2/m64","PSHUFB mm2/m64, mm1","pshufb mm2/m64, mm1","0F 38 00 /r","V","V","SSSE3","","rw,r","",""
+"PSHUFB xmm1, xmm2/m128","PSHUFB xmm2/m128, xmm1","pshufb xmm2/m128, xmm1","66 0F 38 00 /r","V","V","SSSE3","","rw,r","",""
+"PSHUFD xmm1, xmm2/m128, imm8u","PSHUFD imm8u, xmm2/m128, xmm1","pshufd imm8u, xmm2/m128, xmm1","66 0F 70 /r ib","V","V","SSE2","","w,r,r","",""
+"PSHUFHW xmm1, xmm2/m128, imm8u","PSHUFHW imm8u, xmm2/m128, xmm1","pshufhw imm8u, xmm2/m128, xmm1","F3 0F 70 /r ib","V","V","SSE2","","w,r,r","",""
+"PSHUFLW xmm1, xmm2/m128, imm8u","PSHUFLW imm8u, xmm2/m128, xmm1","pshuflw imm8u, xmm2/m128, xmm1","F2 0F 70 /r ib","V","V","SSE2","","w,r,r","",""
+"PSHUFW mm1, mm2/m64, imm8u","PSHUFW imm8u, mm2/m64, mm1","pshufw imm8u, mm2/m64, mm1","0F 70 /r ib","V","V","MMX","","w,r,r","",""
+"PSIGNB mm1, mm2/m64","PSIGNB mm2/m64, mm1","psignb mm2/m64, mm1","0F 38 08 /r","V","V","SSSE3","","rw,r","",""
+"PSIGNB xmm1, xmm2/m128","PSIGNB xmm2/m128, xmm1","psignb xmm2/m128, xmm1","66 0F 38 08 /r","V","V","SSSE3","","rw,r","",""
+"PSIGND mm1, mm2/m64","PSIGND mm2/m64, mm1","psignd mm2/m64, mm1","0F 38 0A /r","V","V","SSSE3","","rw,r","",""
+"PSIGND xmm1, xmm2/m128","PSIGND xmm2/m128, xmm1","psignd xmm2/m128, xmm1","66 0F 38 0A /r","V","V","SSSE3","","rw,r","",""
+"PSIGNW mm1, mm2/m64","PSIGNW mm2/m64, mm1","psignw mm2/m64, mm1","0F 38 09 /r","V","V","SSSE3","","rw,r","",""
+"PSIGNW xmm1, xmm2/m128","PSIGNW xmm2/m128, xmm1","psignw xmm2/m128, xmm1","66 0F 38 09 /r","V","V","SSSE3","","rw,r","",""
+"PSLLD mm2, imm8u","PSLLL imm8u, mm2","pslld imm8u, mm2","0F 72 /6 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSLLD xmm2, imm8u","PSLLL imm8u, xmm2","pslld imm8u, xmm2","66 0F 72 /6 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSLLD mm1, mm2/m64","PSLLL mm2/m64, mm1","pslld mm2/m64, mm1","0F F2 /r","V","V","MMX","","rw,r","",""
+"PSLLD xmm1, xmm2/m128","PSLLL xmm2/m128, xmm1","pslld xmm2/m128, xmm1","66 0F F2 /r","V","V","SSE2","","rw,r","",""
+"PSLLDQ xmm2, imm8u","PSLLO imm8u, xmm2","pslldq imm8u, xmm2","66 0F 73 /7 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSLLQ mm2, imm8u","PSLLQ imm8u, mm2","psllq imm8u, mm2","0F 73 /6 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSLLQ xmm2, imm8u","PSLLQ imm8u, xmm2","psllq imm8u, xmm2","66 0F 73 /6 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSLLQ mm1, mm2/m64","PSLLQ mm2/m64, mm1","psllq mm2/m64, mm1","0F F3 /r","V","V","MMX","","rw,r","",""
+"PSLLQ xmm1, xmm2/m128","PSLLQ xmm2/m128, xmm1","psllq xmm2/m128, xmm1","66 0F F3 /r","V","V","SSE2","","rw,r","",""
+"PSLLW mm2, imm8u","PSLLW imm8u, mm2","psllw imm8u, mm2","0F 71 /6 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSLLW xmm2, imm8u","PSLLW imm8u, xmm2","psllw imm8u, xmm2","66 0F 71 /6 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSLLW mm1, mm2/m64","PSLLW mm2/m64, mm1","psllw mm2/m64, mm1","0F F1 /r","V","V","MMX","","rw,r","",""
+"PSLLW xmm1, xmm2/m128","PSLLW xmm2/m128, xmm1","psllw xmm2/m128, xmm1","66 0F F1 /r","V","V","SSE2","","rw,r","",""
+"PSRAD mm2, imm8u","PSRAL imm8u, mm2","psrad imm8u, mm2","0F 72 /4 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRAD xmm2, imm8u","PSRAL imm8u, xmm2","psrad imm8u, xmm2","66 0F 72 /4 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRAD mm1, mm2/m64","PSRAL mm2/m64, mm1","psrad mm2/m64, mm1","0F E2 /r","V","V","MMX","","rw,r","",""
+"PSRAD xmm1, xmm2/m128","PSRAL xmm2/m128, xmm1","psrad xmm2/m128, xmm1","66 0F E2 /r","V","V","SSE2","","rw,r","",""
+"PSRAW mm2, imm8u","PSRAW imm8u, mm2","psraw imm8u, mm2","0F 71 /4 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRAW xmm2, imm8u","PSRAW imm8u, xmm2","psraw imm8u, xmm2","66 0F 71 /4 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRAW mm1, mm2/m64","PSRAW mm2/m64, mm1","psraw mm2/m64, mm1","0F E1 /r","V","V","MMX","","rw,r","",""
+"PSRAW xmm1, xmm2/m128","PSRAW xmm2/m128, xmm1","psraw xmm2/m128, xmm1","66 0F E1 /r","V","V","SSE2","","rw,r","",""
+"PSRLD mm2, imm8u","PSRLL imm8u, mm2","psrld imm8u, mm2","0F 72 /2 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRLD xmm2, imm8u","PSRLL imm8u, xmm2","psrld imm8u, xmm2","66 0F 72 /2 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRLD mm1, mm2/m64","PSRLL mm2/m64, mm1","psrld mm2/m64, mm1","0F D2 /r","V","V","MMX","","rw,r","",""
+"PSRLD xmm1, xmm2/m128","PSRLL xmm2/m128, xmm1","psrld xmm2/m128, xmm1","66 0F D2 /r","V","V","SSE2","","rw,r","",""
+"PSRLDQ xmm2, imm8u","PSRLO imm8u, xmm2","psrldq imm8u, xmm2","66 0F 73 /3 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRLQ mm2, imm8u","PSRLQ imm8u, mm2","psrlq imm8u, mm2","0F 73 /2 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRLQ xmm2, imm8u","PSRLQ imm8u, xmm2","psrlq imm8u, xmm2","66 0F 73 /2 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRLQ mm1, mm2/m64","PSRLQ mm2/m64, mm1","psrlq mm2/m64, mm1","0F D3 /r","V","V","MMX","","rw,r","",""
+"PSRLQ xmm1, xmm2/m128","PSRLQ xmm2/m128, xmm1","psrlq xmm2/m128, xmm1","66 0F D3 /r","V","V","SSE2","","rw,r","",""
+"PSRLW mm2, imm8u","PSRLW imm8u, mm2","psrlw imm8u, mm2","0F 71 /2 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRLW xmm2, imm8u","PSRLW imm8u, xmm2","psrlw imm8u, xmm2","66 0F 71 /2 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRLW mm1, mm2/m64","PSRLW mm2/m64, mm1","psrlw mm2/m64, mm1","0F D1 /r","V","V","MMX","","rw,r","",""
+"PSRLW xmm1, xmm2/m128","PSRLW xmm2/m128, xmm1","psrlw xmm2/m128, xmm1","66 0F D1 /r","V","V","SSE2","","rw,r","",""
+"PSUBB mm1, mm2/m64","PSUBB mm2/m64, mm1","psubb mm2/m64, mm1","0F F8 /r","V","V","MMX","","rw,r","",""
+"PSUBB xmm1, xmm2/m128","PSUBB xmm2/m128, xmm1","psubb xmm2/m128, xmm1","66 0F F8 /r","V","V","SSE2","","rw,r","",""
+"PSUBD mm1, mm2/m64","PSUBL mm2/m64, mm1","psubd mm2/m64, mm1","0F FA /r","V","V","MMX","","rw,r","",""
+"PSUBD xmm1, xmm2/m128","PSUBL xmm2/m128, xmm1","psubd xmm2/m128, xmm1","66 0F FA /r","V","V","SSE2","","rw,r","",""
+"PSUBQ mm1, mm2/m64","PSUBQ mm2/m64, mm1","psubq mm2/m64, mm1","0F FB /r","V","V","SSE2","","rw,r","",""
+"PSUBQ xmm1, xmm2/m128","PSUBQ xmm2/m128, xmm1","psubq xmm2/m128, xmm1","66 0F FB /r","V","V","SSE2","","rw,r","",""
+"PSUBSB mm1, mm2/m64","PSUBSB mm2/m64, mm1","psubsb mm2/m64, mm1","0F E8 /r","V","V","MMX","","rw,r","",""
+"PSUBSB xmm1, xmm2/m128","PSUBSB xmm2/m128, xmm1","psubsb xmm2/m128, xmm1","66 0F E8 /r","V","V","SSE2","","rw,r","",""
+"PSUBSW mm1, mm2/m64","PSUBSW mm2/m64, mm1","psubsw mm2/m64, mm1","0F E9 /r","V","V","MMX","","rw,r","",""
+"PSUBSW xmm1, xmm2/m128","PSUBSW xmm2/m128, xmm1","psubsw xmm2/m128, xmm1","66 0F E9 /r","V","V","SSE2","","rw,r","",""
+"PSUBUSB mm1, mm2/m64","PSUBUSB mm2/m64, mm1","psubusb mm2/m64, mm1","0F D8 /r","V","V","MMX","","rw,r","",""
+"PSUBUSB xmm1, xmm2/m128","PSUBUSB xmm2/m128, xmm1","psubusb xmm2/m128, xmm1","66 0F D8 /r","V","V","SSE2","","rw,r","",""
+"PSUBUSW mm1, mm2/m64","PSUBUSW mm2/m64, mm1","psubusw mm2/m64, mm1","0F D9 /r","V","V","MMX","","rw,r","",""
+"PSUBUSW xmm1, xmm2/m128","PSUBUSW xmm2/m128, xmm1","psubusw xmm2/m128, xmm1","66 0F D9 /r","V","V","SSE2","","rw,r","",""
+"PSUBW mm1, mm2/m64","PSUBW mm2/m64, mm1","psubw mm2/m64, mm1","0F F9 /r","V","V","MMX","","rw,r","",""
+"PSUBW xmm1, xmm2/m128","PSUBW xmm2/m128, xmm1","psubw xmm2/m128, xmm1","66 0F F9 /r","V","V","SSE2","","rw,r","",""
+"PSWAPD mm1, mm2/m64","PSWAPD mm2/m64, mm1","pswapd mm2/m64, mm1","0F 0F BB /r","V","V","3DNOW","amd","rw,r","",""
+"PTEST xmm1, xmm2/m128","PTEST xmm2/m128, xmm1","ptest xmm2/m128, xmm1","66 0F 38 17 /r","V","V","SSE4_1","","r,r","",""
+"PTWRITE r/m32","PTWRITEL r/m32","ptwritel r/m32","F3 0F AE /4","V","V","","operand16,operand32","r","Y","32"
+"PTWRITE r/m64","PTWRITEQ r/m64","ptwriteq r/m64","F3 REX.W 0F AE /4","N.S.","V","","","r","Y","64"
+"PUNPCKHBW mm1, mm2/m64","PUNPCKHBW mm2/m64, mm1","punpckhbw mm2/m64, mm1","0F 68 /r","V","V","MMX","","rw,r","",""
+"PUNPCKHBW xmm1, xmm2/m128","PUNPCKHBW xmm2/m128, xmm1","punpckhbw xmm2/m128, xmm1","66 0F 68 /r","V","V","SSE2","","rw,r","",""
+"PUNPCKHDQ mm1, mm2/m64","PUNPCKHLQ mm2/m64, mm1","punpckhdq mm2/m64, mm1","0F 6A /r","V","V","MMX","","rw,r","",""
+"PUNPCKHDQ xmm1, xmm2/m128","PUNPCKHLQ xmm2/m128, xmm1","punpckhdq xmm2/m128, xmm1","66 0F 6A /r","V","V","SSE2","","rw,r","",""
+"PUNPCKHQDQ xmm1, xmm2/m128","PUNPCKHQDQ xmm2/m128, xmm1","punpckhqdq xmm2/m128, xmm1","66 0F 6D /r","V","V","SSE2","","rw,r","",""
+"PUNPCKHWD mm1, mm2/m64","PUNPCKHWL mm2/m64, mm1","punpckhwd mm2/m64, mm1","0F 69 /r","V","V","MMX","","rw,r","",""
+"PUNPCKHWD xmm1, xmm2/m128","PUNPCKHWL xmm2/m128, xmm1","punpckhwd xmm2/m128, xmm1","66 0F 69 /r","V","V","SSE2","","rw,r","",""
+"PUNPCKLBW mm1, mm2/m32","PUNPCKLBW mm2/m32, mm1","punpcklbw mm2/m32, mm1","0F 60 /r","V","V","MMX","","rw,r","",""
+"PUNPCKLBW xmm1, xmm2/m128","PUNPCKLBW xmm2/m128, xmm1","punpcklbw xmm2/m128, xmm1","66 0F 60 /r","V","V","SSE2","","rw,r","",""
+"PUNPCKLDQ mm1, mm2/m32","PUNPCKLLQ mm2/m32, mm1","punpckldq mm2/m32, mm1","0F 62 /r","V","V","MMX","","rw,r","",""
+"PUNPCKLDQ xmm1, xmm2/m128","PUNPCKLLQ xmm2/m128, xmm1","punpckldq xmm2/m128, xmm1","66 0F 62 /r","V","V","SSE2","","rw,r","",""
+"PUNPCKLQDQ xmm1, xmm2/m128","PUNPCKLQDQ xmm2/m128, xmm1","punpcklqdq xmm2/m128, xmm1","66 0F 6C /r","V","V","SSE2","","rw,r","",""
+"PUNPCKLWD mm1, mm2/m32","PUNPCKLWL mm2/m32, mm1","punpcklwd mm2/m32, mm1","0F 61 /r","V","V","MMX","","rw,r","",""
+"PUNPCKLWD xmm1, xmm2/m128","PUNPCKLWL xmm2/m128, xmm1","punpcklwd xmm2/m128, xmm1","66 0F 61 /r","V","V","SSE2","","rw,r","",""
+"PUSHAD","PUSHAL","pushal","60","V","N.S.","","operand32","","",""
+"PUSHA","PUSHAW","pushaw","60","V","N.S.","","operand16","","",""
+"PUSHFD","PUSHFL","pushfl","9C","V","N.S.","","operand32","","",""
+"PUSHFQ","PUSHFQ","pushfq","9C","N.S.","V","","default64","","",""
+"PUSHF","PUSHFW","pushfw","9C","V","V","","operand16","","",""
+"PUSH r/m32","PUSHL r/m32","pushl r/m32","FF /6","V","N.S.","","operand32","r","Y","32"
+"PUSH r32op","PUSHL r32op","pushl r32op","50+rd","V","N.S.","","operand32","r","Y","32"
+"PUSH r/m64","PUSHQ r/m64","pushq r/m64","FF /6","N.S.","V","","default64","r","Y","64"
+"PUSH r64op","PUSHQ r64op","pushq r64op","50+ro","N.S.","V","","default64","r","Y","64"
+"PUSH imm16","PUSHW imm16","pushw imm16","68 iw","V","V","","operand16","r","Y",""
+"PUSH r/m16","PUSHW r/m16","pushw r/m16","FF /6","V","V","","operand16","r","Y","16"
+"PUSH r16op","PUSHW r16op","pushw r16op","50+rw","V","V","","operand16","r","Y","16"
+"PUSH CS","PUSHW/PUSHL/PUSHQ CS","pushw/pushl/pushq CS","0E","V","N.S.","","","r","Y",""
+"PUSH DS","PUSHW/PUSHL/PUSHQ DS","pushw/pushl/pushq DS","1E","V","N.S.","","","r","Y",""
+"PUSH ES","PUSHW/PUSHL/PUSHQ ES","pushw/pushl/pushq ES","06","V","N.S.","","","r","Y",""
+"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","V","","operand16","r","Y",""
+"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","N.S.","V","","default64","r","Y",""
+"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","N.S.","","operand32","r","Y",""
+"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","N.S.","V","","default64","r","Y",""
+"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","N.S.","","operand32","r","Y",""
+"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","V","","operand16","r","Y",""
+"PUSH SS","PUSHW/PUSHL/PUSHQ SS","pushw/pushl/pushq SS","16","V","N.S.","","","r","Y",""
+"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V","N.S.","","operand32","r","Y",""
+"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","N.S.","V","","default64","r","Y",""
+"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V","V","","operand16","r","Y",""
+"PXOR mm1, mm2/m64","PXOR mm2/m64, mm1","pxor mm2/m64, mm1","0F EF /r","V","V","MMX","","rw,r","",""
+"PXOR xmm1, xmm2/m128","PXOR xmm2/m128, xmm1","pxor xmm2/m128, xmm1","66 0F EF /r","V","V","SSE2","","rw,r","",""
+"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","D0 /2","V","V","","","rw,r","Y","8"
+"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","REX D0 /2","N.E.","V","","pseudo64","w,r","Y","8"
+"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","D2 /2","V","V","","","rw,r","Y","8"
+"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","REX D2 /2","N.E.","V","","pseudo64","w,r","Y","8"
+"RCL r/m8, imm8","RCLB imm8, r/m8","rclb imm8, r/m8","REX C0 /2 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"RCL r/m8, imm8u","RCLB imm8u, r/m8","rclb imm8u, r/m8","C0 /2 ib","V","V","","","rw,r","Y","8"
+"RCL r/m32, 1","RCLL 1, r/m32","rcll 1, r/m32","D1 /2","V","V","","operand32","rw,r","Y","32"
+"RCL r/m32, CL","RCLL CL, r/m32","rcll CL, r/m32","D3 /2","V","V","","operand32","rw,r","Y","32"
+"RCL r/m32, imm8u","RCLL imm8u, r/m32","rcll imm8u, r/m32","C1 /2 ib","V","V","","operand32","rw,r","Y","32"
+"RCL r/m64, 1","RCLQ 1, r/m64","rclq 1, r/m64","REX.W D1 /2","N.S.","V","","","rw,r","Y","64"
+"RCL r/m64, CL","RCLQ CL, r/m64","rclq CL, r/m64","REX.W D3 /2","N.S.","V","","","rw,r","Y","64"
+"RCL r/m64, imm8u","RCLQ imm8u, r/m64","rclq imm8u, r/m64","REX.W C1 /2 ib","N.S.","V","","","rw,r","Y","64"
+"RCL r/m16, 1","RCLW 1, r/m16","rclw 1, r/m16","D1 /2","V","V","","operand16","rw,r","Y","16"
+"RCL r/m16, CL","RCLW CL, r/m16","rclw CL, r/m16","D3 /2","V","V","","operand16","rw,r","Y","16"
+"RCL r/m16, imm8u","RCLW imm8u, r/m16","rclw imm8u, r/m16","C1 /2 ib","V","V","","operand16","rw,r","Y","16"
+"RCPPS xmm1, xmm2/m128","RCPPS xmm2/m128, xmm1","rcpps xmm2/m128, xmm1","0F 53 /r","V","V","SSE","","w,r","",""
+"RCPSS xmm1, xmm2/m32","RCPSS xmm2/m32, xmm1","rcpss xmm2/m32, xmm1","F3 0F 53 /r","V","V","SSE","","w,r","",""
+"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","D0 /3","V","V","","","rw,r","Y","8"
+"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","REX D0 /3","N.E.","V","","pseudo64","w,r","Y","8"
+"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","D2 /3","V","V","","","rw,r","Y","8"
+"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","REX D2 /3","N.E.","V","","pseudo64","w,r","Y","8"
+"RCR r/m8, imm8","RCRB imm8, r/m8","rcrb imm8, r/m8","REX C0 /3 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"RCR r/m8, imm8u","RCRB imm8u, r/m8","rcrb imm8u, r/m8","C0 /3 ib","V","V","","","rw,r","Y","8"
+"RCR r/m32, 1","RCRL 1, r/m32","rcrl 1, r/m32","D1 /3","V","V","","operand32","rw,r","Y","32"
+"RCR r/m32, CL","RCRL CL, r/m32","rcrl CL, r/m32","D3 /3","V","V","","operand32","rw,r","Y","32"
+"RCR r/m32, imm8u","RCRL imm8u, r/m32","rcrl imm8u, r/m32","C1 /3 ib","V","V","","operand32","rw,r","Y","32"
+"RCR r/m64, 1","RCRQ 1, r/m64","rcrq 1, r/m64","REX.W D1 /3","N.S.","V","","","rw,r","Y","64"
+"RCR r/m64, CL","RCRQ CL, r/m64","rcrq CL, r/m64","REX.W D3 /3","N.S.","V","","","rw,r","Y","64"
+"RCR r/m64, imm8u","RCRQ imm8u, r/m64","rcrq imm8u, r/m64","REX.W C1 /3 ib","N.S.","V","","","rw,r","Y","64"
+"RCR r/m16, 1","RCRW 1, r/m16","rcrw 1, r/m16","D1 /3","V","V","","operand16","rw,r","Y","16"
+"RCR r/m16, CL","RCRW CL, r/m16","rcrw CL, r/m16","D3 /3","V","V","","operand16","rw,r","Y","16"
+"RCR r/m16, imm8u","RCRW imm8u, r/m16","rcrw imm8u, r/m16","C1 /3 ib","V","V","","operand16","rw,r","Y","16"
+"RDFSBASE rmr32","RDFSBASEL rmr32","rdfsbase rmr32","F3 0F AE /0","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32"
+"RDFSBASE rmr64","RDFSBASEQ rmr64","rdfsbase rmr64","F3 REX.W 0F AE /0","N.S.","V","FSGSBASE","modrm_regonly","w","Y","64"
+"RDGSBASE rmr32","RDGSBASEL rmr32","rdgsbase rmr32","F3 0F AE /1","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32"
+"RDGSBASE rmr64","RDGSBASEQ rmr64","rdgsbase rmr64","F3 REX.W 0F AE /1","N.S.","V","FSGSBASE","modrm_regonly","w","Y","64"
+"RDMSR","RDMSR","rdmsr","0F 32","V","V","Pentium","","","",""
+"RDPKRU","RDPKRU","rdpkru","0F 01 EE","V","V","PKU","","","",""
+"RDPMC","RDPMC","rdpmc","0F 33","V","V","","","","",""
+"RDRAND rmr32","RDRANDL rmr32","rdrand rmr32","0F C7 /6","V","V","RDRAND","modrm_regonly,operand32","w","Y","32"
+"RDRAND rmr64","RDRANDQ rmr64","rdrand rmr64","REX.W 0F C7 /6","N.S.","V","RDRAND","modrm_regonly","w","Y","64"
+"RDRAND rmr16","RDRANDW rmr16","rdrand rmr16","0F C7 /6","V","V","RDRAND","modrm_regonly,operand16","w","Y","16"
+"RDSEED rmr32","RDSEEDL rmr32","rdseed rmr32","0F C7 /7","V","V","RDSEED","modrm_regonly,operand32","w","Y","32"
+"RDSEED rmr64","RDSEEDQ rmr64","rdseed rmr64","REX.W 0F C7 /7","N.S.","V","RDSEED","modrm_regonly","w","Y","64"
+"RDSEED rmr16","RDSEEDW rmr16","rdseed rmr16","0F C7 /7","V","V","RDSEED","modrm_regonly,operand16","w","Y","16"
+"RDSSPD rmr32","RDSSPD rmr32","rdsspd rmr32","F3 0F 1E /1","V","V","CET","modrm_regonly,operand16,operand32","w","",""
+"RDSSPQ rmr64","RDSSPQ rmr64","rdsspq rmr64","F3 REX.W 0F 1E /1","N.S.","V","CET","modrm_regonly","w","",""
+"RDTSC","RDTSC","rdtsc","0F 31","V","V","Pentium","","","",""
+"RDTSCP","RDTSCP","rdtscp","0F 01 F9","V","V","RDTSCP","","","",""
+"RET_FAR","RETFW/RETFL/RETFQ","lretw/lretl/lretl","CB","V","V","","","","",""
+"RET_FAR imm16u","RETFW/RETFL/RETFQ imm16u","lretw/lretl/lretl imm16u","CA iw","V","V","","","r","",""
+"RET","RETW/RETL/RETQ","retw/retl/retq","C3","N.S.","V","","default64","","",""
+"RET","RETW/RETL/RETQ","retw/retl/retq","C3","V","N.S.","","","","",""
+"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","N.S.","V","","default64","r","",""
+"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","V","N.S.","","","r","",""
+"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","D0 /0","V","V","","","rw,r","Y","8"
+"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","REX D0 /0","N.E.","V","","pseudo64","w,r","Y","8"
+"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","D2 /0","V","V","","","rw,r","Y","8"
+"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","REX D2 /0","N.E.","V","","pseudo64","w,r","Y","8"
+"ROL r/m8, imm8","ROLB imm8, r/m8","rolb imm8, r/m8","REX C0 /0 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"ROL r/m8, imm8u","ROLB imm8u, r/m8","rolb imm8u, r/m8","C0 /0 ib","V","V","","","rw,r","Y","8"
+"ROL r/m32, 1","ROLL 1, r/m32","roll 1, r/m32","D1 /0","V","V","","operand32","rw,r","Y","32"
+"ROL r/m32, CL","ROLL CL, r/m32","roll CL, r/m32","D3 /0","V","V","","operand32","rw,r","Y","32"
+"ROL r/m32, imm8u","ROLL imm8u, r/m32","roll imm8u, r/m32","C1 /0 ib","V","V","","operand32","rw,r","Y","32"
+"ROL r/m64, 1","ROLQ 1, r/m64","rolq 1, r/m64","REX.W D1 /0","N.S.","V","","","rw,r","Y","64"
+"ROL r/m64, CL","ROLQ CL, r/m64","rolq CL, r/m64","REX.W D3 /0","N.S.","V","","","rw,r","Y","64"
+"ROL r/m64, imm8u","ROLQ imm8u, r/m64","rolq imm8u, r/m64","REX.W C1 /0 ib","N.S.","V","","","rw,r","Y","64"
+"ROL r/m16, 1","ROLW 1, r/m16","rolw 1, r/m16","D1 /0","V","V","","operand16","rw,r","Y","16"
+"ROL r/m16, CL","ROLW CL, r/m16","rolw CL, r/m16","D3 /0","V","V","","operand16","rw,r","Y","16"
+"ROL r/m16, imm8u","ROLW imm8u, r/m16","rolw imm8u, r/m16","C1 /0 ib","V","V","","operand16","rw,r","Y","16"
+"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","D0 /1","V","V","","","rw,r","Y","8"
+"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","REX D0 /1","N.E.","V","","pseudo64","w,r","Y","8"
+"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","D2 /1","V","V","","","rw,r","Y","8"
+"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","REX D2 /1","N.E.","V","","pseudo64","w,r","Y","8"
+"ROR r/m8, imm8","RORB imm8, r/m8","rorb imm8, r/m8","REX C0 /1 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"ROR r/m8, imm8u","RORB imm8u, r/m8","rorb imm8u, r/m8","C0 /1 ib","V","V","","","rw,r","Y","8"
+"ROR r/m32, 1","RORL 1, r/m32","rorl 1, r/m32","D1 /1","V","V","","operand32","rw,r","Y","32"
+"ROR r/m32, CL","RORL CL, r/m32","rorl CL, r/m32","D3 /1","V","V","","operand32","rw,r","Y","32"
+"ROR r/m32, imm8u","RORL imm8u, r/m32","rorl imm8u, r/m32","C1 /1 ib","V","V","","operand32","rw,r","Y","32"
+"ROR r/m64, 1","RORQ 1, r/m64","rorq 1, r/m64","REX.W D1 /1","N.S.","V","","","rw,r","Y","64"
+"ROR r/m64, CL","RORQ CL, r/m64","rorq CL, r/m64","REX.W D3 /1","N.S.","V","","","rw,r","Y","64"
+"ROR r/m64, imm8u","RORQ imm8u, r/m64","rorq imm8u, r/m64","REX.W C1 /1 ib","N.S.","V","","","rw,r","Y","64"
+"ROR r/m16, 1","RORW 1, r/m16","rorw 1, r/m16","D1 /1","V","V","","operand16","rw,r","Y","16"
+"ROR r/m16, CL","RORW CL, r/m16","rorw CL, r/m16","D3 /1","V","V","","operand16","rw,r","Y","16"
+"ROR r/m16, imm8u","RORW imm8u, r/m16","rorw imm8u, r/m16","C1 /1 ib","V","V","","operand16","rw,r","Y","16"
+"RORX r32, r/m32, imm8u","RORXL imm8u, r/m32, r32","rorxl imm8u, r/m32, r32","VEX.128.F2.0F3A.W0 F0 /r ib","V","V","BMI2","","w,r,r","Y","32"
+"RORX r64, r/m64, imm8u","RORXQ imm8u, r/m64, r64","rorxq imm8u, r/m64, r64","VEX.128.F2.0F3A.W1 F0 /r ib","N.S.","V","BMI2","","w,r,r","Y","64"
+"ROUNDPD xmm1, xmm2/m128, imm8u","ROUNDPD imm8u, xmm2/m128, xmm1","roundpd imm8u, xmm2/m128, xmm1","66 0F 3A 09 /r ib","V","V","SSE4_1","","w,r,r","",""
+"ROUNDPS xmm1, xmm2/m128, imm8u","ROUNDPS imm8u, xmm2/m128, xmm1","roundps imm8u, xmm2/m128, xmm1","66 0F 3A 08 /r ib","V","V","SSE4_1","","w,r,r","",""
+"ROUNDSD xmm1, xmm2/m64, imm8u","ROUNDSD imm8u, xmm2/m64, xmm1","roundsd imm8u, xmm2/m64, xmm1","66 0F 3A 0B /r ib","V","V","SSE4_1","","w,r,r","",""
+"ROUNDSS xmm1, xmm2/m32, imm8u","ROUNDSS imm8u, xmm2/m32, xmm1","roundss imm8u, xmm2/m32, xmm1","66 0F 3A 0A /r ib","V","V","SSE4_1","","w,r,r","",""
+"RSM","RSM","rsm","0F AA","V","V","","","","",""
+"RSQRTPS xmm1, xmm2/m128","RSQRTPS xmm2/m128, xmm1","rsqrtps xmm2/m128, xmm1","0F 52 /r","V","V","SSE","","w,r","",""
+"RSQRTSS xmm1, xmm2/m32","RSQRTSS xmm2/m32, xmm1","rsqrtss xmm2/m32, xmm1","F3 0F 52 /r","V","V","SSE","","w,r","",""
+"RSTORSSP m64","RSTORSSP m64","rstorssp m64","F3 0F 01 /5","V","V","CET","modrm_memonly","rw","",""
+"SAHF","SAHF","sahf","9E","V","V","LAHFSAHF","","","",""
+"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","D0 /4","V","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","REX D0 /4","N.E.","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","D2 /4","V","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","REX D2 /4","N.E.","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","C0 /4 ib","V","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","REX C0 /4 ib","N.E.","V","","pseudo","rw,r","Y","8"
+"SALC","SALC","salc","D6","V","N.S.","","","","",""
+"SAL r/m32, 1","SALL 1, r/m32","sall 1, r/m32","D1 /4","V","V","","operand32,pseudo","rw,r","Y","32"
+"SAL r/m32, CL","SALL CL, r/m32","sall CL, r/m32","D3 /4","V","V","","operand32,pseudo","rw,r","Y","32"
+"SAL r/m32, imm8","SALL imm8, r/m32","sall imm8, r/m32","C1 /4 ib","V","V","","operand32,pseudo","rw,r","Y","32"
+"SAL r/m64, 1","SALQ 1, r/m64","salq 1, r/m64","REX.W D1 /4","N.E.","V","","pseudo","rw,r","Y","64"
+"SAL r/m64, CL","SALQ CL, r/m64","salq CL, r/m64","REX.W D3 /4","N.E.","V","","pseudo","rw,r","Y","64"
+"SAL r/m64, imm8","SALQ imm8, r/m64","salq imm8, r/m64","REX.W C1 /4 ib","N.E.","V","","pseudo","rw,r","Y","64"
+"SAL r/m16, 1","SALW 1, r/m16","salw 1, r/m16","D1 /4","V","V","","operand16,pseudo","rw,r","Y","16"
+"SAL r/m16, CL","SALW CL, r/m16","salw CL, r/m16","D3 /4","V","V","","operand16,pseudo","rw,r","Y","16"
+"SAL r/m16, imm8","SALW imm8, r/m16","salw imm8, r/m16","C1 /4 ib","V","V","","operand16,pseudo","rw,r","Y","16"
+"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","D0 /7","V","V","","","rw,r","Y","8"
+"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","REX D0 /7","N.E.","V","","pseudo64","rw,r","Y","8"
+"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","D2 /7","V","V","","","rw,r","Y","8"
+"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","REX D2 /7","N.E.","V","","pseudo64","rw,r","Y","8"
+"SAR r/m8, imm8","SARB imm8, r/m8","sarb imm8, r/m8","REX C0 /7 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"SAR r/m8, imm8u","SARB imm8u, r/m8","sarb imm8u, r/m8","C0 /7 ib","V","V","","","rw,r","Y","8"
+"SAR r/m32, 1","SARL 1, r/m32","sarl 1, r/m32","D1 /7","V","V","","operand32","rw,r","Y","32"
+"SAR r/m32, CL","SARL CL, r/m32","sarl CL, r/m32","D3 /7","V","V","","operand32","rw,r","Y","32"
+"SAR r/m32, imm8u","SARL imm8u, r/m32","sarl imm8u, r/m32","C1 /7 ib","V","V","","operand32","rw,r","Y","32"
+"SAR r/m64, 1","SARQ 1, r/m64","sarq 1, r/m64","REX.W D1 /7","N.S.","V","","","rw,r","Y","64"
+"SAR r/m64, CL","SARQ CL, r/m64","sarq CL, r/m64","REX.W D3 /7","N.S.","V","","","rw,r","Y","64"
+"SAR r/m64, imm8u","SARQ imm8u, r/m64","sarq imm8u, r/m64","REX.W C1 /7 ib","N.S.","V","","","rw,r","Y","64"
+"SAR r/m16, 1","SARW 1, r/m16","sarw 1, r/m16","D1 /7","V","V","","operand16","rw,r","Y","16"
+"SAR r/m16, CL","SARW CL, r/m16","sarw CL, r/m16","D3 /7","V","V","","operand16","rw,r","Y","16"
+"SAR r/m16, imm8u","SARW imm8u, r/m16","sarw imm8u, r/m16","C1 /7 ib","V","V","","operand16","rw,r","Y","16"
+"SARX r32, r/m32, r32V","SARXL r32V, r/m32, r32","sarxl r32V, r/m32, r32","VEX.NDS.128.F3.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32"
+"SARX r64, r/m64, r64V","SARXQ r64V, r/m64, r64","sarxq r64V, r/m64, r64","VEX.NDS.128.F3.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64"
+"SAVESSP","SAVESSP","savessp","F3 0F 01 EA","V","V","CET","","","",""
+"SBB AL, imm8","SBBB imm8, AL","sbbb imm8, AL","1C ib","V","V","","","rw,r","Y","8"
+"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","80 /3 ib","V","V","","","rw,r","Y","8"
+"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","82 /3 ib","V","N.S.","","","rw,r","Y","8"
+"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","REX 80 /3 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","1A /r","V","V","","","rw,r","Y","8"
+"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","REX 1A /r","N.E.","V","","pseudo64","w,r","Y","8"
+"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","18 /r","V","V","","","rw,r","Y","8"
+"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","REX 18 /r","N.E.","V","","pseudo64","w,r","Y","8"
+"SBB EAX, imm32","SBBL imm32, EAX","sbbl imm32, EAX","1D id","V","V","","operand32","rw,r","Y","32"
+"SBB r/m32, imm32","SBBL imm32, r/m32","sbbl imm32, r/m32","81 /3 id","V","V","","operand32","rw,r","Y","32"
+"SBB r/m32, imm8","SBBL imm8, r/m32","sbbl imm8, r/m32","83 /3 ib","V","V","","operand32","rw,r","Y","32"
+"SBB r32, r/m32","SBBL r/m32, r32","sbbl r/m32, r32","1B /r","V","V","","operand32","rw,r","Y","32"
+"SBB r/m32, r32","SBBL r32, r/m32","sbbl r32, r/m32","19 /r","V","V","","operand32","rw,r","Y","32"
+"SBB RAX, imm32","SBBQ imm32, RAX","sbbq imm32, RAX","REX.W 1D id","N.S.","V","","","rw,r","Y","64"
+"SBB r/m64, imm32","SBBQ imm32, r/m64","sbbq imm32, r/m64","REX.W 81 /3 id","N.S.","V","","","rw,r","Y","64"
+"SBB r/m64, imm8","SBBQ imm8, r/m64","sbbq imm8, r/m64","REX.W 83 /3 ib","N.S.","V","","","rw,r","Y","64"
+"SBB r64, r/m64","SBBQ r/m64, r64","sbbq r/m64, r64","REX.W 1B /r","N.S.","V","","","rw,r","Y","64"
+"SBB r/m64, r64","SBBQ r64, r/m64","sbbq r64, r/m64","REX.W 19 /r","N.S.","V","","","rw,r","Y","64"
+"SBB AX, imm16","SBBW imm16, AX","sbbw imm16, AX","1D iw","V","V","","operand16","rw,r","Y","16"
+"SBB r/m16, imm16","SBBW imm16, r/m16","sbbw imm16, r/m16","81 /3 iw","V","V","","operand16","rw,r","Y","16"
+"SBB r/m16, imm8","SBBW imm8, r/m16","sbbw imm8, r/m16","83 /3 ib","V","V","","operand16","rw,r","Y","16"
+"SBB r16, r/m16","SBBW r/m16, r16","sbbw r/m16, r16","1B /r","V","V","","operand16","rw,r","Y","16"
+"SBB r/m16, r16","SBBW r16, r/m16","sbbw r16, r/m16","19 /r","V","V","","operand16","rw,r","Y","16"
+"SCASB","SCASB","scasb","AE","V","V","","","","",""
+"SCASD","SCASL","scasl","AF","V","V","","operand32","","",""
+"SCASQ","SCASQ","scasq","REX.W AF","N.S.","V","","","","",""
+"SCASW","SCASW","scasw","AF","V","V","","operand16","","",""
+"SETAE r/m8","SETCC r/m8","setae r/m8","0F 93 /r","V","V","","","w","",""
+"SETNB r/m8","SETCC r/m8","setnb r/m8","0F 93 /r","V","V","","pseudo","r","",""
+"SETNC r/m8","SETCC r/m8","setnc r/m8","0F 93 /r","V","V","","pseudo","r","",""
+"SETAE r/m8","SETCC r/m8","setae r/m8","REX 0F 93 /r","N.E.","V","","pseudo64","r","",""
+"SETNB r/m8","SETCC r/m8","setnb r/m8","REX 0F 93 /r","N.E.","V","","pseudo","r","",""
+"SETNC r/m8","SETCC r/m8","setnc r/m8","REX 0F 93 /r","N.E.","V","","pseudo","r","",""
+"SETB r/m8","SETCS r/m8","setb r/m8","0F 92 /r","V","V","","","w","",""
+"SETC r/m8","SETCS r/m8","setc r/m8","0F 92 /r","V","V","","pseudo","r","",""
+"SETNAE r/m8","SETCS r/m8","setnae r/m8","0F 92 /r","V","V","","pseudo","r","",""
+"SETB r/m8","SETCS r/m8","setb r/m8","REX 0F 92 /r","N.E.","V","","pseudo64","r","",""
+"SETC r/m8","SETCS r/m8","setc r/m8","REX 0F 92 /r","N.E.","V","","pseudo","r","",""
+"SETNAE r/m8","SETCS r/m8","setnae r/m8","REX 0F 92 /r","N.E.","V","","pseudo","r","",""
+"SETE r/m8","SETEQ r/m8","sete r/m8","0F 94 /r","V","V","","","w","",""
+"SETZ r/m8","SETEQ r/m8","setz r/m8","0F 94 /r","V","V","","pseudo","r","",""
+"SETE r/m8","SETEQ r/m8","sete r/m8","REX 0F 94 /r","N.E.","V","","pseudo64","r","",""
+"SETZ r/m8","SETEQ r/m8","setz r/m8","REX 0F 94 /r","N.E.","V","","pseudo","r","",""
+"SETGE r/m8","SETGE r/m8","setge r/m8","0F 9D /r","V","V","","","w","",""
+"SETNL r/m8","SETGE r/m8","setnl r/m8","0F 9D /r","V","V","","pseudo","r","",""
+"SETGE r/m8","SETGE r/m8","setge r/m8","REX 0F 9D /r","N.E.","V","","pseudo64","r","",""
+"SETNL r/m8","SETGE r/m8","setnl r/m8","REX 0F 9D /r","N.E.","V","","pseudo","r","",""
+"SETG r/m8","SETGT r/m8","setg r/m8","0F 9F /r","V","V","","","w","",""
+"SETNLE r/m8","SETGT r/m8","setnle r/m8","0F 9F /r","V","V","","pseudo","r","",""
+"SETG r/m8","SETGT r/m8","setg r/m8","REX 0F 9F /r","N.E.","V","","pseudo64","r","",""
+"SETNLE r/m8","SETGT r/m8","setnle r/m8","REX 0F 9F /r","N.E.","V","","pseudo","r","",""
+"SETA r/m8","SETHI r/m8","seta r/m8","0F 97 /r","V","V","","","w","",""
+"SETNBE r/m8","SETHI r/m8","setnbe r/m8","0F 97 /r","V","V","","pseudo","r","",""
+"SETA r/m8","SETHI r/m8","seta r/m8","REX 0F 97 /r","N.E.","V","","pseudo64","r","",""
+"SETNBE r/m8","SETHI r/m8","setnbe r/m8","REX 0F 97 /r","N.E.","V","","pseudo","r","",""
+"SETLE r/m8","SETLE r/m8","setle r/m8","0F 9E /r","V","V","","","w","",""
+"SETNG r/m8","SETLE r/m8","setng r/m8","0F 9E /r","V","V","","pseudo","r","",""
+"SETLE r/m8","SETLE r/m8","setle r/m8","REX 0F 9E /r","N.E.","V","","pseudo64","r","",""
+"SETNG r/m8","SETLE r/m8","setng r/m8","REX 0F 9E /r","N.E.","V","","pseudo","r","",""
+"SETBE r/m8","SETLS r/m8","setbe r/m8","0F 96 /r","V","V","","","w","",""
+"SETNA r/m8","SETLS r/m8","setna r/m8","0F 96 /r","V","V","","pseudo","r","",""
+"SETBE r/m8","SETLS r/m8","setbe r/m8","REX 0F 96 /r","N.E.","V","","pseudo64","r","",""
+"SETNA r/m8","SETLS r/m8","setna r/m8","REX 0F 96 /r","N.E.","V","","pseudo","r","",""
+"SETL r/m8","SETLT r/m8","setl r/m8","0F 9C /r","V","V","","","w","",""
+"SETNGE r/m8","SETLT r/m8","setnge r/m8","0F 9C /r","V","V","","pseudo","r","",""
+"SETL r/m8","SETLT r/m8","setl r/m8","REX 0F 9C /r","N.E.","V","","pseudo64","r","",""
+"SETNGE r/m8","SETLT r/m8","setnge r/m8","REX 0F 9C /r","N.E.","V","","pseudo","r","",""
+"SETS r/m8","SETMI r/m8","sets r/m8","0F 98 /r","V","V","","","w","",""
+"SETS r/m8","SETMI r/m8","sets r/m8","REX 0F 98 /r","N.E.","V","","pseudo64","r","",""
+"SETNE r/m8","SETNE r/m8","setne r/m8","0F 95 /r","V","V","","","w","",""
+"SETNZ r/m8","SETNE r/m8","setnz r/m8","0F 95 /r","V","V","","pseudo","r","",""
+"SETNE r/m8","SETNE r/m8","setne r/m8","REX 0F 95 /r","N.E.","V","","pseudo64","r","",""
+"SETNZ r/m8","SETNE r/m8","setnz r/m8","REX 0F 95 /r","N.E.","V","","pseudo","r","",""
+"SETNO r/m8","SETOC r/m8","setno r/m8","0F 91 /r","V","V","","","w","",""
+"SETNO r/m8","SETOC r/m8","setno r/m8","REX 0F 91 /r","N.E.","V","","pseudo64","r","",""
+"SETO r/m8","SETOS r/m8","seto r/m8","0F 90 /r","V","V","","","w","",""
+"SETO r/m8","SETOS r/m8","seto r/m8","REX 0F 90 /r","N.E.","V","","pseudo64","r","",""
+"SETNP r/m8","SETPC r/m8","setnp r/m8","0F 9B /r","V","V","","","w","",""
+"SETPO r/m8","SETPC r/m8","setpo r/m8","0F 9B /r","V","V","","pseudo","r","",""
+"SETNP r/m8","SETPC r/m8","setnp r/m8","REX 0F 9B /r","N.E.","V","","pseudo64","r","",""
+"SETPO r/m8","SETPC r/m8","setpo r/m8","REX 0F 9B /r","N.E.","V","","pseudo","r","",""
+"SETNS r/m8","SETPL r/m8","setns r/m8","0F 99 /r","V","V","","","w","",""
+"SETNS r/m8","SETPL r/m8","setns r/m8","REX 0F 99 /r","N.E.","V","","pseudo64","r","",""
+"SETP r/m8","SETPS r/m8","setp r/m8","0F 9A /r","V","V","","","w","",""
+"SETPE r/m8","SETPS r/m8","setpe r/m8","0F 9A /r","V","V","","pseudo","r","",""
+"SETP r/m8","SETPS r/m8","setp r/m8","REX 0F 9A /r","N.E.","V","","pseudo64","r","",""
+"SETPE r/m8","SETPS r/m8","setpe r/m8","REX 0F 9A /r","N.E.","V","","pseudo","r","",""
+"SETSSBSY","SETSSBSY","setssbsy","F3 0F 01 E8","V","V","CET","","","",""
+"SFENCE","SFENCE","sfence","0F AE /7","V","V","SSE","","","",""
+"SGDT m16&32","SGDT m16&32","sgdt m16&32","0F 01 /0","V","N.S.","","modrm_memonly","w","",""
+"SGDT m16&64","SGDT m16&64","sgdt m16&64","0F 01 /0","N.S.","V","","default64,modrm_memonly","w","",""
+"SHA1MSG1 xmm1, xmm2/m128","SHA1MSG1 xmm2/m128, xmm1","sha1msg1 xmm2/m128, xmm1","0F 38 C9 /r","V","V","SHA","","rw,r","",""
+"SHA1MSG2 xmm1, xmm2/m128","SHA1MSG2 xmm2/m128, xmm1","sha1msg2 xmm2/m128, xmm1","0F 38 CA /r","V","V","SHA","","rw,r","",""
+"SHA1NEXTE xmm1, xmm2/m128","SHA1NEXTE xmm2/m128, xmm1","sha1nexte xmm2/m128, xmm1","0F 38 C8 /r","V","V","SHA","","rw,r","",""
+"SHA1RNDS4 xmm1, xmm2/m128, imm8u:2","SHA1RNDS4 imm8u:2, xmm2/m128, xmm1","sha1rnds4 imm8u:2, xmm2/m128, xmm1","0F 3A CC /r ib","V","V","SHA","","rw,r,r","",""
+"SHA256MSG1 xmm1, xmm2/m128","SHA256MSG1 xmm2/m128, xmm1","sha256msg1 xmm2/m128, xmm1","0F 38 CC /r","V","V","SHA","","rw,r","",""
+"SHA256MSG2 xmm1, xmm2/m128","SHA256MSG2 xmm2/m128, xmm1","sha256msg2 xmm2/m128, xmm1","0F 38 CD /r","V","V","SHA","","rw,r","",""
+"SHA256RNDS2 xmm1, xmm2/m128, <XMM0>","SHA256RNDS2 <XMM0>, xmm2/m128, xmm1","sha256rnds2 <XMM0>, xmm2/m128, xmm1","0F 38 CB /r","V","V","SHA","","rw,r,r","",""
+"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /4","V","V","","","rw,r","Y","8"
+"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /6","V","V","","","rw,r","Y","8"
+"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","REX D0 /4","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /4","V","V","","","rw,r","Y","8"
+"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /6","V","V","","","rw,r","Y","8"
+"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","REX D2 /4","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHL r/m8, imm8","SHLB imm8, r/m8","shlb imm8, r/m8","REX C0 /4 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /4 ib","V","V","","","rw,r","Y","8"
+"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /6 ib","V","V","","","rw,r","Y","8"
+"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /4","V","V","","operand32","rw,r","Y","32"
+"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /6","V","V","","operand32","rw,r","Y","32"
+"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /4","V","V","","operand32","rw,r","Y","32"
+"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /6","V","V","","operand32","rw,r","Y","32"
+"SHLD r/m32, r32, CL","SHLL CL, r32, r/m32","shldl CL, r32, r/m32","0F A5 /r","V","V","","operand32","rw,r,r","Y","32"
+"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /4 ib","V","V","","operand32","rw,r","Y","32"
+"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /6 ib","V","V","","operand32","rw,r","Y","32"
+"SHLD r/m32, r32, imm8u","SHLL imm8u, r32, r/m32","shldl imm8u, r32, r/m32","0F A4 /r ib","V","V","","operand32","rw,r,r","Y","32"
+"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /4","N.S.","V","","","rw,r","Y","64"
+"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /6","N.S.","V","","","rw,r","Y","64"
+"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /4","N.S.","V","","","rw,r","Y","64"
+"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /6","N.S.","V","","","rw,r","Y","64"
+"SHLD r/m64, r64, CL","SHLQ CL, r64, r/m64","shldq CL, r64, r/m64","REX.W 0F A5 /r","N.S.","V","","","rw,r,r","Y","64"
+"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /4 ib","N.S.","V","","","rw,r","Y","64"
+"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /6 ib","N.S.","V","","","rw,r","Y","64"
+"SHLD r/m64, r64, imm8u","SHLQ imm8u, r64, r/m64","shldq imm8u, r64, r/m64","REX.W 0F A4 /r ib","N.S.","V","","","rw,r,r","Y","64"
+"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /4","V","V","","operand16","rw,r","Y","16"
+"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /6","V","V","","operand16","rw,r","Y","16"
+"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /4","V","V","","operand16","rw,r","Y","16"
+"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /6","V","V","","operand16","rw,r","Y","16"
+"SHLD r/m16, r16, CL","SHLW CL, r16, r/m16","shldw CL, r16, r/m16","0F A5 /r","V","V","","operand16","rw,r,r","Y","16"
+"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /4 ib","V","V","","operand16","rw,r","Y","16"
+"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /6 ib","V","V","","operand16","rw,r","Y","16"
+"SHLD r/m16, r16, imm8u","SHLW imm8u, r16, r/m16","shldw imm8u, r16, r/m16","0F A4 /r ib","V","V","","operand16","rw,r,r","Y","16"
+"SHLX r32, r/m32, r32V","SHLXL r32V, r/m32, r32","shlxl r32V, r/m32, r32","VEX.NDS.128.66.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32"
+"SHLX r64, r/m64, r64V","SHLXQ r64V, r/m64, r64","shlxq r64V, r/m64, r64","VEX.NDS.128.66.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64"
+"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","D0 /5","V","V","","","rw,r","Y","8"
+"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","REX D0 /5","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","D2 /5","V","V","","","rw,r","Y","8"
+"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","REX D2 /5","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHR r/m8, imm8","SHRB imm8, r/m8","shrb imm8, r/m8","REX C0 /5 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHR r/m8, imm8u","SHRB imm8u, r/m8","shrb imm8u, r/m8","C0 /5 ib","V","V","","","rw,r","Y","8"
+"SHR r/m32, 1","SHRL 1, r/m32","shrl 1, r/m32","D1 /5","V","V","","operand32","rw,r","Y","32"
+"SHR r/m32, CL","SHRL CL, r/m32","shrl CL, r/m32","D3 /5","V","V","","operand32","rw,r","Y","32"
+"SHRD r/m32, r32, CL","SHRL CL, r32, r/m32","shrdl CL, r32, r/m32","0F AD /r","V","V","","operand32","rw,r,r","Y","32"
+"SHR r/m32, imm8u","SHRL imm8u, r/m32","shrl imm8u, r/m32","C1 /5 ib","V","V","","operand32","rw,r","Y","32"
+"SHRD r/m32, r32, imm8u","SHRL imm8u, r32, r/m32","shrdl imm8u, r32, r/m32","0F AC /r ib","V","V","","operand32","rw,r,r","Y","32"
+"SHR r/m64, 1","SHRQ 1, r/m64","shrq 1, r/m64","REX.W D1 /5","N.S.","V","","","rw,r","Y","64"
+"SHR r/m64, CL","SHRQ CL, r/m64","shrq CL, r/m64","REX.W D3 /5","N.S.","V","","","rw,r","Y","64"
+"SHRD r/m64, r64, CL","SHRQ CL, r64, r/m64","shrdq CL, r64, r/m64","REX.W 0F AD /r","N.S.","V","","","rw,r,r","Y","64"
+"SHR r/m64, imm8u","SHRQ imm8u, r/m64","shrq imm8u, r/m64","REX.W C1 /5 ib","N.S.","V","","","rw,r","Y","64"
+"SHRD r/m64, r64, imm8u","SHRQ imm8u, r64, r/m64","shrdq imm8u, r64, r/m64","REX.W 0F AC /r ib","N.S.","V","","","rw,r,r","Y","64"
+"SHR r/m16, 1","SHRW 1, r/m16","shrw 1, r/m16","D1 /5","V","V","","operand16","rw,r","Y","16"
+"SHR r/m16, CL","SHRW CL, r/m16","shrw CL, r/m16","D3 /5","V","V","","operand16","rw,r","Y","16"
+"SHRD r/m16, r16, CL","SHRW CL, r16, r/m16","shrdw CL, r16, r/m16","0F AD /r","V","V","","operand16","rw,r,r","Y","16"
+"SHR r/m16, imm8u","SHRW imm8u, r/m16","shrw imm8u, r/m16","C1 /5 ib","V","V","","operand16","rw,r","Y","16"
+"SHRD r/m16, r16, imm8u","SHRW imm8u, r16, r/m16","shrdw imm8u, r16, r/m16","0F AC /r ib","V","V","","operand16","rw,r,r","Y","16"
+"SHRX r32, r/m32, r32V","SHRXL r32V, r/m32, r32","shrxl r32V, r/m32, r32","VEX.NDS.128.F2.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32"
+"SHRX r64, r/m64, r64V","SHRXQ r64V, r/m64, r64","shrxq r64V, r/m64, r64","VEX.NDS.128.F2.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64"
+"SHUFPD xmm1, xmm2/m128, imm8u","SHUFPD imm8u, xmm2/m128, xmm1","shufpd imm8u, xmm2/m128, xmm1","66 0F C6 /r ib","V","V","SSE2","","rw,r,r","",""
+"SHUFPS xmm1, xmm2/m128, imm8u","SHUFPS imm8u, xmm2/m128, xmm1","shufps imm8u, xmm2/m128, xmm1","0F C6 /r ib","V","V","SSE","","rw,r,r","",""
+"SIDT m16&32","SIDT m16&32","sidt m16&32","0F 01 /1","V","N.S.","","modrm_memonly","w","",""
+"SIDT m16&64","SIDT m16&64","sidt m16&64","0F 01 /1","N.S.","V","","default64,modrm_memonly","w","",""
+"SKINIT EAX","SKINIT EAX","skinit EAX","0F 01 DE","V","V","SVM","amd,modrm_regonly","r","",""
+"SLDT r/m16","SLDTW r/m16","sldtw r/m16","0F 00 /0","V","V","","operand16","w","Y","16"
+"SLDT r32/m16","SLDT{L/W} r32/m16","sldt{l/w} r32/m16","0F 00 /0","V","V","","operand32","w","Y",""
+"SLDT r64/m16","SLDT{Q/W} r64/m16","sldt{q/w} r64/m16","REX.W 0F 00 /0","N.S.","V","","","w","Y",""
+"SLWPCB rmr32","SLWPCBL rmr32","slwpcbl rmr32","XOP.128.09.W0 12 /1","V","V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32"
+"SLWPCB rmr64","SLWPCBQ rmr64","slwpcbq rmr64","XOP.128.09.W0 12 /1","N.S.","V","XOP","amd,modrm_regonly,operand64","w","Y","64"
+"SMSW r/m16","SMSWW r/m16","smsww r/m16","0F 01 /4","V","V","","operand16","w","Y","16"
+"SMSW r32/m16","SMSW{L/W} r32/m16","smsw{l/w} r32/m16","0F 01 /4","V","V","","operand32","w","Y",""
+"SMSW r64/m16","SMSW{Q/W} r64/m16","smsw{q/w} r64/m16","REX.W 0F 01 /4","N.S.","V","","","w","Y",""
+"SQRTPD xmm1, xmm2/m128","SQRTPD xmm2/m128, xmm1","sqrtpd xmm2/m128, xmm1","66 0F 51 /r","V","V","SSE2","","w,r","",""
+"SQRTPS xmm1, xmm2/m128","SQRTPS xmm2/m128, xmm1","sqrtps xmm2/m128, xmm1","0F 51 /r","V","V","SSE","","w,r","",""
+"SQRTSD xmm1, xmm2/m64","SQRTSD xmm2/m64, xmm1","sqrtsd xmm2/m64, xmm1","F2 0F 51 /r","V","V","SSE2","","w,r","",""
+"SQRTSS xmm1, xmm2/m32","SQRTSS xmm2/m32, xmm1","sqrtss xmm2/m32, xmm1","F3 0F 51 /r","V","V","SSE","","w,r","",""
+"STAC","STAC","stac","0F 01 CB","V","V","","","","",""
+"STC","STC","stc","F9","V","V","","","","",""
+"STD","STD","std","FD","V","V","","","","",""
+"STGI","STGI","stgi","0F 01 DC","V","V","SVM","amd","","",""
+"STI","STI","sti","FB","V","V","","","","",""
+"STMXCSR m32","STMXCSR m32","stmxcsr m32","0F AE /3","V","V","SSE","modrm_memonly","w","",""
+"STOSB","STOSB","stosb","AA","V","V","","","","",""
+"STOSD","STOSL","stosl","AB","V","V","","operand32","","",""
+"STOSQ","STOSQ","stosq","REX.W AB","N.S.","V","","","","",""
+"STOSW","STOSW","stosw","AB","V","V","","operand16","","",""
+"STR r/m16","STRW r/m16","strw r/m16","0F 00 /1","V","V","","operand16","w","Y","16"
+"STR r32/m16","STR{L/W} r32/m16","str{l/w} r32/m16","0F 00 /1","V","V","","operand32","w","Y",""
+"STR r64/m16","STR{Q/W} r64/m16","str{q/w} r64/m16","REX.W 0F 00 /1","N.S.","V","","","w","Y",""
+"SUB AL, imm8","SUBB imm8, AL","subb imm8, AL","2C ib","V","V","","","rw,r","Y","8"
+"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","80 /5 ib","V","V","","","rw,r","Y","8"
+"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","82 /5 ib","V","N.S.","","","rw,r","Y","8"
+"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","REX 80 /5 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","2A /r","V","V","","","rw,r","Y","8"
+"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","REX 2A /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","28 /r","V","V","","","rw,r","Y","8"
+"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","REX 28 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"SUB EAX, imm32","SUBL imm32, EAX","subl imm32, EAX","2D id","V","V","","operand32","rw,r","Y","32"
+"SUB r/m32, imm32","SUBL imm32, r/m32","subl imm32, r/m32","81 /5 id","V","V","","operand32","rw,r","Y","32"
+"SUB r/m32, imm8","SUBL imm8, r/m32","subl imm8, r/m32","83 /5 ib","V","V","","operand32","rw,r","Y","32"
+"SUB r32, r/m32","SUBL r/m32, r32","subl r/m32, r32","2B /r","V","V","","operand32","rw,r","Y","32"
+"SUB r/m32, r32","SUBL r32, r/m32","subl r32, r/m32","29 /r","V","V","","operand32","rw,r","Y","32"
+"SUBPD xmm1, xmm2/m128","SUBPD xmm2/m128, xmm1","subpd xmm2/m128, xmm1","66 0F 5C /r","V","V","SSE2","","rw,r","",""
+"SUBPS xmm1, xmm2/m128","SUBPS xmm2/m128, xmm1","subps xmm2/m128, xmm1","0F 5C /r","V","V","SSE","","rw,r","",""
+"SUB RAX, imm32","SUBQ imm32, RAX","subq imm32, RAX","REX.W 2D id","N.S.","V","","","rw,r","Y","64"
+"SUB r/m64, imm32","SUBQ imm32, r/m64","subq imm32, r/m64","REX.W 81 /5 id","N.S.","V","","","rw,r","Y","64"
+"SUB r/m64, imm8","SUBQ imm8, r/m64","subq imm8, r/m64","REX.W 83 /5 ib","N.S.","V","","","rw,r","Y","64"
+"SUB r64, r/m64","SUBQ r/m64, r64","subq r/m64, r64","REX.W 2B /r","N.S.","V","","","rw,r","Y","64"
+"SUB r/m64, r64","SUBQ r64, r/m64","subq r64, r/m64","REX.W 29 /r","N.S.","V","","","rw,r","Y","64"
+"SUBSD xmm1, xmm2/m64","SUBSD xmm2/m64, xmm1","subsd xmm2/m64, xmm1","F2 0F 5C /r","V","V","SSE2","","rw,r","",""
+"SUBSS xmm1, xmm2/m32","SUBSS xmm2/m32, xmm1","subss xmm2/m32, xmm1","F3 0F 5C /r","V","V","SSE","","rw,r","",""
+"SUB AX, imm16","SUBW imm16, AX","subw imm16, AX","2D iw","V","V","","operand16","rw,r","Y","16"
+"SUB r/m16, imm16","SUBW imm16, r/m16","subw imm16, r/m16","81 /5 iw","V","V","","operand16","rw,r","Y","16"
+"SUB r/m16, imm8","SUBW imm8, r/m16","subw imm8, r/m16","83 /5 ib","V","V","","operand16","rw,r","Y","16"
+"SUB r16, r/m16","SUBW r/m16, r16","subw r/m16, r16","2B /r","V","V","","operand16","rw,r","Y","16"
+"SUB r/m16, r16","SUBW r16, r/m16","subw r16, r/m16","29 /r","V","V","","operand16","rw,r","Y","16"
+"SWAPGS","SWAPGS","swapgs","0F 01 F8","N.S.","V","","","","",""
+"SYSCALL","SYSCALL","syscall","0F 05","N.S.","V","","default64","","",""
+"SYSCALL","SYSCALL","syscall","0F 05","V","N.S.","AMD","amd","","",""
+"SYSENTER","SYSENTER","sysenter","0F 34","V","V","PPRO","","","",""
+"SYSEXIT","SYSEXIT","sysexit","0F 35","V","V","PPRO","","","",""
+"SYSEXIT","SYSEXIT","sysexit","REX.W 0F 35","N.E.","V","","pseudo","","",""
+"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","V","N.S.","AMD","amd","","",""
+"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","N.S.","V","","operand32,operand64","","",""
+"SYSRET","SYSRET","sysretw/sysretl/sysretl","REX.W 0F 07","I","V","","pseudo","","",""
+"T1MSKC r32V, r/m32","T1MSKCL r/m32, r32V","t1mskcl r/m32, r32V","XOP.NDD.128.09.WIG 01 /7","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"T1MSKC r64V, r/m64","T1MSKCQ r/m64, r64V","t1mskcq r/m64, r64V","XOP.NDD.128.09.WIG 01 /7","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"TEST AL, imm8","TESTB imm8, AL","testb imm8, AL","A8 ib","V","V","","","r,r","Y","8"
+"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /0 ib","V","V","","","r,r","Y","8"
+"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /1 ib","V","V","","","r,r","Y","8"
+"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","REX F6 /0 ib","N.E.","V","","pseudo64","r,r","Y","8"
+"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","84 /r","V","V","","","r,r","Y","8"
+"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","REX 84 /r","N.E.","V","","pseudo64","r,r","Y","8"
+"TEST EAX, imm32","TESTL imm32, EAX","testl imm32, EAX","A9 id","V","V","","operand32","r,r","Y","32"
+"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /0 id","V","V","","operand32","r,r","Y","32"
+"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /1 id","V","V","","operand32","r,r","Y","32"
+"TEST r/m32, r32","TESTL r32, r/m32","testl r32, r/m32","85 /r","V","V","","operand32","r,r","Y","32"
+"TEST RAX, imm32","TESTQ imm32, RAX","testq imm32, RAX","REX.W A9 id","N.S.","V","","","r,r","Y","64"
+"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /0 id","N.S.","V","","","r,r","Y","64"
+"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /1 id","N.S.","V","","","r,r","Y","64"
+"TEST r/m64, r64","TESTQ r64, r/m64","testq r64, r/m64","REX.W 85 /r","N.S.","V","","","r,r","Y","64"
+"TEST AX, imm16","TESTW imm16, AX","testw imm16, AX","A9 iw","V","V","","operand16","r,r","Y","16"
+"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /0 iw","V","V","","operand16","r,r","Y","16"
+"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /1 iw","V","V","","operand16","r,r","Y","16"
+"TEST r/m16, r16","TESTW r16, r/m16","testw r16, r/m16","85 /r","V","V","","operand16","r,r","Y","16"
+"TZCNT r32, r/m32","TZCNTL r/m32, r32","tzcntl r/m32, r32","F3 0F BC /r","V","V","BMI1","operand32","w,r","Y","32"
+"TZCNT r64, r/m64","TZCNTQ r/m64, r64","tzcntq r/m64, r64","F3 REX.W 0F BC /r","N.S.","V","BMI1","","w,r","Y","64"
+"TZCNT r16, r/m16","TZCNTW r/m16, r16","tzcntw r/m16, r16","F3 0F BC /r","V","V","BMI1","operand16","w,r","Y","16"
+"TZMSK r32V, r/m32","TZMSKL r/m32, r32V","tzmskl r/m32, r32V","XOP.NDD.128.09.WIG 01 /4","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"TZMSK r64V, r/m64","TZMSKQ r/m64, r64V","tzmskq r/m64, r64V","XOP.NDD.128.09.WIG 01 /4","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"UCOMISD xmm1, xmm2/m64","UCOMISD xmm2/m64, xmm1","ucomisd xmm2/m64, xmm1","66 0F 2E /r","V","V","SSE2","","r,r","",""
+"UCOMISS xmm1, xmm2/m32","UCOMISS xmm2/m32, xmm1","ucomiss xmm2/m32, xmm1","0F 2E /r","V","V","SSE","","r,r","",""
+"UD0 r32, r/m32","UD0 r/m32, r32","ud0 r/m32, r32","0F FF /r","V","V","PPRO","","r,r","",""
+"UD1 r32, r/m32","UD1 r/m32, r32","ud1 r/m32, r32","0F B9 /r","V","V","PPRO","","r,r","",""
+"UD2","UD2","ud2","0F 0B","V","V","PPRO","","","",""
+"UNPCKHPD xmm1, xmm2/m128","UNPCKHPD xmm2/m128, xmm1","unpckhpd xmm2/m128, xmm1","66 0F 15 /r","V","V","SSE2","","rw,r","",""
+"UNPCKHPS xmm1, xmm2/m128","UNPCKHPS xmm2/m128, xmm1","unpckhps xmm2/m128, xmm1","0F 15 /r","V","V","SSE","","rw,r","",""
+"UNPCKLPD xmm1, xmm2/m128","UNPCKLPD xmm2/m128, xmm1","unpcklpd xmm2/m128, xmm1","66 0F 14 /r","V","V","SSE2","","rw,r","",""
+"UNPCKLPS xmm1, xmm2/m128","UNPCKLPS xmm2/m128, xmm1","unpcklps xmm2/m128, xmm1","0F 14 /r","V","V","SSE","","rw,r","",""
+"V4FMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FMADDPS m128, zmmV+3, {k}{z}, zmm1","v4fmaddps m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 9A /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
+"V4FMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FMADDSS m128, xmmV+3, {k}{z}, xmm1","v4fmaddss m128, xmmV+3, {k}{z}, xmm1","EVEX.DDS.LIG.F2.0F38.W0 9B /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
+"V4FNMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FNMADDPS m128, zmmV+3, {k}{z}, zmm1","v4fnmaddps m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 AA /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
+"V4FNMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FNMADDSS m128, xmmV+3, {k}{z}, xmm1","v4fnmaddss m128, xmmV+3, {k}{z}, xmm1","EVEX.DDS.LIG.F2.0F38.W0 AB /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
+"VADDPD xmm1, xmmV, xmm2/m128","VADDPD xmm2/m128, xmmV, xmm1","vaddpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VADDPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vaddpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 58 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VADDPD ymm1, ymmV, ymm2/m256","VADDPD ymm2/m256, ymmV, ymm1","vaddpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VADDPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vaddpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 58 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VADDPD zmm1{er}, {k}{z}, zmmV, zmm2","VADDPD zmm2, zmmV, {k}{z}, zmm1{er}","vaddpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VADDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VADDPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vaddpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 58 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VADDPS xmm1, xmmV, xmm2/m128","VADDPS xmm2/m128, xmmV, xmm1","vaddps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VADDPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vaddps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 58 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VADDPS ymm1, ymmV, ymm2/m256","VADDPS ymm2/m256, ymmV, ymm1","vaddps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VADDPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vaddps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 58 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VADDPS zmm1{er}, {k}{z}, zmmV, zmm2","VADDPS zmm2, zmmV, {k}{z}, zmm1{er}","vaddps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VADDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VADDPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vaddps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 58 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VADDSD xmm1{er}, {k}{z}, xmmV, xmm2","VADDSD xmm2, xmmV, {k}{z}, xmm1{er}","vaddsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VADDSD xmm1, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, xmm1","vaddsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDSD xmm1, {k}{z}, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, {k}{z}, xmm1","vaddsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 58 /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VADDSS xmm1{er}, {k}{z}, xmmV, xmm2","VADDSS xmm2, xmmV, {k}{z}, xmm1{er}","vaddss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VADDSS xmm1, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, xmm1","vaddss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDSS xmm1, {k}{z}, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, {k}{z}, xmm1","vaddss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 58 /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VADDSUBPD xmm1, xmmV, xmm2/m128","VADDSUBPD xmm2/m128, xmmV, xmm1","vaddsubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
+"VADDSUBPD ymm1, ymmV, ymm2/m256","VADDSUBPD ymm2/m256, ymmV, ymm1","vaddsubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
+"VADDSUBPS xmm1, xmmV, xmm2/m128","VADDSUBPS xmm2/m128, xmmV, xmm1","vaddsubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
+"VADDSUBPS ymm1, ymmV, ymm2/m256","VADDSUBPS ymm2/m256, ymmV, ymm1","vaddsubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
+"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV, xmm1","vaesdec xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DE /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
+"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV, xmm1","vaesdec xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DE /r","V","V","AES+AVX","","w,r,r","",""
+"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV, ymm1","vaesdec ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DE /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
+"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV, ymm1","vaesdec ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DE /r","V","V","VAES+AVX","","w,r,r","",""
+"VAESDEC zmm1, zmmV, zmm2/m512","VAESDEC zmm2/m512, zmmV, zmm1","vaesdec zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DE /r","V","V","AES+AVX512F","scale64","w,r,r","",""
+"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV, xmm1","vaesdeclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DF /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
+"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV, xmm1","vaesdeclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DF /r","V","V","AES+AVX","","w,r,r","",""
+"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV, ymm1","vaesdeclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DF /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
+"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV, ymm1","vaesdeclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DF /r","V","V","VAES+AVX","","w,r,r","",""
+"VAESDECLAST zmm1, zmmV, zmm2/m512","VAESDECLAST zmm2/m512, zmmV, zmm1","vaesdeclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DF /r","V","V","AES+AVX512F","scale64","w,r,r","",""
+"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV, xmm1","vaesenc xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DC /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
+"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV, xmm1","vaesenc xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DC /r","V","V","AES+AVX","","w,r,r","",""
+"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV, ymm1","vaesenc ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DC /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
+"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV, ymm1","vaesenc ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DC /r","V","V","VAES+AVX","","w,r,r","",""
+"VAESENC zmm1, zmmV, zmm2/m512","VAESENC zmm2/m512, zmmV, zmm1","vaesenc zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DC /r","V","V","AES+AVX512F","scale64","w,r,r","",""
+"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV, xmm1","vaesenclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DD /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
+"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV, xmm1","vaesenclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DD /r","V","V","AES+AVX","","w,r,r","",""
+"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV, ymm1","vaesenclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DD /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
+"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV, ymm1","vaesenclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DD /r","V","V","VAES+AVX","","w,r,r","",""
+"VAESENCLAST zmm1, zmmV, zmm2/m512","VAESENCLAST zmm2/m512, zmmV, zmm1","vaesenclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DD /r","V","V","AES+AVX512F","scale64","w,r,r","",""
+"VAESIMC xmm1, xmm2/m128","VAESIMC xmm2/m128, xmm1","vaesimc xmm2/m128, xmm1","VEX.128.66.0F38.WIG DB /r","V","V","AES+AVX","","w,r","",""
+"VAESKEYGENASSIST xmm1, xmm2/m128, imm8u","VAESKEYGENASSIST imm8u, xmm2/m128, xmm1","vaeskeygenassist imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG DF /r ib","V","V","AES+AVX","","w,r,r","",""
+"VALIGND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VALIGND imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","valignd imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 03 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VALIGND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VALIGND imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","valignd imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 03 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VALIGND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VALIGND imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","valignd imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 03 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VALIGNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VALIGNQ imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","valignq imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 03 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VALIGNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VALIGNQ imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","valignq imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 03 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VALIGNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VALIGNQ imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","valignq imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 03 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VANDNPD xmm1, xmmV, xmm2/m128","VANDNPD xmm2/m128, xmmV, xmm1","vandnpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
+"VANDNPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDNPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vandnpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 55 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VANDNPD ymm1, ymmV, ymm2/m256","VANDNPD ymm2/m256, ymmV, ymm1","vandnpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
+"VANDNPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDNPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vandnpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 55 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VANDNPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDNPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vandnpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 55 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VANDNPS xmm1, xmmV, xmm2/m128","VANDNPS xmm2/m128, xmmV, xmm1","vandnps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
+"VANDNPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDNPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vandnps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 55 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VANDNPS ymm1, ymmV, ymm2/m256","VANDNPS ymm2/m256, ymmV, ymm1","vandnps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
+"VANDNPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDNPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vandnps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 55 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VANDNPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDNPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vandnps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 55 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VANDPD xmm1, xmmV, xmm2/m128","VANDPD xmm2/m128, xmmV, xmm1","vandpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
+"VANDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vandpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 54 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VANDPD ymm1, ymmV, ymm2/m256","VANDPD ymm2/m256, ymmV, ymm1","vandpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
+"VANDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vandpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 54 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VANDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vandpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 54 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VANDPS xmm1, xmmV, xmm2/m128","VANDPS xmm2/m128, xmmV, xmm1","vandps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
+"VANDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vandps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 54 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VANDPS ymm1, ymmV, ymm2/m256","VANDPS ymm2/m256, ymmV, ymm1","vandps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
+"VANDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vandps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 54 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VANDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vandps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 54 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VBLENDMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VBLENDMPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vblendmpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 65 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VBLENDMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VBLENDMPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vblendmpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 65 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VBLENDMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VBLENDMPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vblendmpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 65 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VBLENDMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VBLENDMPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vblendmps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 65 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VBLENDMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VBLENDMPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vblendmps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 65 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VBLENDMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VBLENDMPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vblendmps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 65 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VBLENDPD xmm1, xmmV, xmm2/m128, imm8u","VBLENDPD imm8u, xmm2/m128, xmmV, xmm1","vblendpd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0D /r ib","V","V","AVX","","w,r,r,r","",""
+"VBLENDPD ymm1, ymmV, ymm2/m256, imm8u","VBLENDPD imm8u, ymm2/m256, ymmV, ymm1","vblendpd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0D /r ib","V","V","AVX","","w,r,r,r","",""
+"VBLENDPS xmm1, xmmV, xmm2/m128, imm8u","VBLENDPS imm8u, xmm2/m128, xmmV, xmm1","vblendps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0C /r ib","V","V","AVX","","w,r,r,r","",""
+"VBLENDPS ymm1, ymmV, ymm2/m256, imm8u","VBLENDPS imm8u, ymm2/m256, ymmV, ymm1","vblendps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0C /r ib","V","V","AVX","","w,r,r,r","",""
+"VBLENDVPD xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPD xmmIH, xmm2/m128, xmmV, xmm1","vblendvpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4B /r /is4","V","V","AVX","","w,r,r,r","",""
+"VBLENDVPD ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPD ymmIH, ymm2/m256, ymmV, ymm1","vblendvpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4B /r /is4","V","V","AVX","","w,r,r,r","",""
+"VBLENDVPS xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPS xmmIH, xmm2/m128, xmmV, xmm1","vblendvps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4A /r /is4","V","V","AVX","","w,r,r,r","",""
+"VBLENDVPS ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPS ymmIH, ymm2/m256, ymmV, ymm1","vblendvps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4A /r /is4","V","V","AVX","","w,r,r,r","",""
+"VBROADCASTF128 ymm1, m128","VBROADCASTF128 m128, ymm1","vbroadcastf128 m128, ymm1","VEX.256.66.0F38.W0 1A /r","V","V","AVX","modrm_memonly","w,r","",""
+"VBROADCASTF32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64, {k}{z}, ymm1","vbroadcastf32x2 xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W0 19 /r","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
+"VBROADCASTF32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64, {k}{z}, zmm1","vbroadcastf32x2 xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W0 19 /r","V","V","AVX512DQ","scale8","w,r,r","",""
+"VBROADCASTF32X4 ymm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z}, ymm1","vbroadcastf32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 1A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTF32X4 zmm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z}, zmm1","vbroadcastf32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1A /r","V","V","AVX512F","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTF32X8 zmm1, {k}{z}, m256","VBROADCASTF32X8 m256, {k}{z}, zmm1","vbroadcastf32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1B /r","V","V","AVX512DQ","modrm_memonly,scale32","w,r,r","",""
+"VBROADCASTF64X2 ymm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z}, ymm1","vbroadcastf64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 1A /r","V","V","AVX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTF64X2 zmm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z}, zmm1","vbroadcastf64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1A /r","V","V","AVX512DQ","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTF64X4 zmm1, {k}{z}, m256","VBROADCASTF64X4 m256, {k}{z}, zmm1","vbroadcastf64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1B /r","V","V","AVX512F","modrm_memonly,scale32","w,r,r","",""
+"VBROADCASTI128 ymm1, m128","VBROADCASTI128 m128, ymm1","vbroadcasti128 m128, ymm1","VEX.256.66.0F38.W0 5A /r","V","V","AVX2","modrm_memonly","w,r","",""
+"VBROADCASTI32X2 xmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}, xmm1","vbroadcasti32x2 xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 59 /r","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
+"VBROADCASTI32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}, ymm1","vbroadcasti32x2 xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W0 59 /r","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
+"VBROADCASTI32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}, zmm1","vbroadcasti32x2 xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W0 59 /r","V","V","AVX512DQ","scale8","w,r,r","",""
+"VBROADCASTI32X4 ymm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z}, ymm1","vbroadcasti32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 5A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTI32X4 zmm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z}, zmm1","vbroadcasti32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5A /r","V","V","AVX512F","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTI32X8 zmm1, {k}{z}, m256","VBROADCASTI32X8 m256, {k}{z}, zmm1","vbroadcasti32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5B /r","V","V","AVX512DQ","modrm_memonly,scale32","w,r,r","",""
+"VBROADCASTI64X2 ymm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z}, ymm1","vbroadcasti64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 5A /r","V","V","AVX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTI64X2 zmm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z}, zmm1","vbroadcasti64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5A /r","V","V","AVX512DQ","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTI64X4 zmm1, {k}{z}, m256","VBROADCASTI64X4 m256, {k}{z}, zmm1","vbroadcasti64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5B /r","V","V","AVX512F","modrm_memonly,scale32","w,r,r","",""
+"VBROADCASTSD ymm1, m64","VBROADCASTSD m64, ymm1","vbroadcastsd m64, ymm1","VEX.256.66.0F38.W0 19 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VBROADCASTSD ymm1, xmm2","VBROADCASTSD xmm2, ymm1","vbroadcastsd xmm2, ymm1","VEX.256.66.0F38.W0 19 /r","V","V","AVX2","modrm_regonly","w,r","",""
+"VBROADCASTSD ymm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z}, ymm1","vbroadcastsd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 19 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VBROADCASTSD zmm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z}, zmm1","vbroadcastsd xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 19 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VBROADCASTSS xmm1, m32","VBROADCASTSS m32, xmm1","vbroadcastss m32, xmm1","VEX.128.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VBROADCASTSS ymm1, m32","VBROADCASTSS m32, ymm1","vbroadcastss m32, ymm1","VEX.256.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VBROADCASTSS xmm1, xmm2","VBROADCASTSS xmm2, xmm1","vbroadcastss xmm2, xmm1","VEX.128.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","",""
+"VBROADCASTSS ymm1, xmm2","VBROADCASTSS xmm2, ymm1","vbroadcastss xmm2, ymm1","VEX.256.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","",""
+"VBROADCASTSS xmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, xmm1","vbroadcastss xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 18 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VBROADCASTSS ymm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, ymm1","vbroadcastss xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 18 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VBROADCASTSS zmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, zmm1","vbroadcastss xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 18 /r","V","V","AVX512F","scale4","w,r,r","",""
+"VCMPPD xmm1, xmmV, xmm2/m128, imm8u","VCMPPD imm8u, xmm2/m128, xmmV, xmm1","vcmppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPPD k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VCMPPD imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","vcmppd imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F.W1 C2 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VCMPPD ymm1, ymmV, ymm2/m256, imm8u","VCMPPD imm8u, ymm2/m256, ymmV, ymm1","vcmppd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPPD k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VCMPPD imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","vcmppd imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F.W1 C2 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VCMPPD k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPD imm8u, zmm2, zmmV, {k}, k1{sae}","vcmppd imm8u, zmm2, zmmV, {k}, k1{sae}","EVEX.NDS.512.66.0F.W1 C2 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VCMPPD k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VCMPPD imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","vcmppd imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F.W1 C2 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VCMPPS xmm1, xmmV, xmm2/m128, imm8u","VCMPPS imm8u, xmm2/m128, xmmV, xmm1","vcmpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPPS k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VCMPPS imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","vcmpps imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.0F.W0 C2 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VCMPPS ymm1, ymmV, ymm2/m256, imm8u","VCMPPS imm8u, ymm2/m256, ymmV, ymm1","vcmpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPPS k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VCMPPS imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","vcmpps imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.0F.W0 C2 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VCMPPS k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPS imm8u, zmm2, zmmV, {k}, k1{sae}","vcmpps imm8u, zmm2, zmmV, {k}, k1{sae}","EVEX.NDS.512.0F.W0 C2 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VCMPPS k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VCMPPS imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","vcmpps imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.0F.W0 C2 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VCMPSD k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSD imm8u, xmm2, xmmV, {k}, k1{sae}","vcmpsd imm8u, xmm2, xmmV, {k}, k1{sae}","EVEX.NDS.128.F2.0F.W1 C2 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VCMPSD xmm1, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64, xmmV, xmm1","vcmpsd imm8u, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPSD k1, {k}, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64, xmmV, {k}, k1","vcmpsd imm8u, xmm2/m64, xmmV, {k}, k1","EVEX.NDS.LIG.F2.0F.W1 C2 /r ib","V","V","AVX512F","scale8","w,r,r,r,r","",""
+"VCMPSS k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSS imm8u, xmm2, xmmV, {k}, k1{sae}","vcmpss imm8u, xmm2, xmmV, {k}, k1{sae}","EVEX.NDS.128.F3.0F.W0 C2 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VCMPSS xmm1, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32, xmmV, xmm1","vcmpss imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPSS k1, {k}, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32, xmmV, {k}, k1","vcmpss imm8u, xmm2/m32, xmmV, {k}, k1","EVEX.NDS.LIG.F3.0F.W0 C2 /r ib","V","V","AVX512F","scale4","w,r,r,r,r","",""
+"VCOMISD xmm1{sae}, xmm2","VCOMISD xmm2, xmm1{sae}","vcomisd xmm2, xmm1{sae}","EVEX.128.66.0F.W1 2F /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1","EVEX.LIG.66.0F.W1 2F /r","V","V","AVX512F","scale8","r,r","",""
+"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1","VEX.LIG.66.0F.WIG 2F /r","V","V","AVX","","r,r","",""
+"VCOMISS xmm1{sae}, xmm2","VCOMISS xmm2, xmm1{sae}","vcomiss xmm2, xmm1{sae}","EVEX.128.0F.W0 2F /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1","EVEX.LIG.0F.W0 2F /r","V","V","AVX512F","scale4","r,r","",""
+"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1","VEX.LIG.0F.WIG 2F /r","V","V","AVX","","r,r","",""
+"VCOMPRESSPD xmm2/m128, {k}{z}, xmm1","VCOMPRESSPD xmm1, {k}{z}, xmm2/m128","vcompresspd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 8A /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VCOMPRESSPD ymm2/m256, {k}{z}, ymm1","VCOMPRESSPD ymm1, {k}{z}, ymm2/m256","vcompresspd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 8A /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VCOMPRESSPD zmm2/m512, {k}{z}, zmm1","VCOMPRESSPD zmm1, {k}{z}, zmm2/m512","vcompresspd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 8A /r","V","V","AVX512F","scale8","w,r,r","",""
+"VCOMPRESSPS xmm2/m128, {k}{z}, xmm1","VCOMPRESSPS xmm1, {k}{z}, xmm2/m128","vcompressps xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 8A /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VCOMPRESSPS ymm2/m256, {k}{z}, ymm1","VCOMPRESSPS ymm1, {k}{z}, ymm2/m256","vcompressps ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 8A /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VCOMPRESSPS zmm2/m512, {k}{z}, zmm1","VCOMPRESSPS zmm1, {k}{z}, zmm2/m512","vcompressps zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 8A /r","V","V","AVX512F","scale4","w,r,r","",""
+"VCVTDQ2PD ymm1, xmm2/m128","VCVTDQ2PD xmm2/m128, ymm1","vcvtdq2pd xmm2/m128, ymm1","VEX.256.F3.0F.WIG E6 /r","V","V","AVX","","w,r","",""
+"VCVTDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PD xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtdq2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W0 E6 /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTDQ2PD xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtdq2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W0 E6 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTDQ2PD xmm1, xmm2/m64","VCVTDQ2PD xmm2/m64, xmm1","vcvtdq2pd xmm2/m64, xmm1","VEX.128.F3.0F.WIG E6 /r","V","V","AVX","","w,r","",""
+"VCVTDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTDQ2PD ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtdq2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W0 E6 /r","V","V","AVX512F","bscale4,scale32","w,r,r","",""
+"VCVTDQ2PS xmm1, xmm2/m128","VCVTDQ2PS xmm2/m128, xmm1","vcvtdq2ps xmm2/m128, xmm1","VEX.128.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PS xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtdq2ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTDQ2PS ymm1, ymm2/m256","VCVTDQ2PS ymm2/m256, ymm1","vcvtdq2ps ymm2/m256, ymm1","VEX.256.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTDQ2PS ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtdq2ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTDQ2PS zmm2, {k}{z}, zmm1{er}","vcvtdq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 5B /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTDQ2PS zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtdq2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTPD2DQ ymm1{er}, {k}{z}, zmm2","VCVTPD2DQ zmm2, {k}{z}, ymm1{er}","vcvtpd2dq zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 E6 /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2DQ zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtpd2dq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.F2.0F.W1 E6 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTPD2DQ xmm1, xmm2/m128","VCVTPD2DQX xmm2/m128, xmm1","vcvtpd2dqx xmm2/m128, xmm1","VEX.128.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128"
+"VCVTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2DQX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2dqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F2.0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTPD2DQ xmm1, ymm2/m256","VCVTPD2DQY ymm2/m256, xmm1","vcvtpd2dqy ymm2/m256, xmm1","VEX.256.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256"
+"VCVTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2DQY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtpd2dqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.F2.0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTPD2PS ymm1{er}, {k}{z}, zmm2","VCVTPD2PS zmm2, {k}{z}, ymm1{er}","vcvtpd2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.66.0F.W1 5A /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTPD2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2PS zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtpd2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.66.0F.W1 5A /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTPD2PS xmm1, xmm2/m128","VCVTPD2PSX xmm2/m128, xmm1","vcvtpd2psx xmm2/m128, xmm1","VEX.128.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","128"
+"VCVTPD2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2PSX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 5A /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTPD2PS xmm1, ymm2/m256","VCVTPD2PSY ymm2/m256, xmm1","vcvtpd2psy ymm2/m256, xmm1","VEX.256.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","256"
+"VCVTPD2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2PSY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtpd2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.66.0F.W1 5A /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2QQ xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2qq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 7B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2QQ ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtpd2qq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 7B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTPD2QQ zmm1{er}, {k}{z}, zmm2","VCVTPD2QQ zmm2, {k}{z}, zmm1{er}","vcvtpd2qq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 7B /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2QQ zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtpd2qq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 7B /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTPD2UDQ ymm1{er}, {k}{z}, zmm2","VCVTPD2UDQ zmm2, {k}{z}, ymm1{er}","vcvtpd2udq zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 79 /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UDQ zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtpd2udq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W1 79 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UDQX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2udqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F.W1 79 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UDQY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtpd2udqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F.W1 79 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UQQ xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2uqq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 79 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UQQ ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtpd2uqq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 79 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTPD2UQQ zmm1{er}, {k}{z}, zmm2","VCVTPD2UQQ zmm2, {k}{z}, zmm1{er}","vcvtpd2uqq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 79 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UQQ zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtpd2uqq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 79 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTPH2PS ymm1, xmm2/m128","VCVTPH2PS xmm2/m128, ymm1","vcvtph2ps xmm2/m128, ymm1","VEX.256.66.0F38.W0 13 /r","V","V","F16C","","w,r","",""
+"VCVTPH2PS ymm1, {k}{z}, xmm2/m128","VCVTPH2PS xmm2/m128, {k}{z}, ymm1","vcvtph2ps xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 13 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VCVTPH2PS xmm1, xmm2/m64","VCVTPH2PS xmm2/m64, xmm1","vcvtph2ps xmm2/m64, xmm1","VEX.128.66.0F38.W0 13 /r","V","V","F16C","","w,r","",""
+"VCVTPH2PS xmm1, {k}{z}, xmm2/m64","VCVTPH2PS xmm2/m64, {k}{z}, xmm1","vcvtph2ps xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 13 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VCVTPH2PS zmm1{sae}, {k}{z}, ymm2","VCVTPH2PS ymm2, {k}{z}, zmm1{sae}","vcvtph2ps ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 13 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTPH2PS zmm1, {k}{z}, ymm2/m256","VCVTPH2PS ymm2/m256, {k}{z}, zmm1","vcvtph2ps ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 13 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VCVTPS2DQ xmm1, xmm2/m128","VCVTPS2DQ xmm2/m128, xmm1","vcvtps2dq xmm2/m128, xmm1","VEX.128.66.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2DQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2dq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2DQ ymm1, ymm2/m256","VCVTPS2DQ ymm2/m256, ymm1","vcvtps2dq ymm2/m256, ymm1","VEX.256.66.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2DQ ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2dq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTPS2DQ zmm1{er}, {k}{z}, zmm2","VCVTPS2DQ zmm2, {k}{z}, zmm1{er}","vcvtps2dq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 5B /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2DQ zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2dq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTPS2PD ymm1, xmm2/m128","VCVTPS2PD xmm2/m128, ymm1","vcvtps2pd xmm2/m128, ymm1","VEX.256.0F.WIG 5A /r","V","V","AVX","","w,r","",""
+"VCVTPS2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2PD xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 5A /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTPS2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2PD xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 5A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2PD xmm1, xmm2/m64","VCVTPS2PD xmm2/m64, xmm1","vcvtps2pd xmm2/m64, xmm1","VEX.128.0F.WIG 5A /r","V","V","AVX","","w,r","",""
+"VCVTPS2PD zmm1{sae}, {k}{z}, ymm2","VCVTPS2PD ymm2, {k}{z}, zmm1{sae}","vcvtps2pd ymm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 5A /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTPS2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2PD ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 5A /r","V","V","AVX512F","bscale4,scale32","w,r,r","",""
+"VCVTPS2PH xmm2/m64, xmm1, imm8u","VCVTPS2PH imm8u, xmm1, xmm2/m64","vcvtps2ph imm8u, xmm1, xmm2/m64","VEX.128.66.0F3A.W0 1D /r ib","V","V","F16C","","w,r,r","",""
+"VCVTPS2PH xmm2/m64, {k}{z}, xmm1, imm8u","VCVTPS2PH imm8u, xmm1, {k}{z}, xmm2/m64","vcvtps2ph imm8u, xmm1, {k}{z}, xmm2/m64","EVEX.128.66.0F3A.W0 1D /r ib","V","V","AVX512F+AVX512VL","scale8","w,r,r,r","",""
+"VCVTPS2PH xmm2/m128, ymm1, imm8u","VCVTPS2PH imm8u, ymm1, xmm2/m128","vcvtps2ph imm8u, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 1D /r ib","V","V","F16C","","w,r,r","",""
+"VCVTPS2PH xmm2/m128, {k}{z}, ymm1, imm8u","VCVTPS2PH imm8u, ymm1, {k}{z}, xmm2/m128","vcvtps2ph imm8u, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0 1D /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VCVTPS2PH ymm2/m256, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1, {k}{z}, ymm2/m256","vcvtps2ph imm8u, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0 1D /r ib","V","V","AVX512F","scale32","w,r,r,r","",""
+"VCVTPS2PH ymm2{sae}, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1, {k}{z}, ymm2{sae}","vcvtps2ph imm8u, zmm1, {k}{z}, ymm2{sae}","EVEX.512.66.0F3A.W0 1D /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VCVTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2QQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2qq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 7B /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2QQ xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2qq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 7B /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2QQ zmm1{er}, {k}{z}, ymm2","VCVTPS2QQ ymm2, {k}{z}, zmm1{er}","vcvtps2qq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 7B /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2QQ ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2qq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 7B /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
+"VCVTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UDQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2udq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 79 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2UDQ ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2udq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 79 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTPS2UDQ zmm1{er}, {k}{z}, zmm2","VCVTPS2UDQ zmm2, {k}{z}, zmm1{er}","vcvtps2udq zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2UDQ zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2udq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 79 /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UQQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2uqq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 79 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2UQQ xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2uqq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 79 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2UQQ zmm1{er}, {k}{z}, ymm2","VCVTPS2UQQ ymm2, {k}{z}, zmm1{er}","vcvtps2uqq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 79 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2UQQ ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2uqq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 79 /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
+"VCVTQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PD xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtqq2pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W1 E6 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PD ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtqq2pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W1 E6 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTQQ2PD zmm2, {k}{z}, zmm1{er}","vcvtqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 E6 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PD zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtqq2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W1 E6 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTQQ2PS zmm2, {k}{z}, ymm1{er}","vcvtqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 5B /r","V","V","AVX512DQ","modrm_regonly","w,r,r","Y",""
+"VCVTQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PS zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtqq2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W1 5B /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512"
+"VCVTQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PSX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtqq2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F.W1 5B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PSY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtqq2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F.W1 5B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTSD2SI r32{er}, xmm2","VCVTSD2SI xmm2, r32{er}","vcvtsd2si xmm2, r32{er}","EVEX.128.F2.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r32","EVEX.LIG.F2.0F.W0 2D /r","V","V","AVX512F","scale8","w,r","Y","32"
+"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r32","VEX.LIG.F2.0F.W0 2D /r","V","V","AVX","","w,r","Y","32"
+"VCVTSD2SI r64{er}, xmm2","VCVTSD2SIQ xmm2, r64{er}","vcvtsd2siq xmm2, r64{er}","EVEX.128.F2.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64, r64","EVEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
+"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64, r64","VEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64"
+"VCVTSD2SS xmm1{er}, {k}{z}, xmmV, xmm2","VCVTSD2SS xmm2, xmmV, {k}{z}, xmm1{er}","vcvtsd2ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5A /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VCVTSD2SS xmm1, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV, xmm1","vcvtsd2ss xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5A /r","V","V","AVX","","w,r,r","",""
+"VCVTSD2SS xmm1, {k}{z}, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV, {k}{z}, xmm1","vcvtsd2ss xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5A /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VCVTSD2USI r32{er}, xmm2","VCVTSD2USIL xmm2, r32{er}","vcvtsd2usi xmm2, r32{er}","EVEX.128.F2.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTSD2USI r32, xmm2/m64","VCVTSD2USIL xmm2/m64, r32","vcvtsd2usi xmm2/m64, r32","EVEX.LIG.F2.0F.W0 79 /r","V","V","AVX512F","scale8","w,r","Y","32"
+"VCVTSD2USI r64{er}, xmm2","VCVTSD2USIQ xmm2, r64{er}","vcvtsd2usi xmm2, r64{er}","EVEX.128.F2.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTSD2USI r64, xmm2/m64","VCVTSD2USIQ xmm2/m64, r64","vcvtsd2usi xmm2/m64, r64","EVEX.LIG.F2.0F.W1 79 /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
+"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r/m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX512F","scale4","w,r,r","Y","32"
+"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r/m32, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX","","w,r,r","Y","32"
+"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r/m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
+"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX","","w,r,r","Y","64"
+"VCVTSI2SD xmm1{er}, xmmV, rmr64","VCVTSI2SDQ rmr64, xmmV, xmm1{er}","vcvtsi2sdq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 2A /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
+"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r/m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX512F","scale4","w,r,r","Y","32"
+"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX","","w,r,r","Y","32"
+"VCVTSI2SS xmm1{er}, xmmV, rmr32","VCVTSI2SSL rmr32, xmmV, xmm1{er}","vcvtsi2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0 2A /r","V","V","AVX512F","modrm_regonly","w,r,r","Y","32"
+"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r/m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
+"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r/m64, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX","","w,r,r","Y","64"
+"VCVTSI2SS xmm1{er}, xmmV, rmr64","VCVTSI2SSQ rmr64, xmmV, xmm1{er}","vcvtsi2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1 2A /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
+"VCVTSS2SD xmm1{sae}, {k}{z}, xmmV, xmm2","VCVTSS2SD xmm2, xmmV, {k}{z}, xmm1{sae}","vcvtss2sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5A /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VCVTSS2SD xmm1, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV, xmm1","vcvtss2sd xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5A /r","V","V","AVX","","w,r,r","",""
+"VCVTSS2SD xmm1, {k}{z}, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV, {k}{z}, xmm1","vcvtss2sd xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5A /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VCVTSS2SI r32{er}, xmm2","VCVTSS2SI xmm2, r32{er}","vcvtss2si xmm2, r32{er}","EVEX.128.F3.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r32","EVEX.LIG.F3.0F.W0 2D /r","V","V","AVX512F","scale4","w,r","Y","32"
+"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r32","VEX.LIG.F3.0F.W0 2D /r","V","V","AVX","","w,r","Y","32"
+"VCVTSS2SI r64{er}, xmm2","VCVTSS2SIQ xmm2, r64{er}","vcvtss2siq xmm2, r64{er}","EVEX.128.F3.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
+"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32, r64","VEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64"
+"VCVTSS2USI r32{er}, xmm2","VCVTSS2USIL xmm2, r32{er}","vcvtss2usil xmm2, r32{er}","EVEX.128.F3.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTSS2USI r32, xmm2/m32","VCVTSS2USIL xmm2/m32, r32","vcvtss2usil xmm2/m32, r32","EVEX.LIG.F3.0F.W0 79 /r","V","V","AVX512F","scale4","w,r","Y","32"
+"VCVTSS2USI r64{er}, xmm2","VCVTSS2USIQ xmm2, r64{er}","vcvtss2usiq xmm2, r64{er}","EVEX.128.F3.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTSS2USI r64, xmm2/m32","VCVTSS2USIQ xmm2/m32, r64","vcvtss2usiq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 79 /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
+"VCVTTPD2DQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2DQ zmm2, {k}{z}, ymm1{sae}","vcvttpd2dq zmm2, {k}{z}, ymm1{sae}","EVEX.512.66.0F.W1 E6 /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2DQ zmm2/m512/m64bcst, {k}{z}, ymm1","vcvttpd2dq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.66.0F.W1 E6 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTTPD2DQ xmm1, xmm2/m128","VCVTTPD2DQX xmm2/m128, xmm1","vcvttpd2dqx xmm2/m128, xmm1","VEX.128.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128"
+"VCVTTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2DQX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2dqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTTPD2DQ xmm1, ymm2/m256","VCVTTPD2DQY ymm2/m256, xmm1","vcvttpd2dqy ymm2/m256, xmm1","VEX.256.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256"
+"VCVTTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2DQY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvttpd2dqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.66.0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2QQ xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2qq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2QQ ymm2/m256/m64bcst, {k}{z}, ymm1","vcvttpd2qq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTTPD2QQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2QQ zmm2, {k}{z}, zmm1{sae}","vcvttpd2qq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 7A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2QQ zmm2/m512/m64bcst, {k}{z}, zmm1","vcvttpd2qq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTTPD2UDQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2UDQ zmm2, {k}{z}, ymm1{sae}","vcvttpd2udq zmm2, {k}{z}, ymm1{sae}","EVEX.512.0F.W1 78 /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UDQ zmm2/m512/m64bcst, {k}{z}, ymm1","vcvttpd2udq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W1 78 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UDQX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2udqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F.W1 78 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UDQY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvttpd2udqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F.W1 78 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UQQ xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2uqq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 78 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UQQ ymm2/m256/m64bcst, {k}{z}, ymm1","vcvttpd2uqq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 78 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTTPD2UQQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2UQQ zmm2, {k}{z}, zmm1{sae}","vcvttpd2uqq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 78 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UQQ zmm2/m512/m64bcst, {k}{z}, zmm1","vcvttpd2uqq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 78 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTTPS2DQ xmm1, xmm2/m128","VCVTTPS2DQ xmm2/m128, xmm1","vcvttps2dq xmm2/m128, xmm1","VEX.128.F3.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2DQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2dq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTTPS2DQ ymm1, ymm2/m256","VCVTTPS2DQ ymm2/m256, ymm1","vcvttps2dq ymm2/m256, ymm1","VEX.256.F3.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2DQ ymm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2dq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTTPS2DQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2DQ zmm2, {k}{z}, zmm1{sae}","vcvttps2dq zmm2, {k}{z}, zmm1{sae}","EVEX.512.F3.0F.W0 5B /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2DQ zmm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2dq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2QQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2qq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 7A /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2QQ xmm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2qq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 7A /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTTPS2QQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2QQ ymm2, {k}{z}, zmm1{sae}","vcvttps2qq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 7A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2QQ ymm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2qq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 7A /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
+"VCVTTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UDQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2udq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 78 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2UDQ ymm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2udq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 78 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTTPS2UDQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2UDQ zmm2, {k}{z}, zmm1{sae}","vcvttps2udq zmm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 78 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2UDQ zmm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2udq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 78 /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UQQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2uqq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 78 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2UQQ xmm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2uqq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 78 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTTPS2UQQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2UQQ ymm2, {k}{z}, zmm1{sae}","vcvttps2uqq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 78 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2UQQ ymm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2uqq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 78 /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
+"VCVTTSD2SI r32{sae}, xmm2","VCVTTSD2SI xmm2, r32{sae}","vcvttsd2si xmm2, r32{sae}","EVEX.128.F2.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64, r32","EVEX.LIG.F2.0F.W0 2C /r","V","V","AVX512F","scale8","w,r","Y","32"
+"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64, r32","VEX.LIG.F2.0F.W0 2C /r","V","V","AVX","","w,r","Y","32"
+"VCVTTSD2SI r64{sae}, xmm2","VCVTTSD2SIQ xmm2, r64{sae}","vcvttsd2siq xmm2, r64{sae}","EVEX.128.F2.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m64, r64","EVEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
+"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m64, r64","VEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64"
+"VCVTTSD2USI r32{sae}, xmm2","VCVTTSD2USIL xmm2, r32{sae}","vcvttsd2usil xmm2, r32{sae}","EVEX.128.F2.0F.W0 78 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTTSD2USI r32, xmm2/m64","VCVTTSD2USIL xmm2/m64, r32","vcvttsd2usil xmm2/m64, r32","EVEX.LIG.F2.0F.W0 78 /r","V","V","AVX512F","scale8","w,r","Y","32"
+"VCVTTSD2USI r64{sae}, xmm2","VCVTTSD2USIQ xmm2, r64{sae}","vcvttsd2usiq xmm2, r64{sae}","EVEX.128.F2.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTTSD2USI r64, xmm2/m64","VCVTTSD2USIQ xmm2/m64, r64","vcvttsd2usiq xmm2/m64, r64","EVEX.LIG.F2.0F.W1 78 /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
+"VCVTTSS2SI r32{sae}, xmm2","VCVTTSS2SI xmm2, r32{sae}","vcvttss2si xmm2, r32{sae}","EVEX.128.F3.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32, r32","EVEX.LIG.F3.0F.W0 2C /r","V","V","AVX512F","scale4","w,r","Y","32"
+"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32, r32","VEX.LIG.F3.0F.W0 2C /r","V","V","AVX","","w,r","Y","32"
+"VCVTTSS2SI r64{sae}, xmm2","VCVTTSS2SIQ xmm2, r64{sae}","vcvttss2siq xmm2, r64{sae}","EVEX.128.F3.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
+"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m32, r64","VEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64"
+"VCVTTSS2USI r32{sae}, xmm2","VCVTTSS2USIL xmm2, r32{sae}","vcvttss2usil xmm2, r32{sae}","EVEX.128.F3.0F.W0 78 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTTSS2USI r32, xmm2/m32","VCVTTSS2USIL xmm2/m32, r32","vcvttss2usil xmm2/m32, r32","EVEX.LIG.F3.0F.W0 78 /r","V","V","AVX512F","scale4","w,r","Y","32"
+"VCVTTSS2USI r64{sae}, xmm2","VCVTTSS2USIQ xmm2, r64{sae}","vcvttss2usiq xmm2, r64{sae}","EVEX.128.F3.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTTSS2USI r64, xmm2/m32","VCVTTSS2USIQ xmm2/m32, r64","vcvttss2usiq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 78 /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
+"VCVTUDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PD xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtudq2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTUDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTUDQ2PD xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtudq2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTUDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTUDQ2PD ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtudq2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W0 7A /r","V","V","AVX512F","bscale4,scale32","w,r,r","",""
+"VCVTUDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PS xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtudq2ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F2.0F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTUDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTUDQ2PS ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtudq2ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F2.0F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTUDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTUDQ2PS zmm2, {k}{z}, zmm1{er}","vcvtudq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.F2.0F.W0 7A /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTUDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTUDQ2PS zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtudq2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F2.0F.W0 7A /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTUQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PD xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtuqq2pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTUQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PD ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtuqq2pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTUQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTUQQ2PD zmm2, {k}{z}, zmm1{er}","vcvtuqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 7A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTUQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PD zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtuqq2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTUQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTUQQ2PS zmm2, {k}{z}, ymm1{er}","vcvtuqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 7A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","Y",""
+"VCVTUQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PS zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtuqq2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.F2.0F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512"
+"VCVTUQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PSX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtuqq2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F2.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTUQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PSY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtuqq2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.F2.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTUSI2SD xmm1, xmmV, r/m32","VCVTUSI2SDL r/m32, xmmV, xmm1","vcvtusi2sd r/m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 7B /r","V","V","AVX512F","scale4","w,r,r","Y","32"
+"VCVTUSI2SD xmm1, xmmV, r/m64","VCVTUSI2SDQ r/m64, xmmV, xmm1","vcvtusi2sd r/m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 7B /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
+"VCVTUSI2SD xmm1{er}, xmmV, rmr64","VCVTUSI2SDQ rmr64, xmmV, xmm1{er}","vcvtusi2sd rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 7B /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
+"VCVTUSI2SS xmm1, xmmV, r/m32","VCVTUSI2SSL r/m32, xmmV, xmm1","vcvtusi2ssl r/m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 7B /r","V","V","AVX512F","scale4","w,r,r","Y","32"
+"VCVTUSI2SS xmm1{er}, xmmV, rmr32","VCVTUSI2SSL rmr32, xmmV, xmm1{er}","vcvtusi2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0 7B /r","V","V","AVX512F","modrm_regonly","w,r,r","Y","32"
+"VCVTUSI2SS xmm1, xmmV, r/m64","VCVTUSI2SSQ r/m64, xmmV, xmm1","vcvtusi2ssq r/m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 7B /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
+"VCVTUSI2SS xmm1{er}, xmmV, rmr64","VCVTUSI2SSQ rmr64, xmmV, xmm1{er}","vcvtusi2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1 7B /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
+"VDBPSADBW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VDBPSADBW imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","vdbpsadbw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 42 /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VDBPSADBW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VDBPSADBW imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","vdbpsadbw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 42 /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VDBPSADBW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VDBPSADBW imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","vdbpsadbw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 42 /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VDIVPD xmm1, xmmV, xmm2/m128","VDIVPD xmm2/m128, xmmV, xmm1","vdivpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VDIVPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vdivpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VDIVPD ymm1, ymmV, ymm2/m256","VDIVPD ymm2/m256, ymmV, ymm1","vdivpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VDIVPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vdivpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VDIVPD zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPD zmm2, zmmV, {k}{z}, zmm1{er}","vdivpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VDIVPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VDIVPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vdivpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5E /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VDIVPS xmm1, xmmV, xmm2/m128","VDIVPS xmm2/m128, xmmV, xmm1","vdivps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VDIVPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vdivps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VDIVPS ymm1, ymmV, ymm2/m256","VDIVPS ymm2/m256, ymmV, ymm1","vdivps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VDIVPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vdivps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VDIVPS zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPS zmm2, zmmV, {k}{z}, zmm1{er}","vdivps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VDIVPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VDIVPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vdivps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5E /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VDIVSD xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSD xmm2, xmmV, {k}{z}, xmm1{er}","vdivsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VDIVSD xmm1, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, xmm1","vdivsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVSD xmm1, {k}{z}, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, {k}{z}, xmm1","vdivsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5E /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VDIVSS xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSS xmm2, xmmV, {k}{z}, xmm1{er}","vdivss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VDIVSS xmm1, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, xmm1","vdivss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVSS xmm1, {k}{z}, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, {k}{z}, xmm1","vdivss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5E /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VDPPD xmm1, xmmV, xmm2/m128, imm8u","VDPPD imm8u, xmm2/m128, xmmV, xmm1","vdppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 41 /r ib","V","V","AVX","","w,r,r,r","",""
+"VDPPS xmm1, xmmV, xmm2/m128, imm8u","VDPPS imm8u, xmm2/m128, xmmV, xmm1","vdpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 40 /r ib","V","V","AVX","","w,r,r,r","",""
+"VDPPS ymm1, ymmV, ymm2/m256, imm8u","VDPPS imm8u, ymm2/m256, ymmV, ymm1","vdpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 40 /r ib","V","V","AVX","","w,r,r,r","",""
+"VERR r/m16","VERR r/m16","verr r/m16","0F 00 /4","V","V","","","r","",""
+"VERW r/m16","VERW r/m16","verw r/m16","0F 00 /5","V","V","","","r","",""
+"VEXP2PD zmm1{sae}, {k}{z}, zmm2","VEXP2PD zmm2, {k}{z}, zmm1{sae}","vexp2pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 C8 /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VEXP2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VEXP2PD zmm2/m512/m64bcst, {k}{z}, zmm1","vexp2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 C8 /r","V","V","AVX512ER","bscale8,scale64","w,r,r","",""
+"VEXP2PS zmm1{sae}, {k}{z}, zmm2","VEXP2PS zmm2, {k}{z}, zmm1{sae}","vexp2ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 C8 /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VEXP2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VEXP2PS zmm2/m512/m32bcst, {k}{z}, zmm1","vexp2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 C8 /r","V","V","AVX512ER","bscale4,scale64","w,r,r","",""
+"VEXPANDPD xmm1, {k}{z}, xmm2/m128","VEXPANDPD xmm2/m128, {k}{z}, xmm1","vexpandpd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 88 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VEXPANDPD ymm1, {k}{z}, ymm2/m256","VEXPANDPD ymm2/m256, {k}{z}, ymm1","vexpandpd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 88 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VEXPANDPD zmm1, {k}{z}, zmm2/m512","VEXPANDPD zmm2/m512, {k}{z}, zmm1","vexpandpd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 88 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VEXPANDPS xmm1, {k}{z}, xmm2/m128","VEXPANDPS xmm2/m128, {k}{z}, xmm1","vexpandps xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 88 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VEXPANDPS ymm1, {k}{z}, ymm2/m256","VEXPANDPS ymm2/m256, {k}{z}, ymm1","vexpandps ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 88 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VEXPANDPS zmm1, {k}{z}, zmm2/m512","VEXPANDPS zmm2/m512, {k}{z}, zmm1","vexpandps zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 88 /r","V","V","AVX512F","scale4","w,r,r","",""
+"VEXTRACTF128 xmm2/m128, ymm1, imm8u:1","VEXTRACTF128 imm8u:1, ymm1, xmm2/m128","vextractf128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 19 /r ib","V","V","AVX","","w,r,r","",""
+"VEXTRACTF32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF32X4 imm8u:1, ymm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:1, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0 19 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VEXTRACTF32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF32X4 imm8u:2, zmm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:2, zmm1, {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W0 19 /r ib","V","V","AVX512F","scale16","w,r,r,r","",""
+"VEXTRACTF32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTF32X8 imm8u:1, zmm1, {k}{z}, ymm2/m256","vextractf32x8 imm8u:1, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0 1B /r ib","V","V","AVX512DQ","scale32","w,r,r,r","",""
+"VEXTRACTF64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF64X2 imm8u:1, ymm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:1, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W1 19 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r","",""
+"VEXTRACTF64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF64X2 imm8u:2, zmm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:2, zmm1, {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W1 19 /r ib","V","V","AVX512DQ","scale16","w,r,r,r","",""
+"VEXTRACTF64X4 ymm2/m256, {k}{z}, zmm1, imm8u","VEXTRACTF64X4 imm8u, zmm1, {k}{z}, ymm2/m256","vextractf64x4 imm8u, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W1 1B /r ib","V","V","AVX512F","scale32","w,r,r,r","",""
+"VEXTRACTI128 xmm2/m128, ymm1, imm8u:1","VEXTRACTI128 imm8u:1, ymm1, xmm2/m128","vextracti128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 39 /r ib","V","V","AVX2","","w,r,r","",""
+"VEXTRACTI32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI32X4 imm8u:1, ymm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:1, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0 39 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VEXTRACTI32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI32X4 imm8u:2, zmm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:2, zmm1, {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W0 39 /r ib","V","V","AVX512F","scale16","w,r,r,r","",""
+"VEXTRACTI32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI32X8 imm8u:1, zmm1, {k}{z}, ymm2/m256","vextracti32x8 imm8u:1, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0 3B /r ib","V","V","AVX512DQ","scale32","w,r,r,r","",""
+"VEXTRACTI64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI64X2 imm8u:1, ymm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:1, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W1 39 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r","",""
+"VEXTRACTI64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI64X2 imm8u:2, zmm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:2, zmm1, {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W1 39 /r ib","V","V","AVX512DQ","scale16","w,r,r,r","",""
+"VEXTRACTI64X4 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI64X4 imm8u:1, zmm1, {k}{z}, ymm2/m256","vextracti64x4 imm8u:1, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W1 3B /r ib","V","V","AVX512F","scale32","w,r,r,r","",""
+"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1, r/m32","vextractps imm8u:2, xmm1, r/m32","EVEX.128.66.0F3A.WIG 17 /r ib","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1, r/m32","vextractps imm8u:2, xmm1, r/m32","VEX.128.66.0F3A.WIG 17 /r ib","V","V","AVX","","w,r,r","",""
+"VFIXUPIMMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VFIXUPIMMPD imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfixupimmpd imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W1 54 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r,r","",""
+"VFIXUPIMMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VFIXUPIMMPD imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfixupimmpd imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W1 54 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r,r","",""
+"VFIXUPIMMPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPD imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","vfixupimmpd imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.DDS.512.66.0F3A.W1 54 /r ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
+"VFIXUPIMMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VFIXUPIMMPD imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfixupimmpd imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W1 54 /r ib","V","V","AVX512F","bscale8,scale64","rw,r,r,r,r","",""
+"VFIXUPIMMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VFIXUPIMMPS imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfixupimmps imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W0 54 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r,r","",""
+"VFIXUPIMMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VFIXUPIMMPS imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfixupimmps imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W0 54 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r,r","",""
+"VFIXUPIMMPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPS imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","vfixupimmps imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.DDS.512.66.0F3A.W0 54 /r ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
+"VFIXUPIMMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VFIXUPIMMPS imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfixupimmps imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W0 54 /r ib","V","V","AVX512F","bscale4,scale64","rw,r,r,r,r","",""
+"VFIXUPIMMSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSD imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vfixupimmsd imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.DDS.128.66.0F3A.W1 55 /r ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
+"VFIXUPIMMSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VFIXUPIMMSD imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","vfixupimmsd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3A.W1 55 /r ib","V","V","AVX512F","scale8","rw,r,r,r,r","",""
+"VFIXUPIMMSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSS imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vfixupimmss imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.DDS.128.66.0F3A.W0 55 /r ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
+"VFIXUPIMMSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VFIXUPIMMSS imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","vfixupimmss imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3A.W0 55 /r ib","V","V","AVX512F","scale4","rw,r,r,r,r","",""
+"VFMADD132PD xmm1, xmmV, xmm2/m128","VFMADD132PD xmm2/m128, xmmV, xmm1","vfmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 98 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmadd132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 98 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADD132PD ymm1, ymmV, ymm2/m256","VFMADD132PD ymm2/m256, ymmV, ymm1","vfmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 98 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmadd132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 98 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 98 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmadd132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 98 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADD132PS xmm1, xmmV, xmm2/m128","VFMADD132PS xmm2/m128, xmmV, xmm1","vfmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 98 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmadd132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 98 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADD132PS ymm1, ymmV, ymm2/m256","VFMADD132PS ymm2/m256, ymmV, ymm1","vfmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 98 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmadd132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 98 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 98 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmadd132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 98 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 99 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD132SD xmm1, xmmV, xmm2/m64","VFMADD132SD xmm2/m64, xmmV, xmm1","vfmadd132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 99 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD132SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmadd132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 99 /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 99 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD132SS xmm1, xmmV, xmm2/m32","VFMADD132SS xmm2/m32, xmmV, xmm1","vfmadd132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 99 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD132SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmadd132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 99 /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMADD213PD xmm1, xmmV, xmm2/m128","VFMADD213PD xmm2/m128, xmmV, xmm1","vfmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmadd213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A8 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADD213PD ymm1, ymmV, ymm2/m256","VFMADD213PD ymm2/m256, ymmV, ymm1","vfmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmadd213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A8 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 A8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmadd213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A8 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADD213PS xmm1, xmmV, xmm2/m128","VFMADD213PS xmm2/m128, xmmV, xmm1","vfmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmadd213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A8 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADD213PS ymm1, ymmV, ymm2/m256","VFMADD213PS ymm2/m256, ymmV, ymm1","vfmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmadd213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A8 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 A8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmadd213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A8 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 A9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD213SD xmm1, xmmV, xmm2/m64","VFMADD213SD xmm2/m64, xmmV, xmm1","vfmadd213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 A9 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD213SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmadd213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 A9 /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 A9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD213SS xmm1, xmmV, xmm2/m32","VFMADD213SS xmm2/m32, xmmV, xmm1","vfmadd213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 A9 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD213SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmadd213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 A9 /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMADD231PD xmm1, xmmV, xmm2/m128","VFMADD231PD xmm2/m128, xmmV, xmm1","vfmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmadd231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B8 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADD231PD ymm1, ymmV, ymm2/m256","VFMADD231PD ymm2/m256, ymmV, ymm1","vfmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmadd231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B8 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 B8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmadd231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B8 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADD231PS xmm1, xmmV, xmm2/m128","VFMADD231PS xmm2/m128, xmmV, xmm1","vfmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmadd231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B8 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADD231PS ymm1, ymmV, ymm2/m256","VFMADD231PS ymm2/m256, ymmV, ymm1","vfmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmadd231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B8 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 B8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmadd231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B8 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 B9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD231SD xmm1, xmmV, xmm2/m64","VFMADD231SD xmm2/m64, xmmV, xmm1","vfmadd231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 B9 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD231SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmadd231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 B9 /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 B9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD231SS xmm1, xmmV, xmm2/m32","VFMADD231SS xmm2/m32, xmmV, xmm1","vfmadd231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 B9 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD231SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmadd231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 B9 /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPD xmm2/m128, xmmIH, xmmV, xmm1","vfmaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 69 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPD xmmIH, xmm2/m128, xmmV, xmm1","vfmaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 69 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPD ymm2/m256, ymmIH, ymmV, ymm1","vfmaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 69 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPD ymmIH, ymm2/m256, ymmV, ymm1","vfmaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 69 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPS xmm2/m128, xmmIH, xmmV, xmm1","vfmaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 68 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPS xmmIH, xmm2/m128, xmmV, xmm1","vfmaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 68 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPS ymm2/m256, ymmIH, ymmV, ymm1","vfmaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 68 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPS ymmIH, ymm2/m256, ymmV, ymm1","vfmaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 68 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFMADDSD xmm2/m64, xmmIH, xmmV, xmm1","vfmaddsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6B /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFMADDSD xmmIH, xmm2/m64, xmmV, xmm1","vfmaddsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6B /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFMADDSS xmm2/m32, xmmIH, xmmV, xmm1","vfmaddss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6A /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFMADDSS xmmIH, xmm2/m32, xmmV, xmm1","vfmaddss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6A /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUB132PD xmm1, xmmV, xmm2/m128","VFMADDSUB132PD xmm2/m128, xmmV, xmm1","vfmaddsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 96 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 96 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADDSUB132PD ymm1, ymmV, ymm2/m256","VFMADDSUB132PD ymm2/m256, ymmV, ymm1","vfmaddsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 96 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 96 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADDSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 96 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 96 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADDSUB132PS xmm1, xmmV, xmm2/m128","VFMADDSUB132PS xmm2/m128, xmmV, xmm1","vfmaddsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 96 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 96 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADDSUB132PS ymm1, ymmV, ymm2/m256","VFMADDSUB132PS ymm2/m256, ymmV, ymm1","vfmaddsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 96 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 96 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADDSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 96 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 96 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADDSUB213PD xmm1, xmmV, xmm2/m128","VFMADDSUB213PD xmm2/m128, xmmV, xmm1","vfmaddsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADDSUB213PD ymm1, ymmV, ymm2/m256","VFMADDSUB213PD ymm2/m256, ymmV, ymm1","vfmaddsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADDSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 A6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A6 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADDSUB213PS xmm1, xmmV, xmm2/m128","VFMADDSUB213PS xmm2/m128, xmmV, xmm1","vfmaddsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A6 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADDSUB213PS ymm1, ymmV, ymm2/m256","VFMADDSUB213PS ymm2/m256, ymmV, ymm1","vfmaddsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A6 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADDSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 A6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A6 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADDSUB231PD xmm1, xmmV, xmm2/m128","VFMADDSUB231PD xmm2/m128, xmmV, xmm1","vfmaddsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADDSUB231PD ymm1, ymmV, ymm2/m256","VFMADDSUB231PD ymm2/m256, ymmV, ymm1","vfmaddsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADDSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 B6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B6 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADDSUB231PS xmm1, xmmV, xmm2/m128","VFMADDSUB231PS xmm2/m128, xmmV, xmm1","vfmaddsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B6 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADDSUB231PS ymm1, ymmV, ymm2/m256","VFMADDSUB231PS ymm2/m256, ymmV, ymm1","vfmaddsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B6 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADDSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 B6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B6 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADDSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPD xmm2/m128, xmmIH, xmmV, xmm1","vfmaddsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 5D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPD xmmIH, xmm2/m128, xmmV, xmm1","vfmaddsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 5D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPD ymm2/m256, ymmIH, ymmV, ymm1","vfmaddsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 5D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPD ymmIH, ymm2/m256, ymmV, ymm1","vfmaddsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 5D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPS xmm2/m128, xmmIH, xmmV, xmm1","vfmaddsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 5C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPS xmmIH, xmm2/m128, xmmV, xmm1","vfmaddsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 5C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPS ymm2/m256, ymmIH, ymmV, ymm1","vfmaddsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 5C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPS ymmIH, ymm2/m256, ymmV, ymm1","vfmaddsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 5C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUB132PD xmm1, xmmV, xmm2/m128","VFMSUB132PD xmm2/m128, xmmV, xmm1","vfmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9A /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsub132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 9A /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUB132PD ymm1, ymmV, ymm2/m256","VFMSUB132PD ymm2/m256, ymmV, ymm1","vfmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9A /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsub132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 9A /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 9A /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsub132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 9A /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUB132PS xmm1, xmmV, xmm2/m128","VFMSUB132PS xmm2/m128, xmmV, xmm1","vfmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9A /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsub132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 9A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUB132PS ymm1, ymmV, ymm2/m256","VFMSUB132PS ymm2/m256, ymmV, ymm1","vfmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9A /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsub132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 9A /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 9A /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsub132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 9A /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 9B /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB132SD xmm1, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64, xmmV, xmm1","vfmsub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9B /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmsub132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 9B /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 9B /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB132SS xmm1, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32, xmmV, xmm1","vfmsub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9B /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmsub132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 9B /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMSUB213PD xmm1, xmmV, xmm2/m128","VFMSUB213PD xmm2/m128, xmmV, xmm1","vfmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsub213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 AA /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUB213PD ymm1, ymmV, ymm2/m256","VFMSUB213PD ymm2/m256, ymmV, ymm1","vfmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsub213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 AA /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 AA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsub213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 AA /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUB213PS xmm1, xmmV, xmm2/m128","VFMSUB213PS xmm2/m128, xmmV, xmm1","vfmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsub213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 AA /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUB213PS ymm1, ymmV, ymm2/m256","VFMSUB213PS ymm2/m256, ymmV, ymm1","vfmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsub213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 AA /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 AA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsub213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 AA /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 AB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB213SD xmm1, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64, xmmV, xmm1","vfmsub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AB /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmsub213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 AB /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 AB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB213SS xmm1, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32, xmmV, xmm1","vfmsub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AB /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmsub213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 AB /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMSUB231PD xmm1, xmmV, xmm2/m128","VFMSUB231PD xmm2/m128, xmmV, xmm1","vfmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsub231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 BA /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUB231PD ymm1, ymmV, ymm2/m256","VFMSUB231PD ymm2/m256, ymmV, ymm1","vfmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsub231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 BA /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 BA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsub231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 BA /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUB231PS xmm1, xmmV, xmm2/m128","VFMSUB231PS xmm2/m128, xmmV, xmm1","vfmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsub231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 BA /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUB231PS ymm1, ymmV, ymm2/m256","VFMSUB231PS ymm2/m256, ymmV, ymm1","vfmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsub231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 BA /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 BA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsub231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 BA /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 BB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB231SD xmm1, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64, xmmV, xmm1","vfmsub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BB /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmsub231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 BB /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 BB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB231SS xmm1, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32, xmmV, xmm1","vfmsub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BB /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmsub231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 BB /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMSUBADD132PD xmm1, xmmV, xmm2/m128","VFMSUBADD132PD xmm2/m128, xmmV, xmm1","vfmsubadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 97 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 97 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUBADD132PD ymm1, ymmV, ymm2/m256","VFMSUBADD132PD ymm2/m256, ymmV, ymm1","vfmsubadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 97 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 97 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUBADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 97 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 97 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUBADD132PS xmm1, xmmV, xmm2/m128","VFMSUBADD132PS xmm2/m128, xmmV, xmm1","vfmsubadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 97 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 97 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUBADD132PS ymm1, ymmV, ymm2/m256","VFMSUBADD132PS ymm2/m256, ymmV, ymm1","vfmsubadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 97 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 97 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUBADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 97 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 97 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUBADD213PD xmm1, xmmV, xmm2/m128","VFMSUBADD213PD xmm2/m128, xmmV, xmm1","vfmsubadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A7 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUBADD213PD ymm1, ymmV, ymm2/m256","VFMSUBADD213PD ymm2/m256, ymmV, ymm1","vfmsubadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A7 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUBADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 A7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A7 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUBADD213PS xmm1, xmmV, xmm2/m128","VFMSUBADD213PS xmm2/m128, xmmV, xmm1","vfmsubadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A7 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUBADD213PS ymm1, ymmV, ymm2/m256","VFMSUBADD213PS ymm2/m256, ymmV, ymm1","vfmsubadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A7 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUBADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 A7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A7 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUBADD231PD xmm1, xmmV, xmm2/m128","VFMSUBADD231PD xmm2/m128, xmmV, xmm1","vfmsubadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B7 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUBADD231PD ymm1, ymmV, ymm2/m256","VFMSUBADD231PD ymm2/m256, ymmV, ymm1","vfmsubadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B7 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUBADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 B7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B7 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUBADD231PS xmm1, xmmV, xmm2/m128","VFMSUBADD231PS xmm2/m128, xmmV, xmm1","vfmsubadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B7 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUBADD231PS ymm1, ymmV, ymm2/m256","VFMSUBADD231PS ymm2/m256, ymmV, ymm1","vfmsubadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B7 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUBADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 B7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B7 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUBADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPD xmm2/m128, xmmIH, xmmV, xmm1","vfmsubaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 5F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPD xmmIH, xmm2/m128, xmmV, xmm1","vfmsubaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 5F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPD ymm2/m256, ymmIH, ymmV, ymm1","vfmsubaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 5F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPD ymmIH, ymm2/m256, ymmV, ymm1","vfmsubaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 5F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPS xmm2/m128, xmmIH, xmmV, xmm1","vfmsubaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 5E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPS xmmIH, xmm2/m128, xmmV, xmm1","vfmsubaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 5E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPS ymm2/m256, ymmIH, ymmV, ymm1","vfmsubaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 5E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPS ymmIH, ymm2/m256, ymmV, ymm1","vfmsubaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 5E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPD xmm2/m128, xmmIH, xmmV, xmm1","vfmsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 6D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPD xmmIH, xmm2/m128, xmmV, xmm1","vfmsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 6D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPD ymm2/m256, ymmIH, ymmV, ymm1","vfmsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 6D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPD ymmIH, ymm2/m256, ymmV, ymm1","vfmsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 6D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPS xmm2/m128, xmmIH, xmmV, xmm1","vfmsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 6C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPS xmmIH, xmm2/m128, xmmV, xmm1","vfmsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 6C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPS ymm2/m256, ymmIH, ymmV, ymm1","vfmsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 6C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPS ymmIH, ymm2/m256, ymmV, ymm1","vfmsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 6C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFMSUBSD xmm2/m64, xmmIH, xmmV, xmm1","vfmsubsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFMSUBSD xmmIH, xmm2/m64, xmmV, xmm1","vfmsubsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFMSUBSS xmm2/m32, xmmIH, xmmV, xmm1","vfmsubss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFMSUBSS xmmIH, xmm2/m32, xmmV, xmm1","vfmsubss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADD132PD xmm1, xmmV, xmm2/m128","VFNMADD132PD xmm2/m128, xmmV, xmm1","vfnmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9C /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 9C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMADD132PD ymm1, ymmV, ymm2/m256","VFNMADD132PD ymm2/m256, ymmV, ymm1","vfnmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9C /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 9C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 9C /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 9C /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMADD132PS xmm1, xmmV, xmm2/m128","VFNMADD132PS xmm2/m128, xmmV, xmm1","vfnmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9C /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 9C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMADD132PS ymm1, ymmV, ymm2/m256","VFNMADD132PS ymm2/m256, ymmV, ymm1","vfnmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9C /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 9C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 9C /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 9C /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 9D /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD132SD xmm1, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64, xmmV, xmm1","vfnmadd132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9D /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmadd132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 9D /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 9D /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD132SS xmm1, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32, xmmV, xmm1","vfnmadd132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9D /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmadd132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 9D /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMADD213PD xmm1, xmmV, xmm2/m128","VFNMADD213PD xmm2/m128, xmmV, xmm1","vfnmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 AC /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMADD213PD ymm1, ymmV, ymm2/m256","VFNMADD213PD ymm2/m256, ymmV, ymm1","vfnmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 AC /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 AC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 AC /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMADD213PS xmm1, xmmV, xmm2/m128","VFNMADD213PS xmm2/m128, xmmV, xmm1","vfnmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 AC /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMADD213PS ymm1, ymmV, ymm2/m256","VFNMADD213PS ymm2/m256, ymmV, ymm1","vfnmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 AC /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 AC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 AC /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 AD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD213SD xmm1, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64, xmmV, xmm1","vfnmadd213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AD /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmadd213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 AD /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 AD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD213SS xmm1, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32, xmmV, xmm1","vfnmadd213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AD /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmadd213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 AD /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMADD231PD xmm1, xmmV, xmm2/m128","VFNMADD231PD xmm2/m128, xmmV, xmm1","vfnmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 BC /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMADD231PD ymm1, ymmV, ymm2/m256","VFNMADD231PD ymm2/m256, ymmV, ymm1","vfnmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 BC /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 BC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 BC /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMADD231PS xmm1, xmmV, xmm2/m128","VFNMADD231PS xmm2/m128, xmmV, xmm1","vfnmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 BC /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMADD231PS ymm1, ymmV, ymm2/m256","VFNMADD231PS ymm2/m256, ymmV, ymm1","vfnmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 BC /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 BC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 BC /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 BD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD231SD xmm1, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64, xmmV, xmm1","vfnmadd231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BD /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmadd231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 BD /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 BD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD231SS xmm1, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32, xmmV, xmm1","vfnmadd231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BD /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmadd231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 BD /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPD xmm2/m128, xmmIH, xmmV, xmm1","vfnmaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 79 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPD xmmIH, xmm2/m128, xmmV, xmm1","vfnmaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 79 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPD ymm2/m256, ymmIH, ymmV, ymm1","vfnmaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 79 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPD ymmIH, ymm2/m256, ymmV, ymm1","vfnmaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 79 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPS xmm2/m128, xmmIH, xmmV, xmm1","vfnmaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 78 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPS xmmIH, xmm2/m128, xmmV, xmm1","vfnmaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 78 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPS ymm2/m256, ymmIH, ymmV, ymm1","vfnmaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 78 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPS ymmIH, ymm2/m256, ymmV, ymm1","vfnmaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 78 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMADDSD xmm2/m64, xmmIH, xmmV, xmm1","vfnmaddsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7B /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMADDSD xmmIH, xmm2/m64, xmmV, xmm1","vfnmaddsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7B /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMADDSS xmm2/m32, xmmIH, xmmV, xmm1","vfnmaddss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7A /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMADDSS xmmIH, xmm2/m32, xmmV, xmm1","vfnmaddss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7A /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUB132PD xmm1, xmmV, xmm2/m128","VFNMSUB132PD xmm2/m128, xmmV, xmm1","vfnmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9E /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 9E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMSUB132PD ymm1, ymmV, ymm2/m256","VFNMSUB132PD ymm2/m256, ymmV, ymm1","vfnmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9E /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 9E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 9E /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 9E /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMSUB132PS xmm1, xmmV, xmm2/m128","VFNMSUB132PS xmm2/m128, xmmV, xmm1","vfnmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9E /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 9E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMSUB132PS ymm1, ymmV, ymm2/m256","VFNMSUB132PS ymm2/m256, ymmV, ymm1","vfnmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9E /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 9E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 9E /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 9E /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 9F /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB132SD xmm1, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64, xmmV, xmm1","vfnmsub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9F /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmsub132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 9F /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 9F /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB132SS xmm1, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32, xmmV, xmm1","vfnmsub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9F /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmsub132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 9F /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMSUB213PD xmm1, xmmV, xmm2/m128","VFNMSUB213PD xmm2/m128, xmmV, xmm1","vfnmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 AE /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMSUB213PD ymm1, ymmV, ymm2/m256","VFNMSUB213PD ymm2/m256, ymmV, ymm1","vfnmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 AE /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 AE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 AE /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMSUB213PS xmm1, xmmV, xmm2/m128","VFNMSUB213PS xmm2/m128, xmmV, xmm1","vfnmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 AE /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMSUB213PS ymm1, ymmV, ymm2/m256","VFNMSUB213PS ymm2/m256, ymmV, ymm1","vfnmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 AE /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 AE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 AE /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 AF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB213SD xmm1, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64, xmmV, xmm1","vfnmsub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AF /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmsub213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 AF /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 AF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB213SS xmm1, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32, xmmV, xmm1","vfnmsub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AF /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmsub213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 AF /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMSUB231PD xmm1, xmmV, xmm2/m128","VFNMSUB231PD xmm2/m128, xmmV, xmm1","vfnmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 BE /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMSUB231PD ymm1, ymmV, ymm2/m256","VFNMSUB231PD ymm2/m256, ymmV, ymm1","vfnmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 BE /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 BE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 BE /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMSUB231PS xmm1, xmmV, xmm2/m128","VFNMSUB231PS xmm2/m128, xmmV, xmm1","vfnmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 BE /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMSUB231PS ymm1, ymmV, ymm2/m256","VFNMSUB231PS ymm2/m256, ymmV, ymm1","vfnmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 BE /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 BE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 BE /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 BF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB231SD xmm1, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64, xmmV, xmm1","vfnmsub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BF /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmsub231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 BF /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 BF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB231SS xmm1, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32, xmmV, xmm1","vfnmsub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BF /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmsub231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 BF /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPD xmm2/m128, xmmIH, xmmV, xmm1","vfnmsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 7D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPD xmmIH, xmm2/m128, xmmV, xmm1","vfnmsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 7D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPD ymm2/m256, ymmIH, ymmV, ymm1","vfnmsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 7D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPD ymmIH, ymm2/m256, ymmV, ymm1","vfnmsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 7D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPS xmm2/m128, xmmIH, xmmV, xmm1","vfnmsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 7C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPS xmmIH, xmm2/m128, xmmV, xmm1","vfnmsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 7C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPS ymm2/m256, ymmIH, ymmV, ymm1","vfnmsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 7C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPS ymmIH, ymm2/m256, ymmV, ymm1","vfnmsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 7C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMSUBSD xmm2/m64, xmmIH, xmmV, xmm1","vfnmsubsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMSUBSD xmmIH, xmm2/m64, xmmV, xmm1","vfnmsubsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMSUBSS xmm2/m32, xmmIH, xmmV, xmm1","vfnmsubss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMSUBSS xmmIH, xmm2/m32, xmmV, xmm1","vfnmsubss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFPCLASSPD k1, {k}, xmm2/m128/m64bcst, imm8u","VFPCLASSPDX imm8u, xmm2/m128/m64bcst, {k}, k1","vfpclasspdx imm8u, xmm2/m128/m64bcst, {k}, k1","EVEX.128.66.0F3A.W1 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","Y","128"
+"VFPCLASSPD k1, {k}, ymm2/m256/m64bcst, imm8u","VFPCLASSPDY imm8u, ymm2/m256/m64bcst, {k}, k1","vfpclasspdy imm8u, ymm2/m256/m64bcst, {k}, k1","EVEX.256.66.0F3A.W1 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","Y","256"
+"VFPCLASSPD k1, {k}, zmm2/m512/m64bcst, imm8u","VFPCLASSPDZ imm8u, zmm2/m512/m64bcst, {k}, k1","vfpclasspdz imm8u, zmm2/m512/m64bcst, {k}, k1","EVEX.512.66.0F3A.W1 66 /r ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","Y","512"
+"VFPCLASSPS k1, {k}, xmm2/m128/m32bcst, imm8u","VFPCLASSPSX imm8u, xmm2/m128/m32bcst, {k}, k1","vfpclasspsx imm8u, xmm2/m128/m32bcst, {k}, k1","EVEX.128.66.0F3A.W0 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","Y","128"
+"VFPCLASSPS k1, {k}, ymm2/m256/m32bcst, imm8u","VFPCLASSPSY imm8u, ymm2/m256/m32bcst, {k}, k1","vfpclasspsy imm8u, ymm2/m256/m32bcst, {k}, k1","EVEX.256.66.0F3A.W0 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","Y","256"
+"VFPCLASSPS k1, {k}, zmm2/m512/m32bcst, imm8u","VFPCLASSPSZ imm8u, zmm2/m512/m32bcst, {k}, k1","vfpclasspsz imm8u, zmm2/m512/m32bcst, {k}, k1","EVEX.512.66.0F3A.W0 66 /r ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","Y","512"
+"VFPCLASSSD k1, {k}, xmm2/m64, imm8u","VFPCLASSSD imm8u, xmm2/m64, {k}, k1","vfpclasssd imm8u, xmm2/m64, {k}, k1","EVEX.LIG.66.0F3A.W1 67 /r ib","V","V","AVX512DQ","scale8","w,r,r,r","",""
+"VFPCLASSSS k1, {k}, xmm2/m32, imm8u","VFPCLASSSS imm8u, xmm2/m32, {k}, k1","vfpclassss imm8u, xmm2/m32, {k}, k1","EVEX.LIG.66.0F3A.W0 67 /r ib","V","V","AVX512DQ","scale4","w,r,r,r","",""
+"VFRCZPD xmm1, xmm2/m128","VFRCZPD xmm2/m128, xmm1","vfrczpd xmm2/m128, xmm1","XOP.128.09.W0 81 /r","V","V","XOP","amd","w,r","",""
+"VFRCZPD ymm1, ymm2/m256","VFRCZPD ymm2/m256, ymm1","vfrczpd ymm2/m256, ymm1","XOP.256.09.W0 81 /r","V","V","XOP","amd","w,r","",""
+"VFRCZPS xmm1, xmm2/m128","VFRCZPS xmm2/m128, xmm1","vfrczps xmm2/m128, xmm1","XOP.128.09.W0 80 /r","V","V","XOP","amd","w,r","",""
+"VFRCZPS ymm1, ymm2/m256","VFRCZPS ymm2/m256, ymm1","vfrczps ymm2/m256, ymm1","XOP.256.09.W0 80 /r","V","V","XOP","amd","w,r","",""
+"VFRCZSD xmm1, xmm2/m64","VFRCZSD xmm2/m64, xmm1","vfrczsd xmm2/m64, xmm1","XOP.128.09.W0 83 /r","V","V","XOP","amd","w,r","",""
+"VFRCZSS xmm1, xmm2/m32","VFRCZSS xmm2/m32, xmm1","vfrczss xmm2/m32, xmm1","XOP.128.09.W0 82 /r","V","V","XOP","amd","w,r","",""
+"VGATHERDPD xmm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7}, xmm1","vgatherdpd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 92 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERDPD ymm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7}, ymm1","vgatherdpd vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 92 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERDPD zmm1, {k1-k7}, vm32y","VGATHERDPD vm32y, {k1-k7}, zmm1","vgatherdpd vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 92 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERDPD xmm1, vm32x, xmmV","VGATHERDPD xmmV, vm32x, xmm1","vgatherdpd xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERDPD ymm1, vm32x, ymmV","VGATHERDPD ymmV, vm32x, ymm1","vgatherdpd ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERDPS xmm1, {k1-k7}, vm32x","VGATHERDPS vm32x, {k1-k7}, xmm1","vgatherdps vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 92 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERDPS ymm1, {k1-k7}, vm32y","VGATHERDPS vm32y, {k1-k7}, ymm1","vgatherdps vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 92 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERDPS zmm1, {k1-k7}, vm32z","VGATHERDPS vm32z, {k1-k7}, zmm1","vgatherdps vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 92 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERDPS xmm1, vm32x, xmmV","VGATHERDPS xmmV, vm32x, xmm1","vgatherdps xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERDPS ymm1, vm32y, ymmV","VGATHERDPS ymmV, vm32y, ymm1","vgatherdps ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERPF0DPD vm32y, {k1-k7}","VGATHERPF0DPD {k1-k7}, vm32y","vgatherpf0dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /1","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VGATHERPF0DPS vm32z, {k1-k7}","VGATHERPF0DPS {k1-k7}, vm32z","vgatherpf0dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /1","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VGATHERPF0QPD vm64z, {k1-k7}","VGATHERPF0QPD {k1-k7}, vm64z","vgatherpf0qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /1","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VGATHERPF0QPS vm64z, {k1-k7}","VGATHERPF0QPS {k1-k7}, vm64z","vgatherpf0qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /1","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VGATHERPF1DPD vm32y, {k1-k7}","VGATHERPF1DPD {k1-k7}, vm32y","vgatherpf1dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /2","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VGATHERPF1DPS vm32z, {k1-k7}","VGATHERPF1DPS {k1-k7}, vm32z","vgatherpf1dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /2","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VGATHERPF1QPD vm64z, {k1-k7}","VGATHERPF1QPD {k1-k7}, vm64z","vgatherpf1qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /2","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VGATHERPF1QPS vm64z, {k1-k7}","VGATHERPF1QPS {k1-k7}, vm64z","vgatherpf1qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /2","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VGATHERQPD xmm1, {k1-k7}, vm64x","VGATHERQPD vm64x, {k1-k7}, xmm1","vgatherqpd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 93 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERQPD ymm1, {k1-k7}, vm64y","VGATHERQPD vm64y, {k1-k7}, ymm1","vgatherqpd vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 93 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERQPD zmm1, {k1-k7}, vm64z","VGATHERQPD vm64z, {k1-k7}, zmm1","vgatherqpd vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 93 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERQPD xmm1, vm64x, xmmV","VGATHERQPD xmmV, vm64x, xmm1","vgatherqpd xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERQPD ymm1, vm64y, ymmV","VGATHERQPD ymmV, vm64y, ymm1","vgatherqpd ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERQPS xmm1, {k1-k7}, vm64x","VGATHERQPS vm64x, {k1-k7}, xmm1","vgatherqps vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 93 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERQPS xmm1, {k1-k7}, vm64y","VGATHERQPS vm64y, {k1-k7}, xmm1","vgatherqps vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 93 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERQPS ymm1, {k1-k7}, vm64z","VGATHERQPS vm64z, {k1-k7}, ymm1","vgatherqps vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 93 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERQPS xmm1, vm64x, xmmV","VGATHERQPS xmmV, vm64x, xmm1","vgatherqps xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERQPS xmm1, vm64y, xmmV","VGATHERQPS xmmV, vm64y, xmm1","vgatherqps xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGETEXPPD xmm1, {k}{z}, xmm2/m128/m64bcst","VGETEXPPD xmm2/m128/m64bcst, {k}{z}, xmm1","vgetexppd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 42 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VGETEXPPD ymm1, {k}{z}, ymm2/m256/m64bcst","VGETEXPPD ymm2/m256/m64bcst, {k}{z}, ymm1","vgetexppd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 42 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VGETEXPPD zmm1{sae}, {k}{z}, zmm2","VGETEXPPD zmm2, {k}{z}, zmm1{sae}","vgetexppd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 42 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VGETEXPPD zmm1, {k}{z}, zmm2/m512/m64bcst","VGETEXPPD zmm2/m512/m64bcst, {k}{z}, zmm1","vgetexppd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 42 /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VGETEXPPS xmm1, {k}{z}, xmm2/m128/m32bcst","VGETEXPPS xmm2/m128/m32bcst, {k}{z}, xmm1","vgetexpps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 42 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VGETEXPPS ymm1, {k}{z}, ymm2/m256/m32bcst","VGETEXPPS ymm2/m256/m32bcst, {k}{z}, ymm1","vgetexpps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 42 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VGETEXPPS zmm1{sae}, {k}{z}, zmm2","VGETEXPPS zmm2, {k}{z}, zmm1{sae}","vgetexpps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 42 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VGETEXPPS zmm1, {k}{z}, zmm2/m512/m32bcst","VGETEXPPS zmm2/m512/m32bcst, {k}{z}, zmm1","vgetexpps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 42 /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VGETEXPSD xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSD xmm2, xmmV, {k}{z}, xmm1{sae}","vgetexpsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W1 43 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VGETEXPSD xmm1, {k}{z}, xmmV, xmm2/m64","VGETEXPSD xmm2/m64, xmmV, {k}{z}, xmm1","vgetexpsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 43 /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VGETEXPSS xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSS xmm2, xmmV, {k}{z}, xmm1{sae}","vgetexpss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W0 43 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VGETEXPSS xmm1, {k}{z}, xmmV, xmm2/m32","VGETEXPSS xmm2/m32, xmmV, {k}{z}, xmm1","vgetexpss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 43 /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VGETMANTPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u:4","VGETMANTPD imm8u:4, xmm2/m128/m64bcst, {k}{z}, xmm1","vgetmantpd imm8u:4, xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 26 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VGETMANTPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u:4","VGETMANTPD imm8u:4, ymm2/m256/m64bcst, {k}{z}, ymm1","vgetmantpd imm8u:4, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 26 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VGETMANTPD zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPD imm8u:4, zmm2, {k}{z}, zmm1{sae}","vgetmantpd imm8u:4, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W1 26 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VGETMANTPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u:4","VGETMANTPD imm8u:4, zmm2/m512/m64bcst, {k}{z}, zmm1","vgetmantpd imm8u:4, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 26 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VGETMANTPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u:4","VGETMANTPS imm8u:4, xmm2/m128/m32bcst, {k}{z}, xmm1","vgetmantps imm8u:4, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 26 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VGETMANTPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u:4","VGETMANTPS imm8u:4, ymm2/m256/m32bcst, {k}{z}, ymm1","vgetmantps imm8u:4, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 26 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VGETMANTPS zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPS imm8u:4, zmm2, {k}{z}, zmm1{sae}","vgetmantps imm8u:4, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W0 26 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VGETMANTPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u:4","VGETMANTPS imm8u:4, zmm2/m512/m32bcst, {k}{z}, zmm1","vgetmantps imm8u:4, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 26 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VGETMANTSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSD imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vgetmantsd imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 27 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VGETMANTSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VGETMANTSD imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","vgetmantsd imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 27 /r ib","V","V","AVX512F","scale8","w,r,r,r,r","",""
+"VGETMANTSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSS imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vgetmantss imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 27 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VGETMANTSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VGETMANTSS imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","vgetmantss imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 27 /r ib","V","V","AVX512F","scale4","w,r,r,r,r","",""
+"VGF2P8AFFINEINVQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEINVQB imm8u, xmm2/m128, xmmV, xmm1","vgf2p8affineinvqb imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX","","w,r,r,r","",""
+"VGF2P8AFFINEINVQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VGF2P8AFFINEINVQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vgf2p8affineinvqb imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VGF2P8AFFINEINVQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEINVQB imm8u, ymm2/m256, ymmV, ymm1","vgf2p8affineinvqb imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX","","w,r,r,r","",""
+"VGF2P8AFFINEINVQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VGF2P8AFFINEINVQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vgf2p8affineinvqb imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VGF2P8AFFINEINVQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VGF2P8AFFINEINVQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vgf2p8affineinvqb imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VGF2P8AFFINEQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEQB imm8u, xmm2/m128, xmmV, xmm1","vgf2p8affineqb imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX","","w,r,r,r","",""
+"VGF2P8AFFINEQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VGF2P8AFFINEQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vgf2p8affineqb imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VGF2P8AFFINEQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEQB imm8u, ymm2/m256, ymmV, ymm1","vgf2p8affineqb imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX","","w,r,r,r","",""
+"VGF2P8AFFINEQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VGF2P8AFFINEQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vgf2p8affineqb imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VGF2P8AFFINEQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VGF2P8AFFINEQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vgf2p8affineqb imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VGF2P8MULB xmm1, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128, xmmV, xmm1","vgf2p8mulb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 CF /r","V","V","GFNI+AVX","","w,r,r","",""
+"VGF2P8MULB xmm1, {k}{z}, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128, xmmV, {k}{z}, xmm1","vgf2p8mulb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 CF /r","V","V","GFNI+AVX512VL","scale16","w,r,r,r","",""
+"VGF2P8MULB ymm1, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256, ymmV, ymm1","vgf2p8mulb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 CF /r","V","V","GFNI+AVX","","w,r,r","",""
+"VGF2P8MULB ymm1, {k}{z}, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256, ymmV, {k}{z}, ymm1","vgf2p8mulb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 CF /r","V","V","GFNI+AVX512VL","scale32","w,r,r,r","",""
+"VGF2P8MULB zmm1, {k}{z}, zmmV, zmm2/m512","VGF2P8MULB zmm2/m512, zmmV, {k}{z}, zmm1","vgf2p8mulb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 CF /r","V","V","GFNI+AVX512F","scale64","w,r,r,r","",""
+"VHADDPD xmm1, xmmV, xmm2/m128","VHADDPD xmm2/m128, xmmV, xmm1","vhaddpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
+"VHADDPD ymm1, ymmV, ymm2/m256","VHADDPD ymm2/m256, ymmV, ymm1","vhaddpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
+"VHADDPS xmm1, xmmV, xmm2/m128","VHADDPS xmm2/m128, xmmV, xmm1","vhaddps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
+"VHADDPS ymm1, ymmV, ymm2/m256","VHADDPS ymm2/m256, ymmV, ymm1","vhaddps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
+"VHSUBPD xmm1, xmmV, xmm2/m128","VHSUBPD xmm2/m128, xmmV, xmm1","vhsubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
+"VHSUBPD ymm1, ymmV, ymm2/m256","VHSUBPD ymm2/m256, ymmV, ymm1","vhsubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
+"VHSUBPS xmm1, xmmV, xmm2/m128","VHSUBPS xmm2/m128, xmmV, xmm1","vhsubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
+"VHSUBPS ymm1, ymmV, ymm2/m256","VHSUBPS ymm2/m256, ymmV, ymm1","vhsubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
+"VINSERTF128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTF128 imm8u:1, xmm2/m128, ymmV, ymm1","vinsertf128 imm8u:1, xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 18 /r ib","V","V","AVX","","w,r,r,r","",""
+"VINSERTF32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF32X4 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf32x4 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 18 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r,r","",""
+"VINSERTF32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF32X4 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf32x4 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 18 /r ib","V","V","AVX512F","scale16","w,r,r,r,r","",""
+"VINSERTF32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF32X8 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf32x8 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 1A /r ib","V","V","AVX512DQ","scale32","w,r,r,r,r","",""
+"VINSERTF64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF64X2 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf64x2 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 18 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r,r","",""
+"VINSERTF64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF64X2 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf64x2 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 18 /r ib","V","V","AVX512DQ","scale16","w,r,r,r,r","",""
+"VINSERTF64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF64X4 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf64x4 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 1A /r ib","V","V","AVX512F","scale32","w,r,r,r,r","",""
+"VINSERTI128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTI128 imm8u:1, xmm2/m128, ymmV, ymm1","vinserti128 imm8u:1, xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 38 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VINSERTI32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI32X4 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti32x4 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 38 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r,r","",""
+"VINSERTI32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI32X4 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti32x4 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 38 /r ib","V","V","AVX512F","scale16","w,r,r,r,r","",""
+"VINSERTI32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI32X8 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti32x8 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 3A /r ib","V","V","AVX512DQ","scale32","w,r,r,r,r","",""
+"VINSERTI64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI64X2 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti64x2 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 38 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r,r","",""
+"VINSERTI64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI64X2 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti64x2 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 38 /r ib","V","V","AVX512DQ","scale16","w,r,r,r,r","",""
+"VINSERTI64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI64X4 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti64x4 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 3A /r ib","V","V","AVX512F","scale32","w,r,r,r,r","",""
+"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32, xmmV, xmm1","vinsertps imm8u, xmm2/m32, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W0 21 /r ib","V","V","AVX512F+AVX512VL","scale4","w,r,r,r","",""
+"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32, xmmV, xmm1","vinsertps imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 21 /r ib","V","V","AVX","","w,r,r,r","",""
+"VLDDQU xmm1, m128","VLDDQU m128, xmm1","vlddqu m128, xmm1","VEX.128.F2.0F.WIG F0 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VLDDQU ymm1, m256","VLDDQU m256, ymm1","vlddqu m256, ymm1","VEX.256.F2.0F.WIG F0 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VLDMXCSR m32","VLDMXCSR m32","vldmxcsr m32","VEX.128.0F.WIG AE /2","V","V","AVX","modrm_memonly","r","",""
+"VMASKMOVDQU xmm1, xmm2","VMASKMOVDQU xmm2, xmm1","vmaskmovdqu xmm2, xmm1","VEX.128.66.0F.WIG F7 /r","V","V","AVX","modrm_regonly","r,r","",""
+"VMASKMOVPD xmm1, xmmV, m128","VMASKMOVPD m128, xmmV, xmm1","vmaskmovpd m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPD ymm1, ymmV, m256","VMASKMOVPD m256, ymmV, ymm1","vmaskmovpd m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPD m128, xmmV, xmm1","VMASKMOVPD xmm1, xmmV, m128","vmaskmovpd xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPD m256, ymmV, ymm1","VMASKMOVPD ymm1, ymmV, m256","vmaskmovpd ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPS xmm1, xmmV, m128","VMASKMOVPS m128, xmmV, xmm1","vmaskmovps m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPS ymm1, ymmV, m256","VMASKMOVPS m256, ymmV, ymm1","vmaskmovps m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPS m128, xmmV, xmm1","VMASKMOVPS xmm1, xmmV, m128","vmaskmovps xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPS m256, ymmV, ymm1","VMASKMOVPS ymm1, ymmV, m256","vmaskmovps ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMAXPD xmm1, xmmV, xmm2/m128","VMAXPD xmm2/m128, xmmV, xmm1","vmaxpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMAXPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vmaxpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VMAXPD ymm1, ymmV, ymm2/m256","VMAXPD ymm2/m256, ymmV, ymm1","vmaxpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMAXPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vmaxpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VMAXPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPD zmm2, zmmV, {k}{z}, zmm1{sae}","vmaxpd zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F.W1 5F /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMAXPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMAXPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vmaxpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5F /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VMAXPS xmm1, xmmV, xmm2/m128","VMAXPS xmm2/m128, xmmV, xmm1","vmaxps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMAXPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vmaxps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VMAXPS ymm1, ymmV, ymm2/m256","VMAXPS ymm2/m256, ymmV, ymm1","vmaxps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMAXPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vmaxps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VMAXPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPS zmm2, zmmV, {k}{z}, zmm1{sae}","vmaxps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0 5F /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMAXPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMAXPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vmaxps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5F /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VMAXSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSD xmm2, xmmV, {k}{z}, xmm1{sae}","vmaxsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5F /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMAXSD xmm1, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, xmm1","vmaxsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXSD xmm1, {k}{z}, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, {k}{z}, xmm1","vmaxsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5F /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VMAXSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSS xmm2, xmmV, {k}{z}, xmm1{sae}","vmaxss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5F /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMAXSS xmm1, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, xmm1","vmaxss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXSS xmm1, {k}{z}, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, {k}{z}, xmm1","vmaxss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5F /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VMCALL","VMCALL","vmcall","0F 01 C1","V","V","VTX","","","",""
+"VMCLEAR m64","VMCLEAR m64","vmclear m64","66 0F C7 /6","V","V","VTX","modrm_memonly","r","",""
+"VMFUNC","VMFUNC","vmfunc","0F 01 D4","V","V","","","","",""
+"VMINPD xmm1, xmmV, xmm2/m128","VMINPD xmm2/m128, xmmV, xmm1","vminpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMINPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vminpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VMINPD ymm1, ymmV, ymm2/m256","VMINPD ymm2/m256, ymmV, ymm1","vminpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMINPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vminpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VMINPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPD zmm2, zmmV, {k}{z}, zmm1{sae}","vminpd zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F.W1 5D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMINPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMINPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vminpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VMINPS xmm1, xmmV, xmm2/m128","VMINPS xmm2/m128, xmmV, xmm1","vminps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMINPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vminps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5D /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VMINPS ymm1, ymmV, ymm2/m256","VMINPS ymm2/m256, ymmV, ymm1","vminps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMINPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vminps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5D /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VMINPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPS zmm2, zmmV, {k}{z}, zmm1{sae}","vminps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0 5D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMINPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMINPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vminps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5D /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VMINSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSD xmm2, xmmV, {k}{z}, xmm1{sae}","vminsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMINSD xmm1, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, xmm1","vminsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINSD xmm1, {k}{z}, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, {k}{z}, xmm1","vminsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5D /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VMINSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSS xmm2, xmmV, {k}{z}, xmm1{sae}","vminss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMINSS xmm1, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, xmm1","vminss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINSS xmm1, {k}{z}, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, {k}{z}, xmm1","vminss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5D /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VMLAUNCH","VMLAUNCH","vmlaunch","0F 01 C2","V","V","VTX","","","",""
+"VMLOAD EAX","VMLOADL EAX","vmloadl EAX","0F 01 DA","V","V","SVM","amd,modrm_regonly,operand32","r","Y","32"
+"VMLOAD RAX","VMLOADQ RAX","vmloadq RAX","REX.W 0F 01 DA","N.S.","V","SVM","amd,modrm_regonly","r","Y","64"
+"VMLOAD AX","VMLOADW AX","vmloadw AX","0F 01 DA","V","V","SVM","amd,modrm_regonly,operand16","r","Y","16"
+"VMMCALL","VMMCALL","vmmcall","0F 01 D9","V","V","SVM","amd","","",""
+"VMOVAPD xmm2/m128, xmm1","VMOVAPD xmm1, xmm2/m128","vmovapd xmm1, xmm2/m128","VEX.128.66.0F.WIG 29 /r","V","V","AVX","","w,r","",""
+"VMOVAPD xmm2/m128, {k}{z}, xmm1","VMOVAPD xmm1, {k}{z}, xmm2/m128","vmovapd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 29 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVAPD xmm1, xmm2/m128","VMOVAPD xmm2/m128, xmm1","vmovapd xmm2/m128, xmm1","VEX.128.66.0F.WIG 28 /r","V","V","AVX","","w,r","",""
+"VMOVAPD xmm1, {k}{z}, xmm2/m128","VMOVAPD xmm2/m128, {k}{z}, xmm1","vmovapd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 28 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVAPD ymm2/m256, ymm1","VMOVAPD ymm1, ymm2/m256","vmovapd ymm1, ymm2/m256","VEX.256.66.0F.WIG 29 /r","V","V","AVX","","w,r","",""
+"VMOVAPD ymm2/m256, {k}{z}, ymm1","VMOVAPD ymm1, {k}{z}, ymm2/m256","vmovapd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 29 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVAPD ymm1, ymm2/m256","VMOVAPD ymm2/m256, ymm1","vmovapd ymm2/m256, ymm1","VEX.256.66.0F.WIG 28 /r","V","V","AVX","","w,r","",""
+"VMOVAPD ymm1, {k}{z}, ymm2/m256","VMOVAPD ymm2/m256, {k}{z}, ymm1","vmovapd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 28 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVAPD zmm2/m512, {k}{z}, zmm1","VMOVAPD zmm1, {k}{z}, zmm2/m512","vmovapd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 29 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVAPD zmm1, {k}{z}, zmm2/m512","VMOVAPD zmm2/m512, {k}{z}, zmm1","vmovapd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 28 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVAPS xmm2/m128, xmm1","VMOVAPS xmm1, xmm2/m128","vmovaps xmm1, xmm2/m128","VEX.128.0F.WIG 29 /r","V","V","AVX","","w,r","",""
+"VMOVAPS xmm2/m128, {k}{z}, xmm1","VMOVAPS xmm1, {k}{z}, xmm2/m128","vmovaps xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 29 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVAPS xmm1, xmm2/m128","VMOVAPS xmm2/m128, xmm1","vmovaps xmm2/m128, xmm1","VEX.128.0F.WIG 28 /r","V","V","AVX","","w,r","",""
+"VMOVAPS xmm1, {k}{z}, xmm2/m128","VMOVAPS xmm2/m128, {k}{z}, xmm1","vmovaps xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 28 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVAPS ymm2/m256, ymm1","VMOVAPS ymm1, ymm2/m256","vmovaps ymm1, ymm2/m256","VEX.256.0F.WIG 29 /r","V","V","AVX","","w,r","",""
+"VMOVAPS ymm2/m256, {k}{z}, ymm1","VMOVAPS ymm1, {k}{z}, ymm2/m256","vmovaps ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 29 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVAPS ymm1, ymm2/m256","VMOVAPS ymm2/m256, ymm1","vmovaps ymm2/m256, ymm1","VEX.256.0F.WIG 28 /r","V","V","AVX","","w,r","",""
+"VMOVAPS ymm1, {k}{z}, ymm2/m256","VMOVAPS ymm2/m256, {k}{z}, ymm1","vmovaps ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 28 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVAPS zmm2/m512, {k}{z}, zmm1","VMOVAPS zmm1, {k}{z}, zmm2/m512","vmovaps zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 29 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVAPS zmm1, {k}{z}, zmm2/m512","VMOVAPS zmm2/m512, {k}{z}, zmm1","vmovaps zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 28 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","EVEX.128.66.0F.W0 6E /r","V","V","AVX512F+AVX512VL","scale4","w,r","",""
+"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","VEX.128.66.0F.W0 6E /r","V","V","AVX","","w,r","",""
+"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","EVEX.128.66.0F.W0 7E /r","V","V","AVX512F+AVX512VL","scale4","w,r","",""
+"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","VEX.128.66.0F.W0 7E /r","V","V","AVX","","w,r","",""
+"VMOVDDUP xmm1, xmm2/m64","VMOVDDUP xmm2/m64, xmm1","vmovddup xmm2/m64, xmm1","VEX.128.F2.0F.WIG 12 /r","V","V","AVX","","w,r","",""
+"VMOVDDUP xmm1, {k}{z}, xmm2/m64","VMOVDDUP xmm2/m64, {k}{z}, xmm1","vmovddup xmm2/m64, {k}{z}, xmm1","EVEX.128.F2.0F.W1 12 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VMOVDDUP ymm1, ymm2/m256","VMOVDDUP ymm2/m256, ymm1","vmovddup ymm2/m256, ymm1","VEX.256.F2.0F.WIG 12 /r","V","V","AVX","","w,r","",""
+"VMOVDDUP ymm1, {k}{z}, ymm2/m256","VMOVDDUP ymm2/m256, {k}{z}, ymm1","vmovddup ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 12 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVDDUP zmm1, {k}{z}, zmm2/m512","VMOVDDUP zmm2/m512, {k}{z}, zmm1","vmovddup zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 12 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVDQA xmm2/m128, xmm1","VMOVDQA xmm1, xmm2/m128","vmovdqa xmm1, xmm2/m128","VEX.128.66.0F.WIG 7F /r","V","V","AVX","","w,r","",""
+"VMOVDQA xmm1, xmm2/m128","VMOVDQA xmm2/m128, xmm1","vmovdqa xmm2/m128, xmm1","VEX.128.66.0F.WIG 6F /r","V","V","AVX","","w,r","",""
+"VMOVDQA ymm2/m256, ymm1","VMOVDQA ymm1, ymm2/m256","vmovdqa ymm1, ymm2/m256","VEX.256.66.0F.WIG 7F /r","V","V","AVX","","w,r","",""
+"VMOVDQA ymm1, ymm2/m256","VMOVDQA ymm2/m256, ymm1","vmovdqa ymm2/m256, ymm1","VEX.256.66.0F.WIG 6F /r","V","V","AVX","","w,r","",""
+"VMOVDQA32 xmm2/m128, {k}{z}, xmm1","VMOVDQA32 xmm1, {k}{z}, xmm2/m128","vmovdqa32 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W0 7F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVDQA32 xmm1, {k}{z}, xmm2/m128","VMOVDQA32 xmm2/m128, {k}{z}, xmm1","vmovdqa32 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W0 6F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVDQA32 ymm2/m256, {k}{z}, ymm1","VMOVDQA32 ymm1, {k}{z}, ymm2/m256","vmovdqa32 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W0 7F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVDQA32 ymm1, {k}{z}, ymm2/m256","VMOVDQA32 ymm2/m256, {k}{z}, ymm1","vmovdqa32 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W0 6F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVDQA32 zmm2/m512, {k}{z}, zmm1","VMOVDQA32 zmm1, {k}{z}, zmm2/m512","vmovdqa32 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W0 7F /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVDQA32 zmm1, {k}{z}, zmm2/m512","VMOVDQA32 zmm2/m512, {k}{z}, zmm1","vmovdqa32 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W0 6F /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVDQA64 xmm2/m128, {k}{z}, xmm1","VMOVDQA64 xmm1, {k}{z}, xmm2/m128","vmovdqa64 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 7F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQA64 xmm1, {k}{z}, xmm2/m128","VMOVDQA64 xmm2/m128, {k}{z}, xmm1","vmovdqa64 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 6F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQA64 ymm2/m256, {k}{z}, ymm1","VMOVDQA64 ymm1, {k}{z}, ymm2/m256","vmovdqa64 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 7F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQA64 ymm1, {k}{z}, ymm2/m256","VMOVDQA64 ymm2/m256, {k}{z}, ymm1","vmovdqa64 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 6F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQA64 zmm2/m512, {k}{z}, zmm1","VMOVDQA64 zmm1, {k}{z}, zmm2/m512","vmovdqa64 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 7F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQA64 zmm1, {k}{z}, zmm2/m512","VMOVDQA64 zmm2/m512, {k}{z}, zmm1","vmovdqa64 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 6F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU xmm2/m128, xmm1","VMOVDQU xmm1, xmm2/m128","vmovdqu xmm1, xmm2/m128","VEX.128.F3.0F.WIG 7F /r","V","V","AVX","","w,r","",""
+"VMOVDQU xmm1, xmm2/m128","VMOVDQU xmm2/m128, xmm1","vmovdqu xmm2/m128, xmm1","VEX.128.F3.0F.WIG 6F /r","V","V","AVX","","w,r","",""
+"VMOVDQU ymm2/m256, ymm1","VMOVDQU ymm1, ymm2/m256","vmovdqu ymm1, ymm2/m256","VEX.256.F3.0F.WIG 7F /r","V","V","AVX","","w,r","",""
+"VMOVDQU ymm1, ymm2/m256","VMOVDQU ymm2/m256, ymm1","vmovdqu ymm2/m256, ymm1","VEX.256.F3.0F.WIG 6F /r","V","V","AVX","","w,r","",""
+"VMOVDQU16 xmm2/m128, {k}{z}, xmm1","VMOVDQU16 xmm1, {k}{z}, xmm2/m128","vmovdqu16 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W1 7F /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VMOVDQU16 xmm1, {k}{z}, xmm2/m128","VMOVDQU16 xmm2/m128, {k}{z}, xmm1","vmovdqu16 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W1 6F /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VMOVDQU16 ymm2/m256, {k}{z}, ymm1","VMOVDQU16 ymm1, {k}{z}, ymm2/m256","vmovdqu16 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W1 7F /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VMOVDQU16 ymm1, {k}{z}, ymm2/m256","VMOVDQU16 ymm2/m256, {k}{z}, ymm1","vmovdqu16 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 6F /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VMOVDQU16 zmm2/m512, {k}{z}, zmm1","VMOVDQU16 zmm1, {k}{z}, zmm2/m512","vmovdqu16 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W1 7F /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VMOVDQU16 zmm1, {k}{z}, zmm2/m512","VMOVDQU16 zmm2/m512, {k}{z}, zmm1","vmovdqu16 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 6F /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VMOVDQU32 xmm2/m128, {k}{z}, xmm1","VMOVDQU32 xmm1, {k}{z}, xmm2/m128","vmovdqu32 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W0 7F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU32 xmm1, {k}{z}, xmm2/m128","VMOVDQU32 xmm2/m128, {k}{z}, xmm1","vmovdqu32 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 6F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU32 ymm2/m256, {k}{z}, ymm1","VMOVDQU32 ymm1, {k}{z}, ymm2/m256","vmovdqu32 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W0 7F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU32 ymm1, {k}{z}, ymm2/m256","VMOVDQU32 ymm2/m256, {k}{z}, ymm1","vmovdqu32 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 6F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU32 zmm2/m512, {k}{z}, zmm1","VMOVDQU32 zmm1, {k}{z}, zmm2/m512","vmovdqu32 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W0 7F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU32 zmm1, {k}{z}, zmm2/m512","VMOVDQU32 zmm2/m512, {k}{z}, zmm1","vmovdqu32 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 6F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU64 xmm2/m128, {k}{z}, xmm1","VMOVDQU64 xmm1, {k}{z}, xmm2/m128","vmovdqu64 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W1 7F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU64 xmm1, {k}{z}, xmm2/m128","VMOVDQU64 xmm2/m128, {k}{z}, xmm1","vmovdqu64 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W1 6F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU64 ymm2/m256, {k}{z}, ymm1","VMOVDQU64 ymm1, {k}{z}, ymm2/m256","vmovdqu64 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W1 7F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU64 ymm1, {k}{z}, ymm2/m256","VMOVDQU64 ymm2/m256, {k}{z}, ymm1","vmovdqu64 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W1 6F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU64 zmm2/m512, {k}{z}, zmm1","VMOVDQU64 zmm1, {k}{z}, zmm2/m512","vmovdqu64 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W1 7F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU64 zmm1, {k}{z}, zmm2/m512","VMOVDQU64 zmm2/m512, {k}{z}, zmm1","vmovdqu64 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W1 6F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU8 xmm2/m128, {k}{z}, xmm1","VMOVDQU8 xmm1, {k}{z}, xmm2/m128","vmovdqu8 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W0 7F /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU8 xmm1, {k}{z}, xmm2/m128","VMOVDQU8 xmm2/m128, {k}{z}, xmm1","vmovdqu8 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W0 6F /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU8 ymm2/m256, {k}{z}, ymm1","VMOVDQU8 ymm1, {k}{z}, ymm2/m256","vmovdqu8 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W0 7F /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU8 ymm1, {k}{z}, ymm2/m256","VMOVDQU8 ymm2/m256, {k}{z}, ymm1","vmovdqu8 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W0 6F /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU8 zmm2/m512, {k}{z}, zmm1","VMOVDQU8 zmm1, {k}{z}, zmm2/m512","vmovdqu8 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W0 7F /r","V","V","AVX512BW","scale64","w,r,r","Y","512"
+"VMOVDQU8 zmm1, {k}{z}, zmm2/m512","VMOVDQU8 zmm2/m512, {k}{z}, zmm1","vmovdqu8 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W0 6F /r","V","V","AVX512BW","scale64","w,r,r","Y","512"
+"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xmmV, xmm1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xmmV, xmm1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xmm1","EVEX.NDS.LIG.66.0F.W1 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
+"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","EVEX.LIG.66.0F.W1 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
+"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","VEX.128.66.0F.WIG 17 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xmm1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
+"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xmm1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","EVEX.128.0F.W0 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
+"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","VEX.128.0F.WIG 17 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xmmV, xmm1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xmmV, xmm1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xmm1","EVEX.NDS.LIG.66.0F.W1 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
+"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","EVEX.LIG.66.0F.W1 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
+"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","VEX.128.66.0F.WIG 13 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xmm1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
+"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xmm1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","EVEX.128.0F.W0 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
+"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","VEX.128.0F.WIG 13 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVMSKPD r32, xmm2","VMOVMSKPD xmm2, r32","vmovmskpd xmm2, r32","VEX.128.66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VMOVMSKPD r32, ymm2","VMOVMSKPD ymm2, r32","vmovmskpd ymm2, r32","VEX.256.66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VMOVMSKPS r32, xmm2","VMOVMSKPS xmm2, r32","vmovmskps xmm2, r32","VEX.128.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VMOVMSKPS r32, ymm2","VMOVMSKPS ymm2, r32","vmovmskps ymm2, r32","VEX.256.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","EVEX.128.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
+"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","VEX.128.66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","EVEX.256.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
+"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","VEX.256.66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTDQ m512, zmm1","VMOVNTDQ zmm1, m512","vmovntdq zmm1, m512","EVEX.512.66.0F.W0 E7 /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
+"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","EVEX.128.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
+"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","VEX.128.66.0F38.WIG 2A /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","EVEX.256.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
+"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","VEX.256.66.0F38.WIG 2A /r","V","V","AVX2","modrm_memonly","w,r","",""
+"VMOVNTDQA zmm1, m512","VMOVNTDQA m512, zmm1","vmovntdqa m512, zmm1","EVEX.512.66.0F38.W0 2A /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
+"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","EVEX.128.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
+"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","VEX.128.66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","EVEX.256.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
+"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","VEX.256.66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTPD m512, zmm1","VMOVNTPD zmm1, m512","vmovntpd zmm1, m512","EVEX.512.66.0F.W1 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
+"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","EVEX.128.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
+"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","VEX.128.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","EVEX.256.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
+"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","VEX.256.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTPS m512, zmm1","VMOVNTPS zmm1, m512","vmovntps zmm1, m512","EVEX.512.0F.W0 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
+"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","EVEX.128.66.0F.W1 6E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","",""
+"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","VEX.128.66.0F.W1 6E /r","N.S.","V","AVX","","w,r","",""
+"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","EVEX.128.66.0F.W1 7E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","",""
+"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","VEX.128.66.0F.W1 7E /r","N.S.","V","AVX","","w,r","",""
+"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","EVEX.LIG.66.0F.W1 D6 /r","V","V","AVX512F+AVX512VL","scale8","w,r","",""
+"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","VEX.128.66.0F.WIG D6 /r","V","V","AVX","","w,r","",""
+"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","EVEX.LIG.F3.0F.W1 7E /r","V","V","AVX512F+AVX512VL","scale8","w,r","",""
+"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","VEX.128.F3.0F.WIG 7E /r","V","V","AVX","","w,r","",""
+"VMOVSD xmm1, m64","VMOVSD m64, xmm1","vmovsd m64, xmm1","VEX.LIG.F2.0F.WIG 10 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVSD xmm1, {k}{z}, m64","VMOVSD m64, {k}{z}, xmm1","vmovsd m64, {k}{z}, xmm1","EVEX.LIG.F2.0F.W1 10 /r","V","V","AVX512F","modrm_memonly,scale8","w,r,r","",""
+"VMOVSD m64, xmm1","VMOVSD xmm1, m64","vmovsd xmm1, m64","VEX.LIG.F2.0F.WIG 11 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVSD xmm2, xmmV, xmm1","VMOVSD xmm1, xmmV, xmm2","vmovsd xmm1, xmmV, xmm2","VEX.NDS.LIG.F2.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVSD xmm2, {k}{z}, xmmV, xmm1","VMOVSD xmm1, xmmV, {k}{z}, xmm2","vmovsd xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F2.0F.W1 11 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMOVSD m64, {k}, xmm1","VMOVSD xmm1, {k}, m64","vmovsd xmm1, {k}, m64","EVEX.LIG.F2.0F.W1 11 /r","V","V","AVX512F","modrm_memonly,scale8","w,r,r","",""
+"VMOVSD xmm1, xmmV, xmm2","VMOVSD xmm2, xmmV, xmm1","vmovsd xmm2, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVSD xmm1, {k}{z}, xmmV, xmm2","VMOVSD xmm2, xmmV, {k}{z}, xmm1","vmovsd xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 10 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMOVSHDUP xmm1, xmm2/m128","VMOVSHDUP xmm2/m128, xmm1","vmovshdup xmm2/m128, xmm1","VEX.128.F3.0F.WIG 16 /r","V","V","AVX","","w,r","",""
+"VMOVSHDUP xmm1, {k}{z}, xmm2/m128","VMOVSHDUP xmm2/m128, {k}{z}, xmm1","vmovshdup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 16 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVSHDUP ymm1, ymm2/m256","VMOVSHDUP ymm2/m256, ymm1","vmovshdup ymm2/m256, ymm1","VEX.256.F3.0F.WIG 16 /r","V","V","AVX","","w,r","",""
+"VMOVSHDUP ymm1, {k}{z}, ymm2/m256","VMOVSHDUP ymm2/m256, {k}{z}, ymm1","vmovshdup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 16 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVSHDUP zmm1, {k}{z}, zmm2/m512","VMOVSHDUP zmm2/m512, {k}{z}, zmm1","vmovshdup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 16 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVSLDUP xmm1, xmm2/m128","VMOVSLDUP xmm2/m128, xmm1","vmovsldup xmm2/m128, xmm1","VEX.128.F3.0F.WIG 12 /r","V","V","AVX","","w,r","",""
+"VMOVSLDUP xmm1, {k}{z}, xmm2/m128","VMOVSLDUP xmm2/m128, {k}{z}, xmm1","vmovsldup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 12 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVSLDUP ymm1, ymm2/m256","VMOVSLDUP ymm2/m256, ymm1","vmovsldup ymm2/m256, ymm1","VEX.256.F3.0F.WIG 12 /r","V","V","AVX","","w,r","",""
+"VMOVSLDUP ymm1, {k}{z}, ymm2/m256","VMOVSLDUP ymm2/m256, {k}{z}, ymm1","vmovsldup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 12 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVSLDUP zmm1, {k}{z}, zmm2/m512","VMOVSLDUP zmm2/m512, {k}{z}, zmm1","vmovsldup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 12 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVSS xmm1, m32","VMOVSS m32, xmm1","vmovss m32, xmm1","VEX.LIG.F3.0F.WIG 10 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVSS xmm1, {k}{z}, m32","VMOVSS m32, {k}{z}, xmm1","vmovss m32, {k}{z}, xmm1","EVEX.LIG.F3.0F.W0 10 /r","V","V","AVX512F","modrm_memonly,scale4","w,r,r","",""
+"VMOVSS m32, xmm1","VMOVSS xmm1, m32","vmovss xmm1, m32","VEX.LIG.F3.0F.WIG 11 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVSS xmm2, xmmV, xmm1","VMOVSS xmm1, xmmV, xmm2","vmovss xmm1, xmmV, xmm2","VEX.NDS.LIG.F3.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVSS xmm2, {k}{z}, xmmV, xmm1","VMOVSS xmm1, xmmV, {k}{z}, xmm2","vmovss xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F3.0F.W0 11 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMOVSS m32, {k}, xmm1","VMOVSS xmm1, {k}, m32","vmovss xmm1, {k}, m32","EVEX.LIG.F3.0F.W0 11 /r","V","V","AVX512F","modrm_memonly,scale4","w,r,r","",""
+"VMOVSS xmm1, xmmV, xmm2","VMOVSS xmm2, xmmV, xmm1","vmovss xmm2, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVSS xmm1, {k}{z}, xmmV, xmm2","VMOVSS xmm2, xmmV, {k}{z}, xmm1","vmovss xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 10 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMOVUPD xmm2/m128, xmm1","VMOVUPD xmm1, xmm2/m128","vmovupd xmm1, xmm2/m128","VEX.128.66.0F.WIG 11 /r","V","V","AVX","","w,r","",""
+"VMOVUPD xmm2/m128, {k}{z}, xmm1","VMOVUPD xmm1, {k}{z}, xmm2/m128","vmovupd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 11 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVUPD xmm1, xmm2/m128","VMOVUPD xmm2/m128, xmm1","vmovupd xmm2/m128, xmm1","VEX.128.66.0F.WIG 10 /r","V","V","AVX","","w,r","",""
+"VMOVUPD xmm1, {k}{z}, xmm2/m128","VMOVUPD xmm2/m128, {k}{z}, xmm1","vmovupd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 10 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVUPD ymm2/m256, ymm1","VMOVUPD ymm1, ymm2/m256","vmovupd ymm1, ymm2/m256","VEX.256.66.0F.WIG 11 /r","V","V","AVX","","w,r","",""
+"VMOVUPD ymm2/m256, {k}{z}, ymm1","VMOVUPD ymm1, {k}{z}, ymm2/m256","vmovupd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 11 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVUPD ymm1, ymm2/m256","VMOVUPD ymm2/m256, ymm1","vmovupd ymm2/m256, ymm1","VEX.256.66.0F.WIG 10 /r","V","V","AVX","","w,r","",""
+"VMOVUPD ymm1, {k}{z}, ymm2/m256","VMOVUPD ymm2/m256, {k}{z}, ymm1","vmovupd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 10 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVUPD zmm2/m512, {k}{z}, zmm1","VMOVUPD zmm1, {k}{z}, zmm2/m512","vmovupd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 11 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVUPD zmm1, {k}{z}, zmm2/m512","VMOVUPD zmm2/m512, {k}{z}, zmm1","vmovupd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 10 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVUPS xmm2/m128, xmm1","VMOVUPS xmm1, xmm2/m128","vmovups xmm1, xmm2/m128","VEX.128.0F.WIG 11 /r","V","V","AVX","","w,r","",""
+"VMOVUPS xmm2/m128, {k}{z}, xmm1","VMOVUPS xmm1, {k}{z}, xmm2/m128","vmovups xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 11 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVUPS xmm1, xmm2/m128","VMOVUPS xmm2/m128, xmm1","vmovups xmm2/m128, xmm1","VEX.128.0F.WIG 10 /r","V","V","AVX","","w,r","",""
+"VMOVUPS xmm1, {k}{z}, xmm2/m128","VMOVUPS xmm2/m128, {k}{z}, xmm1","vmovups xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 10 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVUPS ymm2/m256, ymm1","VMOVUPS ymm1, ymm2/m256","vmovups ymm1, ymm2/m256","VEX.256.0F.WIG 11 /r","V","V","AVX","","w,r","",""
+"VMOVUPS ymm2/m256, {k}{z}, ymm1","VMOVUPS ymm1, {k}{z}, ymm2/m256","vmovups ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 11 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVUPS ymm1, ymm2/m256","VMOVUPS ymm2/m256, ymm1","vmovups ymm2/m256, ymm1","VEX.256.0F.WIG 10 /r","V","V","AVX","","w,r","",""
+"VMOVUPS ymm1, {k}{z}, ymm2/m256","VMOVUPS ymm2/m256, {k}{z}, ymm1","vmovups ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 10 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVUPS zmm2/m512, {k}{z}, zmm1","VMOVUPS zmm1, {k}{z}, zmm2/m512","vmovups zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 11 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVUPS zmm1, {k}{z}, zmm2/m512","VMOVUPS zmm2/m512, {k}{z}, zmm1","vmovups zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 10 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMPSADBW xmm1, xmmV, xmm2/m128, imm8u","VMPSADBW imm8u, xmm2/m128, xmmV, xmm1","vmpsadbw imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 42 /r ib","V","V","AVX","","w,r,r,r","",""
+"VMPSADBW ymm1, ymmV, ymm2/m256, imm8u","VMPSADBW imm8u, ymm2/m256, ymmV, ymm1","vmpsadbw imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 42 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VMPTRLD m64","VMPTRLD m64","vmptrld m64","0F C7 /6","V","V","VTX","modrm_memonly","r","",""
+"VMPTRST m64","VMPTRST m64","vmptrst m64","0F C7 /7","V","V","VTX","modrm_memonly","w","",""
+"VMREAD r/m32, r32","VMREAD r32, r/m32","vmread r32, r/m32","0F 78 /r","V","N.S.","VTX","","rw,r","",""
+"VMREAD r/m64, r64","VMREAD r64, r/m64","vmread r64, r/m64","0F 78 /r","N.S.","V","VTX","default64","rw,r","",""
+"VMRESUME","VMRESUME","vmresume","0F 01 C3","V","V","VTX","","","",""
+"VMRUN EAX","VMRUNL EAX","vmrunl EAX","0F 01 D8","V","V","SVM","amd,modrm_regonly,operand32","r","Y","32"
+"VMRUN RAX","VMRUNQ RAX","vmrunq RAX","REX.W 0F 01 D8","N.S.","V","SVM","amd,modrm_regonly","r","Y","64"
+"VMRUN AX","VMRUNW AX","vmrunw AX","0F 01 D8","V","V","SVM","amd,modrm_regonly,operand16","r","Y","16"
+"VMSAVE","VMSAVE","vmsave","0F 01 DB","V","V","SVM","amd","","",""
+"VMULPD xmm1, xmmV, xmm2/m128","VMULPD xmm2/m128, xmmV, xmm1","vmulpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMULPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vmulpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 59 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VMULPD ymm1, ymmV, ymm2/m256","VMULPD ymm2/m256, ymmV, ymm1","vmulpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMULPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vmulpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 59 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VMULPD zmm1{er}, {k}{z}, zmmV, zmm2","VMULPD zmm2, zmmV, {k}{z}, zmm1{er}","vmulpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMULPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMULPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vmulpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 59 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VMULPS xmm1, xmmV, xmm2/m128","VMULPS xmm2/m128, xmmV, xmm1","vmulps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMULPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vmulps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 59 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VMULPS ymm1, ymmV, ymm2/m256","VMULPS ymm2/m256, ymmV, ymm1","vmulps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMULPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vmulps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 59 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VMULPS zmm1{er}, {k}{z}, zmmV, zmm2","VMULPS zmm2, zmmV, {k}{z}, zmm1{er}","vmulps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMULPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMULPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vmulps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 59 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VMULSD xmm1{er}, {k}{z}, xmmV, xmm2","VMULSD xmm2, xmmV, {k}{z}, xmm1{er}","vmulsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMULSD xmm1, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, xmm1","vmulsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULSD xmm1, {k}{z}, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, {k}{z}, xmm1","vmulsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 59 /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VMULSS xmm1{er}, {k}{z}, xmmV, xmm2","VMULSS xmm2, xmmV, {k}{z}, xmm1{er}","vmulss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMULSS xmm1, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, xmm1","vmulss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULSS xmm1, {k}{z}, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, {k}{z}, xmm1","vmulss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 59 /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VMWRITE r32, r/m32","VMWRITE r/m32, r32","vmwrite r/m32, r32","0F 79 /r","V","N.S.","VTX","","r,r","",""
+"VMWRITE r64, r/m64","VMWRITE r/m64, r64","vmwrite r/m64, r64","0F 79 /r","N.S.","V","VTX","default64","r,r","",""
+"VMXOFF","VMXOFF","vmxoff","0F 01 C4","V","V","VTX","","","",""
+"VMXON m64","VMXON m64","vmxon m64","F3 0F C7 /6","V","V","VTX","modrm_memonly","r","",""
+"VORPD xmm1, xmmV, xmm2/m128","VORPD xmm2/m128, xmmV, xmm1","vorpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
+"VORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VORPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vorpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 56 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VORPD ymm1, ymmV, ymm2/m256","VORPD ymm2/m256, ymmV, ymm1","vorpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
+"VORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VORPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vorpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 56 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VORPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vorpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 56 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VORPS xmm1, xmmV, xmm2/m128","VORPS xmm2/m128, xmmV, xmm1","vorps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
+"VORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VORPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vorps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 56 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VORPS ymm1, ymmV, ymm2/m256","VORPS ymm2/m256, ymmV, ymm1","vorps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
+"VORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VORPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vorps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 56 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VORPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vorps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 56 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VP4DPWSSD zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSD m128, zmmV+3, {k}{z}, zmm1","vp4dpwssd m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 52 /r","V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","",""
+"VP4DPWSSDS zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSDS m128, zmmV+3, {k}{z}, zmm1","vp4dpwssds m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 53 /r","V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","",""
+"VPABSB xmm1, xmm2/m128","VPABSB xmm2/m128, xmm1","vpabsb xmm2/m128, xmm1","VEX.128.66.0F38.WIG 1C /r","V","V","AVX","","w,r","",""
+"VPABSB xmm1, {k}{z}, xmm2/m128","VPABSB xmm2/m128, {k}{z}, xmm1","vpabsb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1C /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPABSB ymm1, ymm2/m256","VPABSB ymm2/m256, ymm1","vpabsb ymm2/m256, ymm1","VEX.256.66.0F38.WIG 1C /r","V","V","AVX2","","w,r","",""
+"VPABSB ymm1, {k}{z}, ymm2/m256","VPABSB ymm2/m256, {k}{z}, ymm1","vpabsb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1C /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPABSB zmm1, {k}{z}, zmm2/m512","VPABSB zmm2/m512, {k}{z}, zmm1","vpabsb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1C /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VPABSD xmm1, xmm2/m128","VPABSD xmm2/m128, xmm1","vpabsd xmm2/m128, xmm1","VEX.128.66.0F38.WIG 1E /r","V","V","AVX","","w,r","",""
+"VPABSD xmm1, {k}{z}, xmm2/m128/m32bcst","VPABSD xmm2/m128/m32bcst, {k}{z}, xmm1","vpabsd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 1E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VPABSD ymm1, ymm2/m256","VPABSD ymm2/m256, ymm1","vpabsd ymm2/m256, ymm1","VEX.256.66.0F38.WIG 1E /r","V","V","AVX2","","w,r","",""
+"VPABSD ymm1, {k}{z}, ymm2/m256/m32bcst","VPABSD ymm2/m256/m32bcst, {k}{z}, ymm1","vpabsd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 1E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VPABSD zmm1, {k}{z}, zmm2/m512/m32bcst","VPABSD zmm2/m512/m32bcst, {k}{z}, zmm1","vpabsd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1E /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VPABSQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPABSQ xmm2/m128/m64bcst, {k}{z}, xmm1","vpabsq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 1F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VPABSQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPABSQ ymm2/m256/m64bcst, {k}{z}, ymm1","vpabsq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 1F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VPABSQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPABSQ zmm2/m512/m64bcst, {k}{z}, zmm1","vpabsq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1F /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VPABSW xmm1, xmm2/m128","VPABSW xmm2/m128, xmm1","vpabsw xmm2/m128, xmm1","VEX.128.66.0F38.WIG 1D /r","V","V","AVX","","w,r","",""
+"VPABSW xmm1, {k}{z}, xmm2/m128","VPABSW xmm2/m128, {k}{z}, xmm1","vpabsw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1D /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPABSW ymm1, ymm2/m256","VPABSW ymm2/m256, ymm1","vpabsw ymm2/m256, ymm1","VEX.256.66.0F38.WIG 1D /r","V","V","AVX2","","w,r","",""
+"VPABSW ymm1, {k}{z}, ymm2/m256","VPABSW ymm2/m256, {k}{z}, ymm1","vpabsw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1D /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPABSW zmm1, {k}{z}, zmm2/m512","VPABSW zmm2/m512, {k}{z}, zmm1","vpabsw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1D /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VPACKSSDW xmm1, xmmV, xmm2/m128","VPACKSSDW xmm2/m128, xmmV, xmm1","vpackssdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6B /r","V","V","AVX","","w,r,r","",""
+"VPACKSSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKSSDW xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpackssdw xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 6B /r","V","V","AVX512BW+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPACKSSDW ymm1, ymmV, ymm2/m256","VPACKSSDW ymm2/m256, ymmV, ymm1","vpackssdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6B /r","V","V","AVX2","","w,r,r","",""
+"VPACKSSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKSSDW ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpackssdw ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 6B /r","V","V","AVX512BW+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPACKSSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKSSDW zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpackssdw zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 6B /r","V","V","AVX512BW","bscale4,scale64","w,r,r,r","",""
+"VPACKSSWB xmm1, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV, xmm1","vpacksswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 63 /r","V","V","AVX","","w,r,r","",""
+"VPACKSSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV, {k}{z}, xmm1","vpacksswb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 63 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPACKSSWB ymm1, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV, ymm1","vpacksswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 63 /r","V","V","AVX2","","w,r,r","",""
+"VPACKSSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV, {k}{z}, ymm1","vpacksswb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 63 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPACKSSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKSSWB zmm2/m512, zmmV, {k}{z}, zmm1","vpacksswb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 63 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPACKUSDW xmm1, xmmV, xmm2/m128","VPACKUSDW xmm2/m128, xmmV, xmm1","vpackusdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 2B /r","V","V","AVX","","w,r,r","",""
+"VPACKUSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKUSDW xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpackusdw xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 2B /r","V","V","AVX512BW+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPACKUSDW ymm1, ymmV, ymm2/m256","VPACKUSDW ymm2/m256, ymmV, ymm1","vpackusdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 2B /r","V","V","AVX2","","w,r,r","",""
+"VPACKUSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKUSDW ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpackusdw ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 2B /r","V","V","AVX512BW+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPACKUSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKUSDW zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpackusdw zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 2B /r","V","V","AVX512BW","bscale4,scale64","w,r,r,r","",""
+"VPACKUSWB xmm1, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV, xmm1","vpackuswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 67 /r","V","V","AVX","","w,r,r","",""
+"VPACKUSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV, {k}{z}, xmm1","vpackuswb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 67 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPACKUSWB ymm1, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV, ymm1","vpackuswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 67 /r","V","V","AVX2","","w,r,r","",""
+"VPACKUSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV, {k}{z}, ymm1","vpackuswb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 67 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPACKUSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKUSWB zmm2/m512, zmmV, {k}{z}, zmm1","vpackuswb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 67 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDB xmm1, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV, xmm1","vpaddb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FC /r","V","V","AVX","","w,r,r","",""
+"VPADDB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV, {k}{z}, xmm1","vpaddb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG FC /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDB ymm1, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV, ymm1","vpaddb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FC /r","V","V","AVX2","","w,r,r","",""
+"VPADDB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV, {k}{z}, ymm1","vpaddb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG FC /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDB zmm2/m512, zmmV, {k}{z}, zmm1","vpaddb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG FC /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDD xmm1, xmmV, xmm2/m128","VPADDD xmm2/m128, xmmV, xmm1","vpaddd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FE /r","V","V","AVX","","w,r,r","",""
+"VPADDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPADDD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpaddd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 FE /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPADDD ymm1, ymmV, ymm2/m256","VPADDD ymm2/m256, ymmV, ymm1","vpaddd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FE /r","V","V","AVX2","","w,r,r","",""
+"VPADDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPADDD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpaddd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 FE /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPADDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPADDD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpaddd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 FE /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPADDQ xmm1, xmmV, xmm2/m128","VPADDQ xmm2/m128, xmmV, xmm1","vpaddq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D4 /r","V","V","AVX","","w,r,r","",""
+"VPADDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPADDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpaddq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 D4 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPADDQ ymm1, ymmV, ymm2/m256","VPADDQ ymm2/m256, ymmV, ymm1","vpaddq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D4 /r","V","V","AVX2","","w,r,r","",""
+"VPADDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPADDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpaddq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 D4 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPADDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPADDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpaddq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 D4 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPADDSB xmm1, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV, xmm1","vpaddsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EC /r","V","V","AVX","","w,r,r","",""
+"VPADDSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV, {k}{z}, xmm1","vpaddsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EC /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDSB ymm1, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV, ymm1","vpaddsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EC /r","V","V","AVX2","","w,r,r","",""
+"VPADDSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV, {k}{z}, ymm1","vpaddsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EC /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSB zmm2/m512, zmmV, {k}{z}, zmm1","vpaddsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EC /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDSW xmm1, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV, xmm1","vpaddsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG ED /r","V","V","AVX","","w,r,r","",""
+"VPADDSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV, {k}{z}, xmm1","vpaddsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG ED /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDSW ymm1, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV, ymm1","vpaddsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG ED /r","V","V","AVX2","","w,r,r","",""
+"VPADDSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV, {k}{z}, ymm1","vpaddsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG ED /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSW zmm2/m512, zmmV, {k}{z}, zmm1","vpaddsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG ED /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDUSB xmm1, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV, xmm1","vpaddusb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DC /r","V","V","AVX","","w,r,r","",""
+"VPADDUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV, {k}{z}, xmm1","vpaddusb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DC /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDUSB ymm1, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV, ymm1","vpaddusb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DC /r","V","V","AVX2","","w,r,r","",""
+"VPADDUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV, {k}{z}, ymm1","vpaddusb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DC /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSB zmm2/m512, zmmV, {k}{z}, zmm1","vpaddusb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DC /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDUSW xmm1, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV, xmm1","vpaddusw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DD /r","V","V","AVX","","w,r,r","",""
+"VPADDUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV, {k}{z}, xmm1","vpaddusw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DD /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDUSW ymm1, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV, ymm1","vpaddusw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DD /r","V","V","AVX2","","w,r,r","",""
+"VPADDUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV, {k}{z}, ymm1","vpaddusw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DD /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSW zmm2/m512, zmmV, {k}{z}, zmm1","vpaddusw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DD /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDW xmm1, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV, xmm1","vpaddw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FD /r","V","V","AVX","","w,r,r","",""
+"VPADDW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV, {k}{z}, xmm1","vpaddw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG FD /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDW ymm1, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV, ymm1","vpaddw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FD /r","V","V","AVX2","","w,r,r","",""
+"VPADDW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV, {k}{z}, ymm1","vpaddw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG FD /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDW zmm2/m512, zmmV, {k}{z}, zmm1","vpaddw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG FD /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPALIGNR xmm1, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u, xmm2/m128, xmmV, xmm1","vpalignr imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0F /r ib","V","V","AVX","","w,r,r,r","",""
+"VPALIGNR xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","vpalignr imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.WIG 0F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPALIGNR ymm1, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u, ymm2/m256, ymmV, ymm1","vpalignr imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0F /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPALIGNR ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","vpalignr imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.WIG 0F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPALIGNR zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPALIGNR imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","vpalignr imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.WIG 0F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPAND xmm1, xmmV, xmm2/m128","VPAND xmm2/m128, xmmV, xmm1","vpand xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DB /r","V","V","AVX","","w,r,r","",""
+"VPAND ymm1, ymmV, ymm2/m256","VPAND ymm2/m256, ymmV, ymm1","vpand ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DB /r","V","V","AVX2","","w,r,r","",""
+"VPANDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpandd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 DB /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPANDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpandd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 DB /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPANDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpandd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 DB /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPANDN xmm1, xmmV, xmm2/m128","VPANDN xmm2/m128, xmmV, xmm1","vpandn xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DF /r","V","V","AVX","","w,r,r","",""
+"VPANDN ymm1, ymmV, ymm2/m256","VPANDN ymm2/m256, ymmV, ymm1","vpandn ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DF /r","V","V","AVX2","","w,r,r","",""
+"VPANDND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDND xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpandnd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 DF /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPANDND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDND ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpandnd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 DF /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPANDND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDND zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpandnd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 DF /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPANDNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDNQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpandnq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 DF /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPANDNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDNQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpandnq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 DF /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPANDNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDNQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpandnq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 DF /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPANDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpandq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 DB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPANDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpandq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 DB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPANDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpandq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 DB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPAVGB xmm1, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV, xmm1","vpavgb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E0 /r","V","V","AVX","","w,r,r","",""
+"VPAVGB xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV, {k}{z}, xmm1","vpavgb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E0 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPAVGB ymm1, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV, ymm1","vpavgb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E0 /r","V","V","AVX2","","w,r,r","",""
+"VPAVGB ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV, {k}{z}, ymm1","vpavgb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E0 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPAVGB zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGB zmm2/m512, zmmV, {k}{z}, zmm1","vpavgb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E0 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPAVGW xmm1, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV, xmm1","vpavgw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E3 /r","V","V","AVX","","w,r,r","",""
+"VPAVGW xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV, {k}{z}, xmm1","vpavgw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E3 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPAVGW ymm1, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV, ymm1","vpavgw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E3 /r","V","V","AVX2","","w,r,r","",""
+"VPAVGW ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV, {k}{z}, ymm1","vpavgw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E3 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPAVGW zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGW zmm2/m512, zmmV, {k}{z}, zmm1","vpavgw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E3 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPBLENDD xmm1, xmmV, xmm2/m128, imm8u","VPBLENDD imm8u, xmm2/m128, xmmV, xmm1","vpblendd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 02 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPBLENDD ymm1, ymmV, ymm2/m256, imm8u","VPBLENDD imm8u, ymm2/m256, ymmV, ymm1","vpblendd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 02 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPBLENDMB xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMB xmm2/m128, xmmV, {k}{z}, xmm1","vpblendmb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 66 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPBLENDMB ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMB ymm2/m256, ymmV, {k}{z}, ymm1","vpblendmb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 66 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPBLENDMB zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMB zmm2/m512, zmmV, {k}{z}, zmm1","vpblendmb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 66 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPBLENDMD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPBLENDMD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpblendmd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 64 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPBLENDMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPBLENDMD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpblendmd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 64 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPBLENDMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPBLENDMD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpblendmd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 64 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPBLENDMQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPBLENDMQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpblendmq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 64 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPBLENDMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPBLENDMQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpblendmq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 64 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPBLENDMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPBLENDMQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpblendmq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 64 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPBLENDMW xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMW xmm2/m128, xmmV, {k}{z}, xmm1","vpblendmw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 66 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPBLENDMW ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMW ymm2/m256, ymmV, {k}{z}, ymm1","vpblendmw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 66 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPBLENDMW zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMW zmm2/m512, zmmV, {k}{z}, zmm1","vpblendmw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 66 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPBLENDVB xmm1, xmmV, xmm2/m128, xmmIH","VPBLENDVB xmmIH, xmm2/m128, xmmV, xmm1","vpblendvb xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4C /r /is4","V","V","AVX","","w,r,r,r","",""
+"VPBLENDVB ymm1, ymmV, ymm2/m256, ymmIH","VPBLENDVB ymmIH, ymm2/m256, ymmV, ymm1","vpblendvb ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4C /r /is4","V","V","AVX2","","w,r,r,r","",""
+"VPBLENDW xmm1, xmmV, xmm2/m128, imm8u","VPBLENDW imm8u, xmm2/m128, xmmV, xmm1","vpblendw imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0E /r ib","V","V","AVX","","w,r,r,r","",""
+"VPBLENDW ymm1, ymmV, ymm2/m256, imm8u","VPBLENDW imm8u, ymm2/m256, ymmV, ymm1","vpblendw imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0E /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPBROADCASTB xmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, xmm1","vpbroadcastb rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7A /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTB ymm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, ymm1","vpbroadcastb rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7A /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTB zmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, zmm1","vpbroadcastb rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7A /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"VPBROADCASTB xmm1, xmm2/m8","VPBROADCASTB xmm2/m8, xmm1","vpbroadcastb xmm2/m8, xmm1","VEX.128.66.0F38.W0 78 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTB ymm1, xmm2/m8","VPBROADCASTB xmm2/m8, ymm1","vpbroadcastb xmm2/m8, ymm1","VEX.256.66.0F38.W0 78 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTB xmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, xmm1","vpbroadcastb xmm2/m8, {k}{z}, xmm1","EVEX.128.66.0F38.W0 78 /r","V","V","AVX512BW+AVX512VL","scale1","w,r,r","",""
+"VPBROADCASTB ymm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, ymm1","vpbroadcastb xmm2/m8, {k}{z}, ymm1","EVEX.256.66.0F38.W0 78 /r","V","V","AVX512BW+AVX512VL","scale1","w,r,r","",""
+"VPBROADCASTB zmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, zmm1","vpbroadcastb xmm2/m8, {k}{z}, zmm1","EVEX.512.66.0F38.W0 78 /r","V","V","AVX512BW","scale1","w,r,r","",""
+"VPBROADCASTD xmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, xmm1","vpbroadcastd rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7C /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTD ymm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, ymm1","vpbroadcastd rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7C /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTD zmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, zmm1","vpbroadcastd rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7C /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VPBROADCASTD xmm1, xmm2/m32","VPBROADCASTD xmm2/m32, xmm1","vpbroadcastd xmm2/m32, xmm1","VEX.128.66.0F38.W0 58 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTD ymm1, xmm2/m32","VPBROADCASTD xmm2/m32, ymm1","vpbroadcastd xmm2/m32, ymm1","VEX.256.66.0F38.W0 58 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTD xmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, xmm1","vpbroadcastd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 58 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPBROADCASTD ymm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, ymm1","vpbroadcastd xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 58 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPBROADCASTD zmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, zmm1","vpbroadcastd xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 58 /r","V","V","AVX512F","scale4","w,r,r","",""
+"VPBROADCASTMB2Q xmm1, k2","VPBROADCASTMB2Q k2, xmm1","vpbroadcastmb2q k2, xmm1","EVEX.128.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
+"VPBROADCASTMB2Q ymm1, k2","VPBROADCASTMB2Q k2, ymm1","vpbroadcastmb2q k2, ymm1","EVEX.256.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
+"VPBROADCASTMB2Q zmm1, k2","VPBROADCASTMB2Q k2, zmm1","vpbroadcastmb2q k2, zmm1","EVEX.512.F3.0F38.W1 2A /r","V","V","AVX512CD","modrm_regonly","w,r","",""
+"VPBROADCASTMW2D xmm1, k2","VPBROADCASTMW2D k2, xmm1","vpbroadcastmw2d k2, xmm1","EVEX.128.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
+"VPBROADCASTMW2D ymm1, k2","VPBROADCASTMW2D k2, ymm1","vpbroadcastmw2d k2, ymm1","EVEX.256.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
+"VPBROADCASTMW2D zmm1, k2","VPBROADCASTMW2D k2, zmm1","vpbroadcastmw2d k2, zmm1","EVEX.512.F3.0F38.W0 3A /r","V","V","AVX512CD","modrm_regonly","w,r","",""
+"VPBROADCASTQ xmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, xmm1","vpbroadcastq rmr64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 7C /r","N.S.","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTQ ymm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, ymm1","vpbroadcastq rmr64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 7C /r","N.S.","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTQ zmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, zmm1","vpbroadcastq rmr64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 7C /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","",""
+"VPBROADCASTQ xmm1, xmm2/m64","VPBROADCASTQ xmm2/m64, xmm1","vpbroadcastq xmm2/m64, xmm1","VEX.128.66.0F38.W0 59 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTQ ymm1, xmm2/m64","VPBROADCASTQ xmm2/m64, ymm1","vpbroadcastq xmm2/m64, ymm1","VEX.256.66.0F38.W0 59 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTQ xmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, xmm1","vpbroadcastq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 59 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPBROADCASTQ ymm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, ymm1","vpbroadcastq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 59 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPBROADCASTQ zmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, zmm1","vpbroadcastq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 59 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPBROADCASTW xmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, xmm1","vpbroadcastw rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7B /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTW ymm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, ymm1","vpbroadcastw rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7B /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTW zmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, zmm1","vpbroadcastw rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7B /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"VPBROADCASTW xmm1, xmm2/m16","VPBROADCASTW xmm2/m16, xmm1","vpbroadcastw xmm2/m16, xmm1","VEX.128.66.0F38.W0 79 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTW ymm1, xmm2/m16","VPBROADCASTW xmm2/m16, ymm1","vpbroadcastw xmm2/m16, ymm1","VEX.256.66.0F38.W0 79 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTW xmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, xmm1","vpbroadcastw xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.W0 79 /r","V","V","AVX512BW+AVX512VL","scale2","w,r,r","",""
+"VPBROADCASTW ymm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, ymm1","vpbroadcastw xmm2/m16, {k}{z}, ymm1","EVEX.256.66.0F38.W0 79 /r","V","V","AVX512BW+AVX512VL","scale2","w,r,r","",""
+"VPBROADCASTW zmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, zmm1","vpbroadcastw xmm2/m16, {k}{z}, zmm1","EVEX.512.66.0F38.W0 79 /r","V","V","AVX512BW","scale2","w,r,r","",""
+"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u, xmm2/m128, xmmV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F3A.WIG 44 /r ib","V","V","VPCLMULQDQ+AVX512VL","scale16","w,r,r,r","",""
+"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u, xmm2/m128, xmmV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 44 /r ib","V","V","PCLMULQDQ+AVX","","w,r,r,r","",""
+"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u, ymm2/m256, ymmV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F3A.WIG 44 /r ib","V","V","VPCLMULQDQ+AVX512VL","scale32","w,r,r,r","",""
+"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u, ymm2/m256, ymmV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 44 /r ib","V","V","VPCLMULQDQ","","w,r,r,r","",""
+"VPCLMULQDQ zmm1, zmmV, zmm2/m512, imm8u","VPCLMULQDQ imm8u, zmm2/m512, zmmV, zmm1","vpclmulqdq imm8u, zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F3A.WIG 44 /r ib","V","V","VPCLMULQDQ+AVX512F","scale64","w,r,r,r","",""
+"VPCMOV xmm1, xmmV, xmmIH, xmm2/m128","VPCMOV xmm2/m128, xmmIH, xmmV, xmm1","vpcmov xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A2 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPCMOV xmm1, xmmV, xmm2/m128, xmmIH","VPCMOV xmmIH, xmm2/m128, xmmV, xmm1","vpcmov xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A2 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPCMOV ymm1, ymmV, ymmIH, ymm2/m256","VPCMOV ymm2/m256, ymmIH, ymmV, ymm1","vpcmov ymm2/m256, ymmIH, ymmV, ymm1","XOP.NDS.256.08.W1 A2 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPCMOV ymm1, ymmV, ymm2/m256, ymmIH","VPCMOV ymmIH, ymm2/m256, ymmV, ymm1","vpcmov ymmIH, ymm2/m256, ymmV, ymm1","XOP.NDS.256.08.W0 A2 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPCMPB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPB imm8u, xmm2/m128, xmmV, {k}, k1","vpcmpb imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 3F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPCMPB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPB imm8u, ymm2/m256, ymmV, {k}, k1","vpcmpb imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 3F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPCMPB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPB imm8u, zmm2/m512, zmmV, {k}, k1","vpcmpb imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 3F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPCMPD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPD imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpd imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 1F /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VPCMPD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPD imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpd imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 1F /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VPCMPD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPD imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpd imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 1F /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VPCMPEQB xmm1, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, xmm1","vpcmpeqb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 74 /r","V","V","AVX","","w,r,r","",""
+"VPCMPEQB k1, {k}, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, {k}, k1","vpcmpeqb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 74 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPCMPEQB ymm1, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, ymm1","vpcmpeqb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 74 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPEQB k1, {k}, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, {k}, k1","vpcmpeqb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 74 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPCMPEQB k1, {k}, zmmV, zmm2/m512","VPCMPEQB zmm2/m512, zmmV, {k}, k1","vpcmpeqb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 74 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPCMPEQD xmm1, xmmV, xmm2/m128","VPCMPEQD xmm2/m128, xmmV, xmm1","vpcmpeqd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 76 /r","V","V","AVX","","w,r,r","",""
+"VPCMPEQD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPEQD xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpeqd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPCMPEQD ymm1, ymmV, ymm2/m256","VPCMPEQD ymm2/m256, ymmV, ymm1","vpcmpeqd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 76 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPEQD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPEQD ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpeqd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPCMPEQD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPEQD zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpeqd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F.W0 76 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPCMPEQQ xmm1, xmmV, xmm2/m128","VPCMPEQQ xmm2/m128, xmmV, xmm1","vpcmpeqq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 29 /r","V","V","AVX","","w,r,r","",""
+"VPCMPEQQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPEQQ xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpeqq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 29 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPCMPEQQ ymm1, ymmV, ymm2/m256","VPCMPEQQ ymm2/m256, ymmV, ymm1","vpcmpeqq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 29 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPEQQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPEQQ ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpeqq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 29 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPCMPEQQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPEQQ zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpeqq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 29 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPCMPEQW xmm1, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, xmm1","vpcmpeqw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 75 /r","V","V","AVX","","w,r,r","",""
+"VPCMPEQW k1, {k}, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, {k}, k1","vpcmpeqw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 75 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPCMPEQW ymm1, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, ymm1","vpcmpeqw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 75 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPEQW k1, {k}, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, {k}, k1","vpcmpeqw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 75 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPCMPEQW k1, {k}, zmmV, zmm2/m512","VPCMPEQW zmm2/m512, zmmV, {k}, k1","vpcmpeqw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 75 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPCMPESTRI xmm1, xmm2/m128, imm8u","VPCMPESTRI imm8u, xmm2/m128, xmm1","vpcmpestri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 61 /r ib","V","V","AVX","","r,r,r","",""
+"VPCMPESTRM xmm1, xmm2/m128, imm8u","VPCMPESTRM imm8u, xmm2/m128, xmm1","vpcmpestrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 60 /r ib","V","V","AVX","","r,r,r","",""
+"VPCMPGTB xmm1, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, xmm1","vpcmpgtb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 64 /r","V","V","AVX","","w,r,r","",""
+"VPCMPGTB k1, {k}, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, {k}, k1","vpcmpgtb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 64 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPCMPGTB ymm1, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, ymm1","vpcmpgtb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 64 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPGTB k1, {k}, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, {k}, k1","vpcmpgtb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 64 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPCMPGTB k1, {k}, zmmV, zmm2/m512","VPCMPGTB zmm2/m512, zmmV, {k}, k1","vpcmpgtb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 64 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPCMPGTD xmm1, xmmV, xmm2/m128","VPCMPGTD xmm2/m128, xmmV, xmm1","vpcmpgtd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 66 /r","V","V","AVX","","w,r,r","",""
+"VPCMPGTD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPGTD xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpgtd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F.W0 66 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPCMPGTD ymm1, ymmV, ymm2/m256","VPCMPGTD ymm2/m256, ymmV, ymm1","vpcmpgtd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 66 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPGTD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPGTD ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpgtd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F.W0 66 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPCMPGTD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPGTD zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpgtd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F.W0 66 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPCMPGTQ xmm1, xmmV, xmm2/m128","VPCMPGTQ xmm2/m128, xmmV, xmm1","vpcmpgtq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 37 /r","V","V","AVX","","w,r,r","",""
+"VPCMPGTQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPGTQ xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpgtq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 37 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPCMPGTQ ymm1, ymmV, ymm2/m256","VPCMPGTQ ymm2/m256, ymmV, ymm1","vpcmpgtq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 37 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPGTQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPGTQ ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpgtq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 37 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPCMPGTQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPGTQ zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpgtq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 37 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPCMPGTW xmm1, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, xmm1","vpcmpgtw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 65 /r","V","V","AVX","","w,r,r","",""
+"VPCMPGTW k1, {k}, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, {k}, k1","vpcmpgtw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 65 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPCMPGTW ymm1, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, ymm1","vpcmpgtw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 65 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPGTW k1, {k}, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, {k}, k1","vpcmpgtw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 65 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPCMPGTW k1, {k}, zmmV, zmm2/m512","VPCMPGTW zmm2/m512, zmmV, {k}, k1","vpcmpgtw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 65 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPCMPISTRI xmm1, xmm2/m128, imm8u","VPCMPISTRI imm8u, xmm2/m128, xmm1","vpcmpistri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 63 /r ib","V","V","AVX","","r,r,r","",""
+"VPCMPISTRM xmm1, xmm2/m128, imm8u","VPCMPISTRM imm8u, xmm2/m128, xmm1","vpcmpistrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 62 /r ib","V","V","AVX","","r,r,r","",""
+"VPCMPQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPQ imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpq imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 1F /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VPCMPQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPQ imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpq imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 1F /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VPCMPQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPQ imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpq imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 1F /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VPCMPUB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUB imm8u, xmm2/m128, xmmV, {k}, k1","vpcmpub imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 3E /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPCMPUB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUB imm8u, ymm2/m256, ymmV, {k}, k1","vpcmpub imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 3E /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPCMPUB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUB imm8u, zmm2/m512, zmmV, {k}, k1","vpcmpub imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 3E /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPCMPUD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPUD imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpud imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 1E /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VPCMPUD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPUD imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpud imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 1E /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VPCMPUD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPUD imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpud imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 1E /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VPCMPUQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPUQ imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpuq imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 1E /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VPCMPUQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPUQ imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpuq imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 1E /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VPCMPUQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPUQ imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpuq imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 1E /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VPCMPUW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUW imm8u, xmm2/m128, xmmV, {k}, k1","vpcmpuw imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 3E /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPCMPUW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUW imm8u, ymm2/m256, ymmV, {k}, k1","vpcmpuw imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 3E /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPCMPUW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUW imm8u, zmm2/m512, zmmV, {k}, k1","vpcmpuw imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 3E /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPCMPW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPW imm8u, xmm2/m128, xmmV, {k}, k1","vpcmpw imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 3F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPCMPW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPW imm8u, ymm2/m256, ymmV, {k}, k1","vpcmpw imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 3F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPCMPW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPW imm8u, zmm2/m512, zmmV, {k}, k1","vpcmpw imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 3F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPCOMB xmm1, xmmV, xmm2/m128, imm8u","VPCOMB imm8u, xmm2/m128, xmmV, xmm1","vpcomb imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CC /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMD xmm1, xmmV, xmm2/m128, imm8u","VPCOMD imm8u, xmm2/m128, xmmV, xmm1","vpcomd imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CE /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMPRESSB xmm2/m128, {k}{z}, xmm1","VPCOMPRESSB xmm1, {k}{z}, xmm2/m128","vpcompressb xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
+"VPCOMPRESSB ymm2/m256, {k}{z}, ymm1","VPCOMPRESSB ymm1, {k}{z}, ymm2/m256","vpcompressb ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
+"VPCOMPRESSB zmm2/m512, {k}{z}, zmm1","VPCOMPRESSB zmm1, {k}{z}, zmm2/m512","vpcompressb zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 63 /r","V","V","AVX512_VBMI2","scale1","w,r,r","",""
+"VPCOMPRESSD xmm2/m128, {k}{z}, xmm1","VPCOMPRESSD xmm1, {k}{z}, xmm2/m128","vpcompressd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 8B /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPCOMPRESSD ymm2/m256, {k}{z}, ymm1","VPCOMPRESSD ymm1, {k}{z}, ymm2/m256","vpcompressd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 8B /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPCOMPRESSD zmm2/m512, {k}{z}, zmm1","VPCOMPRESSD zmm1, {k}{z}, zmm2/m512","vpcompressd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 8B /r","V","V","AVX512F","scale4","w,r,r","",""
+"VPCOMPRESSQ xmm2/m128, {k}{z}, xmm1","VPCOMPRESSQ xmm1, {k}{z}, xmm2/m128","vpcompressq xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 8B /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPCOMPRESSQ ymm2/m256, {k}{z}, ymm1","VPCOMPRESSQ ymm1, {k}{z}, ymm2/m256","vpcompressq ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 8B /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPCOMPRESSQ zmm2/m512, {k}{z}, zmm1","VPCOMPRESSQ zmm1, {k}{z}, zmm2/m512","vpcompressq zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 8B /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPCOMPRESSW xmm2/m128, {k}{z}, xmm1","VPCOMPRESSW xmm1, {k}{z}, xmm2/m128","vpcompressw xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
+"VPCOMPRESSW ymm2/m256, {k}{z}, ymm1","VPCOMPRESSW ymm1, {k}{z}, ymm2/m256","vpcompressw ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
+"VPCOMPRESSW zmm2/m512, {k}{z}, zmm1","VPCOMPRESSW zmm1, {k}{z}, zmm2/m512","vpcompressw zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 63 /r","V","V","AVX512_VBMI2","scale2","w,r,r","",""
+"VPCOMQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMQ imm8u, xmm2/m128, xmmV, xmm1","vpcomq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CF /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMUB xmm1, xmmV, xmm2/m128, imm8u","VPCOMUB imm8u, xmm2/m128, xmmV, xmm1","vpcomub imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EC /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMUD xmm1, xmmV, xmm2/m128, imm8u","VPCOMUD imm8u, xmm2/m128, xmmV, xmm1","vpcomud imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EE /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMUQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMUQ imm8u, xmm2/m128, xmmV, xmm1","vpcomuq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EF /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMUW xmm1, xmmV, xmm2/m128, imm8u","VPCOMUW imm8u, xmm2/m128, xmmV, xmm1","vpcomuw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 ED /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMW xmm1, xmmV, xmm2/m128, imm8u","VPCOMW imm8u, xmm2/m128, xmmV, xmm1","vpcomw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CD /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCONFLICTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPCONFLICTD xmm2/m128/m32bcst, {k}{z}, xmm1","vpconflictd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 C4 /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","",""
+"VPCONFLICTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPCONFLICTD ymm2/m256/m32bcst, {k}{z}, ymm1","vpconflictd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 C4 /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","",""
+"VPCONFLICTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPCONFLICTD zmm2/m512/m32bcst, {k}{z}, zmm1","vpconflictd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 C4 /r","V","V","AVX512CD","bscale4,scale64","w,r,r","",""
+"VPCONFLICTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPCONFLICTQ xmm2/m128/m64bcst, {k}{z}, xmm1","vpconflictq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 C4 /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","",""
+"VPCONFLICTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPCONFLICTQ ymm2/m256/m64bcst, {k}{z}, ymm1","vpconflictq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 C4 /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","",""
+"VPCONFLICTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPCONFLICTQ zmm2/m512/m64bcst, {k}{z}, zmm1","vpconflictq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 C4 /r","V","V","AVX512CD","bscale8,scale64","w,r,r","",""
+"VPDPBUSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpbusd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 50 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPDPBUSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpbusd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 50 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPDPBUSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpbusd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 50 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
+"VPDPBUSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSDS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpbusds xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 51 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPDPBUSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSDS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpbusds ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 51 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPDPBUSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSDS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpbusds zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 51 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
+"VPDPWSSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpwssd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 52 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPDPWSSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpwssd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 52 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPDPWSSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpwssd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 52 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
+"VPDPWSSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSDS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpwssds xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 53 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPDPWSSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSDS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpwssds ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 53 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPDPWSSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSDS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpwssds zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 53 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
+"VPERM2F128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2F128 imm8u, ymm2/m256, ymmV, ymm1","vperm2f128 imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 06 /r ib","V","V","AVX","","w,r,r,r","",""
+"VPERM2I128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2I128 imm8u, ymm2/m256, ymmV, ymm1","vperm2i128 imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 46 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPERMB xmm1, {k}{z}, xmmV, xmm2/m128","VPERMB xmm2/m128, xmmV, {k}{z}, xmm1","vpermb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 8D /r","V","V","AVX512_VBMI+AVX512VL","scale16","w,r,r,r","",""
+"VPERMB ymm1, {k}{z}, ymmV, ymm2/m256","VPERMB ymm2/m256, ymmV, {k}{z}, ymm1","vpermb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 8D /r","V","V","AVX512_VBMI+AVX512VL","scale32","w,r,r,r","",""
+"VPERMB zmm1, {k}{z}, zmmV, zmm2/m512","VPERMB zmm2/m512, zmmV, {k}{z}, zmm1","vpermb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 8D /r","V","V","AVX512_VBMI","scale64","w,r,r,r","",""
+"VPERMD ymm1, ymmV, ymm2/m256","VPERMD ymm2/m256, ymmV, ymm1","vpermd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 36 /r","V","V","AVX2","","w,r,r","",""
+"VPERMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 36 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPERMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 36 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPERMI2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2B xmm2/m128, xmmV, {k}{z}, xmm1","vpermi2b xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 75 /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","",""
+"VPERMI2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2B ymm2/m256, ymmV, {k}{z}, ymm1","vpermi2b ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 75 /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","",""
+"VPERMI2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2B zmm2/m512, zmmV, {k}{z}, zmm1","vpermi2b zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 75 /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","",""
+"VPERMI2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2D xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermi2d xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPERMI2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2D ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermi2d ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPERMI2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2D zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermi2d zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 76 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VPERMI2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermi2pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 77 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPERMI2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermi2pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 77 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPERMI2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermi2pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 77 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VPERMI2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermi2ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 77 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPERMI2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermi2ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 77 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPERMI2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermi2ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 77 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VPERMI2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2Q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermi2q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 76 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPERMI2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2Q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermi2q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 76 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPERMI2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2Q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermi2q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 76 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VPERMI2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2W xmm2/m128, xmmV, {k}{z}, xmm1","vpermi2w xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 75 /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","",""
+"VPERMI2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2W ymm2/m256, ymmV, {k}{z}, ymm1","vpermi2w ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 75 /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","",""
+"VPERMI2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2W zmm2/m512, zmmV, {k}{z}, zmm1","vpermi2w zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 75 /r","V","V","AVX512BW","scale64","rw,r,r,r","",""
+"VPERMIL2PD xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PD imm8u, xmm2/m128, xmmIH, xmmV, xmm1","vpermil2pd imm8u, xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PD xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PD imm8u, xmmIH, xmm2/m128, xmmV, xmm1","vpermil2pd imm8u, xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PD ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PD imm8u, ymm2/m256, ymmIH, ymmV, ymm1","vpermil2pd imm8u, ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PD ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PD imm8u, ymmIH, ymm2/m256, ymmV, ymm1","vpermil2pd imm8u, ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PS xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PS imm8u, xmm2/m128, xmmIH, xmmV, xmm1","vpermil2ps imm8u, xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PS xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PS imm8u, xmmIH, xmm2/m128, xmmV, xmm1","vpermil2ps imm8u, xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PS ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PS imm8u, ymm2/m256, ymmIH, ymmV, ymm1","vpermil2ps imm8u, ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PS ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PS imm8u, ymmIH, ymm2/m256, ymmV, ymm1","vpermil2ps imm8u, ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMILPD xmm1, xmm2/m128, imm8u","VPERMILPD imm8u, xmm2/m128, xmm1","vpermilpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 05 /r ib","V","V","AVX","","w,r,r","",""
+"VPERMILPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VPERMILPD imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","vpermilpd imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 05 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPERMILPD ymm1, ymm2/m256, imm8u","VPERMILPD imm8u, ymm2/m256, ymm1","vpermilpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 05 /r ib","V","V","AVX","","w,r,r","",""
+"VPERMILPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMILPD imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vpermilpd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 05 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMILPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMILPD imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vpermilpd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 05 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMILPD xmm1, xmmV, xmm2/m128","VPERMILPD xmm2/m128, xmmV, xmm1","vpermilpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0D /r","V","V","AVX","","w,r,r","",""
+"VPERMILPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMILPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermilpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 0D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPERMILPD ymm1, ymmV, ymm2/m256","VPERMILPD ymm2/m256, ymmV, ymm1","vpermilpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0D /r","V","V","AVX","","w,r,r","",""
+"VPERMILPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMILPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermilpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 0D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMILPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMILPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermilpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 0D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMILPS xmm1, xmm2/m128, imm8u","VPERMILPS imm8u, xmm2/m128, xmm1","vpermilps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 04 /r ib","V","V","AVX","","w,r,r","",""
+"VPERMILPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPERMILPS imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","vpermilps imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 04 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPERMILPS ymm1, ymm2/m256, imm8u","VPERMILPS imm8u, ymm2/m256, ymm1","vpermilps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 04 /r ib","V","V","AVX","","w,r,r","",""
+"VPERMILPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPERMILPS imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","vpermilps imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 04 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPERMILPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPERMILPS imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","vpermilps imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 04 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPERMILPS xmm1, xmmV, xmm2/m128","VPERMILPS xmm2/m128, xmmV, xmm1","vpermilps xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0C /r","V","V","AVX","","w,r,r","",""
+"VPERMILPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMILPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermilps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 0C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPERMILPS ymm1, ymmV, ymm2/m256","VPERMILPS ymm2/m256, ymmV, ymm1","vpermilps ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0C /r","V","V","AVX","","w,r,r","",""
+"VPERMILPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMILPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermilps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 0C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPERMILPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMILPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermilps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 0C /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPERMPD ymm1, ymm2/m256, imm8u","VPERMPD imm8u, ymm2/m256, ymm1","vpermpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 01 /r ib","V","V","AVX2","","w,r,r","",""
+"VPERMPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMPD imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vpermpd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 01 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMPD imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vpermpd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 01 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 16 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 16 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMPS ymm1, ymmV, ymm2/m256","VPERMPS ymm2/m256, ymmV, ymm1","vpermps ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 16 /r","V","V","AVX2","","w,r,r","",""
+"VPERMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 16 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPERMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 16 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPERMQ ymm1, ymm2/m256, imm8u","VPERMQ imm8u, ymm2/m256, ymm1","vpermq imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 00 /r ib","V","V","AVX2","","w,r,r","",""
+"VPERMQ ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vpermq imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 00 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMQ zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vpermq imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 00 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 36 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 36 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMT2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2B xmm2/m128, xmmV, {k}{z}, xmm1","vpermt2b xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7D /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","",""
+"VPERMT2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2B ymm2/m256, ymmV, {k}{z}, ymm1","vpermt2b ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7D /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","",""
+"VPERMT2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2B zmm2/m512, zmmV, {k}{z}, zmm1","vpermt2b zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7D /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","",""
+"VPERMT2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2D xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermt2d xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPERMT2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2D ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermt2d ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPERMT2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2D zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermt2d zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7E /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VPERMT2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermt2pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPERMT2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermt2pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPERMT2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermt2pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7F /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VPERMT2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermt2ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPERMT2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermt2ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPERMT2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermt2ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7F /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VPERMT2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2Q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermt2q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPERMT2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2Q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermt2q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPERMT2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2Q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermt2q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7E /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VPERMT2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2W xmm2/m128, xmmV, {k}{z}, xmm1","vpermt2w xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7D /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","",""
+"VPERMT2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2W ymm2/m256, ymmV, {k}{z}, ymm1","vpermt2w ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7D /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","",""
+"VPERMT2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2W zmm2/m512, zmmV, {k}{z}, zmm1","vpermt2w zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7D /r","V","V","AVX512BW","scale64","rw,r,r,r","",""
+"VPERMW xmm1, {k}{z}, xmmV, xmm2/m128","VPERMW xmm2/m128, xmmV, {k}{z}, xmm1","vpermw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 8D /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPERMW ymm1, {k}{z}, ymmV, ymm2/m256","VPERMW ymm2/m256, ymmV, {k}{z}, ymm1","vpermw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 8D /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPERMW zmm1, {k}{z}, zmmV, zmm2/m512","VPERMW zmm2/m512, zmmV, {k}{z}, zmm1","vpermw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 8D /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPEXPANDB xmm1, {k}{z}, xmm2/m128","VPEXPANDB xmm2/m128, {k}{z}, xmm1","vpexpandb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 62 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
+"VPEXPANDB ymm1, {k}{z}, ymm2/m256","VPEXPANDB ymm2/m256, {k}{z}, ymm1","vpexpandb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 62 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
+"VPEXPANDB zmm1, {k}{z}, zmm2/m512","VPEXPANDB zmm2/m512, {k}{z}, zmm1","vpexpandb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 62 /r","V","V","AVX512_VBMI2","scale1","w,r,r","",""
+"VPEXPANDD xmm1, {k}{z}, xmm2/m128","VPEXPANDD xmm2/m128, {k}{z}, xmm1","vpexpandd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 89 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPEXPANDD ymm1, {k}{z}, ymm2/m256","VPEXPANDD ymm2/m256, {k}{z}, ymm1","vpexpandd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 89 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPEXPANDD zmm1, {k}{z}, zmm2/m512","VPEXPANDD zmm2/m512, {k}{z}, zmm1","vpexpandd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 89 /r","V","V","AVX512F","scale4","w,r,r","",""
+"VPEXPANDQ xmm1, {k}{z}, xmm2/m128","VPEXPANDQ xmm2/m128, {k}{z}, xmm1","vpexpandq xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 89 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPEXPANDQ ymm1, {k}{z}, ymm2/m256","VPEXPANDQ ymm2/m256, {k}{z}, ymm1","vpexpandq ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 89 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPEXPANDQ zmm1, {k}{z}, zmm2/m512","VPEXPANDQ zmm2/m512, {k}{z}, zmm1","vpexpandq zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 89 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPEXPANDW xmm1, {k}{z}, xmm2/m128","VPEXPANDW xmm2/m128, {k}{z}, xmm1","vpexpandw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 62 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
+"VPEXPANDW ymm1, {k}{z}, ymm2/m256","VPEXPANDW ymm2/m256, {k}{z}, ymm1","vpexpandw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 62 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
+"VPEXPANDW zmm1, {k}{z}, zmm2/m512","VPEXPANDW zmm2/m512, {k}{z}, zmm1","vpexpandw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 62 /r","V","V","AVX512_VBMI2","scale2","w,r,r","",""
+"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb imm8u, xmm1, r32/m8","EVEX.128.66.0F3A.WIG 14 /r ib","V","V","AVX512BW+AVX512VL","scale1","w,r,r","",""
+"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb imm8u, xmm1, r32/m8","VEX.128.66.0F3A.WIG 14 /r ib","V","V","AVX","","w,r,r","",""
+"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, xmm1, r/m32","EVEX.128.66.0F3A.W0 16 /r ib","V","V","AVX512DQ+AVX512VL","scale4","w,r,r","",""
+"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, xmm1, r/m32","VEX.128.66.0F3A.W0 16 /r ib","V","V","AVX","","w,r,r","",""
+"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, xmm1, r/m64","EVEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
+"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, xmm1, r/m64","VEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX","","w,r,r","",""
+"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1, r32/m16","vpextrw imm8u, xmm1, r32/m16","EVEX.128.66.0F3A.WIG 15 /r ib","V","V","AVX512BW+AVX512VL","scale2","w,r,r","",""
+"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1, r32/m16","vpextrw imm8u, xmm1, r32/m16","VEX.128.66.0F3A.WIG 15 /r ib","V","V","AVX","","w,r,r","",""
+"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2, r32","EVEX.128.66.0F.WIG C5 /r ib","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2, r32","VEX.128.66.0F.WIG C5 /r ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPGATHERDD xmm1, {k1-k7}, vm32x","VPGATHERDD vm32x, {k1-k7}, xmm1","vpgatherdd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 90 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERDD ymm1, {k1-k7}, vm32y","VPGATHERDD vm32y, {k1-k7}, ymm1","vpgatherdd vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 90 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERDD zmm1, {k1-k7}, vm32z","VPGATHERDD vm32z, {k1-k7}, zmm1","vpgatherdd vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 90 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERDD xmm1, vm32x, xmmV","VPGATHERDD xmmV, vm32x, xmm1","vpgatherdd xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERDD ymm1, vm32y, ymmV","VPGATHERDD ymmV, vm32y, ymm1","vpgatherdd ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERDQ xmm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7}, xmm1","vpgatherdq vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 90 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERDQ ymm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7}, ymm1","vpgatherdq vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 90 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERDQ zmm1, {k1-k7}, vm32y","VPGATHERDQ vm32y, {k1-k7}, zmm1","vpgatherdq vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 90 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERDQ xmm1, vm32x, xmmV","VPGATHERDQ xmmV, vm32x, xmm1","vpgatherdq xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERDQ ymm1, vm32x, ymmV","VPGATHERDQ ymmV, vm32x, ymm1","vpgatherdq ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERQD xmm1, {k1-k7}, vm64x","VPGATHERQD vm64x, {k1-k7}, xmm1","vpgatherqd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 91 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERQD xmm1, {k1-k7}, vm64y","VPGATHERQD vm64y, {k1-k7}, xmm1","vpgatherqd vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 91 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERQD ymm1, {k1-k7}, vm64z","VPGATHERQD vm64z, {k1-k7}, ymm1","vpgatherqd vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 91 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERQD xmm1, vm64x, xmmV","VPGATHERQD xmmV, vm64x, xmm1","vpgatherqd xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERQD xmm1, vm64y, xmmV","VPGATHERQD xmmV, vm64y, xmm1","vpgatherqd xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERQQ xmm1, {k1-k7}, vm64x","VPGATHERQQ vm64x, {k1-k7}, xmm1","vpgatherqq vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 91 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERQQ ymm1, {k1-k7}, vm64y","VPGATHERQQ vm64y, {k1-k7}, ymm1","vpgatherqq vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 91 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERQQ zmm1, {k1-k7}, vm64z","VPGATHERQQ vm64z, {k1-k7}, zmm1","vpgatherqq vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 91 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERQQ xmm1, vm64x, xmmV","VPGATHERQQ xmmV, vm64x, xmm1","vpgatherqq xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERQQ ymm1, vm64y, ymmV","VPGATHERQQ ymmV, vm64y, ymm1","vpgatherqq ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPHADDBD xmm1, xmm2/m128","VPHADDBD xmm2/m128, xmm1","vphaddbd xmm2/m128, xmm1","XOP.128.09.W0 C2 /r","V","V","XOP","amd","w,r","",""
+"VPHADDBQ xmm1, xmm2/m128","VPHADDBQ xmm2/m128, xmm1","vphaddbq xmm2/m128, xmm1","XOP.128.09.W0 C3 /r","V","V","XOP","amd","w,r","",""
+"VPHADDBW xmm1, xmm2/m128","VPHADDBW xmm2/m128, xmm1","vphaddbw xmm2/m128, xmm1","XOP.128.09.W0 C1 /r","V","V","XOP","amd","w,r","",""
+"VPHADDD xmm1, xmmV, xmm2/m128","VPHADDD xmm2/m128, xmmV, xmm1","vphaddd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 02 /r","V","V","AVX","","w,r,r","",""
+"VPHADDD ymm1, ymmV, ymm2/m256","VPHADDD ymm2/m256, ymmV, ymm1","vphaddd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 02 /r","V","V","AVX2","","w,r,r","",""
+"VPHADDDQ xmm1, xmm2/m128","VPHADDDQ xmm2/m128, xmm1","vphadddq xmm2/m128, xmm1","XOP.128.09.W0 CB /r","V","V","XOP","amd","w,r","",""
+"VPHADDSW xmm1, xmmV, xmm2/m128","VPHADDSW xmm2/m128, xmmV, xmm1","vphaddsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 03 /r","V","V","AVX","","w,r,r","",""
+"VPHADDSW ymm1, ymmV, ymm2/m256","VPHADDSW ymm2/m256, ymmV, ymm1","vphaddsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 03 /r","V","V","AVX2","","w,r,r","",""
+"VPHADDUBD xmm1, xmm2/m128","VPHADDUBD xmm2/m128, xmm1","vphaddubd xmm2/m128, xmm1","XOP.128.09.W0 D2 /r","V","V","XOP","amd","w,r","",""
+"VPHADDUBQ xmm1, xmm2/m128","VPHADDUBQ xmm2/m128, xmm1","vphaddubq xmm2/m128, xmm1","XOP.128.09.W0 D3 /r","V","V","XOP","amd","w,r","",""
+"VPHADDUBW xmm1, xmm2/m128","VPHADDUBW xmm2/m128, xmm1","vphaddubw xmm2/m128, xmm1","XOP.128.09.W0 D1 /r","V","V","XOP","amd","w,r","",""
+"VPHADDUDQ xmm1, xmm2/m128","VPHADDUDQ xmm2/m128, xmm1","vphaddudq xmm2/m128, xmm1","XOP.128.09.W0 DB /r","V","V","XOP","amd","w,r","",""
+"VPHADDUWD xmm1, xmm2/m128","VPHADDUWD xmm2/m128, xmm1","vphadduwd xmm2/m128, xmm1","XOP.128.09.W0 D6 /r","V","V","XOP","amd","w,r","",""
+"VPHADDUWQ xmm1, xmm2/m128","VPHADDUWQ xmm2/m128, xmm1","vphadduwq xmm2/m128, xmm1","XOP.128.09.W0 D7 /r","V","V","XOP","amd","w,r","",""
+"VPHADDW xmm1, xmmV, xmm2/m128","VPHADDW xmm2/m128, xmmV, xmm1","vphaddw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 01 /r","V","V","AVX","","w,r,r","",""
+"VPHADDW ymm1, ymmV, ymm2/m256","VPHADDW ymm2/m256, ymmV, ymm1","vphaddw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 01 /r","V","V","AVX2","","w,r,r","",""
+"VPHADDWD xmm1, xmm2/m128","VPHADDWD xmm2/m128, xmm1","vphaddwd xmm2/m128, xmm1","XOP.128.09.W0 C6 /r","V","V","XOP","amd","w,r","",""
+"VPHADDWQ xmm1, xmm2/m128","VPHADDWQ xmm2/m128, xmm1","vphaddwq xmm2/m128, xmm1","XOP.128.09.W0 C7 /r","V","V","XOP","amd","w,r","",""
+"VPHMINPOSUW xmm1, xmm2/m128","VPHMINPOSUW xmm2/m128, xmm1","vphminposuw xmm2/m128, xmm1","VEX.128.66.0F38.WIG 41 /r","V","V","AVX","","w,r","",""
+"VPHSUBBW xmm1, xmm2/m128","VPHSUBBW xmm2/m128, xmm1","vphsubbw xmm2/m128, xmm1","XOP.128.09.W0 E1 /r","V","V","XOP","amd","w,r","",""
+"VPHSUBD xmm1, xmmV, xmm2/m128","VPHSUBD xmm2/m128, xmmV, xmm1","vphsubd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 06 /r","V","V","AVX","","w,r,r","",""
+"VPHSUBD ymm1, ymmV, ymm2/m256","VPHSUBD ymm2/m256, ymmV, ymm1","vphsubd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 06 /r","V","V","AVX2","","w,r,r","",""
+"VPHSUBDQ xmm1, xmm2/m128","VPHSUBDQ xmm2/m128, xmm1","vphsubdq xmm2/m128, xmm1","XOP.128.09.W0 E3 /r","V","V","XOP","amd","w,r","",""
+"VPHSUBSW xmm1, xmmV, xmm2/m128","VPHSUBSW xmm2/m128, xmmV, xmm1","vphsubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 07 /r","V","V","AVX","","w,r,r","",""
+"VPHSUBSW ymm1, ymmV, ymm2/m256","VPHSUBSW ymm2/m256, ymmV, ymm1","vphsubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 07 /r","V","V","AVX2","","w,r,r","",""
+"VPHSUBW xmm1, xmmV, xmm2/m128","VPHSUBW xmm2/m128, xmmV, xmm1","vphsubw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 05 /r","V","V","AVX","","w,r,r","",""
+"VPHSUBW ymm1, ymmV, ymm2/m256","VPHSUBW ymm2/m256, ymmV, ymm1","vphsubw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 05 /r","V","V","AVX2","","w,r,r","",""
+"VPHSUBWD xmm1, xmm2/m128","VPHSUBWD xmm2/m128, xmm1","vphsubwd xmm2/m128, xmm1","XOP.128.09.W0 E2 /r","V","V","XOP","amd","w,r","",""
+"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV, xmm1","vpinsrb imm8u, r32/m8, xmmV, xmm1","EVEX.NDS.128.66.0F3A.WIG 20 /r ib","V","V","AVX512BW+AVX512VL","scale1","w,r,r,r","",""
+"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV, xmm1","vpinsrb imm8u, r32/m8, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 20 /r ib","V","V","AVX","","w,r,r,r","",""
+"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV, xmm1","vpinsrd imm8u, r/m32, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W0 22 /r ib","V","V","AVX512DQ+AVX512VL","scale4","w,r,r,r","",""
+"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV, xmm1","vpinsrd imm8u, r/m32, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 22 /r ib","V","V","AVX","","w,r,r,r","",""
+"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV, xmm1","vpinsrq imm8u, r/m64, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W1 22 /r ib","N.S.","V","AVX512DQ+AVX512VL","scale8","w,r,r,r","",""
+"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV, xmm1","vpinsrq imm8u, r/m64, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 22 /r ib","N.S.","V","AVX","","w,r,r,r","",""
+"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV, xmm1","vpinsrw imm8u, r32/m16, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG C4 /r ib","V","V","AVX512BW+AVX512VL","scale2","w,r,r,r","",""
+"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV, xmm1","vpinsrw imm8u, r32/m16, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C4 /r ib","V","V","AVX","","w,r,r,r","",""
+"VPLZCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPLZCNTD xmm2/m128/m32bcst, {k}{z}, xmm1","vplzcntd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 44 /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","",""
+"VPLZCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPLZCNTD ymm2/m256/m32bcst, {k}{z}, ymm1","vplzcntd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 44 /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","",""
+"VPLZCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPLZCNTD zmm2/m512/m32bcst, {k}{z}, zmm1","vplzcntd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 44 /r","V","V","AVX512CD","bscale4,scale64","w,r,r","",""
+"VPLZCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPLZCNTQ xmm2/m128/m64bcst, {k}{z}, xmm1","vplzcntq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 44 /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","",""
+"VPLZCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPLZCNTQ ymm2/m256/m64bcst, {k}{z}, ymm1","vplzcntq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 44 /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","",""
+"VPLZCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPLZCNTQ zmm2/m512/m64bcst, {k}{z}, zmm1","vplzcntq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 44 /r","V","V","AVX512CD","bscale8,scale64","w,r,r","",""
+"VPMACSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDD xmmIH, xmm2/m128, xmmV, xmm1","vpmacsdd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 9E /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQH xmmIH, xmm2/m128, xmmV, xmm1","vpmacsdqh xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 9F /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQL xmmIH, xmm2/m128, xmmV, xmm1","vpmacsdql xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 97 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDD xmmIH, xmm2/m128, xmmV, xmm1","vpmacssdd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 8E /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQH xmmIH, xmm2/m128, xmmV, xmm1","vpmacssdqh xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 8F /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQL xmmIH, xmm2/m128, xmmV, xmm1","vpmacssdql xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 87 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWD xmmIH, xmm2/m128, xmmV, xmm1","vpmacsswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 86 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWW xmmIH, xmm2/m128, xmmV, xmm1","vpmacssww xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 85 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWD xmmIH, xmm2/m128, xmmV, xmm1","vpmacswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 96 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWW xmmIH, xmm2/m128, xmmV, xmm1","vpmacsww xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 95 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMADCSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSSWD xmmIH, xmm2/m128, xmmV, xmm1","vpmadcsswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A6 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMADCSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSWD xmmIH, xmm2/m128, xmmV, xmm1","vpmadcswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 B6 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMADD52HUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52HUQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmadd52huq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B5 /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPMADD52HUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52HUQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmadd52huq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B5 /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPMADD52HUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52HUQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmadd52huq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B5 /r","V","V","AVX512_IFMA","bscale8,scale64","rw,r,r,r","",""
+"VPMADD52LUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52LUQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmadd52luq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B4 /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPMADD52LUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52LUQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmadd52luq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B4 /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPMADD52LUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52LUQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmadd52luq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B4 /r","V","V","AVX512_IFMA","bscale8,scale64","rw,r,r,r","",""
+"VPMADDUBSW xmm1, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128, xmmV, xmm1","vpmaddubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 04 /r","V","V","AVX","","w,r,r","",""
+"VPMADDUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128, xmmV, {k}{z}, xmm1","vpmaddubsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 04 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMADDUBSW ymm1, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256, ymmV, ymm1","vpmaddubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 04 /r","V","V","AVX2","","w,r,r","",""
+"VPMADDUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256, ymmV, {k}{z}, ymm1","vpmaddubsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 04 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMADDUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDUBSW zmm2/m512, zmmV, {k}{z}, zmm1","vpmaddubsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 04 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMADDWD xmm1, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV, xmm1","vpmaddwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F5 /r","V","V","AVX","","w,r,r","",""
+"VPMADDWD xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV, {k}{z}, xmm1","vpmaddwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F5 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMADDWD ymm1, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV, ymm1","vpmaddwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F5 /r","V","V","AVX2","","w,r,r","",""
+"VPMADDWD ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV, {k}{z}, ymm1","vpmaddwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F5 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMADDWD zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDWD zmm2/m512, zmmV, {k}{z}, zmm1","vpmaddwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F5 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMASKMOVD xmm1, xmmV, m128","VPMASKMOVD m128, xmmV, xmm1","vpmaskmovd m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVD ymm1, ymmV, m256","VPMASKMOVD m256, ymmV, ymm1","vpmaskmovd m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVD m128, xmmV, xmm1","VPMASKMOVD xmm1, xmmV, m128","vpmaskmovd xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVD m256, ymmV, ymm1","VPMASKMOVD ymm1, ymmV, m256","vpmaskmovd ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVQ xmm1, xmmV, m128","VPMASKMOVQ m128, xmmV, xmm1","vpmaskmovq m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVQ ymm1, ymmV, m256","VPMASKMOVQ m256, ymmV, ymm1","vpmaskmovq m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVQ m128, xmmV, xmm1","VPMASKMOVQ xmm1, xmmV, m128","vpmaskmovq xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVQ m256, ymmV, ymm1","VPMASKMOVQ ymm1, ymmV, m256","vpmaskmovq ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMAXSB xmm1, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV, xmm1","vpmaxsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3C /r","V","V","AVX","","w,r,r","",""
+"VPMAXSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV, {k}{z}, xmm1","vpmaxsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3C /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMAXSB ymm1, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV, ymm1","vpmaxsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3C /r","V","V","AVX2","","w,r,r","",""
+"VPMAXSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV, {k}{z}, ymm1","vpmaxsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3C /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMAXSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSB zmm2/m512, zmmV, {k}{z}, zmm1","vpmaxsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3C /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMAXSD xmm1, xmmV, xmm2/m128","VPMAXSD xmm2/m128, xmmV, xmm1","vpmaxsd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3D /r","V","V","AVX","","w,r,r","",""
+"VPMAXSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXSD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpmaxsd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 3D /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMAXSD ymm1, ymmV, ymm2/m256","VPMAXSD ymm2/m256, ymmV, ymm1","vpmaxsd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3D /r","V","V","AVX2","","w,r,r","",""
+"VPMAXSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXSD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpmaxsd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 3D /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMAXSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXSD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpmaxsd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 3D /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMAXSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXSQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmaxsq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 3D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMAXSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXSQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmaxsq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 3D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMAXSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXSQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmaxsq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 3D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMAXSW xmm1, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV, xmm1","vpmaxsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EE /r","V","V","AVX","","w,r,r","",""
+"VPMAXSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV, {k}{z}, xmm1","vpmaxsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EE /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMAXSW ymm1, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV, ymm1","vpmaxsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EE /r","V","V","AVX2","","w,r,r","",""
+"VPMAXSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV, {k}{z}, ymm1","vpmaxsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EE /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMAXSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSW zmm2/m512, zmmV, {k}{z}, zmm1","vpmaxsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EE /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMAXUB xmm1, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV, xmm1","vpmaxub xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DE /r","V","V","AVX","","w,r,r","",""
+"VPMAXUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV, {k}{z}, xmm1","vpmaxub xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DE /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMAXUB ymm1, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV, ymm1","vpmaxub ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DE /r","V","V","AVX2","","w,r,r","",""
+"VPMAXUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV, {k}{z}, ymm1","vpmaxub ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DE /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMAXUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUB zmm2/m512, zmmV, {k}{z}, zmm1","vpmaxub zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DE /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMAXUD xmm1, xmmV, xmm2/m128","VPMAXUD xmm2/m128, xmmV, xmm1","vpmaxud xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3F /r","V","V","AVX","","w,r,r","",""
+"VPMAXUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXUD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpmaxud xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 3F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMAXUD ymm1, ymmV, ymm2/m256","VPMAXUD ymm2/m256, ymmV, ymm1","vpmaxud ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3F /r","V","V","AVX2","","w,r,r","",""
+"VPMAXUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXUD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpmaxud ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 3F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMAXUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXUD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpmaxud zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 3F /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMAXUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXUQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmaxuq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 3F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMAXUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXUQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmaxuq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 3F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMAXUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXUQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmaxuq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 3F /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMAXUW xmm1, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV, xmm1","vpmaxuw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3E /r","V","V","AVX","","w,r,r","",""
+"VPMAXUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV, {k}{z}, xmm1","vpmaxuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3E /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMAXUW ymm1, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV, ymm1","vpmaxuw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3E /r","V","V","AVX2","","w,r,r","",""
+"VPMAXUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV, {k}{z}, ymm1","vpmaxuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3E /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMAXUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUW zmm2/m512, zmmV, {k}{z}, zmm1","vpmaxuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3E /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMINSB xmm1, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV, xmm1","vpminsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 38 /r","V","V","AVX","","w,r,r","",""
+"VPMINSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV, {k}{z}, xmm1","vpminsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 38 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMINSB ymm1, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV, ymm1","vpminsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 38 /r","V","V","AVX2","","w,r,r","",""
+"VPMINSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV, {k}{z}, ymm1","vpminsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 38 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMINSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSB zmm2/m512, zmmV, {k}{z}, zmm1","vpminsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 38 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMINSD xmm1, xmmV, xmm2/m128","VPMINSD xmm2/m128, xmmV, xmm1","vpminsd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 39 /r","V","V","AVX","","w,r,r","",""
+"VPMINSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINSD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpminsd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 39 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMINSD ymm1, ymmV, ymm2/m256","VPMINSD ymm2/m256, ymmV, ymm1","vpminsd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 39 /r","V","V","AVX2","","w,r,r","",""
+"VPMINSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINSD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpminsd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 39 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMINSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINSD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpminsd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 39 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMINSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINSQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpminsq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 39 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMINSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINSQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpminsq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 39 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMINSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINSQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpminsq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 39 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMINSW xmm1, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV, xmm1","vpminsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EA /r","V","V","AVX","","w,r,r","",""
+"VPMINSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV, {k}{z}, xmm1","vpminsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EA /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMINSW ymm1, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV, ymm1","vpminsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EA /r","V","V","AVX2","","w,r,r","",""
+"VPMINSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV, {k}{z}, ymm1","vpminsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EA /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMINSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSW zmm2/m512, zmmV, {k}{z}, zmm1","vpminsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EA /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMINUB xmm1, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV, xmm1","vpminub xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DA /r","V","V","AVX","","w,r,r","",""
+"VPMINUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV, {k}{z}, xmm1","vpminub xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DA /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMINUB ymm1, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV, ymm1","vpminub ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DA /r","V","V","AVX2","","w,r,r","",""
+"VPMINUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV, {k}{z}, ymm1","vpminub ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DA /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMINUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUB zmm2/m512, zmmV, {k}{z}, zmm1","vpminub zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DA /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMINUD xmm1, xmmV, xmm2/m128","VPMINUD xmm2/m128, xmmV, xmm1","vpminud xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3B /r","V","V","AVX","","w,r,r","",""
+"VPMINUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINUD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpminud xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 3B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMINUD ymm1, ymmV, ymm2/m256","VPMINUD ymm2/m256, ymmV, ymm1","vpminud ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3B /r","V","V","AVX2","","w,r,r","",""
+"VPMINUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINUD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpminud ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 3B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMINUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINUD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpminud zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 3B /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMINUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINUQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpminuq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 3B /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMINUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINUQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpminuq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 3B /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMINUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINUQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpminuq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 3B /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMINUW xmm1, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV, xmm1","vpminuw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3A /r","V","V","AVX","","w,r,r","",""
+"VPMINUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV, {k}{z}, xmm1","vpminuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3A /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMINUW ymm1, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV, ymm1","vpminuw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3A /r","V","V","AVX2","","w,r,r","",""
+"VPMINUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV, {k}{z}, ymm1","vpminuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3A /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMINUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUW zmm2/m512, zmmV, {k}{z}, zmm1","vpminuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3A /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMOVB2M k1, xmm2","VPMOVB2M xmm2, k1","vpmovb2m xmm2, k1","EVEX.128.F3.0F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVB2M k1, ymm2","VPMOVB2M ymm2, k1","vpmovb2m ymm2, k1","EVEX.256.F3.0F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVB2M k1, zmm2","VPMOVB2M zmm2, k1","vpmovb2m zmm2, k1","EVEX.512.F3.0F38.W0 29 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"VPMOVD2M k1, xmm2","VPMOVD2M xmm2, k1","vpmovd2m xmm2, k1","EVEX.128.F3.0F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVD2M k1, ymm2","VPMOVD2M ymm2, k1","vpmovd2m ymm2, k1","EVEX.256.F3.0F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVD2M k1, zmm2","VPMOVD2M zmm2, k1","vpmovd2m zmm2, k1","EVEX.512.F3.0F38.W0 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"VPMOVDB xmm2/m32, {k}{z}, xmm1","VPMOVDB xmm1, {k}{z}, xmm2/m32","vpmovdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 31 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVDB xmm2/m64, {k}{z}, ymm1","VPMOVDB ymm1, {k}{z}, xmm2/m64","vpmovdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 31 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVDB xmm2/m128, {k}{z}, zmm1","VPMOVDB zmm1, {k}{z}, xmm2/m128","vpmovdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 31 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVDW xmm2/m64, {k}{z}, xmm1","VPMOVDW xmm1, {k}{z}, xmm2/m64","vpmovdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 33 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVDW xmm2/m128, {k}{z}, ymm1","VPMOVDW ymm1, {k}{z}, xmm2/m128","vpmovdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 33 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVDW ymm2/m256, {k}{z}, zmm1","VPMOVDW zmm1, {k}{z}, ymm2/m256","vpmovdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 33 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVM2B xmm1, k2","VPMOVM2B k2, xmm1","vpmovm2b k2, xmm1","EVEX.128.F3.0F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2B ymm1, k2","VPMOVM2B k2, ymm1","vpmovm2b k2, ymm1","EVEX.256.F3.0F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2B zmm1, k2","VPMOVM2B k2, zmm1","vpmovm2b k2, zmm1","EVEX.512.F3.0F38.W0 28 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"VPMOVM2D xmm1, k2","VPMOVM2D k2, xmm1","vpmovm2d k2, xmm1","EVEX.128.F3.0F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2D ymm1, k2","VPMOVM2D k2, ymm1","vpmovm2d k2, ymm1","EVEX.256.F3.0F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2D zmm1, k2","VPMOVM2D k2, zmm1","vpmovm2d k2, zmm1","EVEX.512.F3.0F38.W0 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"VPMOVM2Q xmm1, k2","VPMOVM2Q k2, xmm1","vpmovm2q k2, xmm1","EVEX.128.F3.0F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2Q ymm1, k2","VPMOVM2Q k2, ymm1","vpmovm2q k2, ymm1","EVEX.256.F3.0F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2Q zmm1, k2","VPMOVM2Q k2, zmm1","vpmovm2q k2, zmm1","EVEX.512.F3.0F38.W1 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"VPMOVM2W xmm1, k2","VPMOVM2W k2, xmm1","vpmovm2w k2, xmm1","EVEX.128.F3.0F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2W ymm1, k2","VPMOVM2W k2, ymm1","vpmovm2w k2, ymm1","EVEX.256.F3.0F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2W zmm1, k2","VPMOVM2W k2, zmm1","vpmovm2w k2, zmm1","EVEX.512.F3.0F38.W1 28 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"VPMOVMSKB r32, xmm2","VPMOVMSKB xmm2, r32","vpmovmskb xmm2, r32","VEX.128.66.0F.WIG D7 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VPMOVMSKB r32, ymm2","VPMOVMSKB ymm2, r32","vpmovmskb ymm2, r32","VEX.256.66.0F.WIG D7 /r","V","V","AVX2","modrm_regonly","w,r","",""
+"VPMOVQ2M k1, xmm2","VPMOVQ2M xmm2, k1","vpmovq2m xmm2, k1","EVEX.128.F3.0F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVQ2M k1, ymm2","VPMOVQ2M ymm2, k1","vpmovq2m ymm2, k1","EVEX.256.F3.0F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVQ2M k1, zmm2","VPMOVQ2M zmm2, k1","vpmovq2m zmm2, k1","EVEX.512.F3.0F38.W1 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"VPMOVQB xmm2/m16, {k}{z}, xmm1","VPMOVQB xmm1, {k}{z}, xmm2/m16","vpmovqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 32 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVQB xmm2/m32, {k}{z}, ymm1","VPMOVQB ymm1, {k}{z}, xmm2/m32","vpmovqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 32 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVQB xmm2/m64, {k}{z}, zmm1","VPMOVQB zmm1, {k}{z}, xmm2/m64","vpmovqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 32 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVQD xmm2/m64, {k}{z}, xmm1","VPMOVQD xmm1, {k}{z}, xmm2/m64","vpmovqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 35 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVQD xmm2/m128, {k}{z}, ymm1","VPMOVQD ymm1, {k}{z}, xmm2/m128","vpmovqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 35 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVQD ymm2/m256, {k}{z}, zmm1","VPMOVQD zmm1, {k}{z}, ymm2/m256","vpmovqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 35 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVQW xmm2/m32, {k}{z}, xmm1","VPMOVQW xmm1, {k}{z}, xmm2/m32","vpmovqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 34 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVQW xmm2/m64, {k}{z}, ymm1","VPMOVQW ymm1, {k}{z}, xmm2/m64","vpmovqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 34 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVQW xmm2/m128, {k}{z}, zmm1","VPMOVQW zmm1, {k}{z}, xmm2/m128","vpmovqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 34 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSDB xmm2/m32, {k}{z}, xmm1","VPMOVSDB xmm1, {k}{z}, xmm2/m32","vpmovsdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 21 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSDB xmm2/m64, {k}{z}, ymm1","VPMOVSDB ymm1, {k}{z}, xmm2/m64","vpmovsdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 21 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSDB xmm2/m128, {k}{z}, zmm1","VPMOVSDB zmm1, {k}{z}, xmm2/m128","vpmovsdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 21 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSDW xmm2/m64, {k}{z}, xmm1","VPMOVSDW xmm1, {k}{z}, xmm2/m64","vpmovsdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 23 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSDW xmm2/m128, {k}{z}, ymm1","VPMOVSDW ymm1, {k}{z}, xmm2/m128","vpmovsdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 23 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVSDW ymm2/m256, {k}{z}, zmm1","VPMOVSDW zmm1, {k}{z}, ymm2/m256","vpmovsdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 23 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVSQB xmm2/m16, {k}{z}, xmm1","VPMOVSQB xmm1, {k}{z}, xmm2/m16","vpmovsqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 22 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVSQB xmm2/m32, {k}{z}, ymm1","VPMOVSQB ymm1, {k}{z}, xmm2/m32","vpmovsqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 22 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSQB xmm2/m64, {k}{z}, zmm1","VPMOVSQB zmm1, {k}{z}, xmm2/m64","vpmovsqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 22 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVSQD xmm2/m64, {k}{z}, xmm1","VPMOVSQD xmm1, {k}{z}, xmm2/m64","vpmovsqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 25 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSQD xmm2/m128, {k}{z}, ymm1","VPMOVSQD ymm1, {k}{z}, xmm2/m128","vpmovsqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 25 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVSQD ymm2/m256, {k}{z}, zmm1","VPMOVSQD zmm1, {k}{z}, ymm2/m256","vpmovsqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 25 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVSQW xmm2/m32, {k}{z}, xmm1","VPMOVSQW xmm1, {k}{z}, xmm2/m32","vpmovsqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 24 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSQW xmm2/m64, {k}{z}, ymm1","VPMOVSQW ymm1, {k}{z}, xmm2/m64","vpmovsqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 24 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSQW xmm2/m128, {k}{z}, zmm1","VPMOVSQW zmm1, {k}{z}, xmm2/m128","vpmovsqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 24 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSWB xmm2/m64, {k}{z}, xmm1","VPMOVSWB xmm1, {k}{z}, xmm2/m64","vpmovswb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 20 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVSWB xmm2/m128, {k}{z}, ymm1","VPMOVSWB ymm1, {k}{z}, xmm2/m128","vpmovswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 20 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVSWB ymm2/m256, {k}{z}, zmm1","VPMOVSWB zmm1, {k}{z}, ymm2/m256","vpmovswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 20 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVSXBD zmm1, {k}{z}, xmm2/m128","VPMOVSXBD xmm2/m128, {k}{z}, zmm1","vpmovsxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 21 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSXBD xmm1, xmm2/m32","VPMOVSXBD xmm2/m32, xmm1","vpmovsxbd xmm2/m32, xmm1","VEX.128.66.0F38.WIG 21 /r","V","V","AVX","","w,r","",""
+"VPMOVSXBD xmm1, {k}{z}, xmm2/m32","VPMOVSXBD xmm2/m32, {k}{z}, xmm1","vpmovsxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 21 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSXBD ymm1, xmm2/m64","VPMOVSXBD xmm2/m64, ymm1","vpmovsxbd xmm2/m64, ymm1","VEX.256.66.0F38.WIG 21 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXBD ymm1, {k}{z}, xmm2/m64","VPMOVSXBD xmm2/m64, {k}{z}, ymm1","vpmovsxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 21 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSXBQ xmm1, xmm2/m16","VPMOVSXBQ xmm2/m16, xmm1","vpmovsxbq xmm2/m16, xmm1","VEX.128.66.0F38.WIG 22 /r","V","V","AVX","","w,r","",""
+"VPMOVSXBQ xmm1, {k}{z}, xmm2/m16","VPMOVSXBQ xmm2/m16, {k}{z}, xmm1","vpmovsxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 22 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVSXBQ ymm1, xmm2/m32","VPMOVSXBQ xmm2/m32, ymm1","vpmovsxbq xmm2/m32, ymm1","VEX.256.66.0F38.WIG 22 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXBQ ymm1, {k}{z}, xmm2/m32","VPMOVSXBQ xmm2/m32, {k}{z}, ymm1","vpmovsxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 22 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSXBQ zmm1, {k}{z}, xmm2/m64","VPMOVSXBQ xmm2/m64, {k}{z}, zmm1","vpmovsxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 22 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVSXBW ymm1, xmm2/m128","VPMOVSXBW xmm2/m128, ymm1","vpmovsxbw xmm2/m128, ymm1","VEX.256.66.0F38.WIG 20 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXBW ymm1, {k}{z}, xmm2/m128","VPMOVSXBW xmm2/m128, {k}{z}, ymm1","vpmovsxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 20 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVSXBW xmm1, xmm2/m64","VPMOVSXBW xmm2/m64, xmm1","vpmovsxbw xmm2/m64, xmm1","VEX.128.66.0F38.WIG 20 /r","V","V","AVX","","w,r","",""
+"VPMOVSXBW xmm1, {k}{z}, xmm2/m64","VPMOVSXBW xmm2/m64, {k}{z}, xmm1","vpmovsxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 20 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVSXBW zmm1, {k}{z}, ymm2/m256","VPMOVSXBW ymm2/m256, {k}{z}, zmm1","vpmovsxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 20 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVSXDQ ymm1, xmm2/m128","VPMOVSXDQ xmm2/m128, ymm1","vpmovsxdq xmm2/m128, ymm1","VEX.256.66.0F38.WIG 25 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXDQ ymm1, {k}{z}, xmm2/m128","VPMOVSXDQ xmm2/m128, {k}{z}, ymm1","vpmovsxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 25 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVSXDQ xmm1, xmm2/m64","VPMOVSXDQ xmm2/m64, xmm1","vpmovsxdq xmm2/m64, xmm1","VEX.128.66.0F38.WIG 25 /r","V","V","AVX","","w,r","",""
+"VPMOVSXDQ xmm1, {k}{z}, xmm2/m64","VPMOVSXDQ xmm2/m64, {k}{z}, xmm1","vpmovsxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 25 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSXDQ zmm1, {k}{z}, ymm2/m256","VPMOVSXDQ ymm2/m256, {k}{z}, zmm1","vpmovsxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 25 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVSXWD ymm1, xmm2/m128","VPMOVSXWD xmm2/m128, ymm1","vpmovsxwd xmm2/m128, ymm1","VEX.256.66.0F38.WIG 23 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXWD ymm1, {k}{z}, xmm2/m128","VPMOVSXWD xmm2/m128, {k}{z}, ymm1","vpmovsxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 23 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVSXWD xmm1, xmm2/m64","VPMOVSXWD xmm2/m64, xmm1","vpmovsxwd xmm2/m64, xmm1","VEX.128.66.0F38.WIG 23 /r","V","V","AVX","","w,r","",""
+"VPMOVSXWD xmm1, {k}{z}, xmm2/m64","VPMOVSXWD xmm2/m64, {k}{z}, xmm1","vpmovsxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 23 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSXWD zmm1, {k}{z}, ymm2/m256","VPMOVSXWD ymm2/m256, {k}{z}, zmm1","vpmovsxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 23 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVSXWQ zmm1, {k}{z}, xmm2/m128","VPMOVSXWQ xmm2/m128, {k}{z}, zmm1","vpmovsxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 24 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSXWQ xmm1, xmm2/m32","VPMOVSXWQ xmm2/m32, xmm1","vpmovsxwq xmm2/m32, xmm1","VEX.128.66.0F38.WIG 24 /r","V","V","AVX","","w,r","",""
+"VPMOVSXWQ xmm1, {k}{z}, xmm2/m32","VPMOVSXWQ xmm2/m32, {k}{z}, xmm1","vpmovsxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 24 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSXWQ ymm1, xmm2/m64","VPMOVSXWQ xmm2/m64, ymm1","vpmovsxwq xmm2/m64, ymm1","VEX.256.66.0F38.WIG 24 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXWQ ymm1, {k}{z}, xmm2/m64","VPMOVSXWQ xmm2/m64, {k}{z}, ymm1","vpmovsxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 24 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSDB xmm2/m32, {k}{z}, xmm1","VPMOVUSDB xmm1, {k}{z}, xmm2/m32","vpmovusdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 11 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVUSDB xmm2/m64, {k}{z}, ymm1","VPMOVUSDB ymm1, {k}{z}, xmm2/m64","vpmovusdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 11 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSDB xmm2/m128, {k}{z}, zmm1","VPMOVUSDB zmm1, {k}{z}, xmm2/m128","vpmovusdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 11 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVUSDW xmm2/m64, {k}{z}, xmm1","VPMOVUSDW xmm1, {k}{z}, xmm2/m64","vpmovusdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 13 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSDW xmm2/m128, {k}{z}, ymm1","VPMOVUSDW ymm1, {k}{z}, xmm2/m128","vpmovusdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 13 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVUSDW ymm2/m256, {k}{z}, zmm1","VPMOVUSDW zmm1, {k}{z}, ymm2/m256","vpmovusdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 13 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVUSQB xmm2/m16, {k}{z}, xmm1","VPMOVUSQB xmm1, {k}{z}, xmm2/m16","vpmovusqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 12 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVUSQB xmm2/m32, {k}{z}, ymm1","VPMOVUSQB ymm1, {k}{z}, xmm2/m32","vpmovusqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 12 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVUSQB xmm2/m64, {k}{z}, zmm1","VPMOVUSQB zmm1, {k}{z}, xmm2/m64","vpmovusqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 12 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVUSQD xmm2/m64, {k}{z}, xmm1","VPMOVUSQD xmm1, {k}{z}, xmm2/m64","vpmovusqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSQD xmm2/m128, {k}{z}, ymm1","VPMOVUSQD ymm1, {k}{z}, xmm2/m128","vpmovusqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVUSQD ymm2/m256, {k}{z}, zmm1","VPMOVUSQD zmm1, {k}{z}, ymm2/m256","vpmovusqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 15 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVUSQW xmm2/m32, {k}{z}, xmm1","VPMOVUSQW xmm1, {k}{z}, xmm2/m32","vpmovusqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVUSQW xmm2/m64, {k}{z}, ymm1","VPMOVUSQW ymm1, {k}{z}, xmm2/m64","vpmovusqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSQW xmm2/m128, {k}{z}, zmm1","VPMOVUSQW zmm1, {k}{z}, xmm2/m128","vpmovusqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 14 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVUSWB xmm2/m64, {k}{z}, xmm1","VPMOVUSWB xmm1, {k}{z}, xmm2/m64","vpmovuswb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 10 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSWB xmm2/m128, {k}{z}, ymm1","VPMOVUSWB ymm1, {k}{z}, xmm2/m128","vpmovuswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 10 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVUSWB ymm2/m256, {k}{z}, zmm1","VPMOVUSWB zmm1, {k}{z}, ymm2/m256","vpmovuswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 10 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVW2M k1, xmm2","VPMOVW2M xmm2, k1","vpmovw2m xmm2, k1","EVEX.128.F3.0F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVW2M k1, ymm2","VPMOVW2M ymm2, k1","vpmovw2m ymm2, k1","EVEX.256.F3.0F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVW2M k1, zmm2","VPMOVW2M zmm2, k1","vpmovw2m zmm2, k1","EVEX.512.F3.0F38.W1 29 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"VPMOVWB xmm2/m64, {k}{z}, xmm1","VPMOVWB xmm1, {k}{z}, xmm2/m64","vpmovwb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 30 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVWB xmm2/m128, {k}{z}, ymm1","VPMOVWB ymm1, {k}{z}, xmm2/m128","vpmovwb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 30 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVWB ymm2/m256, {k}{z}, zmm1","VPMOVWB zmm1, {k}{z}, ymm2/m256","vpmovwb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 30 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVZXBD zmm1, {k}{z}, xmm2/m128","VPMOVZXBD xmm2/m128, {k}{z}, zmm1","vpmovzxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 31 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVZXBD xmm1, xmm2/m32","VPMOVZXBD xmm2/m32, xmm1","vpmovzxbd xmm2/m32, xmm1","VEX.128.66.0F38.WIG 31 /r","V","V","AVX","","w,r","",""
+"VPMOVZXBD xmm1, {k}{z}, xmm2/m32","VPMOVZXBD xmm2/m32, {k}{z}, xmm1","vpmovzxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 31 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVZXBD ymm1, xmm2/m64","VPMOVZXBD xmm2/m64, ymm1","vpmovzxbd xmm2/m64, ymm1","VEX.256.66.0F38.WIG 31 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXBD ymm1, {k}{z}, xmm2/m64","VPMOVZXBD xmm2/m64, {k}{z}, ymm1","vpmovzxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 31 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVZXBQ xmm1, xmm2/m16","VPMOVZXBQ xmm2/m16, xmm1","vpmovzxbq xmm2/m16, xmm1","VEX.128.66.0F38.WIG 32 /r","V","V","AVX","","w,r","",""
+"VPMOVZXBQ xmm1, {k}{z}, xmm2/m16","VPMOVZXBQ xmm2/m16, {k}{z}, xmm1","vpmovzxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 32 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVZXBQ ymm1, xmm2/m32","VPMOVZXBQ xmm2/m32, ymm1","vpmovzxbq xmm2/m32, ymm1","VEX.256.66.0F38.WIG 32 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXBQ ymm1, {k}{z}, xmm2/m32","VPMOVZXBQ xmm2/m32, {k}{z}, ymm1","vpmovzxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 32 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVZXBQ zmm1, {k}{z}, xmm2/m64","VPMOVZXBQ xmm2/m64, {k}{z}, zmm1","vpmovzxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 32 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVZXBW ymm1, xmm2/m128","VPMOVZXBW xmm2/m128, ymm1","vpmovzxbw xmm2/m128, ymm1","VEX.256.66.0F38.WIG 30 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXBW ymm1, {k}{z}, xmm2/m128","VPMOVZXBW xmm2/m128, {k}{z}, ymm1","vpmovzxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 30 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVZXBW xmm1, xmm2/m64","VPMOVZXBW xmm2/m64, xmm1","vpmovzxbw xmm2/m64, xmm1","VEX.128.66.0F38.WIG 30 /r","V","V","AVX","","w,r","",""
+"VPMOVZXBW xmm1, {k}{z}, xmm2/m64","VPMOVZXBW xmm2/m64, {k}{z}, xmm1","vpmovzxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 30 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVZXBW zmm1, {k}{z}, ymm2/m256","VPMOVZXBW ymm2/m256, {k}{z}, zmm1","vpmovzxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 30 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVZXDQ ymm1, xmm2/m128","VPMOVZXDQ xmm2/m128, ymm1","vpmovzxdq xmm2/m128, ymm1","VEX.256.66.0F38.WIG 35 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXDQ ymm1, {k}{z}, xmm2/m128","VPMOVZXDQ xmm2/m128, {k}{z}, ymm1","vpmovzxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 35 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVZXDQ xmm1, xmm2/m64","VPMOVZXDQ xmm2/m64, xmm1","vpmovzxdq xmm2/m64, xmm1","VEX.128.66.0F38.WIG 35 /r","V","V","AVX","","w,r","",""
+"VPMOVZXDQ xmm1, {k}{z}, xmm2/m64","VPMOVZXDQ xmm2/m64, {k}{z}, xmm1","vpmovzxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 35 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVZXDQ zmm1, {k}{z}, ymm2/m256","VPMOVZXDQ ymm2/m256, {k}{z}, zmm1","vpmovzxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 35 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVZXWD ymm1, xmm2/m128","VPMOVZXWD xmm2/m128, ymm1","vpmovzxwd xmm2/m128, ymm1","VEX.256.66.0F38.WIG 33 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXWD ymm1, {k}{z}, xmm2/m128","VPMOVZXWD xmm2/m128, {k}{z}, ymm1","vpmovzxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 33 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVZXWD xmm1, xmm2/m64","VPMOVZXWD xmm2/m64, xmm1","vpmovzxwd xmm2/m64, xmm1","VEX.128.66.0F38.WIG 33 /r","V","V","AVX","","w,r","",""
+"VPMOVZXWD xmm1, {k}{z}, xmm2/m64","VPMOVZXWD xmm2/m64, {k}{z}, xmm1","vpmovzxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 33 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVZXWD zmm1, {k}{z}, ymm2/m256","VPMOVZXWD ymm2/m256, {k}{z}, zmm1","vpmovzxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 33 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVZXWQ zmm1, {k}{z}, xmm2/m128","VPMOVZXWQ xmm2/m128, {k}{z}, zmm1","vpmovzxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 34 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVZXWQ xmm1, xmm2/m32","VPMOVZXWQ xmm2/m32, xmm1","vpmovzxwq xmm2/m32, xmm1","VEX.128.66.0F38.WIG 34 /r","V","V","AVX","","w,r","",""
+"VPMOVZXWQ xmm1, {k}{z}, xmm2/m32","VPMOVZXWQ xmm2/m32, {k}{z}, xmm1","vpmovzxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 34 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVZXWQ ymm1, xmm2/m64","VPMOVZXWQ xmm2/m64, ymm1","vpmovzxwq xmm2/m64, ymm1","VEX.256.66.0F38.WIG 34 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXWQ ymm1, {k}{z}, xmm2/m64","VPMOVZXWQ xmm2/m64, {k}{z}, ymm1","vpmovzxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 34 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMULDQ xmm1, xmmV, xmm2/m128","VPMULDQ xmm2/m128, xmmV, xmm1","vpmuldq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 28 /r","V","V","AVX","","w,r,r","",""
+"VPMULDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmuldq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 28 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMULDQ ymm1, ymmV, ymm2/m256","VPMULDQ ymm2/m256, ymmV, ymm1","vpmuldq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 28 /r","V","V","AVX2","","w,r,r","",""
+"VPMULDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmuldq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 28 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMULDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmuldq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 28 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMULHRSW xmm1, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV, xmm1","vpmulhrsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0B /r","V","V","AVX","","w,r,r","",""
+"VPMULHRSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV, {k}{z}, xmm1","vpmulhrsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 0B /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMULHRSW ymm1, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV, ymm1","vpmulhrsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0B /r","V","V","AVX2","","w,r,r","",""
+"VPMULHRSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV, {k}{z}, ymm1","vpmulhrsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 0B /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMULHRSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHRSW zmm2/m512, zmmV, {k}{z}, zmm1","vpmulhrsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 0B /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMULHUW xmm1, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV, xmm1","vpmulhuw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E4 /r","V","V","AVX","","w,r,r","",""
+"VPMULHUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV, {k}{z}, xmm1","vpmulhuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E4 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMULHUW ymm1, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV, ymm1","vpmulhuw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E4 /r","V","V","AVX2","","w,r,r","",""
+"VPMULHUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV, {k}{z}, ymm1","vpmulhuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E4 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMULHUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHUW zmm2/m512, zmmV, {k}{z}, zmm1","vpmulhuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E4 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMULHW xmm1, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV, xmm1","vpmulhw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E5 /r","V","V","AVX","","w,r,r","",""
+"VPMULHW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV, {k}{z}, xmm1","vpmulhw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E5 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMULHW ymm1, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV, ymm1","vpmulhw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E5 /r","V","V","AVX2","","w,r,r","",""
+"VPMULHW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV, {k}{z}, ymm1","vpmulhw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E5 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMULHW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHW zmm2/m512, zmmV, {k}{z}, zmm1","vpmulhw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E5 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMULLD xmm1, xmmV, xmm2/m128","VPMULLD xmm2/m128, xmmV, xmm1","vpmulld xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 40 /r","V","V","AVX","","w,r,r","",""
+"VPMULLD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMULLD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpmulld xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 40 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMULLD ymm1, ymmV, ymm2/m256","VPMULLD ymm2/m256, ymmV, ymm1","vpmulld ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 40 /r","V","V","AVX2","","w,r,r","",""
+"VPMULLD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMULLD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpmulld ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 40 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMULLD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMULLD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpmulld zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 40 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMULLQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULLQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmullq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 40 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMULLQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULLQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmullq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 40 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMULLQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULLQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmullq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 40 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VPMULLW xmm1, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV, xmm1","vpmullw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D5 /r","V","V","AVX","","w,r,r","",""
+"VPMULLW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV, {k}{z}, xmm1","vpmullw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D5 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMULLW ymm1, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV, ymm1","vpmullw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D5 /r","V","V","AVX2","","w,r,r","",""
+"VPMULLW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV, {k}{z}, ymm1","vpmullw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D5 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMULLW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULLW zmm2/m512, zmmV, {k}{z}, zmm1","vpmullw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D5 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMULTISHIFTQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULTISHIFTQB xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmultishiftqb xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 83 /r","V","V","AVX512_VBMI+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMULTISHIFTQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULTISHIFTQB ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmultishiftqb ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 83 /r","V","V","AVX512_VBMI+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMULTISHIFTQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULTISHIFTQB zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmultishiftqb zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 83 /r","V","V","AVX512_VBMI","bscale8,scale64","w,r,r,r","",""
+"VPMULUDQ xmm1, xmmV, xmm2/m128","VPMULUDQ xmm2/m128, xmmV, xmm1","vpmuludq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F4 /r","V","V","AVX","","w,r,r","",""
+"VPMULUDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULUDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmuludq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 F4 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMULUDQ ymm1, ymmV, ymm2/m256","VPMULUDQ ymm2/m256, ymmV, ymm1","vpmuludq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F4 /r","V","V","AVX2","","w,r,r","",""
+"VPMULUDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULUDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmuludq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 F4 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMULUDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULUDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmuludq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 F4 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPOPCNTB xmm1, {k}{z}, xmm2/m128","VPOPCNTB xmm2/m128, {k}{z}, xmm1","vpopcntb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 54 /r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r","",""
+"VPOPCNTB ymm1, {k}{z}, ymm2/m256","VPOPCNTB ymm2/m256, {k}{z}, ymm1","vpopcntb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 54 /r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r","",""
+"VPOPCNTB zmm1, {k}{z}, zmm2/m512","VPOPCNTB zmm2/m512, {k}{z}, zmm1","vpopcntb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 54 /r","V","V","AVX512_BITALG","scale64","w,r,r","",""
+"VPOPCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPOPCNTD xmm2/m128/m32bcst, {k}{z}, xmm1","vpopcntd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VPOPCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPOPCNTD ymm2/m256/m32bcst, {k}{z}, ymm1","vpopcntd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale32","w,r,r","",""
+"VPOPCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPOPCNTD zmm2/m512/m32bcst, {k}{z}, zmm1","vpopcntd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 55 /r","V","V","AVX512_VPOPCNTDQ","bscale4,scale64","w,r,r","",""
+"VPOPCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPOPCNTQ xmm2/m128/m64bcst, {k}{z}, xmm1","vpopcntq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VPOPCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPOPCNTQ ymm2/m256/m64bcst, {k}{z}, ymm1","vpopcntq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VPOPCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPOPCNTQ zmm2/m512/m64bcst, {k}{z}, zmm1","vpopcntq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 55 /r","V","V","AVX512_VPOPCNTDQ","bscale8,scale64","w,r,r","",""
+"VPOPCNTW xmm1, {k}{z}, xmm2/m128","VPOPCNTW xmm2/m128, {k}{z}, xmm1","vpopcntw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 54 /r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r","",""
+"VPOPCNTW ymm1, {k}{z}, ymm2/m256","VPOPCNTW ymm2/m256, {k}{z}, ymm1","vpopcntw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 54 /r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r","",""
+"VPOPCNTW zmm1, {k}{z}, zmm2/m512","VPOPCNTW zmm2/m512, {k}{z}, zmm1","vpopcntw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 54 /r","V","V","AVX512_BITALG","scale64","w,r,r","",""
+"VPOR xmm1, xmmV, xmm2/m128","VPOR xmm2/m128, xmmV, xmm1","vpor xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EB /r","V","V","AVX","","w,r,r","",""
+"VPOR ymm1, ymmV, ymm2/m256","VPOR ymm2/m256, ymmV, ymm1","vpor ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EB /r","V","V","AVX2","","w,r,r","",""
+"VPORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPORD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpord xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 EB /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPORD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpord ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 EB /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPORD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpord zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 EB /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPORQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vporq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 EB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPORQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vporq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 EB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPORQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vporq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 EB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPPERM xmm1, xmmV, xmmIH, xmm2/m128","VPPERM xmm2/m128, xmmIH, xmmV, xmm1","vpperm xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A3 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPPERM xmm1, xmmV, xmm2/m128, xmmIH","VPPERM xmmIH, xmm2/m128, xmmV, xmm1","vpperm xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A3 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPROLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPROLD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vprold imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /1 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPROLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPROLD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vprold imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /1 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPROLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPROLD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vprold imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /1 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPROLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPROLQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vprolq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 72 /1 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPROLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPROLQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vprolq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 72 /1 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPROLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPROLQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vprolq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 72 /1 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPROLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPROLVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vprolvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPROLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPROLVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vprolvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPROLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPROLVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vprolvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 15 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPROLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPROLVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vprolvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPROLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPROLVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vprolvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPROLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPROLVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vprolvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 15 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPRORD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPRORD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vprord imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /0 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPRORD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPRORD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vprord imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /0 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPRORD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPRORD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vprord imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /0 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPRORQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPRORQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vprorq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 72 /0 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPRORQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPRORQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vprorq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 72 /0 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPRORQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPRORQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vprorq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 72 /0 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPRORVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPRORVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vprorvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPRORVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPRORVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vprorvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPRORVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPRORVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vprorvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 14 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPRORVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPRORVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vprorvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPRORVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPRORVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vprorvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPRORVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPRORVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vprorvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 14 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPROTB xmm1, xmm2/m128, imm8u","VPROTB imm8u, xmm2/m128, xmm1","vprotb imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C0 /r ib","V","V","XOP","amd","w,r,r","",""
+"VPROTB xmm1, xmmV, xmm2/m128","VPROTB xmm2/m128, xmmV, xmm1","vprotb xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 90 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTB xmm1, xmm2/m128, xmmV","VPROTB xmmV, xmm2/m128, xmm1","vprotb xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 90 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTD xmm1, xmm2/m128, imm8u","VPROTD imm8u, xmm2/m128, xmm1","vprotd imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C2 /r ib","V","V","XOP","amd","w,r,r","",""
+"VPROTD xmm1, xmmV, xmm2/m128","VPROTD xmm2/m128, xmmV, xmm1","vprotd xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 92 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTD xmm1, xmm2/m128, xmmV","VPROTD xmmV, xmm2/m128, xmm1","vprotd xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 92 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTQ xmm1, xmm2/m128, imm8u","VPROTQ imm8u, xmm2/m128, xmm1","vprotq imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C3 /r ib","V","V","XOP","amd","w,r,r","",""
+"VPROTQ xmm1, xmmV, xmm2/m128","VPROTQ xmm2/m128, xmmV, xmm1","vprotq xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 93 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTQ xmm1, xmm2/m128, xmmV","VPROTQ xmmV, xmm2/m128, xmm1","vprotq xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 93 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTW xmm1, xmm2/m128, imm8u","VPROTW imm8u, xmm2/m128, xmm1","vprotw imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C1 /r ib","V","V","XOP","amd","w,r,r","",""
+"VPROTW xmm1, xmmV, xmm2/m128","VPROTW xmm2/m128, xmmV, xmm1","vprotw xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 91 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTW xmm1, xmm2/m128, xmmV","VPROTW xmmV, xmm2/m128, xmm1","vprotw xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 91 /r","V","V","XOP","amd","w,r,r","",""
+"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV, xmm1","vpsadbw xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG F6 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV, xmm1","vpsadbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F6 /r","V","V","AVX","","w,r,r","",""
+"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV, ymm1","vpsadbw ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F.WIG F6 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV, ymm1","vpsadbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F6 /r","V","V","AVX2","","w,r,r","",""
+"VPSADBW zmm1, zmmV, zmm2/m512","VPSADBW zmm2/m512, zmmV, zmm1","vpsadbw zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F.WIG F6 /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VPSCATTERDD vm32x, {k1-k7}, xmm1","VPSCATTERDD xmm1, {k1-k7}, vm32x","vpscatterdd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A0 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERDD vm32y, {k1-k7}, ymm1","VPSCATTERDD ymm1, {k1-k7}, vm32y","vpscatterdd ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A0 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERDD vm32z, {k1-k7}, zmm1","VPSCATTERDD zmm1, {k1-k7}, vm32z","vpscatterdd zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A0 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERDQ vm32x, {k1-k7}, xmm1","VPSCATTERDQ xmm1, {k1-k7}, vm32x","vpscatterdq xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A0 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERDQ vm32x, {k1-k7}, ymm1","VPSCATTERDQ ymm1, {k1-k7}, vm32x","vpscatterdq ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A0 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERDQ vm32y, {k1-k7}, zmm1","VPSCATTERDQ zmm1, {k1-k7}, vm32y","vpscatterdq zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A0 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERQD vm64x, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7}, vm64x","vpscatterqd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A1 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERQD vm64y, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7}, vm64y","vpscatterqd xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A1 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERQD vm64z, {k1-k7}, ymm1","VPSCATTERQD ymm1, {k1-k7}, vm64z","vpscatterqd ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A1 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERQQ vm64x, {k1-k7}, xmm1","VPSCATTERQQ xmm1, {k1-k7}, vm64x","vpscatterqq xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A1 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERQQ vm64y, {k1-k7}, ymm1","VPSCATTERQQ ymm1, {k1-k7}, vm64y","vpscatterqq ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A1 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERQQ vm64z, {k1-k7}, zmm1","VPSCATTERQQ zmm1, {k1-k7}, vm64z","vpscatterqq zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A1 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VPSHAB xmm1, xmmV, xmm2/m128","VPSHAB xmm2/m128, xmmV, xmm1","vpshab xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 98 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAB xmm1, xmm2/m128, xmmV","VPSHAB xmmV, xmm2/m128, xmm1","vpshab xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 98 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAD xmm1, xmmV, xmm2/m128","VPSHAD xmm2/m128, xmmV, xmm1","vpshad xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 9A /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAD xmm1, xmm2/m128, xmmV","VPSHAD xmmV, xmm2/m128, xmm1","vpshad xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 9A /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAQ xmm1, xmmV, xmm2/m128","VPSHAQ xmm2/m128, xmmV, xmm1","vpshaq xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 9B /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAQ xmm1, xmm2/m128, xmmV","VPSHAQ xmmV, xmm2/m128, xmm1","vpshaq xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 9B /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAW xmm1, xmmV, xmm2/m128","VPSHAW xmm2/m128, xmmV, xmm1","vpshaw xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 99 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAW xmm1, xmm2/m128, xmmV","VPSHAW xmmV, xmm2/m128, xmm1","vpshaw xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 99 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLB xmm1, xmmV, xmm2/m128","VPSHLB xmm2/m128, xmmV, xmm1","vpshlb xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 94 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLB xmm1, xmm2/m128, xmmV","VPSHLB xmmV, xmm2/m128, xmm1","vpshlb xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 94 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLD xmm1, xmmV, xmm2/m128","VPSHLD xmm2/m128, xmmV, xmm1","vpshld xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 96 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLD xmm1, xmm2/m128, xmmV","VPSHLD xmmV, xmm2/m128, xmm1","vpshld xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 96 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHLDD imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshldd imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VPSHLDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHLDD imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshldd imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VPSHLDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHLDD imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshldd imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2","bscale4,scale64","w,r,r,r,r","",""
+"VPSHLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHLDQ imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshldq imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VPSHLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHLDQ imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshldq imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VPSHLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHLDQ imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshldq imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2","bscale8,scale64","w,r,r,r,r","",""
+"VPSHLDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHLDVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshldvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPSHLDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHLDVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshldvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPSHLDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHLDVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshldvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 71 /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,r,r,r","",""
+"VPSHLDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHLDVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshldvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPSHLDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHLDVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshldvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPSHLDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHLDVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshldvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 71 /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,r,r,r","",""
+"VPSHLDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHLDVW xmm2/m128, xmmV, {k}{z}, xmm1","vpshldvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 70 /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","",""
+"VPSHLDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHLDVW ymm2/m256, ymmV, {k}{z}, ymm1","vpshldvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 70 /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","",""
+"VPSHLDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHLDVW zmm2/m512, zmmV, {k}{z}, zmm1","vpshldvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 70 /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","",""
+"VPSHLDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHLDW imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","vpshldw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r,r,r","",""
+"VPSHLDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHLDW imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","vpshldw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r,r,r","",""
+"VPSHLDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHLDW imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","vpshldw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",""
+"VPSHLQ xmm1, xmmV, xmm2/m128","VPSHLQ xmm2/m128, xmmV, xmm1","vpshlq xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 97 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLQ xmm1, xmm2/m128, xmmV","VPSHLQ xmmV, xmm2/m128, xmm1","vpshlq xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 97 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLW xmm1, xmmV, xmm2/m128","VPSHLW xmm2/m128, xmmV, xmm1","vpshlw xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 95 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLW xmm1, xmm2/m128, xmmV","VPSHLW xmmV, xmm2/m128, xmm1","vpshlw xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 95 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHRDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHRDD imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshrdd imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VPSHRDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHRDD imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshrdd imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VPSHRDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHRDD imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshrdd imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2","bscale4,scale64","w,r,r,r,r","",""
+"VPSHRDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHRDQ imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshrdq imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VPSHRDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHRDQ imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshrdq imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VPSHRDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHRDQ imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshrdq imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2","bscale8,scale64","w,r,r,r,r","",""
+"VPSHRDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHRDVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshrdvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPSHRDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHRDVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshrdvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPSHRDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHRDVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshrdvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 73 /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,r,r,r","",""
+"VPSHRDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHRDVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshrdvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPSHRDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHRDVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshrdvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPSHRDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHRDVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshrdvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 73 /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,r,r,r","",""
+"VPSHRDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHRDVW xmm2/m128, xmmV, {k}{z}, xmm1","vpshrdvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 72 /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","",""
+"VPSHRDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHRDVW ymm2/m256, ymmV, {k}{z}, ymm1","vpshrdvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 72 /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","",""
+"VPSHRDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHRDVW zmm2/m512, zmmV, {k}{z}, zmm1","vpshrdvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 72 /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","",""
+"VPSHRDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHRDW imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","vpshrdw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r,r,r","",""
+"VPSHRDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHRDW imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","vpshrdw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r,r,r","",""
+"VPSHRDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHRDW imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","vpshrdw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",""
+"VPSHUFB xmm1, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV, xmm1","vpshufb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 00 /r","V","V","AVX","","w,r,r","",""
+"VPSHUFB xmm1, {k}{z}, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV, {k}{z}, xmm1","vpshufb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 00 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSHUFB ymm1, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV, ymm1","vpshufb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 00 /r","V","V","AVX2","","w,r,r","",""
+"VPSHUFB ymm1, {k}{z}, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV, {k}{z}, ymm1","vpshufb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 00 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSHUFB zmm1, {k}{z}, zmmV, zmm2/m512","VPSHUFB zmm2/m512, zmmV, {k}{z}, zmm1","vpshufb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 00 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSHUFBITQMB k1, {k}, xmmV, xmm2/m128","VPSHUFBITQMB xmm2/m128, xmmV, {k}, k1","vpshufbitqmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 8F /r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r,r","",""
+"VPSHUFBITQMB k1, {k}, ymmV, ymm2/m256","VPSHUFBITQMB ymm2/m256, ymmV, {k}, k1","vpshufbitqmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 8F /r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r,r","",""
+"VPSHUFBITQMB k1, {k}, zmmV, zmm2/m512","VPSHUFBITQMB zmm2/m512, zmmV, {k}, k1","vpshufbitqmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 8F /r","V","V","AVX512_BITALG","scale64","w,r,r,r","",""
+"VPSHUFD xmm1, xmm2/m128, imm8u","VPSHUFD imm8u, xmm2/m128, xmm1","vpshufd imm8u, xmm2/m128, xmm1","VEX.128.66.0F.WIG 70 /r ib","V","V","AVX","","w,r,r","",""
+"VPSHUFD xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSHUFD imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","vpshufd imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 70 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSHUFD ymm1, ymm2/m256, imm8u","VPSHUFD imm8u, ymm2/m256, ymm1","vpshufd imm8u, ymm2/m256, ymm1","VEX.256.66.0F.WIG 70 /r ib","V","V","AVX2","","w,r,r","",""
+"VPSHUFD ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSHUFD imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","vpshufd imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 70 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSHUFD zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSHUFD imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","vpshufd imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 70 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSHUFHW xmm1, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128, xmm1","vpshufhw imm8u, xmm2/m128, xmm1","VEX.128.F3.0F.WIG 70 /r ib","V","V","AVX","","w,r,r","",""
+"VPSHUFHW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128, {k}{z}, xmm1","vpshufhw imm8u, xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.WIG 70 /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSHUFHW ymm1, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256, ymm1","vpshufhw imm8u, ymm2/m256, ymm1","VEX.256.F3.0F.WIG 70 /r ib","V","V","AVX2","","w,r,r","",""
+"VPSHUFHW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256, {k}{z}, ymm1","vpshufhw imm8u, ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.WIG 70 /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSHUFHW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFHW imm8u, zmm2/m512, {k}{z}, zmm1","vpshufhw imm8u, zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.WIG 70 /r ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSHUFLW xmm1, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128, xmm1","vpshuflw imm8u, xmm2/m128, xmm1","VEX.128.F2.0F.WIG 70 /r ib","V","V","AVX","","w,r,r","",""
+"VPSHUFLW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128, {k}{z}, xmm1","vpshuflw imm8u, xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.WIG 70 /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSHUFLW ymm1, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256, ymm1","vpshuflw imm8u, ymm2/m256, ymm1","VEX.256.F2.0F.WIG 70 /r ib","V","V","AVX2","","w,r,r","",""
+"VPSHUFLW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256, {k}{z}, ymm1","vpshuflw imm8u, ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.WIG 70 /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSHUFLW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFLW imm8u, zmm2/m512, {k}{z}, zmm1","vpshuflw imm8u, zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.WIG 70 /r ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSIGNB xmm1, xmmV, xmm2/m128","VPSIGNB xmm2/m128, xmmV, xmm1","vpsignb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 08 /r","V","V","AVX","","w,r,r","",""
+"VPSIGNB ymm1, ymmV, ymm2/m256","VPSIGNB ymm2/m256, ymmV, ymm1","vpsignb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 08 /r","V","V","AVX2","","w,r,r","",""
+"VPSIGND xmm1, xmmV, xmm2/m128","VPSIGND xmm2/m128, xmmV, xmm1","vpsignd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0A /r","V","V","AVX","","w,r,r","",""
+"VPSIGND ymm1, ymmV, ymm2/m256","VPSIGND ymm2/m256, ymmV, ymm1","vpsignd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0A /r","V","V","AVX2","","w,r,r","",""
+"VPSIGNW xmm1, xmmV, xmm2/m128","VPSIGNW xmm2/m128, xmmV, xmm1","vpsignw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 09 /r","V","V","AVX","","w,r,r","",""
+"VPSIGNW ymm1, ymmV, ymm2/m256","VPSIGNW ymm2/m256, ymmV, ymm1","vpsignw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 09 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLD xmmV, xmm2, imm8u","VPSLLD imm8u, xmm2, xmmV","vpslld imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 72 /6 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSLLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSLLD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vpslld imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /6 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSLLD ymmV, ymm2, imm8u","VPSLLD imm8u, ymm2, ymmV","vpslld imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 72 /6 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSLLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSLLD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vpslld imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /6 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSLLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSLLD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vpslld imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /6 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSLLD xmm1, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV, xmm1","vpslld xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F2 /r","V","V","AVX","","w,r,r","",""
+"VPSLLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV, {k}{z}, xmm1","vpslld xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 F2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLD ymm1, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV, ymm1","vpslld xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F2 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV, {k}{z}, ymm1","vpslld xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 F2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLD xmm2/m128, zmmV, {k}{z}, zmm1","vpslld xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 F2 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSLLDQ xmmV, xmm2, imm8u","VPSLLDQ imm8u, xmm2, xmmV","vpslldq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /7 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSLLDQ xmmV, xmm2/m128, imm8u","VPSLLDQ imm8u, xmm2/m128, xmmV","vpslldq imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /7 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPSLLDQ ymmV, ymm2, imm8u","VPSLLDQ imm8u, ymm2, ymmV","vpslldq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /7 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSLLDQ ymmV, ymm2/m256, imm8u","VPSLLDQ imm8u, ymm2/m256, ymmV","vpslldq imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /7 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPSLLDQ zmmV, zmm2/m512, imm8u","VPSLLDQ imm8u, zmm2/m512, zmmV","vpslldq imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /7 ib","V","V","AVX512BW","scale64","w,r,r","",""
+"VPSLLQ xmmV, xmm2, imm8u","VPSLLQ imm8u, xmm2, xmmV","vpsllq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /6 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSLLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSLLQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vpsllq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 73 /6 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSLLQ ymmV, ymm2, imm8u","VPSLLQ imm8u, ymm2, ymmV","vpsllq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /6 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSLLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSLLQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vpsllq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 73 /6 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSLLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSLLQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vpsllq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 73 /6 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSLLQ xmm1, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV, xmm1","vpsllq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F3 /r","V","V","AVX","","w,r,r","",""
+"VPSLLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV, {k}{z}, xmm1","vpsllq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 F3 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLQ ymm1, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV, ymm1","vpsllq xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F3 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV, {k}{z}, ymm1","vpsllq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 F3 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLQ xmm2/m128, zmmV, {k}{z}, zmm1","vpsllq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 F3 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSLLVD xmm1, xmmV, xmm2/m128","VPSLLVD xmm2/m128, xmmV, xmm1","vpsllvd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 47 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSLLVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsllvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 47 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSLLVD ymm1, ymmV, ymm2/m256","VPSLLVD ymm2/m256, ymmV, ymm1","vpsllvd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 47 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSLLVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsllvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 47 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSLLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSLLVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsllvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 47 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSLLVQ xmm1, xmmV, xmm2/m128","VPSLLVQ xmm2/m128, xmmV, xmm1","vpsllvq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 47 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSLLVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsllvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 47 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSLLVQ ymm1, ymmV, ymm2/m256","VPSLLVQ ymm2/m256, ymmV, ymm1","vpsllvq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 47 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSLLVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsllvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 47 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSLLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSLLVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsllvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 47 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSLLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLVW xmm2/m128, xmmV, {k}{z}, xmm1","vpsllvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 12 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSLLVW ymm2/m256, ymmV, {k}{z}, ymm1","vpsllvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 12 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSLLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSLLVW zmm2/m512, zmmV, {k}{z}, zmm1","vpsllvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 12 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSLLW xmmV, xmm2, imm8u","VPSLLW imm8u, xmm2, xmmV","vpsllw imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 71 /6 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSLLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSLLW imm8u, xmm2/m128, {k}{z}, xmmV","vpsllw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /6 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLW ymmV, ymm2, imm8u","VPSLLW imm8u, ymm2, ymmV","vpsllw imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 71 /6 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSLLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSLLW imm8u, ymm2/m256, {k}{z}, ymmV","vpsllw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /6 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSLLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSLLW imm8u, zmm2/m512, {k}{z}, zmmV","vpsllw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /6 ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSLLW xmm1, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV, xmm1","vpsllw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F1 /r","V","V","AVX","","w,r,r","",""
+"VPSLLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV, {k}{z}, xmm1","vpsllw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLW ymm1, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV, ymm1","vpsllw xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F1 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV, {k}{z}, ymm1","vpsllw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLW xmm2/m128, zmmV, {k}{z}, zmm1","vpsllw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F1 /r","V","V","AVX512BW","scale16","w,r,r,r","",""
+"VPSRAD xmmV, xmm2, imm8u","VPSRAD imm8u, xmm2, xmmV","vpsrad imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 72 /4 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRAD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRAD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vpsrad imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /4 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSRAD ymmV, ymm2, imm8u","VPSRAD imm8u, ymm2, ymmV","vpsrad imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 72 /4 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRAD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRAD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vpsrad imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /4 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSRAD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRAD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vpsrad imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /4 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSRAD xmm1, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV, xmm1","vpsrad xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E2 /r","V","V","AVX","","w,r,r","",""
+"VPSRAD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV, {k}{z}, xmm1","vpsrad xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 E2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAD ymm1, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV, ymm1","vpsrad xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E2 /r","V","V","AVX2","","w,r,r","",""
+"VPSRAD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV, {k}{z}, ymm1","vpsrad xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 E2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAD xmm2/m128, zmmV, {k}{z}, zmm1","vpsrad xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 E2 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSRAQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRAQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vpsraq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 72 /4 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSRAQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRAQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vpsraq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 72 /4 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSRAQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRAQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vpsraq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 72 /4 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSRAQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAQ xmm2/m128, xmmV, {k}{z}, xmm1","vpsraq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 E2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAQ xmm2/m128, ymmV, {k}{z}, ymm1","vpsraq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 E2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAQ xmm2/m128, zmmV, {k}{z}, zmm1","vpsraq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 E2 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSRAVD xmm1, xmmV, xmm2/m128","VPSRAVD xmm2/m128, xmmV, xmm1","vpsravd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 46 /r","V","V","AVX2","","w,r,r","",""
+"VPSRAVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRAVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsravd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 46 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSRAVD ymm1, ymmV, ymm2/m256","VPSRAVD ymm2/m256, ymmV, ymm1","vpsravd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 46 /r","V","V","AVX2","","w,r,r","",""
+"VPSRAVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRAVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsravd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 46 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSRAVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRAVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsravd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 46 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSRAVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRAVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsravq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 46 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSRAVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRAVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsravq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 46 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSRAVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRAVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsravq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 46 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSRAVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAVW xmm2/m128, xmmV, {k}{z}, xmm1","vpsravw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 11 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRAVW ymm2/m256, ymmV, {k}{z}, ymm1","vpsravw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 11 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSRAVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRAVW zmm2/m512, zmmV, {k}{z}, zmm1","vpsravw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 11 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSRAW xmmV, xmm2, imm8u","VPSRAW imm8u, xmm2, xmmV","vpsraw imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 71 /4 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRAW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRAW imm8u, xmm2/m128, {k}{z}, xmmV","vpsraw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /4 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAW ymmV, ymm2, imm8u","VPSRAW imm8u, ymm2, ymmV","vpsraw imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 71 /4 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRAW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRAW imm8u, ymm2/m256, {k}{z}, ymmV","vpsraw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /4 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSRAW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRAW imm8u, zmm2/m512, {k}{z}, zmmV","vpsraw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /4 ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSRAW xmm1, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV, xmm1","vpsraw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E1 /r","V","V","AVX","","w,r,r","",""
+"VPSRAW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV, {k}{z}, xmm1","vpsraw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAW ymm1, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV, ymm1","vpsraw xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E1 /r","V","V","AVX2","","w,r,r","",""
+"VPSRAW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV, {k}{z}, ymm1","vpsraw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAW xmm2/m128, zmmV, {k}{z}, zmm1","vpsraw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E1 /r","V","V","AVX512BW","scale16","w,r,r,r","",""
+"VPSRLD xmmV, xmm2, imm8u","VPSRLD imm8u, xmm2, xmmV","vpsrld imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 72 /2 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRLD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vpsrld imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /2 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSRLD ymmV, ymm2, imm8u","VPSRLD imm8u, ymm2, ymmV","vpsrld imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 72 /2 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRLD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vpsrld imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /2 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSRLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRLD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vpsrld imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /2 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSRLD xmm1, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV, xmm1","vpsrld xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D2 /r","V","V","AVX","","w,r,r","",""
+"VPSRLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV, {k}{z}, xmm1","vpsrld xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 D2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLD ymm1, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV, ymm1","vpsrld xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D2 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV, {k}{z}, ymm1","vpsrld xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 D2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLD xmm2/m128, zmmV, {k}{z}, zmm1","vpsrld xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 D2 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSRLDQ xmmV, xmm2, imm8u","VPSRLDQ imm8u, xmm2, xmmV","vpsrldq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /3 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRLDQ xmmV, xmm2/m128, imm8u","VPSRLDQ imm8u, xmm2/m128, xmmV","vpsrldq imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /3 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPSRLDQ ymmV, ymm2, imm8u","VPSRLDQ imm8u, ymm2, ymmV","vpsrldq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /3 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRLDQ ymmV, ymm2/m256, imm8u","VPSRLDQ imm8u, ymm2/m256, ymmV","vpsrldq imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /3 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPSRLDQ zmmV, zmm2/m512, imm8u","VPSRLDQ imm8u, zmm2/m512, zmmV","vpsrldq imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /3 ib","V","V","AVX512BW","scale64","w,r,r","",""
+"VPSRLQ xmmV, xmm2, imm8u","VPSRLQ imm8u, xmm2, xmmV","vpsrlq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /2 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRLQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vpsrlq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 73 /2 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSRLQ ymmV, ymm2, imm8u","VPSRLQ imm8u, ymm2, ymmV","vpsrlq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /2 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRLQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vpsrlq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 73 /2 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSRLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRLQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vpsrlq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 73 /2 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSRLQ xmm1, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV, xmm1","vpsrlq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D3 /r","V","V","AVX","","w,r,r","",""
+"VPSRLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV, {k}{z}, xmm1","vpsrlq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 D3 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLQ ymm1, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV, ymm1","vpsrlq xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D3 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV, {k}{z}, ymm1","vpsrlq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 D3 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLQ xmm2/m128, zmmV, {k}{z}, zmm1","vpsrlq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 D3 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSRLVD xmm1, xmmV, xmm2/m128","VPSRLVD xmm2/m128, xmmV, xmm1","vpsrlvd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 45 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRLVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsrlvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 45 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSRLVD ymm1, ymmV, ymm2/m256","VPSRLVD ymm2/m256, ymmV, ymm1","vpsrlvd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 45 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRLVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsrlvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 45 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSRLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRLVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsrlvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 45 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSRLVQ xmm1, xmmV, xmm2/m128","VPSRLVQ xmm2/m128, xmmV, xmm1","vpsrlvq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 45 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRLVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsrlvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 45 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSRLVQ ymm1, ymmV, ymm2/m256","VPSRLVQ ymm2/m256, ymmV, ymm1","vpsrlvq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 45 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRLVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsrlvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 45 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSRLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRLVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsrlvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 45 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSRLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLVW xmm2/m128, xmmV, {k}{z}, xmm1","vpsrlvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 10 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRLVW ymm2/m256, ymmV, {k}{z}, ymm1","vpsrlvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 10 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSRLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRLVW zmm2/m512, zmmV, {k}{z}, zmm1","vpsrlvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 10 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSRLW xmmV, xmm2, imm8u","VPSRLW imm8u, xmm2, xmmV","vpsrlw imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 71 /2 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRLW imm8u, xmm2/m128, {k}{z}, xmmV","vpsrlw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /2 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLW ymmV, ymm2, imm8u","VPSRLW imm8u, ymm2, ymmV","vpsrlw imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 71 /2 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRLW imm8u, ymm2/m256, {k}{z}, ymmV","vpsrlw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /2 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSRLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRLW imm8u, zmm2/m512, {k}{z}, zmmV","vpsrlw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /2 ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSRLW xmm1, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV, xmm1","vpsrlw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D1 /r","V","V","AVX","","w,r,r","",""
+"VPSRLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV, {k}{z}, xmm1","vpsrlw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLW ymm1, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV, ymm1","vpsrlw xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D1 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV, {k}{z}, ymm1","vpsrlw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLW xmm2/m128, zmmV, {k}{z}, zmm1","vpsrlw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D1 /r","V","V","AVX512BW","scale16","w,r,r,r","",""
+"VPSUBB xmm1, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV, xmm1","vpsubb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F8 /r","V","V","AVX","","w,r,r","",""
+"VPSUBB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV, {k}{z}, xmm1","vpsubb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F8 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBB ymm1, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV, ymm1","vpsubb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F8 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV, {k}{z}, ymm1","vpsubb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F8 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBB zmm2/m512, zmmV, {k}{z}, zmm1","vpsubb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F8 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBD xmm1, xmmV, xmm2/m128","VPSUBD xmm2/m128, xmmV, xmm1","vpsubd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FA /r","V","V","AVX","","w,r,r","",""
+"VPSUBD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSUBD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsubd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 FA /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSUBD ymm1, ymmV, ymm2/m256","VPSUBD ymm2/m256, ymmV, ymm1","vpsubd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FA /r","V","V","AVX2","","w,r,r","",""
+"VPSUBD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSUBD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsubd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 FA /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSUBD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSUBD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsubd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 FA /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSUBQ xmm1, xmmV, xmm2/m128","VPSUBQ xmm2/m128, xmmV, xmm1","vpsubq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FB /r","V","V","AVX","","w,r,r","",""
+"VPSUBQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSUBQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsubq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 FB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSUBQ ymm1, ymmV, ymm2/m256","VPSUBQ ymm2/m256, ymmV, ymm1","vpsubq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FB /r","V","V","AVX2","","w,r,r","",""
+"VPSUBQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSUBQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsubq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 FB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSUBQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSUBQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsubq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 FB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSUBSB xmm1, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV, xmm1","vpsubsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E8 /r","V","V","AVX","","w,r,r","",""
+"VPSUBSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV, {k}{z}, xmm1","vpsubsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E8 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBSB ymm1, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV, ymm1","vpsubsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E8 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV, {k}{z}, ymm1","vpsubsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E8 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSB zmm2/m512, zmmV, {k}{z}, zmm1","vpsubsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E8 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBSW xmm1, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV, xmm1","vpsubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E9 /r","V","V","AVX","","w,r,r","",""
+"VPSUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV, {k}{z}, xmm1","vpsubsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E9 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBSW ymm1, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV, ymm1","vpsubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E9 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV, {k}{z}, ymm1","vpsubsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E9 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSW zmm2/m512, zmmV, {k}{z}, zmm1","vpsubsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E9 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBUSB xmm1, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV, xmm1","vpsubusb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D8 /r","V","V","AVX","","w,r,r","",""
+"VPSUBUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV, {k}{z}, xmm1","vpsubusb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D8 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBUSB ymm1, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV, ymm1","vpsubusb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D8 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV, {k}{z}, ymm1","vpsubusb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D8 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSB zmm2/m512, zmmV, {k}{z}, zmm1","vpsubusb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D8 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBUSW xmm1, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV, xmm1","vpsubusw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D9 /r","V","V","AVX","","w,r,r","",""
+"VPSUBUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV, {k}{z}, xmm1","vpsubusw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D9 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBUSW ymm1, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV, ymm1","vpsubusw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D9 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV, {k}{z}, ymm1","vpsubusw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D9 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSW zmm2/m512, zmmV, {k}{z}, zmm1","vpsubusw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D9 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBW xmm1, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV, xmm1","vpsubw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F9 /r","V","V","AVX","","w,r,r","",""
+"VPSUBW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV, {k}{z}, xmm1","vpsubw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F9 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBW ymm1, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV, ymm1","vpsubw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F9 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV, {k}{z}, ymm1","vpsubw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F9 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBW zmm2/m512, zmmV, {k}{z}, zmm1","vpsubw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F9 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPTERNLOGD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPTERNLOGD imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpternlogd imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W0 25 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r,r","",""
+"VPTERNLOGD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPTERNLOGD imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpternlogd imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W0 25 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r,r","",""
+"VPTERNLOGD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPTERNLOGD imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpternlogd imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W0 25 /r ib","V","V","AVX512F","bscale4,scale64","rw,r,r,r,r","",""
+"VPTERNLOGQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPTERNLOGQ imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpternlogq imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W1 25 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r,r","",""
+"VPTERNLOGQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPTERNLOGQ imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpternlogq imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W1 25 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r,r","",""
+"VPTERNLOGQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPTERNLOGQ imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpternlogq imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W1 25 /r ib","V","V","AVX512F","bscale8,scale64","rw,r,r,r,r","",""
+"VPTEST xmm1, xmm2/m128","VPTEST xmm2/m128, xmm1","vptest xmm2/m128, xmm1","VEX.128.66.0F38.WIG 17 /r","V","V","AVX","","r,r","",""
+"VPTEST ymm1, ymm2/m256","VPTEST ymm2/m256, ymm1","vptest ymm2/m256, ymm1","VEX.256.66.0F38.WIG 17 /r","V","V","AVX","","r,r","",""
+"VPTESTMB k1, {k}, xmmV, xmm2/m128","VPTESTMB xmm2/m128, xmmV, {k}, k1","vptestmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 26 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPTESTMB k1, {k}, ymmV, ymm2/m256","VPTESTMB ymm2/m256, ymmV, {k}, k1","vptestmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 26 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPTESTMB k1, {k}, zmmV, zmm2/m512","VPTESTMB zmm2/m512, zmmV, {k}, k1","vptestmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 26 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPTESTMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTMD xmm2/m128/m32bcst, xmmV, {k}, k1","vptestmd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPTESTMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTMD ymm2/m256/m32bcst, ymmV, {k}, k1","vptestmd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPTESTMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTMD zmm2/m512/m32bcst, zmmV, {k}, k1","vptestmd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 27 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPTESTMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTMQ xmm2/m128/m64bcst, xmmV, {k}, k1","vptestmq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPTESTMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTMQ ymm2/m256/m64bcst, ymmV, {k}, k1","vptestmq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPTESTMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTMQ zmm2/m512/m64bcst, zmmV, {k}, k1","vptestmq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 27 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPTESTMW k1, {k}, xmmV, xmm2/m128","VPTESTMW xmm2/m128, xmmV, {k}, k1","vptestmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 26 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPTESTMW k1, {k}, ymmV, ymm2/m256","VPTESTMW ymm2/m256, ymmV, {k}, k1","vptestmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 26 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPTESTMW k1, {k}, zmmV, zmm2/m512","VPTESTMW zmm2/m512, zmmV, {k}, k1","vptestmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 26 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPTESTNMB k1, {k}, xmmV, xmm2/m128","VPTESTNMB xmm2/m128, xmmV, {k}, k1","vptestnmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W0 26 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPTESTNMB k1, {k}, ymmV, ymm2/m256","VPTESTNMB ymm2/m256, ymmV, {k}, k1","vptestnmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W0 26 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPTESTNMB k1, {k}, zmmV, zmm2/m512","VPTESTNMB zmm2/m512, zmmV, {k}, k1","vptestnmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W0 26 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPTESTNMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTNMD xmm2/m128/m32bcst, xmmV, {k}, k1","vptestnmd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPTESTNMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTNMD ymm2/m256/m32bcst, ymmV, {k}, k1","vptestnmd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPTESTNMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTNMD zmm2/m512/m32bcst, zmmV, {k}, k1","vptestnmd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W0 27 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPTESTNMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTNMQ xmm2/m128/m64bcst, xmmV, {k}, k1","vptestnmq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPTESTNMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTNMQ ymm2/m256/m64bcst, ymmV, {k}, k1","vptestnmq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPTESTNMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTNMQ zmm2/m512/m64bcst, zmmV, {k}, k1","vptestnmq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W1 27 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPTESTNMW k1, {k}, xmmV, xmm2/m128","VPTESTNMW xmm2/m128, xmmV, {k}, k1","vptestnmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W1 26 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPTESTNMW k1, {k}, ymmV, ymm2/m256","VPTESTNMW ymm2/m256, ymmV, {k}, k1","vptestnmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W1 26 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPTESTNMW k1, {k}, zmmV, zmm2/m512","VPTESTNMW zmm2/m512, zmmV, {k}, k1","vptestnmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W1 26 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPUNPCKHBW xmm1, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128, xmmV, xmm1","vpunpckhbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 68 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKHBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128, xmmV, {k}{z}, xmm1","vpunpckhbw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 68 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPUNPCKHBW ymm1, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256, ymmV, ymm1","vpunpckhbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 68 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKHBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256, ymmV, {k}{z}, ymm1","vpunpckhbw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 68 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPUNPCKHBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHBW zmm2/m512, zmmV, {k}{z}, zmm1","vpunpckhbw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 68 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPUNPCKHDQ xmm1, xmmV, xmm2/m128","VPUNPCKHDQ xmm2/m128, xmmV, xmm1","vpunpckhdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6A /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKHDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKHDQ xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpunpckhdq xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 6A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPUNPCKHDQ ymm1, ymmV, ymm2/m256","VPUNPCKHDQ ymm2/m256, ymmV, ymm1","vpunpckhdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6A /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKHDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKHDQ ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpunpckhdq ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 6A /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPUNPCKHDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKHDQ zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpunpckhdq zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 6A /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPUNPCKHQDQ xmm1, xmmV, xmm2/m128","VPUNPCKHQDQ xmm2/m128, xmmV, xmm1","vpunpckhqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6D /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKHQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKHQDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpunpckhqdq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 6D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPUNPCKHQDQ ymm1, ymmV, ymm2/m256","VPUNPCKHQDQ ymm2/m256, ymmV, ymm1","vpunpckhqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6D /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKHQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKHQDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpunpckhqdq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 6D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPUNPCKHQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKHQDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpunpckhqdq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 6D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPUNPCKHWD xmm1, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128, xmmV, xmm1","vpunpckhwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 69 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKHWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128, xmmV, {k}{z}, xmm1","vpunpckhwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 69 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPUNPCKHWD ymm1, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256, ymmV, ymm1","vpunpckhwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 69 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKHWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256, ymmV, {k}{z}, ymm1","vpunpckhwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 69 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPUNPCKHWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHWD zmm2/m512, zmmV, {k}{z}, zmm1","vpunpckhwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 69 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPUNPCKLBW xmm1, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128, xmmV, xmm1","vpunpcklbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 60 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKLBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128, xmmV, {k}{z}, xmm1","vpunpcklbw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 60 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPUNPCKLBW ymm1, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256, ymmV, ymm1","vpunpcklbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 60 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKLBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256, ymmV, {k}{z}, ymm1","vpunpcklbw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 60 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPUNPCKLBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLBW zmm2/m512, zmmV, {k}{z}, zmm1","vpunpcklbw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 60 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPUNPCKLDQ xmm1, xmmV, xmm2/m128","VPUNPCKLDQ xmm2/m128, xmmV, xmm1","vpunpckldq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 62 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKLDQ xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpunpckldq xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 62 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPUNPCKLDQ ymm1, ymmV, ymm2/m256","VPUNPCKLDQ ymm2/m256, ymmV, ymm1","vpunpckldq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 62 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKLDQ ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpunpckldq ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 62 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPUNPCKLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKLDQ zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpunpckldq zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 62 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPUNPCKLQDQ xmm1, xmmV, xmm2/m128","VPUNPCKLQDQ xmm2/m128, xmmV, xmm1","vpunpcklqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6C /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKLQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKLQDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpunpcklqdq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 6C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPUNPCKLQDQ ymm1, ymmV, ymm2/m256","VPUNPCKLQDQ ymm2/m256, ymmV, ymm1","vpunpcklqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6C /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKLQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKLQDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpunpcklqdq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 6C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPUNPCKLQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKLQDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpunpcklqdq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 6C /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPUNPCKLWD xmm1, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128, xmmV, xmm1","vpunpcklwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 61 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKLWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128, xmmV, {k}{z}, xmm1","vpunpcklwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 61 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPUNPCKLWD ymm1, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256, ymmV, ymm1","vpunpcklwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 61 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKLWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256, ymmV, {k}{z}, ymm1","vpunpcklwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 61 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPUNPCKLWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLWD zmm2/m512, zmmV, {k}{z}, zmm1","vpunpcklwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 61 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPXOR xmm1, xmmV, xmm2/m128","VPXOR xmm2/m128, xmmV, xmm1","vpxor xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EF /r","V","V","AVX","","w,r,r","",""
+"VPXOR ymm1, ymmV, ymm2/m256","VPXOR ymm2/m256, ymmV, ymm1","vpxor ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EF /r","V","V","AVX2","","w,r,r","",""
+"VPXORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPXORD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpxord xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 EF /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPXORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPXORD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpxord ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 EF /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPXORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPXORD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpxord zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 EF /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPXORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPXORQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpxorq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 EF /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPXORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPXORQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpxorq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 EF /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPXORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPXORQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpxorq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 EF /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VRANGEPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u:4","VRANGEPD imm8u:4, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vrangepd imm8u:4, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 50 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VRANGEPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u:4","VRANGEPD imm8u:4, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vrangepd imm8u:4, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 50 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VRANGEPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPD imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}","vrangepd imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F3A.W1 50 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VRANGEPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u:4","VRANGEPD imm8u:4, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vrangepd imm8u:4, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 50 /r ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r,r","",""
+"VRANGEPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u:4","VRANGEPS imm8u:4, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vrangeps imm8u:4, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 50 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VRANGEPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u:4","VRANGEPS imm8u:4, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vrangeps imm8u:4, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 50 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VRANGEPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPS imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}","vrangeps imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F3A.W0 50 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VRANGEPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u:4","VRANGEPS imm8u:4, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vrangeps imm8u:4, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 50 /r ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r,r","",""
+"VRANGESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESD imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vrangesd imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 51 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VRANGESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VRANGESD imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","vrangesd imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 51 /r ib","V","V","AVX512DQ","scale8","w,r,r,r,r","",""
+"VRANGESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESS imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vrangess imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 51 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VRANGESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VRANGESS imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","vrangess imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 51 /r ib","V","V","AVX512DQ","scale4","w,r,r,r,r","",""
+"VRCP14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRCP14PD xmm2/m128/m64bcst, {k}{z}, xmm1","vrcp14pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 4C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VRCP14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRCP14PD ymm2/m256/m64bcst, {k}{z}, ymm1","vrcp14pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 4C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VRCP14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP14PD zmm2/m512/m64bcst, {k}{z}, zmm1","vrcp14pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 4C /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VRCP14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRCP14PS xmm2/m128/m32bcst, {k}{z}, xmm1","vrcp14ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 4C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VRCP14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRCP14PS ymm2/m256/m32bcst, {k}{z}, ymm1","vrcp14ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 4C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VRCP14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP14PS zmm2/m512/m32bcst, {k}{z}, zmm1","vrcp14ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 4C /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VRCP14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP14SD xmm2/m64, xmmV, {k}{z}, xmm1","vrcp14sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 4D /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VRCP14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP14SS xmm2/m32, xmmV, {k}{z}, xmm1","vrcp14ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 4D /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VRCP28PD zmm1{sae}, {k}{z}, zmm2","VRCP28PD zmm2, {k}{z}, zmm1{sae}","vrcp28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 CA /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VRCP28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP28PD zmm2/m512/m64bcst, {k}{z}, zmm1","vrcp28pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 CA /r","V","V","AVX512ER","bscale8,scale64","w,r,r","",""
+"VRCP28PS zmm1{sae}, {k}{z}, zmm2","VRCP28PS zmm2, {k}{z}, zmm1{sae}","vrcp28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 CA /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VRCP28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP28PS zmm2/m512/m32bcst, {k}{z}, zmm1","vrcp28ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 CA /r","V","V","AVX512ER","bscale4,scale64","w,r,r","",""
+"VRCP28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SD xmm2, xmmV, {k}{z}, xmm1{sae}","vrcp28sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W1 CB /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
+"VRCP28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP28SD xmm2/m64, xmmV, {k}{z}, xmm1","vrcp28sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 CB /r","V","V","AVX512ER","scale8","w,r,r,r","",""
+"VRCP28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SS xmm2, xmmV, {k}{z}, xmm1{sae}","vrcp28ss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W0 CB /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
+"VRCP28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP28SS xmm2/m32, xmmV, {k}{z}, xmm1","vrcp28ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 CB /r","V","V","AVX512ER","scale4","w,r,r,r","",""
+"VRCPPS xmm1, xmm2/m128","VRCPPS xmm2/m128, xmm1","vrcpps xmm2/m128, xmm1","VEX.128.0F.WIG 53 /r","V","V","AVX","","w,r","",""
+"VRCPPS ymm1, ymm2/m256","VRCPPS ymm2/m256, ymm1","vrcpps ymm2/m256, ymm1","VEX.256.0F.WIG 53 /r","V","V","AVX","","w,r","",""
+"VRCPSS xmm1, xmmV, xmm2/m32","VRCPSS xmm2/m32, xmmV, xmm1","vrcpss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 53 /r","V","V","AVX","","w,r,r","",""
+"VREDUCEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VREDUCEPD imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","vreducepd imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VREDUCEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VREDUCEPD imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vreducepd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VREDUCEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPD imm8u, zmm2, {k}{z}, zmm1{sae}","vreducepd imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W1 56 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","",""
+"VREDUCEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VREDUCEPD imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vreducepd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 56 /r ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VREDUCEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VREDUCEPS imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","vreduceps imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VREDUCEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VREDUCEPS imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","vreduceps imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VREDUCEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPS imm8u, zmm2, {k}{z}, zmm1{sae}","vreduceps imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W0 56 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","",""
+"VREDUCEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VREDUCEPS imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","vreduceps imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 56 /r ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VREDUCESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESD imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vreducesd imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 57 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VREDUCESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VREDUCESD imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","vreducesd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 57 /r ib","V","V","AVX512DQ","scale8","w,r,r,r,r","",""
+"VREDUCESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESS imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vreducess imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 57 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VREDUCESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VREDUCESS imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","vreducess imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 57 /r ib","V","V","AVX512DQ","scale4","w,r,r,r,r","",""
+"VRNDSCALEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VRNDSCALEPD imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","vrndscalepd imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 09 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VRNDSCALEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VRNDSCALEPD imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vrndscalepd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 09 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VRNDSCALEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPD imm8u, zmm2, {k}{z}, zmm1{sae}","vrndscalepd imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W1 09 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VRNDSCALEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VRNDSCALEPD imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vrndscalepd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 09 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VRNDSCALEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VRNDSCALEPS imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","vrndscaleps imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 08 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VRNDSCALEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VRNDSCALEPS imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","vrndscaleps imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 08 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VRNDSCALEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPS imm8u, zmm2, {k}{z}, zmm1{sae}","vrndscaleps imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W0 08 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VRNDSCALEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VRNDSCALEPS imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","vrndscaleps imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 08 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VRNDSCALESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESD imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vrndscalesd imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 0B /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VRNDSCALESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VRNDSCALESD imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","vrndscalesd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 0B /r ib","V","V","AVX512F","scale8","w,r,r,r,r","",""
+"VRNDSCALESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESS imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vrndscaless imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 0A /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VRNDSCALESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VRNDSCALESS imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","vrndscaless imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 0A /r ib","V","V","AVX512F","scale4","w,r,r,r,r","",""
+"VROUNDPD xmm1, xmm2/m128, imm8u","VROUNDPD imm8u, xmm2/m128, xmm1","vroundpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 09 /r ib","V","V","AVX","","w,r,r","",""
+"VROUNDPD ymm1, ymm2/m256, imm8u","VROUNDPD imm8u, ymm2/m256, ymm1","vroundpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 09 /r ib","V","V","AVX","","w,r,r","",""
+"VROUNDPS xmm1, xmm2/m128, imm8u","VROUNDPS imm8u, xmm2/m128, xmm1","vroundps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 08 /r ib","V","V","AVX","","w,r,r","",""
+"VROUNDPS ymm1, ymm2/m256, imm8u","VROUNDPS imm8u, ymm2/m256, ymm1","vroundps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 08 /r ib","V","V","AVX","","w,r,r","",""
+"VROUNDSD xmm1, xmmV, xmm2/m64, imm8u","VROUNDSD imm8u, xmm2/m64, xmmV, xmm1","vroundsd imm8u, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.WIG 0B /r ib","V","V","AVX","","w,r,r,r","",""
+"VROUNDSS xmm1, xmmV, xmm2/m32, imm8u","VROUNDSS imm8u, xmm2/m32, xmmV, xmm1","vroundss imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.WIG 0A /r ib","V","V","AVX","","w,r,r,r","",""
+"VRSQRT14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRSQRT14PD xmm2/m128/m64bcst, {k}{z}, xmm1","vrsqrt14pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 4E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VRSQRT14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRSQRT14PD ymm2/m256/m64bcst, {k}{z}, ymm1","vrsqrt14pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 4E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VRSQRT14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT14PD zmm2/m512/m64bcst, {k}{z}, zmm1","vrsqrt14pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 4E /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VRSQRT14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRSQRT14PS xmm2/m128/m32bcst, {k}{z}, xmm1","vrsqrt14ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 4E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VRSQRT14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRSQRT14PS ymm2/m256/m32bcst, {k}{z}, ymm1","vrsqrt14ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 4E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VRSQRT14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT14PS zmm2/m512/m32bcst, {k}{z}, zmm1","vrsqrt14ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 4E /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VRSQRT14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT14SD xmm2/m64, xmmV, {k}{z}, xmm1","vrsqrt14sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 4F /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VRSQRT14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT14SS xmm2/m32, xmmV, {k}{z}, xmm1","vrsqrt14ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 4F /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VRSQRT28PD zmm1{sae}, {k}{z}, zmm2","VRSQRT28PD zmm2, {k}{z}, zmm1{sae}","vrsqrt28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 CC /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VRSQRT28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT28PD zmm2/m512/m64bcst, {k}{z}, zmm1","vrsqrt28pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 CC /r","V","V","AVX512ER","bscale8,scale64","w,r,r","",""
+"VRSQRT28PS zmm1{sae}, {k}{z}, zmm2","VRSQRT28PS zmm2, {k}{z}, zmm1{sae}","vrsqrt28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 CC /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VRSQRT28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT28PS zmm2/m512/m32bcst, {k}{z}, zmm1","vrsqrt28ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 CC /r","V","V","AVX512ER","bscale4,scale64","w,r,r","",""
+"VRSQRT28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SD xmm2, xmmV, {k}{z}, xmm1{sae}","vrsqrt28sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W1 CD /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
+"VRSQRT28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT28SD xmm2/m64, xmmV, {k}{z}, xmm1","vrsqrt28sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 CD /r","V","V","AVX512ER","scale8","w,r,r,r","",""
+"VRSQRT28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SS xmm2, xmmV, {k}{z}, xmm1{sae}","vrsqrt28ss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W0 CD /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
+"VRSQRT28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT28SS xmm2/m32, xmmV, {k}{z}, xmm1","vrsqrt28ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 CD /r","V","V","AVX512ER","scale4","w,r,r,r","",""
+"VRSQRTPS xmm1, xmm2/m128","VRSQRTPS xmm2/m128, xmm1","vrsqrtps xmm2/m128, xmm1","VEX.128.0F.WIG 52 /r","V","V","AVX","","w,r","",""
+"VRSQRTPS ymm1, ymm2/m256","VRSQRTPS ymm2/m256, ymm1","vrsqrtps ymm2/m256, ymm1","VEX.256.0F.WIG 52 /r","V","V","AVX","","w,r","",""
+"VRSQRTSS xmm1, xmmV, xmm2/m32","VRSQRTSS xmm2/m32, xmmV, xmm1","vrsqrtss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 52 /r","V","V","AVX","","w,r,r","",""
+"VSCALEFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSCALEFPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vscalefpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 2C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VSCALEFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSCALEFPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vscalefpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 2C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VSCALEFPD zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPD zmm2, zmmV, {k}{z}, zmm1{er}","vscalefpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F38.W1 2C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSCALEFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSCALEFPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vscalefpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 2C /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VSCALEFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSCALEFPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vscalefps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 2C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VSCALEFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSCALEFPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vscalefps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 2C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VSCALEFPS zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPS zmm2, zmmV, {k}{z}, zmm1{er}","vscalefps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F38.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSCALEFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSCALEFPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vscalefps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 2C /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VSCALEFSD xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSD xmm2, xmmV, {k}{z}, xmm1{er}","vscalefsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.66.0F38.W1 2D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSCALEFSD xmm1, {k}{z}, xmmV, xmm2/m64","VSCALEFSD xmm2/m64, xmmV, {k}{z}, xmm1","vscalefsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 2D /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VSCALEFSS xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSS xmm2, xmmV, {k}{z}, xmm1{er}","vscalefss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.66.0F38.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSCALEFSS xmm1, {k}{z}, xmmV, xmm2/m32","VSCALEFSS xmm2/m32, xmmV, {k}{z}, xmm1","vscalefss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 2D /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VSCATTERDPD vm32x, {k1-k7}, xmm1","VSCATTERDPD xmm1, {k1-k7}, vm32x","vscatterdpd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A2 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERDPD vm32x, {k1-k7}, ymm1","VSCATTERDPD ymm1, {k1-k7}, vm32x","vscatterdpd ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A2 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERDPD vm32y, {k1-k7}, zmm1","VSCATTERDPD zmm1, {k1-k7}, vm32y","vscatterdpd zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A2 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERDPS vm32x, {k1-k7}, xmm1","VSCATTERDPS xmm1, {k1-k7}, vm32x","vscatterdps xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A2 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERDPS vm32y, {k1-k7}, ymm1","VSCATTERDPS ymm1, {k1-k7}, vm32y","vscatterdps ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A2 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERDPS vm32z, {k1-k7}, zmm1","VSCATTERDPS zmm1, {k1-k7}, vm32z","vscatterdps zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A2 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERPF0DPD vm32y, {k1-k7}","VSCATTERPF0DPD {k1-k7}, vm32y","vscatterpf0dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /5","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VSCATTERPF0DPS vm32z, {k1-k7}","VSCATTERPF0DPS {k1-k7}, vm32z","vscatterpf0dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /5","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VSCATTERPF0QPD vm64z, {k1-k7}","VSCATTERPF0QPD {k1-k7}, vm64z","vscatterpf0qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /5","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VSCATTERPF0QPS vm64z, {k1-k7}","VSCATTERPF0QPS {k1-k7}, vm64z","vscatterpf0qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /5","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VSCATTERPF1DPD vm32y, {k1-k7}","VSCATTERPF1DPD {k1-k7}, vm32y","vscatterpf1dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /6","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VSCATTERPF1DPS vm32z, {k1-k7}","VSCATTERPF1DPS {k1-k7}, vm32z","vscatterpf1dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /6","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VSCATTERPF1QPD vm64z, {k1-k7}","VSCATTERPF1QPD {k1-k7}, vm64z","vscatterpf1qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /6","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VSCATTERPF1QPS vm64z, {k1-k7}","VSCATTERPF1QPS {k1-k7}, vm64z","vscatterpf1qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /6","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VSCATTERQPD vm64x, {k1-k7}, xmm1","VSCATTERQPD xmm1, {k1-k7}, vm64x","vscatterqpd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A3 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERQPD vm64y, {k1-k7}, ymm1","VSCATTERQPD ymm1, {k1-k7}, vm64y","vscatterqpd ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A3 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERQPD vm64z, {k1-k7}, zmm1","VSCATTERQPD zmm1, {k1-k7}, vm64z","vscatterqpd zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A3 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERQPS vm64x, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7}, vm64x","vscatterqps xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A3 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERQPS vm64y, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7}, vm64y","vscatterqps xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A3 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERQPS vm64z, {k1-k7}, ymm1","VSCATTERQPS ymm1, {k1-k7}, vm64z","vscatterqps ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A3 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VSHUFF32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFF32X4 imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshuff32x4 imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 23 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VSHUFF32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFF32X4 imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshuff32x4 imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 23 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VSHUFF64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFF64X2 imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshuff64x2 imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 23 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VSHUFF64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFF64X2 imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshuff64x2 imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 23 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VSHUFI32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFI32X4 imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufi32x4 imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 43 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VSHUFI32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFI32X4 imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufi32x4 imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 43 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VSHUFI64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFI64X2 imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufi64x2 imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 43 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VSHUFI64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFI64X2 imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufi64x2 imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 43 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VSHUFPD xmm1, xmmV, xmm2/m128, imm8u","VSHUFPD imm8u, xmm2/m128, xmmV, xmm1","vshufpd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C6 /r ib","V","V","AVX","","w,r,r,r","",""
+"VSHUFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VSHUFPD imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vshufpd imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 C6 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VSHUFPD ymm1, ymmV, ymm2/m256, imm8u","VSHUFPD imm8u, ymm2/m256, ymmV, ymm1","vshufpd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG C6 /r ib","V","V","AVX","","w,r,r,r","",""
+"VSHUFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFPD imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufpd imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 C6 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VSHUFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFPD imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufpd imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 C6 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VSHUFPS xmm1, xmmV, xmm2/m128, imm8u","VSHUFPS imm8u, xmm2/m128, xmmV, xmm1","vshufps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG C6 /r ib","V","V","AVX","","w,r,r,r","",""
+"VSHUFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VSHUFPS imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vshufps imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 C6 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VSHUFPS ymm1, ymmV, ymm2/m256, imm8u","VSHUFPS imm8u, ymm2/m256, ymmV, ymm1","vshufps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG C6 /r ib","V","V","AVX","","w,r,r,r","",""
+"VSHUFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFPS imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufps imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 C6 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VSHUFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFPS imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufps imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 C6 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VSQRTPD xmm1, xmm2/m128","VSQRTPD xmm2/m128, xmm1","vsqrtpd xmm2/m128, xmm1","VEX.128.66.0F.WIG 51 /r","V","V","AVX","","w,r","",""
+"VSQRTPD xmm1, {k}{z}, xmm2/m128/m64bcst","VSQRTPD xmm2/m128/m64bcst, {k}{z}, xmm1","vsqrtpd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 51 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VSQRTPD ymm1, ymm2/m256","VSQRTPD ymm2/m256, ymm1","vsqrtpd ymm2/m256, ymm1","VEX.256.66.0F.WIG 51 /r","V","V","AVX","","w,r","",""
+"VSQRTPD ymm1, {k}{z}, ymm2/m256/m64bcst","VSQRTPD ymm2/m256/m64bcst, {k}{z}, ymm1","vsqrtpd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 51 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VSQRTPD zmm1{er}, {k}{z}, zmm2","VSQRTPD zmm2, {k}{z}, zmm1{er}","vsqrtpd zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 51 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VSQRTPD zmm1, {k}{z}, zmm2/m512/m64bcst","VSQRTPD zmm2/m512/m64bcst, {k}{z}, zmm1","vsqrtpd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 51 /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VSQRTPS xmm1, xmm2/m128","VSQRTPS xmm2/m128, xmm1","vsqrtps xmm2/m128, xmm1","VEX.128.0F.WIG 51 /r","V","V","AVX","","w,r","",""
+"VSQRTPS xmm1, {k}{z}, xmm2/m128/m32bcst","VSQRTPS xmm2/m128/m32bcst, {k}{z}, xmm1","vsqrtps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 51 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VSQRTPS ymm1, ymm2/m256","VSQRTPS ymm2/m256, ymm1","vsqrtps ymm2/m256, ymm1","VEX.256.0F.WIG 51 /r","V","V","AVX","","w,r","",""
+"VSQRTPS ymm1, {k}{z}, ymm2/m256/m32bcst","VSQRTPS ymm2/m256/m32bcst, {k}{z}, ymm1","vsqrtps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 51 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VSQRTPS zmm1{er}, {k}{z}, zmm2","VSQRTPS zmm2, {k}{z}, zmm1{er}","vsqrtps zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 51 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VSQRTPS zmm1, {k}{z}, zmm2/m512/m32bcst","VSQRTPS zmm2/m512/m32bcst, {k}{z}, zmm1","vsqrtps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 51 /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VSQRTSD xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSD xmm2, xmmV, {k}{z}, xmm1{er}","vsqrtsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 51 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSQRTSD xmm1, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV, xmm1","vsqrtsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 51 /r","V","V","AVX","","w,r,r","",""
+"VSQRTSD xmm1, {k}{z}, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV, {k}{z}, xmm1","vsqrtsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 51 /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VSQRTSS xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSS xmm2, xmmV, {k}{z}, xmm1{er}","vsqrtss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 51 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSQRTSS xmm1, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV, xmm1","vsqrtss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 51 /r","V","V","AVX","","w,r,r","",""
+"VSQRTSS xmm1, {k}{z}, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV, {k}{z}, xmm1","vsqrtss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 51 /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VSTMXCSR m32","VSTMXCSR m32","vstmxcsr m32","VEX.128.0F.WIG AE /3","V","V","AVX","modrm_memonly","w","",""
+"VSUBPD xmm1, xmmV, xmm2/m128","VSUBPD xmm2/m128, xmmV, xmm1","vsubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSUBPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vsubpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VSUBPD ymm1, ymmV, ymm2/m256","VSUBPD ymm2/m256, ymmV, ymm1","vsubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSUBPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vsubpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VSUBPD zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPD zmm2, zmmV, {k}{z}, zmm1{er}","vsubpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSUBPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSUBPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vsubpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5C /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VSUBPS xmm1, xmmV, xmm2/m128","VSUBPS xmm2/m128, xmmV, xmm1","vsubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSUBPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vsubps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VSUBPS ymm1, ymmV, ymm2/m256","VSUBPS ymm2/m256, ymmV, ymm1","vsubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSUBPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vsubps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VSUBPS zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPS zmm2, zmmV, {k}{z}, zmm1{er}","vsubps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSUBPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSUBPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vsubps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5C /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VSUBSD xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSD xmm2, xmmV, {k}{z}, xmm1{er}","vsubsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSUBSD xmm1, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, xmm1","vsubsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBSD xmm1, {k}{z}, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, {k}{z}, xmm1","vsubsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5C /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VSUBSS xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSS xmm2, xmmV, {k}{z}, xmm1{er}","vsubss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSUBSS xmm1, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, xmm1","vsubss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBSS xmm1, {k}{z}, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, {k}{z}, xmm1","vsubss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5C /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VTESTPD xmm1, xmm2/m128","VTESTPD xmm2/m128, xmm1","vtestpd xmm2/m128, xmm1","VEX.128.66.0F38.W0 0F /r","V","V","AVX","","r,r","",""
+"VTESTPD ymm1, ymm2/m256","VTESTPD ymm2/m256, ymm1","vtestpd ymm2/m256, ymm1","VEX.256.66.0F38.W0 0F /r","V","V","AVX","","r,r","",""
+"VTESTPS xmm1, xmm2/m128","VTESTPS xmm2/m128, xmm1","vtestps xmm2/m128, xmm1","VEX.128.66.0F38.W0 0E /r","V","V","AVX","","r,r","",""
+"VTESTPS ymm1, ymm2/m256","VTESTPS ymm2/m256, ymm1","vtestps ymm2/m256, ymm1","VEX.256.66.0F38.W0 0E /r","V","V","AVX","","r,r","",""
+"VUCOMISD xmm1{sae}, xmm2","VUCOMISD xmm2, xmm1{sae}","vucomisd xmm2, xmm1{sae}","EVEX.128.66.0F.W1 2E /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xmm1","EVEX.LIG.66.0F.W1 2E /r","V","V","AVX512F","scale8","r,r","",""
+"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xmm1","VEX.LIG.66.0F.WIG 2E /r","V","V","AVX","","r,r","",""
+"VUCOMISS xmm1{sae}, xmm2","VUCOMISS xmm2, xmm1{sae}","vucomiss xmm2, xmm1{sae}","EVEX.128.0F.W0 2E /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xmm1","EVEX.LIG.0F.W0 2E /r","V","V","AVX512F","scale4","r,r","",""
+"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xmm1","VEX.LIG.0F.WIG 2E /r","V","V","AVX","","r,r","",""
+"VUNPCKHPD xmm1, xmmV, xmm2/m128","VUNPCKHPD xmm2/m128, xmmV, xmm1","vunpckhpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKHPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKHPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vunpckhpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VUNPCKHPD ymm1, ymmV, ymm2/m256","VUNPCKHPD ymm2/m256, ymmV, ymm1","vunpckhpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKHPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKHPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vunpckhpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VUNPCKHPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKHPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vunpckhpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 15 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VUNPCKHPS xmm1, xmmV, xmm2/m128","VUNPCKHPS xmm2/m128, xmmV, xmm1","vunpckhps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKHPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKHPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vunpckhps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VUNPCKHPS ymm1, ymmV, ymm2/m256","VUNPCKHPS ymm2/m256, ymmV, ymm1","vunpckhps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKHPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKHPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vunpckhps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VUNPCKHPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKHPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vunpckhps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 15 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VUNPCKLPD xmm1, xmmV, xmm2/m128","VUNPCKLPD xmm2/m128, xmmV, xmm1","vunpcklpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKLPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKLPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vunpcklpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VUNPCKLPD ymm1, ymmV, ymm2/m256","VUNPCKLPD ymm2/m256, ymmV, ymm1","vunpcklpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKLPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKLPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vunpcklpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VUNPCKLPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKLPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vunpcklpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 14 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VUNPCKLPS xmm1, xmmV, xmm2/m128","VUNPCKLPS xmm2/m128, xmmV, xmm1","vunpcklps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKLPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKLPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vunpcklps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VUNPCKLPS ymm1, ymmV, ymm2/m256","VUNPCKLPS ymm2/m256, ymmV, ymm1","vunpcklps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKLPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKLPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vunpcklps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VUNPCKLPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKLPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vunpcklps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 14 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VXORPD xmm1, xmmV, xmm2/m128","VXORPD xmm2/m128, xmmV, xmm1","vxorpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
+"VXORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VXORPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vxorpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 57 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VXORPD ymm1, ymmV, ymm2/m256","VXORPD ymm2/m256, ymmV, ymm1","vxorpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
+"VXORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VXORPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vxorpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 57 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VXORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VXORPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vxorpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 57 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VXORPS xmm1, xmmV, xmm2/m128","VXORPS xmm2/m128, xmmV, xmm1","vxorps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
+"VXORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VXORPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vxorps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 57 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VXORPS ymm1, ymmV, ymm2/m256","VXORPS ymm2/m256, ymmV, ymm1","vxorps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
+"VXORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VXORPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vxorps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 57 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VXORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VXORPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vxorps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 57 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VZEROALL","VZEROALL","vzeroall","VEX.256.0F.WIG 77","V","V","AVX","","","",""
+"VZEROUPPER","VZEROUPPER","vzeroupper","VEX.128.0F.WIG 77","V","V","AVX","","","",""
+"WAIT","WAIT","wait","9B","V","V","","pseudo","","",""
+"WBINVD","WBINVD","wbinvd","0F 09","V","V","486","","","",""
+"WRFSBASE rmr32","WRFSBASE rmr32","wrfsbase rmr32","F3 0F AE /2","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32"
+"WRFSBASE rmr64","WRFSBASE rmr64","wrfsbase rmr64","F3 REX.W 0F AE /2","N.S.","V","FSGSBASE","modrm_regonly","r","Y","64"
+"WRGSBASE rmr32","WRGSBASE rmr32","wrgsbase rmr32","F3 0F AE /3","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32"
+"WRGSBASE rmr64","WRGSBASE rmr64","wrgsbase rmr64","F3 REX.W 0F AE /3","N.S.","V","FSGSBASE","modrm_regonly","r","Y","64"
+"WRMSR","WRMSR","wrmsr","0F 30","V","V","Pentium","","","",""
+"WRPKRU","WRPKRU","wrpkru","0F 01 EF","V","V","PKU","","","",""
+"WRSSD m32, r32","WRSSD r32, m32","wrssd r32, m32","0F 38 F6 /r","V","V","CET","modrm_memonly,operand16,operand32","w,r","",""
+"WRSSQ m64, r64","WRSSQ r64, m64","wrssq r64, m64","REX.W 0F 38 F6 /r","N.S.","V","CET","modrm_memonly","w,r","",""
+"WRUSSD m32, r32","WRUSSD r32, m32","wrussd r32, m32","66 0F 38 F5 /r","V","V","CET","modrm_memonly,operand16,operand32","w,r","",""
+"WRUSSQ m64, r64","WRUSSQ r64, m64","wrussq r64, m64","66 REX.W 0F 38 F5 /r","N.S.","V","CET","modrm_memonly","w,r","",""
+"XABORT imm8u","XABORT imm8u","xabort imm8u","C6 F8 ib","V","V","RTM","modrm_regonly","r","",""
+"XACQUIRE","XACQUIRE","xacquire","F2","V","V","HLE","pseudo","","",""
+"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","0F C0 /r","V","V","486","","rw,rw","Y","8"
+"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","REX 0F C0 /r","N.E.","V","","pseudo64","rw,w","Y","8"
+"XADD r/m32, r32","XADDL r32, r/m32","xaddl r32, r/m32","0F C1 /r","V","V","486","operand32","rw,rw","Y","32"
+"XADD r/m64, r64","XADDQ r64, r/m64","xaddq r64, r/m64","REX.W 0F C1 /r","N.S.","V","486","","rw,rw","Y","64"
+"XADD r/m16, r16","XADDW r16, r/m16","xaddw r16, r/m16","0F C1 /r","V","V","486","operand16","rw,rw","Y","16"
+"XBEGIN rel16","XBEGIN rel16","xbegin rel16","C7 F8 cw","V","V","RTM","modrm_regonly,operand16","r","",""
+"XBEGIN rel32","XBEGIN rel32","xbegin rel32","C7 F8 cd","V","V","RTM","modrm_regonly,operand32,operand64","r","",""
+"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","86 /r","V","V","","pseudo","w,r","Y","8"
+"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","REX 86 /r","N.E.","V","","pseudo","w,r","Y","8"
+"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","86 /r","V","V","","","rw,rw","Y","8"
+"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","REX 86 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"XCHG r32op, EAX","XCHGL EAX, r32op","xchgl EAX, r32op","90+rd","V","V","","operand32","rw,rw","Y","32"
+"XCHG r32, r/m32","XCHGL r/m32, r32","xchgl r/m32, r32","87 /r","V","V","","operand32,pseudo","w,r","Y","32"
+"XCHG r/m32, r32","XCHGL r32, r/m32","xchgl r32, r/m32","87 /r","V","V","","operand32","rw,rw","Y","32"
+"XCHG EAX, r32op","XCHGL r32op, EAX","xchgl r32op, EAX","90+rd","V","V","","operand32,pseudo","rw,rw","Y","32"
+"XCHG r64op, RAX","XCHGQ RAX, r64op","xchgq RAX, r64op","REX.W 90+ro","N.S.","V","","","rw,rw","Y","64"
+"XCHG r64, r/m64","XCHGQ r/m64, r64","xchgq r/m64, r64","REX.W 87 /r","N.E.","V","","pseudo","w,r","Y","64"
+"XCHG r/m64, r64","XCHGQ r64, r/m64","xchgq r64, r/m64","REX.W 87 /r","N.S.","V","","","rw,rw","Y","64"
+"XCHG RAX, r64op","XCHGQ r64op, RAX","xchgq r64op, RAX","REX.W 90+rd","N.E.","V","","pseudo","rw,rw","Y","64"
+"XCHG r16op, AX","XCHGW AX, r16op","xchgw AX, r16op","90+rw","V","V","","operand16","rw,rw","Y","16"
+"XCHG r16, r/m16","XCHGW r/m16, r16","xchgw r/m16, r16","87 /r","V","V","","operand16,pseudo","w,r","Y","16"
+"XCHG r/m16, r16","XCHGW r16, r/m16","xchgw r16, r/m16","87 /r","V","V","","operand16","rw,rw","Y","16"
+"XCHG AX, r16op","XCHGW r16op, AX","xchgw r16op, AX","90+rw","V","V","","operand16,pseudo","rw,rw","Y","16"
+"XEND","XEND","xend","0F 01 D5","V","V","RTM","","","",""
+"XGETBV","XGETBV","xgetbv","0F 01 D0","V","V","XSAVE","","","",""
+"XLATB","XLAT","xlat","D7","V","V","","","","",""
+"XLATB","XLAT","xlat","REX.W D7","N.E.","V","","pseudo","","",""
+"XOR r/m8, imm8","XORB imm8, r/m8","xorb imm8, r/m8","REX 80 /6 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"XOR AL, imm8u","XORB imm8u, AL","xorb imm8u, AL","34 ib","V","V","","","rw,r","Y","8"
+"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","80 /6 ib","V","V","","","rw,r","Y","8"
+"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","82 /6 ib","V","N.S.","","","rw,r","Y","8"
+"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","32 /r","V","V","","","rw,r","Y","8"
+"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","REX 32 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","30 /r","V","V","","","rw,r","Y","8"
+"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","REX 30 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"XOR EAX, imm32","XORL imm32, EAX","xorl imm32, EAX","35 id","V","V","","operand32","rw,r","Y","32"
+"XOR r/m32, imm32","XORL imm32, r/m32","xorl imm32, r/m32","81 /6 id","V","V","","operand32","rw,r","Y","32"
+"XOR r/m32, imm8","XORL imm8, r/m32","xorl imm8, r/m32","83 /6 ib","V","V","","operand32","rw,r","Y","32"
+"XOR r32, r/m32","XORL r/m32, r32","xorl r/m32, r32","33 /r","V","V","","operand32","rw,r","Y","32"
+"XOR r/m32, r32","XORL r32, r/m32","xorl r32, r/m32","31 /r","V","V","","operand32","rw,r","Y","32"
+"XORPD xmm1, xmm2/m128","XORPD xmm2/m128, xmm1","xorpd xmm2/m128, xmm1","66 0F 57 /r","V","V","SSE2","","rw,r","",""
+"XORPS xmm1, xmm2/m128","XORPS xmm2/m128, xmm1","xorps xmm2/m128, xmm1","0F 57 /r","V","V","SSE","","rw,r","",""
+"XOR RAX, imm32","XORQ imm32, RAX","xorq imm32, RAX","REX.W 35 id","N.S.","V","","","rw,r","Y","64"
+"XOR r/m64, imm32","XORQ imm32, r/m64","xorq imm32, r/m64","REX.W 81 /6 id","N.S.","V","","","rw,r","Y","64"
+"XOR r/m64, imm8","XORQ imm8, r/m64","xorq imm8, r/m64","REX.W 83 /6 ib","N.S.","V","","","rw,r","Y","64"
+"XOR r64, r/m64","XORQ r/m64, r64","xorq r/m64, r64","REX.W 33 /r","N.S.","V","","","rw,r","Y","64"
+"XOR r/m64, r64","XORQ r64, r/m64","xorq r64, r/m64","REX.W 31 /r","N.S.","V","","","rw,r","Y","64"
+"XOR AX, imm16","XORW imm16, AX","xorw imm16, AX","35 iw","V","V","","operand16","rw,r","Y","16"
+"XOR r/m16, imm16","XORW imm16, r/m16","xorw imm16, r/m16","81 /6 iw","V","V","","operand16","rw,r","Y","16"
+"XOR r/m16, imm8","XORW imm8, r/m16","xorw imm8, r/m16","83 /6 ib","V","V","","operand16","rw,r","Y","16"
+"XOR r16, r/m16","XORW r/m16, r16","xorw r/m16, r16","33 /r","V","V","","operand16","rw,r","Y","16"
+"XOR r/m16, r16","XORW r16, r/m16","xorw r16, r/m16","31 /r","V","V","","operand16","rw,r","Y","16"
+"XRELEASE","XRELEASE","xrelease","F3","V","V","HLE","pseudo","","",""
+"XRSTOR mem","XRSTOR mem","xrstor mem","0F AE /5","V","V","XSAVE","modrm_memonly,operand16,operand32","r","",""
+"XRSTOR64 mem","XRSTOR64 mem","xrstor64 mem","REX.W 0F AE /5","N.S.","V","XSAVE","modrm_memonly","r","",""
+"XRSTORS mem","XRSTORS mem","xrstors mem","0F C7 /3","V","V","XSAVES","modrm_memonly,operand16,operand32","r","",""
+"XRSTORS64 mem","XRSTORS64 mem","xrstors64 mem","REX.W 0F C7 /3","N.S.","V","XSAVES","modrm_memonly","r","",""
+"XSAVE mem","XSAVE mem","xsave mem","0F AE /4","V","V","XSAVE","modrm_memonly,operand16,operand32","w","",""
+"XSAVE64 mem","XSAVE64 mem","xsave64 mem","REX.W 0F AE /4","N.S.","V","XSAVE","modrm_memonly","w","",""
+"XSAVEC mem","XSAVEC mem","xsavec mem","0F C7 /4","V","V","XSAVEC","modrm_memonly,operand16,operand32","w","",""
+"XSAVEC64 mem","XSAVEC64 mem","xsavec64 mem","REX.W 0F C7 /4","N.S.","V","XSAVEC","modrm_memonly","w","",""
+"XSAVEOPT mem","XSAVEOPT mem","xsaveopt mem","0F AE /6","V","V","XSAVEOPT","modrm_memonly,operand16,operand32","w","",""
+"XSAVEOPT64 mem","XSAVEOPT64 mem","xsaveopt64 mem","REX.W 0F AE /6","N.S.","V","XSAVEOPT","modrm_memonly","w","",""
+"XSAVES mem","XSAVES mem","xsaves mem","0F C7 /5","V","V","XSAVES","modrm_memonly,operand16,operand32","w","",""
+"XSAVES64 mem","XSAVES64 mem","xsaves64 mem","REX.W 0F C7 /5","N.S.","V","XSAVES","modrm_memonly","w","",""
+"XSETBV","XSETBV","xsetbv","0F 01 D1","V","V","XSAVE","","","",""
+"XTEST","XTEST","xtest","0F 01 D6","V","V","HLE or RTM","","","",""
-- 
2.35.2



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/4] TCG support for AVX
  2022-04-18 17:39 ` [PATCH 2/4] TCG support for AVX Paul Brook
@ 2022-04-18 19:33   ` Peter Maydell
  2022-04-18 19:45     ` Paul Brook
  0 siblings, 1 reply; 67+ messages in thread
From: Peter Maydell @ 2022-04-18 19:33 UTC (permalink / raw)
  To: Paul Brook; +Cc: Eduardo Habkost, Paolo Bonzini, Richard Henderson, qemu-devel

On Mon, 18 Apr 2022 at 18:48, Paul Brook <paul@nowt.org> wrote:
>
> Add TCG translation of guest AVX/AVX2 instructions
> This comprises:
>
> * VEX encodings of most (all?) "legacy" SSE operations.
>   These typically add an extra source operand, and clear the unused half
>   of the destination register (SSE encodings leave this unchanged)
>   Previously we were incorrectly translating VEX encoded instructions
>   as if they were legacy SSE encodings.
> * 256-bit variants of many instructions. AVX adds floating point
>   operations. AVX2 adds integer operations.
> * A few new instructions (VBROADCAST, VGATHER, VZERO)
>
> Signed-off-by: Paul Brook <paul@nowt.org>
> ---
>  target/i386/cpu.c            |    8 +-
>  target/i386/helper.h         |    2 +
>  target/i386/ops_sse.h        | 2606 ++++++++++++++++++++++++----------
>  target/i386/ops_sse_header.h |  364 +++--
>  target/i386/tcg/fpu_helper.c |    3 +
>  target/i386/tcg/translate.c  | 1902 +++++++++++++++++++------
>  6 files changed, 3597 insertions(+), 1288 deletions(-)

Massively too large for a single patch, I'm afraid. This needs
to be split, probably into at least twenty patches, which each
are a reviewable chunk of code that does one coherent thing.

(Also I think Paolo may have been looking at AVX implementation?)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/4] TCG support for AVX
  2022-04-18 19:33   ` Peter Maydell
@ 2022-04-18 19:45     ` Paul Brook
  2022-04-18 19:50       ` Peter Maydell
                         ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-18 19:45 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Eduardo Habkost, Paolo Bonzini, Richard Henderson, qemu-devel

On Mon, 2022-04-18 at 20:33 +0100, Peter Maydell wrote:
> On Mon, 18 Apr 2022 at 18:48, Paul Brook <paul@nowt.org> wrote:
> > 
> > Add TCG translation of guest AVX/AVX2 instructions
> > This comprises:
> > 
> 
> Massively too large for a single patch, I'm afraid. This needs
> to be split, probably into at least twenty patches, which each
> are a reviewable chunk of code that does one coherent thing.

Hmm, I'mm see what I can do.

Unfortunately the table driven decoding means that going from two to
three operands tends to be a bit all or nothing just to get the thing
to compile.

Paul


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/4] TCG support for AVX
  2022-04-18 19:45     ` Paul Brook
@ 2022-04-18 19:50       ` Peter Maydell
  2022-04-18 23:14       ` Richard Henderson
  2022-04-20 14:19       ` Paolo Bonzini
  2 siblings, 0 replies; 67+ messages in thread
From: Peter Maydell @ 2022-04-18 19:50 UTC (permalink / raw)
  To: Paul Brook; +Cc: Eduardo Habkost, Paolo Bonzini, Richard Henderson, qemu-devel

On Mon, 18 Apr 2022 at 20:45, Paul Brook <paul@nowt.org> wrote:
>
> On Mon, 2022-04-18 at 20:33 +0100, Peter Maydell wrote:
> > On Mon, 18 Apr 2022 at 18:48, Paul Brook <paul@nowt.org> wrote:
> > >
> > > Add TCG translation of guest AVX/AVX2 instructions
> > > This comprises:
> > >
> >
> > Massively too large for a single patch, I'm afraid. This needs
> > to be split, probably into at least twenty patches, which each
> > are a reviewable chunk of code that does one coherent thing.
>
> Hmm, I'mm see what I can do.

Do check with Paolo to make sure we're not accidentally
duplicating work before putting too much time into the split...

thanks
-- PMM


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/4] TCG support for AVX
  2022-04-18 19:45     ` Paul Brook
  2022-04-18 19:50       ` Peter Maydell
@ 2022-04-18 23:14       ` Richard Henderson
  2022-04-20 14:19       ` Paolo Bonzini
  2 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2022-04-18 23:14 UTC (permalink / raw)
  To: Paul Brook, Peter Maydell; +Cc: Eduardo Habkost, Paolo Bonzini, qemu-devel

On 4/18/22 12:45, Paul Brook wrote:
> Unfortunately the table driven decoding means that going from two to
> three operands tends to be a bit all or nothing just to get the thing
> to compile.

Yes, gen_sse is awful.  Which is why the previous attempt at AVX2 rewrote the decoder:

https://lore.kernel.org/qemu-devel/20190821172951.15333-1-jan.bobek@gmail.com/

Anyway, do coordinate with Paolo on this.


r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 4/4] AVX tests
  2022-04-18 17:39 ` [PATCH 4/4] AVX tests Paul Brook
@ 2022-04-19 10:34   ` Alex Bennée
  0 siblings, 0 replies; 67+ messages in thread
From: Alex Bennée @ 2022-04-19 10:34 UTC (permalink / raw)
  To: Paul Brook; +Cc: Eduardo Habkost, Paolo Bonzini, Richard Henderson, qemu-devel


Paul Brook <paul@nowt.org> writes:

> Tests for correct operation of most x86-64 SSE and AVX instructions.
> It should cover all combinations of overlapping register and memory
> operands on a set of random-ish data.
>
> Results are bit-identical to an Intel i5-8500, with the exception of
> the RCPSS and RSQRT approximations where the real CPU gives less accurate
> results (the Intel spec allows relative errors up to 1.5 * 2^-12)
>
> Signed-off-by: Paul Brook <paul@nowt.org>

Acked-by: Alex Bennée <alex.bennee@linaro.org>

BTW while I the directed tests are excellent you could also enable an
AVX build of the sha512 test which runs through a sha512 test suite
which should also shake out any bugs.

> ---
>  tests/tcg/i386/Makefile.target |   10 +-
>  tests/tcg/i386/README          |    9 +
>  tests/tcg/i386/test-avx.c      |  347 +++
>  tests/tcg/i386/test-avx.py     |  352 +++
>  tests/tcg/i386/x86.csv         | 4658 ++++++++++++++++++++++++++++++++
>  5 files changed, 5374 insertions(+), 2 deletions(-)
>  create mode 100644 tests/tcg/i386/test-avx.c
>  create mode 100755 tests/tcg/i386/test-avx.py
>  create mode 100644 tests/tcg/i386/x86.csv
>
> diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
> index e1c0310be6..f1c3275e2e 100644
> --- a/tests/tcg/i386/Makefile.target
> +++ b/tests/tcg/i386/Makefile.target
> @@ -7,8 +7,8 @@ VPATH 		+= $(I386_SRC)
>  
>  I386_SRCS=$(notdir $(wildcard $(I386_SRC)/*.c))
>  ALL_X86_TESTS=$(I386_SRCS:.c=)
> -SKIP_I386_TESTS=test-i386-ssse3
> -X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
> +SKIP_I386_TESTS=test-i386-ssse3 test-avx
> +X86_64_TESTS:=$(filter test-i386-ssse3 test-avx, $(ALL_X86_TESTS))
>  
>  test-i386-sse-exceptions: CFLAGS += -msse4.1 -mfpmath=sse
>  run-test-i386-sse-exceptions: QEMU_OPTS += -cpu max
> @@ -80,3 +80,9 @@ run-sha512-sse: QEMU_OPTS+=-cpu max
>  run-plugin-sha512-sse-with-%: QEMU_OPTS+=-cpu max
>  
>  TESTS+=sha512-sse
> +
> +test-avx.h: test-avx.py x86.csv
> +	$(PYTHON) $(I386_SRC)/test-avx.py $(I386_SRC)/x86.csv $@
> +
> +test-avx: CFLAGS += -mavx -masm=intel -O -I.
> +test-avx: test-avx.h
> diff --git a/tests/tcg/i386/README b/tests/tcg/i386/README
> index 09e88f30dc..403d10dad8 100644
> --- a/tests/tcg/i386/README
> +++ b/tests/tcg/i386/README
> @@ -15,6 +15,15 @@ The Linux system call vm86() is used to test vm86 emulation.
>  Various exceptions are raised to test most of the x86 user space
>  exception reporting.
>  
> +test-avx
> +--------
> +
> +This program executes most SSE/AVX instructions and generates a text output,
> +for comparison with the output obtained with a real CPU or another emulator.
> +
> +test-avx.h is generate from x86.csv by test-avx.py
> +x86.csv comes from https://github.com/quasilyte/avx512test
> +
>  linux-test
>  ----------
>  
> diff --git a/tests/tcg/i386/test-avx.c b/tests/tcg/i386/test-avx.c
> new file mode 100644
> index 0000000000..953e2906fe
> --- /dev/null
> +++ b/tests/tcg/i386/test-avx.c
> @@ -0,0 +1,347 @@
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +typedef void (*testfn)(void);
> +
> +typedef struct {
> +    uint64_t q0, q1, q2, q3;
> +} __attribute__((aligned(32))) v4di;
> +
> +typedef struct {
> +    uint64_t mm[8];
> +    v4di ymm[16];
> +    uint64_t r[16];
> +    uint64_t flags;
> +    uint32_t ff;
> +    uint64_t pad;
> +    v4di mem[4];
> +    v4di mem0[4];
> +} reg_state;
> +
> +typedef struct {
> +    int n;
> +    testfn fn;
> +    const char *s;
> +    reg_state *init;
> +} TestDef;
> +
> +reg_state initI;
> +reg_state initF32;
> +reg_state initF64;
> +
> +static void dump_ymm(const char *name, int n, const v4di *r, int ff)
> +{
> +    printf("%s%d = %016lx %016lx %016lx %016lx\n",
> +           name, n, r->q3, r->q2, r->q1, r->q0);
> +    if (ff == 64) {
> +        double v[4];
> +        memcpy(v, r, sizeof(v));
> +        printf("        %16g %16g %16g %16g\n",
> +                v[3], v[2], v[1], v[0]);
> +    } else if (ff == 32) {
> +        float v[8];
> +        memcpy(v, r, sizeof(v));
> +        printf(" %8g %8g %8g %8g %8g %8g %8g %8g\n",
> +                v[7], v[6], v[5], v[4], v[3], v[2], v[1], v[0]);
> +    }
> +}
> +
> +static void dump_regs(reg_state *s)
> +{
> +    int i;
> +
> +    for (i = 0; i < 16; i++) {
> +        dump_ymm("ymm", i, &s->ymm[i], 0);
> +    }
> +    for (i = 0; i < 4; i++) {
> +        dump_ymm("mem", i, &s->mem0[i], 0);
> +    }
> +}
> +
> +static void compare_state(const reg_state *a, const reg_state *b)
> +{
> +    int i;
> +    for (i = 0; i < 8; i++) {
> +        if (a->mm[i] != b->mm[i]) {
> +            printf("MM%d = %016lx\n", i, b->mm[i]);
> +        }
> +    }
> +    for (i = 0; i < 16; i++) {
> +        if (a->r[i] != b->r[i]) {
> +            printf("r%d = %016lx\n", i, b->r[i]);
> +        }
> +    }
> +    for (i = 0; i < 16; i++) {
> +        if (memcmp(&a->ymm[i], &b->ymm[i], 32)) {
> +            dump_ymm("ymm", i, &b->ymm[i], a->ff);
> +        }
> +    }
> +    for (i = 0; i < 4; i++) {
> +        if (memcmp(&a->mem0[i], &a->mem[i], 32)) {
> +            dump_ymm("mem", i, &a->mem[i], a->ff);
> +        }
> +    }
> +    if (a->flags != b->flags) {
> +        printf("FLAGS = %016lx\n", b->flags);
> +    }
> +}
> +
> +#define LOADMM(r, o) "movq " #r ", " #o "[%0]\n\t"
> +#define LOADYMM(r, o) "vmovdqa " #r ", " #o "[%0]\n\t"
> +#define STOREMM(r, o) "movq " #o "[%1], " #r "\n\t"
> +#define STOREYMM(r, o) "vmovdqa " #o "[%1], " #r "\n\t"
> +#define MMREG(F) \
> +    F(mm0, 0x00) \
> +    F(mm1, 0x08) \
> +    F(mm2, 0x10) \
> +    F(mm3, 0x18) \
> +    F(mm4, 0x20) \
> +    F(mm5, 0x28) \
> +    F(mm6, 0x30) \
> +    F(mm7, 0x38)
> +#define YMMREG(F) \
> +    F(ymm0, 0x040) \
> +    F(ymm1, 0x060) \
> +    F(ymm2, 0x080) \
> +    F(ymm3, 0x0a0) \
> +    F(ymm4, 0x0c0) \
> +    F(ymm5, 0x0e0) \
> +    F(ymm6, 0x100) \
> +    F(ymm7, 0x120) \
> +    F(ymm8, 0x140) \
> +    F(ymm9, 0x160) \
> +    F(ymm10, 0x180) \
> +    F(ymm11, 0x1a0) \
> +    F(ymm12, 0x1c0) \
> +    F(ymm13, 0x1e0) \
> +    F(ymm14, 0x200) \
> +    F(ymm15, 0x220)
> +#define LOADREG(r, o) "mov " #r ", " #o "[rax]\n\t"
> +#define STOREREG(r, o) "mov " #o "[rax], " #r "\n\t"
> +#define REG(F) \
> +    F(rbx, 0x248) \
> +    F(rcx, 0x250) \
> +    F(rdx, 0x258) \
> +    F(rsi, 0x260) \
> +    F(rdi, 0x268) \
> +    F(r8, 0x280) \
> +    F(r9, 0x288) \
> +    F(r10, 0x290) \
> +    F(r11, 0x298) \
> +    F(r12, 0x2a0) \
> +    F(r13, 0x2a8) \
> +    F(r14, 0x2b0) \
> +    F(r15, 0x2b8) \
> +
> +static void run_test(const TestDef *t)
> +{
> +    reg_state result;
> +    reg_state *init = t->init;
> +    memcpy(init->mem, init->mem0, sizeof(init->mem));
> +    printf("%5d %s\n", t->n, t->s);
> +    asm volatile(
> +            MMREG(LOADMM)
> +            YMMREG(LOADYMM)
> +            "sub rsp, 128\n\t"
> +            "push rax\n\t"
> +            "push rbx\n\t"
> +            "push rcx\n\t"
> +            "push rdx\n\t"
> +            "push %1\n\t"
> +            "push %2\n\t"
> +            "mov rax, %0\n\t"
> +            "pushf\n\t"
> +            "pop rbx\n\t"
> +            "shr rbx, 8\n\t"
> +            "shl rbx, 8\n\t"
> +            "mov rcx, 0x2c0[rax]\n\t"
> +            "and rcx, 0xff\n\t"
> +            "or rbx, rcx\n\t"
> +            "push rbx\n\t"
> +            "popf\n\t"
> +            REG(LOADREG)
> +            "mov rax, 0x240[rax]\n\t"
> +            "call [rsp]\n\t"
> +            "mov [rsp], rax\n\t"
> +            "mov rax, 8[rsp]\n\t"
> +            REG(STOREREG)
> +            "mov rbx, [rsp]\n\t"
> +            "mov 0x240[rax], rbx\n\t"
> +            "mov rbx, 0\n\t"
> +            "mov 0x270[rax], rbx\n\t"
> +            "mov 0x278[rax], rbx\n\t"
> +            "pushf\n\t"
> +            "pop rbx\n\t"
> +            "and rbx, 0xff\n\t"
> +            "mov 0x2c0[rax], rbx\n\t"
> +            "add rsp, 16\n\t"
> +            "pop rdx\n\t"
> +            "pop rcx\n\t"
> +            "pop rbx\n\t"
> +            "pop rax\n\t"
> +            "add rsp, 128\n\t"
> +            MMREG(STOREMM)
> +            YMMREG(STOREYMM)
> +            : : "r"(init), "r"(&result), "r"(t->fn)
> +            : "memory", "cc",
> +            "rsi", "rdi",
> +            "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15",
> +            "mm0", "mm1", "mm2", "mm3", "mm4", "mm5", "mm6", "mm7",
> +            "ymm0", "ymm1", "ymm2", "ymm3", "ymm4", "ymm5",
> +            "ymm6", "ymm7", "ymm8", "ymm9", "ymm10", "ymm11",
> +            "ymm12", "ymm13", "ymm14", "ymm15"
> +            );
> +    compare_state(init, &result);
> +}
> +
> +#define TEST(n, cmd, type) \
> +static void __attribute__((naked)) test_##n(void) \
> +{ \
> +    asm volatile(cmd); \
> +    asm volatile("ret"); \
> +}
> +#include "test-avx.h"
> +
> +
> +static const TestDef test_table[] = {
> +#define TEST(n, cmd, type) {n, test_##n, cmd, &init##type},
> +#include "test-avx.h"
> +    {-1, NULL, "", NULL}
> +};
> +
> +static void run_all(void)
> +{
> +    const TestDef *t;
> +    for (t = test_table; t->fn; t++) {
> +        run_test(t);
> +    }
> +}
> +
> +#define ARRAY_LEN(x) (sizeof(x) / sizeof(x[0]))
> +
> +float val_f32[] = {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5, 8.3};
> +double val_f64[] = {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5};
> +v4di val_i64[] = {
> +    {0x3d6b3b6a9e4118f2lu, 0x355ae76d2774d78clu,
> +     0xac3ff76c4daa4b28lu, 0xe7fabd204cb54083lu},
> +    {0xd851c54a56bf1f29lu, 0x4a84d1d50bf4c4fflu,
> +     0x56621e553d52b56clu, 0xd0069553da8f584alu},
> +    {0x5826475e2c5fd799lu, 0xfd32edc01243f5e9lu,
> +     0x738ba2c66d3fe126lu, 0x5707219c6e6c26b4lu},
> +};
> +
> +v4di deadbeef = {0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull,
> +                 0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull};
> +v4di indexq = {0x000000000000001full, 0x000000000000008full,
> +               0xffffffffffffffffull, 0xffffffffffffff5full};
> +v4di indexd = {0x00000002000000efull, 0xfffffff500000010ull,
> +               0x0000000afffffff0ull, 0x000000000000000eull};
> +
> +v4di gather_mem[0x20];
> +
> +void init_f32reg(v4di *r)
> +{
> +    static int n;
> +    float v[8];
> +    int i;
> +    for (i = 0; i < 8; i++) {
> +        v[i] = val_f32[n++];
> +        if (n == ARRAY_LEN(val_f32)) {
> +            n = 0;
> +        }
> +    }
> +    memcpy(r, v, sizeof(*r));
> +}
> +
> +void init_f64reg(v4di *r)
> +{
> +    static int n;
> +    double v[4];
> +    int i;
> +    for (i = 0; i < 4; i++) {
> +        v[i] = val_f64[n++];
> +        if (n == ARRAY_LEN(val_f64)) {
> +            n = 0;
> +        }
> +    }
> +    memcpy(r, v, sizeof(*r));
> +}
> +
> +void init_intreg(v4di *r)
> +{
> +    static uint64_t mask;
> +    static int n;
> +
> +    r->q0 = val_i64[n].q0 ^ mask;
> +    r->q1 = val_i64[n].q1 ^ mask;
> +    r->q2 = val_i64[n].q2 ^ mask;
> +    r->q3 = val_i64[n].q3 ^ mask;
> +    n++;
> +    if (n == ARRAY_LEN(val_i64)) {
> +        n = 0;
> +        mask *= 0x104C11DB7;
> +    }
> +}
> +
> +static void init_all(reg_state *s)
> +{
> +    int i;
> +
> +    s->r[3] = (uint64_t)&s->mem[0]; /* rdx */
> +    s->r[4] = (uint64_t)&gather_mem[ARRAY_LEN(gather_mem) / 2]; /* rsi */
> +    s->r[5] = (uint64_t)&s->mem[2]; /* rdi */
> +    s->flags = 2;
> +    for (i = 0; i < 16; i++) {
> +        s->ymm[i] = deadbeef;
> +    }
> +    s->ymm[13] = indexd;
> +    s->ymm[14] = indexq;
> +    for (i = 0; i < 4; i++) {
> +        s->mem0[i] = deadbeef;
> +    }
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +    int i;
> +
> +    init_all(&initI);
> +    init_intreg(&initI.ymm[10]);
> +    init_intreg(&initI.ymm[11]);
> +    init_intreg(&initI.ymm[12]);
> +    init_intreg(&initI.mem0[1]);
> +    printf("Int:\n");
> +    dump_regs(&initI);
> +
> +    init_all(&initF32);
> +    init_f32reg(&initF32.ymm[10]);
> +    init_f32reg(&initF32.ymm[11]);
> +    init_f32reg(&initF32.ymm[12]);
> +    init_f32reg(&initF32.mem0[1]);
> +    initF32.ff = 32;
> +    printf("F32:\n");
> +    dump_regs(&initF32);
> +
> +    init_all(&initF64);
> +    init_f64reg(&initF64.ymm[10]);
> +    init_f64reg(&initF64.ymm[11]);
> +    init_f64reg(&initF64.ymm[12]);
> +    init_f64reg(&initF64.mem0[1]);
> +    initF64.ff = 64;
> +    printf("F64:\n");
> +    dump_regs(&initF64);
> +
> +    for (i = 0; i < ARRAY_LEN(gather_mem); i++) {
> +        init_intreg(&gather_mem[i]);
> +    }
> +
> +    if (argc > 1) {
> +        int n = atoi(argv[1]);
> +        run_test(&test_table[n]);
> +    } else {
> +        run_all();
> +    }
> +    return 0;
> +}
> diff --git a/tests/tcg/i386/test-avx.py b/tests/tcg/i386/test-avx.py
> new file mode 100755
> index 0000000000..0b2d799c5c
> --- /dev/null
> +++ b/tests/tcg/i386/test-avx.py
> @@ -0,0 +1,352 @@
> +#! /usr/bin/env python3
> +
> +# Generate test-avx.h from x86.csv
> +
> +import csv
> +import sys
> +from fnmatch import fnmatch
> +
> +archs = [
> +    # TODO: MMX?
> +    "SSE", "SSE2", "SSE3", "SSSE3", "SSE4_1", "SSE4_2",
> +    "AVX", "AVX2", "AES+AVX", # "VAES+AVX",
> +]
> +
> +ignore = set(["FISTTP",
> +    "LDMXCSR", "VLDMXCSR", "STMXCSR", "VSTMXCSR"])
> +
> +imask = {
> +    'vBLENDPD': 0xff,
> +    'vBLENDPS': 0x0f,
> +    'CMP[PS][SD]': 0x07,
> +    'VCMP[PS][SD]': 0x1f,
> +    'vDPPD': 0x33,
> +    'vDPPS': 0xff,
> +    'vEXTRACTPS': 0x03,
> +    'vINSERTPS': 0xff,
> +    'MPSADBW': 0x7,
> +    'VMPSADBW': 0x3f,
> +    'vPALIGNR': 0x3f,
> +    'vPBLENDW': 0xff,
> +    'vPCMP[EI]STR*': 0x0f,
> +    'vPEXTRB': 0x0f,
> +    'vPEXTRW': 0x07,
> +    'vPEXTRD': 0x03,
> +    'vPEXTRQ': 0x01,
> +    'vPINSRB': 0x0f,
> +    'vPINSRW': 0x07,
> +    'vPINSRD': 0x03,
> +    'vPINSRQ': 0x01,
> +    'vPSHUF[DW]': 0xff,
> +    'vPSHUF[LH]W': 0xff,
> +    'vPS[LR][AL][WDQ]': 0x3f,
> +    'vPS[RL]LDQ': 0x1f,
> +    'vROUND[PS][SD]': 0x7,
> +    'vSHUFPD': 0x0f,
> +    'vSHUFPS': 0xff,
> +    'vAESKEYGENASSIST': 0,
> +    'VEXTRACT[FI]128': 0x01,
> +    'VINSERT[FI]128': 0x01,
> +    'VPBLENDD': 0xff,
> +    'VPERM2[FI]128': 0x33,
> +    'VPERMPD': 0xff,
> +    'VPERMQ': 0xff,
> +    'VPERMILPS': 0xff,
> +    'VPERMILPD': 0x0f,
> +    }
> +
> +def strip_comments(x):
> +    for l in x:
> +        if l != '' and l[0] != '#':
> +            yield l
> +
> +def reg_w(w):
> +    if w == 8:
> +        return 'al'
> +    elif w == 16:
> +        return 'ax'
> +    elif w == 32:
> +        return 'eax'
> +    elif w == 64:
> +        return 'rax'
> +    raise Exception("bad reg_w %d" % w)
> +
> +def mem_w(w):
> +    if w == 8:
> +        t = "BYTE"
> +    elif w == 16:
> +        t = "WORD"
> +    elif w == 32:
> +        t = "DWORD"
> +    elif w == 64:
> +        t = "QWORD"
> +    elif w == 128:
> +        t = "XMMWORD"
> +    elif w == 256:
> +        t = "YMMWORD"
> +    else:
> +        raise Exception()
> +
> +    return t + " PTR 32[rdx]"
> +
> +class XMMArg():
> +    isxmm = True
> +    def __init__(self, reg, mw):
> +        if mw not in [0, 8, 16, 32, 64, 128, 256]:
> +            raise Exception("Bad /m width: %s" % w)
> +        self.reg = reg
> +        self.mw = mw
> +        self.ismem = mw != 0
> +    def regstr(self, n):
> +        if n < 0:
> +            return mem_w(self.mw)
> +        else:
> +            return "%smm%d" % (self.reg, n)
> +
> +class MMArg():
> +    isxmm = True
> +    ismem = False # TODO
> +    def regstr(self, n):
> +        return "mm%d" % (n & 7)
> +
> +def match(op, pattern):
> +    if pattern[0] == 'v':
> +        return fnmatch(op, pattern[1:]) or fnmatch(op, 'V'+pattern[1:])
> +    return fnmatch(op, pattern)
> +
> +class ArgVSIB():
> +    isxmm = True
> +    ismem = False
> +    def __init__(self, reg, w):
> +        if w not in [32, 64]:
> +            raise Exception("Bad vsib width: %s" % w)
> +        self.w = w
> +        self.reg = reg
> +    def regstr(self, n):
> +        reg = "%smm%d" % (self.reg, n >> 2)
> +        return "[rsi + %s * %d]" % (reg, 1 << (n & 3))
> +
> +class ArgImm8u():
> +    isxmm = False
> +    ismem = False
> +    def __init__(self, op):
> +        for k, v in imask.items():
> +            if match(op, k):
> +                self.mask = imask[k];
> +                return
> +        raise Exception("Unknown immediate")
> +    def vals(self):
> +        mask = self.mask
> +        yield 0
> +        n = 0
> +        while n != mask:
> +            n += 1
> +            while (n & ~mask) != 0:
> +                n += (n & ~mask)
> +            yield n
> +
> +class ArgRM():
> +    isxmm = False
> +    def __init__(self, rw, mw):
> +        if rw not in [8, 16, 32, 64]:
> +            raise Exception("Bad r/w width: %s" % w)
> +        if mw not in [0, 8, 16, 32, 64]:
> +            raise Exception("Bad r/w width: %s" % w)
> +        self.rw = rw
> +        self.mw = mw
> +        self.ismem = mw != 0
> +    def regstr(self, n):
> +        if n < 0:
> +            return mem_w(self.mw)
> +        else:
> +            return reg_w(self.rw)
> +
> +class ArgMem():
> +    isxmm = False
> +    ismem = True
> +    def __init__(self, w):
> +        if w not in [8, 16, 32, 64, 128, 256]:
> +            raise Exception("Bad mem width: %s" % w)
> +        self.w = w
> +    def regstr(self, n):
> +        return mem_w(self.w)
> +
> +def ArgGenerator(arg, op):
> +    if arg[:3] == 'xmm' or arg[:3] == "ymm":
> +        if "/" in arg:
> +            r, m = arg.split('/')
> +            if (m[0] != 'm'):
> +                raise Exception("Expected /m: %s", arg)
> +            return XMMArg(arg[0], int(m[1:]));
> +        else:
> +            return XMMArg(arg[0], 0);
> +    elif arg[:2] == 'mm':
> +        return MMArg();
> +    elif arg[:4] == 'imm8':
> +        return ArgImm8u(op);
> +    elif arg == '<XMM0>':
> +        return None
> +    elif arg[0] == 'r':
> +        if '/m' in arg:
> +            r, m = arg.split('/')
> +            if (m[0] != 'm'):
> +                raise Exception("Expected /m: %s", arg)
> +            mw = int(m[1:])
> +            if r == 'r':
> +                rw = mw
> +            else:
> +                rw = int(r[1:])
> +            return ArgRM(rw, mw)
> +
> +        return ArgRM(int(arg[1:]), 0);
> +    elif arg[0] == 'm':
> +        return ArgMem(int(arg[1:]))
> +    elif arg[:2] == 'vm':
> +        return ArgVSIB(arg[-1], int(arg[2:-1]))
> +    else:
> +        raise Exception("Unrecognised arg: %s", arg)
> +
> +class InsnGenerator:
> +    def __init__(self, op, args):
> +        self.op = op
> +        if op[-2:] in ["PS", "PD", "SS", "SD"]:
> +            if op[-1] == 'S':
> +                self.optype = 'F32'
> +            else:
> +                self.optype = 'F64'
> +        else:
> +            self.optype = 'I'
> +
> +        try:
> +            self.args = list(ArgGenerator(a, op) for a in args)
> +            if len(self.args) > 0 and self.args[-1] is None:
> +                self.args = self.args[:-1]
> +        except Exception as e:
> +            raise Exception("Bad arg %s: %s" % (op, e))
> +
> +    def gen(self):
> +        regs = (10, 11, 12)
> +        dest = 9
> +
> +        nreg = len(self.args)
> +        if nreg == 0:
> +            yield self.op
> +            return
> +        if isinstance(self.args[-1], ArgImm8u):
> +            nreg -= 1
> +            immarg = self.args[-1]
> +        else:
> +            immarg = None
> +        memarg = -1
> +        for n, arg in enumerate(self.args):
> +            if arg.ismem:
> +                memarg = n
> +
> +        if (self.op.startswith("VGATHER") or self.op.startswith("VPGATHER")):
> +            if "GATHERD" in self.op:
> +                ireg = 13 << 2
> +            else:
> +                ireg = 14 << 2
> +            regset = [
> +                (dest, ireg | 0, regs[0]),
> +                (dest, ireg | 1, regs[0]),
> +                (dest, ireg | 2, regs[0]),
> +                (dest, ireg | 3, regs[0]),
> +                ]
> +            if memarg >= 0:
> +                raise Exception("vsib with memory: %s" % self.op)
> +        elif nreg == 1:
> +            regset = [(regs[0],)]
> +            if memarg == 0:
> +                regset += [(-1,)]
> +        elif nreg == 2:
> +            regset = [
> +                (regs[0], regs[1]),
> +                (regs[0], regs[0]),
> +                ]
> +            if memarg == 0:
> +                regset += [(-1, regs[0])]
> +            elif memarg == 1:
> +                regset += [(dest, -1)]
> +        elif nreg == 3:
> +            regset = [
> +                (dest, regs[0], regs[1]),
> +                (dest, regs[0], regs[0]),
> +                (regs[0], regs[0], regs[1]),
> +                (regs[0], regs[1], regs[0]),
> +                (regs[0], regs[0], regs[0]),
> +                ]
> +            if memarg == 2:
> +                regset += [
> +                    (dest, regs[0], -1),
> +                    (regs[0], regs[0], -1),
> +                    ]
> +            elif memarg > 0:
> +                raise Exception("Memarg %d" % memarg)
> +        elif nreg == 4:
> +            regset = [
> +                (dest, regs[0], regs[1], regs[2]),
> +                (dest, regs[0], regs[0], regs[1]),
> +                (dest, regs[0], regs[1], regs[0]),
> +                (dest, regs[1], regs[0], regs[0]),
> +                (dest, regs[0], regs[0], regs[0]),
> +                (regs[0], regs[0], regs[1], regs[2]),
> +                (regs[0], regs[1], regs[0], regs[2]),
> +                (regs[0], regs[1], regs[2], regs[0]),
> +                (regs[0], regs[0], regs[0], regs[1]),
> +                (regs[0], regs[0], regs[1], regs[0]),
> +                (regs[0], regs[1], regs[0], regs[0]),
> +                (regs[0], regs[0], regs[0], regs[0]),
> +                ]
> +            if memarg == 2:
> +                regset += [
> +                    (dest, regs[0], -1, regs[1]),
> +                    (dest, regs[0], -1, regs[0]),
> +                    (regs[0], regs[0], -1, regs[1]),
> +                    (regs[0], regs[1], -1, regs[0]),
> +                    (regs[0], regs[0], -1, regs[0]),
> +                    ]
> +            elif memarg > 0:
> +                raise Exception("Memarg4 %d" % memarg)
> +        else:
> +            raise Exception("Too many regs: %s(%d)" % (self.op, nreg))
> +
> +        for regv in regset:
> +            argstr = []
> +            for i in range(nreg):
> +                arg = self.args[i]
> +                argstr.append(arg.regstr(regv[i]))
> +            if immarg is None:
> +                yield self.op + ' ' + ','.join(argstr)
> +            else:
> +                for immval in immarg.vals():
> +                    yield self.op + ' ' + ','.join(argstr) + ',' + str(immval)
> +
> +def split0(s):
> +    if s == '':
> +        return []
> +    return s.split(',')
> +
> +def main():
> +    n = 0
> +    if len(sys.argv) != 3:
> +        print("Usage: test-avx.py x86.csv test-avx.h")
> +        exit(1)
> +    csvfile = open(sys.argv[1], 'r', newline='')
> +    with open(sys.argv[2], "w") as outf:
> +        outf.write("// Generated by test-avx.py. Do not edit.\n")
> +        for row in csv.reader(strip_comments(csvfile)):
> +            insn = row[0].replace(',', '').split()
> +            if insn[0] in ignore:
> +                continue
> +            cpuid = row[6]
> +            if cpuid in archs:
> +                g = InsnGenerator(insn[0], insn[1:])
> +                for insn in g.gen():
> +                    outf.write('TEST(%d, "%s", %s)\n' % (n, insn, g.optype))
> +                    n += 1
> +        outf.write("#undef TEST\n")
> +        csvfile.close()
> +
> +if __name__ == "__main__":
> +    main()
> diff --git a/tests/tcg/i386/x86.csv b/tests/tcg/i386/x86.csv
> new file mode 100644
> index 0000000000..d5d0c17f1b
> --- /dev/null
> +++ b/tests/tcg/i386/x86.csv
> @@ -0,0 +1,4658 @@
> +# x86 instruction set description version 0.2x, 2018-05-08
> +#
> +# https://golang.org/x/arch/x86
> +#
> +# The latest version of the CSV file is
> +# available online at https://golang.org/s/x86.csv.
> +#
> +# This file contains a block of comment lines, each beginning with #,
> +# followed by entries in CSV format. All the # comments are at the top
> +# of the file, so a reader can skip past the comments and hand the
> +# rest of the file to a standard CSV reader.
> +# Each CSV line contains these fields:
> +#
> +# 1. The Intel manual instruction mnemonic. For example, "SHR r/m32, imm8".
> +#
> +# 2. The Go assembler instruction mnemonic. For example, "SHRL imm8, r/m32".
> +#
> +# 3. The GNU binutils instruction mnemonic. For example, "shrl imm8, r/m32".
> +#
> +# 4. The instruction encoding. For example, "C1 /4 ib".
> +#
> +# 5. The validity of the instruction in 32-bit (aka compatiblity, legacy) mode.
> +#
> +# 6. The validity of the instruction in 64-bit mode.
> +#
> +# 7. The CPUID feature flags that signal support for the instruction.
> +#
> +# 8. Additional comma-separated tags containing hints about the instruction.
> +#
> +# 9. The read/write actions of the instruction on the arguments used in
> +# the Intel mnemonic. For example, "rw,r" to denote that "SHR r/m32, imm8"
> +# reads and writes its first argument but only reads its second argument.
> +#
> +# 10. Whether the opcode used in the Intel mnemonic has encoding forms
> +# distinguished only by operand size, like most arithmetic instructions.
> +# The string "Y" indicates yes, the string "" indicates no.
> +#
> +# 11. The data size of the operation in bits. In general this is the size corresponding
> +# to the Go and GNU assembler opcode suffix.
> +# Mnemonics (the opcode string)
> +#
> +# The instruction mnemonics are as used in the Intel manual, with a few exceptions.
> +#
> +# Mnemonics claiming general memory forms but that really require fixed addressing modes
> +# are omitted in favor of their equivalents with implicit arguments..
> +# For example, "CMPS m16, m16" (really CMPS [SI], [DI]) is omitted in favor of "CMPSW".
> +#
> +# Instruction forms with an explicit REP, REPE, or REPNE prefix are also omitted.
> +# Encoders and decoders are expected to handle those prefixes separately.
> +#
> +# Perhaps most significantly, the argument syntaxes used in the mnemonic indicate
> +# exactly how to derive the argument from the instruction encoding, or vice versa.
> +#
> +# Immediate values: imm8, imm8u, imm16, imm16u, imm32, imm64.
> +# Immediates are signed by default; the u suffixes indicates an unsigned value.
> +# Immediates may have bitfield-like modifier that specifies how much bits
> +# are used. For example, imm8u:4 is encoded like 8bit immediate,
> +# but only 4bits are meaningful while the others are ignored or must be 0.
> +#
> +# Memory operands. The forms m, m128, m14/28byte, m16, m16&16, m16&32, m16&64, m16:16, m16:32,
> +# m16:64, m16int, m256, m2byte, m32, m32&32, m32fp, m32int, m512byte, m64, m64fp, m64int,
> +# m8, m80bcd, m80dec, m80fp, m94/108byte. These operands always correspond to the
> +# memory address specified by the r/m half of the modrm encoding.
> +#
> +# Integer registers.
> +# The forms r8, r16, r32, r64 indicate a register selected by the modrm reg encoding.
> +# The forms rmr16, rmr32, rmr64 indicate a register (never memory) selected by the modrm r/m encoding.
> +# The forms r/m8, r/m16, r/m32, and r/m64 indicate a register or memory selected by the modrm r/m encoding.
> +# Forms with two sizes, like r32/m16 also indicate a register or memory selected by the modrm r/m encodng,
> +# but the size for a register argument differs from the size of a memory argument.
> +# The forms r8V, r16V, r32V, r64V indicate a register selected by the VEX.vvvv bits.
> +#
> +# Multimedia registers.
> +# The forms mm1, xmm1, and ymm1 indicate a multimedia register selected by the
> +# modrm reg encoding.
> +# The forms mm2, xmm2, and ymm2 indicate a register (never memory) selected by
> +# the modrm r/m encoding.
> +# The forms mm2/m64, xmm2/m128, and so on indicate a register or memory
> +# selected by the modrm r/m encoding.
> +# The forms xmmV and ymmV indicate a register selected by the VEX.vvvv bits.
> +# The forms xmmI and ymmI indicate a register selected by the top four bits of an /is4 immediate byte.
> +#
> +# Bound registers.
> +# The form bnd1 indicates a bound register selected by the modrm reg encoding.
> +# The form bnd2 indicates a bound register (never memory) selected by the modrm r/m encoding.
> +# The forms bnd2/m64 and bnd2/m128 indicate a register or memorys selected by the modrm r/m encoding.
> +# TODO: Describe mib.
> +#
> +# One-of-a-kind operands: rel8, rel16, rel32, ptr16:16, ptr16:32,
> +# moffs8, moffs16, moffs32, moffs64, vm32x, vm32y, vm64x, and vm64y
> +# are all as in the Intel manual.
> +#
> +# Encodings
> +#
> +# The encodings are also as used in the Intel manual, with automated corrections.
> +# For example, the Intel manual sometimes omits the modrm /r indicator or other trailing bytes,
> +# and it also contains typographical errors.
> +# These problems are corrected so that the CSV data may be used to generate
> +# tools for processing x86 machine code.
> +# See https://golang.org/x/arch/x86/x86map for one such generator.
> +#
> +# Valid32 and Valid64
> +#
> +# These columns hold validity abbreviations as defined in the Intel manual:
> +# V, I, N.E., N.P., N.S., or N.I.
> +# Tools processing the data are typically only concerned with whether the
> +# column is "V" (valid) or not.
> +# This data is also corrected compared to the manual.
> +# For example, the manual lists many instruction forms using REX bytes
> +# with an incorrect "V" in the Valid32 column.
> +#
> +# CPUID Feature Flags
> +#
> +# This column specifies CPUID feature flags that must be present in order
> +# to use the instruction. If multiple flags are required,
> +# they are listed separated by plus signs, as in PCLMULQDQ+AVX.
> +# The column can also list one of the values 486, Pentium, PentiumII, and P6,
> +# indicating that the instruction was introduced on that architecture version.
> +#
> +# Tags
> +#
> +# The tag column does not correspond to a traditional column in the Intel manual tables.
> +# Instead, it is itself a comma-separated list of tags or hints derived by analysis
> +# of the instruction set or the instruction encodings.
> +#
> +# The tags address16, address32, and address64 indicate that the instruction form
> +# applies when using the specified addressing size. It may therefore be necessary to use an
> +# address size prefix byte to access the instruction.
> +# If two address tags are listed, the instruction can be used with either of those
> +# address sizes. An instruction will never list all three address sizes.
> +# (In fact, today, no instruction lists two address sizes, but that may change.)
> +#
> +# The tags operand16, operand32, and operand64 indicate that the instruction form
> +# applies when using the specified operand size. It may therefore be necessary to use an
> +# operand size prefix byte to access the instruction.
> +# If two operand tags are listed,  the instruction can be used with either of those
> +# operand sizes. An instruction will never list all three operand sizes.
> +# For some instructions, default64 is used instead of operand64,
> +# which specifies data promotion to 64-bit.
> +# For instructions with different possible data sizes,
> +# it also describes that default data size is 64-bit instead of 32-bit.
> +# Using refining prefix like 0x66 will lead to 32-bit operation (if supported).
> +#
> +# The tags modrm_regonly or modrm_memonly indicate that the modrm byte's
> +# r/m encoding must specify a register or memory, respectively.
> +# Especially in newer instructions, the modrm constraint may be the only way
> +# to distinguish two instruction forms. For example the MOVHLPS and MOVLPS
> +# instructions share the same encoding, except that the former requires the
> +# modrm byte's r/m to indicate a register, while the latter requires it to indicate memory.
> +#
> +# The tags pseudo and pseudo64 indicate that this instruction form is redundant
> +# with others listed in the table and should be ignored when generating disassembly
> +# or instruction scanning programs. The pseudo64 tag is reserved for the case where
> +# the manual lists an instruction twice, once with the optional 64-bit mode REX byte.
> +# Since most decoders will handle the REX byte separately, the form with the
> +# unnecessary REX is tagged pseudo64.
> +#
> +# The amd tag marks AMD-specific instructions.
> +# As an example, all instructions of SSE4a have such tag.
> +#
> +# The AVX512-specific tags: scaleX and bscaleX.
> +# scale1, scale2, scale4, scale8, scale16, scale32, scale64 specify
> +# the compressed displacement multiplier (scaling).
> +# For example, if displacement is 128 and scale32 is set,
> +# disp8 value should be calculated as 128/32.
> +# bscale4 and bscale8 have the same meaning, but are used
> +# when instruction uses embedded broadcast feature.
> +# If instruction does not have bscaleX tag, it does not support EVEX broadcasting.
> +#
> +# Related packages (can be a good source of additional documentation):
> +#	x86csv - read and manipulate x86.csv
> +#	x86spec - x86.csv generator
> +#	x86map - x86asm table generator based on x86.csv
> +#	x86avxgen - cmd/internal/obj/x86 optab generator based x86.csv
> +# All listed packages are located at golang.org/x/arch/x86/.
> +"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","V","N.S.","","operand32","r","Y",""
> +"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","N.S.","V","","default64","r","Y",""
> +"AAA","AAA","aaa","37","V","N.S.","","","","",""
> +"AAD","AAD","aad","D5 0A","V","I","","pseudo","","",""
> +"AAD imm8u","AAD imm8u","aad imm8u","D5 ib","V","N.S.","","","r","",""
> +"AAM","AAM","aam","D4 0A","V","I","","pseudo","","",""
> +"AAM imm8u","AAM imm8u","aam imm8u","D4 ib","V","N.S.","","","r","",""
> +"AAS","AAS","aas","3F","V","N.S.","","","","",""
> +"ADC AL, imm8","ADCB imm8, AL","adcb imm8, AL","14 ib","V","V","","","rw,r","Y","8"
> +"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","80 /2 ib","V","V","","","rw,r","Y","8"
> +"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","82 /2 ib","V","N.S.","","","rw,r","Y","8"
> +"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","REX 80 /2 ib","N.E.","V","","pseudo64","rw,r","Y","8"
> +"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","12 /r","V","V","","","rw,r","Y","8"
> +"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","REX 12 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","10 /r","V","V","","","rw,r","Y","8"
> +"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","REX 10 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"ADC EAX, imm32","ADCL imm32, EAX","adcl imm32, EAX","15 id","V","V","","operand32","rw,r","Y","32"
> +"ADC r/m32, imm32","ADCL imm32, r/m32","adcl imm32, r/m32","81 /2 id","V","V","","operand32","rw,r","Y","32"
> +"ADC r/m32, imm8","ADCL imm8, r/m32","adcl imm8, r/m32","83 /2 ib","V","V","","operand32","rw,r","Y","32"
> +"ADC r32, r/m32","ADCL r/m32, r32","adcl r/m32, r32","13 /r","V","V","","operand32","rw,r","Y","32"
> +"ADC r/m32, r32","ADCL r32, r/m32","adcl r32, r/m32","11 /r","V","V","","operand32","rw,r","Y","32"
> +"ADC RAX, imm32","ADCQ imm32, RAX","adcq imm32, RAX","REX.W 15 id","N.S.","V","","","rw,r","Y","64"
> +"ADC r/m64, imm32","ADCQ imm32, r/m64","adcq imm32, r/m64","REX.W 81 /2 id","N.S.","V","","","rw,r","Y","64"
> +"ADC r/m64, imm8","ADCQ imm8, r/m64","adcq imm8, r/m64","REX.W 83 /2 ib","N.S.","V","","","rw,r","Y","64"
> +"ADC r64, r/m64","ADCQ r/m64, r64","adcq r/m64, r64","REX.W 13 /r","N.S.","V","","","rw,r","Y","64"
> +"ADC r/m64, r64","ADCQ r64, r/m64","adcq r64, r/m64","REX.W 11 /r","N.S.","V","","","rw,r","Y","64"
> +"ADC AX, imm16","ADCW imm16, AX","adcw imm16, AX","15 iw","V","V","","operand16","rw,r","Y","16"
> +"ADC r/m16, imm16","ADCW imm16, r/m16","adcw imm16, r/m16","81 /2 iw","V","V","","operand16","rw,r","Y","16"
> +"ADC r/m16, imm8","ADCW imm8, r/m16","adcw imm8, r/m16","83 /2 ib","V","V","","operand16","rw,r","Y","16"
> +"ADC r16, r/m16","ADCW r/m16, r16","adcw r/m16, r16","13 /r","V","V","","operand16","rw,r","Y","16"
> +"ADC r/m16, r16","ADCW r16, r/m16","adcw r16, r/m16","11 /r","V","V","","operand16","rw,r","Y","16"
> +"ADCX r32, r/m32","ADCXL r/m32, r32","adcxl r/m32, r32","66 0F 38 F6 /r","V","V","ADX","operand16,operand32","rw,r","Y","32"
> +"ADCX r64, r/m64","ADCXQ r/m64, r64","adcxq r/m64, r64","66 REX.W 0F 38 F6 /r","N.S.","V","ADX","","rw,r","Y","64"
> +"ADD AL, imm8","ADDB imm8, AL","addb imm8, AL","04 ib","V","V","","","rw,r","Y","8"
> +"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","80 /0 ib","V","V","","","rw,r","Y","8"
> +"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","82 /0 ib","V","N.S.","","","rw,r","Y","8"
> +"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","REX 80 /0 ib","N.E.","V","","pseudo64","rw,r","Y","8"
> +"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","02 /r","V","V","","","rw,r","Y","8"
> +"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","REX 02 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","00 /r","V","V","","","rw,r","Y","8"
> +"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","REX 00 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"ADD EAX, imm32","ADDL imm32, EAX","addl imm32, EAX","05 id","V","V","","operand32","rw,r","Y","32"
> +"ADD r/m32, imm32","ADDL imm32, r/m32","addl imm32, r/m32","81 /0 id","V","V","","operand32","rw,r","Y","32"
> +"ADD r/m32, imm8","ADDL imm8, r/m32","addl imm8, r/m32","83 /0 ib","V","V","","operand32","rw,r","Y","32"
> +"ADD r32, r/m32","ADDL r/m32, r32","addl r/m32, r32","03 /r","V","V","","operand32","rw,r","Y","32"
> +"ADD r/m32, r32","ADDL r32, r/m32","addl r32, r/m32","01 /r","V","V","","operand32","rw,r","Y","32"
> +"ADDPD xmm1, xmm2/m128","ADDPD xmm2/m128, xmm1","addpd xmm2/m128, xmm1","66 0F 58 /r","V","V","SSE2","","rw,r","",""
> +"ADDPS xmm1, xmm2/m128","ADDPS xmm2/m128, xmm1","addps xmm2/m128, xmm1","0F 58 /r","V","V","SSE","","rw,r","",""
> +"ADD RAX, imm32","ADDQ imm32, RAX","addq imm32, RAX","REX.W 05 id","N.S.","V","","","rw,r","Y","64"
> +"ADD r/m64, imm32","ADDQ imm32, r/m64","addq imm32, r/m64","REX.W 81 /0 id","N.S.","V","","","rw,r","Y","64"
> +"ADD r/m64, imm8","ADDQ imm8, r/m64","addq imm8, r/m64","REX.W 83 /0 ib","N.S.","V","","","rw,r","Y","64"
> +"ADD r64, r/m64","ADDQ r/m64, r64","addq r/m64, r64","REX.W 03 /r","N.S.","V","","","rw,r","Y","64"
> +"ADD r/m64, r64","ADDQ r64, r/m64","addq r64, r/m64","REX.W 01 /r","N.S.","V","","","rw,r","Y","64"
> +"ADDSD xmm1, xmm2/m64","ADDSD xmm2/m64, xmm1","addsd xmm2/m64, xmm1","F2 0F 58 /r","V","V","SSE2","","rw,r","",""
> +"ADDSS xmm1, xmm2/m32","ADDSS xmm2/m32, xmm1","addss xmm2/m32, xmm1","F3 0F 58 /r","V","V","SSE","","rw,r","",""
> +"ADDSUBPD xmm1, xmm2/m128","ADDSUBPD xmm2/m128, xmm1","addsubpd xmm2/m128, xmm1","66 0F D0 /r","V","V","SSE3","","rw,r","",""
> +"ADDSUBPS xmm1, xmm2/m128","ADDSUBPS xmm2/m128, xmm1","addsubps xmm2/m128, xmm1","F2 0F D0 /r","V","V","SSE3","","rw,r","",""
> +"ADD AX, imm16","ADDW imm16, AX","addw imm16, AX","05 iw","V","V","","operand16","rw,r","Y","16"
> +"ADD r/m16, imm16","ADDW imm16, r/m16","addw imm16, r/m16","81 /0 iw","V","V","","operand16","rw,r","Y","16"
> +"ADD r/m16, imm8","ADDW imm8, r/m16","addw imm8, r/m16","83 /0 ib","V","V","","operand16","rw,r","Y","16"
> +"ADD r16, r/m16","ADDW r/m16, r16","addw r/m16, r16","03 /r","V","V","","operand16","rw,r","Y","16"
> +"ADD r/m16, r16","ADDW r16, r/m16","addw r16, r/m16","01 /r","V","V","","operand16","rw,r","Y","16"
> +"ADOX r32, r/m32","ADOXL r/m32, r32","adoxl r/m32, r32","F3 0F 38 F6 /r","V","V","ADX","operand16,operand32","rw,r","Y","32"
> +"ADOX r64, r/m64","ADOXQ r/m64, r64","adoxq r/m64, r64","F3 REX.W 0F 38 F6 /r","N.S.","V","ADX","","rw,r","Y","64"
> +"AESDEC xmm1, xmm2/m128","AESDEC xmm2/m128, xmm1","aesdec xmm2/m128, xmm1","66 0F 38 DE /r","V","V","AES","","rw,r","",""
> +"AESDECLAST xmm1, xmm2/m128","AESDECLAST xmm2/m128, xmm1","aesdeclast xmm2/m128, xmm1","66 0F 38 DF /r","V","V","AES","","rw,r","",""
> +"AESENC xmm1, xmm2/m128","AESENC xmm2/m128, xmm1","aesenc xmm2/m128, xmm1","66 0F 38 DC /r","V","V","AES","","rw,r","",""
> +"AESENCLAST xmm1, xmm2/m128","AESENCLAST xmm2/m128, xmm1","aesenclast xmm2/m128, xmm1","66 0F 38 DD /r","V","V","AES","","rw,r","",""
> +"AESIMC xmm1, xmm2/m128","AESIMC xmm2/m128, xmm1","aesimc xmm2/m128, xmm1","66 0F 38 DB /r","V","V","AES","","w,r","",""
> +"AESKEYGENASSIST xmm1, xmm2/m128, imm8u","AESKEYGENASSIST imm8u,
> xmm2/m128, xmm1","aeskeygenassist imm8u, xmm2/m128, xmm1","66 0F 3A DF
> /r ib","V","V","AES","","w,r,r","",""
> +"AND AL, imm8","ANDB imm8, AL","andb imm8, AL","24 ib","V","V","","","rw,r","Y","8"
> +"AND r/m8, imm8","ANDB imm8, r/m8","andb imm8, r/m8","REX 80 /4 ib","N.E.","V","","pseudo64","rw,r","Y","8"
> +"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","80 /4 ib","V","V","","","rw,r","Y","8"
> +"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","82 /4 ib","V","N.S.","","","rw,r","Y","8"
> +"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","22 /r","V","V","","","rw,r","Y","8"
> +"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","REX 22 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","20 /r","V","V","","","rw,r","Y","8"
> +"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","REX 20 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"AND EAX, imm32","ANDL imm32, EAX","andl imm32, EAX","25 id","V","V","","operand32","rw,r","Y","32"
> +"AND r/m32, imm32","ANDL imm32, r/m32","andl imm32, r/m32","81 /4 id","V","V","","operand32","rw,r","Y","32"
> +"AND r/m32, imm8","ANDL imm8, r/m32","andl imm8, r/m32","83 /4 ib","V","V","","operand32","rw,r","Y","32"
> +"AND r32, r/m32","ANDL r/m32, r32","andl r/m32, r32","23 /r","V","V","","operand32","rw,r","Y","32"
> +"AND r/m32, r32","ANDL r32, r/m32","andl r32, r/m32","21 /r","V","V","","operand32","rw,r","Y","32"
> +"ANDN r32, r32V, r/m32","ANDNL r/m32, r32V, r32","andnl r/m32, r32V, r32","VEX.DDS.128.0F38.W0 F2 /r","V","V","BMI1","","rw,r,r","Y","32"
> +"ANDNPD xmm1, xmm2/m128","ANDNPD xmm2/m128, xmm1","andnpd xmm2/m128, xmm1","66 0F 55 /r","V","V","SSE2","","rw,r","",""
> +"ANDNPS xmm1, xmm2/m128","ANDNPS xmm2/m128, xmm1","andnps xmm2/m128, xmm1","0F 55 /r","V","V","SSE","","rw,r","",""
> +"ANDN r64, r64V, r/m64","ANDNQ r/m64, r64V, r64","andnq r/m64, r64V, r64","VEX.DDS.128.0F38.W1 F2 /r","N.S.","V","BMI1","","rw,r,r","Y","64"
> +"ANDPD xmm1, xmm2/m128","ANDPD xmm2/m128, xmm1","andpd xmm2/m128, xmm1","66 0F 54 /r","V","V","SSE2","","rw,r","",""
> +"ANDPS xmm1, xmm2/m128","ANDPS xmm2/m128, xmm1","andps xmm2/m128, xmm1","0F 54 /r","V","V","SSE","","rw,r","",""
> +"AND RAX, imm32","ANDQ imm32, RAX","andq imm32, RAX","REX.W 25 id","N.S.","V","","","rw,r","Y","64"
> +"AND r/m64, imm32","ANDQ imm32, r/m64","andq imm32, r/m64","REX.W 81 /4 id","N.S.","V","","","rw,r","Y","64"
> +"AND r/m64, imm8","ANDQ imm8, r/m64","andq imm8, r/m64","REX.W 83 /4 ib","N.S.","V","","","rw,r","Y","64"
> +"AND r64, r/m64","ANDQ r/m64, r64","andq r/m64, r64","REX.W 23 /r","N.S.","V","","","rw,r","Y","64"
> +"AND r/m64, r64","ANDQ r64, r/m64","andq r64, r/m64","REX.W 21 /r","N.S.","V","","","rw,r","Y","64"
> +"AND AX, imm16","ANDW imm16, AX","andw imm16, AX","25 iw","V","V","","operand16","rw,r","Y","16"
> +"AND r/m16, imm16","ANDW imm16, r/m16","andw imm16, r/m16","81 /4 iw","V","V","","operand16","rw,r","Y","16"
> +"AND r/m16, imm8","ANDW imm8, r/m16","andw imm8, r/m16","83 /4 ib","V","V","","operand16","rw,r","Y","16"
> +"AND r16, r/m16","ANDW r/m16, r16","andw r/m16, r16","23 /r","V","V","","operand16","rw,r","Y","16"
> +"AND r/m16, r16","ANDW r16, r/m16","andw r16, r/m16","21 /r","V","V","","operand16","rw,r","Y","16"
> +"ARPL r/m16, r16","ARPL r16, r/m16","arpl r16, r/m16","63 /r","V","N.S.","","","rw,r","",""
> +"BEXTR r32, r/m32, r32V","BEXTRL r32V, r/m32, r32","bextrl r32V, r/m32, r32","VEX.NDS.128.0F38.W0 F7 /r","V","V","BMI1","","w,r,r","Y","32"
> +"BEXTR r64, r/m64, r64V","BEXTRQ r64V, r/m64, r64","bextrq r64V, r/m64, r64","VEX.NDS.128.0F38.W1 F7 /r","N.S.","V","BMI1","","w,r,r","Y","64"
> +"BEXTR_XOP r32, r/m32, imm32u","BEXTR_XOPL imm32u, r/m32,
> r32","bextr_xopl imm32u, r/m32, r32","XOP.128.0A.WIG 10
> /r","V","V","TBM","amd,operand16,operand32","w,r,r","Y","32"
> +"BEXTR_XOP r64, r/m64, imm32u","BEXTR_XOPQ imm32u, r/m64, r64","bextr_xopq imm32u, r/m64, r64","XOP.128.0A.WIG 10 /r","N.S.","V","TBM","amd,operand64","w,r,r","Y","64"
> +"BLCFILL r32V, r/m32","BLCFILLL r/m32, r32V","blcfill r/m32, r32V","XOP.NDD.128.09.WIG 01 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
> +"BLCFILL r64V, r/m64","BLCFILLQ r/m64, r64V","blcfill r/m64, r64V","XOP.NDD.128.09.W1 01 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64"
> +"BLCIC r32V, r/m32","BLCICL r/m32, r32V","blcicl r/m32, r32V","XOP.NDD.128.09.WIG 01 /5","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
> +"BLCIC r64V, r/m64","BLCICQ r/m64, r64V","blcicq r/m64, r64V","XOP.NDD.128.09.WIG 01 /5","N.S.","V","TBM","amd,operand64","w,r","Y","64"
> +"BLCI r32V, r/m32","BLCIL r/m32, r32V","blcil r/m32, r32V","XOP.NDD.128.09.WIG 02 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
> +"BLCI r64V, r/m64","BLCIQ r/m64, r64V","blciq r/m64, r64V","XOP.NDD.128.09.WIG 02 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64"
> +"BLCMSK r32V, r/m32","BLCMSKL r/m32, r32V","blcmskl r/m32, r32V","XOP.NDD.128.09.WIG 02 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
> +"BLCMSK r64V, r/m64","BLCMSKQ r/m64, r64V","blcmskq r/m64, r64V","XOP.NDD.128.09.WIG 02 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64"
> +"BLCS r32V, r/m32","BLCSL r/m32, r32V","blcsl r/m32, r32V","XOP.NDD.128.09.WIG 01 /3","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
> +"BLCS r64V, r/m64","BLCSQ r/m64, r64V","blcsq r/m64, r64V","XOP.NDD.128.09.WIG 01 /3","N.S.","V","TBM","amd,operand64","w,r","Y","64"
> +"BLENDPD xmm1, xmm2/m128, imm8u","BLENDPD imm8u, xmm2/m128, xmm1","blendpd imm8u, xmm2/m128, xmm1","66 0F 3A 0D /r ib","V","V","SSE4_1","","rw,r,r","",""
> +"BLENDPS xmm1, xmm2/m128, imm8u","BLENDPS imm8u, xmm2/m128, xmm1","blendps imm8u, xmm2/m128, xmm1","66 0F 3A 0C /r ib","V","V","SSE4_1","","rw,r,r","",""
> +"BLENDVPD xmm1, xmm2/m128, <XMM0>","BLENDVPD <XMM0>, xmm2/m128, xmm1","blendvpd <XMM0>, xmm2/m128, xmm1","66 0F 38 15 /r","V","V","SSE4_1","","rw,r,r","",""
> +"BLENDVPS xmm1, xmm2/m128, <XMM0>","BLENDVPS <XMM0>, xmm2/m128, xmm1","blendvps <XMM0>, xmm2/m128, xmm1","66 0F 38 14 /r","V","V","SSE4_1","","rw,r,r","",""
> +"BLSFILL r32V, r/m32","BLSFILLL r/m32, r32V","blsfill r/m32, r32V","XOP.NDD.128.09.WIG 01 /2","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
> +"BLSFILL r64V, r/m64","BLSFILLQ r/m64, r64V","blsfill r/m64, r64V","XOP.NDD.128.09.W1 01 /2","N.S.","V","TBM","amd,operand64","w,r","Y","64"
> +"BLSIC r32V, r/m32","BLSICL r/m32, r32V","blsicl r/m32, r32V","XOP.NDD.128.09.WIG 01 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
> +"BLSIC r64V, r/m64","BLSICQ r/m64, r64V","blsicq r/m64, r64V","XOP.NDD.128.09.WIG 01 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64"
> +"BLSI r32V, r/m32","BLSIL r/m32, r32V","blsil r/m32, r32V","VEX.NDD.128.0F38.W0 F3 /3","V","V","BMI1","","w,r","Y","32"
> +"BLSI r64V, r/m64","BLSIQ r/m64, r64V","blsiq r/m64, r64V","VEX.NDD.128.0F38.W1 F3 /3","N.S.","V","BMI1","","w,r","Y","64"
> +"BLSMSK r32V, r/m32","BLSMSKL r/m32, r32V","blsmskl r/m32, r32V","VEX.NDD.128.0F38.W0 F3 /2","V","V","BMI1","","w,r","Y","32"
> +"BLSMSK r64V, r/m64","BLSMSKQ r/m64, r64V","blsmskq r/m64, r64V","VEX.NDD.128.0F38.W1 F3 /2","N.S.","V","BMI1","","w,r","Y","64"
> +"BLSR r32V, r/m32","BLSRL r/m32, r32V","blsrl r/m32, r32V","VEX.NDD.128.0F38.W0 F3 /1","V","V","BMI1","","w,r","Y","32"
> +"BLSR r64V, r/m64","BLSRQ r/m64, r64V","blsrq r/m64, r64V","VEX.NDD.128.0F38.W1 F3 /1","N.S.","V","BMI1","","w,r","Y","64"
> +"BNDCL bnd1, r/m32","BNDCL r/m32, bnd1","bndcl r/m32, bnd1","F3 0F 1A /r","V","N.S.","MPX","","r,r","",""
> +"BNDCL bnd1, r/m64","BNDCL r/m64, bnd1","bndcl r/m64, bnd1","F3 0F 1A /r","N.S.","V","MPX","","r,r","",""
> +"BNDCN bnd1, r/m32","BNDCN r/m32, bnd1","bndcn r/m32, bnd1","F2 0F 1B /r","V","N.S.","MPX","","r,r","",""
> +"BNDCN bnd1, r/m64","BNDCN r/m64, bnd1","bndcn r/m64, bnd1","F2 0F 1B /r","N.S.","V","MPX","","r,r","",""
> +"BNDCU bnd1, r/m32","BNDCU r/m32, bnd1","bndcu r/m32, bnd1","F2 0F 1A /r","V","N.S.","MPX","","r,r","",""
> +"BNDCU bnd1, r/m64","BNDCU r/m64, bnd1","bndcu r/m64, bnd1","F2 0F 1A /r","N.S.","V","MPX","","r,r","",""
> +"BNDLDX bnd1, mib","BNDLDX mib, bnd1","bndldx mib, bnd1","0F 1A /r","V","V","MPX","modrm_memonly","w,r","",""
> +"BNDMK bnd1, m32","BNDMK m32, bnd1","bndmk m32, bnd1","F3 0F 1B /r","V","N.S.","MPX","modrm_memonly","w,r","",""
> +"BNDMK bnd1, m64","BNDMK m64, bnd1","bndmk m64, bnd1","F3 0F 1B /r","N.S.","V","MPX","modrm_memonly","w,r","",""
> +"BNDMOV bnd2/m128, bnd1","BNDMOV bnd1, bnd2/m128","bndmov bnd1, bnd2/m128","66 0F 1B /r","N.S.","V","MPX","","w,r","",""
> +"BNDMOV bnd2/m64, bnd1","BNDMOV bnd1, bnd2/m64","bndmov bnd1, bnd2/m64","66 0F 1B /r","V","N.S.","MPX","","w,r","",""
> +"BNDMOV bnd1, bnd2/m128","BNDMOV bnd2/m128, bnd1","bndmov bnd2/m128, bnd1","66 0F 1A /r","N.S.","V","MPX","","w,r","",""
> +"BNDMOV bnd1, bnd2/m64","BNDMOV bnd2/m64, bnd1","bndmov bnd2/m64, bnd1","66 0F 1A /r","V","N.S.","MPX","","w,r","",""
> +"BNDSTX mib, bnd1","BNDSTX bnd1, mib","bndstx bnd1, mib","0F 1B /r","V","V","MPX","modrm_memonly","w,r","",""
> +"BOUND r32, m32&32","BOUNDL m32&32, r32","boundl r32, m32&32","62 /r","V","N.S.","","modrm_memonly,operand32","r,r","Y","32"
> +"BOUND r16, m16&16","BOUNDW m16&16, r16","boundw r16, m16&16","62 /r","V","N.S.","","modrm_memonly,operand16","r,r","Y","16"
> +"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","0F BC /r","V","V","","operand32","rw,r","Y","32"
> +"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","F3 0F BC /r","V","V","","operand32","rw,r","Y","32"
> +"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","F3 REX.W 0F BC /r","N.S.","V","","","rw,r","Y","64"
> +"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","REX.W 0F BC /r","N.S.","V","","","rw,r","Y","64"
> +"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","0F BC /r","V","V","","operand16","rw,r","Y","16"
> +"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","F3 0F BC /r","V","V","","operand16","rw,r","Y","16"
> +"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","0F BD /r","V","V","","operand32","rw,r","Y","32"
> +"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","F3 0F BD /r","V","V","","operand32","rw,r","Y","32"
> +"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","F3 REX.W 0F BD /r","N.S.","V","","","rw,r","Y","64"
> +"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","REX.W 0F BD /r","N.S.","V","","","rw,r","Y","64"
> +"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","0F BD /r","V","V","","operand16","rw,r","Y","16"
> +"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","F3 0F BD /r","V","V","","operand16","rw,r","Y","16"
> +"BSWAP r32op","BSWAPL r32op","bswap r32op","0F C8+rd","V","V","486","operand32","rw","Y","32"
> +"BSWAP r64op","BSWAPQ r64op","bswap r64op","REX.W 0F C8+ro","N.S.","V","486","","rw","Y","64"
> +"BSWAP r16op","BSWAPW r16op","bswap r16op","0F C8+rw","V","V","486","operand16","rw","Y","16"
> +"BTC r/m32, imm8u","BTCL imm8u, r/m32","btcl imm8u, r/m32","0F BA /7 ib","V","V","","operand32","rw,r","Y","32"
> +"BTC r/m32, r32","BTCL r32, r/m32","btcl r32, r/m32","0F BB /r","V","V","","operand32","rw,r","Y","32"
> +"BTC r/m64, imm8u","BTCQ imm8u, r/m64","btcq imm8u, r/m64","REX.W 0F BA /7 ib","N.S.","V","","","rw,r","Y","64"
> +"BTC r/m64, r64","BTCQ r64, r/m64","btcq r64, r/m64","REX.W 0F BB /r","N.S.","V","","","rw,r","Y","64"
> +"BTC r/m16, imm8u","BTCW imm8u, r/m16","btcw imm8u, r/m16","0F BA /7 ib","V","V","","operand16","rw,r","Y","16"
> +"BTC r/m16, r16","BTCW r16, r/m16","btcw r16, r/m16","0F BB /r","V","V","","operand16","rw,r","Y","16"
> +"BT r/m32, imm8u","BTL imm8u, r/m32","btl imm8u, r/m32","0F BA /4 ib","V","V","","operand32","r,r","Y","32"
> +"BT r/m32, r32","BTL r32, r/m32","btl r32, r/m32","0F A3 /r","V","V","","operand32","r,r","Y","32"
> +"BT r/m64, imm8u","BTQ imm8u, r/m64","btq imm8u, r/m64","REX.W 0F BA /4 ib","N.S.","V","","","r,r","Y","64"
> +"BT r/m64, r64","BTQ r64, r/m64","btq r64, r/m64","REX.W 0F A3 /r","N.S.","V","","","r,r","Y","64"
> +"BTR r/m32, imm8u","BTRL imm8u, r/m32","btrl imm8u, r/m32","0F BA /6 ib","V","V","","operand32","rw,r","Y","32"
> +"BTR r/m32, r32","BTRL r32, r/m32","btrl r32, r/m32","0F B3 /r","V","V","","operand32","rw,r","Y","32"
> +"BTR r/m64, imm8u","BTRQ imm8u, r/m64","btrq imm8u, r/m64","REX.W 0F BA /6 ib","N.S.","V","","","rw,r","Y","64"
> +"BTR r/m64, r64","BTRQ r64, r/m64","btrq r64, r/m64","REX.W 0F B3 /r","N.S.","V","","","rw,r","Y","64"
> +"BTR r/m16, imm8u","BTRW imm8u, r/m16","btrw imm8u, r/m16","0F BA /6 ib","V","V","","operand16","rw,r","Y","16"
> +"BTR r/m16, r16","BTRW r16, r/m16","btrw r16, r/m16","0F B3 /r","V","V","","operand16","rw,r","Y","16"
> +"BTS r/m32, imm8u","BTSL imm8u, r/m32","btsl imm8u, r/m32","0F BA /5 ib","V","V","","operand32","rw,r","Y","32"
> +"BTS r/m32, r32","BTSL r32, r/m32","btsl r32, r/m32","0F AB /r","V","V","","operand32","rw,r","Y","32"
> +"BTS r/m64, imm8u","BTSQ imm8u, r/m64","btsq imm8u, r/m64","REX.W 0F BA /5 ib","N.S.","V","","","rw,r","Y","64"
> +"BTS r/m64, r64","BTSQ r64, r/m64","btsq r64, r/m64","REX.W 0F AB /r","N.S.","V","","","rw,r","Y","64"
> +"BTS r/m16, imm8u","BTSW imm8u, r/m16","btsw imm8u, r/m16","0F BA /5 ib","V","V","","operand16","rw,r","Y","16"
> +"BTS r/m16, r16","BTSW r16, r/m16","btsw r16, r/m16","0F AB /r","V","V","","operand16","rw,r","Y","16"
> +"BT r/m16, imm8u","BTW imm8u, r/m16","btw imm8u, r/m16","0F BA /4 ib","V","V","","operand16","r,r","Y","16"
> +"BT r/m16, r16","BTW r16, r/m16","btw r16, r/m16","0F A3 /r","V","V","","operand16","r,r","Y","16"
> +"BZHI r32, r/m32, r32V","BZHIL r32V, r/m32, r32","bzhil r32V, r/m32, r32","VEX.NDS.128.0F38.W0 F5 /r","V","V","BMI2","","w,r,r","Y","32"
> +"BZHI r64, r/m64, r64V","BZHIQ r64V, r/m64, r64","bzhiq r64V, r/m64, r64","VEX.NDS.128.0F38.W1 F5 /r","N.S.","V","BMI2","","w,r,r","Y","64"
> +"CALL rel16","CALL rel16","call rel16","E8 cw","V","N.S.","","operand16","r","Y",""
> +"CALL rel32","CALL rel32","call rel32","E8 cd","V","N.S.","","operand32","r","Y",""
> +"CALL rel32","CALL rel32","call rel32","E8 cd","N.S.","V","","default64","r","Y",""
> +"CALL r/m32","CALLL* r/m32","calll* r/m32","FF /2","V","N.S.","","operand32","r","Y","32"
> +"CALL r/m64","CALLQ* r/m64","callq* r/m64","FF /2","N.S.","V","","default64","r","Y","64"
> +"CALL r/m16","CALLW* r/m16","callw* r/m16","FF /2","V","N.S.","","operand16","r","Y","16"
> +"CBW","CBW","cbtw","98","V","V","","operand16","","",""
> +"CDQ","CDQ","cltd","99","V","V","","operand32","","",""
> +"CDQE","CDQE","cltq","REX.W 98","N.S.","V","","","","",""
> +"CLAC","CLAC","clac","0F 01 CA","V","V","","","","",""
> +"CLC","CLC","clc","F8","V","V","","","","",""
> +"CLD","CLD","cld","FC","V","V","","","","",""
> +"CLFLUSH m8","CLFLUSH m8","clflush m8","0F AE /7","V","V","","modrm_memonly","r","",""
> +"CLFLUSHOPT m8","CLFLUSHOPT m8","clflushopt m8","66 0F AE /7","V","V","","modrm_memonly","r","",""
> +"CLGI","CLGI","clgi","0F 01 DD","V","V","SVM","amd","","",""
> +"CLI","CLI","cli","FA","V","V","","","","",""
> +"CLRSSBSY m64","CLRSSBSY m64","clrssbsy m64","F3 0F AE /6","V","V","CET","modrm_memonly","w","",""
> +"CLTS","CLTS","clts","0F 06","V","V","","","","",""
> +"CLWB m8","CLWB m8","clwb m8","66 0F AE /6","V","V","CLWB","modrm_memonly","r","",""
> +"CLZERO EAX","CLZEROL EAX","clzerol EAX","0F 01 FC","V","V","CLZERO","amd,modrm_regonly,operand32","r","Y","32"
> +"CLZERO RAX","CLZEROQ RAX","clzeroq RAX","REX.W 0F 01 FC","N.S.","V","CLZERO","amd,modrm_regonly","r","Y","64"
> +"CLZERO AX","CLZEROW AX","clzerow AX","0F 01 FC","V","V","CLZERO","amd,modrm_regonly,operand16","r","Y","16"
> +"CMC","CMC","cmc","F5","V","V","","","","",""
> +"CMOVC r16, r/m16","CMOVC r/m16, r16","cmovc r/m16, r16","0F 42 /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVC r32, r/m32","CMOVC r/m32, r32","cmovc r/m32, r32","0F 42 /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVC r64, r/m64","CMOVC r/m64, r64","cmovc r/m64, r64","REX.W 0F 42 /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVAE r32, r/m32","CMOVLCC r/m32, r32","cmovael r/m32, r32","0F 43 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVB r32, r/m32","CMOVLCS r/m32, r32","cmovbl r/m32, r32","0F 42 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVE r32, r/m32","CMOVLEQ r/m32, r32","cmovel r/m32, r32","0F 44 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVGE r32, r/m32","CMOVLGE r/m32, r32","cmovgel r/m32, r32","0F 4D /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVG r32, r/m32","CMOVLGT r/m32, r32","cmovgl r/m32, r32","0F 4F /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVA r32, r/m32","CMOVLHI r/m32, r32","cmoval r/m32, r32","0F 47 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVLE r32, r/m32","CMOVLLE r/m32, r32","cmovlel r/m32, r32","0F 4E /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVBE r32, r/m32","CMOVLLS r/m32, r32","cmovbel r/m32, r32","0F 46 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVL r32, r/m32","CMOVLLT r/m32, r32","cmovll r/m32, r32","0F 4C /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVS r32, r/m32","CMOVLMI r/m32, r32","cmovsl r/m32, r32","0F 48 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVNE r32, r/m32","CMOVLNE r/m32, r32","cmovnel r/m32, r32","0F 45 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVNO r32, r/m32","CMOVLOC r/m32, r32","cmovnol r/m32, r32","0F 41 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVO r32, r/m32","CMOVLOS r/m32, r32","cmovol r/m32, r32","0F 40 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVNP r32, r/m32","CMOVLPC r/m32, r32","cmovnpl r/m32, r32","0F 4B /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVNS r32, r/m32","CMOVLPL r/m32, r32","cmovnsl r/m32, r32","0F 49 /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVP r32, r/m32","CMOVLPS r/m32, r32","cmovpl r/m32, r32","0F 4A /r","V","V","","P6,operand32","rw,r","Y","32"
> +"CMOVNA r16, r/m16","CMOVNA r/m16, r16","cmovna r/m16, r16","0F 46 /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNA r32, r/m32","CMOVNA r/m32, r32","cmovna r/m32, r32","0F 46 /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNA r64, r/m64","CMOVNA r/m64, r64","cmovna r/m64, r64","REX.W 0F 46 /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVNAE r16, r/m16","CMOVNAE r/m16, r16","cmovnae r/m16, r16","0F 42 /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNAE r32, r/m32","CMOVNAE r/m32, r32","cmovnae r/m32, r32","0F 42 /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNAE r64, r/m64","CMOVNAE r/m64, r64","cmovnae r/m64, r64","REX.W 0F 42 /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVNB r16, r/m16","CMOVNB r/m16, r16","cmovnb r/m16, r16","0F 43 /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNB r32, r/m32","CMOVNB r/m32, r32","cmovnb r/m32, r32","0F 43 /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNB r64, r/m64","CMOVNB r/m64, r64","cmovnb r/m64, r64","REX.W 0F 43 /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVNBE r16, r/m16","CMOVNBE r/m16, r16","cmovnbe r/m16, r16","0F 47 /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNBE r32, r/m32","CMOVNBE r/m32, r32","cmovnbe r/m32, r32","0F 47 /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNBE r64, r/m64","CMOVNBE r/m64, r64","cmovnbe r/m64, r64","REX.W 0F 47 /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVNC r16, r/m16","CMOVNC r/m16, r16","cmovnc r/m16, r16","0F 43 /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNC r32, r/m32","CMOVNC r/m32, r32","cmovnc r/m32, r32","0F 43 /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNC r64, r/m64","CMOVNC r/m64, r64","cmovnc r/m64, r64","REX.W 0F 43 /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVNG r16, r/m16","CMOVNG r/m16, r16","cmovng r/m16, r16","0F 4E /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNG r32, r/m32","CMOVNG r/m32, r32","cmovng r/m32, r32","0F 4E /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNG r64, r/m64","CMOVNG r/m64, r64","cmovng r/m64, r64","REX.W 0F 4E /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVNGE r16, r/m16","CMOVNGE r/m16, r16","cmovnge r/m16, r16","0F 4C /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNGE r32, r/m32","CMOVNGE r/m32, r32","cmovnge r/m32, r32","0F 4C /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNGE r64, r/m64","CMOVNGE r/m64, r64","cmovnge r/m64, r64","REX.W 0F 4C /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVNL r16, r/m16","CMOVNL r/m16, r16","cmovnl r/m16, r16","0F 4D /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNL r32, r/m32","CMOVNL r/m32, r32","cmovnl r/m32, r32","0F 4D /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNL r64, r/m64","CMOVNL r/m64, r64","cmovnl r/m64, r64","REX.W 0F 4D /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVNLE r16, r/m16","CMOVNLE r/m16, r16","cmovnle r/m16, r16","0F 4F /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNLE r32, r/m32","CMOVNLE r/m32, r32","cmovnle r/m32, r32","0F 4F /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNLE r64, r/m64","CMOVNLE r/m64, r64","cmovnle r/m64, r64","REX.W 0F 4F /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVNZ r16, r/m16","CMOVNZ r/m16, r16","cmovnz r/m16, r16","0F 45 /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVNZ r32, r/m32","CMOVNZ r/m32, r32","cmovnz r/m32, r32","0F 45 /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVNZ r64, r/m64","CMOVNZ r/m64, r64","cmovnz r/m64, r64","REX.W 0F 45 /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVPE r16, r/m16","CMOVPE r/m16, r16","cmovpe r/m16, r16","0F 4A /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVPE r32, r/m32","CMOVPE r/m32, r32","cmovpe r/m32, r32","0F 4A /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVPE r64, r/m64","CMOVPE r/m64, r64","cmovpe r/m64, r64","REX.W 0F 4A /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVPO r16, r/m16","CMOVPO r/m16, r16","cmovpo r/m16, r16","0F 4B /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVPO r32, r/m32","CMOVPO r/m32, r32","cmovpo r/m32, r32","0F 4B /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVPO r64, r/m64","CMOVPO r/m64, r64","cmovpo r/m64, r64","REX.W 0F 4B /r","N.E.","V","","pseudo","rw,r","",""
> +"CMOVAE r64, r/m64","CMOVQCC r/m64, r64","cmovaeq r/m64, r64","REX.W 0F 43 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVB r64, r/m64","CMOVQCS r/m64, r64","cmovbq r/m64, r64","REX.W 0F 42 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVE r64, r/m64","CMOVQEQ r/m64, r64","cmoveq r/m64, r64","REX.W 0F 44 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVGE r64, r/m64","CMOVQGE r/m64, r64","cmovgeq r/m64, r64","REX.W 0F 4D /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVG r64, r/m64","CMOVQGT r/m64, r64","cmovgq r/m64, r64","REX.W 0F 4F /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVA r64, r/m64","CMOVQHI r/m64, r64","cmovaq r/m64, r64","REX.W 0F 47 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVLE r64, r/m64","CMOVQLE r/m64, r64","cmovleq r/m64, r64","REX.W 0F 4E /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVBE r64, r/m64","CMOVQLS r/m64, r64","cmovbeq r/m64, r64","REX.W 0F 46 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVL r64, r/m64","CMOVQLT r/m64, r64","cmovlq r/m64, r64","REX.W 0F 4C /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVS r64, r/m64","CMOVQMI r/m64, r64","cmovsq r/m64, r64","REX.W 0F 48 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVNE r64, r/m64","CMOVQNE r/m64, r64","cmovneq r/m64, r64","REX.W 0F 45 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVNO r64, r/m64","CMOVQOC r/m64, r64","cmovnoq r/m64, r64","REX.W 0F 41 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVO r64, r/m64","CMOVQOS r/m64, r64","cmovoq r/m64, r64","REX.W 0F 40 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVNP r64, r/m64","CMOVQPC r/m64, r64","cmovnpq r/m64, r64","REX.W 0F 4B /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVNS r64, r/m64","CMOVQPL r/m64, r64","cmovnsq r/m64, r64","REX.W 0F 49 /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVP r64, r/m64","CMOVQPS r/m64, r64","cmovpq r/m64, r64","REX.W 0F 4A /r","N.S.","V","","","rw,r","Y","64"
> +"CMOVAE r16, r/m16","CMOVWCC r/m16, r16","cmovaew r/m16, r16","0F 43 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVB r16, r/m16","CMOVWCS r/m16, r16","cmovbw r/m16, r16","0F 42 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVE r16, r/m16","CMOVWEQ r/m16, r16","cmovew r/m16, r16","0F 44 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVGE r16, r/m16","CMOVWGE r/m16, r16","cmovgew r/m16, r16","0F 4D /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVG r16, r/m16","CMOVWGT r/m16, r16","cmovgw r/m16, r16","0F 4F /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVA r16, r/m16","CMOVWHI r/m16, r16","cmovaw r/m16, r16","0F 47 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVLE r16, r/m16","CMOVWLE r/m16, r16","cmovlew r/m16, r16","0F 4E /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVBE r16, r/m16","CMOVWLS r/m16, r16","cmovbew r/m16, r16","0F 46 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVL r16, r/m16","CMOVWLT r/m16, r16","cmovlw r/m16, r16","0F 4C /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVS r16, r/m16","CMOVWMI r/m16, r16","cmovsw r/m16, r16","0F 48 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVNE r16, r/m16","CMOVWNE r/m16, r16","cmovnew r/m16, r16","0F 45 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVNO r16, r/m16","CMOVWOC r/m16, r16","cmovnow r/m16, r16","0F 41 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVO r16, r/m16","CMOVWOS r/m16, r16","cmovow r/m16, r16","0F 40 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVNP r16, r/m16","CMOVWPC r/m16, r16","cmovnpw r/m16, r16","0F 4B /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVNS r16, r/m16","CMOVWPL r/m16, r16","cmovnsw r/m16, r16","0F 49 /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVP r16, r/m16","CMOVWPS r/m16, r16","cmovpw r/m16, r16","0F 4A /r","V","V","","P6,operand16","rw,r","Y","16"
> +"CMOVZ r16, r/m16","CMOVZ r/m16, r16","cmovz r/m16, r16","0F 44 /r","V","V","","P6,operand16,pseudo","rw,r","",""
> +"CMOVZ r32, r/m32","CMOVZ r/m32, r32","cmovz r/m32, r32","0F 44 /r","V","V","","P6,operand32,pseudo","rw,r","",""
> +"CMOVZ r64, r/m64","CMOVZ r/m64, r64","cmovz r/m64, r64","REX.W 0F 44 /r","N.E.","V","","pseudo","rw,r","",""
> +"CMP AL, imm8","CMPB AL, imm8","cmpb imm8, AL","3C ib","V","V","","","r,r","Y","8"
> +"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","80 /7 ib","V","V","","","r,r","Y","8"
> +"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","82 /7 ib","V","N.S.","","","r,r","Y","8"
> +"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","REX 80 /7 ib","N.E.","V","","pseudo64","r,r","Y","8"
> +"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","38 /r","V","V","","","r,r","Y","8"
> +"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","REX 38 /r","N.E.","V","","pseudo64","r,r","Y","8"
> +"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","3A /r","V","V","","","r,r","Y","8"
> +"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","REX 3A /r","N.E.","V","","pseudo64","r,r","Y","8"
> +"CMP EAX, imm32","CMPL EAX, imm32","cmpl imm32, EAX","3D id","V","V","","operand32","r,r","Y","32"
> +"CMP r/m32, imm32","CMPL r/m32, imm32","cmpl imm32, r/m32","81 /7 id","V","V","","operand32","r,r","Y","32"
> +"CMP r/m32, imm8","CMPL r/m32, imm8","cmpl imm8, r/m32","83 /7 ib","V","V","","operand32","r,r","Y","32"
> +"CMP r/m32, r32","CMPL r/m32, r32","cmpl r32, r/m32","39 /r","V","V","","operand32","r,r","Y","32"
> +"CMP r32, r/m32","CMPL r32, r/m32","cmpl r/m32, r32","3B /r","V","V","","operand32","r,r","Y","32"
> +"CMPPD xmm1, xmm2/m128, imm8u","CMPPD imm8u, xmm1, xmm2/m128","cmppd imm8u, xmm2/m128, xmm1","66 0F C2 /r ib","V","V","SSE2","","rw,r,r","",""
> +"CMPPS xmm1, xmm2/m128, imm8u","CMPPS imm8u, xmm1, xmm2/m128","cmpps imm8u, xmm2/m128, xmm1","0F C2 /r ib","V","V","SSE","","rw,r,r","",""
> +"CMP RAX, imm32","CMPQ RAX, imm32","cmpq imm32, RAX","REX.W 3D id","N.S.","V","","","r,r","Y","64"
> +"CMP r/m64, imm32","CMPQ r/m64, imm32","cmpq imm32, r/m64","REX.W 81 /7 id","N.S.","V","","","r,r","Y","64"
> +"CMP r/m64, imm8","CMPQ r/m64, imm8","cmpq imm8, r/m64","REX.W 83 /7 ib","N.S.","V","","","r,r","Y","64"
> +"CMP r/m64, r64","CMPQ r/m64, r64","cmpq r64, r/m64","REX.W 39 /r","N.S.","V","","","r,r","Y","64"
> +"CMP r64, r/m64","CMPQ r64, r/m64","cmpq r/m64, r64","REX.W 3B /r","N.S.","V","","","r,r","Y","64"
> +"CMPSB","CMPSB","cmpsb","A6","V","V","","","","",""
> +"CMPSD xmm1, xmm2/m64, imm8u","CMPSD imm8u, xmm1, xmm2/m64","cmpsd imm8u, xmm2/m64, xmm1","F2 0F C2 /r ib","V","V","SSE2","","rw,r,r","",""
> +"CMPSD","CMPSL","cmpsl","A7","V","V","","operand32","","",""
> +"CMPSQ","CMPSQ","cmpsq","REX.W A7","N.S.","V","","","","",""
> +"CMPSS xmm1, xmm2/m32, imm8u","CMPSS imm8u, xmm1, xmm2/m32","cmpss imm8u, xmm2/m32, xmm1","F3 0F C2 /r ib","V","V","SSE","","rw,r,r","",""
> +"CMPSW","CMPSW","cmpsw","A7","V","V","","operand16","","",""
> +"CMP AX, imm16","CMPW AX, imm16","cmpw imm16, AX","3D iw","V","V","","operand16","r,r","Y","16"
> +"CMP r/m16, imm16","CMPW r/m16, imm16","cmpw imm16, r/m16","81 /7 iw","V","V","","operand16","r,r","Y","16"
> +"CMP r/m16, imm8","CMPW r/m16, imm8","cmpw imm8, r/m16","83 /7 ib","V","V","","operand16","r,r","Y","16"
> +"CMP r/m16, r16","CMPW r/m16, r16","cmpw r16, r/m16","39 /r","V","V","","operand16","r,r","Y","16"
> +"CMP r16, r/m16","CMPW r16, r/m16","cmpw r/m16, r16","3B /r","V","V","","operand16","r,r","Y","16"
> +"CMPXCHG16B m128","CMPXCHG16B m128","cmpxchg16b m128","REX.W 0F C7 /1","N.S.","V","","modrm_memonly","rw","",""
> +"CMPXCHG8B m64","CMPXCHG8B m64","cmpxchg8b m64","0F C7 /1","V","V","Pentium","modrm_memonly,operand16,operand32","rw","",""
> +"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","0F B0 /r","V","V","486","","rw,r","Y","8"
> +"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","REX 0F B0 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"CMPXCHG r/m32, r32","CMPXCHGL r32, r/m32","cmpxchgl r32, r/m32","0F B1 /r","V","V","486","operand32","rw,r","Y","32"
> +"CMPXCHG r/m64, r64","CMPXCHGQ r64, r/m64","cmpxchgq r64, r/m64","REX.W 0F B1 /r","N.S.","V","486","","rw,r","Y","64"
> +"CMPXCHG r/m16, r16","CMPXCHGW r16, r/m16","cmpxchgw r16, r/m16","0F B1 /r","V","V","486","operand16","rw,r","Y","16"
> +"COMISD xmm1, xmm2/m64","COMISD xmm2/m64, xmm1","comisd xmm2/m64, xmm1","66 0F 2F /r","V","V","SSE2","","r,r","",""
> +"COMISS xmm1, xmm2/m32","COMISS xmm2/m32, xmm1","comiss xmm2/m32, xmm1","0F 2F /r","V","V","SSE","","r,r","",""
> +"CPUID","CPUID","cpuid","0F A2","V","V","486","","","",""
> +"CQO","CQO","cqto","REX.W 99","N.S.","V","","","","",""
> +"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 0F 38 F0 /r","V","V","SSE4_2","operand16,operand32","rw,r","Y","8"
> +"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 REX 0F 38 F0 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"CRC32 r64, r/m8","CRC32B r/m8, r64","crc32b r/m8, r64","F2 REX.W 0F 38 F0 /r","N.S.","V","SSE4_2","","rw,r","Y","8"
> +"CRC32 r32, r/m32","CRC32L r/m32, r32","crc32l r/m32, r32","F2 0F 38 F1 /r","V","V","SSE4_2","operand32","rw,r","Y","32"
> +"CRC32 r64, r/m64","CRC32Q r/m64, r64","crc32q r/m64, r64","F2 REX.W 0F 38 F1 /r","N.S.","V","SSE4_2","","rw,r","Y","64"
> +"CRC32 r32, r/m16","CRC32W r/m16, r32","crc32w r/m16, r32","F2 0F 38 F1 /r","V","V","SSE4_2","operand16","rw,r","Y","16"
> +"CVTPD2PI mm1, xmm2/m128","CVTPD2PI xmm2/m128, mm1","cvtpd2pi xmm2/m128, mm1","66 0F 2D /r","V","V","SSE2","","w,r","",""
> +"CVTPD2DQ xmm1, xmm2/m128","CVTPD2PL xmm2/m128, xmm1","cvtpd2dq xmm2/m128, xmm1","F2 0F E6 /r","V","V","SSE2","","w,r","",""
> +"CVTPD2PS xmm1, xmm2/m128","CVTPD2PS xmm2/m128, xmm1","cvtpd2ps xmm2/m128, xmm1","66 0F 5A /r","V","V","SSE2","","w,r","",""
> +"CVTPI2PD xmm1, mm2/m64","CVTPI2PD mm2/m64, xmm1","cvtpi2pd mm2/m64, xmm1","66 0F 2A /r","V","V","SSE2","","w,r","",""
> +"CVTPI2PS xmm1, mm2/m64","CVTPI2PS mm2/m64, xmm1","cvtpi2ps mm2/m64, xmm1","0F 2A /r","V","V","SSE","","w,r","",""
> +"CVTDQ2PD xmm1, xmm2/m64","CVTPL2PD xmm2/m64, xmm1","cvtdq2pd xmm2/m64, xmm1","F3 0F E6 /r","V","V","SSE2","","w,r","",""
> +"CVTDQ2PS xmm1, xmm2/m128","CVTPL2PS xmm2/m128, xmm1","cvtdq2ps xmm2/m128, xmm1","0F 5B /r","V","V","SSE2","","w,r","",""
> +"CVTPS2PD xmm1, xmm2/m64","CVTPS2PD xmm2/m64, xmm1","cvtps2pd xmm2/m64, xmm1","0F 5A /r","V","V","SSE2","","w,r","",""
> +"CVTPS2PI mm1, xmm2/m64","CVTPS2PI xmm2/m64, mm1","cvtps2pi xmm2/m64, mm1","0F 2D /r","V","V","SSE","","w,r","",""
> +"CVTPS2DQ xmm1, xmm2/m128","CVTPS2PL xmm2/m128, xmm1","cvtps2dq xmm2/m128, xmm1","66 0F 5B /r","V","V","SSE2","","w,r","",""
> +"CVTSD2SI r32, xmm2/m64","CVTSD2SL xmm2/m64, r32","cvtsd2si xmm2/m64, r32","F2 0F 2D /r","V","V","SSE2","operand16,operand32","w,r","Y","32"
> +"CVTSD2SI r64, xmm2/m64","CVTSD2SL xmm2/m64, r64","cvtsd2siq xmm2/m64, r64","F2 REX.W 0F 2D /r","N.S.","V","SSE2","","w,r","Y","64"
> +"CVTSD2SS xmm1, xmm2/m64","CVTSD2SS xmm2/m64, xmm1","cvtsd2ss xmm2/m64, xmm1","F2 0F 5A /r","V","V","SSE2","","w,r","",""
> +"CVTSI2SD xmm1, r/m32","CVTSL2SD r/m32, xmm1","cvtsi2sdl r/m32, xmm1","F2 0F 2A /r","V","V","SSE2","operand16,operand32","w,r","Y","32"
> +"CVTSI2SS xmm1, r/m32","CVTSL2SS r/m32, xmm1","cvtsi2ssl r/m32, xmm1","F3 0F 2A /r","V","V","SSE","operand16,operand32","w,r","Y","32"
> +"CVTSI2SD xmm1, r/m64","CVTSQ2SD r/m64, xmm1","cvtsi2sdq r/m64, xmm1","F2 REX.W 0F 2A /r","N.S.","V","SSE2","","w,r","Y","64"
> +"CVTSI2SS xmm1, r/m64","CVTSQ2SS r/m64, xmm1","cvtsi2ssq r/m64, xmm1","F3 REX.W 0F 2A /r","N.S.","V","SSE","","w,r","Y","64"
> +"CVTSS2SD xmm1, xmm2/m32","CVTSS2SD xmm2/m32, xmm1","cvtss2sd xmm2/m32, xmm1","F3 0F 5A /r","V","V","SSE2","","w,r","",""
> +"CVTSS2SI r32, xmm2/m32","CVTSS2SL xmm2/m32, r32","cvtss2si xmm2/m32, r32","F3 0F 2D /r","V","V","SSE","operand16,operand32","w,r","Y","32"
> +"CVTSS2SI r64, xmm2/m32","CVTSS2SL xmm2/m32, r64","cvtss2siq xmm2/m32, r64","F3 REX.W 0F 2D /r","N.S.","V","SSE","","w,r","Y","64"
> +"CVTTPD2PI mm1, xmm2/m128","CVTTPD2PI xmm2/m128, mm1","cvttpd2pi xmm2/m128, mm1","66 0F 2C /r","V","V","SSE2","","w,r","",""
> +"CVTTPD2DQ xmm1, xmm2/m128","CVTTPD2PL xmm2/m128, xmm1","cvttpd2dq xmm2/m128, xmm1","66 0F E6 /r","V","V","SSE2","","w,r","",""
> +"CVTTPS2PI mm1, xmm2/m64","CVTTPS2PI xmm2/m64, mm1","cvttps2pi xmm2/m64, mm1","0F 2C /r","V","V","SSE","","w,r","",""
> +"CVTTPS2DQ xmm1, xmm2/m128","CVTTPS2PL xmm2/m128, xmm1","cvttps2dq xmm2/m128, xmm1","F3 0F 5B /r","V","V","SSE2","","w,r","",""
> +"CVTTSD2SI r32, xmm2/m64","CVTTSD2SL xmm2/m64, r32","cvttsd2si xmm2/m64, r32","F2 0F 2C /r","V","V","SSE2","operand16,operand32","w,r","Y","32"
> +"CVTTSD2SI r64, xmm2/m64","CVTTSD2SL xmm2/m64, r64","cvttsd2siq xmm2/m64, r64","F2 REX.W 0F 2C /r","N.S.","V","SSE2","","w,r","Y","64"
> +"CVTTSS2SI r32, xmm2/m32","CVTTSS2SL xmm2/m32, r32","cvttss2si xmm2/m32, r32","F3 0F 2C /r","V","V","SSE","operand16,operand32","w,r","Y","32"
> +"CVTTSS2SI r64, xmm2/m32","CVTTSS2SL xmm2/m32, r64","cvttss2siq xmm2/m32, r64","F3 REX.W 0F 2C /r","N.S.","V","SSE","","w,r","Y","64"
> +"CWD","CWD","cwtd","99","V","V","","operand16","","",""
> +"CWDE","CWDE","cwtl","98","V","V","","operand32","","",""
> +"DAA","DAA","daa","27","V","N.S.","","","","",""
> +"DAS","DAS","das","2F","V","N.S.","","","","",""
> +"DEC r/m8","DECB r/m8","decb r/m8","FE /1","V","V","","","rw","Y","8"
> +"DEC r/m8","DECB r/m8","decb r/m8","REX FE /1","N.E.","V","","pseudo64","rw","Y","8"
> +"DEC r/m32","DECL r/m32","decl r/m32","FF /1","V","V","","operand32","rw","Y","32"
> +"DEC r32op","DECL r32op","decl r32op","48+rd","V","N.S.","","operand32","rw","Y","32"
> +"DEC r/m64","DECQ r/m64","decq r/m64","REX.W FF /1","N.S.","V","","","rw","Y","64"
> +"DEC r/m16","DECW r/m16","decw r/m16","FF /1","V","V","","operand16","rw","Y","16"
> +"DEC r16op","DECW r16op","decw r16op","48+rw","V","N.S.","","operand16","rw","Y","16"
> +"DIV r/m8","DIVB r/m8","divb r/m8","F6 /6","V","V","","","r","Y","8"
> +"DIV r/m8","DIVB r/m8","divb r/m8","REX F6 /6","N.E.","V","","pseudo64","w","Y","8"
> +"DIV r/m32","DIVL r/m32","divl r/m32","F7 /6","V","V","","operand32","r","Y","32"
> +"DIVPD xmm1, xmm2/m128","DIVPD xmm2/m128, xmm1","divpd xmm2/m128, xmm1","66 0F 5E /r","V","V","SSE2","","rw,r","",""
> +"DIVPS xmm1, xmm2/m128","DIVPS xmm2/m128, xmm1","divps xmm2/m128, xmm1","0F 5E /r","V","V","SSE","","rw,r","",""
> +"DIV r/m64","DIVQ r/m64","divq r/m64","REX.W F7 /6","N.S.","V","","","r","Y","64"
> +"DIVSD xmm1, xmm2/m64","DIVSD xmm2/m64, xmm1","divsd xmm2/m64, xmm1","F2 0F 5E /r","V","V","SSE2","","rw,r","",""
> +"DIVSS xmm1, xmm2/m32","DIVSS xmm2/m32, xmm1","divss xmm2/m32, xmm1","F3 0F 5E /r","V","V","SSE","","rw,r","",""
> +"DIV r/m16","DIVW r/m16","divw r/m16","F7 /6","V","V","","operand16","r","Y","16"
> +"DPPD xmm1, xmm2/m128, imm8u","DPPD imm8u, xmm2/m128, xmm1","dppd imm8u, xmm2/m128, xmm1","66 0F 3A 41 /r ib","V","V","SSE4_1","","rw,r,r","",""
> +"DPPS xmm1, xmm2/m128, imm8u","DPPS imm8u, xmm2/m128, xmm1","dpps imm8u, xmm2/m128, xmm1","66 0F 3A 40 /r ib","V","V","SSE4_1","","rw,r,r","",""
> +"EMMS","EMMS","emms","0F 77","V","V","MMX","","","",""
> +"ENCLS","ENCLS","encls","0F 01 CF","V","V","","","","",""
> +"ENCLU","ENCLU","enclu","0F 01 D7","V","V","","","","",""
> +"ENDBR32","ENDBR32","endbr32","F3 0F 1E FB","V","V","CET","","","",""
> +"ENDBR64","ENDBR64","endbr64","F3 0F 1E FA","V","V","CET","","","Y",""
> +"ENTER imm16, 0","ENTER 0, imm16","enter imm16, 0","C8 iw 00","V","V","","pseudo","r,r","",""
> +"ENTER imm16, 1","ENTER 1, imm16","enter imm16, 1","C8 iw 01","V","V","","pseudo","r,r","",""
> +"ENTER imm16, imm8b","ENTERW/ENTERL/ENTERQ imm8b, imm16","enterw/enterl/enterq imm16, imm8b","C8 iw ib","V","V","","","r,r","",""
> +"EXTRACTPS r/m32, xmm1, imm8u:2","EXTRACTPS imm8u:2, xmm1, r/m32","extractps imm8u:2, xmm1, r/m32","66 0F 3A 17 /r ib","V","V","SSE4_1","","w,r,r","",""
> +"EXTRQ xmm1, imm8u, imm8u","EXTRQ imm8u, imm8u, xmm1","extrq imm8u, imm8u, xmm1","66 0F 78 /0 ib ib","V","V","SSE4a","amd,modrm_regonly","w,r,r","",""
> +"EXTRQ xmm1, xmm2","EXTRQ xmm2, xmm1","extrq xmm2, xmm1","66 0F 79 /r","V","V","SSE4a","amd,modrm_regonly","w,r","",""
> +"F2XM1","F2XM1","f2xm1","D9 F0","V","V","","","","",""
> +"FABS","FABS","fabs","D9 E1","V","V","","","","",""
> +"FADD ST(i), ST(0)","FADDD ST(0), ST(i)","fadd ST(0), ST(i)","DC C0+i","V","V","","","rw,r","Y",""
> +"FADD ST(0), ST(i)","FADDD ST(i), ST(0)","fadd ST(i), ST(0)","D8 C0+i","V","V","","","rw,r","Y",""
> +"FADD ST(0), m32fp","FADDD m32fp, ST(0)","fadds m32fp, ST(0)","D8 /0","V","V","","","rw,r","Y","32"
> +"FADD ST(0), m64fp","FADDD m64fp, ST(0)","faddl m64fp, ST(0)","DC /0","V","V","","","rw,r","Y","64"
> +"FADDP","FADDDP","faddp","DE C1","V","V","","pseudo","","",""
> +"FADDP ST(i), ST(0)","FADDDP ST(0), ST(i)","faddp ST(0), ST(i)","DE C0+i","V","V","","","rw,r","",""
> +"FBLD ST(0), m80dec","FBLD m80dec, ST(0)","fbld m80dec, ST(0)","DF /4","V","V","","","w,r","",""
> +"FBSTP m80dec, ST(0)","FBSTP ST(0), m80dec","fbstp ST(0), m80dec","DF /6","V","V","","","w,r","",""
> +"FCHS","FCHS","fchs","D9 E0","V","V","","","","",""
> +"FCLEX","FCLEX","fclex","9B DB E2","V","V","","pseudo","","",""
> +"FCMOVB ST(0), ST(i)","FCMOVB ST(i), ST(0)","fcmovb ST(i), ST(0)","DA C0+i","V","V","","P6","rw,r","",""
> +"FCMOVBE ST(0), ST(i)","FCMOVBE ST(i), ST(0)","fcmovbe ST(i), ST(0)","DA D0+i","V","V","","P6","rw,r","",""
> +"FCMOVE ST(0), ST(i)","FCMOVE ST(i), ST(0)","fcmove ST(i), ST(0)","DA C8+i","V","V","","P6","rw,r","",""
> +"FCMOVNB ST(0), ST(i)","FCMOVNB ST(i), ST(0)","fcmovnb ST(i), ST(0)","DB C0+i","V","V","","P6","rw,r","",""
> +"FCMOVNBE ST(0), ST(i)","FCMOVNBE ST(i), ST(0)","fcmovnbe ST(i), ST(0)","DB D0+i","V","V","","P6","rw,r","",""
> +"FCMOVNE ST(0), ST(i)","FCMOVNE ST(i), ST(0)","fcmovne ST(i), ST(0)","DB C8+i","V","V","","P6","rw,r","",""
> +"FCMOVNU ST(0), ST(i)","FCMOVNU ST(i), ST(0)","fcmovnu ST(i), ST(0)","DB D8+i","V","V","","P6","rw,r","",""
> +"FCMOVU ST(0), ST(i)","FCMOVU ST(i), ST(0)","fcmovu ST(i), ST(0)","DA D8+i","V","V","","P6","rw,r","",""
> +"FCOM","FCOMD","fcom","D8 D1","V","V","","pseudo","","Y",""
> +"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","D8 D0+i","V","V","","","r,r","Y",""
> +"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","DC D0+i","V","V","","","r,r","Y",""
> +"FCOM ST(0), m32fp","FCOMD m32fp, ST(0)","fcoms m32fp, ST(0)","D8 /2","V","V","","","r,r","Y","32"
> +"FCOM ST(0), m64fp","FCOMD m64fp, ST(0)","fcoml m64fp, ST(0)","DC /2","V","V","","","r,r","Y","64"
> +"FCOMP ST(0), m32fp","FCOMFP m32fp, ST(0)","fcomps m32fp, ST(0)","D8 /3","V","V","","","r,r","Y","32"
> +"FCOMI ST(0), ST(i)","FCOMI ST(i), ST(0)","fcomi ST(i), ST(0)","DB F0+i","V","V","PPRO","P6","r,r","",""
> +"FCOMIP ST(0), ST(i)","FCOMIP ST(i), ST(0)","fcomip ST(i), ST(0)","DF F0+i","V","V","PPRO","P6","r,r","",""
> +"FCOMP","FCOMP","fcomp","D8 D9","V","V","","pseudo","","Y",""
> +"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","D8 D8+i","V","V","","","r,r","Y",""
> +"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DC D8+i","V","V","","","r,r","Y",""
> +"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DE D0+i","V","V","","","r,r","Y",""
> +"FCOMP ST(0), m64fp","FCOMPL m64fp, ST(0)","fcompl m64fp, ST(0)","DC /3","V","V","","","r,r","Y","64"
> +"FCOMPP","FCOMPP","fcompp","DE D9","V","V","","","","",""
> +"FCOS","FCOS","fcos","D9 FF","V","V","","","","",""
> +"FDECSTP","FDECSTP","fdecstp","D9 F6","V","V","","","","",""
> +"FDISI8087_NOP","FDISI8087_NOP","fdisi8087_nop","DB E1","V","V","","","","",""
> +"FDIVR ST(i), ST(0)","FDIVD ST(0), ST(i)","fdiv ST(0), ST(i)","DC F0+i","V","V","","","rw,r","Y",""
> +"FDIV ST(i), ST(0)","FDIVD ST(0), ST(i)","fdivr ST(0), ST(i)","DC F8+i","V","V","","","rw,r","Y",""
> +"FDIV ST(0), ST(i)","FDIVD ST(i), ST(0)","fdiv ST(i), ST(0)","D8 F0+i","V","V","","","rw,r","Y",""
> +"FDIV ST(0), m32fp","FDIVD m32fp, ST(0)","fdivs m32fp, ST(0)","D8 /6","V","V","","","rw,r","Y","32"
> +"FDIV ST(0), m64fp","FDIVD m64fp, ST(0)","fdivl m64fp, ST(0)","DC /6","V","V","","","rw,r","Y","64"
> +"FDIVR ST(0), m32fp","FDIVFR m32fp, ST(0)","fdivrs m32fp, ST(0)","D8 /7","V","V","","","rw,r","Y","32"
> +"FDIVP","FDIVP","fdivp","DE F9","V","V","","pseudo","","",""
> +"FDIVRP ST(i), ST(0)","FDIVP ST(0), ST(i)","fdivp ST(0), ST(i)","DE F0+i","V","V","","","rw,r","",""
> +"FDIVR ST(0), ST(i)","FDIVR ST(i), ST(0)","fdivr ST(i), ST(0)","D8 F8+i","V","V","","","rw,r","Y",""
> +"FDIVR ST(0), m64fp","FDIVRL m64fp, ST(0)","fdivrl m64fp, ST(0)","DC /7","V","V","","","rw,r","Y","64"
> +"FDIVRP","FDIVRP","fdivrp","DE F1","V","V","","pseudo","","",""
> +"FDIVP ST(i), ST(0)","FDIVRP ST(0), ST(i)","fdivrp ST(0), ST(i)","DE F8+i","V","V","","","rw,r","",""
> +"FEMMS","FEMMS","femms","0F 0E","V","V","3DNOW","amd","","",""
> +"FENI8087_NOP","FENI8087_NOP","feni8087_nop","DB E0","V","V","","","","",""
> +"FFREE ST(i)","FFREE ST(i)","ffree ST(i)","DD C0+i","V","V","","","r","",""
> +"FFREEP ST(i)","FFREEP ST(i)","ffreep ST(i)","DF C0+i","V","V","","","r","",""
> +"FIADD ST(0), m16int","FIADD m16int, ST(0)","fiadd m16int, ST(0)","DE /0","V","V","","","rw,r","Y",""
> +"FIADD ST(0), m32int","FIADDL m32int, ST(0)","fiaddl m32int, ST(0)","DA /0","V","V","","","rw,r","Y","32"
> +"FICOM ST(0), m16int","FICOM m16int, ST(0)","ficom m16int, ST(0)","DE /2","V","V","","","r,r","Y",""
> +"FICOM ST(0), m32int","FICOML m32int, ST(0)","ficoml m32int, ST(0)","DA /2","V","V","","","r,r","Y","32"
> +"FICOMP ST(0), m16int","FICOMP m16int, ST(0)","ficomp m16int, ST(0)","DE /3","V","V","","","r,r","Y",""
> +"FICOMP ST(0), m32int","FICOMPL m32int, ST(0)","ficompl m32int, ST(0)","DA /3","V","V","","","r,r","Y","32"
> +"FIDIV ST(0), m16int","FIDIV m16int, ST(0)","fidiv m16int, ST(0)","DE /6","V","V","","","rw,r","Y",""
> +"FIDIV ST(0), m32int","FIDIVL m32int, ST(0)","fidivl m32int, ST(0)","DA /6","V","V","","","rw,r","Y","32"
> +"FIDIVR ST(0), m16int","FIDIVR m16int, ST(0)","fidivr m16int, ST(0)","DE /7","V","V","","","rw,r","Y",""
> +"FIDIVR ST(0), m32int","FIDIVRL m32int, ST(0)","fidivrl m32int, ST(0)","DA /7","V","V","","","rw,r","Y","32"
> +"FILD ST(0), m16int","FILD m16int, ST(0)","fild m16int, ST(0)","DF /0","V","V","","","w,r","Y",""
> +"FILD ST(0), m32int","FILDL m32int, ST(0)","fildl m32int, ST(0)","DB /0","V","V","","","w,r","Y","32"
> +"FILD ST(0), m64int","FILDLL m64int, ST(0)","fildll m64int, ST(0)","DF /5","V","V","","","w,r","Y","64"
> +"FIMUL ST(0), m16int","FIMUL m16int, ST(0)","fimul m16int, ST(0)","DE /1","V","V","","","rw,r","Y",""
> +"FIMUL ST(0), m32int","FIMULL m32int, ST(0)","fimull m32int, ST(0)","DA /1","V","V","","","rw,r","Y","32"
> +"FINCSTP","FINCSTP","fincstp","D9 F7","V","V","","","","",""
> +"FINIT","FINIT","finit","9B DB E3","V","V","","pseudo","","",""
> +"FIST m16int, ST(0)","FIST ST(0), m16int","fist ST(0), m16int","DF /2","V","V","","","w,r","Y",""
> +"FIST m32int, ST(0)","FISTL ST(0), m32int","fistl ST(0), m32int","DB /2","V","V","","","w,r","Y","32"
> +"FISTP m16int, ST(0)","FISTP ST(0), m16int","fistp ST(0), m16int","DF /3","V","V","","","w,r","Y",""
> +"FISTP m32int, ST(0)","FISTPL ST(0), m32int","fistpl ST(0), m32int","DB /3","V","V","","","w,r","Y","32"
> +"FISTP m64int, ST(0)","FISTPLL ST(0), m64int","fistpll ST(0), m64int","DF /7","V","V","","","w,r","Y","64"
> +"FISTTP m16int, ST(0)","FISTTP ST(0), m16int","fisttp ST(0), m16int","DF /1","V","V","SSE3","modrm_memonly","w,r","Y",""
> +"FISTTP m32int, ST(0)","FISTTPL ST(0), m32int","fisttpl ST(0), m32int","DB /1","V","V","SSE3","modrm_memonly","w,r","Y","32"
> +"FISTTP m64int, ST(0)","FISTTPLL ST(0), m64int","fisttpll ST(0), m64int","DD /1","V","V","SSE3","modrm_memonly","w,r","Y","64"
> +"FISUB ST(0), m16int","FISUB m16int, ST(0)","fisub m16int, ST(0)","DE /4","V","V","","","rw,r","Y",""
> +"FISUB ST(0), m32int","FISUBL m32int, ST(0)","fisubl m32int, ST(0)","DA /4","V","V","","","rw,r","Y","32"
> +"FISUBR ST(0), m16int","FISUBR m16int, ST(0)","fisubr m16int, ST(0)","DE /5","V","V","","","rw,r","Y",""
> +"FISUBR ST(0), m32int","FISUBRL m32int, ST(0)","fisubrl m32int, ST(0)","DA /5","V","V","","","rw,r","Y","32"
> +"FLD ST(0), ST(i)","FLD ST(i), ST(0)","fld ST(i), ST(0)","D9 C0+i","V","V","","","w,r","Y",""
> +"FLD1","FLD1","fld1","D9 E8","V","V","","","","",""
> +"FLDCW m2byte","FLDCW m2byte","fldcw m2byte","D9 /5","V","V","","","r","",""
> +"FLDENV m28byte","FLDENV m28byte","fldenv m28byte","D9 /4","V","V","","operand32,operand64","r","",""
> +"FLDENV m14byte","FLDENVS m14byte","fldenv m14byte","D9 /4","V","V","","operand16","r","",""
> +"FLD ST(0), m64fp","FLDL m64fp, ST(0)","fldl m64fp, ST(0)","DD /0","V","V","","","w,r","Y","64"
> +"FLDL2E","FLDL2E","fldl2e","D9 EA","V","V","","","","",""
> +"FLDL2T","FLDL2T","fldl2t","D9 E9","V","V","","","","",""
> +"FLDLG2","FLDLG2","fldlg2","D9 EC","V","V","","","","",""
> +"FLDLN2","FLDLN2","fldln2","D9 ED","V","V","","","","",""
> +"FLDPI","FLDPI","fldpi","D9 EB","V","V","","","","",""
> +"FLD ST(0), m32fp","FLDS m32fp, ST(0)","flds m32fp, ST(0)","D9 /0","V","V","","","w,r","Y","32"
> +"FLD ST(0), m80fp","FLDT m80fp, ST(0)","fldt m80fp, ST(0)","DB /5","V","V","","","w,r","Y","80"
> +"FLDZ","FLDZ","fldz","D9 EE","V","V","","","","",""
> +"FMUL ST(i), ST(0)","FMUL ST(0), ST(i)","fmul ST(0), ST(i)","DC C8+i","V","V","","","rw,r","Y",""
> +"FMUL ST(0), ST(i)","FMUL ST(i), ST(0)","fmul ST(i), ST(0)","D8 C8+i","V","V","","","rw,r","Y",""
> +"FMUL ST(0), m64fp","FMULL m64fp, ST(0)","fmull m64fp, ST(0)","DC /1","V","V","","","rw,r","Y","64"
> +"FMULP","FMULP","fmulp","DE C9","V","V","","pseudo","","",""
> +"FMULP ST(i), ST(0)","FMULP ST(0), ST(i)","fmulp ST(0), ST(i)","DE C8+i","V","V","","","rw,r","",""
> +"FMUL ST(0), m32fp","FMULS m32fp, ST(0)","fmuls m32fp, ST(0)","D8 /1","V","V","","","rw,r","Y","32"
> +"FNCLEX","FNCLEX","fnclex","DB E2","V","V","","","","",""
> +"FNINIT","FNINIT","fninit","DB E3","V","V","","","","",""
> +"FNOP","FNOP","fnop","D9 D0","V","V","","","","",""
> +"FNSAVE m108byte","FNSAVE m108byte","fnsave m108byte","DD /6","V","V","","operand32,operand64","w","",""
> +"FNSAVE m94byte","FNSAVES m94byte","fnsave m94byte","DD /6","V","V","","operand16","w","",""
> +"FNSTCW m2byte","FNSTCW m2byte","fnstcw m2byte","D9 /7","V","V","","","w","",""
> +"FNSTENV m28byte","FNSTENV m28byte","fnstenv m28byte","D9 /6","V","V","","operand32,operand64","w","",""
> +"FNSTENV m14byte","FNSTENVS m14byte","fnstenv m14byte","D9 /6","V","V","","operand16","w","",""
> +"FNSTSW AX","FNSTSW AX","fnstsw AX","DF E0","V","V","","","w","",""
> +"FNSTSW m2byte","FNSTSW m2byte","fnstsw m2byte","DD /7","V","V","","","w","",""
> +"FPATAN","FPATAN","fpatan","D9 F3","V","V","","","","",""
> +"FPREM","FPREM","fprem","D9 F8","V","V","","","","",""
> +"FPREM1","FPREM1","fprem1","D9 F5","V","V","","","","",""
> +"FPTAN","FPTAN","fptan","D9 F2","V","V","","","","",""
> +"FRNDINT","FRNDINT","frndint","D9 FC","V","V","","","","",""
> +"FRSTOR m108byte","FRSTOR m108byte","frstor m108byte","DD /4","V","V","","operand32,operand64","r","",""
> +"FRSTOR m94byte","FRSTORS m94byte","frstor m94byte","DD /4","V","V","","operand16","r","",""
> +"FSAVE m94/108byte","FSAVE m94/108byte","fsave m94/108byte","9B DD /6","V","V","","pseudo","w","",""
> +"FSCALE","FSCALE","fscale","D9 FD","V","V","","","","",""
> +"FSETPM287_NOP","FSETPM287_NOP","fsetpm287_nop","DB E4","V","V","","","","",""
> +"FSIN","FSIN","fsin","D9 FE","V","V","","","","",""
> +"FSINCOS","FSINCOS","fsincos","D9 FB","V","V","","","","",""
> +"FSQRT","FSQRT","fsqrt","D9 FA","V","V","","","","",""
> +"FST ST(i), ST(0)","FST ST(0), ST(i)","fst ST(0), ST(i)","DD D0+i","V","V","","","w,r","Y",""
> +"FSTCW m2byte","FSTCW m2byte","fstcw m2byte","9B D9 /7","V","V","","pseudo","w","",""
> +"FSTENV m14/28byte","FSTENV m14/28byte","fstenv m14/28byte","9B D9 /6","V","V","","pseudo","w","",""
> +"FST m64fp, ST(0)","FSTL ST(0), m64fp","fstl ST(0), m64fp","DD /2","V","V","","","w,r","Y","64"
> +"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DD D8+i","V","V","","","w,r","Y",""
> +"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D0+i","V","V","","","w,r","Y",""
> +"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D8+i","V","V","","","w,r","Y",""
> +"FSTP m64fp, ST(0)","FSTPL ST(0), m64fp","fstpl ST(0), m64fp","DD /3","V","V","","","w,r","Y","64"
> +"FSTPNCE ST(i), ST(0)","FSTPNCE ST(0), ST(i)","fstpnce ST(0), ST(i)","D9 D8+i","V","V","","","w,r","",""
> +"FSTP m32fp, ST(0)","FSTPS ST(0), m32fp","fstps ST(0), m32fp","D9 /3","V","V","","","w,r","Y","32"
> +"FSTP m80fp, ST(0)","FSTPT ST(0), m80fp","fstpt ST(0), m80fp","DB /7","V","V","","","w,r","Y","80"
> +"FST m32fp, ST(0)","FSTS ST(0), m32fp","fsts ST(0), m32fp","D9 /2","V","V","","","w,r","Y","32"
> +"FSTSW AX","FSTSW AX","fstsw AX","9B DF E0","V","V","","pseudo","w","",""
> +"FSTSW m2byte","FSTSW m2byte","fstsw m2byte","9B DD /7","V","V","","pseudo","w","",""
> +"FSUBR ST(i), ST(0)","FSUB ST(0), ST(i)","fsub ST(0), ST(i)","DC E0+i","V","V","","","rw,r","Y",""
> +"FSUB ST(0), ST(i)","FSUB ST(i), ST(0)","fsub ST(i), ST(0)","D8 E0+i","V","V","","","rw,r","Y",""
> +"FSUB ST(0), m64fp","FSUBL m64fp, ST(0)","fsubl m64fp, ST(0)","DC /4","V","V","","","rw,r","Y","64"
> +"FSUBP","FSUBP","fsubp","DE E9","V","V","","pseudo","","",""
> +"FSUBRP ST(i), ST(0)","FSUBP ST(0), ST(i)","fsubp ST(0), ST(i)","DE E0+i","V","V","","","rw,r","",""
> +"FSUB ST(i), ST(0)","FSUBR ST(0), ST(i)","fsubr ST(0), ST(i)","DC E8+i","V","V","","","rw,r","Y",""
> +"FSUBR ST(0), ST(i)","FSUBR ST(i), ST(0)","fsubr ST(i), ST(0)","D8 E8+i","V","V","","","rw,r","Y",""
> +"FSUBR ST(0), m64fp","FSUBRL m64fp, ST(0)","fsubrl m64fp, ST(0)","DC /5","V","V","","","rw,r","Y","64"
> +"FSUBRP","FSUBRP","fsubrp","DE E1","V","V","","pseudo","","",""
> +"FSUBP ST(i), ST(0)","FSUBRP ST(0), ST(i)","fsubrp ST(0), ST(i)","DE E8+i","V","V","","","rw,r","",""
> +"FSUBR ST(0), m32fp","FSUBRS m32fp, ST(0)","fsubrs m32fp, ST(0)","D8 /5","V","V","","","rw,r","Y","32"
> +"FSUB ST(0), m32fp","FSUBS m32fp, ST(0)","fsubs m32fp, ST(0)","D8 /4","V","V","","","rw,r","Y","32"
> +"FTST","FTST","ftst","D9 E4","V","V","","","","",""
> +"FUCOM","FUCOM","fucom","DD E1","V","V","","pseudo","","",""
> +"FUCOM ST(0), ST(i)","FUCOM ST(i), ST(0)","fucom ST(i), ST(0)","DD E0+i","V","V","","","r,r","",""
> +"FUCOMI ST(0), ST(i)","FUCOMI ST(i), ST(0)","fucomi ST(i), ST(0)","DB E8+i","V","V","PPRO","P6","r,r","",""
> +"FUCOMIP ST(0), ST(i)","FUCOMIP ST(i), ST(0)","fucomip ST(i), ST(0)","DF E8+i","V","V","PPRO","P6","r,r","",""
> +"FUCOMP","FUCOMP","fucomp","DD E9","V","V","","pseudo","","",""
> +"FUCOMP ST(0), ST(i)","FUCOMP ST(i), ST(0)","fucomp ST(i), ST(0)","DD E8+i","V","V","","","r,r","",""
> +"FUCOMPP","FUCOMPP","fucompp","DA E9","V","V","","","","",""
> +"FWAIT","FWAIT","fwait","9B","V","V","","","","",""
> +"FXAM","FXAM","fxam","D9 E5","V","V","","","","",""
> +"FXCH","FXCH","fxch","D9 C9","V","V","","pseudo","","",""
> +"FXCH ST(0), ST(i)","FXCH ST(i), ST(0)","fxch ST(i), ST(0)","D9 C8+i","V","V","","","rw,rw","",""
> +"FXCH_ALIAS1 ST(0), ST(i)","FXCH_ALIAS1 ST(i), ST(0)","fxch_alias1 ST(i), ST(0)","DD C8+i","V","V","","","rw,rw","",""
> +"FXCH_ALIAS2 ST(0), ST(i)","FXCH_ALIAS2 ST(i), ST(0)","fxch_alias2 ST(i), ST(0)","DF C8+i","V","V","","","rw,rw","",""
> +"FXRSTOR m512byte","FXRSTOR m512byte","fxrstor m512byte","0F AE /1","V","V","","modrm_memonly,operand16,operand32","r","",""
> +"FXRSTOR64 m512byte","FXRSTOR64 m512byte","fxrstor64 m512byte","REX.W 0F AE /1","N.S.","V","","modrm_memonly","r","",""
> +"FXSAVE m512byte","FXSAVE m512byte","fxsave m512byte","0F AE /0","V","V","","modrm_memonly,operand16,operand32","w","",""
> +"FXSAVE64 m512byte","FXSAVE64 m512byte","fxsave64 m512byte","REX.W 0F AE /0","N.S.","V","","modrm_memonly","w","",""
> +"FXTRACT","FXTRACT","fxtract","D9 F4","V","V","","","","",""
> +"FYL2X","FYL2X","fyl2x","D9 F1","V","V","","","","",""
> +"FYL2XP1","FYL2XP1","fyl2xp1","D9 F9","V","V","","","","",""
> +"GETSEC","GETSEC","getsec","0F 37","V","V","SMX","","","",""
> +"GF2P8AFFINEINVQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEINVQB imm8u,
> xmm2/m128, xmm1","gf2p8affineinvqb imm8u, xmm2/m128, xmm1","66 0F 3A
> CF /r ib","V","V","GFNI","","rw,r,r","",""
> +"GF2P8AFFINEQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEQB imm8u, xmm2/m128, xmm1","gf2p8affineqb imm8u, xmm2/m128, xmm1","66 0F 3A CE /r ib","V","V","GFNI","","rw,r,r","",""
> +"GF2P8MULB xmm1, xmm2/m128","GF2P8MULB xmm2/m128, xmm1","gf2p8mulb xmm2/m128, xmm1","66 0F 38 CF /r","V","V","GFNI","","rw,r","",""
> +"HADDPD xmm1, xmm2/m128","HADDPD xmm2/m128, xmm1","haddpd xmm2/m128, xmm1","66 0F 7C /r","V","V","SSE3","","rw,r","",""
> +"HADDPS xmm1, xmm2/m128","HADDPS xmm2/m128, xmm1","haddps xmm2/m128, xmm1","F2 0F 7C /r","V","V","SSE3","","rw,r","",""
> +"HLT","HLT","hlt","F4","V","V","","","","",""
> +"HSUBPD xmm1, xmm2/m128","HSUBPD xmm2/m128, xmm1","hsubpd xmm2/m128, xmm1","66 0F 7D /r","V","V","SSE3","","rw,r","",""
> +"HSUBPS xmm1, xmm2/m128","HSUBPS xmm2/m128, xmm1","hsubps xmm2/m128, xmm1","F2 0F 7D /r","V","V","SSE3","","rw,r","",""
> +"ICEBP","ICEBP","icebp","F1","V","V","","","","",""
> +"IDIV r/m8","IDIVB r/m8","idivb r/m8","F6 /7","V","V","","","r","Y","8"
> +"IDIV r/m8","IDIVB r/m8","idivb r/m8","REX F6 /7","N.E.","V","","pseudo64","r","Y","8"
> +"IDIV r/m32","IDIVL r/m32","idivl r/m32","F7 /7","V","V","","operand32","r","Y","32"
> +"IDIV r/m64","IDIVQ r/m64","idivq r/m64","REX.W F7 /7","N.S.","V","","","r","Y","64"
> +"IDIV r/m16","IDIVW r/m16","idivw r/m16","F7 /7","V","V","","operand16","r","Y","16"
> +"IMUL r32, r/m32, imm32","IMUL3 imm32, r/m32, r32","imull imm32, r/m32, r32","69 /r id","V","V","","operand32","w,r,r","Y","32"
> +"IMUL r64, r/m64, imm32","IMUL3 imm32, r/m64, r64","imulq imm32, r/m64, r64","REX.W 69 /r id","N.S.","V","","","w,r,r","Y","64"
> +"IMUL r16, r/m16, imm8","IMUL3 imm8, r/m16, r16","imulw imm8, r/m16, r16","6B /r ib","V","V","","operand16","w,r,r","Y","16"
> +"IMUL r32, r/m32, imm8","IMUL3 imm8, r/m32, r32","imull imm8, r/m32, r32","6B /r ib","V","V","","operand32","w,r,r","Y","32"
> +"IMUL r64, r/m64, imm8","IMUL3 imm8, r/m64, r64","imulq imm8, r/m64, r64","REX.W 6B /r ib","N.S.","V","","","w,r,r","Y","64"
> +"IMUL r/m8","IMULB r/m8","imulb r/m8","F6 /5","V","V","","","r","Y","8"
> +"IMUL r/m32","IMULL r/m32","imull r/m32","F7 /5","V","V","","operand32","r","Y","32"
> +"IMUL r32, r/m32","IMULL r/m32, r32","imull r/m32, r32","0F AF /r","V","V","","operand32","rw,r","Y","32"
> +"IMUL r/m64","IMULQ r/m64","imulq r/m64","REX.W F7 /5","N.S.","V","","","r","Y","64"
> +"IMUL r64, r/m64","IMULQ r/m64, r64","imulq r/m64, r64","REX.W 0F AF /r","N.S.","V","","","rw,r","Y","64"
> +"IMUL r16, r/m16, imm16","IMULW imm16, r/m16, r16","imulw imm16, r/m16, r16","69 /r iw","V","V","","operand16","w,r,r","Y","16"
> +"IMUL r/m16","IMULW r/m16","imulw r/m16","F7 /5","V","V","","operand16","r","Y","16"
> +"IMUL r16, r/m16","IMULW r/m16, r16","imulw r/m16, r16","0F AF /r","V","V","","operand16","rw,r","Y","16"
> +"IN AL, DX","INB DX, AL","inb DX, AL","EC","V","V","","","w,r","Y","8"
> +"IN AL, imm8u","INB imm8u, AL","inb imm8u, AL","E4 ib","V","V","","","w,r","Y","8"
> +"INC r/m8","INCB r/m8","incb r/m8","FE /0","V","V","","","rw","Y","8"
> +"INC r/m8","INCB r/m8","incb r/m8","REX FE /0","N.E.","V","","pseudo64","rw","Y","8"
> +"INC r/m32","INCL r/m32","incl r/m32","FF /0","V","V","","operand32","rw","Y","32"
> +"INC r32op","INCL r32op","incl r32op","40+rd","V","N.S.","","operand32","rw","Y","32"
> +"INC r/m64","INCQ r/m64","incq r/m64","REX.W FF /0","N.S.","V","","","rw","Y","64"
> +"INCSSPD rmr32","INCSSPD rmr32","incsspd rmr32","F3 0F AE /5","V","V","CET","modrm_regonly,operand16,operand32","r","",""
> +"INCSSPQ rmr64","INCSSPQ rmr64","incsspq rmr64","F3 REX.W 0F AE /5","N.S.","V","CET","modrm_regonly","r","",""
> +"INC r/m16","INCW r/m16","incw r/m16","FF /0","V","V","","operand16","rw","Y","16"
> +"INC r16op","INCW r16op","incw r16op","40+rw","V","N.S.","","operand16","rw","Y","16"
> +"IN EAX, DX","INL DX, EAX","inl DX, EAX","ED","V","V","","operand32,operand64","w,r","Y","32"
> +"IN EAX, imm8u","INL imm8u, EAX","inl imm8u, EAX","E5 ib","V","V","","operand32,operand64","w,r","Y","32"
> +"INSB","INSB","insb","6C","V","V","","","","",""
> +"INSERTPS xmm1, xmm2/m32, imm8u","INSERTPS imm8u, xmm2/m32, xmm1","insertps imm8u, xmm2/m32, xmm1","66 0F 3A 21 /r ib","V","V","SSE4_1","","rw,r,r","",""
> +"INSERTQ xmm1, xmm2, imm8u, imm8u","INSERTQ imm8u, imm8u, xmm2,
> xmm1","insertq imm8u, imm8u, xmm2, xmm1","F2 0F 78 /r ib
> ib","V","V","SSE4a","amd,modrm_regonly","w,r,r,r","",""
> +"INSERTQ xmm1, xmm2","INSERTQ xmm2, xmm1","insertq xmm2, xmm1","F2 0F 79 /r","V","V","SSE4a","amd,modrm_regonly","w,r","",""
> +"INSD","INSL","insl","6D","V","V","","operand32,operand64","","",""
> +"INSW","INSW","insw","6D","V","V","","operand16","","",""
> +"INT 3","INT 3","int 3","CC","V","V","","","r","",""
> +"INT imm8u","INT imm8u","int imm8u","CD ib","V","V","","","r","",""
> +"INTO","INTO","into","CE","V","N.S.","","","","",""
> +"INVD","INVD","invd","0F 08","V","V","486","","","",""
> +"INVEPT r32, m128","INVEPT m128, r32","invept m128, r32","66 0F 38 80 /r","V","N.S.","VTX","modrm_memonly","r,r","",""
> +"INVEPT r64, m128","INVEPT m128, r64","invept m128, r64","66 0F 38 80 /r","N.S.","V","VTX","default64,modrm_memonly","r,r","",""
> +"INVLPG m","INVLPG m","invlpg m","0F 01 /7","V","V","486","modrm_memonly","r","",""
> +"INVLPGA EAX, ECX","INVLPGAL ECX, EAX","invlpgal ECX, EAX","0F 01 DF","V","V","SVM","amd,modrm_regonly,operand32","r,r","Y","32"
> +"INVLPGA RAX, ECX","INVLPGAQ ECX, RAX","invlpgaq ECX, RAX","REX.W 0F 01 DF","N.S.","V","SVM","amd,modrm_regonly","r,r","Y","64"
> +"INVLPGA AX, ECX","INVLPGAW ECX, AX","invlpgaw ECX, AX","0F 01 DF","V","V","SVM","amd,modrm_regonly,operand16","r,r","Y","16"
> +"INVPCID r32, m128","INVPCID m128, r32","invpcid m128, r32","66 0F 38 82 /r","V","N.S.","INVPCID","modrm_memonly","r,r","",""
> +"INVPCID r64, m128","INVPCID m128, r64","invpcid m128, r64","66 0F 38 82 /r","N.S.","V","INVPCID","default64,modrm_memonly","r,r","",""
> +"INVVPID r32, m128","INVVPID m128, r32","invvpid m128, r32","66 0F 38 81 /r","V","N.S.","VTX","modrm_memonly","r,r","",""
> +"INVVPID r64, m128","INVVPID m128, r64","invvpid m128, r64","66 0F 38 81 /r","N.S.","V","VTX","default64,modrm_memonly","r,r","",""
> +"IN AX, DX","INW DX, AX","inw DX, AX","ED","V","V","","operand16","w,r","Y","16"
> +"IN AX, imm8u","INW imm8u, AX","inw imm8u, AX","E5 ib","V","V","","operand16","w,r","Y","16"
> +"IRETD","IRETL","iretl","CF","V","V","","operand32","","",""
> +"IRETQ","IRETQ","iretq","REX.W CF","N.S.","V","","","","",""
> +"IRET","IRETW","iretw","CF","V","V","","operand16","","",""
> +"JA rel16","JA rel16","ja rel16","0F 87 cw","V","N.S.","","operand16","r","",""
> +"JA rel32","JA rel32","ja rel32","0F 87 cd","V","N.S.","","operand32","r","",""
> +"JA rel32","JA rel32","ja rel32","0F 87 cd","N.S.","V","","default64","r","",""
> +"JA rel8","JA rel8","ja rel8","77 cb","N.S.","V","","default64","r","",""
> +"JA rel8","JA rel8","ja rel8","77 cb","V","N.S.","","","r","",""
> +"JAE rel16","JAE rel16","jae rel16","0F 83 cw","V","N.S.","","operand16","r","",""
> +"JAE rel32","JAE rel32","jae rel32","0F 83 cd","N.S.","V","","default64","r","",""
> +"JAE rel32","JAE rel32","jae rel32","0F 83 cd","V","N.S.","","operand32","r","",""
> +"JAE rel8","JAE rel8","jae rel8","73 cb","V","N.S.","","","r","",""
> +"JAE rel8","JAE rel8","jae rel8","73 cb","N.S.","V","","default64","r","",""
> +"JB rel16","JB rel16","jb rel16","0F 82 cw","V","N.S.","","operand16","r","",""
> +"JB rel32","JB rel32","jb rel32","0F 82 cd","V","N.S.","","operand32","r","",""
> +"JB rel32","JB rel32","jb rel32","0F 82 cd","N.S.","V","","default64","r","",""
> +"JB rel8","JB rel8","jb rel8","72 cb","N.S.","V","","default64","r","",""
> +"JB rel8","JB rel8","jb rel8","72 cb","V","N.S.","","","r","",""
> +"JBE rel16","JBE rel16","jbe rel16","0F 86 cw","V","N.S.","","operand16","r","",""
> +"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","V","N.S.","","operand32","r","",""
> +"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","N.S.","V","","default64","r","",""
> +"JBE rel8","JBE rel8","jbe rel8","76 cb","V","N.S.","","","r","",""
> +"JBE rel8","JBE rel8","jbe rel8","76 cb","N.S.","V","","default64","r","",""
> +"JC rel16","JC rel16","jc rel16","0F 82 cw","V","N.S.","","pseudo","r","",""
> +"JC rel32","JC rel32","jc rel32","0F 82 cd","V","V","","pseudo","r","",""
> +"JC rel8","JC rel8","jc rel8","72 cb","V","V","","pseudo","r","",""
> +"JCXZ rel8","JCXZ rel8","jcxz rel8","E3 cb","V","N.S.","","address16","r","",""
> +"JE rel16","JE rel16","je rel16","0F 84 cw","V","N.S.","","operand16","r","",""
> +"JE rel32","JE rel32","je rel32","0F 84 cd","V","N.S.","","operand32","r","",""
> +"JE rel32","JE rel32","je rel32","0F 84 cd","N.S.","V","","default64","r","",""
> +"JE rel8","JE rel8","je rel8","74 cb","N.S.","V","","default64","r","",""
> +"JE rel8","JE rel8","je rel8","74 cb","V","N.S.","","","r","",""
> +"JECXZ rel8","JECXZ rel8","jecxz rel8","E3 cb","V","V","","address32","r","",""
> +"JG rel16","JG rel16","jg rel16","0F 8F cw","V","N.S.","","operand16","r","",""
> +"JG rel32","JG rel32","jg rel32","0F 8F cd","N.S.","V","","default64","r","",""
> +"JG rel32","JG rel32","jg rel32","0F 8F cd","V","N.S.","","operand32","r","",""
> +"JG rel8","JG rel8","jg rel8","7F cb","V","N.S.","","","r","",""
> +"JG rel8","JG rel8","jg rel8","7F cb","N.S.","V","","default64","r","",""
> +"JGE rel16","JGE rel16","jge rel16","0F 8D cw","V","N.S.","","operand16","r","",""
> +"JGE rel32","JGE rel32","jge rel32","0F 8D cd","V","N.S.","","operand32","r","",""
> +"JGE rel32","JGE rel32","jge rel32","0F 8D cd","N.S.","V","","default64","r","",""
> +"JGE rel8","JGE rel8","jge rel8","7D cb","N.S.","V","","default64","r","",""
> +"JGE rel8","JGE rel8","jge rel8","7D cb","V","N.S.","","","r","",""
> +"JL rel16","JL rel16","jl rel16","0F 8C cw","V","N.S.","","operand16","r","",""
> +"JL rel32","JL rel32","jl rel32","0F 8C cd","V","N.S.","","operand32","r","",""
> +"JL rel32","JL rel32","jl rel32","0F 8C cd","N.S.","V","","default64","r","",""
> +"JL rel8","JL rel8","jl rel8","7C cb","V","N.S.","","","r","",""
> +"JL rel8","JL rel8","jl rel8","7C cb","N.S.","V","","default64","r","",""
> +"JLE rel16","JLE rel16","jle rel16","0F 8E cw","V","N.S.","","operand16","r","",""
> +"JLE rel32","JLE rel32","jle rel32","0F 8E cd","V","N.S.","","operand32","r","",""
> +"JLE rel32","JLE rel32","jle rel32","0F 8E cd","N.S.","V","","default64","r","",""
> +"JLE rel8","JLE rel8","jle rel8","7E cb","N.S.","V","","default64","r","",""
> +"JLE rel8","JLE rel8","jle rel8","7E cb","V","N.S.","","","r","",""
> +"JMP rel16","JMP rel16","jmp rel16","E9 cw","V","N.S.","","operand16","r","Y",""
> +"JMP rel32","JMP rel32","jmp rel32","E9 cd","N.S.","V","","default64","r","Y",""
> +"JMP rel32","JMP rel32","jmp rel32","E9 cd","V","N.S.","","operand32","r","Y",""
> +"JMP rel8","JMP rel8","jmp rel8","EB cb","N.S.","V","","default64","r","Y",""
> +"JMP rel8","JMP rel8","jmp rel8","EB cb","V","N.S.","","","r","Y",""
> +"JMP r/m32","JMPL* r/m32","jmpl* r/m32","FF /4","V","N.S.","","operand32","r","Y","32"
> +"JMP r/m64","JMPQ* r/m64","jmpq* r/m64","FF /4","N.S.","V","","","r","Y","64"
> +"JMP r/m16","JMPW* r/m16","jmpw* r/m16","FF /4","V","N.S.","","operand16","r","Y","16"
> +"JNA rel16","JNA rel16","jna rel16","0F 86 cw","V","N.S.","","pseudo","r","",""
> +"JNA rel32","JNA rel32","jna rel32","0F 86 cd","V","V","","pseudo","r","",""
> +"JNA rel8","JNA rel8","jna rel8","76 cb","V","V","","pseudo","r","",""
> +"JNAE rel16","JNAE rel16","jnae rel16","0F 82 cw","V","N.S.","","pseudo","r","",""
> +"JNAE rel32","JNAE rel32","jnae rel32","0F 82 cd","V","V","","pseudo","r","",""
> +"JNAE rel8","JNAE rel8","jnae rel8","72 cb","V","V","","pseudo","r","",""
> +"JNB rel16","JNB rel16","jnb rel16","0F 83 cw","V","N.S.","","pseudo","r","",""
> +"JNB rel32","JNB rel32","jnb rel32","0F 83 cd","V","V","","pseudo","r","",""
> +"JNB rel8","JNB rel8","jnb rel8","73 cb","V","V","","pseudo","r","",""
> +"JNBE rel16","JNBE rel16","jnbe rel16","0F 87 cw","V","N.S.","","pseudo","r","",""
> +"JNBE rel32","JNBE rel32","jnbe rel32","0F 87 cd","V","V","","pseudo","r","",""
> +"JNBE rel8","JNBE rel8","jnbe rel8","77 cb","V","V","","pseudo","r","",""
> +"JNC rel16","JNC rel16","jnc rel16","0F 83 cw","V","N.S.","","pseudo","r","",""
> +"JNC rel32","JNC rel32","jnc rel32","0F 83 cd","V","V","","pseudo","r","",""
> +"JNC rel8","JNC rel8","jnc rel8","73 cb","V","V","","pseudo","r","",""
> +"JNE rel16","JNE rel16","jne rel16","0F 85 cw","V","N.S.","","operand16","r","",""
> +"JNE rel32","JNE rel32","jne rel32","0F 85 cd","N.S.","V","","default64","r","",""
> +"JNE rel32","JNE rel32","jne rel32","0F 85 cd","V","N.S.","","operand32","r","",""
> +"JNE rel8","JNE rel8","jne rel8","75 cb","V","N.S.","","","r","",""
> +"JNE rel8","JNE rel8","jne rel8","75 cb","N.S.","V","","default64","r","",""
> +"JNG rel16","JNG rel16","jng rel16","0F 8E cw","V","N.S.","","pseudo","r","",""
> +"JNG rel32","JNG rel32","jng rel32","0F 8E cd","V","V","","pseudo","r","",""
> +"JNG rel8","JNG rel8","jng rel8","7E cb","V","V","","pseudo","r","",""
> +"JNGE rel16","JNGE rel16","jnge rel16","0F 8C cw","V","N.S.","","pseudo","r","",""
> +"JNGE rel32","JNGE rel32","jnge rel32","0F 8C cd","V","V","","pseudo","r","",""
> +"JNGE rel8","JNGE rel8","jnge rel8","7C cb","V","V","","pseudo","r","",""
> +"JNL rel16","JNL rel16","jnl rel16","0F 8D cw","V","N.S.","","pseudo","r","",""
> +"JNL rel32","JNL rel32","jnl rel32","0F 8D cd","V","V","","pseudo","r","",""
> +"JNL rel8","JNL rel8","jnl rel8","7D cb","V","V","","pseudo","r","",""
> +"JNLE rel16","JNLE rel16","jnle rel16","0F 8F cw","V","N.S.","","pseudo","r","",""
> +"JNLE rel32","JNLE rel32","jnle rel32","0F 8F cd","V","V","","pseudo","r","",""
> +"JNLE rel8","JNLE rel8","jnle rel8","7F cb","V","V","","pseudo","r","",""
> +"JNO rel16","JNO rel16","jno rel16","0F 81 cw","V","N.S.","","operand16","r","",""
> +"JNO rel32","JNO rel32","jno rel32","0F 81 cd","V","N.S.","","operand32","r","",""
> +"JNO rel32","JNO rel32","jno rel32","0F 81 cd","N.S.","V","","default64","r","",""
> +"JNO rel8","JNO rel8","jno rel8","71 cb","V","N.S.","","","r","",""
> +"JNO rel8","JNO rel8","jno rel8","71 cb","N.S.","V","","default64","r","",""
> +"JNP rel16","JNP rel16","jnp rel16","0F 8B cw","V","N.S.","","operand16","r","",""
> +"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","V","N.S.","","operand32","r","",""
> +"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","N.S.","V","","default64","r","",""
> +"JNP rel8","JNP rel8","jnp rel8","7B cb","N.S.","V","","default64","r","",""
> +"JNP rel8","JNP rel8","jnp rel8","7B cb","V","N.S.","","","r","",""
> +"JNS rel16","JNS rel16","jns rel16","0F 89 cw","V","N.S.","","operand16","r","",""
> +"JNS rel32","JNS rel32","jns rel32","0F 89 cd","N.S.","V","","default64","r","",""
> +"JNS rel32","JNS rel32","jns rel32","0F 89 cd","V","N.S.","","operand32","r","",""
> +"JNS rel8","JNS rel8","jns rel8","79 cb","V","N.S.","","","r","",""
> +"JNS rel8","JNS rel8","jns rel8","79 cb","N.S.","V","","default64","r","",""
> +"JNZ rel16","JNZ rel16","jnz rel16","0F 85 cw","V","N.S.","","pseudo","r","",""
> +"JNZ rel32","JNZ rel32","jnz rel32","0F 85 cd","V","V","","pseudo","r","",""
> +"JNZ rel8","JNZ rel8","jnz rel8","75 cb","V","V","","pseudo","r","",""
> +"JO rel16","JO rel16","jo rel16","0F 80 cw","V","N.S.","","operand16","r","",""
> +"JO rel32","JO rel32","jo rel32","0F 80 cd","V","N.S.","","operand32","r","",""
> +"JO rel32","JO rel32","jo rel32","0F 80 cd","N.S.","V","","default64","r","",""
> +"JO rel8","JO rel8","jo rel8","70 cb","V","N.S.","","","r","",""
> +"JO rel8","JO rel8","jo rel8","70 cb","N.S.","V","","default64","r","",""
> +"JP rel16","JP rel16","jp rel16","0F 8A cw","V","N.S.","","operand16","r","",""
> +"JP rel32","JP rel32","jp rel32","0F 8A cd","N.S.","V","","default64","r","",""
> +"JP rel32","JP rel32","jp rel32","0F 8A cd","V","N.S.","","operand32","r","",""
> +"JP rel8","JP rel8","jp rel8","7A cb","N.S.","V","","default64","r","",""
> +"JP rel8","JP rel8","jp rel8","7A cb","V","N.S.","","","r","",""
> +"JPE rel16","JPE rel16","jpe rel16","0F 8A cw","V","N.S.","","pseudo","r","",""
> +"JPE rel32","JPE rel32","jpe rel32","0F 8A cd","V","V","","pseudo","r","",""
> +"JPE rel8","JPE rel8","jpe rel8","7A cb","V","V","","pseudo","r","",""
> +"JPO rel16","JPO rel16","jpo rel16","0F 8B cw","V","N.S.","","pseudo","r","",""
> +"JPO rel32","JPO rel32","jpo rel32","0F 8B cd","V","V","","pseudo","r","",""
> +"JPO rel8","JPO rel8","jpo rel8","7B cb","V","V","","pseudo","r","",""
> +"JRCXZ rel8","JRCXZ rel8","jrcxz rel8","E3 cb","N.S.","V","","address64","r","",""
> +"JS rel16","JS rel16","js rel16","0F 88 cw","V","N.S.","","operand16","r","",""
> +"JS rel32","JS rel32","js rel32","0F 88 cd","V","N.S.","","operand32","r","",""
> +"JS rel32","JS rel32","js rel32","0F 88 cd","N.S.","V","","default64","r","",""
> +"JS rel8","JS rel8","js rel8","78 cb","V","N.S.","","","r","",""
> +"JS rel8","JS rel8","js rel8","78 cb","N.S.","V","","default64","r","",""
> +"JZ rel16","JZ rel16","jz rel16","0F 84 cw","V","N.S.","","operand16,pseudo","r","",""
> +"JZ rel32","JZ rel32","jz rel32","0F 84 cd","V","V","","operand32,pseudo","r","",""
> +"JZ rel8","JZ rel8","jz rel8","74 cb","V","V","","pseudo","r","",""
> +"KADDB k1, kV, k2","KADDB k2, kV, k1","kaddb k2, kV, k1","VEX.NDS.256.66.0F.W0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"KADDD k1, kV, k2","KADDD k2, kV, k1","kaddd k2, kV, k1","VEX.NDS.256.66.0F.W1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KADDQ k1, kV, k2","KADDQ k2, kV, k1","kaddq k2, kV, k1","VEX.NDS.256.0F.W1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KADDW k1, kV, k2","KADDW k2, kV, k1","kaddw k2, kV, k1","VEX.NDS.256.0F.W0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"KANDB k1, kV, k2","KANDB k2, kV, k1","kandb k2, kV, k1","VEX.NDS.256.66.0F.W0 41 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"KANDD k1, kV, k2","KANDD k2, kV, k1","kandd k2, kV, k1","VEX.NDS.256.66.0F.W1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KANDNB k1, kV, k2","KANDNB k2, kV, k1","kandnb k2, kV, k1","VEX.NDS.256.66.0F.W0 42 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"KANDND k1, kV, k2","KANDND k2, kV, k1","kandnd k2, kV, k1","VEX.NDS.256.66.0F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KANDNQ k1, kV, k2","KANDNQ k2, kV, k1","kandnq k2, kV, k1","VEX.NDS.256.0F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KANDNW k1, kV, k2","KANDNW k2, kV, k1","kandnw k2, kV, k1","VEX.NDS.256.0F.W0 42 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"KANDQ k1, kV, k2","KANDQ k2, kV, k1","kandq k2, kV, k1","VEX.NDS.256.0F.W1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KANDW k1, kV, k2","KANDW k2, kV, k1","kandw k2, kV, k1","VEX.NDS.256.0F.W0 41 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"KMOVB m8, k1","KMOVB k1, m8","kmovb k1, m8","VEX.128.66.0F.W0 91 /r","V","V","AVX512DQ","modrm_memonly","w,r","",""
> +"KMOVB r32, k2","KMOVB k2, r32","kmovb k2, r32","VEX.128.66.0F.W0 93 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
> +"KMOVB k1, k2/m8","KMOVB k2/m8, k1","kmovb k2/m8, k1","VEX.128.66.0F.W0 90 /r","V","V","AVX512DQ","","w,r","",""
> +"KMOVB k1, rmr32","KMOVB rmr32, k1","kmovb rmr32, k1","VEX.128.66.0F.W0 92 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
> +"KMOVD m32, k1","KMOVD k1, m32","kmovd k1, m32","VEX.128.66.0F.W1 91 /r","V","V","AVX512BW","modrm_memonly","w,r","",""
> +"KMOVD r32, k2","KMOVD k2, r32","kmovd k2, r32","VEX.128.F2.0F.W0 93 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
> +"KMOVD k1, k2/m32","KMOVD k2/m32, k1","kmovd k2/m32, k1","VEX.128.66.0F.W1 90 /r","V","V","AVX512BW","","w,r","",""
> +"KMOVD k1, rmr32","KMOVD rmr32, k1","kmovd rmr32, k1","VEX.128.F2.0F.W0 92 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
> +"KMOVQ m64, k1","KMOVQ k1, m64","kmovq k1, m64","VEX.128.0F.W1 91 /r","V","V","AVX512BW","modrm_memonly","w,r","",""
> +"KMOVQ r64, k2","KMOVQ k2, r64","kmovq k2, r64","VEX.128.F2.0F.W1 93 /r","N.S.","V","AVX512BW","modrm_regonly","w,r","",""
> +"KMOVQ k1, k2/m64","KMOVQ k2/m64, k1","kmovq k2/m64, k1","VEX.128.0F.W1 90 /r","V","V","AVX512BW","","w,r","",""
> +"KMOVQ k1, rmr64","KMOVQ rmr64, k1","kmovq rmr64, k1","VEX.128.F2.0F.W1 92 /r","N.S.","V","AVX512BW","modrm_regonly","w,r","",""
> +"KMOVW m16, k1","KMOVW k1, m16","kmovw k1, m16","VEX.128.0F.W0 91 /r","V","V","AVX512F","modrm_memonly","w,r","",""
> +"KMOVW r32, k2","KMOVW k2, r32","kmovw k2, r32","VEX.128.0F.W0 93 /r","V","V","AVX512F","modrm_regonly","w,r","",""
> +"KMOVW k1, k2/m16","KMOVW k2/m16, k1","kmovw k2/m16, k1","VEX.128.0F.W0 90 /r","V","V","AVX512F","","w,r","",""
> +"KMOVW k1, rmr32","KMOVW rmr32, k1","kmovw rmr32, k1","VEX.128.0F.W0 92 /r","V","V","AVX512F","modrm_regonly","w,r","",""
> +"KNOTB k1, k2","KNOTB k2, k1","knotb k2, k1","VEX.128.66.0F.W0 44 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
> +"KNOTD k1, k2","KNOTD k2, k1","knotd k2, k1","VEX.128.66.0F.W1 44 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
> +"KNOTQ k1, k2","KNOTQ k2, k1","knotq k2, k1","VEX.128.0F.W1 44 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
> +"KNOTW k1, k2","KNOTW k2, k1","knotw k2, k1","VEX.128.0F.W0 44 /r","V","V","AVX512F","modrm_regonly","w,r","",""
> +"KORB k1, kV, k2","KORB k2, kV, k1","korb k2, kV, k1","VEX.NDS.256.66.0F.W0 45 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"KORD k1, kV, k2","KORD k2, kV, k1","kord k2, kV, k1","VEX.NDS.256.66.0F.W1 45 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KORQ k1, kV, k2","KORQ k2, kV, k1","korq k2, kV, k1","VEX.NDS.256.0F.W1 45 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KORTESTB k1, k2","KORTESTB k2, k1","kortestb k2, k1","VEX.128.66.0F.W0 98 /r","V","V","AVX512DQ","modrm_regonly","r,r","",""
> +"KORTESTD k1, k2","KORTESTD k2, k1","kortestd k2, k1","VEX.128.66.0F.W1 98 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
> +"KORTESTQ k1, k2","KORTESTQ k2, k1","kortestq k2, k1","VEX.128.0F.W1 98 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
> +"KORTESTW k1, k2","KORTESTW k2, k1","kortestw k2, k1","VEX.128.0F.W0 98 /r","V","V","AVX512F","modrm_regonly","r,r","",""
> +"KORW k1, kV, k2","KORW k2, kV, k1","korw k2, kV, k1","VEX.NDS.256.0F.W0 45 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"KSHIFTLB k1, k2, imm8u","KSHIFTLB imm8u, k2, k1","kshiftlb imm8u, k2, k1","VEX.128.66.0F3A.W0 32 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"KSHIFTLD k1, k2, imm8u","KSHIFTLD imm8u, k2, k1","kshiftld imm8u, k2, k1","VEX.128.66.0F3A.W0 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KSHIFTLQ k1, k2, imm8u","KSHIFTLQ imm8u, k2, k1","kshiftlq imm8u, k2, k1","VEX.128.66.0F3A.W1 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KSHIFTLW k1, k2, imm8u","KSHIFTLW imm8u, k2, k1","kshiftlw imm8u, k2, k1","VEX.128.66.0F3A.W1 32 /r ib","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"KSHIFTRB k1, k2, imm8u","KSHIFTRB imm8u, k2, k1","kshiftrb imm8u, k2, k1","VEX.128.66.0F3A.W0 30 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"KSHIFTRD k1, k2, imm8u","KSHIFTRD imm8u, k2, k1","kshiftrd imm8u, k2, k1","VEX.128.66.0F3A.W0 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KSHIFTRQ k1, k2, imm8u","KSHIFTRQ imm8u, k2, k1","kshiftrq imm8u, k2, k1","VEX.128.66.0F3A.W1 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KSHIFTRW k1, k2, imm8u","KSHIFTRW imm8u, k2, k1","kshiftrw imm8u, k2, k1","VEX.128.66.0F3A.W1 30 /r ib","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"KTESTB k1, k2","KTESTB k2, k1","ktestb k2, k1","VEX.128.66.0F.W0 99 /r","V","V","AVX512DQ","modrm_regonly","r,r","",""
> +"KTESTD k1, k2","KTESTD k2, k1","ktestd k2, k1","VEX.128.66.0F.W1 99 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
> +"KTESTQ k1, k2","KTESTQ k2, k1","ktestq k2, k1","VEX.128.0F.W1 99 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
> +"KTESTW k1, k2","KTESTW k2, k1","ktestw k2, k1","VEX.128.0F.W0 99 /r","V","V","AVX512DQ","modrm_regonly","r,r","",""
> +"KUNPCKBW k1, kV, k2","KUNPCKBW k2, kV, k1","kunpckbw k2, kV, k1","VEX.NDS.256.66.0F.W0 4B /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"KUNPCKDQ k1, kV, k2","KUNPCKDQ k2, kV, k1","kunpckdq k2, kV, k1","VEX.NDS.256.0F.W1 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KUNPCKWD k1, kV, k2","KUNPCKWD k2, kV, k1","kunpckwd k2, kV, k1","VEX.NDS.256.0F.W0 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KXNORB k1, kV, k2","KXNORB k2, kV, k1","kxnorb k2, kV, k1","VEX.NDS.256.66.0F.W0 46 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"KXNORD k1, kV, k2","KXNORD k2, kV, k1","kxnord k2, kV, k1","VEX.NDS.256.66.0F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KXNORQ k1, kV, k2","KXNORQ k2, kV, k1","kxnorq k2, kV, k1","VEX.NDS.256.0F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KXNORW k1, kV, k2","KXNORW k2, kV, k1","kxnorw k2, kV, k1","VEX.NDS.256.0F.W0 46 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"KXORB k1, kV, k2","KXORB k2, kV, k1","kxorb k2, kV, k1","VEX.NDS.256.66.0F.W0 47 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"KXORD k1, kV, k2","KXORD k2, kV, k1","kxord k2, kV, k1","VEX.NDS.256.66.0F.W1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KXORQ k1, kV, k2","KXORQ k2, kV, k1","kxorq k2, kV, k1","VEX.NDS.256.0F.W1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"KXORW k1, kV, k2","KXORW k2, kV, k1","kxorw k2, kV, k1","VEX.NDS.256.0F.W0 47 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"LAHF","LAHF","lahf","9F","V","V","LAHFSAHF","","","",""
> +"LAR r32, r32/m16","LARL r32/m16, r32","larl r32/m16, r32","0F 02 /r","V","V","","operand32","rw,r","Y","32"
> +"LAR r64, r64/m16","LARQ r64/m16, r64","larq r64/m16, r64","REX.W 0F 02 /r","N.S.","V","","","rw,r","Y","64"
> +"LAR r16, r/m16","LARW r/m16, r16","larw r/m16, r16","0F 02 /r","V","V","","operand16","rw,r","Y","16"
> +"CALL_FAR ptr16:32","LCALLL ptr16:32","lcalll ptr16:32","9A cd iw","V","N.S.","","operand32","r","Y",""
> +"CALL_FAR m16:32","LCALLL* m16:32","lcalll* m16:32","FF /3","V","V","","modrm_memonly,operand32","r","Y",""
> +"CALL_FAR m16:64","LCALLQ* m16:64","lcallq* m16:64","REX.W FF /3","N.S.","V","","modrm_memonly","r","Y",""
> +"CALL_FAR ptr16:16","LCALLW ptr16:16","lcallw ptr16:16","9A cw iw","V","N.S.","","operand16","r","Y",""
> +"CALL_FAR m16:16","LCALLW* m16:16","lcallw* m16:16","FF /3","V","V","","modrm_memonly,operand16","r","Y",""
> +"LDDQU xmm1, m128","LDDQU m128, xmm1","lddqu m128, xmm1","F2 0F F0 /r","V","V","SSE3","modrm_memonly","w,r","",""
> +"LDMXCSR m32","LDMXCSR m32","ldmxcsr m32","0F AE /2","V","V","SSE","modrm_memonly","r","",""
> +"LDS r32, m16:32","LDSL m16:32, r32","ldsl m16:32, r32","C5 /r","V","N.S.","","modrm_memonly,operand32","w,r","Y","32"
> +"LDS r16, m16:16","LDSW m16:16, r16","ldsw m16:16, r16","C5 /r","V","N.S.","","modrm_memonly,operand16","w,r","Y","16"
> +"LEA r32, m","LEAL m, r32","leal m, r32","8D /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
> +"LEA r64, m","LEAQ m, r64","leaq m, r64","REX.W 8D /r","N.S.","V","","modrm_memonly","w,r","Y","64"
> +"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","N.S.","V","","default64","","Y",""
> +"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","N.S.","","operand32","","Y",""
> +"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","V","","operand16","","Y",""
> +"LEA r16, m","LEAW m, r16","leaw m, r16","8D /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
> +"LES r32, m16:32","LESL m16:32, r32","lesl m16:32, r32","C4 /r","V","N.S.","","modrm_memonly,operand32","w,r","Y","32"
> +"LES r16, m16:16","LESW m16:16, r16","lesw m16:16, r16","C4 /r","V","N.S.","","modrm_memonly,operand16","w,r","Y","16"
> +"LFENCE","LFENCE","lfence","0F AE /5","V","V","SSE2","","","",""
> +"LFS r32, m16:32","LFSL m16:32, r32","lfsl m16:32, r32","0F B4 /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
> +"LFS r64, m16:64","LFSQ m16:64, r64","lfsq m16:64, r64","REX.W 0F B4 /r","N.S.","V","","modrm_memonly","w,r","Y","64"
> +"LFS r16, m16:16","LFSW m16:16, r16","lfsw m16:16, r16","0F B4 /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
> +"LGDT m16&64","LGDT m16&64","lgdt m16&64","0F 01 /2","N.S.","V","","default64,modrm_memonly","r","",""
> +"LGDT m16&32","LGDTW/LGDTL m16&32","lgdtw/lgdtl m16&32","0F 01 /2","V","N.S.","","modrm_memonly","r","",""
> +"LGS r32, m16:32","LGSL m16:32, r32","lgsl m16:32, r32","0F B5 /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
> +"LGS r64, m16:64","LGSQ m16:64, r64","lgsq m16:64, r64","REX.W 0F B5 /r","N.S.","V","","modrm_memonly","w,r","Y","64"
> +"LGS r16, m16:16","LGSW m16:16, r16","lgsw m16:16, r16","0F B5 /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
> +"LIDT m16&64","LIDT m16&64","lidt m16&64","0F 01 /3","N.S.","V","","default64,modrm_memonly","r","",""
> +"LIDT m16&32","LIDTW/LIDTL m16&32","lidtw/lidtl m16&32","0F 01 /3","V","N.S.","","modrm_memonly","r","",""
> +"JMP_FAR ptr16:32","LJMPL ptr16:32","ljmpl ptr16:32","EA cd iw","V","N.S.","","operand32","r","Y",""
> +"JMP_FAR m16:32","LJMPL* m16:32","ljmpl* m16:32","FF /5","V","V","","modrm_memonly,operand32","r","Y",""
> +"JMP_FAR m16:64","LJMPQ* m16:64","ljmpq* m16:64","REX.W FF /5","N.S.","V","","modrm_memonly","r","Y",""
> +"JMP_FAR ptr16:16","LJMPW ptr16:16","ljmpw ptr16:16","EA cw iw","V","N.S.","","operand16","r","Y",""
> +"JMP_FAR m16:16","LJMPW* m16:16","ljmpw* m16:16","FF /5","V","V","","modrm_memonly,operand16","r","Y",""
> +"LLDT r/m16","LLDT r/m16","lldt r/m16","0F 00 /2","V","V","","","r","",""
> +"LLWPCB rmr32","LLWPCBL rmr32","llwpcbl rmr32","XOP.128.09.W0 12 /0","V","V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32"
> +"LLWPCB rmr64","LLWPCBQ rmr64","llwpcbq rmr64","XOP.128.09.W0 12 /0","N.S.","V","XOP","amd,modrm_regonly,operand64","w","Y","64"
> +"LMSW r/m16","LMSW r/m16","lmsw r/m16","0F 01 /6","V","V","","","r","",""
> +"LOCK","LOCK","lock","F0","V","V","","pseudo","","",""
> +"LODSB","LODSB","lodsb","AC","V","V","","","","",""
> +"LODSD","LODSL","lodsl","AD","V","V","","operand32","","",""
> +"LODSQ","LODSQ","lodsq","REX.W AD","N.S.","V","","","","",""
> +"LODSW","LODSW","lodsw","AD","V","V","","operand16","","",""
> +"LOOP rel8","LOOP rel8","loop rel8","E2 cb","V","V","","","r","",""
> +"LOOPE rel8","LOOPEQ rel8","loope rel8","E1 cb","V","V","","","r","",""
> +"LOOPNE rel8","LOOPNE rel8","loopne rel8","E0 cb","V","V","","","r","",""
> +"LSL r32, r32/m16","LSLL r32/m16, r32","lsll r32/m16, r32","0F 03 /r","V","V","","operand32","rw,r","Y","32"
> +"LSL r64, r32/m16","LSLQ r32/m16, r64","lslq r32/m16, r64","REX.W 0F 03 /r","N.S.","V","","","rw,r","Y","64"
> +"LSL r16, r/m16","LSLW r/m16, r16","lslw r/m16, r16","0F 03 /r","V","V","","operand16","rw,r","Y","16"
> +"LSS r32, m16:32","LSSL m16:32, r32","lssl m16:32, r32","0F B2 /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
> +"LSS r64, m16:64","LSSQ m16:64, r64","lssq m16:64, r64","REX.W 0F B2 /r","N.S.","V","","modrm_memonly","w,r","Y","64"
> +"LSS r16, m16:16","LSSW m16:16, r16","lssw m16:16, r16","0F B2 /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
> +"LTR r/m16","LTR r/m16","ltr r/m16","0F 00 /3","V","V","","","r","",""
> +"LWPINS r32V, r/m32, imm32u","LWPINS imm32u, r/m32, r32V","lwpins imm32u, r/m32, r32V","XOP.NDD.128.0A.W0 12 /0","V","V","XOP","amd,operand16,operand32","w,r,r","",""
> +"LWPINS r64V, r64/m32, imm32u","LWPINS imm32u, r64/m32, r64V","lwpins imm32u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /0","N.S.","V","XOP","amd,operand64","w,r,r","",""
> +"LWPVAL r32V, r/m32, imm32u","LWPVAL imm32u, r/m32, r32V","lwpval imm32u, r/m32, r32V","XOP.NDD.128.0A.W0 12 /1","V","V","XOP","amd,operand16,operand32","w,r,r","",""
> +"LWPVAL r64V, r64/m32, imm32u","LWPVAL imm32u, r64/m32, r64V","lwpval imm32u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /1","N.S.","V","XOP","amd,operand64","w,r,r","",""
> +"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","V","V","LZCNT","operand32","w,r","Y","32"
> +"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","V","V","AMD","amd,operand32","w,r","Y","32"
> +"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD /r","N.S.","V","AMD","amd","w,r","Y","64"
> +"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD /r","N.S.","V","LZCNT","","w,r","Y","64"
> +"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","V","V","AMD","amd,operand16","w,r","Y","16"
> +"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","V","V","LZCNT","operand16","w,r","Y","16"
> +"MASKMOVDQU xmm1, xmm2","MASKMOVOU xmm2, xmm1","maskmovdqu xmm2, xmm1","66 0F F7 /r","V","V","SSE2","modrm_regonly","r,r","",""
> +"MASKMOVQ mm1, mm2","MASKMOVQ mm2, mm1","maskmovq mm2, mm1","0F F7 /r","V","V","MMX","modrm_regonly","r,r","",""
> +"MAXPD xmm1, xmm2/m128","MAXPD xmm2/m128, xmm1","maxpd xmm2/m128, xmm1","66 0F 5F /r","V","V","SSE2","","rw,r","",""
> +"MAXPS xmm1, xmm2/m128","MAXPS xmm2/m128, xmm1","maxps xmm2/m128, xmm1","0F 5F /r","V","V","SSE","","rw,r","",""
> +"MAXSD xmm1, xmm2/m64","MAXSD xmm2/m64, xmm1","maxsd xmm2/m64, xmm1","F2 0F 5F /r","V","V","SSE2","","rw,r","",""
> +"MAXSS xmm1, xmm2/m32","MAXSS xmm2/m32, xmm1","maxss xmm2/m32, xmm1","F3 0F 5F /r","V","V","SSE","","rw,r","",""
> +"MFENCE","MFENCE","mfence","0F AE /6","V","V","SSE2","","","",""
> +"MINPD xmm1, xmm2/m128","MINPD xmm2/m128, xmm1","minpd xmm2/m128, xmm1","66 0F 5D /r","V","V","SSE2","","rw,r","",""
> +"MINPS xmm1, xmm2/m128","MINPS xmm2/m128, xmm1","minps xmm2/m128, xmm1","0F 5D /r","V","V","SSE","","rw,r","",""
> +"MINSD xmm1, xmm2/m64","MINSD xmm2/m64, xmm1","minsd xmm2/m64, xmm1","F2 0F 5D /r","V","V","SSE2","","rw,r","",""
> +"MINSS xmm1, xmm2/m32","MINSS xmm2/m32, xmm1","minss xmm2/m32, xmm1","F3 0F 5D /r","V","V","SSE","","rw,r","",""
> +"MONITOR","MONITOR","monitor","0F 01 C8","V","V","MONITOR","","","",""
> +"MOVAPD xmm2/m128, xmm1","MOVAPD xmm1, xmm2/m128","movapd xmm1, xmm2/m128","66 0F 29 /r","V","V","SSE2","","w,r","",""
> +"MOVAPD xmm1, xmm2/m128","MOVAPD xmm2/m128, xmm1","movapd xmm2/m128, xmm1","66 0F 28 /r","V","V","SSE2","","w,r","",""
> +"MOVAPS xmm2/m128, xmm1","MOVAPS xmm1, xmm2/m128","movaps xmm1, xmm2/m128","0F 29 /r","V","V","SSE","","w,r","",""
> +"MOVAPS xmm1, xmm2/m128","MOVAPS xmm2/m128, xmm1","movaps xmm2/m128, xmm1","0F 28 /r","V","V","SSE","","w,r","",""
> +"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","C6 /0 ib","V","V","","","w,r","Y","8"
> +"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","REX C6 /0 ib","N.E.","V","","pseudo64","w,r","Y","8"
> +"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","B0+rb ib","V","V","","","w,r","Y","8"
> +"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","REX B0+rb ib","N.E.","V","","pseudo64","w,r","Y","8"
> +"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","8A /r","V","V","","","w,r","Y","8"
> +"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","REX 8A /r","N.E.","V","","pseudo64","w,r","Y","8"
> +"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","88 /r","V","V","","","w,r","Y","8"
> +"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","REX 88 /r","N.E.","V","","pseudo64","w,r","Y","8"
> +"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, moffs8","A2 cm","V","V","","","w,r","Y","8"
> +"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, moffs8","REX.W A2 cm","N.E.","V","","pseudo","w,r","Y","8"
> +"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8, AL","A0 cm","V","V","","","w,r","Y","8"
> +"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8, AL","REX.W A0 cm","N.E.","V","","pseudo","w,r","Y","8"
> +"MOVBE r32, m32","MOVBELL m32, r32","movbell m32, r32","0F 38 F0 /r","V","V","MOVBE","modrm_memonly,operand32","w,r","Y","32"
> +"MOVBE m32, r32","MOVBELL r32, m32","movbell r32, m32","0F 38 F1 /r","V","V","MOVBE","modrm_memonly,operand32","w,r","Y","32"
> +"MOVBE r64, m64","MOVBEQQ m64, r64","movbeqq m64, r64","REX.W 0F 38 F0 /r","N.S.","V","MOVBE","modrm_memonly","w,r","Y","64"
> +"MOVBE m64, r64","MOVBEQQ r64, m64","movbeqq r64, m64","REX.W 0F 38 F1 /r","N.S.","V","MOVBE","modrm_memonly","w,r","Y","64"
> +"MOVBE r16, m16","MOVBEWW m16, r16","movbeww m16, r16","0F 38 F0 /r","V","V","MOVBE","modrm_memonly,operand16","w,r","Y","16"
> +"MOVBE m16, r16","MOVBEWW r16, m16","movbeww r16, m16","0F 38 F1 /r","V","V","MOVBE","modrm_memonly,operand16","w,r","Y","16"
> +"MOVSX r32, r/m8","MOVBLSX r/m8, r32","movsbl r/m8, r32","0F BE /r","V","V","","operand32","w,r","Y","32"
> +"MOVZX r32, r/m8","MOVBLZX r/m8, r32","movzbl r/m8, r32","0F B6 /r","V","V","","operand32","w,r","Y","32"
> +"MOVSX r64, r/m8","MOVBQSX r/m8, r64","movsbq r/m8, r64","REX.W 0F BE /r","N.S.","V","","","w,r","Y","64"
> +"MOVZX r64, r/m8","MOVBQZX r/m8, r64","movzbq r/m8, r64","REX.W 0F B6 /r","N.S.","V","","","w,r","Y","64"
> +"MOVSX r16, r/m8","MOVBWSX r/m8, r16","movsbw r/m8, r16","0F BE /r","V","V","","operand16","w,r","Y","16"
> +"MOVZX r16, r/m8","MOVBWZX r/m8, r16","movzbw r/m8, r16","0F B6 /r","V","V","","operand16","w,r","Y","16"
> +"MOVD r/m32, mm1","MOVD mm1, r/m32","movd mm1, r/m32","0F 7E /r","V","V","MMX","operand16,operand32","w,r","",""
> +"MOVD mm1, r/m32","MOVD r/m32, mm1","movd r/m32, mm1","0F 6E /r","V","V","MMX","operand16,operand32","w,r","",""
> +"MOVD xmm1, r/m32","MOVD r/m32, xmm1","movd r/m32, xmm1","66 0F 6E /r","V","V","SSE2","operand16,operand32","w,r","",""
> +"MOVD r/m32, xmm1","MOVD xmm1, r/m32","movd xmm1, r/m32","66 0F 7E /r","V","V","SSE2","operand16,operand32","w,r","",""
> +"MOVDDUP xmm1, xmm2/m64","MOVDDUP xmm2/m64, xmm1","movddup xmm2/m64, xmm1","F2 0F 12 /r","V","V","SSE3","","w,r","",""
> +"MOVHLPS xmm1, xmm2","MOVHLPS xmm2, xmm1","movhlps xmm2, xmm1","0F 12 /r","V","V","SSE","modrm_regonly","w,r","",""
> +"MOVHPD xmm1, m64","MOVHPD m64, xmm1","movhpd m64, xmm1","66 0F 16 /r","V","V","SSE2","modrm_memonly","w,r","",""
> +"MOVHPD m64, xmm1","MOVHPD xmm1, m64","movhpd xmm1, m64","66 0F 17 /r","V","V","SSE2","modrm_memonly","w,r","",""
> +"MOVHPS xmm1, m64","MOVHPS m64, xmm1","movhps m64, xmm1","0F 16 /r","V","V","SSE","modrm_memonly","w,r","",""
> +"MOVHPS m64, xmm1","MOVHPS xmm1, m64","movhps xmm1, m64","0F 17 /r","V","V","SSE","modrm_memonly","w,r","",""
> +"MOV rmr32, CR0-CR7","MOVL CR0-CR7, rmr32","movl CR0-CR7, rmr32","0F 20 /r","V","N.S.","","","w,r","Y","32"
> +"MOV rmr32, DR0-DR7","MOVL DR0-DR7, rmr32","movl DR0-DR7, rmr32","0F 21 /r","V","N.S.","","","w,r","Y","32"
> +"MOV moffs32, EAX","MOVL EAX, moffs32","movl EAX, moffs32","A3 cm","V","V","","operand32","w,r","Y","32"
> +"MOV r/m32, imm32","MOVL imm32, r/m32","movl imm32, r/m32","C7 /0 id","V","V","","operand32","w,r","Y","32"
> +"MOV r32op, imm32u","MOVL imm32u, r32op","movl imm32u, r32op","B8+rd id","V","V","","operand32","w,r","Y","32"
> +"MOV EAX, moffs32","MOVL moffs32, EAX","movl moffs32, EAX","A1 cm","V","V","","operand32","w,r","Y","32"
> +"MOV r32, r/m32","MOVL r/m32, r32","movl r/m32, r32","8B /r","V","V","","operand32","w,r","Y","32"
> +"MOV r/m32, r32","MOVL r32, r/m32","movl r32, r/m32","89 /r","V","V","","operand32","w,r","Y","32"
> +"MOV CR0-CR7, rmr32","MOVL rmr32, CR0-CR7","movl rmr32, CR0-CR7","0F 22 /r","V","N.S.","","","w,r","Y","32"
> +"MOV DR0-DR7, rmr32","MOVL rmr32, DR0-DR7","movl rmr32, DR0-DR7","0F 23 /r","V","N.S.","","","w,r","Y","32"
> +"MOVLHPS xmm1, xmm2","MOVLHPS xmm2, xmm1","movlhps xmm2, xmm1","0F 16 /r","V","V","SSE","modrm_regonly","w,r","",""
> +"MOVLPD xmm1, m64","MOVLPD m64, xmm1","movlpd m64, xmm1","66 0F 12 /r","V","V","SSE2","modrm_memonly","w,r","",""
> +"MOVLPD m64, xmm1","MOVLPD xmm1, m64","movlpd xmm1, m64","66 0F 13 /r","V","V","SSE2","modrm_memonly","w,r","",""
> +"MOVLPS xmm1, m64","MOVLPS m64, xmm1","movlps m64, xmm1","0F 12 /r","V","V","SSE","modrm_memonly","w,r","",""
> +"MOVLPS m64, xmm1","MOVLPS xmm1, m64","movlps xmm1, m64","0F 13 /r","V","V","SSE","modrm_memonly","w,r","",""
> +"MOVSXD r32, r/m32","MOVLQSX r/m32, r32","movsxdl r/m32, r32","63 /r","N.S.","V","","operand32","w,r","Y","32"
> +"MOVSXD r64, r/m32","MOVLQSX r/m32, r64","movslq r/m32, r64","REX.W 63 /r","N.S.","V","","","w,r","Y","64"
> +"MOVMSKPD r32, xmm2","MOVMSKPD xmm2, r32","movmskpd xmm2, r32","66 0F 50 /r","V","V","SSE2","modrm_regonly","w,r","",""
> +"MOVMSKPS r32, xmm2","MOVMSKPS xmm2, r32","movmskps xmm2, r32","0F 50 /r","V","V","SSE","modrm_regonly","w,r","",""
> +"MOVNTDQA xmm1, m128","MOVNTDQA m128, xmm1","movntdqa m128, xmm1","66 0F 38 2A /r","V","V","SSE4_1","modrm_memonly","w,r","",""
> +"MOVNTI m32, r32","MOVNTIL r32, m32","movntil r32, m32","0F C3 /r","V","V","SSE2","modrm_memonly,operand16,operand32","w,r","Y","32"
> +"MOVNTI m64, r64","MOVNTIQ r64, m64","movntiq r64, m64","REX.W 0F C3 /r","N.S.","V","SSE2","modrm_memonly","w,r","Y","64"
> +"MOVNTDQ m128, xmm1","MOVNTO xmm1, m128","movntdq xmm1, m128","66 0F E7 /r","V","V","SSE2","modrm_memonly","w,r","",""
> +"MOVNTPD m128, xmm1","MOVNTPD xmm1, m128","movntpd xmm1, m128","66 0F 2B /r","V","V","SSE2","modrm_memonly","w,r","",""
> +"MOVNTPS m128, xmm1","MOVNTPS xmm1, m128","movntps xmm1, m128","0F 2B /r","V","V","SSE","modrm_memonly","w,r","",""
> +"MOVNTQ m64, mm1","MOVNTQ mm1, m64","movntq mm1, m64","0F E7 /r","V","V","MMX","modrm_memonly","w,r","",""
> +"MOVNTSD m64, xmm1","MOVNTSD xmm1, m64","movntsd xmm1, m64","F2 0F 2B /r","V","V","SSE4a","amd,modrm_memonly","w,r","",""
> +"MOVNTSS m32, xmm1","MOVNTSS xmm1, m32","movntss xmm1, m32","F3 0F 2B /r","V","V","SSE4a","amd,modrm_memonly","w,r","",""
> +"MOVDQA xmm2/m128, xmm1","MOVO xmm1, xmm2/m128","movdqa xmm1, xmm2/m128","66 0F 7F /r","V","V","SSE2","","w,r","",""
> +"MOVDQA xmm1, xmm2/m128","MOVO xmm2/m128, xmm1","movdqa xmm2/m128, xmm1","66 0F 6F /r","V","V","SSE2","","w,r","",""
> +"MOVDQU xmm2/m128, xmm1","MOVOU xmm1, xmm2/m128","movdqu xmm1, xmm2/m128","F3 0F 7F /r","V","V","SSE2","","w,r","",""
> +"MOVDQU xmm1, xmm2/m128","MOVOU xmm2/m128, xmm1","movdqu xmm2/m128, xmm1","F3 0F 6F /r","V","V","SSE2","","w,r","",""
> +"MOV rmr64, CR0-CR7","MOVQ CR0-CR7, rmr64","movq CR0-CR7, rmr64","0F 20 /r","N.S.","V","","default64","w,r","Y","64"
> +"MOV rmr64, CR8","MOVQ CR8, rmr64","movq CR8, rmr64","REX.R + 0F 20 /0","N.E.","V","","modrm_regonly,pseudo","w,r","Y","64"
> +"MOV rmr64, DR0-DR7","MOVQ DR0-DR7, rmr64","movq DR0-DR7, rmr64","0F 21 /r","N.S.","V","","default64","w,r","Y","64"
> +"MOV moffs64, RAX","MOVQ RAX, moffs64","movabsq RAX, moffs64","REX.W A3 cm","N.S.","V","","","w,r","Y","64"
> +"MOV r/m64, imm32","MOVQ imm32, r/m64","movq imm32, r/m64","REX.W C7 /0 id","N.S.","V","","","w,r","Y","64"
> +"MOV r64op, imm64u","MOVQ imm64u, r64op","movq imm64u, r64op","REX.W B8+ro io","N.S.","V","","","w,r","Y","64"
> +"MOVQ mm2/m64, mm1","MOVQ mm1, mm2/m64","movq mm1, mm2/m64","0F 7F /r","V","V","MMX","","w,r","",""
> +"MOVQ r/m64, mm1","MOVQ mm1, r/m64","movq mm1, r/m64","REX.W 0F 7E /r","N.S.","V","MMX","","w,r","",""
> +"MOVQ mm1, mm2/m64","MOVQ mm2/m64, mm1","movq mm2/m64, mm1","0F 6F /r","V","V","MMX","","w,r","",""
> +"MOV RAX, moffs64","MOVQ moffs64, RAX","movabsq moffs64, RAX","REX.W A1 cm","N.S.","V","","","w,r","Y","64"
> +"MOVQ mm1, r/m64","MOVQ r/m64, mm1","movq r/m64, mm1","REX.W 0F 6E /r","N.S.","V","MMX","","w,r","",""
> +"MOV r64, r/m64","MOVQ r/m64, r64","movq r/m64, r64","REX.W 8B /r","N.S.","V","","","w,r","Y","64"
> +"MOVQ xmm1, r/m64","MOVQ r/m64, xmm1","movq r/m64, xmm1","66 REX.W 0F 6E /r","N.S.","V","SSE2","","w,r","",""
> +"MOV r/m64, r64","MOVQ r64, r/m64","movq r64, r/m64","REX.W 89 /r","N.S.","V","","","w,r","Y","64"
> +"MOV CR0-CR7, rmr64","MOVQ rmr64, CR0-CR7","movq rmr64, CR0-CR7","0F 22 /r","N.S.","V","","default64","w,r","Y","64"
> +"MOV CR8, rmr64","MOVQ rmr64, CR8","movq rmr64, CR8","REX.R + 0F 22 /0","N.E.","V","","modrm_regonly,pseudo","w,r","Y","64"
> +"MOV DR0-DR7, rmr64","MOVQ rmr64, DR0-DR7","movq rmr64, DR0-DR7","0F 23 /r","N.S.","V","","default64","w,r","Y","64"
> +"MOVQ r/m64, xmm1","MOVQ xmm1, r/m64","movq xmm1, r/m64","66 REX.W 0F 7E /r","N.S.","V","SSE2","","w,r","",""
> +"MOVQ xmm2/m64, xmm1","MOVQ xmm1, xmm2/m64","movq xmm1, xmm2/m64","66 0F D6 /r","V","V","SSE2","","w,r","",""
> +"MOVDQ2Q mm1, xmm2","MOVQ xmm2, mm1","movdq2q xmm2, mm1","F2 0F D6 /r","V","V","SSE2","modrm_regonly","w,r","",""
> +"MOVQ xmm1, xmm2/m64","MOVQ xmm2/m64, xmm1","movq xmm2/m64, xmm1","F3 0F 7E /r","V","V","SSE2","","w,r","",""
> +"MOVQ2DQ xmm1, mm2","MOVQOZX mm2, xmm1","movq2dq mm2, xmm1","F3 0F D6 /r","V","V","SSE2","modrm_regonly","w,r","",""
> +"MOVSB","MOVSB","movsb","A4","V","V","","","","",""
> +"MOVSD xmm2/m64, xmm1","MOVSD xmm1, xmm2/m64","movsd xmm1, xmm2/m64","F2 0F 11 /r","V","V","SSE2","","w,r","",""
> +"MOVSD xmm1, xmm2/m64","MOVSD xmm2/m64, xmm1","movsd xmm2/m64, xmm1","F2 0F 10 /r","V","V","SSE2","","w,r","",""
> +"MOVSHDUP xmm1, xmm2/m128","MOVSHDUP xmm2/m128, xmm1","movshdup xmm2/m128, xmm1","F3 0F 16 /r","V","V","SSE3","","w,r","",""
> +"MOVSD","MOVSL","movsl","A5","V","V","","operand32","","",""
> +"MOVSLDUP xmm1, xmm2/m128","MOVSLDUP xmm2/m128, xmm1","movsldup xmm2/m128, xmm1","F3 0F 12 /r","V","V","SSE3","","w,r","",""
> +"MOVSQ","MOVSQ","movsq","REX.W A5","N.S.","V","","","","",""
> +"MOVSS xmm2/m32, xmm1","MOVSS xmm1, xmm2/m32","movss xmm1, xmm2/m32","F3 0F 11 /r","V","V","SSE","","w,r","",""
> +"MOVSS xmm1, xmm2/m32","MOVSS xmm2/m32, xmm1","movss xmm2/m32, xmm1","F3 0F 10 /r","V","V","SSE","","w,r","",""
> +"MOVSW","MOVSW","movsw","A5","V","V","","operand16","","",""
> +"MOVSX r16, r/m16","MOVSWW r/m16, r16","movsww r/m16, r16","0F BF /r","V","V","","operand16","w,r","Y","16"
> +"MOVUPD xmm2/m128, xmm1","MOVUPD xmm1, xmm2/m128","movupd xmm1, xmm2/m128","66 0F 11 /r","V","V","SSE2","","w,r","",""
> +"MOVUPD xmm1, xmm2/m128","MOVUPD xmm2/m128, xmm1","movupd xmm2/m128, xmm1","66 0F 10 /r","V","V","SSE2","","w,r","",""
> +"MOVUPS xmm2/m128, xmm1","MOVUPS xmm1, xmm2/m128","movups xmm1, xmm2/m128","0F 11 /r","V","V","SSE","","w,r","",""
> +"MOVUPS xmm1, xmm2/m128","MOVUPS xmm2/m128, xmm1","movups xmm2/m128, xmm1","0F 10 /r","V","V","SSE","","w,r","",""
> +"MOV moffs16, AX","MOVW AX, moffs16","movw AX, moffs16","A3 cm","V","V","","operand16","w,r","Y","16"
> +"MOV r/m16, Sreg","MOVW Sreg, r/m16","movw Sreg, r/m16","8C /r","V","V","","operand16","w,r","Y","16"
> +"MOV r/m16, imm16","MOVW imm16, r/m16","movw imm16, r/m16","C7 /0 iw","V","V","","operand16","w,r","Y","16"
> +"MOV r16op, imm16u","MOVW imm16u, r16op","movw imm16u, r16op","B8+rw iw","V","V","","operand16","w,r","Y","16"
> +"MOV AX, moffs16","MOVW moffs16, AX","movw moffs16, AX","A1 cm","V","V","","operand16","w,r","Y","16"
> +"MOV Sreg, r/m16","MOVW r/m16, Sreg","movw r/m16, Sreg","8E /r","V","V","","","w,r","Y","16"
> +"MOV r16, r/m16","MOVW r/m16, r16","movw r/m16, r16","8B /r","V","V","","operand16","w,r","Y","16"
> +"MOV r/m16, r16","MOVW r16, r/m16","movw r16, r/m16","89 /r","V","V","","operand16","w,r","Y","16"
> +"MOVSX r32, r/m16","MOVWLSX r/m16, r32","movswl r/m16, r32","0F BF /r","V","V","","operand32","w,r","Y","32"
> +"MOVZX r32, r/m16","MOVWLZX r/m16, r32","movzwl r/m16, r32","0F B7 /r","V","V","","operand32","w,r","Y","32"
> +"MOVSX r64, r/m16","MOVWQSX r/m16, r64","movswq r/m16, r64","REX.W 0F BF /r","N.S.","V","","","w,r","Y","64"
> +"MOVSXD r16, r/m32","MOVWQSX r/m32, r16","movsxdw r/m32, r16","63 /r","N.S.","V","","operand16","w,r","Y","16"
> +"MOVZX r64, r/m16","MOVWQZX r/m16, r64","movzwq r/m16, r64","REX.W 0F B7 /r","N.S.","V","","","w,r","Y","64"
> +"MOVZX r16, r/m16","MOVZWW r/m16, r16","movzww r/m16, r16","0F B7 /r","V","V","","operand16","w,r","Y","16"
> +"MOV r32/m16, Sreg","MOV{L/W} Sreg, r32/m16","mov{l/w} Sreg, r32/m16","8C /r","V","V","","operand32","w,r","Y",""
> +"MOV r64/m16, Sreg","MOV{Q/W} Sreg, r64/m16","mov{q/w} Sreg, r64/m16","REX.W 8C /r","N.S.","V","","","w,r","Y",""
> +"MPSADBW xmm1, xmm2/m128, imm8u","MPSADBW imm8u, xmm2/m128, xmm1","mpsadbw imm8u, xmm2/m128, xmm1","66 0F 3A 42 /r ib","V","V","SSE4_1","","rw,r,r","",""
> +"MUL r/m8","MULB r/m8","mulb r/m8","F6 /4","V","V","","","r","Y","8"
> +"MUL r/m8","MULB r/m8","mulb r/m8","REX F6 /4","N.E.","V","","pseudo64","r","Y","8"
> +"MUL r/m32","MULL r/m32","mull r/m32","F7 /4","V","V","","operand32","r","Y","32"
> +"MULPD xmm1, xmm2/m128","MULPD xmm2/m128, xmm1","mulpd xmm2/m128, xmm1","66 0F 59 /r","V","V","SSE2","","rw,r","",""
> +"MULPS xmm1, xmm2/m128","MULPS xmm2/m128, xmm1","mulps xmm2/m128, xmm1","0F 59 /r","V","V","SSE","","rw,r","",""
> +"MUL r/m64","MULQ r/m64","mulq r/m64","REX.W F7 /4","N.S.","V","","","r","Y","64"
> +"MULSD xmm1, xmm2/m64","MULSD xmm2/m64, xmm1","mulsd xmm2/m64, xmm1","F2 0F 59 /r","V","V","SSE2","","rw,r","",""
> +"MULSS xmm1, xmm2/m32","MULSS xmm2/m32, xmm1","mulss xmm2/m32, xmm1","F3 0F 59 /r","V","V","SSE","","rw,r","",""
> +"MUL r/m16","MULW r/m16","mulw r/m16","F7 /4","V","V","","operand16","r","Y","16"
> +"MULX r32, r32V, r/m32","MULXL r/m32, r32V, r32","mulxl r/m32, r32V, r32","VEX.NDD.128.F2.0F38.W0 F6 /r","V","V","BMI2","","w,w,r","Y","32"
> +"MULX r64, r64V, r/m64","MULXQ r/m64, r64V, r64","mulxq r/m64, r64V, r64","VEX.NDD.128.F2.0F38.W1 F6 /r","N.S.","V","BMI2","","w,w,r","Y","64"
> +"MWAIT","MWAIT","mwait","0F 01 C9","V","V","MONITOR","","","",""
> +"NEG r/m8","NEGB r/m8","negb r/m8","F6 /3","V","V","","","rw","Y","8"
> +"NEG r/m8","NEGB r/m8","negb r/m8","REX F6 /3","N.E.","V","","pseudo64","rw","Y","8"
> +"NEG r/m32","NEGL r/m32","negl r/m32","F7 /3","V","V","","operand32","rw","Y","32"
> +"NEG r/m64","NEGQ r/m64","negq r/m64","REX.W F7 /3","N.S.","V","","","rw","Y","64"
> +"NEG r/m16","NEGW r/m16","negw r/m16","F7 /3","V","V","","operand16","rw","Y","16"
> +"NOP","NOP","nop","90","V","V","","pseudo","","Y",""
> +"NOP","NOP","nop","90+rd","V","V","","operand32,operand64","","Y",""
> +"NOP","NOP","nop","90+rw","V","V","","operand16,operand64","","Y",""
> +"NOP","NOP","nop","F3 90+rd","V","V","","operand32","","Y",""
> +"NOP","NOP","nop","F3 90+rw","V","V","","operand16","","Y",""
> +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /4","V","V","","operand32","r","Y","32"
> +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /5","V","V","","operand32","r","Y","32"
> +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /6","V","V","","operand32","r","Y","32"
> +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /7","V","V","","operand32","r","Y","32"
> +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 19 /r","V","V","","operand32","r,r","Y","32"
> +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1A /r","V","V","","operand32","r,r","Y","32"
> +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1B /r","V","V","","operand32","r,r","Y","32"
> +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1C /r","V","V","","operand32","r,r","Y","32"
> +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1D /r","V","V","","operand32","r,r","Y","32"
> +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V","PPRO","operand32","r,r","Y","32"
> +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V","","operand32","r,r","Y","32"
> +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1F /r","V","V","","operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 0D /r","V","V","PRFCHW","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1A /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1B /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","66 0F 1E /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F2 0F 1E /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1B /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /0","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /1","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /2","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /3","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /4","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /5","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /6","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F8","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F9","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FA","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FB","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FC","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FD","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FE","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FF","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
> +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /0","V","V","","modrm_regonly,operand32","r","Y","32"
> +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /1","V","V","","modrm_regonly,operand32","r","Y","32"
> +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /2","V","V","","modrm_regonly,operand32","r","Y","32"
> +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /3","V","V","","modrm_regonly,operand32","r","Y","32"
> +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /4","N.S.","V","","","r","Y","64"
> +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /5","N.S.","V","","","r","Y","64"
> +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /6","N.S.","V","","","r","Y","64"
> +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /7","N.S.","V","","","r","Y","64"
> +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 19 /r","N.S.","V","","","r,r","Y","64"
> +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1A /r","N.S.","V","","","r,r","Y","64"
> +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1B /r","N.S.","V","","","r,r","Y","64"
> +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1C /r","N.S.","V","","","r,r","Y","64"
> +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1D /r","N.S.","V","","","r,r","Y","64"
> +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S.","V","","","r,r","Y","64"
> +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S.","V","PPRO","","r,r","Y","64"
> +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1F /r","N.S.","V","","","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","66 REX.W 0F 1E /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F2 REX.W 0F 1E /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1B /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /0","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /1","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /2","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /3","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /4","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /5","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /6","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F8","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F9","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FA","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FB","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FC","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FD","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FE","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FF","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 0D /r","N.S.","V","PRFCHW","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1A /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1B /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
> +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /0","N.S.","V","","modrm_regonly","r","Y","64"
> +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /1","N.S.","V","","modrm_regonly","r","Y","64"
> +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /2","N.S.","V","","modrm_regonly","r","Y","64"
> +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /3","N.S.","V","","modrm_regonly","r","Y","64"
> +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /4","V","V","","operand16","r","Y","16"
> +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /5","V","V","","operand16","r","Y","16"
> +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /6","V","V","","operand16","r","Y","16"
> +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /7","V","V","","operand16","r","Y","16"
> +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 19 /r","V","V","","operand16","r,r","Y","16"
> +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1A /r","V","V","","operand16","r,r","Y","16"
> +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1B /r","V","V","","operand16","r,r","Y","16"
> +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1C /r","V","V","","operand16","r,r","Y","16"
> +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1D /r","V","V","","operand16","r,r","Y","16"
> +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V","","operand16","r,r","Y","16"
> +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V","PPRO","operand16","r,r","Y","16"
> +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1F /r","V","V","","operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 0D /r","V","V","PRFCHW","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1A /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1B /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","66 0F 1E /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F2 0F 1E /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1B /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /0","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /1","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /2","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /3","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /4","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /5","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /6","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F8","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F9","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FA","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FB","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FC","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FD","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FE","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FF","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
> +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /0","V","V","","modrm_regonly,operand16","r","Y","16"
> +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /1","V","V","","modrm_regonly,operand16","r","Y","16"
> +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /2","V","V","","modrm_regonly,operand16","r","Y","16"
> +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /3","V","V","","modrm_regonly,operand16","r","Y","16"
> +"NOT r/m8","NOTB r/m8","notb r/m8","F6 /2","V","V","","","rw","Y","8"
> +"NOT r/m8","NOTB r/m8","notb r/m8","REX F6 /2","N.E.","V","","pseudo64","rw","Y","8"
> +"NOT r/m32","NOTL r/m32","notl r/m32","F7 /2","V","V","","operand32","rw","Y","32"
> +"NOT r/m64","NOTQ r/m64","notq r/m64","REX.W F7 /2","N.S.","V","","","rw","Y","64"
> +"NOT r/m16","NOTW r/m16","notw r/m16","F7 /2","V","V","","operand16","rw","Y","16"
> +"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","80 /1 ib","V","V","","","rw,r","Y","8"
> +"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","82 /1 ib","V","N.S.","","","rw,r","Y","8"
> +"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","REX 80 /1 ib","N.E.","V","","pseudo64","rw,r","Y","8"
> +"OR AL, imm8u","ORB imm8u, AL","orb imm8u, AL","0C ib","V","V","","","rw,r","Y","8"
> +"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","0A /r","V","V","","","rw,r","Y","8"
> +"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","REX 0A /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","08 /r","V","V","","","rw,r","Y","8"
> +"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","REX 08 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"OR EAX, imm32","ORL imm32, EAX","orl imm32, EAX","0D id","V","V","","operand32","rw,r","Y","32"
> +"OR r/m32, imm32","ORL imm32, r/m32","orl imm32, r/m32","81 /1 id","V","V","","operand32","rw,r","Y","32"
> +"OR r/m32, imm8","ORL imm8, r/m32","orl imm8, r/m32","83 /1 ib","V","V","","operand32","rw,r","Y","32"
> +"OR r32, r/m32","ORL r/m32, r32","orl r/m32, r32","0B /r","V","V","","operand32","rw,r","Y","32"
> +"OR r/m32, r32","ORL r32, r/m32","orl r32, r/m32","09 /r","V","V","","operand32","rw,r","Y","32"
> +"ORPD xmm1, xmm2/m128","ORPD xmm2/m128, xmm1","orpd xmm2/m128, xmm1","66 0F 56 /r","V","V","SSE2","","rw,r","",""
> +"ORPS xmm1, xmm2/m128","ORPS xmm2/m128, xmm1","orps xmm2/m128, xmm1","0F 56 /r","V","V","SSE","","rw,r","",""
> +"OR RAX, imm32","ORQ imm32, RAX","orq imm32, RAX","REX.W 0D id","N.S.","V","","","rw,r","Y","64"
> +"OR r/m64, imm32","ORQ imm32, r/m64","orq imm32, r/m64","REX.W 81 /1 id","N.S.","V","","","rw,r","Y","64"
> +"OR r/m64, imm8","ORQ imm8, r/m64","orq imm8, r/m64","REX.W 83 /1 ib","N.S.","V","","","rw,r","Y","64"
> +"OR r64, r/m64","ORQ r/m64, r64","orq r/m64, r64","REX.W 0B /r","N.S.","V","","","rw,r","Y","64"
> +"OR r/m64, r64","ORQ r64, r/m64","orq r64, r/m64","REX.W 09 /r","N.S.","V","","","rw,r","Y","64"
> +"OR AX, imm16","ORW imm16, AX","orw imm16, AX","0D iw","V","V","","operand16","rw,r","Y","16"
> +"OR r/m16, imm16","ORW imm16, r/m16","orw imm16, r/m16","81 /1 iw","V","V","","operand16","rw,r","Y","16"
> +"OR r/m16, imm8","ORW imm8, r/m16","orw imm8, r/m16","83 /1 ib","V","V","","operand16","rw,r","Y","16"
> +"OR r16, r/m16","ORW r/m16, r16","orw r/m16, r16","0B /r","V","V","","operand16","rw,r","Y","16"
> +"OR r/m16, r16","ORW r16, r/m16","orw r16, r/m16","09 /r","V","V","","operand16","rw,r","Y","16"
> +"OUT DX, AL","OUTB AL, DX","outb AL, DX","EE","V","V","","","r,r","Y","8"
> +"OUT imm8u, AL","OUTB AL, imm8u","outb AL, imm8u","E6 ib","V","V","","","r,r","Y","8"
> +"OUT DX, EAX","OUTL EAX, DX","outl EAX, DX","EF","V","V","","operand32,operand64","r,r","Y","32"
> +"OUT imm8u, EAX","OUTL EAX, imm8u","outl EAX, imm8u","E7 ib","V","V","","operand32,operand64","r,r","Y","32"
> +"OUTSB","OUTSB","outsb","6E","V","V","","","","",""
> +"OUTSD","OUTSL","outsl","6F","V","V","","operand32,operand64","","",""
> +"OUTSW","OUTSW","outsw","6F","V","V","","operand16","","",""
> +"OUT DX, AX","OUTW AX, DX","outw AX, DX","EF","V","V","","operand16","r,r","Y","16"
> +"OUT imm8u, AX","OUTW AX, imm8u","outw AX, imm8u","E7 ib","V","V","","operand16","r,r","Y","16"
> +"PABSB mm1, mm2/m64","PABSB mm2/m64, mm1","pabsb mm2/m64, mm1","0F 38 1C /r","V","V","SSSE3","","w,r","",""
> +"PABSB xmm1, xmm2/m128","PABSB xmm2/m128, xmm1","pabsb xmm2/m128, xmm1","66 0F 38 1C /r","V","V","SSSE3","","w,r","",""
> +"PABSD mm1, mm2/m64","PABSD mm2/m64, mm1","pabsd mm2/m64, mm1","0F 38 1E /r","V","V","SSSE3","","w,r","",""
> +"PABSD xmm1, xmm2/m128","PABSD xmm2/m128, xmm1","pabsd xmm2/m128, xmm1","66 0F 38 1E /r","V","V","SSSE3","","w,r","",""
> +"PABSW mm1, mm2/m64","PABSW mm2/m64, mm1","pabsw mm2/m64, mm1","0F 38 1D /r","V","V","SSSE3","","w,r","",""
> +"PABSW xmm1, xmm2/m128","PABSW xmm2/m128, xmm1","pabsw xmm2/m128, xmm1","66 0F 38 1D /r","V","V","SSSE3","","w,r","",""
> +"PACKSSDW mm1, mm2/m64","PACKSSLW mm2/m64, mm1","packssdw mm2/m64, mm1","0F 6B /r","V","V","MMX","","rw,r","",""
> +"PACKSSDW xmm1, xmm2/m128","PACKSSLW xmm2/m128, xmm1","packssdw xmm2/m128, xmm1","66 0F 6B /r","V","V","SSE2","","rw,r","",""
> +"PACKSSWB mm1, mm2/m64","PACKSSWB mm2/m64, mm1","packsswb mm2/m64, mm1","0F 63 /r","V","V","MMX","","rw,r","",""
> +"PACKSSWB xmm1, xmm2/m128","PACKSSWB xmm2/m128, xmm1","packsswb xmm2/m128, xmm1","66 0F 63 /r","V","V","SSE2","","rw,r","",""
> +"PACKUSDW xmm1, xmm2/m128","PACKUSDW xmm2/m128, xmm1","packusdw xmm2/m128, xmm1","66 0F 38 2B /r","V","V","SSE4_1","","rw,r","",""
> +"PACKUSWB mm1, mm2/m64","PACKUSWB mm2/m64, mm1","packuswb mm2/m64, mm1","0F 67 /r","V","V","MMX","","rw,r","",""
> +"PACKUSWB xmm1, xmm2/m128","PACKUSWB xmm2/m128, xmm1","packuswb xmm2/m128, xmm1","66 0F 67 /r","V","V","SSE2","","rw,r","",""
> +"PADDB mm1, mm2/m64","PADDB mm2/m64, mm1","paddb mm2/m64, mm1","0F FC /r","V","V","MMX","","rw,r","",""
> +"PADDB xmm1, xmm2/m128","PADDB xmm2/m128, xmm1","paddb xmm2/m128, xmm1","66 0F FC /r","V","V","SSE2","","rw,r","",""
> +"PADDD mm1, mm2/m64","PADDL mm2/m64, mm1","paddd mm2/m64, mm1","0F FE /r","V","V","MMX","","rw,r","",""
> +"PADDD xmm1, xmm2/m128","PADDL xmm2/m128, xmm1","paddd xmm2/m128, xmm1","66 0F FE /r","V","V","SSE2","","rw,r","",""
> +"PADDQ mm1, mm2/m64","PADDQ mm2/m64, mm1","paddq mm2/m64, mm1","0F D4 /r","V","V","SSE2","","rw,r","",""
> +"PADDQ xmm1, xmm2/m128","PADDQ xmm2/m128, xmm1","paddq xmm2/m128, xmm1","66 0F D4 /r","V","V","SSE2","","rw,r","",""
> +"PADDSB mm1, mm2/m64","PADDSB mm2/m64, mm1","paddsb mm2/m64, mm1","0F EC /r","V","V","MMX","","rw,r","",""
> +"PADDSB xmm1, xmm2/m128","PADDSB xmm2/m128, xmm1","paddsb xmm2/m128, xmm1","66 0F EC /r","V","V","SSE2","","rw,r","",""
> +"PADDSW mm1, mm2/m64","PADDSW mm2/m64, mm1","paddsw mm2/m64, mm1","0F ED /r","V","V","MMX","","rw,r","",""
> +"PADDSW xmm1, xmm2/m128","PADDSW xmm2/m128, xmm1","paddsw xmm2/m128, xmm1","66 0F ED /r","V","V","SSE2","","rw,r","",""
> +"PADDUSB mm1, mm2/m64","PADDUSB mm2/m64, mm1","paddusb mm2/m64, mm1","0F DC /r","V","V","MMX","","rw,r","",""
> +"PADDUSB xmm1, xmm2/m128","PADDUSB xmm2/m128, xmm1","paddusb xmm2/m128, xmm1","66 0F DC /r","V","V","SSE2","","rw,r","",""
> +"PADDUSW mm1, mm2/m64","PADDUSW mm2/m64, mm1","paddusw mm2/m64, mm1","0F DD /r","V","V","MMX","","rw,r","",""
> +"PADDUSW xmm1, xmm2/m128","PADDUSW xmm2/m128, xmm1","paddusw xmm2/m128, xmm1","66 0F DD /r","V","V","SSE2","","rw,r","",""
> +"PADDW mm1, mm2/m64","PADDW mm2/m64, mm1","paddw mm2/m64, mm1","0F FD /r","V","V","MMX","","rw,r","",""
> +"PADDW xmm1, xmm2/m128","PADDW xmm2/m128, xmm1","paddw xmm2/m128, xmm1","66 0F FD /r","V","V","SSE2","","rw,r","",""
> +"PALIGNR mm1, mm2/m64, imm8u","PALIGNR imm8u, mm2/m64, mm1","palignr imm8u, mm2/m64, mm1","0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r","",""
> +"PALIGNR xmm1, xmm2/m128, imm8u","PALIGNR imm8u, xmm2/m128, xmm1","palignr imm8u, xmm2/m128, xmm1","66 0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r","",""
> +"PAND mm1, mm2/m64","PAND mm2/m64, mm1","pand mm2/m64, mm1","0F DB /r","V","V","MMX","","rw,r","",""
> +"PAND xmm1, xmm2/m128","PAND xmm2/m128, xmm1","pand xmm2/m128, xmm1","66 0F DB /r","V","V","SSE2","","rw,r","",""
> +"PANDN mm1, mm2/m64","PANDN mm2/m64, mm1","pandn mm2/m64, mm1","0F DF /r","V","V","MMX","","rw,r","",""
> +"PANDN xmm1, xmm2/m128","PANDN xmm2/m128, xmm1","pandn xmm2/m128, xmm1","66 0F DF /r","V","V","SSE2","","rw,r","",""
> +"PAUSE","PAUSE","pause","F3 90","V","V","","pseudo","","",""
> +"PAUSE","PAUSE","pause","F3 90+rd","V","V","","operand32","","Y",""
> +"PAUSE","PAUSE","pause","F3 90+rw","V","V","","operand16,operand64","","Y",""
> +"PAVGB mm1, mm2/m64","PAVGB mm2/m64, mm1","pavgb mm2/m64, mm1","0F E0 /r","V","V","MMX","","rw,r","",""
> +"PAVGB xmm1, xmm2/m128","PAVGB xmm2/m128, xmm1","pavgb xmm2/m128, xmm1","66 0F E0 /r","V","V","SSE2","","rw,r","",""
> +"PAVGUSB mm1, mm2/m64","PAVGUSB mm2/m64, mm1","pavgusb mm2/m64, mm1","0F 0F BF /r","V","V","3DNOW","amd","rw,r","",""
> +"PAVGW mm1, mm2/m64","PAVGW mm2/m64, mm1","pavgw mm2/m64, mm1","0F E3 /r","V","V","MMX","","rw,r","",""
> +"PAVGW xmm1, xmm2/m128","PAVGW xmm2/m128, xmm1","pavgw xmm2/m128, xmm1","66 0F E3 /r","V","V","SSE2","","rw,r","",""
> +"PBLENDVB xmm1, xmm2/m128, <XMM0>","PBLENDVB <XMM0>, xmm2/m128, xmm1","pblendvb <XMM0>, xmm2/m128, xmm1","66 0F 38 10 /r","V","V","SSE4_1","","rw,r,r","",""
> +"PBLENDW xmm1, xmm2/m128, imm8u","PBLENDW imm8u, xmm2/m128, xmm1","pblendw imm8u, xmm2/m128, xmm1","66 0F 3A 0E /r ib","V","V","SSE4_1","","rw,r,r","",""
> +"PCLMULQDQ xmm1, xmm2/m128, imm8u","PCLMULQDQ imm8u, xmm2/m128, xmm1","pclmulqdq imm8u, xmm2/m128, xmm1","66 0F 3A 44 /r ib","V","V","PCLMULQDQ","","rw,r,r","",""
> +"PCMPEQB mm1, mm2/m64","PCMPEQB mm2/m64, mm1","pcmpeqb mm2/m64, mm1","0F 74 /r","V","V","MMX","","rw,r","",""
> +"PCMPEQB xmm1, xmm2/m128","PCMPEQB xmm2/m128, xmm1","pcmpeqb xmm2/m128, xmm1","66 0F 74 /r","V","V","SSE2","","rw,r","",""
> +"PCMPEQD mm1, mm2/m64","PCMPEQL mm2/m64, mm1","pcmpeqd mm2/m64, mm1","0F 76 /r","V","V","MMX","","rw,r","",""
> +"PCMPEQD xmm1, xmm2/m128","PCMPEQL xmm2/m128, xmm1","pcmpeqd xmm2/m128, xmm1","66 0F 76 /r","V","V","SSE2","","rw,r","",""
> +"PCMPEQQ xmm1, xmm2/m128","PCMPEQQ xmm2/m128, xmm1","pcmpeqq xmm2/m128, xmm1","66 0F 38 29 /r","V","V","SSE4_1","","rw,r","",""
> +"PCMPEQW mm1, mm2/m64","PCMPEQW mm2/m64, mm1","pcmpeqw mm2/m64, mm1","0F 75 /r","V","V","MMX","","rw,r","",""
> +"PCMPEQW xmm1, xmm2/m128","PCMPEQW xmm2/m128, xmm1","pcmpeqw xmm2/m128, xmm1","66 0F 75 /r","V","V","SSE2","","rw,r","",""
> +"PCMPESTRI xmm1, xmm2/m128, imm8u","PCMPESTRI imm8u, xmm2/m128, xmm1","pcmpestri imm8u, xmm2/m128, xmm1","66 0F 3A 61 /r ib","V","V","SSE4_2","","r,r,r","",""
> +"PCMPESTRM xmm1, xmm2/m128, imm8u","PCMPESTRM imm8u, xmm2/m128, xmm1","pcmpestrm imm8u, xmm2/m128, xmm1","66 0F 3A 60 /r ib","V","V","SSE4_2","","r,r,r","",""
> +"PCMPGTB mm1, mm2/m64","PCMPGTB mm2/m64, mm1","pcmpgtb mm2/m64, mm1","0F 64 /r","V","V","MMX","","rw,r","",""
> +"PCMPGTB xmm1, xmm2/m128","PCMPGTB xmm2/m128, xmm1","pcmpgtb xmm2/m128, xmm1","66 0F 64 /r","V","V","SSE2","","rw,r","",""
> +"PCMPGTD mm1, mm2/m64","PCMPGTL mm2/m64, mm1","pcmpgtd mm2/m64, mm1","0F 66 /r","V","V","MMX","","rw,r","",""
> +"PCMPGTD xmm1, xmm2/m128","PCMPGTL xmm2/m128, xmm1","pcmpgtd xmm2/m128, xmm1","66 0F 66 /r","V","V","SSE2","","rw,r","",""
> +"PCMPGTQ xmm1, xmm2/m128","PCMPGTQ xmm2/m128, xmm1","pcmpgtq xmm2/m128, xmm1","66 0F 38 37 /r","V","V","SSE4_2","","rw,r","",""
> +"PCMPGTW mm1, mm2/m64","PCMPGTW mm2/m64, mm1","pcmpgtw mm2/m64, mm1","0F 65 /r","V","V","MMX","","rw,r","",""
> +"PCMPGTW xmm1, xmm2/m128","PCMPGTW xmm2/m128, xmm1","pcmpgtw xmm2/m128, xmm1","66 0F 65 /r","V","V","SSE2","","rw,r","",""
> +"PCMPISTRI xmm1, xmm2/m128, imm8u","PCMPISTRI imm8u, xmm2/m128, xmm1","pcmpistri imm8u, xmm2/m128, xmm1","66 0F 3A 63 /r ib","V","V","SSE4_2","","r,r,r","",""
> +"PCMPISTRM xmm1, xmm2/m128, imm8u","PCMPISTRM imm8u, xmm2/m128, xmm1","pcmpistrm imm8u, xmm2/m128, xmm1","66 0F 3A 62 /r ib","V","V","SSE4_2","","r,r,r","",""
> +"PDEP r32, r32V, r/m32","PDEPL r/m32, r32V, r32","pdepl r/m32, r32V, r32","VEX.DDS.128.F2.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32"
> +"PDEP r64, r64V, r/m64","PDEPQ r/m64, r64V, r64","pdepq r/m64, r64V, r64","VEX.DDS.128.F2.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64"
> +"PEXT r32, r32V, r/m32","PEXTL r/m32, r32V, r32","pextl r/m32, r32V, r32","VEX.DDS.128.F3.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32"
> +"PEXT r64, r64V, r/m64","PEXTQ r/m64, r64V, r64","pextq r/m64, r64V, r64","VEX.DDS.128.F3.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64"
> +"PEXTRB r32/m8, xmm1, imm8u","PEXTRB imm8u, xmm1, r32/m8","pextrb imm8u, xmm1, r32/m8","66 0F 3A 14 /r ib","V","V","SSE4_1","","w,r,r","",""
> +"PEXTRD r/m32, xmm1, imm8u","PEXTRD imm8u, xmm1, r/m32","pextrd imm8u, xmm1, r/m32","66 0F 3A 16 /r ib","V","V","SSE4_1","operand16,operand32","w,r,r","",""
> +"PEXTRQ r/m64, xmm1, imm8u","PEXTRQ imm8u, xmm1, r/m64","pextrq imm8u, xmm1, r/m64","66 REX.W 0F 3A 16 /r ib","N.S.","V","SSE4_1","","w,r,r","",""
> +"PEXTRW r32, mm2, imm8u","PEXTRW imm8u, mm2, r32","pextrw imm8u, mm2, r32","0F C5 /r ib","V","V","MMX","modrm_regonly","w,r,r","",""
> +"PEXTRW r32/m16, xmm1, imm8u","PEXTRW imm8u, xmm1, r32/m16","pextrw imm8u, xmm1, r32/m16","66 0F 3A 15 /r ib","V","V","SSE4_1","","w,r,r","",""
> +"PEXTRW r32, xmm2, imm8u","PEXTRW imm8u, xmm2, r32","pextrw imm8u, xmm2, r32","66 0F C5 /r ib","V","V","SSE2","modrm_regonly","w,r,r","",""
> +"PF2ID mm1, mm2/m64","PF2ID mm2/m64, mm1","pf2id mm2/m64, mm1","0F 0F 1D /r","V","V","3DNOW","amd","rw,r","",""
> +"PF2IW mm1, mm2/m64","PF2IW mm2/m64, mm1","pf2iw mm2/m64, mm1","0F 0F 1C /r","V","V","3DNOW","amd","rw,r","",""
> +"PFACC mm1, mm2/m64","PFACC mm2/m64, mm1","pfacc mm2/m64, mm1","0F 0F AE /r","V","V","3DNOW","amd","rw,r","",""
> +"PFADD mm1, mm2/m64","PFADD mm2/m64, mm1","pfadd mm2/m64, mm1","0F 0F 9E /r","V","V","3DNOW","amd","rw,r","",""
> +"PFCMPEQ mm1, mm2/m64","PFCMPEQ mm2/m64, mm1","pfcmpeq mm2/m64, mm1","0F 0F B0 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFCMPGE mm1, mm2/m64","PFCMPGE mm2/m64, mm1","pfcmpge mm2/m64, mm1","0F 0F 90 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFCMPGT mm1, mm2/m64","PFCMPGT mm2/m64, mm1","pfcmpgt mm2/m64, mm1","0F 0F A0 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFCPIT1 mm1, mm2/m64","PFCPIT1 mm2/m64, mm1","pfcpit1 mm2/m64, mm1","0F 0F A6 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFMAX mm1, mm2/m64","PFMAX mm2/m64, mm1","pfmax mm2/m64, mm1","0F 0F A4 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFMIN mm1, mm2/m64","PFMIN mm2/m64, mm1","pfmin mm2/m64, mm1","0F 0F 94 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFMUL mm1, mm2/m64","PFMUL mm2/m64, mm1","pfmul mm2/m64, mm1","0F 0F B4 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFNACC mm1, mm2/m64","PFNACC mm2/m64, mm1","pfnacc mm2/m64, mm1","0F 0F 8A /r","V","V","3DNOW","amd","rw,r","",""
> +"PFPNACC mm1, mm2/m64","PFPNACC mm2/m64, mm1","pfpnacc mm2/m64, mm1","0F 0F 8E /r","V","V","3DNOW","amd","rw,r","",""
> +"PFRCP mm1, mm2/m64","PFRCP mm2/m64, mm1","pfrcp mm2/m64, mm1","0F 0F 96 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFRCPIT2 mm1, mm2/m64","PFRCPIT2 mm2/m64, mm1","pfrcpit2 mm2/m64, mm1","0F 0F B6 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFRSQIT1 mm1, mm2/m64","PFRSQIT1 mm2/m64, mm1","pfrsqit1 mm2/m64, mm1","0F 0F A7 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFSQRT mm1, mm2/m64","PFSQRT mm2/m64, mm1","pfsqrt mm2/m64, mm1","0F 0F 97 /r","V","V","3DNOW","amd","rw,r","",""
> +"PFSUB mm1, mm2/m64","PFSUB mm2/m64, mm1","pfsub mm2/m64, mm1","0F 0F 9A /r","V","V","3DNOW","amd","rw,r","",""
> +"PFSUBR mm1, mm2/m64","PFSUBR mm2/m64, mm1","pfsubr mm2/m64, mm1","0F 0F AA /r","V","V","3DNOW","amd","rw,r","",""
> +"PHADDD mm1, mm2/m64","PHADDD mm2/m64, mm1","phaddd mm2/m64, mm1","0F 38 02 /r","V","V","SSSE3","","rw,r","",""
> +"PHADDD xmm1, xmm2/m128","PHADDD xmm2/m128, xmm1","phaddd xmm2/m128, xmm1","66 0F 38 02 /r","V","V","SSSE3","","rw,r","",""
> +"PHADDSW mm1, mm2/m64","PHADDSW mm2/m64, mm1","phaddsw mm2/m64, mm1","0F 38 03 /r","V","V","SSSE3","","rw,r","",""
> +"PHADDSW xmm1, xmm2/m128","PHADDSW xmm2/m128, xmm1","phaddsw xmm2/m128, xmm1","66 0F 38 03 /r","V","V","SSSE3","","rw,r","",""
> +"PHADDW mm1, mm2/m64","PHADDW mm2/m64, mm1","phaddw mm2/m64, mm1","0F 38 01 /r","V","V","SSSE3","","rw,r","",""
> +"PHADDW xmm1, xmm2/m128","PHADDW xmm2/m128, xmm1","phaddw xmm2/m128, xmm1","66 0F 38 01 /r","V","V","SSSE3","","rw,r","",""
> +"PHMINPOSUW xmm1, xmm2/m128","PHMINPOSUW xmm2/m128, xmm1","phminposuw xmm2/m128, xmm1","66 0F 38 41 /r","V","V","SSE4_1","","w,r","",""
> +"PHSUBD mm1, mm2/m64","PHSUBD mm2/m64, mm1","phsubd mm2/m64, mm1","0F 38 06 /r","V","V","SSSE3","","rw,r","",""
> +"PHSUBD xmm1, xmm2/m128","PHSUBD xmm2/m128, xmm1","phsubd xmm2/m128, xmm1","66 0F 38 06 /r","V","V","SSSE3","","rw,r","",""
> +"PHSUBSW mm1, mm2/m64","PHSUBSW mm2/m64, mm1","phsubsw mm2/m64, mm1","0F 38 07 /r","V","V","SSSE3","","rw,r","",""
> +"PHSUBSW xmm1, xmm2/m128","PHSUBSW xmm2/m128, xmm1","phsubsw xmm2/m128, xmm1","66 0F 38 07 /r","V","V","SSSE3","","rw,r","",""
> +"PHSUBW mm1, mm2/m64","PHSUBW mm2/m64, mm1","phsubw mm2/m64, mm1","0F 38 05 /r","V","V","SSSE3","","rw,r","",""
> +"PHSUBW xmm1, xmm2/m128","PHSUBW xmm2/m128, xmm1","phsubw xmm2/m128, xmm1","66 0F 38 05 /r","V","V","SSSE3","","rw,r","",""
> +"PI2FD mm1, mm2/m64","PI2FD mm2/m64, mm1","pi2fd mm2/m64, mm1","0F 0F 0D /r","V","V","3DNOW","amd","rw,r","",""
> +"PI2FW mm1, mm2/m64","PI2FW mm2/m64, mm1","pi2fw mm2/m64, mm1","0F 0F 0C /r","V","V","3DNOW","amd","rw,r","",""
> +"PINSRB xmm1, r32/m8, imm8u","PINSRB imm8u, r32/m8, xmm1","pinsrb imm8u, r32/m8, xmm1","66 0F 3A 20 /r ib","V","V","SSE4_1","","rw,r,r","",""
> +"PINSRD xmm1, r/m32, imm8u","PINSRD imm8u, r/m32, xmm1","pinsrd imm8u, r/m32, xmm1","66 0F 3A 22 /r ib","V","V","SSE4_1","operand16,operand32","rw,r,r","",""
> +"PINSRQ xmm1, r/m64, imm8u","PINSRQ imm8u, r/m64, xmm1","pinsrq imm8u, r/m64, xmm1","66 REX.W 0F 3A 22 /r ib","N.S.","V","SSE4_1","","rw,r,r","",""
> +"PINSRW mm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, mm1","pinsrw imm8u, r32/m16, mm1","0F C4 /r ib","V","V","MMX","","rw,r,r","",""
> +"PINSRW xmm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, xmm1","pinsrw imm8u, r32/m16, xmm1","66 0F C4 /r ib","V","V","SSE2","","rw,r,r","",""
> +"PMADDUBSW mm1, mm2/m64","PMADDUBSW mm2/m64, mm1","pmaddubsw mm2/m64, mm1","0F 38 04 /r","V","V","SSSE3","","rw,r","",""
> +"PMADDUBSW xmm1, xmm2/m128","PMADDUBSW xmm2/m128, xmm1","pmaddubsw xmm2/m128, xmm1","66 0F 38 04 /r","V","V","SSSE3","","rw,r","",""
> +"PMADDWD mm1, mm2/m64","PMADDWL mm2/m64, mm1","pmaddwd mm2/m64, mm1","0F F5 /r","V","V","MMX","","rw,r","",""
> +"PMADDWD xmm1, xmm2/m128","PMADDWL xmm2/m128, xmm1","pmaddwd xmm2/m128, xmm1","66 0F F5 /r","V","V","SSE2","","rw,r","",""
> +"PMAXSB xmm1, xmm2/m128","PMAXSB xmm2/m128, xmm1","pmaxsb xmm2/m128, xmm1","66 0F 38 3C /r","V","V","SSE4_1","","rw,r","",""
> +"PMAXSD xmm1, xmm2/m128","PMAXSD xmm2/m128, xmm1","pmaxsd xmm2/m128, xmm1","66 0F 38 3D /r","V","V","SSE4_1","","rw,r","",""
> +"PMAXSW mm1, mm2/m64","PMAXSW mm2/m64, mm1","pmaxsw mm2/m64, mm1","0F EE /r","V","V","MMX","","rw,r","",""
> +"PMAXSW xmm1, xmm2/m128","PMAXSW xmm2/m128, xmm1","pmaxsw xmm2/m128, xmm1","66 0F EE /r","V","V","SSE2","","rw,r","",""
> +"PMAXUB mm1, mm2/m64","PMAXUB mm2/m64, mm1","pmaxub mm2/m64, mm1","0F DE /r","V","V","MMX","","rw,r","",""
> +"PMAXUB xmm1, xmm2/m128","PMAXUB xmm2/m128, xmm1","pmaxub xmm2/m128, xmm1","66 0F DE /r","V","V","SSE2","","rw,r","",""
> +"PMAXUD xmm1, xmm2/m128","PMAXUD xmm2/m128, xmm1","pmaxud xmm2/m128, xmm1","66 0F 38 3F /r","V","V","SSE4_1","","rw,r","",""
> +"PMAXUW xmm1, xmm2/m128","PMAXUW xmm2/m128, xmm1","pmaxuw xmm2/m128, xmm1","66 0F 38 3E /r","V","V","SSE4_1","","rw,r","",""
> +"PMINSB xmm1, xmm2/m128","PMINSB xmm2/m128, xmm1","pminsb xmm2/m128, xmm1","66 0F 38 38 /r","V","V","SSE4_1","","rw,r","",""
> +"PMINSD xmm1, xmm2/m128","PMINSD xmm2/m128, xmm1","pminsd xmm2/m128, xmm1","66 0F 38 39 /r","V","V","SSE4_1","","rw,r","",""
> +"PMINSW mm1, mm2/m64","PMINSW mm2/m64, mm1","pminsw mm2/m64, mm1","0F EA /r","V","V","MMX","","rw,r","",""
> +"PMINSW xmm1, xmm2/m128","PMINSW xmm2/m128, xmm1","pminsw xmm2/m128, xmm1","66 0F EA /r","V","V","SSE2","","rw,r","",""
> +"PMINUB mm1, mm2/m64","PMINUB mm2/m64, mm1","pminub mm2/m64, mm1","0F DA /r","V","V","MMX","","rw,r","",""
> +"PMINUB xmm1, xmm2/m128","PMINUB xmm2/m128, xmm1","pminub xmm2/m128, xmm1","66 0F DA /r","V","V","SSE2","","rw,r","",""
> +"PMINUD xmm1, xmm2/m128","PMINUD xmm2/m128, xmm1","pminud xmm2/m128, xmm1","66 0F 38 3B /r","V","V","SSE4_1","","rw,r","",""
> +"PMINUW xmm1, xmm2/m128","PMINUW xmm2/m128, xmm1","pminuw xmm2/m128, xmm1","66 0F 38 3A /r","V","V","SSE4_1","","rw,r","",""
> +"PMOVMSKB r32, mm2","PMOVMSKB mm2, r32","pmovmskb mm2, r32","0F D7 /r","V","V","SSE","modrm_regonly","w,r","",""
> +"PMOVMSKB r32, xmm2","PMOVMSKB xmm2, r32","pmovmskb xmm2, r32","66 0F D7 /r","V","V","SSE2","modrm_regonly","w,r","",""
> +"PMOVSXBD xmm1, xmm2/m32","PMOVSXBD xmm2/m32, xmm1","pmovsxbd xmm2/m32, xmm1","66 0F 38 21 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVSXBQ xmm1, xmm2/m16","PMOVSXBQ xmm2/m16, xmm1","pmovsxbq xmm2/m16, xmm1","66 0F 38 22 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVSXBW xmm1, xmm2/m64","PMOVSXBW xmm2/m64, xmm1","pmovsxbw xmm2/m64, xmm1","66 0F 38 20 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVSXDQ xmm1, xmm2/m64","PMOVSXDQ xmm2/m64, xmm1","pmovsxdq xmm2/m64, xmm1","66 0F 38 25 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVSXWD xmm1, xmm2/m64","PMOVSXWD xmm2/m64, xmm1","pmovsxwd xmm2/m64, xmm1","66 0F 38 23 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVSXWQ xmm1, xmm2/m32","PMOVSXWQ xmm2/m32, xmm1","pmovsxwq xmm2/m32, xmm1","66 0F 38 24 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVZXBD xmm1, xmm2/m32","PMOVZXBD xmm2/m32, xmm1","pmovzxbd xmm2/m32, xmm1","66 0F 38 31 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVZXBQ xmm1, xmm2/m16","PMOVZXBQ xmm2/m16, xmm1","pmovzxbq xmm2/m16, xmm1","66 0F 38 32 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVZXBW xmm1, xmm2/m64","PMOVZXBW xmm2/m64, xmm1","pmovzxbw xmm2/m64, xmm1","66 0F 38 30 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVZXDQ xmm1, xmm2/m64","PMOVZXDQ xmm2/m64, xmm1","pmovzxdq xmm2/m64, xmm1","66 0F 38 35 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVZXWD xmm1, xmm2/m64","PMOVZXWD xmm2/m64, xmm1","pmovzxwd xmm2/m64, xmm1","66 0F 38 33 /r","V","V","SSE4_1","","w,r","",""
> +"PMOVZXWQ xmm1, xmm2/m32","PMOVZXWQ xmm2/m32, xmm1","pmovzxwq xmm2/m32, xmm1","66 0F 38 34 /r","V","V","SSE4_1","","w,r","",""
> +"PMULDQ xmm1, xmm2/m128","PMULDQ xmm2/m128, xmm1","pmuldq xmm2/m128, xmm1","66 0F 38 28 /r","V","V","SSE4_1","","rw,r","",""
> +"PMULHRSW mm1, mm2/m64","PMULHRSW mm2/m64, mm1","pmulhrsw mm2/m64, mm1","0F 38 0B /r","V","V","SSSE3","","rw,r","",""
> +"PMULHRSW xmm1, xmm2/m128","PMULHRSW xmm2/m128, xmm1","pmulhrsw xmm2/m128, xmm1","66 0F 38 0B /r","V","V","SSSE3","","rw,r","",""
> +"PMULHRW mm1, mm2/m64","PMULHRW mm2/m64, mm1","pmulhrw mm2/m64, mm1","0F 0F B7 /r","V","V","3DNOW","amd","rw,r","",""
> +"PMULHUW mm1, mm2/m64","PMULHUW mm2/m64, mm1","pmulhuw mm2/m64, mm1","0F E4 /r","V","V","MMX","","rw,r","",""
> +"PMULHUW xmm1, xmm2/m128","PMULHUW xmm2/m128, xmm1","pmulhuw xmm2/m128, xmm1","66 0F E4 /r","V","V","SSE2","","rw,r","",""
> +"PMULHW mm1, mm2/m64","PMULHW mm2/m64, mm1","pmulhw mm2/m64, mm1","0F E5 /r","V","V","MMX","","rw,r","",""
> +"PMULHW xmm1, xmm2/m128","PMULHW xmm2/m128, xmm1","pmulhw xmm2/m128, xmm1","66 0F E5 /r","V","V","SSE2","","rw,r","",""
> +"PMULLD xmm1, xmm2/m128","PMULLD xmm2/m128, xmm1","pmulld xmm2/m128, xmm1","66 0F 38 40 /r","V","V","SSE4_1","","rw,r","",""
> +"PMULLW mm1, mm2/m64","PMULLW mm2/m64, mm1","pmullw mm2/m64, mm1","0F D5 /r","V","V","MMX","","rw,r","",""
> +"PMULLW xmm1, xmm2/m128","PMULLW xmm2/m128, xmm1","pmullw xmm2/m128, xmm1","66 0F D5 /r","V","V","SSE2","","rw,r","",""
> +"PMULUDQ mm1, mm2/m64","PMULULQ mm2/m64, mm1","pmuludq mm2/m64, mm1","0F F4 /r","V","V","SSE2","","rw,r","",""
> +"PMULUDQ xmm1, xmm2/m128","PMULULQ xmm2/m128, xmm1","pmuludq xmm2/m128, xmm1","66 0F F4 /r","V","V","SSE2","","rw,r","",""
> +"POPAD","POPAL","popal","61","V","N.S.","","operand32","","",""
> +"POPA","POPAW","popaw","61","V","N.S.","","operand16","","",""
> +"POPCNT r32, r/m32","POPCNTL r/m32, r32","popcntl r/m32, r32","F3 0F B8 /r","V","V","POPCNT","operand32","w,r","Y","32"
> +"POPCNT r64, r/m64","POPCNTQ r/m64, r64","popcntq r/m64, r64","F3 REX.W 0F B8 /r","N.S.","V","POPCNT","","w,r","Y","64"
> +"POPCNT r16, r/m16","POPCNTW r/m16, r16","popcntw r/m16, r16","F3 0F B8 /r","V","V","POPCNT","operand16","w,r","Y","16"
> +"POPFD","POPFL","popfl","9D","V","N.S.","","operand32","","",""
> +"POPFQ","POPFQ","popfq","9D","N.S.","V","","default64","","",""
> +"POPF","POPFW","popfw","9D","V","V","","operand16","","",""
> +"POP r/m32","POPL r/m32","popl r/m32","8F /0","V","N.S.","","operand32","w","Y","32"
> +"POP r32op","POPL r32op","popl r32op","58+rd","V","N.S.","","operand32","w","Y","32"
> +"POP r/m64","POPQ r/m64","popq r/m64","8F /0","N.S.","V","","default64","w","Y","64"
> +"POP r64op","POPQ r64op","popq r64op","58+ro","N.S.","V","","default64","w","Y","64"
> +"POP r/m16","POPW r/m16","popw r/m16","8F /0","V","V","","operand16","w","Y","16"
> +"POP r16op","POPW r16op","popw r16op","58+rw","V","V","","operand16","w","Y","16"
> +"POP DS","POPW/POPL/POPQ DS","popw/popl/popq DS","1F","V","N.S.","","","w","Y",""
> +"POP ES","POPW/POPL/POPQ ES","popw/popl/popq ES","07","V","N.S.","","","w","Y",""
> +"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","N.S.","V","","default64","w","Y",""
> +"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","N.S.","","operand32","w","Y",""
> +"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","V","","operand16","w","Y",""
> +"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","N.S.","V","","default64","w","Y",""
> +"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","V","","operand16","w","Y",""
> +"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","N.S.","","operand32","w","Y",""
> +"POP SS","POPW/POPL/POPQ SS","popw/popl/popq SS","17","V","N.S.","","","w","Y",""
> +"POR mm1, mm2/m64","POR mm2/m64, mm1","por mm2/m64, mm1","0F EB /r","V","V","MMX","","rw,r","",""
> +"POR xmm1, xmm2/m128","POR xmm2/m128, xmm1","por xmm2/m128, xmm1","66 0F EB /r","V","V","SSE2","","rw,r","",""
> +"PREFETCHNTA m8","PREFETCHNTA m8","prefetchnta m8","0F 18 /0","V","V","","modrm_memonly","r","",""
> +"PREFETCHT0 m8","PREFETCHT0 m8","prefetcht0 m8","0F 18 /1","V","V","","modrm_memonly","r","",""
> +"PREFETCHT1 m8","PREFETCHT1 m8","prefetcht1 m8","0F 18 /2","V","V","","modrm_memonly","r","",""
> +"PREFETCHT2 m8","PREFETCHT2 m8","prefetcht2 m8","0F 18 /3","V","V","","modrm_memonly","r","",""
> +"PREFETCHW m8","PREFETCHW m8","prefetchw m8","0F 0D /1","V","V","PRFCHW","modrm_memonly","r","",""
> +"PREFETCHWT1 m8","PREFETCHWT1 m8","prefetchwt1 m8","0F 0D /2","V","V","PREFETCHWT1","modrm_memonly","r","",""
> +"PREFETCHW_ALIAS m8","PREFETCHW_ALIAS m8","prefetchw_alias m8","0F 0D /3","V","V","PRFCHW","modrm_memonly","r","",""
> +"PREFETCH_EXCLUSIVE m8","PREFETCH_EXCLUSIVE m8","prefetch_exclusive m8","0F 0D /0","V","V","PRFCHW","modrm_memonly","r","",""
> +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /2","V","V","PRFCHW","modrm_memonly","r","Y",""
> +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /4","V","V","PRFCHW","modrm_memonly","r","Y",""
> +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /5","V","V","PRFCHW","modrm_memonly","r","Y",""
> +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /6","V","V","PRFCHW","modrm_memonly","r","Y",""
> +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /7","V","V","PRFCHW","modrm_memonly","r","Y",""
> +"PSADBW mm1, mm2/m64","PSADBW mm2/m64, mm1","psadbw mm2/m64, mm1","0F F6 /r","V","V","MMX","","rw,r","",""
> +"PSADBW xmm1, xmm2/m128","PSADBW xmm2/m128, xmm1","psadbw xmm2/m128, xmm1","66 0F F6 /r","V","V","SSE2","","rw,r","",""
> +"PSHUFB mm1, mm2/m64","PSHUFB mm2/m64, mm1","pshufb mm2/m64, mm1","0F 38 00 /r","V","V","SSSE3","","rw,r","",""
> +"PSHUFB xmm1, xmm2/m128","PSHUFB xmm2/m128, xmm1","pshufb xmm2/m128, xmm1","66 0F 38 00 /r","V","V","SSSE3","","rw,r","",""
> +"PSHUFD xmm1, xmm2/m128, imm8u","PSHUFD imm8u, xmm2/m128, xmm1","pshufd imm8u, xmm2/m128, xmm1","66 0F 70 /r ib","V","V","SSE2","","w,r,r","",""
> +"PSHUFHW xmm1, xmm2/m128, imm8u","PSHUFHW imm8u, xmm2/m128, xmm1","pshufhw imm8u, xmm2/m128, xmm1","F3 0F 70 /r ib","V","V","SSE2","","w,r,r","",""
> +"PSHUFLW xmm1, xmm2/m128, imm8u","PSHUFLW imm8u, xmm2/m128, xmm1","pshuflw imm8u, xmm2/m128, xmm1","F2 0F 70 /r ib","V","V","SSE2","","w,r,r","",""
> +"PSHUFW mm1, mm2/m64, imm8u","PSHUFW imm8u, mm2/m64, mm1","pshufw imm8u, mm2/m64, mm1","0F 70 /r ib","V","V","MMX","","w,r,r","",""
> +"PSIGNB mm1, mm2/m64","PSIGNB mm2/m64, mm1","psignb mm2/m64, mm1","0F 38 08 /r","V","V","SSSE3","","rw,r","",""
> +"PSIGNB xmm1, xmm2/m128","PSIGNB xmm2/m128, xmm1","psignb xmm2/m128, xmm1","66 0F 38 08 /r","V","V","SSSE3","","rw,r","",""
> +"PSIGND mm1, mm2/m64","PSIGND mm2/m64, mm1","psignd mm2/m64, mm1","0F 38 0A /r","V","V","SSSE3","","rw,r","",""
> +"PSIGND xmm1, xmm2/m128","PSIGND xmm2/m128, xmm1","psignd xmm2/m128, xmm1","66 0F 38 0A /r","V","V","SSSE3","","rw,r","",""
> +"PSIGNW mm1, mm2/m64","PSIGNW mm2/m64, mm1","psignw mm2/m64, mm1","0F 38 09 /r","V","V","SSSE3","","rw,r","",""
> +"PSIGNW xmm1, xmm2/m128","PSIGNW xmm2/m128, xmm1","psignw xmm2/m128, xmm1","66 0F 38 09 /r","V","V","SSSE3","","rw,r","",""
> +"PSLLD mm2, imm8u","PSLLL imm8u, mm2","pslld imm8u, mm2","0F 72 /6 ib","V","V","MMX","modrm_regonly","rw,r","",""
> +"PSLLD xmm2, imm8u","PSLLL imm8u, xmm2","pslld imm8u, xmm2","66 0F 72 /6 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSLLD mm1, mm2/m64","PSLLL mm2/m64, mm1","pslld mm2/m64, mm1","0F F2 /r","V","V","MMX","","rw,r","",""
> +"PSLLD xmm1, xmm2/m128","PSLLL xmm2/m128, xmm1","pslld xmm2/m128, xmm1","66 0F F2 /r","V","V","SSE2","","rw,r","",""
> +"PSLLDQ xmm2, imm8u","PSLLO imm8u, xmm2","pslldq imm8u, xmm2","66 0F 73 /7 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSLLQ mm2, imm8u","PSLLQ imm8u, mm2","psllq imm8u, mm2","0F 73 /6 ib","V","V","MMX","modrm_regonly","rw,r","",""
> +"PSLLQ xmm2, imm8u","PSLLQ imm8u, xmm2","psllq imm8u, xmm2","66 0F 73 /6 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSLLQ mm1, mm2/m64","PSLLQ mm2/m64, mm1","psllq mm2/m64, mm1","0F F3 /r","V","V","MMX","","rw,r","",""
> +"PSLLQ xmm1, xmm2/m128","PSLLQ xmm2/m128, xmm1","psllq xmm2/m128, xmm1","66 0F F3 /r","V","V","SSE2","","rw,r","",""
> +"PSLLW mm2, imm8u","PSLLW imm8u, mm2","psllw imm8u, mm2","0F 71 /6 ib","V","V","MMX","modrm_regonly","rw,r","",""
> +"PSLLW xmm2, imm8u","PSLLW imm8u, xmm2","psllw imm8u, xmm2","66 0F 71 /6 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSLLW mm1, mm2/m64","PSLLW mm2/m64, mm1","psllw mm2/m64, mm1","0F F1 /r","V","V","MMX","","rw,r","",""
> +"PSLLW xmm1, xmm2/m128","PSLLW xmm2/m128, xmm1","psllw xmm2/m128, xmm1","66 0F F1 /r","V","V","SSE2","","rw,r","",""
> +"PSRAD mm2, imm8u","PSRAL imm8u, mm2","psrad imm8u, mm2","0F 72 /4 ib","V","V","MMX","modrm_regonly","rw,r","",""
> +"PSRAD xmm2, imm8u","PSRAL imm8u, xmm2","psrad imm8u, xmm2","66 0F 72 /4 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSRAD mm1, mm2/m64","PSRAL mm2/m64, mm1","psrad mm2/m64, mm1","0F E2 /r","V","V","MMX","","rw,r","",""
> +"PSRAD xmm1, xmm2/m128","PSRAL xmm2/m128, xmm1","psrad xmm2/m128, xmm1","66 0F E2 /r","V","V","SSE2","","rw,r","",""
> +"PSRAW mm2, imm8u","PSRAW imm8u, mm2","psraw imm8u, mm2","0F 71 /4 ib","V","V","MMX","modrm_regonly","rw,r","",""
> +"PSRAW xmm2, imm8u","PSRAW imm8u, xmm2","psraw imm8u, xmm2","66 0F 71 /4 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSRAW mm1, mm2/m64","PSRAW mm2/m64, mm1","psraw mm2/m64, mm1","0F E1 /r","V","V","MMX","","rw,r","",""
> +"PSRAW xmm1, xmm2/m128","PSRAW xmm2/m128, xmm1","psraw xmm2/m128, xmm1","66 0F E1 /r","V","V","SSE2","","rw,r","",""
> +"PSRLD mm2, imm8u","PSRLL imm8u, mm2","psrld imm8u, mm2","0F 72 /2 ib","V","V","MMX","modrm_regonly","rw,r","",""
> +"PSRLD xmm2, imm8u","PSRLL imm8u, xmm2","psrld imm8u, xmm2","66 0F 72 /2 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSRLD mm1, mm2/m64","PSRLL mm2/m64, mm1","psrld mm2/m64, mm1","0F D2 /r","V","V","MMX","","rw,r","",""
> +"PSRLD xmm1, xmm2/m128","PSRLL xmm2/m128, xmm1","psrld xmm2/m128, xmm1","66 0F D2 /r","V","V","SSE2","","rw,r","",""
> +"PSRLDQ xmm2, imm8u","PSRLO imm8u, xmm2","psrldq imm8u, xmm2","66 0F 73 /3 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSRLQ mm2, imm8u","PSRLQ imm8u, mm2","psrlq imm8u, mm2","0F 73 /2 ib","V","V","MMX","modrm_regonly","rw,r","",""
> +"PSRLQ xmm2, imm8u","PSRLQ imm8u, xmm2","psrlq imm8u, xmm2","66 0F 73 /2 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSRLQ mm1, mm2/m64","PSRLQ mm2/m64, mm1","psrlq mm2/m64, mm1","0F D3 /r","V","V","MMX","","rw,r","",""
> +"PSRLQ xmm1, xmm2/m128","PSRLQ xmm2/m128, xmm1","psrlq xmm2/m128, xmm1","66 0F D3 /r","V","V","SSE2","","rw,r","",""
> +"PSRLW mm2, imm8u","PSRLW imm8u, mm2","psrlw imm8u, mm2","0F 71 /2 ib","V","V","MMX","modrm_regonly","rw,r","",""
> +"PSRLW xmm2, imm8u","PSRLW imm8u, xmm2","psrlw imm8u, xmm2","66 0F 71 /2 ib","V","V","SSE2","modrm_regonly","rw,r","",""
> +"PSRLW mm1, mm2/m64","PSRLW mm2/m64, mm1","psrlw mm2/m64, mm1","0F D1 /r","V","V","MMX","","rw,r","",""
> +"PSRLW xmm1, xmm2/m128","PSRLW xmm2/m128, xmm1","psrlw xmm2/m128, xmm1","66 0F D1 /r","V","V","SSE2","","rw,r","",""
> +"PSUBB mm1, mm2/m64","PSUBB mm2/m64, mm1","psubb mm2/m64, mm1","0F F8 /r","V","V","MMX","","rw,r","",""
> +"PSUBB xmm1, xmm2/m128","PSUBB xmm2/m128, xmm1","psubb xmm2/m128, xmm1","66 0F F8 /r","V","V","SSE2","","rw,r","",""
> +"PSUBD mm1, mm2/m64","PSUBL mm2/m64, mm1","psubd mm2/m64, mm1","0F FA /r","V","V","MMX","","rw,r","",""
> +"PSUBD xmm1, xmm2/m128","PSUBL xmm2/m128, xmm1","psubd xmm2/m128, xmm1","66 0F FA /r","V","V","SSE2","","rw,r","",""
> +"PSUBQ mm1, mm2/m64","PSUBQ mm2/m64, mm1","psubq mm2/m64, mm1","0F FB /r","V","V","SSE2","","rw,r","",""
> +"PSUBQ xmm1, xmm2/m128","PSUBQ xmm2/m128, xmm1","psubq xmm2/m128, xmm1","66 0F FB /r","V","V","SSE2","","rw,r","",""
> +"PSUBSB mm1, mm2/m64","PSUBSB mm2/m64, mm1","psubsb mm2/m64, mm1","0F E8 /r","V","V","MMX","","rw,r","",""
> +"PSUBSB xmm1, xmm2/m128","PSUBSB xmm2/m128, xmm1","psubsb xmm2/m128, xmm1","66 0F E8 /r","V","V","SSE2","","rw,r","",""
> +"PSUBSW mm1, mm2/m64","PSUBSW mm2/m64, mm1","psubsw mm2/m64, mm1","0F E9 /r","V","V","MMX","","rw,r","",""
> +"PSUBSW xmm1, xmm2/m128","PSUBSW xmm2/m128, xmm1","psubsw xmm2/m128, xmm1","66 0F E9 /r","V","V","SSE2","","rw,r","",""
> +"PSUBUSB mm1, mm2/m64","PSUBUSB mm2/m64, mm1","psubusb mm2/m64, mm1","0F D8 /r","V","V","MMX","","rw,r","",""
> +"PSUBUSB xmm1, xmm2/m128","PSUBUSB xmm2/m128, xmm1","psubusb xmm2/m128, xmm1","66 0F D8 /r","V","V","SSE2","","rw,r","",""
> +"PSUBUSW mm1, mm2/m64","PSUBUSW mm2/m64, mm1","psubusw mm2/m64, mm1","0F D9 /r","V","V","MMX","","rw,r","",""
> +"PSUBUSW xmm1, xmm2/m128","PSUBUSW xmm2/m128, xmm1","psubusw xmm2/m128, xmm1","66 0F D9 /r","V","V","SSE2","","rw,r","",""
> +"PSUBW mm1, mm2/m64","PSUBW mm2/m64, mm1","psubw mm2/m64, mm1","0F F9 /r","V","V","MMX","","rw,r","",""
> +"PSUBW xmm1, xmm2/m128","PSUBW xmm2/m128, xmm1","psubw xmm2/m128, xmm1","66 0F F9 /r","V","V","SSE2","","rw,r","",""
> +"PSWAPD mm1, mm2/m64","PSWAPD mm2/m64, mm1","pswapd mm2/m64, mm1","0F 0F BB /r","V","V","3DNOW","amd","rw,r","",""
> +"PTEST xmm1, xmm2/m128","PTEST xmm2/m128, xmm1","ptest xmm2/m128, xmm1","66 0F 38 17 /r","V","V","SSE4_1","","r,r","",""
> +"PTWRITE r/m32","PTWRITEL r/m32","ptwritel r/m32","F3 0F AE /4","V","V","","operand16,operand32","r","Y","32"
> +"PTWRITE r/m64","PTWRITEQ r/m64","ptwriteq r/m64","F3 REX.W 0F AE /4","N.S.","V","","","r","Y","64"
> +"PUNPCKHBW mm1, mm2/m64","PUNPCKHBW mm2/m64, mm1","punpckhbw mm2/m64, mm1","0F 68 /r","V","V","MMX","","rw,r","",""
> +"PUNPCKHBW xmm1, xmm2/m128","PUNPCKHBW xmm2/m128, xmm1","punpckhbw xmm2/m128, xmm1","66 0F 68 /r","V","V","SSE2","","rw,r","",""
> +"PUNPCKHDQ mm1, mm2/m64","PUNPCKHLQ mm2/m64, mm1","punpckhdq mm2/m64, mm1","0F 6A /r","V","V","MMX","","rw,r","",""
> +"PUNPCKHDQ xmm1, xmm2/m128","PUNPCKHLQ xmm2/m128, xmm1","punpckhdq xmm2/m128, xmm1","66 0F 6A /r","V","V","SSE2","","rw,r","",""
> +"PUNPCKHQDQ xmm1, xmm2/m128","PUNPCKHQDQ xmm2/m128, xmm1","punpckhqdq xmm2/m128, xmm1","66 0F 6D /r","V","V","SSE2","","rw,r","",""
> +"PUNPCKHWD mm1, mm2/m64","PUNPCKHWL mm2/m64, mm1","punpckhwd mm2/m64, mm1","0F 69 /r","V","V","MMX","","rw,r","",""
> +"PUNPCKHWD xmm1, xmm2/m128","PUNPCKHWL xmm2/m128, xmm1","punpckhwd xmm2/m128, xmm1","66 0F 69 /r","V","V","SSE2","","rw,r","",""
> +"PUNPCKLBW mm1, mm2/m32","PUNPCKLBW mm2/m32, mm1","punpcklbw mm2/m32, mm1","0F 60 /r","V","V","MMX","","rw,r","",""
> +"PUNPCKLBW xmm1, xmm2/m128","PUNPCKLBW xmm2/m128, xmm1","punpcklbw xmm2/m128, xmm1","66 0F 60 /r","V","V","SSE2","","rw,r","",""
> +"PUNPCKLDQ mm1, mm2/m32","PUNPCKLLQ mm2/m32, mm1","punpckldq mm2/m32, mm1","0F 62 /r","V","V","MMX","","rw,r","",""
> +"PUNPCKLDQ xmm1, xmm2/m128","PUNPCKLLQ xmm2/m128, xmm1","punpckldq xmm2/m128, xmm1","66 0F 62 /r","V","V","SSE2","","rw,r","",""
> +"PUNPCKLQDQ xmm1, xmm2/m128","PUNPCKLQDQ xmm2/m128, xmm1","punpcklqdq xmm2/m128, xmm1","66 0F 6C /r","V","V","SSE2","","rw,r","",""
> +"PUNPCKLWD mm1, mm2/m32","PUNPCKLWL mm2/m32, mm1","punpcklwd mm2/m32, mm1","0F 61 /r","V","V","MMX","","rw,r","",""
> +"PUNPCKLWD xmm1, xmm2/m128","PUNPCKLWL xmm2/m128, xmm1","punpcklwd xmm2/m128, xmm1","66 0F 61 /r","V","V","SSE2","","rw,r","",""
> +"PUSHAD","PUSHAL","pushal","60","V","N.S.","","operand32","","",""
> +"PUSHA","PUSHAW","pushaw","60","V","N.S.","","operand16","","",""
> +"PUSHFD","PUSHFL","pushfl","9C","V","N.S.","","operand32","","",""
> +"PUSHFQ","PUSHFQ","pushfq","9C","N.S.","V","","default64","","",""
> +"PUSHF","PUSHFW","pushfw","9C","V","V","","operand16","","",""
> +"PUSH r/m32","PUSHL r/m32","pushl r/m32","FF /6","V","N.S.","","operand32","r","Y","32"
> +"PUSH r32op","PUSHL r32op","pushl r32op","50+rd","V","N.S.","","operand32","r","Y","32"
> +"PUSH r/m64","PUSHQ r/m64","pushq r/m64","FF /6","N.S.","V","","default64","r","Y","64"
> +"PUSH r64op","PUSHQ r64op","pushq r64op","50+ro","N.S.","V","","default64","r","Y","64"
> +"PUSH imm16","PUSHW imm16","pushw imm16","68 iw","V","V","","operand16","r","Y",""
> +"PUSH r/m16","PUSHW r/m16","pushw r/m16","FF /6","V","V","","operand16","r","Y","16"
> +"PUSH r16op","PUSHW r16op","pushw r16op","50+rw","V","V","","operand16","r","Y","16"
> +"PUSH CS","PUSHW/PUSHL/PUSHQ CS","pushw/pushl/pushq CS","0E","V","N.S.","","","r","Y",""
> +"PUSH DS","PUSHW/PUSHL/PUSHQ DS","pushw/pushl/pushq DS","1E","V","N.S.","","","r","Y",""
> +"PUSH ES","PUSHW/PUSHL/PUSHQ ES","pushw/pushl/pushq ES","06","V","N.S.","","","r","Y",""
> +"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","V","","operand16","r","Y",""
> +"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","N.S.","V","","default64","r","Y",""
> +"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","N.S.","","operand32","r","Y",""
> +"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","N.S.","V","","default64","r","Y",""
> +"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","N.S.","","operand32","r","Y",""
> +"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","V","","operand16","r","Y",""
> +"PUSH SS","PUSHW/PUSHL/PUSHQ SS","pushw/pushl/pushq SS","16","V","N.S.","","","r","Y",""
> +"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V","N.S.","","operand32","r","Y",""
> +"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","N.S.","V","","default64","r","Y",""
> +"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V","V","","operand16","r","Y",""
> +"PXOR mm1, mm2/m64","PXOR mm2/m64, mm1","pxor mm2/m64, mm1","0F EF /r","V","V","MMX","","rw,r","",""
> +"PXOR xmm1, xmm2/m128","PXOR xmm2/m128, xmm1","pxor xmm2/m128, xmm1","66 0F EF /r","V","V","SSE2","","rw,r","",""
> +"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","D0 /2","V","V","","","rw,r","Y","8"
> +"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","REX D0 /2","N.E.","V","","pseudo64","w,r","Y","8"
> +"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","D2 /2","V","V","","","rw,r","Y","8"
> +"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","REX D2 /2","N.E.","V","","pseudo64","w,r","Y","8"
> +"RCL r/m8, imm8","RCLB imm8, r/m8","rclb imm8, r/m8","REX C0 /2 ib","N.E.","V","","pseudo64","w,r","Y","8"
> +"RCL r/m8, imm8u","RCLB imm8u, r/m8","rclb imm8u, r/m8","C0 /2 ib","V","V","","","rw,r","Y","8"
> +"RCL r/m32, 1","RCLL 1, r/m32","rcll 1, r/m32","D1 /2","V","V","","operand32","rw,r","Y","32"
> +"RCL r/m32, CL","RCLL CL, r/m32","rcll CL, r/m32","D3 /2","V","V","","operand32","rw,r","Y","32"
> +"RCL r/m32, imm8u","RCLL imm8u, r/m32","rcll imm8u, r/m32","C1 /2 ib","V","V","","operand32","rw,r","Y","32"
> +"RCL r/m64, 1","RCLQ 1, r/m64","rclq 1, r/m64","REX.W D1 /2","N.S.","V","","","rw,r","Y","64"
> +"RCL r/m64, CL","RCLQ CL, r/m64","rclq CL, r/m64","REX.W D3 /2","N.S.","V","","","rw,r","Y","64"
> +"RCL r/m64, imm8u","RCLQ imm8u, r/m64","rclq imm8u, r/m64","REX.W C1 /2 ib","N.S.","V","","","rw,r","Y","64"
> +"RCL r/m16, 1","RCLW 1, r/m16","rclw 1, r/m16","D1 /2","V","V","","operand16","rw,r","Y","16"
> +"RCL r/m16, CL","RCLW CL, r/m16","rclw CL, r/m16","D3 /2","V","V","","operand16","rw,r","Y","16"
> +"RCL r/m16, imm8u","RCLW imm8u, r/m16","rclw imm8u, r/m16","C1 /2 ib","V","V","","operand16","rw,r","Y","16"
> +"RCPPS xmm1, xmm2/m128","RCPPS xmm2/m128, xmm1","rcpps xmm2/m128, xmm1","0F 53 /r","V","V","SSE","","w,r","",""
> +"RCPSS xmm1, xmm2/m32","RCPSS xmm2/m32, xmm1","rcpss xmm2/m32, xmm1","F3 0F 53 /r","V","V","SSE","","w,r","",""
> +"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","D0 /3","V","V","","","rw,r","Y","8"
> +"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","REX D0 /3","N.E.","V","","pseudo64","w,r","Y","8"
> +"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","D2 /3","V","V","","","rw,r","Y","8"
> +"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","REX D2 /3","N.E.","V","","pseudo64","w,r","Y","8"
> +"RCR r/m8, imm8","RCRB imm8, r/m8","rcrb imm8, r/m8","REX C0 /3 ib","N.E.","V","","pseudo64","w,r","Y","8"
> +"RCR r/m8, imm8u","RCRB imm8u, r/m8","rcrb imm8u, r/m8","C0 /3 ib","V","V","","","rw,r","Y","8"
> +"RCR r/m32, 1","RCRL 1, r/m32","rcrl 1, r/m32","D1 /3","V","V","","operand32","rw,r","Y","32"
> +"RCR r/m32, CL","RCRL CL, r/m32","rcrl CL, r/m32","D3 /3","V","V","","operand32","rw,r","Y","32"
> +"RCR r/m32, imm8u","RCRL imm8u, r/m32","rcrl imm8u, r/m32","C1 /3 ib","V","V","","operand32","rw,r","Y","32"
> +"RCR r/m64, 1","RCRQ 1, r/m64","rcrq 1, r/m64","REX.W D1 /3","N.S.","V","","","rw,r","Y","64"
> +"RCR r/m64, CL","RCRQ CL, r/m64","rcrq CL, r/m64","REX.W D3 /3","N.S.","V","","","rw,r","Y","64"
> +"RCR r/m64, imm8u","RCRQ imm8u, r/m64","rcrq imm8u, r/m64","REX.W C1 /3 ib","N.S.","V","","","rw,r","Y","64"
> +"RCR r/m16, 1","RCRW 1, r/m16","rcrw 1, r/m16","D1 /3","V","V","","operand16","rw,r","Y","16"
> +"RCR r/m16, CL","RCRW CL, r/m16","rcrw CL, r/m16","D3 /3","V","V","","operand16","rw,r","Y","16"
> +"RCR r/m16, imm8u","RCRW imm8u, r/m16","rcrw imm8u, r/m16","C1 /3 ib","V","V","","operand16","rw,r","Y","16"
> +"RDFSBASE rmr32","RDFSBASEL rmr32","rdfsbase rmr32","F3 0F AE /0","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32"
> +"RDFSBASE rmr64","RDFSBASEQ rmr64","rdfsbase rmr64","F3 REX.W 0F AE /0","N.S.","V","FSGSBASE","modrm_regonly","w","Y","64"
> +"RDGSBASE rmr32","RDGSBASEL rmr32","rdgsbase rmr32","F3 0F AE /1","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32"
> +"RDGSBASE rmr64","RDGSBASEQ rmr64","rdgsbase rmr64","F3 REX.W 0F AE /1","N.S.","V","FSGSBASE","modrm_regonly","w","Y","64"
> +"RDMSR","RDMSR","rdmsr","0F 32","V","V","Pentium","","","",""
> +"RDPKRU","RDPKRU","rdpkru","0F 01 EE","V","V","PKU","","","",""
> +"RDPMC","RDPMC","rdpmc","0F 33","V","V","","","","",""
> +"RDRAND rmr32","RDRANDL rmr32","rdrand rmr32","0F C7 /6","V","V","RDRAND","modrm_regonly,operand32","w","Y","32"
> +"RDRAND rmr64","RDRANDQ rmr64","rdrand rmr64","REX.W 0F C7 /6","N.S.","V","RDRAND","modrm_regonly","w","Y","64"
> +"RDRAND rmr16","RDRANDW rmr16","rdrand rmr16","0F C7 /6","V","V","RDRAND","modrm_regonly,operand16","w","Y","16"
> +"RDSEED rmr32","RDSEEDL rmr32","rdseed rmr32","0F C7 /7","V","V","RDSEED","modrm_regonly,operand32","w","Y","32"
> +"RDSEED rmr64","RDSEEDQ rmr64","rdseed rmr64","REX.W 0F C7 /7","N.S.","V","RDSEED","modrm_regonly","w","Y","64"
> +"RDSEED rmr16","RDSEEDW rmr16","rdseed rmr16","0F C7 /7","V","V","RDSEED","modrm_regonly,operand16","w","Y","16"
> +"RDSSPD rmr32","RDSSPD rmr32","rdsspd rmr32","F3 0F 1E /1","V","V","CET","modrm_regonly,operand16,operand32","w","",""
> +"RDSSPQ rmr64","RDSSPQ rmr64","rdsspq rmr64","F3 REX.W 0F 1E /1","N.S.","V","CET","modrm_regonly","w","",""
> +"RDTSC","RDTSC","rdtsc","0F 31","V","V","Pentium","","","",""
> +"RDTSCP","RDTSCP","rdtscp","0F 01 F9","V","V","RDTSCP","","","",""
> +"RET_FAR","RETFW/RETFL/RETFQ","lretw/lretl/lretl","CB","V","V","","","","",""
> +"RET_FAR imm16u","RETFW/RETFL/RETFQ imm16u","lretw/lretl/lretl imm16u","CA iw","V","V","","","r","",""
> +"RET","RETW/RETL/RETQ","retw/retl/retq","C3","N.S.","V","","default64","","",""
> +"RET","RETW/RETL/RETQ","retw/retl/retq","C3","V","N.S.","","","","",""
> +"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","N.S.","V","","default64","r","",""
> +"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","V","N.S.","","","r","",""
> +"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","D0 /0","V","V","","","rw,r","Y","8"
> +"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","REX D0 /0","N.E.","V","","pseudo64","w,r","Y","8"
> +"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","D2 /0","V","V","","","rw,r","Y","8"
> +"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","REX D2 /0","N.E.","V","","pseudo64","w,r","Y","8"
> +"ROL r/m8, imm8","ROLB imm8, r/m8","rolb imm8, r/m8","REX C0 /0 ib","N.E.","V","","pseudo64","w,r","Y","8"
> +"ROL r/m8, imm8u","ROLB imm8u, r/m8","rolb imm8u, r/m8","C0 /0 ib","V","V","","","rw,r","Y","8"
> +"ROL r/m32, 1","ROLL 1, r/m32","roll 1, r/m32","D1 /0","V","V","","operand32","rw,r","Y","32"
> +"ROL r/m32, CL","ROLL CL, r/m32","roll CL, r/m32","D3 /0","V","V","","operand32","rw,r","Y","32"
> +"ROL r/m32, imm8u","ROLL imm8u, r/m32","roll imm8u, r/m32","C1 /0 ib","V","V","","operand32","rw,r","Y","32"
> +"ROL r/m64, 1","ROLQ 1, r/m64","rolq 1, r/m64","REX.W D1 /0","N.S.","V","","","rw,r","Y","64"
> +"ROL r/m64, CL","ROLQ CL, r/m64","rolq CL, r/m64","REX.W D3 /0","N.S.","V","","","rw,r","Y","64"
> +"ROL r/m64, imm8u","ROLQ imm8u, r/m64","rolq imm8u, r/m64","REX.W C1 /0 ib","N.S.","V","","","rw,r","Y","64"
> +"ROL r/m16, 1","ROLW 1, r/m16","rolw 1, r/m16","D1 /0","V","V","","operand16","rw,r","Y","16"
> +"ROL r/m16, CL","ROLW CL, r/m16","rolw CL, r/m16","D3 /0","V","V","","operand16","rw,r","Y","16"
> +"ROL r/m16, imm8u","ROLW imm8u, r/m16","rolw imm8u, r/m16","C1 /0 ib","V","V","","operand16","rw,r","Y","16"
> +"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","D0 /1","V","V","","","rw,r","Y","8"
> +"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","REX D0 /1","N.E.","V","","pseudo64","w,r","Y","8"
> +"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","D2 /1","V","V","","","rw,r","Y","8"
> +"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","REX D2 /1","N.E.","V","","pseudo64","w,r","Y","8"
> +"ROR r/m8, imm8","RORB imm8, r/m8","rorb imm8, r/m8","REX C0 /1 ib","N.E.","V","","pseudo64","w,r","Y","8"
> +"ROR r/m8, imm8u","RORB imm8u, r/m8","rorb imm8u, r/m8","C0 /1 ib","V","V","","","rw,r","Y","8"
> +"ROR r/m32, 1","RORL 1, r/m32","rorl 1, r/m32","D1 /1","V","V","","operand32","rw,r","Y","32"
> +"ROR r/m32, CL","RORL CL, r/m32","rorl CL, r/m32","D3 /1","V","V","","operand32","rw,r","Y","32"
> +"ROR r/m32, imm8u","RORL imm8u, r/m32","rorl imm8u, r/m32","C1 /1 ib","V","V","","operand32","rw,r","Y","32"
> +"ROR r/m64, 1","RORQ 1, r/m64","rorq 1, r/m64","REX.W D1 /1","N.S.","V","","","rw,r","Y","64"
> +"ROR r/m64, CL","RORQ CL, r/m64","rorq CL, r/m64","REX.W D3 /1","N.S.","V","","","rw,r","Y","64"
> +"ROR r/m64, imm8u","RORQ imm8u, r/m64","rorq imm8u, r/m64","REX.W C1 /1 ib","N.S.","V","","","rw,r","Y","64"
> +"ROR r/m16, 1","RORW 1, r/m16","rorw 1, r/m16","D1 /1","V","V","","operand16","rw,r","Y","16"
> +"ROR r/m16, CL","RORW CL, r/m16","rorw CL, r/m16","D3 /1","V","V","","operand16","rw,r","Y","16"
> +"ROR r/m16, imm8u","RORW imm8u, r/m16","rorw imm8u, r/m16","C1 /1 ib","V","V","","operand16","rw,r","Y","16"
> +"RORX r32, r/m32, imm8u","RORXL imm8u, r/m32, r32","rorxl imm8u, r/m32, r32","VEX.128.F2.0F3A.W0 F0 /r ib","V","V","BMI2","","w,r,r","Y","32"
> +"RORX r64, r/m64, imm8u","RORXQ imm8u, r/m64, r64","rorxq imm8u, r/m64, r64","VEX.128.F2.0F3A.W1 F0 /r ib","N.S.","V","BMI2","","w,r,r","Y","64"
> +"ROUNDPD xmm1, xmm2/m128, imm8u","ROUNDPD imm8u, xmm2/m128, xmm1","roundpd imm8u, xmm2/m128, xmm1","66 0F 3A 09 /r ib","V","V","SSE4_1","","w,r,r","",""
> +"ROUNDPS xmm1, xmm2/m128, imm8u","ROUNDPS imm8u, xmm2/m128, xmm1","roundps imm8u, xmm2/m128, xmm1","66 0F 3A 08 /r ib","V","V","SSE4_1","","w,r,r","",""
> +"ROUNDSD xmm1, xmm2/m64, imm8u","ROUNDSD imm8u, xmm2/m64, xmm1","roundsd imm8u, xmm2/m64, xmm1","66 0F 3A 0B /r ib","V","V","SSE4_1","","w,r,r","",""
> +"ROUNDSS xmm1, xmm2/m32, imm8u","ROUNDSS imm8u, xmm2/m32, xmm1","roundss imm8u, xmm2/m32, xmm1","66 0F 3A 0A /r ib","V","V","SSE4_1","","w,r,r","",""
> +"RSM","RSM","rsm","0F AA","V","V","","","","",""
> +"RSQRTPS xmm1, xmm2/m128","RSQRTPS xmm2/m128, xmm1","rsqrtps xmm2/m128, xmm1","0F 52 /r","V","V","SSE","","w,r","",""
> +"RSQRTSS xmm1, xmm2/m32","RSQRTSS xmm2/m32, xmm1","rsqrtss xmm2/m32, xmm1","F3 0F 52 /r","V","V","SSE","","w,r","",""
> +"RSTORSSP m64","RSTORSSP m64","rstorssp m64","F3 0F 01 /5","V","V","CET","modrm_memonly","rw","",""
> +"SAHF","SAHF","sahf","9E","V","V","LAHFSAHF","","","",""
> +"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","D0 /4","V","V","","pseudo","rw,r","Y","8"
> +"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","REX D0 /4","N.E.","V","","pseudo","rw,r","Y","8"
> +"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","D2 /4","V","V","","pseudo","rw,r","Y","8"
> +"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","REX D2 /4","N.E.","V","","pseudo","rw,r","Y","8"
> +"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","C0 /4 ib","V","V","","pseudo","rw,r","Y","8"
> +"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","REX C0 /4 ib","N.E.","V","","pseudo","rw,r","Y","8"
> +"SALC","SALC","salc","D6","V","N.S.","","","","",""
> +"SAL r/m32, 1","SALL 1, r/m32","sall 1, r/m32","D1 /4","V","V","","operand32,pseudo","rw,r","Y","32"
> +"SAL r/m32, CL","SALL CL, r/m32","sall CL, r/m32","D3 /4","V","V","","operand32,pseudo","rw,r","Y","32"
> +"SAL r/m32, imm8","SALL imm8, r/m32","sall imm8, r/m32","C1 /4 ib","V","V","","operand32,pseudo","rw,r","Y","32"
> +"SAL r/m64, 1","SALQ 1, r/m64","salq 1, r/m64","REX.W D1 /4","N.E.","V","","pseudo","rw,r","Y","64"
> +"SAL r/m64, CL","SALQ CL, r/m64","salq CL, r/m64","REX.W D3 /4","N.E.","V","","pseudo","rw,r","Y","64"
> +"SAL r/m64, imm8","SALQ imm8, r/m64","salq imm8, r/m64","REX.W C1 /4 ib","N.E.","V","","pseudo","rw,r","Y","64"
> +"SAL r/m16, 1","SALW 1, r/m16","salw 1, r/m16","D1 /4","V","V","","operand16,pseudo","rw,r","Y","16"
> +"SAL r/m16, CL","SALW CL, r/m16","salw CL, r/m16","D3 /4","V","V","","operand16,pseudo","rw,r","Y","16"
> +"SAL r/m16, imm8","SALW imm8, r/m16","salw imm8, r/m16","C1 /4 ib","V","V","","operand16,pseudo","rw,r","Y","16"
> +"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","D0 /7","V","V","","","rw,r","Y","8"
> +"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","REX D0 /7","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","D2 /7","V","V","","","rw,r","Y","8"
> +"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","REX D2 /7","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SAR r/m8, imm8","SARB imm8, r/m8","sarb imm8, r/m8","REX C0 /7 ib","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SAR r/m8, imm8u","SARB imm8u, r/m8","sarb imm8u, r/m8","C0 /7 ib","V","V","","","rw,r","Y","8"
> +"SAR r/m32, 1","SARL 1, r/m32","sarl 1, r/m32","D1 /7","V","V","","operand32","rw,r","Y","32"
> +"SAR r/m32, CL","SARL CL, r/m32","sarl CL, r/m32","D3 /7","V","V","","operand32","rw,r","Y","32"
> +"SAR r/m32, imm8u","SARL imm8u, r/m32","sarl imm8u, r/m32","C1 /7 ib","V","V","","operand32","rw,r","Y","32"
> +"SAR r/m64, 1","SARQ 1, r/m64","sarq 1, r/m64","REX.W D1 /7","N.S.","V","","","rw,r","Y","64"
> +"SAR r/m64, CL","SARQ CL, r/m64","sarq CL, r/m64","REX.W D3 /7","N.S.","V","","","rw,r","Y","64"
> +"SAR r/m64, imm8u","SARQ imm8u, r/m64","sarq imm8u, r/m64","REX.W C1 /7 ib","N.S.","V","","","rw,r","Y","64"
> +"SAR r/m16, 1","SARW 1, r/m16","sarw 1, r/m16","D1 /7","V","V","","operand16","rw,r","Y","16"
> +"SAR r/m16, CL","SARW CL, r/m16","sarw CL, r/m16","D3 /7","V","V","","operand16","rw,r","Y","16"
> +"SAR r/m16, imm8u","SARW imm8u, r/m16","sarw imm8u, r/m16","C1 /7 ib","V","V","","operand16","rw,r","Y","16"
> +"SARX r32, r/m32, r32V","SARXL r32V, r/m32, r32","sarxl r32V, r/m32, r32","VEX.NDS.128.F3.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32"
> +"SARX r64, r/m64, r64V","SARXQ r64V, r/m64, r64","sarxq r64V, r/m64, r64","VEX.NDS.128.F3.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64"
> +"SAVESSP","SAVESSP","savessp","F3 0F 01 EA","V","V","CET","","","",""
> +"SBB AL, imm8","SBBB imm8, AL","sbbb imm8, AL","1C ib","V","V","","","rw,r","Y","8"
> +"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","80 /3 ib","V","V","","","rw,r","Y","8"
> +"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","82 /3 ib","V","N.S.","","","rw,r","Y","8"
> +"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","REX 80 /3 ib","N.E.","V","","pseudo64","w,r","Y","8"
> +"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","1A /r","V","V","","","rw,r","Y","8"
> +"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","REX 1A /r","N.E.","V","","pseudo64","w,r","Y","8"
> +"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","18 /r","V","V","","","rw,r","Y","8"
> +"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","REX 18 /r","N.E.","V","","pseudo64","w,r","Y","8"
> +"SBB EAX, imm32","SBBL imm32, EAX","sbbl imm32, EAX","1D id","V","V","","operand32","rw,r","Y","32"
> +"SBB r/m32, imm32","SBBL imm32, r/m32","sbbl imm32, r/m32","81 /3 id","V","V","","operand32","rw,r","Y","32"
> +"SBB r/m32, imm8","SBBL imm8, r/m32","sbbl imm8, r/m32","83 /3 ib","V","V","","operand32","rw,r","Y","32"
> +"SBB r32, r/m32","SBBL r/m32, r32","sbbl r/m32, r32","1B /r","V","V","","operand32","rw,r","Y","32"
> +"SBB r/m32, r32","SBBL r32, r/m32","sbbl r32, r/m32","19 /r","V","V","","operand32","rw,r","Y","32"
> +"SBB RAX, imm32","SBBQ imm32, RAX","sbbq imm32, RAX","REX.W 1D id","N.S.","V","","","rw,r","Y","64"
> +"SBB r/m64, imm32","SBBQ imm32, r/m64","sbbq imm32, r/m64","REX.W 81 /3 id","N.S.","V","","","rw,r","Y","64"
> +"SBB r/m64, imm8","SBBQ imm8, r/m64","sbbq imm8, r/m64","REX.W 83 /3 ib","N.S.","V","","","rw,r","Y","64"
> +"SBB r64, r/m64","SBBQ r/m64, r64","sbbq r/m64, r64","REX.W 1B /r","N.S.","V","","","rw,r","Y","64"
> +"SBB r/m64, r64","SBBQ r64, r/m64","sbbq r64, r/m64","REX.W 19 /r","N.S.","V","","","rw,r","Y","64"
> +"SBB AX, imm16","SBBW imm16, AX","sbbw imm16, AX","1D iw","V","V","","operand16","rw,r","Y","16"
> +"SBB r/m16, imm16","SBBW imm16, r/m16","sbbw imm16, r/m16","81 /3 iw","V","V","","operand16","rw,r","Y","16"
> +"SBB r/m16, imm8","SBBW imm8, r/m16","sbbw imm8, r/m16","83 /3 ib","V","V","","operand16","rw,r","Y","16"
> +"SBB r16, r/m16","SBBW r/m16, r16","sbbw r/m16, r16","1B /r","V","V","","operand16","rw,r","Y","16"
> +"SBB r/m16, r16","SBBW r16, r/m16","sbbw r16, r/m16","19 /r","V","V","","operand16","rw,r","Y","16"
> +"SCASB","SCASB","scasb","AE","V","V","","","","",""
> +"SCASD","SCASL","scasl","AF","V","V","","operand32","","",""
> +"SCASQ","SCASQ","scasq","REX.W AF","N.S.","V","","","","",""
> +"SCASW","SCASW","scasw","AF","V","V","","operand16","","",""
> +"SETAE r/m8","SETCC r/m8","setae r/m8","0F 93 /r","V","V","","","w","",""
> +"SETNB r/m8","SETCC r/m8","setnb r/m8","0F 93 /r","V","V","","pseudo","r","",""
> +"SETNC r/m8","SETCC r/m8","setnc r/m8","0F 93 /r","V","V","","pseudo","r","",""
> +"SETAE r/m8","SETCC r/m8","setae r/m8","REX 0F 93 /r","N.E.","V","","pseudo64","r","",""
> +"SETNB r/m8","SETCC r/m8","setnb r/m8","REX 0F 93 /r","N.E.","V","","pseudo","r","",""
> +"SETNC r/m8","SETCC r/m8","setnc r/m8","REX 0F 93 /r","N.E.","V","","pseudo","r","",""
> +"SETB r/m8","SETCS r/m8","setb r/m8","0F 92 /r","V","V","","","w","",""
> +"SETC r/m8","SETCS r/m8","setc r/m8","0F 92 /r","V","V","","pseudo","r","",""
> +"SETNAE r/m8","SETCS r/m8","setnae r/m8","0F 92 /r","V","V","","pseudo","r","",""
> +"SETB r/m8","SETCS r/m8","setb r/m8","REX 0F 92 /r","N.E.","V","","pseudo64","r","",""
> +"SETC r/m8","SETCS r/m8","setc r/m8","REX 0F 92 /r","N.E.","V","","pseudo","r","",""
> +"SETNAE r/m8","SETCS r/m8","setnae r/m8","REX 0F 92 /r","N.E.","V","","pseudo","r","",""
> +"SETE r/m8","SETEQ r/m8","sete r/m8","0F 94 /r","V","V","","","w","",""
> +"SETZ r/m8","SETEQ r/m8","setz r/m8","0F 94 /r","V","V","","pseudo","r","",""
> +"SETE r/m8","SETEQ r/m8","sete r/m8","REX 0F 94 /r","N.E.","V","","pseudo64","r","",""
> +"SETZ r/m8","SETEQ r/m8","setz r/m8","REX 0F 94 /r","N.E.","V","","pseudo","r","",""
> +"SETGE r/m8","SETGE r/m8","setge r/m8","0F 9D /r","V","V","","","w","",""
> +"SETNL r/m8","SETGE r/m8","setnl r/m8","0F 9D /r","V","V","","pseudo","r","",""
> +"SETGE r/m8","SETGE r/m8","setge r/m8","REX 0F 9D /r","N.E.","V","","pseudo64","r","",""
> +"SETNL r/m8","SETGE r/m8","setnl r/m8","REX 0F 9D /r","N.E.","V","","pseudo","r","",""
> +"SETG r/m8","SETGT r/m8","setg r/m8","0F 9F /r","V","V","","","w","",""
> +"SETNLE r/m8","SETGT r/m8","setnle r/m8","0F 9F /r","V","V","","pseudo","r","",""
> +"SETG r/m8","SETGT r/m8","setg r/m8","REX 0F 9F /r","N.E.","V","","pseudo64","r","",""
> +"SETNLE r/m8","SETGT r/m8","setnle r/m8","REX 0F 9F /r","N.E.","V","","pseudo","r","",""
> +"SETA r/m8","SETHI r/m8","seta r/m8","0F 97 /r","V","V","","","w","",""
> +"SETNBE r/m8","SETHI r/m8","setnbe r/m8","0F 97 /r","V","V","","pseudo","r","",""
> +"SETA r/m8","SETHI r/m8","seta r/m8","REX 0F 97 /r","N.E.","V","","pseudo64","r","",""
> +"SETNBE r/m8","SETHI r/m8","setnbe r/m8","REX 0F 97 /r","N.E.","V","","pseudo","r","",""
> +"SETLE r/m8","SETLE r/m8","setle r/m8","0F 9E /r","V","V","","","w","",""
> +"SETNG r/m8","SETLE r/m8","setng r/m8","0F 9E /r","V","V","","pseudo","r","",""
> +"SETLE r/m8","SETLE r/m8","setle r/m8","REX 0F 9E /r","N.E.","V","","pseudo64","r","",""
> +"SETNG r/m8","SETLE r/m8","setng r/m8","REX 0F 9E /r","N.E.","V","","pseudo","r","",""
> +"SETBE r/m8","SETLS r/m8","setbe r/m8","0F 96 /r","V","V","","","w","",""
> +"SETNA r/m8","SETLS r/m8","setna r/m8","0F 96 /r","V","V","","pseudo","r","",""
> +"SETBE r/m8","SETLS r/m8","setbe r/m8","REX 0F 96 /r","N.E.","V","","pseudo64","r","",""
> +"SETNA r/m8","SETLS r/m8","setna r/m8","REX 0F 96 /r","N.E.","V","","pseudo","r","",""
> +"SETL r/m8","SETLT r/m8","setl r/m8","0F 9C /r","V","V","","","w","",""
> +"SETNGE r/m8","SETLT r/m8","setnge r/m8","0F 9C /r","V","V","","pseudo","r","",""
> +"SETL r/m8","SETLT r/m8","setl r/m8","REX 0F 9C /r","N.E.","V","","pseudo64","r","",""
> +"SETNGE r/m8","SETLT r/m8","setnge r/m8","REX 0F 9C /r","N.E.","V","","pseudo","r","",""
> +"SETS r/m8","SETMI r/m8","sets r/m8","0F 98 /r","V","V","","","w","",""
> +"SETS r/m8","SETMI r/m8","sets r/m8","REX 0F 98 /r","N.E.","V","","pseudo64","r","",""
> +"SETNE r/m8","SETNE r/m8","setne r/m8","0F 95 /r","V","V","","","w","",""
> +"SETNZ r/m8","SETNE r/m8","setnz r/m8","0F 95 /r","V","V","","pseudo","r","",""
> +"SETNE r/m8","SETNE r/m8","setne r/m8","REX 0F 95 /r","N.E.","V","","pseudo64","r","",""
> +"SETNZ r/m8","SETNE r/m8","setnz r/m8","REX 0F 95 /r","N.E.","V","","pseudo","r","",""
> +"SETNO r/m8","SETOC r/m8","setno r/m8","0F 91 /r","V","V","","","w","",""
> +"SETNO r/m8","SETOC r/m8","setno r/m8","REX 0F 91 /r","N.E.","V","","pseudo64","r","",""
> +"SETO r/m8","SETOS r/m8","seto r/m8","0F 90 /r","V","V","","","w","",""
> +"SETO r/m8","SETOS r/m8","seto r/m8","REX 0F 90 /r","N.E.","V","","pseudo64","r","",""
> +"SETNP r/m8","SETPC r/m8","setnp r/m8","0F 9B /r","V","V","","","w","",""
> +"SETPO r/m8","SETPC r/m8","setpo r/m8","0F 9B /r","V","V","","pseudo","r","",""
> +"SETNP r/m8","SETPC r/m8","setnp r/m8","REX 0F 9B /r","N.E.","V","","pseudo64","r","",""
> +"SETPO r/m8","SETPC r/m8","setpo r/m8","REX 0F 9B /r","N.E.","V","","pseudo","r","",""
> +"SETNS r/m8","SETPL r/m8","setns r/m8","0F 99 /r","V","V","","","w","",""
> +"SETNS r/m8","SETPL r/m8","setns r/m8","REX 0F 99 /r","N.E.","V","","pseudo64","r","",""
> +"SETP r/m8","SETPS r/m8","setp r/m8","0F 9A /r","V","V","","","w","",""
> +"SETPE r/m8","SETPS r/m8","setpe r/m8","0F 9A /r","V","V","","pseudo","r","",""
> +"SETP r/m8","SETPS r/m8","setp r/m8","REX 0F 9A /r","N.E.","V","","pseudo64","r","",""
> +"SETPE r/m8","SETPS r/m8","setpe r/m8","REX 0F 9A /r","N.E.","V","","pseudo","r","",""
> +"SETSSBSY","SETSSBSY","setssbsy","F3 0F 01 E8","V","V","CET","","","",""
> +"SFENCE","SFENCE","sfence","0F AE /7","V","V","SSE","","","",""
> +"SGDT m16&32","SGDT m16&32","sgdt m16&32","0F 01 /0","V","N.S.","","modrm_memonly","w","",""
> +"SGDT m16&64","SGDT m16&64","sgdt m16&64","0F 01 /0","N.S.","V","","default64,modrm_memonly","w","",""
> +"SHA1MSG1 xmm1, xmm2/m128","SHA1MSG1 xmm2/m128, xmm1","sha1msg1 xmm2/m128, xmm1","0F 38 C9 /r","V","V","SHA","","rw,r","",""
> +"SHA1MSG2 xmm1, xmm2/m128","SHA1MSG2 xmm2/m128, xmm1","sha1msg2 xmm2/m128, xmm1","0F 38 CA /r","V","V","SHA","","rw,r","",""
> +"SHA1NEXTE xmm1, xmm2/m128","SHA1NEXTE xmm2/m128, xmm1","sha1nexte xmm2/m128, xmm1","0F 38 C8 /r","V","V","SHA","","rw,r","",""
> +"SHA1RNDS4 xmm1, xmm2/m128, imm8u:2","SHA1RNDS4 imm8u:2, xmm2/m128, xmm1","sha1rnds4 imm8u:2, xmm2/m128, xmm1","0F 3A CC /r ib","V","V","SHA","","rw,r,r","",""
> +"SHA256MSG1 xmm1, xmm2/m128","SHA256MSG1 xmm2/m128, xmm1","sha256msg1 xmm2/m128, xmm1","0F 38 CC /r","V","V","SHA","","rw,r","",""
> +"SHA256MSG2 xmm1, xmm2/m128","SHA256MSG2 xmm2/m128, xmm1","sha256msg2 xmm2/m128, xmm1","0F 38 CD /r","V","V","SHA","","rw,r","",""
> +"SHA256RNDS2 xmm1, xmm2/m128, <XMM0>","SHA256RNDS2 <XMM0>, xmm2/m128, xmm1","sha256rnds2 <XMM0>, xmm2/m128, xmm1","0F 38 CB /r","V","V","SHA","","rw,r,r","",""
> +"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /4","V","V","","","rw,r","Y","8"
> +"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /6","V","V","","","rw,r","Y","8"
> +"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","REX D0 /4","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /4","V","V","","","rw,r","Y","8"
> +"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /6","V","V","","","rw,r","Y","8"
> +"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","REX D2 /4","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SHL r/m8, imm8","SHLB imm8, r/m8","shlb imm8, r/m8","REX C0 /4 ib","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /4 ib","V","V","","","rw,r","Y","8"
> +"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /6 ib","V","V","","","rw,r","Y","8"
> +"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /4","V","V","","operand32","rw,r","Y","32"
> +"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /6","V","V","","operand32","rw,r","Y","32"
> +"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /4","V","V","","operand32","rw,r","Y","32"
> +"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /6","V","V","","operand32","rw,r","Y","32"
> +"SHLD r/m32, r32, CL","SHLL CL, r32, r/m32","shldl CL, r32, r/m32","0F A5 /r","V","V","","operand32","rw,r,r","Y","32"
> +"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /4 ib","V","V","","operand32","rw,r","Y","32"
> +"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /6 ib","V","V","","operand32","rw,r","Y","32"
> +"SHLD r/m32, r32, imm8u","SHLL imm8u, r32, r/m32","shldl imm8u, r32, r/m32","0F A4 /r ib","V","V","","operand32","rw,r,r","Y","32"
> +"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /4","N.S.","V","","","rw,r","Y","64"
> +"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /6","N.S.","V","","","rw,r","Y","64"
> +"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /4","N.S.","V","","","rw,r","Y","64"
> +"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /6","N.S.","V","","","rw,r","Y","64"
> +"SHLD r/m64, r64, CL","SHLQ CL, r64, r/m64","shldq CL, r64, r/m64","REX.W 0F A5 /r","N.S.","V","","","rw,r,r","Y","64"
> +"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /4 ib","N.S.","V","","","rw,r","Y","64"
> +"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /6 ib","N.S.","V","","","rw,r","Y","64"
> +"SHLD r/m64, r64, imm8u","SHLQ imm8u, r64, r/m64","shldq imm8u, r64, r/m64","REX.W 0F A4 /r ib","N.S.","V","","","rw,r,r","Y","64"
> +"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /4","V","V","","operand16","rw,r","Y","16"
> +"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /6","V","V","","operand16","rw,r","Y","16"
> +"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /4","V","V","","operand16","rw,r","Y","16"
> +"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /6","V","V","","operand16","rw,r","Y","16"
> +"SHLD r/m16, r16, CL","SHLW CL, r16, r/m16","shldw CL, r16, r/m16","0F A5 /r","V","V","","operand16","rw,r,r","Y","16"
> +"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /4 ib","V","V","","operand16","rw,r","Y","16"
> +"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /6 ib","V","V","","operand16","rw,r","Y","16"
> +"SHLD r/m16, r16, imm8u","SHLW imm8u, r16, r/m16","shldw imm8u, r16, r/m16","0F A4 /r ib","V","V","","operand16","rw,r,r","Y","16"
> +"SHLX r32, r/m32, r32V","SHLXL r32V, r/m32, r32","shlxl r32V, r/m32, r32","VEX.NDS.128.66.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32"
> +"SHLX r64, r/m64, r64V","SHLXQ r64V, r/m64, r64","shlxq r64V, r/m64, r64","VEX.NDS.128.66.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64"
> +"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","D0 /5","V","V","","","rw,r","Y","8"
> +"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","REX D0 /5","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","D2 /5","V","V","","","rw,r","Y","8"
> +"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","REX D2 /5","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SHR r/m8, imm8","SHRB imm8, r/m8","shrb imm8, r/m8","REX C0 /5 ib","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SHR r/m8, imm8u","SHRB imm8u, r/m8","shrb imm8u, r/m8","C0 /5 ib","V","V","","","rw,r","Y","8"
> +"SHR r/m32, 1","SHRL 1, r/m32","shrl 1, r/m32","D1 /5","V","V","","operand32","rw,r","Y","32"
> +"SHR r/m32, CL","SHRL CL, r/m32","shrl CL, r/m32","D3 /5","V","V","","operand32","rw,r","Y","32"
> +"SHRD r/m32, r32, CL","SHRL CL, r32, r/m32","shrdl CL, r32, r/m32","0F AD /r","V","V","","operand32","rw,r,r","Y","32"
> +"SHR r/m32, imm8u","SHRL imm8u, r/m32","shrl imm8u, r/m32","C1 /5 ib","V","V","","operand32","rw,r","Y","32"
> +"SHRD r/m32, r32, imm8u","SHRL imm8u, r32, r/m32","shrdl imm8u, r32, r/m32","0F AC /r ib","V","V","","operand32","rw,r,r","Y","32"
> +"SHR r/m64, 1","SHRQ 1, r/m64","shrq 1, r/m64","REX.W D1 /5","N.S.","V","","","rw,r","Y","64"
> +"SHR r/m64, CL","SHRQ CL, r/m64","shrq CL, r/m64","REX.W D3 /5","N.S.","V","","","rw,r","Y","64"
> +"SHRD r/m64, r64, CL","SHRQ CL, r64, r/m64","shrdq CL, r64, r/m64","REX.W 0F AD /r","N.S.","V","","","rw,r,r","Y","64"
> +"SHR r/m64, imm8u","SHRQ imm8u, r/m64","shrq imm8u, r/m64","REX.W C1 /5 ib","N.S.","V","","","rw,r","Y","64"
> +"SHRD r/m64, r64, imm8u","SHRQ imm8u, r64, r/m64","shrdq imm8u, r64, r/m64","REX.W 0F AC /r ib","N.S.","V","","","rw,r,r","Y","64"
> +"SHR r/m16, 1","SHRW 1, r/m16","shrw 1, r/m16","D1 /5","V","V","","operand16","rw,r","Y","16"
> +"SHR r/m16, CL","SHRW CL, r/m16","shrw CL, r/m16","D3 /5","V","V","","operand16","rw,r","Y","16"
> +"SHRD r/m16, r16, CL","SHRW CL, r16, r/m16","shrdw CL, r16, r/m16","0F AD /r","V","V","","operand16","rw,r,r","Y","16"
> +"SHR r/m16, imm8u","SHRW imm8u, r/m16","shrw imm8u, r/m16","C1 /5 ib","V","V","","operand16","rw,r","Y","16"
> +"SHRD r/m16, r16, imm8u","SHRW imm8u, r16, r/m16","shrdw imm8u, r16, r/m16","0F AC /r ib","V","V","","operand16","rw,r,r","Y","16"
> +"SHRX r32, r/m32, r32V","SHRXL r32V, r/m32, r32","shrxl r32V, r/m32, r32","VEX.NDS.128.F2.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32"
> +"SHRX r64, r/m64, r64V","SHRXQ r64V, r/m64, r64","shrxq r64V, r/m64, r64","VEX.NDS.128.F2.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64"
> +"SHUFPD xmm1, xmm2/m128, imm8u","SHUFPD imm8u, xmm2/m128, xmm1","shufpd imm8u, xmm2/m128, xmm1","66 0F C6 /r ib","V","V","SSE2","","rw,r,r","",""
> +"SHUFPS xmm1, xmm2/m128, imm8u","SHUFPS imm8u, xmm2/m128, xmm1","shufps imm8u, xmm2/m128, xmm1","0F C6 /r ib","V","V","SSE","","rw,r,r","",""
> +"SIDT m16&32","SIDT m16&32","sidt m16&32","0F 01 /1","V","N.S.","","modrm_memonly","w","",""
> +"SIDT m16&64","SIDT m16&64","sidt m16&64","0F 01 /1","N.S.","V","","default64,modrm_memonly","w","",""
> +"SKINIT EAX","SKINIT EAX","skinit EAX","0F 01 DE","V","V","SVM","amd,modrm_regonly","r","",""
> +"SLDT r/m16","SLDTW r/m16","sldtw r/m16","0F 00 /0","V","V","","operand16","w","Y","16"
> +"SLDT r32/m16","SLDT{L/W} r32/m16","sldt{l/w} r32/m16","0F 00 /0","V","V","","operand32","w","Y",""
> +"SLDT r64/m16","SLDT{Q/W} r64/m16","sldt{q/w} r64/m16","REX.W 0F 00 /0","N.S.","V","","","w","Y",""
> +"SLWPCB rmr32","SLWPCBL rmr32","slwpcbl rmr32","XOP.128.09.W0 12 /1","V","V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32"
> +"SLWPCB rmr64","SLWPCBQ rmr64","slwpcbq rmr64","XOP.128.09.W0 12 /1","N.S.","V","XOP","amd,modrm_regonly,operand64","w","Y","64"
> +"SMSW r/m16","SMSWW r/m16","smsww r/m16","0F 01 /4","V","V","","operand16","w","Y","16"
> +"SMSW r32/m16","SMSW{L/W} r32/m16","smsw{l/w} r32/m16","0F 01 /4","V","V","","operand32","w","Y",""
> +"SMSW r64/m16","SMSW{Q/W} r64/m16","smsw{q/w} r64/m16","REX.W 0F 01 /4","N.S.","V","","","w","Y",""
> +"SQRTPD xmm1, xmm2/m128","SQRTPD xmm2/m128, xmm1","sqrtpd xmm2/m128, xmm1","66 0F 51 /r","V","V","SSE2","","w,r","",""
> +"SQRTPS xmm1, xmm2/m128","SQRTPS xmm2/m128, xmm1","sqrtps xmm2/m128, xmm1","0F 51 /r","V","V","SSE","","w,r","",""
> +"SQRTSD xmm1, xmm2/m64","SQRTSD xmm2/m64, xmm1","sqrtsd xmm2/m64, xmm1","F2 0F 51 /r","V","V","SSE2","","w,r","",""
> +"SQRTSS xmm1, xmm2/m32","SQRTSS xmm2/m32, xmm1","sqrtss xmm2/m32, xmm1","F3 0F 51 /r","V","V","SSE","","w,r","",""
> +"STAC","STAC","stac","0F 01 CB","V","V","","","","",""
> +"STC","STC","stc","F9","V","V","","","","",""
> +"STD","STD","std","FD","V","V","","","","",""
> +"STGI","STGI","stgi","0F 01 DC","V","V","SVM","amd","","",""
> +"STI","STI","sti","FB","V","V","","","","",""
> +"STMXCSR m32","STMXCSR m32","stmxcsr m32","0F AE /3","V","V","SSE","modrm_memonly","w","",""
> +"STOSB","STOSB","stosb","AA","V","V","","","","",""
> +"STOSD","STOSL","stosl","AB","V","V","","operand32","","",""
> +"STOSQ","STOSQ","stosq","REX.W AB","N.S.","V","","","","",""
> +"STOSW","STOSW","stosw","AB","V","V","","operand16","","",""
> +"STR r/m16","STRW r/m16","strw r/m16","0F 00 /1","V","V","","operand16","w","Y","16"
> +"STR r32/m16","STR{L/W} r32/m16","str{l/w} r32/m16","0F 00 /1","V","V","","operand32","w","Y",""
> +"STR r64/m16","STR{Q/W} r64/m16","str{q/w} r64/m16","REX.W 0F 00 /1","N.S.","V","","","w","Y",""
> +"SUB AL, imm8","SUBB imm8, AL","subb imm8, AL","2C ib","V","V","","","rw,r","Y","8"
> +"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","80 /5 ib","V","V","","","rw,r","Y","8"
> +"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","82 /5 ib","V","N.S.","","","rw,r","Y","8"
> +"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","REX 80 /5 ib","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","2A /r","V","V","","","rw,r","Y","8"
> +"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","REX 2A /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","28 /r","V","V","","","rw,r","Y","8"
> +"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","REX 28 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"SUB EAX, imm32","SUBL imm32, EAX","subl imm32, EAX","2D id","V","V","","operand32","rw,r","Y","32"
> +"SUB r/m32, imm32","SUBL imm32, r/m32","subl imm32, r/m32","81 /5 id","V","V","","operand32","rw,r","Y","32"
> +"SUB r/m32, imm8","SUBL imm8, r/m32","subl imm8, r/m32","83 /5 ib","V","V","","operand32","rw,r","Y","32"
> +"SUB r32, r/m32","SUBL r/m32, r32","subl r/m32, r32","2B /r","V","V","","operand32","rw,r","Y","32"
> +"SUB r/m32, r32","SUBL r32, r/m32","subl r32, r/m32","29 /r","V","V","","operand32","rw,r","Y","32"
> +"SUBPD xmm1, xmm2/m128","SUBPD xmm2/m128, xmm1","subpd xmm2/m128, xmm1","66 0F 5C /r","V","V","SSE2","","rw,r","",""
> +"SUBPS xmm1, xmm2/m128","SUBPS xmm2/m128, xmm1","subps xmm2/m128, xmm1","0F 5C /r","V","V","SSE","","rw,r","",""
> +"SUB RAX, imm32","SUBQ imm32, RAX","subq imm32, RAX","REX.W 2D id","N.S.","V","","","rw,r","Y","64"
> +"SUB r/m64, imm32","SUBQ imm32, r/m64","subq imm32, r/m64","REX.W 81 /5 id","N.S.","V","","","rw,r","Y","64"
> +"SUB r/m64, imm8","SUBQ imm8, r/m64","subq imm8, r/m64","REX.W 83 /5 ib","N.S.","V","","","rw,r","Y","64"
> +"SUB r64, r/m64","SUBQ r/m64, r64","subq r/m64, r64","REX.W 2B /r","N.S.","V","","","rw,r","Y","64"
> +"SUB r/m64, r64","SUBQ r64, r/m64","subq r64, r/m64","REX.W 29 /r","N.S.","V","","","rw,r","Y","64"
> +"SUBSD xmm1, xmm2/m64","SUBSD xmm2/m64, xmm1","subsd xmm2/m64, xmm1","F2 0F 5C /r","V","V","SSE2","","rw,r","",""
> +"SUBSS xmm1, xmm2/m32","SUBSS xmm2/m32, xmm1","subss xmm2/m32, xmm1","F3 0F 5C /r","V","V","SSE","","rw,r","",""
> +"SUB AX, imm16","SUBW imm16, AX","subw imm16, AX","2D iw","V","V","","operand16","rw,r","Y","16"
> +"SUB r/m16, imm16","SUBW imm16, r/m16","subw imm16, r/m16","81 /5 iw","V","V","","operand16","rw,r","Y","16"
> +"SUB r/m16, imm8","SUBW imm8, r/m16","subw imm8, r/m16","83 /5 ib","V","V","","operand16","rw,r","Y","16"
> +"SUB r16, r/m16","SUBW r/m16, r16","subw r/m16, r16","2B /r","V","V","","operand16","rw,r","Y","16"
> +"SUB r/m16, r16","SUBW r16, r/m16","subw r16, r/m16","29 /r","V","V","","operand16","rw,r","Y","16"
> +"SWAPGS","SWAPGS","swapgs","0F 01 F8","N.S.","V","","","","",""
> +"SYSCALL","SYSCALL","syscall","0F 05","N.S.","V","","default64","","",""
> +"SYSCALL","SYSCALL","syscall","0F 05","V","N.S.","AMD","amd","","",""
> +"SYSENTER","SYSENTER","sysenter","0F 34","V","V","PPRO","","","",""
> +"SYSEXIT","SYSEXIT","sysexit","0F 35","V","V","PPRO","","","",""
> +"SYSEXIT","SYSEXIT","sysexit","REX.W 0F 35","N.E.","V","","pseudo","","",""
> +"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","V","N.S.","AMD","amd","","",""
> +"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","N.S.","V","","operand32,operand64","","",""
> +"SYSRET","SYSRET","sysretw/sysretl/sysretl","REX.W 0F 07","I","V","","pseudo","","",""
> +"T1MSKC r32V, r/m32","T1MSKCL r/m32, r32V","t1mskcl r/m32, r32V","XOP.NDD.128.09.WIG 01 /7","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
> +"T1MSKC r64V, r/m64","T1MSKCQ r/m64, r64V","t1mskcq r/m64, r64V","XOP.NDD.128.09.WIG 01 /7","N.S.","V","TBM","amd,operand64","w,r","Y","64"
> +"TEST AL, imm8","TESTB imm8, AL","testb imm8, AL","A8 ib","V","V","","","r,r","Y","8"
> +"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /0 ib","V","V","","","r,r","Y","8"
> +"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /1 ib","V","V","","","r,r","Y","8"
> +"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","REX F6 /0 ib","N.E.","V","","pseudo64","r,r","Y","8"
> +"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","84 /r","V","V","","","r,r","Y","8"
> +"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","REX 84 /r","N.E.","V","","pseudo64","r,r","Y","8"
> +"TEST EAX, imm32","TESTL imm32, EAX","testl imm32, EAX","A9 id","V","V","","operand32","r,r","Y","32"
> +"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /0 id","V","V","","operand32","r,r","Y","32"
> +"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /1 id","V","V","","operand32","r,r","Y","32"
> +"TEST r/m32, r32","TESTL r32, r/m32","testl r32, r/m32","85 /r","V","V","","operand32","r,r","Y","32"
> +"TEST RAX, imm32","TESTQ imm32, RAX","testq imm32, RAX","REX.W A9 id","N.S.","V","","","r,r","Y","64"
> +"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /0 id","N.S.","V","","","r,r","Y","64"
> +"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /1 id","N.S.","V","","","r,r","Y","64"
> +"TEST r/m64, r64","TESTQ r64, r/m64","testq r64, r/m64","REX.W 85 /r","N.S.","V","","","r,r","Y","64"
> +"TEST AX, imm16","TESTW imm16, AX","testw imm16, AX","A9 iw","V","V","","operand16","r,r","Y","16"
> +"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /0 iw","V","V","","operand16","r,r","Y","16"
> +"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /1 iw","V","V","","operand16","r,r","Y","16"
> +"TEST r/m16, r16","TESTW r16, r/m16","testw r16, r/m16","85 /r","V","V","","operand16","r,r","Y","16"
> +"TZCNT r32, r/m32","TZCNTL r/m32, r32","tzcntl r/m32, r32","F3 0F BC /r","V","V","BMI1","operand32","w,r","Y","32"
> +"TZCNT r64, r/m64","TZCNTQ r/m64, r64","tzcntq r/m64, r64","F3 REX.W 0F BC /r","N.S.","V","BMI1","","w,r","Y","64"
> +"TZCNT r16, r/m16","TZCNTW r/m16, r16","tzcntw r/m16, r16","F3 0F BC /r","V","V","BMI1","operand16","w,r","Y","16"
> +"TZMSK r32V, r/m32","TZMSKL r/m32, r32V","tzmskl r/m32, r32V","XOP.NDD.128.09.WIG 01 /4","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
> +"TZMSK r64V, r/m64","TZMSKQ r/m64, r64V","tzmskq r/m64, r64V","XOP.NDD.128.09.WIG 01 /4","N.S.","V","TBM","amd,operand64","w,r","Y","64"
> +"UCOMISD xmm1, xmm2/m64","UCOMISD xmm2/m64, xmm1","ucomisd xmm2/m64, xmm1","66 0F 2E /r","V","V","SSE2","","r,r","",""
> +"UCOMISS xmm1, xmm2/m32","UCOMISS xmm2/m32, xmm1","ucomiss xmm2/m32, xmm1","0F 2E /r","V","V","SSE","","r,r","",""
> +"UD0 r32, r/m32","UD0 r/m32, r32","ud0 r/m32, r32","0F FF /r","V","V","PPRO","","r,r","",""
> +"UD1 r32, r/m32","UD1 r/m32, r32","ud1 r/m32, r32","0F B9 /r","V","V","PPRO","","r,r","",""
> +"UD2","UD2","ud2","0F 0B","V","V","PPRO","","","",""
> +"UNPCKHPD xmm1, xmm2/m128","UNPCKHPD xmm2/m128, xmm1","unpckhpd xmm2/m128, xmm1","66 0F 15 /r","V","V","SSE2","","rw,r","",""
> +"UNPCKHPS xmm1, xmm2/m128","UNPCKHPS xmm2/m128, xmm1","unpckhps xmm2/m128, xmm1","0F 15 /r","V","V","SSE","","rw,r","",""
> +"UNPCKLPD xmm1, xmm2/m128","UNPCKLPD xmm2/m128, xmm1","unpcklpd xmm2/m128, xmm1","66 0F 14 /r","V","V","SSE2","","rw,r","",""
> +"UNPCKLPS xmm1, xmm2/m128","UNPCKLPS xmm2/m128, xmm1","unpcklps xmm2/m128, xmm1","0F 14 /r","V","V","SSE","","rw,r","",""
> +"V4FMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FMADDPS m128, zmmV+3,
> {k}{z}, zmm1","v4fmaddps m128, zmmV+3, {k}{z},
> zmm1","EVEX.DDS.512.F2.0F38.W0 9A
> /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
> +"V4FMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FMADDSS m128, xmmV+3,
> {k}{z}, xmm1","v4fmaddss m128, xmmV+3, {k}{z},
> xmm1","EVEX.DDS.LIG.F2.0F38.W0 9B
> /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
> +"V4FNMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FNMADDPS m128, zmmV+3,
> {k}{z}, zmm1","v4fnmaddps m128, zmmV+3, {k}{z},
> zmm1","EVEX.DDS.512.F2.0F38.W0 AA
> /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
> +"V4FNMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FNMADDSS m128, xmmV+3,
> {k}{z}, xmm1","v4fnmaddss m128, xmmV+3, {k}{z},
> xmm1","EVEX.DDS.LIG.F2.0F38.W0 AB
> /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
> +"VADDPD xmm1, xmmV, xmm2/m128","VADDPD xmm2/m128, xmmV, xmm1","vaddpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
> +"VADDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VADDPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vaddpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 58
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VADDPD ymm1, ymmV, ymm2/m256","VADDPD ymm2/m256, ymmV, ymm1","vaddpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
> +"VADDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VADDPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vaddpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 58
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VADDPD zmm1{er}, {k}{z}, zmmV, zmm2","VADDPD zmm2, zmmV, {k}{z},
> zmm1{er}","vaddpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1
> 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VADDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VADDPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vaddpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 58
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VADDPS xmm1, xmmV, xmm2/m128","VADDPS xmm2/m128, xmmV, xmm1","vaddps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
> +"VADDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VADDPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vaddps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 58
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VADDPS ymm1, ymmV, ymm2/m256","VADDPS ymm2/m256, ymmV, ymm1","vaddps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
> +"VADDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VADDPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vaddps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 58
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VADDPS zmm1{er}, {k}{z}, zmmV, zmm2","VADDPS zmm2, zmmV, {k}{z},
> zmm1{er}","vaddps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 58
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VADDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VADDPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vaddps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 58
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VADDSD xmm1{er}, {k}{z}, xmmV, xmm2","VADDSD xmm2, xmmV, {k}{z},
> xmm1{er}","vaddsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1
> 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VADDSD xmm1, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, xmm1","vaddsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
> +"VADDSD xmm1, {k}{z}, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, {k}{z},
> xmm1","vaddsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 58
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VADDSS xmm1{er}, {k}{z}, xmmV, xmm2","VADDSS xmm2, xmmV, {k}{z},
> xmm1{er}","vaddss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0
> 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VADDSS xmm1, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, xmm1","vaddss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
> +"VADDSS xmm1, {k}{z}, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, {k}{z},
> xmm1","vaddss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 58
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VADDSUBPD xmm1, xmmV, xmm2/m128","VADDSUBPD xmm2/m128, xmmV, xmm1","vaddsubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
> +"VADDSUBPD ymm1, ymmV, ymm2/m256","VADDSUBPD ymm2/m256, ymmV, ymm1","vaddsubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
> +"VADDSUBPS xmm1, xmmV, xmm2/m128","VADDSUBPS xmm2/m128, xmmV, xmm1","vaddsubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
> +"VADDSUBPS ymm1, ymmV, ymm2/m256","VADDSUBPS ymm2/m256, ymmV, ymm1","vaddsubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
> +"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV,
> xmm1","vaesdec xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DE
> /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
> +"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV, xmm1","vaesdec xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DE /r","V","V","AES+AVX","","w,r,r","",""
> +"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV,
> ymm1","vaesdec ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DE
> /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
> +"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV, ymm1","vaesdec ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DE /r","V","V","VAES+AVX","","w,r,r","",""
> +"VAESDEC zmm1, zmmV, zmm2/m512","VAESDEC zmm2/m512, zmmV,
> zmm1","vaesdec zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DE
> /r","V","V","AES+AVX512F","scale64","w,r,r","",""
> +"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV,
> xmm1","vaesdeclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DF
> /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
> +"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV,
> xmm1","vaesdeclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DF
> /r","V","V","AES+AVX","","w,r,r","",""
> +"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV,
> ymm1","vaesdeclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DF
> /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
> +"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV,
> ymm1","vaesdeclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DF
> /r","V","V","VAES+AVX","","w,r,r","",""
> +"VAESDECLAST zmm1, zmmV, zmm2/m512","VAESDECLAST zmm2/m512, zmmV,
> zmm1","vaesdeclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DF
> /r","V","V","AES+AVX512F","scale64","w,r,r","",""
> +"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV,
> xmm1","vaesenc xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DC
> /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
> +"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV, xmm1","vaesenc xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DC /r","V","V","AES+AVX","","w,r,r","",""
> +"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV,
> ymm1","vaesenc ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DC
> /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
> +"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV, ymm1","vaesenc ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DC /r","V","V","VAES+AVX","","w,r,r","",""
> +"VAESENC zmm1, zmmV, zmm2/m512","VAESENC zmm2/m512, zmmV,
> zmm1","vaesenc zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DC
> /r","V","V","AES+AVX512F","scale64","w,r,r","",""
> +"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV,
> xmm1","vaesenclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DD
> /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
> +"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV,
> xmm1","vaesenclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DD
> /r","V","V","AES+AVX","","w,r,r","",""
> +"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV,
> ymm1","vaesenclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DD
> /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
> +"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV,
> ymm1","vaesenclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DD
> /r","V","V","VAES+AVX","","w,r,r","",""
> +"VAESENCLAST zmm1, zmmV, zmm2/m512","VAESENCLAST zmm2/m512, zmmV,
> zmm1","vaesenclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DD
> /r","V","V","AES+AVX512F","scale64","w,r,r","",""
> +"VAESIMC xmm1, xmm2/m128","VAESIMC xmm2/m128, xmm1","vaesimc xmm2/m128, xmm1","VEX.128.66.0F38.WIG DB /r","V","V","AES+AVX","","w,r","",""
> +"VAESKEYGENASSIST xmm1, xmm2/m128, imm8u","VAESKEYGENASSIST imm8u,
> xmm2/m128, xmm1","vaeskeygenassist imm8u, xmm2/m128,
> xmm1","VEX.128.66.0F3A.WIG DF /r
> ib","V","V","AES+AVX","","w,r,r","",""
> +"VALIGND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VALIGND
> imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","valignd imm8u,
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 03 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
> +"VALIGND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VALIGND
> imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","valignd imm8u,
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 03 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VALIGND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VALIGND
> imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","valignd imm8u,
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 03 /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
> +"VALIGNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VALIGNQ
> imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","valignq imm8u,
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 03 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VALIGNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VALIGNQ
> imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","valignq imm8u,
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 03 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VALIGNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VALIGNQ
> imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","valignq imm8u,
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 03 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
> +"VANDNPD xmm1, xmmV, xmm2/m128","VANDNPD xmm2/m128, xmmV, xmm1","vandnpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
> +"VANDNPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDNPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vandnpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 55
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VANDNPD ymm1, ymmV, ymm2/m256","VANDNPD ymm2/m256, ymmV, ymm1","vandnpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
> +"VANDNPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDNPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vandnpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 55
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VANDNPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDNPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vandnpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 55
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
> +"VANDNPS xmm1, xmmV, xmm2/m128","VANDNPS xmm2/m128, xmmV, xmm1","vandnps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
> +"VANDNPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDNPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vandnps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 55
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VANDNPS ymm1, ymmV, ymm2/m256","VANDNPS ymm2/m256, ymmV, ymm1","vandnps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
> +"VANDNPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDNPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vandnps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 55
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VANDNPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDNPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vandnps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 55
> /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
> +"VANDPD xmm1, xmmV, xmm2/m128","VANDPD xmm2/m128, xmmV, xmm1","vandpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
> +"VANDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vandpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 54
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VANDPD ymm1, ymmV, ymm2/m256","VANDPD ymm2/m256, ymmV, ymm1","vandpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
> +"VANDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vandpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 54
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VANDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vandpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 54
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
> +"VANDPS xmm1, xmmV, xmm2/m128","VANDPS xmm2/m128, xmmV, xmm1","vandps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
> +"VANDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vandps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 54
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VANDPS ymm1, ymmV, ymm2/m256","VANDPS ymm2/m256, ymmV, ymm1","vandps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
> +"VANDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vandps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 54
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VANDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vandps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 54
> /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
> +"VBLENDMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VBLENDMPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vblendmpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 65
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VBLENDMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VBLENDMPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vblendmpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 65
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VBLENDMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VBLENDMPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vblendmpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 65
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VBLENDMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VBLENDMPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vblendmps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 65
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VBLENDMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VBLENDMPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vblendmps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 65
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VBLENDMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VBLENDMPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vblendmps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 65
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VBLENDPD xmm1, xmmV, xmm2/m128, imm8u","VBLENDPD imm8u, xmm2/m128,
> xmmV, xmm1","vblendpd imm8u, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.WIG 0D /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VBLENDPD ymm1, ymmV, ymm2/m256, imm8u","VBLENDPD imm8u, ymm2/m256,
> ymmV, ymm1","vblendpd imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.WIG 0D /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VBLENDPS xmm1, xmmV, xmm2/m128, imm8u","VBLENDPS imm8u, xmm2/m128,
> xmmV, xmm1","vblendps imm8u, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.WIG 0C /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VBLENDPS ymm1, ymmV, ymm2/m256, imm8u","VBLENDPS imm8u, ymm2/m256,
> ymmV, ymm1","vblendps imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.WIG 0C /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VBLENDVPD xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPD xmmIH, xmm2/m128,
> xmmV, xmm1","vblendvpd xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 4B /r
> /is4","V","V","AVX","","w,r,r,r","",""
> +"VBLENDVPD ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPD ymmIH, ymm2/m256,
> ymmV, ymm1","vblendvpd ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 4B /r
> /is4","V","V","AVX","","w,r,r,r","",""
> +"VBLENDVPS xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPS xmmIH, xmm2/m128,
> xmmV, xmm1","vblendvps xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 4A /r
> /is4","V","V","AVX","","w,r,r,r","",""
> +"VBLENDVPS ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPS ymmIH, ymm2/m256,
> ymmV, ymm1","vblendvps ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 4A /r
> /is4","V","V","AVX","","w,r,r,r","",""
> +"VBROADCASTF128 ymm1, m128","VBROADCASTF128 m128, ymm1","vbroadcastf128 m128, ymm1","VEX.256.66.0F38.W0 1A /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VBROADCASTF32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64,
> {k}{z}, ymm1","vbroadcastf32x2 xmm2/m64, {k}{z},
> ymm1","EVEX.256.66.0F38.W0 19
> /r","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
> +"VBROADCASTF32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64,
> {k}{z}, zmm1","vbroadcastf32x2 xmm2/m64, {k}{z},
> zmm1","EVEX.512.66.0F38.W0 19
> /r","V","V","AVX512DQ","scale8","w,r,r","",""
> +"VBROADCASTF32X4 ymm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z},
> ymm1","vbroadcastf32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 1A
> /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r,r","",""
> +"VBROADCASTF32X4 zmm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z},
> zmm1","vbroadcastf32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1A
> /r","V","V","AVX512F","modrm_memonly,scale16","w,r,r","",""
> +"VBROADCASTF32X8 zmm1, {k}{z}, m256","VBROADCASTF32X8 m256, {k}{z},
> zmm1","vbroadcastf32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1B
> /r","V","V","AVX512DQ","modrm_memonly,scale32","w,r,r","",""
> +"VBROADCASTF64X2 ymm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z},
> ymm1","vbroadcastf64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 1A
> /r","V","V","AVX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","",""
> +"VBROADCASTF64X2 zmm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z},
> zmm1","vbroadcastf64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1A
> /r","V","V","AVX512DQ","modrm_memonly,scale16","w,r,r","",""
> +"VBROADCASTF64X4 zmm1, {k}{z}, m256","VBROADCASTF64X4 m256, {k}{z},
> zmm1","vbroadcastf64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1B
> /r","V","V","AVX512F","modrm_memonly,scale32","w,r,r","",""
> +"VBROADCASTI128 ymm1, m128","VBROADCASTI128 m128, ymm1","vbroadcasti128 m128, ymm1","VEX.256.66.0F38.W0 5A /r","V","V","AVX2","modrm_memonly","w,r","",""
> +"VBROADCASTI32X2 xmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64,
> {k}{z}, xmm1","vbroadcasti32x2 xmm2/m64, {k}{z},
> xmm1","EVEX.128.66.0F38.W0 59
> /r","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
> +"VBROADCASTI32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64,
> {k}{z}, ymm1","vbroadcasti32x2 xmm2/m64, {k}{z},
> ymm1","EVEX.256.66.0F38.W0 59
> /r","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
> +"VBROADCASTI32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64,
> {k}{z}, zmm1","vbroadcasti32x2 xmm2/m64, {k}{z},
> zmm1","EVEX.512.66.0F38.W0 59
> /r","V","V","AVX512DQ","scale8","w,r,r","",""
> +"VBROADCASTI32X4 ymm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z},
> ymm1","vbroadcasti32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 5A
> /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r,r","",""
> +"VBROADCASTI32X4 zmm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z},
> zmm1","vbroadcasti32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5A
> /r","V","V","AVX512F","modrm_memonly,scale16","w,r,r","",""
> +"VBROADCASTI32X8 zmm1, {k}{z}, m256","VBROADCASTI32X8 m256, {k}{z},
> zmm1","vbroadcasti32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5B
> /r","V","V","AVX512DQ","modrm_memonly,scale32","w,r,r","",""
> +"VBROADCASTI64X2 ymm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z},
> ymm1","vbroadcasti64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 5A
> /r","V","V","AVX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","",""
> +"VBROADCASTI64X2 zmm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z},
> zmm1","vbroadcasti64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5A
> /r","V","V","AVX512DQ","modrm_memonly,scale16","w,r,r","",""
> +"VBROADCASTI64X4 zmm1, {k}{z}, m256","VBROADCASTI64X4 m256, {k}{z},
> zmm1","vbroadcasti64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5B
> /r","V","V","AVX512F","modrm_memonly,scale32","w,r,r","",""
> +"VBROADCASTSD ymm1, m64","VBROADCASTSD m64, ymm1","vbroadcastsd m64, ymm1","VEX.256.66.0F38.W0 19 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VBROADCASTSD ymm1, xmm2","VBROADCASTSD xmm2, ymm1","vbroadcastsd xmm2, ymm1","VEX.256.66.0F38.W0 19 /r","V","V","AVX2","modrm_regonly","w,r","",""
> +"VBROADCASTSD ymm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z},
> ymm1","vbroadcastsd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 19
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VBROADCASTSD zmm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z},
> zmm1","vbroadcastsd xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 19
> /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VBROADCASTSS xmm1, m32","VBROADCASTSS m32, xmm1","vbroadcastss m32, xmm1","VEX.128.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VBROADCASTSS ymm1, m32","VBROADCASTSS m32, ymm1","vbroadcastss m32, ymm1","VEX.256.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VBROADCASTSS xmm1, xmm2","VBROADCASTSS xmm2, xmm1","vbroadcastss xmm2, xmm1","VEX.128.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","",""
> +"VBROADCASTSS ymm1, xmm2","VBROADCASTSS xmm2, ymm1","vbroadcastss xmm2, ymm1","VEX.256.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","",""
> +"VBROADCASTSS xmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z},
> xmm1","vbroadcastss xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 18
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VBROADCASTSS ymm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z},
> ymm1","vbroadcastss xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 18
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VBROADCASTSS zmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z},
> zmm1","vbroadcastss xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 18
> /r","V","V","AVX512F","scale4","w,r,r","",""
> +"VCMPPD xmm1, xmmV, xmm2/m128, imm8u","VCMPPD imm8u, xmm2/m128, xmmV,
> xmm1","vcmppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C2
> /r ib","V","V","AVX","","w,r,r,r","",""
> +"VCMPPD k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VCMPPD imm8u,
> xmm2/m128/m64bcst, xmmV, {k}, k1","vcmppd imm8u, xmm2/m128/m64bcst,
> xmmV, {k}, k1","EVEX.NDS.128.66.0F.W1 C2 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VCMPPD ymm1, ymmV, ymm2/m256, imm8u","VCMPPD imm8u, ymm2/m256, ymmV,
> ymm1","vcmppd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG C2
> /r ib","V","V","AVX","","w,r,r,r","",""
> +"VCMPPD k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VCMPPD imm8u,
> ymm2/m256/m64bcst, ymmV, {k}, k1","vcmppd imm8u, ymm2/m256/m64bcst,
> ymmV, {k}, k1","EVEX.NDS.256.66.0F.W1 C2 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VCMPPD k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPD imm8u, zmm2, zmmV,
> {k}, k1{sae}","vcmppd imm8u, zmm2, zmmV, {k},
> k1{sae}","EVEX.NDS.512.66.0F.W1 C2 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
> +"VCMPPD k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VCMPPD imm8u,
> zmm2/m512/m64bcst, zmmV, {k}, k1","vcmppd imm8u, zmm2/m512/m64bcst,
> zmmV, {k}, k1","EVEX.NDS.512.66.0F.W1 C2 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
> +"VCMPPS xmm1, xmmV, xmm2/m128, imm8u","VCMPPS imm8u, xmm2/m128, xmmV,
> xmm1","vcmpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG C2 /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VCMPPS k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VCMPPS imm8u,
> xmm2/m128/m32bcst, xmmV, {k}, k1","vcmpps imm8u, xmm2/m128/m32bcst,
> xmmV, {k}, k1","EVEX.NDS.128.0F.W0 C2 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
> +"VCMPPS ymm1, ymmV, ymm2/m256, imm8u","VCMPPS imm8u, ymm2/m256, ymmV,
> ymm1","vcmpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG C2 /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VCMPPS k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VCMPPS imm8u,
> ymm2/m256/m32bcst, ymmV, {k}, k1","vcmpps imm8u, ymm2/m256/m32bcst,
> ymmV, {k}, k1","EVEX.NDS.256.0F.W0 C2 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VCMPPS k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPS imm8u, zmm2, zmmV,
> {k}, k1{sae}","vcmpps imm8u, zmm2, zmmV, {k},
> k1{sae}","EVEX.NDS.512.0F.W0 C2 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
> +"VCMPPS k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VCMPPS imm8u,
> zmm2/m512/m32bcst, zmmV, {k}, k1","vcmpps imm8u, zmm2/m512/m32bcst,
> zmmV, {k}, k1","EVEX.NDS.512.0F.W0 C2 /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
> +"VCMPSD k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSD imm8u, xmm2, xmmV,
> {k}, k1{sae}","vcmpsd imm8u, xmm2, xmmV, {k},
> k1{sae}","EVEX.NDS.128.F2.0F.W1 C2 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
> +"VCMPSD xmm1, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64, xmmV,
> xmm1","vcmpsd imm8u, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG C2
> /r ib","V","V","AVX","","w,r,r,r","",""
> +"VCMPSD k1, {k}, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64,
> xmmV, {k}, k1","vcmpsd imm8u, xmm2/m64, xmmV, {k},
> k1","EVEX.NDS.LIG.F2.0F.W1 C2 /r
> ib","V","V","AVX512F","scale8","w,r,r,r,r","",""
> +"VCMPSS k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSS imm8u, xmm2, xmmV,
> {k}, k1{sae}","vcmpss imm8u, xmm2, xmmV, {k},
> k1{sae}","EVEX.NDS.128.F3.0F.W0 C2 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
> +"VCMPSS xmm1, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32, xmmV,
> xmm1","vcmpss imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG C2
> /r ib","V","V","AVX","","w,r,r,r","",""
> +"VCMPSS k1, {k}, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32,
> xmmV, {k}, k1","vcmpss imm8u, xmm2/m32, xmmV, {k},
> k1","EVEX.NDS.LIG.F3.0F.W0 C2 /r
> ib","V","V","AVX512F","scale4","w,r,r,r,r","",""
> +"VCOMISD xmm1{sae}, xmm2","VCOMISD xmm2, xmm1{sae}","vcomisd xmm2, xmm1{sae}","EVEX.128.66.0F.W1 2F /r","V","V","AVX512F","modrm_regonly","r,r","",""
> +"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1","EVEX.LIG.66.0F.W1 2F /r","V","V","AVX512F","scale8","r,r","",""
> +"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1","VEX.LIG.66.0F.WIG 2F /r","V","V","AVX","","r,r","",""
> +"VCOMISS xmm1{sae}, xmm2","VCOMISS xmm2, xmm1{sae}","vcomiss xmm2, xmm1{sae}","EVEX.128.0F.W0 2F /r","V","V","AVX512F","modrm_regonly","r,r","",""
> +"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1","EVEX.LIG.0F.W0 2F /r","V","V","AVX512F","scale4","r,r","",""
> +"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1","VEX.LIG.0F.WIG 2F /r","V","V","AVX","","r,r","",""
> +"VCOMPRESSPD xmm2/m128, {k}{z}, xmm1","VCOMPRESSPD xmm1, {k}{z},
> xmm2/m128","vcompresspd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1
> 8A /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VCOMPRESSPD ymm2/m256, {k}{z}, ymm1","VCOMPRESSPD ymm1, {k}{z},
> ymm2/m256","vcompresspd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1
> 8A /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VCOMPRESSPD zmm2/m512, {k}{z}, zmm1","VCOMPRESSPD zmm1, {k}{z},
> zmm2/m512","vcompresspd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1
> 8A /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VCOMPRESSPS xmm2/m128, {k}{z}, xmm1","VCOMPRESSPS xmm1, {k}{z},
> xmm2/m128","vcompressps xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0
> 8A /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VCOMPRESSPS ymm2/m256, {k}{z}, ymm1","VCOMPRESSPS ymm1, {k}{z},
> ymm2/m256","vcompressps ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0
> 8A /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VCOMPRESSPS zmm2/m512, {k}{z}, zmm1","VCOMPRESSPS zmm1, {k}{z},
> zmm2/m512","vcompressps zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0
> 8A /r","V","V","AVX512F","scale4","w,r,r","",""
> +"VCVTDQ2PD ymm1, xmm2/m128","VCVTDQ2PD xmm2/m128, ymm1","vcvtdq2pd xmm2/m128, ymm1","VEX.256.F3.0F.WIG E6 /r","V","V","AVX","","w,r","",""
> +"VCVTDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PD
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtdq2pd xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.F3.0F.W0 E6
> /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","",""
> +"VCVTDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTDQ2PD
> xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtdq2pd xmm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.F3.0F.W0 E6
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTDQ2PD xmm1, xmm2/m64","VCVTDQ2PD xmm2/m64, xmm1","vcvtdq2pd xmm2/m64, xmm1","VEX.128.F3.0F.WIG E6 /r","V","V","AVX","","w,r","",""
> +"VCVTDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTDQ2PD
> ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtdq2pd ymm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.F3.0F.W0 E6
> /r","V","V","AVX512F","bscale4,scale32","w,r,r","",""
> +"VCVTDQ2PS xmm1, xmm2/m128","VCVTDQ2PS xmm2/m128, xmm1","vcvtdq2ps xmm2/m128, xmm1","VEX.128.0F.WIG 5B /r","V","V","AVX","","w,r","",""
> +"VCVTDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PS
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtdq2ps xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.0F.W0 5B
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTDQ2PS ymm1, ymm2/m256","VCVTDQ2PS ymm2/m256, ymm1","vcvtdq2ps ymm2/m256, ymm1","VEX.256.0F.WIG 5B /r","V","V","AVX","","w,r","",""
> +"VCVTDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTDQ2PS
> ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtdq2ps ymm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.0F.W0 5B
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VCVTDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTDQ2PS zmm2, {k}{z},
> zmm1{er}","vcvtdq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 5B
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VCVTDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTDQ2PS
> zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtdq2ps zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.0F.W0 5B
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VCVTPD2DQ ymm1{er}, {k}{z}, zmm2","VCVTPD2DQ zmm2, {k}{z},
> ymm1{er}","vcvtpd2dq zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 E6
> /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
> +"VCVTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2DQ
> zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtpd2dq zmm2/m512/m64bcst, {k}{z},
> ymm1","EVEX.512.F2.0F.W1 E6
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
> +"VCVTPD2DQ xmm1, xmm2/m128","VCVTPD2DQX xmm2/m128, xmm1","vcvtpd2dqx xmm2/m128, xmm1","VEX.128.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128"
> +"VCVTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2DQX
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2dqx xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.F2.0F.W1 E6
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
> +"VCVTPD2DQ xmm1, ymm2/m256","VCVTPD2DQY ymm2/m256, xmm1","vcvtpd2dqy ymm2/m256, xmm1","VEX.256.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256"
> +"VCVTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2DQY
> ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtpd2dqy ymm2/m256/m64bcst,
> {k}{z}, xmm1","EVEX.256.F2.0F.W1 E6
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
> +"VCVTPD2PS ymm1{er}, {k}{z}, zmm2","VCVTPD2PS zmm2, {k}{z},
> ymm1{er}","vcvtpd2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.66.0F.W1 5A
> /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
> +"VCVTPD2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2PS
> zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtpd2ps zmm2/m512/m64bcst, {k}{z},
> ymm1","EVEX.512.66.0F.W1 5A
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
> +"VCVTPD2PS xmm1, xmm2/m128","VCVTPD2PSX xmm2/m128, xmm1","vcvtpd2psx xmm2/m128, xmm1","VEX.128.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","128"
> +"VCVTPD2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2PSX
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2psx xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.66.0F.W1 5A
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
> +"VCVTPD2PS xmm1, ymm2/m256","VCVTPD2PSY ymm2/m256, xmm1","vcvtpd2psy ymm2/m256, xmm1","VEX.256.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","256"
> +"VCVTPD2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2PSY
> ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtpd2psy ymm2/m256/m64bcst,
> {k}{z}, xmm1","EVEX.256.66.0F.W1 5A
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
> +"VCVTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2QQ
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2qq xmm2/m128/m64bcst, {k}{z},
> xmm1","EVEX.128.66.0F.W1 7B
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VCVTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2QQ
> ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtpd2qq ymm2/m256/m64bcst, {k}{z},
> ymm1","EVEX.256.66.0F.W1 7B
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VCVTPD2QQ zmm1{er}, {k}{z}, zmm2","VCVTPD2QQ zmm2, {k}{z},
> zmm1{er}","vcvtpd2qq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 7B
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2QQ
> zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtpd2qq zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.66.0F.W1 7B
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
> +"VCVTPD2UDQ ymm1{er}, {k}{z}, zmm2","VCVTPD2UDQ zmm2, {k}{z},
> ymm1{er}","vcvtpd2udq zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 79
> /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
> +"VCVTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UDQ
> zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtpd2udq zmm2/m512/m64bcst,
> {k}{z}, ymm1","EVEX.512.0F.W1 79
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
> +"VCVTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UDQX
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2udqx xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.0F.W1 79
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
> +"VCVTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UDQY
> ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtpd2udqy ymm2/m256/m64bcst,
> {k}{z}, xmm1","EVEX.256.0F.W1 79
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
> +"VCVTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UQQ
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2uqq xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.66.0F.W1 79
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VCVTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UQQ
> ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtpd2uqq ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.66.0F.W1 79
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VCVTPD2UQQ zmm1{er}, {k}{z}, zmm2","VCVTPD2UQQ zmm2, {k}{z},
> zmm1{er}","vcvtpd2uqq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 79
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UQQ
> zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtpd2uqq zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F.W1 79
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
> +"VCVTPH2PS ymm1, xmm2/m128","VCVTPH2PS xmm2/m128, ymm1","vcvtph2ps xmm2/m128, ymm1","VEX.256.66.0F38.W0 13 /r","V","V","F16C","","w,r","",""
> +"VCVTPH2PS ymm1, {k}{z}, xmm2/m128","VCVTPH2PS xmm2/m128, {k}{z},
> ymm1","vcvtph2ps xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 13
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VCVTPH2PS xmm1, xmm2/m64","VCVTPH2PS xmm2/m64, xmm1","vcvtph2ps xmm2/m64, xmm1","VEX.128.66.0F38.W0 13 /r","V","V","F16C","","w,r","",""
> +"VCVTPH2PS xmm1, {k}{z}, xmm2/m64","VCVTPH2PS xmm2/m64, {k}{z},
> xmm1","vcvtph2ps xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 13
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VCVTPH2PS zmm1{sae}, {k}{z}, ymm2","VCVTPH2PS ymm2, {k}{z},
> zmm1{sae}","vcvtph2ps ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 13
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VCVTPH2PS zmm1, {k}{z}, ymm2/m256","VCVTPH2PS ymm2/m256, {k}{z},
> zmm1","vcvtph2ps ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 13
> /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VCVTPS2DQ xmm1, xmm2/m128","VCVTPS2DQ xmm2/m128, xmm1","vcvtps2dq xmm2/m128, xmm1","VEX.128.66.0F.WIG 5B /r","V","V","AVX","","w,r","",""
> +"VCVTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2DQ
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2dq xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.66.0F.W0 5B
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTPS2DQ ymm1, ymm2/m256","VCVTPS2DQ ymm2/m256, ymm1","vcvtps2dq ymm2/m256, ymm1","VEX.256.66.0F.WIG 5B /r","V","V","AVX","","w,r","",""
> +"VCVTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2DQ
> ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2dq ymm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.66.0F.W0 5B
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VCVTPS2DQ zmm1{er}, {k}{z}, zmm2","VCVTPS2DQ zmm2, {k}{z},
> zmm1{er}","vcvtps2dq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 5B
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VCVTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2DQ
> zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2dq zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.66.0F.W0 5B
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VCVTPS2PD ymm1, xmm2/m128","VCVTPS2PD xmm2/m128, ymm1","vcvtps2pd xmm2/m128, ymm1","VEX.256.0F.WIG 5A /r","V","V","AVX","","w,r","",""
> +"VCVTPS2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2PD
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2pd xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.0F.W0 5A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","",""
> +"VCVTPS2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2PD
> xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2pd xmm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.0F.W0 5A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTPS2PD xmm1, xmm2/m64","VCVTPS2PD xmm2/m64, xmm1","vcvtps2pd xmm2/m64, xmm1","VEX.128.0F.WIG 5A /r","V","V","AVX","","w,r","",""
> +"VCVTPS2PD zmm1{sae}, {k}{z}, ymm2","VCVTPS2PD ymm2, {k}{z},
> zmm1{sae}","vcvtps2pd ymm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 5A
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VCVTPS2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2PD
> ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2pd ymm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.0F.W0 5A
> /r","V","V","AVX512F","bscale4,scale32","w,r,r","",""
> +"VCVTPS2PH xmm2/m64, xmm1, imm8u","VCVTPS2PH imm8u, xmm1, xmm2/m64","vcvtps2ph imm8u, xmm1, xmm2/m64","VEX.128.66.0F3A.W0 1D /r ib","V","V","F16C","","w,r,r","",""
> +"VCVTPS2PH xmm2/m64, {k}{z}, xmm1, imm8u","VCVTPS2PH imm8u, xmm1,
> {k}{z}, xmm2/m64","vcvtps2ph imm8u, xmm1, {k}{z},
> xmm2/m64","EVEX.128.66.0F3A.W0 1D /r
> ib","V","V","AVX512F+AVX512VL","scale8","w,r,r,r","",""
> +"VCVTPS2PH xmm2/m128, ymm1, imm8u","VCVTPS2PH imm8u, ymm1, xmm2/m128","vcvtps2ph imm8u, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 1D /r ib","V","V","F16C","","w,r,r","",""
> +"VCVTPS2PH xmm2/m128, {k}{z}, ymm1, imm8u","VCVTPS2PH imm8u, ymm1,
> {k}{z}, xmm2/m128","vcvtps2ph imm8u, ymm1, {k}{z},
> xmm2/m128","EVEX.256.66.0F3A.W0 1D /r
> ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VCVTPS2PH ymm2/m256, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1,
> {k}{z}, ymm2/m256","vcvtps2ph imm8u, zmm1, {k}{z},
> ymm2/m256","EVEX.512.66.0F3A.W0 1D /r
> ib","V","V","AVX512F","scale32","w,r,r,r","",""
> +"VCVTPS2PH ymm2{sae}, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1,
> {k}{z}, ymm2{sae}","vcvtps2ph imm8u, zmm1, {k}{z},
> ymm2{sae}","EVEX.512.66.0F3A.W0 1D /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VCVTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2QQ
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2qq xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.66.0F.W0 7B
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
> +"VCVTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2QQ
> xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2qq xmm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.66.0F.W0 7B
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTPS2QQ zmm1{er}, {k}{z}, ymm2","VCVTPS2QQ ymm2, {k}{z},
> zmm1{er}","vcvtps2qq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 7B
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2QQ
> ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2qq ymm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.66.0F.W0 7B
> /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
> +"VCVTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UDQ
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2udq xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.0F.W0 79
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2UDQ
> ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2udq ymm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.0F.W0 79
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VCVTPS2UDQ zmm1{er}, {k}{z}, zmm2","VCVTPS2UDQ zmm2, {k}{z},
> zmm1{er}","vcvtps2udq zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 79
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VCVTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2UDQ
> zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2udq zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.0F.W0 79
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VCVTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UQQ
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2uqq xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.66.0F.W0 79
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
> +"VCVTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2UQQ
> xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2uqq xmm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.66.0F.W0 79
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTPS2UQQ zmm1{er}, {k}{z}, ymm2","VCVTPS2UQQ ymm2, {k}{z},
> zmm1{er}","vcvtps2uqq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 79
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2UQQ
> ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2uqq ymm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.66.0F.W0 79
> /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
> +"VCVTQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PD
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtqq2pd xmm2/m128/m64bcst, {k}{z},
> xmm1","EVEX.128.F3.0F.W1 E6
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VCVTQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PD
> ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtqq2pd ymm2/m256/m64bcst, {k}{z},
> ymm1","EVEX.256.F3.0F.W1 E6
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VCVTQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTQQ2PD zmm2, {k}{z},
> zmm1{er}","vcvtqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 E6
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PD
> zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtqq2pd zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.F3.0F.W1 E6
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
> +"VCVTQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTQQ2PS zmm2, {k}{z},
> ymm1{er}","vcvtqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 5B
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","Y",""
> +"VCVTQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PS
> zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtqq2ps zmm2/m512/m64bcst, {k}{z},
> ymm1","EVEX.512.0F.W1 5B
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512"
> +"VCVTQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PSX
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtqq2psx xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.0F.W1 5B
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","128"
> +"VCVTQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PSY
> ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtqq2psy ymm2/m256/m64bcst,
> {k}{z}, xmm1","EVEX.256.0F.W1 5B
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","256"
> +"VCVTSD2SI r32{er}, xmm2","VCVTSD2SI xmm2, r32{er}","vcvtsd2si xmm2, r32{er}","EVEX.128.F2.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
> +"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r32","EVEX.LIG.F2.0F.W0 2D /r","V","V","AVX512F","scale8","w,r","Y","32"
> +"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r32","VEX.LIG.F2.0F.W0 2D /r","V","V","AVX","","w,r","Y","32"
> +"VCVTSD2SI r64{er}, xmm2","VCVTSD2SIQ xmm2, r64{er}","vcvtsd2siq xmm2, r64{er}","EVEX.128.F2.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
> +"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64, r64","EVEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
> +"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64, r64","VEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64"
> +"VCVTSD2SS xmm1{er}, {k}{z}, xmmV, xmm2","VCVTSD2SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vcvtsd2ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.NDS.128.F2.0F.W1 5A
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VCVTSD2SS xmm1, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV, xmm1","vcvtsd2ss xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5A /r","V","V","AVX","","w,r,r","",""
> +"VCVTSD2SS xmm1, {k}{z}, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV,
> {k}{z}, xmm1","vcvtsd2ss xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.F2.0F.W1 5A
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VCVTSD2USI r32{er}, xmm2","VCVTSD2USIL xmm2, r32{er}","vcvtsd2usi xmm2, r32{er}","EVEX.128.F2.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
> +"VCVTSD2USI r32, xmm2/m64","VCVTSD2USIL xmm2/m64, r32","vcvtsd2usi xmm2/m64, r32","EVEX.LIG.F2.0F.W0 79 /r","V","V","AVX512F","scale8","w,r","Y","32"
> +"VCVTSD2USI r64{er}, xmm2","VCVTSD2USIQ xmm2, r64{er}","vcvtsd2usi xmm2, r64{er}","EVEX.128.F2.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
> +"VCVTSD2USI r64, xmm2/m64","VCVTSD2USIQ xmm2/m64, r64","vcvtsd2usi xmm2/m64, r64","EVEX.LIG.F2.0F.W1 79 /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
> +"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r/m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX512F","scale4","w,r,r","Y","32"
> +"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r/m32, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX","","w,r,r","Y","32"
> +"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r/m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
> +"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX","","w,r,r","Y","64"
> +"VCVTSI2SD xmm1{er}, xmmV, rmr64","VCVTSI2SDQ rmr64, xmmV,
> xmm1{er}","vcvtsi2sdq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 2A
> /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
> +"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r/m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX512F","scale4","w,r,r","Y","32"
> +"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX","","w,r,r","Y","32"
> +"VCVTSI2SS xmm1{er}, xmmV, rmr32","VCVTSI2SSL rmr32, xmmV,
> xmm1{er}","vcvtsi2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0 2A
> /r","V","V","AVX512F","modrm_regonly","w,r,r","Y","32"
> +"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r/m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
> +"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r/m64, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX","","w,r,r","Y","64"
> +"VCVTSI2SS xmm1{er}, xmmV, rmr64","VCVTSI2SSQ rmr64, xmmV,
> xmm1{er}","vcvtsi2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1 2A
> /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
> +"VCVTSS2SD xmm1{sae}, {k}{z}, xmmV, xmm2","VCVTSS2SD xmm2, xmmV,
> {k}{z}, xmm1{sae}","vcvtss2sd xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5A
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VCVTSS2SD xmm1, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV, xmm1","vcvtss2sd xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5A /r","V","V","AVX","","w,r,r","",""
> +"VCVTSS2SD xmm1, {k}{z}, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV,
> {k}{z}, xmm1","vcvtss2sd xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.F3.0F.W0 5A
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VCVTSS2SI r32{er}, xmm2","VCVTSS2SI xmm2, r32{er}","vcvtss2si xmm2, r32{er}","EVEX.128.F3.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
> +"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r32","EVEX.LIG.F3.0F.W0 2D /r","V","V","AVX512F","scale4","w,r","Y","32"
> +"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r32","VEX.LIG.F3.0F.W0 2D /r","V","V","AVX","","w,r","Y","32"
> +"VCVTSS2SI r64{er}, xmm2","VCVTSS2SIQ xmm2, r64{er}","vcvtss2siq xmm2, r64{er}","EVEX.128.F3.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
> +"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
> +"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32, r64","VEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64"
> +"VCVTSS2USI r32{er}, xmm2","VCVTSS2USIL xmm2, r32{er}","vcvtss2usil xmm2, r32{er}","EVEX.128.F3.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
> +"VCVTSS2USI r32, xmm2/m32","VCVTSS2USIL xmm2/m32, r32","vcvtss2usil xmm2/m32, r32","EVEX.LIG.F3.0F.W0 79 /r","V","V","AVX512F","scale4","w,r","Y","32"
> +"VCVTSS2USI r64{er}, xmm2","VCVTSS2USIQ xmm2, r64{er}","vcvtss2usiq xmm2, r64{er}","EVEX.128.F3.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
> +"VCVTSS2USI r64, xmm2/m32","VCVTSS2USIQ xmm2/m32, r64","vcvtss2usiq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 79 /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
> +"VCVTTPD2DQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2DQ zmm2, {k}{z},
> ymm1{sae}","vcvttpd2dq zmm2, {k}{z}, ymm1{sae}","EVEX.512.66.0F.W1 E6
> /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
> +"VCVTTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2DQ
> zmm2/m512/m64bcst, {k}{z}, ymm1","vcvttpd2dq zmm2/m512/m64bcst,
> {k}{z}, ymm1","EVEX.512.66.0F.W1 E6
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
> +"VCVTTPD2DQ xmm1, xmm2/m128","VCVTTPD2DQX xmm2/m128, xmm1","vcvttpd2dqx xmm2/m128, xmm1","VEX.128.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128"
> +"VCVTTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2DQX
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2dqx xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.66.0F.W1 E6
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
> +"VCVTTPD2DQ xmm1, ymm2/m256","VCVTTPD2DQY ymm2/m256, xmm1","vcvttpd2dqy ymm2/m256, xmm1","VEX.256.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256"
> +"VCVTTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2DQY
> ymm2/m256/m64bcst, {k}{z}, xmm1","vcvttpd2dqy ymm2/m256/m64bcst,
> {k}{z}, xmm1","EVEX.256.66.0F.W1 E6
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
> +"VCVTTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2QQ
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2qq xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.66.0F.W1 7A
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VCVTTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2QQ
> ymm2/m256/m64bcst, {k}{z}, ymm1","vcvttpd2qq ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.66.0F.W1 7A
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VCVTTPD2QQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2QQ zmm2, {k}{z},
> zmm1{sae}","vcvttpd2qq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 7A
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2QQ
> zmm2/m512/m64bcst, {k}{z}, zmm1","vcvttpd2qq zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F.W1 7A
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
> +"VCVTTPD2UDQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2UDQ zmm2, {k}{z},
> ymm1{sae}","vcvttpd2udq zmm2, {k}{z}, ymm1{sae}","EVEX.512.0F.W1 78
> /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
> +"VCVTTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UDQ
> zmm2/m512/m64bcst, {k}{z}, ymm1","vcvttpd2udq zmm2/m512/m64bcst,
> {k}{z}, ymm1","EVEX.512.0F.W1 78
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
> +"VCVTTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UDQX
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2udqx xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.0F.W1 78
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
> +"VCVTTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UDQY
> ymm2/m256/m64bcst, {k}{z}, xmm1","vcvttpd2udqy ymm2/m256/m64bcst,
> {k}{z}, xmm1","EVEX.256.0F.W1 78
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
> +"VCVTTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UQQ
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2uqq xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.66.0F.W1 78
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VCVTTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UQQ
> ymm2/m256/m64bcst, {k}{z}, ymm1","vcvttpd2uqq ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.66.0F.W1 78
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VCVTTPD2UQQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2UQQ zmm2, {k}{z},
> zmm1{sae}","vcvttpd2uqq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 78
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UQQ
> zmm2/m512/m64bcst, {k}{z}, zmm1","vcvttpd2uqq zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F.W1 78
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
> +"VCVTTPS2DQ xmm1, xmm2/m128","VCVTTPS2DQ xmm2/m128, xmm1","vcvttps2dq xmm2/m128, xmm1","VEX.128.F3.0F.WIG 5B /r","V","V","AVX","","w,r","",""
> +"VCVTTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2DQ
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2dq xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.F3.0F.W0 5B
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTTPS2DQ ymm1, ymm2/m256","VCVTTPS2DQ ymm2/m256, ymm1","vcvttps2dq ymm2/m256, ymm1","VEX.256.F3.0F.WIG 5B /r","V","V","AVX","","w,r","",""
> +"VCVTTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2DQ
> ymm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2dq ymm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.F3.0F.W0 5B
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VCVTTPS2DQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2DQ zmm2, {k}{z},
> zmm1{sae}","vcvttps2dq zmm2, {k}{z}, zmm1{sae}","EVEX.512.F3.0F.W0 5B
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VCVTTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2DQ
> zmm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2dq zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.F3.0F.W0 5B
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VCVTTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2QQ
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2qq xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.66.0F.W0 7A
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
> +"VCVTTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2QQ
> xmm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2qq xmm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.66.0F.W0 7A
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTTPS2QQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2QQ ymm2, {k}{z},
> zmm1{sae}","vcvttps2qq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 7A
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2QQ
> ymm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2qq ymm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.66.0F.W0 7A
> /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
> +"VCVTTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UDQ
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2udq xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.0F.W0 78
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2UDQ
> ymm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2udq ymm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.0F.W0 78
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VCVTTPS2UDQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2UDQ zmm2, {k}{z},
> zmm1{sae}","vcvttps2udq zmm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 78
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VCVTTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2UDQ
> zmm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2udq zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.0F.W0 78
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VCVTTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UQQ
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2uqq xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.66.0F.W0 78
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
> +"VCVTTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2UQQ
> xmm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2uqq xmm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.66.0F.W0 78
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTTPS2UQQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2UQQ ymm2, {k}{z},
> zmm1{sae}","vcvttps2uqq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 78
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2UQQ
> ymm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2uqq ymm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.66.0F.W0 78
> /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
> +"VCVTTSD2SI r32{sae}, xmm2","VCVTTSD2SI xmm2, r32{sae}","vcvttsd2si xmm2, r32{sae}","EVEX.128.F2.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
> +"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64, r32","EVEX.LIG.F2.0F.W0 2C /r","V","V","AVX512F","scale8","w,r","Y","32"
> +"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64, r32","VEX.LIG.F2.0F.W0 2C /r","V","V","AVX","","w,r","Y","32"
> +"VCVTTSD2SI r64{sae}, xmm2","VCVTTSD2SIQ xmm2, r64{sae}","vcvttsd2siq xmm2, r64{sae}","EVEX.128.F2.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
> +"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m64, r64","EVEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
> +"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m64, r64","VEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64"
> +"VCVTTSD2USI r32{sae}, xmm2","VCVTTSD2USIL xmm2, r32{sae}","vcvttsd2usil xmm2, r32{sae}","EVEX.128.F2.0F.W0 78 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
> +"VCVTTSD2USI r32, xmm2/m64","VCVTTSD2USIL xmm2/m64, r32","vcvttsd2usil xmm2/m64, r32","EVEX.LIG.F2.0F.W0 78 /r","V","V","AVX512F","scale8","w,r","Y","32"
> +"VCVTTSD2USI r64{sae}, xmm2","VCVTTSD2USIQ xmm2, r64{sae}","vcvttsd2usiq xmm2, r64{sae}","EVEX.128.F2.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
> +"VCVTTSD2USI r64, xmm2/m64","VCVTTSD2USIQ xmm2/m64, r64","vcvttsd2usiq xmm2/m64, r64","EVEX.LIG.F2.0F.W1 78 /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
> +"VCVTTSS2SI r32{sae}, xmm2","VCVTTSS2SI xmm2, r32{sae}","vcvttss2si xmm2, r32{sae}","EVEX.128.F3.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
> +"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32, r32","EVEX.LIG.F3.0F.W0 2C /r","V","V","AVX512F","scale4","w,r","Y","32"
> +"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32, r32","VEX.LIG.F3.0F.W0 2C /r","V","V","AVX","","w,r","Y","32"
> +"VCVTTSS2SI r64{sae}, xmm2","VCVTTSS2SIQ xmm2, r64{sae}","vcvttss2siq xmm2, r64{sae}","EVEX.128.F3.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
> +"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
> +"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m32, r64","VEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64"
> +"VCVTTSS2USI r32{sae}, xmm2","VCVTTSS2USIL xmm2, r32{sae}","vcvttss2usil xmm2, r32{sae}","EVEX.128.F3.0F.W0 78 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
> +"VCVTTSS2USI r32, xmm2/m32","VCVTTSS2USIL xmm2/m32, r32","vcvttss2usil xmm2/m32, r32","EVEX.LIG.F3.0F.W0 78 /r","V","V","AVX512F","scale4","w,r","Y","32"
> +"VCVTTSS2USI r64{sae}, xmm2","VCVTTSS2USIQ xmm2, r64{sae}","vcvttss2usiq xmm2, r64{sae}","EVEX.128.F3.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
> +"VCVTTSS2USI r64, xmm2/m32","VCVTTSS2USIQ xmm2/m32, r64","vcvttss2usiq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 78 /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
> +"VCVTUDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PD
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtudq2pd xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.F3.0F.W0 7A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","",""
> +"VCVTUDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTUDQ2PD
> xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtudq2pd xmm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.F3.0F.W0 7A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTUDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTUDQ2PD
> ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtudq2pd ymm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.F3.0F.W0 7A
> /r","V","V","AVX512F","bscale4,scale32","w,r,r","",""
> +"VCVTUDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PS
> xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtudq2ps xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.F2.0F.W0 7A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VCVTUDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTUDQ2PS
> ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtudq2ps ymm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.F2.0F.W0 7A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VCVTUDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTUDQ2PS zmm2, {k}{z},
> zmm1{er}","vcvtudq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.F2.0F.W0 7A
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VCVTUDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTUDQ2PS
> zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtudq2ps zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.F2.0F.W0 7A
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VCVTUQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PD
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtuqq2pd xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.F3.0F.W1 7A
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VCVTUQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PD
> ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtuqq2pd ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.F3.0F.W1 7A
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VCVTUQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTUQQ2PD zmm2, {k}{z},
> zmm1{er}","vcvtuqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 7A
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
> +"VCVTUQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PD
> zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtuqq2pd zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.F3.0F.W1 7A
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
> +"VCVTUQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTUQQ2PS zmm2, {k}{z},
> ymm1{er}","vcvtuqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 7A
> /r","V","V","AVX512DQ","modrm_regonly","w,r,r","Y",""
> +"VCVTUQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PS
> zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtuqq2ps zmm2/m512/m64bcst,
> {k}{z}, ymm1","EVEX.512.F2.0F.W1 7A
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512"
> +"VCVTUQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PSX
> xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtuqq2psx xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.F2.0F.W1 7A
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","128"
> +"VCVTUQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PSY
> ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtuqq2psy ymm2/m256/m64bcst,
> {k}{z}, xmm1","EVEX.256.F2.0F.W1 7A
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","256"
> +"VCVTUSI2SD xmm1, xmmV, r/m32","VCVTUSI2SDL r/m32, xmmV, xmm1","vcvtusi2sd r/m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 7B /r","V","V","AVX512F","scale4","w,r,r","Y","32"
> +"VCVTUSI2SD xmm1, xmmV, r/m64","VCVTUSI2SDQ r/m64, xmmV, xmm1","vcvtusi2sd r/m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 7B /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
> +"VCVTUSI2SD xmm1{er}, xmmV, rmr64","VCVTUSI2SDQ rmr64, xmmV,
> xmm1{er}","vcvtusi2sd rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 7B
> /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
> +"VCVTUSI2SS xmm1, xmmV, r/m32","VCVTUSI2SSL r/m32, xmmV, xmm1","vcvtusi2ssl r/m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 7B /r","V","V","AVX512F","scale4","w,r,r","Y","32"
> +"VCVTUSI2SS xmm1{er}, xmmV, rmr32","VCVTUSI2SSL rmr32, xmmV,
> xmm1{er}","vcvtusi2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0
> 7B /r","V","V","AVX512F","modrm_regonly","w,r,r","Y","32"
> +"VCVTUSI2SS xmm1, xmmV, r/m64","VCVTUSI2SSQ r/m64, xmmV, xmm1","vcvtusi2ssq r/m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 7B /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
> +"VCVTUSI2SS xmm1{er}, xmmV, rmr64","VCVTUSI2SSQ rmr64, xmmV,
> xmm1{er}","vcvtusi2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1
> 7B /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
> +"VDBPSADBW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VDBPSADBW imm8u,
> xmm2/m128, xmmV, {k}{z}, xmm1","vdbpsadbw imm8u, xmm2/m128, xmmV,
> {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 42 /r
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
> +"VDBPSADBW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VDBPSADBW imm8u,
> ymm2/m256, ymmV, {k}{z}, ymm1","vdbpsadbw imm8u, ymm2/m256, ymmV,
> {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 42 /r
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
> +"VDBPSADBW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VDBPSADBW imm8u,
> zmm2/m512, zmmV, {k}{z}, zmm1","vdbpsadbw imm8u, zmm2/m512, zmmV,
> {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 42 /r
> ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
> +"VDIVPD xmm1, xmmV, xmm2/m128","VDIVPD xmm2/m128, xmmV, xmm1","vdivpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
> +"VDIVPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VDIVPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vdivpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5E
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VDIVPD ymm1, ymmV, ymm2/m256","VDIVPD ymm2/m256, ymmV, ymm1","vdivpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
> +"VDIVPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VDIVPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vdivpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5E
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VDIVPD zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPD zmm2, zmmV, {k}{z},
> zmm1{er}","vdivpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1
> 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VDIVPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VDIVPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vdivpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5E
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VDIVPS xmm1, xmmV, xmm2/m128","VDIVPS xmm2/m128, xmmV, xmm1","vdivps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
> +"VDIVPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VDIVPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vdivps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VDIVPS ymm1, ymmV, ymm2/m256","VDIVPS ymm2/m256, ymmV, ymm1","vdivps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
> +"VDIVPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VDIVPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vdivps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VDIVPS zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPS zmm2, zmmV, {k}{z},
> zmm1{er}","vdivps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5E
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VDIVPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VDIVPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vdivps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5E
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VDIVSD xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSD xmm2, xmmV, {k}{z},
> xmm1{er}","vdivsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1
> 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VDIVSD xmm1, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, xmm1","vdivsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
> +"VDIVSD xmm1, {k}{z}, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, {k}{z},
> xmm1","vdivsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5E
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VDIVSS xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSS xmm2, xmmV, {k}{z},
> xmm1{er}","vdivss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0
> 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VDIVSS xmm1, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, xmm1","vdivss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
> +"VDIVSS xmm1, {k}{z}, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, {k}{z},
> xmm1","vdivss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5E
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VDPPD xmm1, xmmV, xmm2/m128, imm8u","VDPPD imm8u, xmm2/m128, xmmV,
> xmm1","vdppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 41
> /r ib","V","V","AVX","","w,r,r,r","",""
> +"VDPPS xmm1, xmmV, xmm2/m128, imm8u","VDPPS imm8u, xmm2/m128, xmmV,
> xmm1","vdpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 40
> /r ib","V","V","AVX","","w,r,r,r","",""
> +"VDPPS ymm1, ymmV, ymm2/m256, imm8u","VDPPS imm8u, ymm2/m256, ymmV,
> ymm1","vdpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 40
> /r ib","V","V","AVX","","w,r,r,r","",""
> +"VERR r/m16","VERR r/m16","verr r/m16","0F 00 /4","V","V","","","r","",""
> +"VERW r/m16","VERW r/m16","verw r/m16","0F 00 /5","V","V","","","r","",""
> +"VEXP2PD zmm1{sae}, {k}{z}, zmm2","VEXP2PD zmm2, {k}{z},
> zmm1{sae}","vexp2pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 C8
> /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
> +"VEXP2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VEXP2PD zmm2/m512/m64bcst,
> {k}{z}, zmm1","vexp2pd zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W1 C8
> /r","V","V","AVX512ER","bscale8,scale64","w,r,r","",""
> +"VEXP2PS zmm1{sae}, {k}{z}, zmm2","VEXP2PS zmm2, {k}{z},
> zmm1{sae}","vexp2ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 C8
> /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
> +"VEXP2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VEXP2PS zmm2/m512/m32bcst,
> {k}{z}, zmm1","vexp2ps zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W0 C8
> /r","V","V","AVX512ER","bscale4,scale64","w,r,r","",""
> +"VEXPANDPD xmm1, {k}{z}, xmm2/m128","VEXPANDPD xmm2/m128, {k}{z},
> xmm1","vexpandpd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 88
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VEXPANDPD ymm1, {k}{z}, ymm2/m256","VEXPANDPD ymm2/m256, {k}{z},
> ymm1","vexpandpd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 88
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VEXPANDPD zmm1, {k}{z}, zmm2/m512","VEXPANDPD zmm2/m512, {k}{z},
> zmm1","vexpandpd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 88
> /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VEXPANDPS xmm1, {k}{z}, xmm2/m128","VEXPANDPS xmm2/m128, {k}{z},
> xmm1","vexpandps xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 88
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VEXPANDPS ymm1, {k}{z}, ymm2/m256","VEXPANDPS ymm2/m256, {k}{z},
> ymm1","vexpandps ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 88
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VEXPANDPS zmm1, {k}{z}, zmm2/m512","VEXPANDPS zmm2/m512, {k}{z},
> zmm1","vexpandps zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 88
> /r","V","V","AVX512F","scale4","w,r,r","",""
> +"VEXTRACTF128 xmm2/m128, ymm1, imm8u:1","VEXTRACTF128 imm8u:1, ymm1,
> xmm2/m128","vextractf128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0
> 19 /r ib","V","V","AVX","","w,r,r","",""
> +"VEXTRACTF32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF32X4
> imm8u:1, ymm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:1, ymm1,
> {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0 19 /r
> ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VEXTRACTF32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF32X4
> imm8u:2, zmm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:2, zmm1,
> {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W0 19 /r
> ib","V","V","AVX512F","scale16","w,r,r,r","",""
> +"VEXTRACTF32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTF32X8
> imm8u:1, zmm1, {k}{z}, ymm2/m256","vextractf32x8 imm8u:1, zmm1,
> {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0 1B /r
> ib","V","V","AVX512DQ","scale32","w,r,r,r","",""
> +"VEXTRACTF64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF64X2
> imm8u:1, ymm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:1, ymm1,
> {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W1 19 /r
> ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r","",""
> +"VEXTRACTF64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF64X2
> imm8u:2, zmm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:2, zmm1,
> {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W1 19 /r
> ib","V","V","AVX512DQ","scale16","w,r,r,r","",""
> +"VEXTRACTF64X4 ymm2/m256, {k}{z}, zmm1, imm8u","VEXTRACTF64X4 imm8u,
> zmm1, {k}{z}, ymm2/m256","vextractf64x4 imm8u, zmm1, {k}{z},
> ymm2/m256","EVEX.512.66.0F3A.W1 1B /r
> ib","V","V","AVX512F","scale32","w,r,r,r","",""
> +"VEXTRACTI128 xmm2/m128, ymm1, imm8u:1","VEXTRACTI128 imm8u:1, ymm1,
> xmm2/m128","vextracti128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0
> 39 /r ib","V","V","AVX2","","w,r,r","",""
> +"VEXTRACTI32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI32X4
> imm8u:1, ymm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:1, ymm1,
> {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0 39 /r
> ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VEXTRACTI32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI32X4
> imm8u:2, zmm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:2, zmm1,
> {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W0 39 /r
> ib","V","V","AVX512F","scale16","w,r,r,r","",""
> +"VEXTRACTI32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI32X8
> imm8u:1, zmm1, {k}{z}, ymm2/m256","vextracti32x8 imm8u:1, zmm1,
> {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0 3B /r
> ib","V","V","AVX512DQ","scale32","w,r,r,r","",""
> +"VEXTRACTI64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI64X2
> imm8u:1, ymm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:1, ymm1,
> {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W1 39 /r
> ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r","",""
> +"VEXTRACTI64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI64X2
> imm8u:2, zmm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:2, zmm1,
> {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W1 39 /r
> ib","V","V","AVX512DQ","scale16","w,r,r,r","",""
> +"VEXTRACTI64X4 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI64X4
> imm8u:1, zmm1, {k}{z}, ymm2/m256","vextracti64x4 imm8u:1, zmm1,
> {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W1 3B /r
> ib","V","V","AVX512F","scale32","w,r,r,r","",""
> +"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1,
> r/m32","vextractps imm8u:2, xmm1, r/m32","EVEX.128.66.0F3A.WIG 17 /r
> ib","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1, r/m32","vextractps imm8u:2, xmm1, r/m32","VEX.128.66.0F3A.WIG 17 /r ib","V","V","AVX","","w,r,r","",""
> +"VFIXUPIMMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst,
> imm8u","VFIXUPIMMPD imm8u, xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vfixupimmpd imm8u, xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F3A.W1 54 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r,r","",""
> +"VFIXUPIMMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst,
> imm8u","VFIXUPIMMPD imm8u, ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vfixupimmpd imm8u, ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F3A.W1 54 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r,r","",""
> +"VFIXUPIMMPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPD
> imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","vfixupimmpd imm8u, zmm2, zmmV,
> {k}{z}, zmm1{sae}","EVEX.DDS.512.66.0F3A.W1 54 /r
> ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
> +"VFIXUPIMMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst,
> imm8u","VFIXUPIMMPD imm8u, zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vfixupimmpd imm8u, zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F3A.W1 54 /r
> ib","V","V","AVX512F","bscale8,scale64","rw,r,r,r,r","",""
> +"VFIXUPIMMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst,
> imm8u","VFIXUPIMMPS imm8u, xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","vfixupimmps imm8u, xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F3A.W0 54 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r,r","",""
> +"VFIXUPIMMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst,
> imm8u","VFIXUPIMMPS imm8u, ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","vfixupimmps imm8u, ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F3A.W0 54 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r,r","",""
> +"VFIXUPIMMPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPS
> imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","vfixupimmps imm8u, zmm2, zmmV,
> {k}{z}, zmm1{sae}","EVEX.DDS.512.66.0F3A.W0 54 /r
> ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
> +"VFIXUPIMMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst,
> imm8u","VFIXUPIMMPS imm8u, zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","vfixupimmps imm8u, zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F3A.W0 54 /r
> ib","V","V","AVX512F","bscale4,scale64","rw,r,r,r,r","",""
> +"VFIXUPIMMSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSD
> imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vfixupimmsd imm8u, xmm2, xmmV,
> {k}{z}, xmm1{sae}","EVEX.DDS.128.66.0F3A.W1 55 /r
> ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
> +"VFIXUPIMMSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VFIXUPIMMSD imm8u,
> xmm2/m64, xmmV, {k}{z}, xmm1","vfixupimmsd imm8u, xmm2/m64, xmmV,
> {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3A.W1 55 /r
> ib","V","V","AVX512F","scale8","rw,r,r,r,r","",""
> +"VFIXUPIMMSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSS
> imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vfixupimmss imm8u, xmm2, xmmV,
> {k}{z}, xmm1{sae}","EVEX.DDS.128.66.0F3A.W0 55 /r
> ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
> +"VFIXUPIMMSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VFIXUPIMMSS imm8u,
> xmm2/m32, xmmV, {k}{z}, xmm1","vfixupimmss imm8u, xmm2/m32, xmmV,
> {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3A.W0 55 /r
> ib","V","V","AVX512F","scale4","rw,r,r,r,r","",""
> +"VFMADD132PD xmm1, xmmV, xmm2/m128","VFMADD132PD xmm2/m128, xmmV, xmm1","vfmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 98 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD132PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmadd132pd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 98
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMADD132PD ymm1, ymmV, ymm2/m256","VFMADD132PD ymm2/m256, ymmV, ymm1","vfmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 98 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD132PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmadd132pd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 98
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmadd132pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 98
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD132PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmadd132pd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 98
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMADD132PS xmm1, xmmV, xmm2/m128","VFMADD132PS xmm2/m128, xmmV, xmm1","vfmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 98 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD132PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmadd132ps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 98
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMADD132PS ymm1, ymmV, ymm2/m256","VFMADD132PS ymm2/m256, ymmV, ymm1","vfmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 98 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD132PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmadd132ps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 98
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmadd132ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 98
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD132PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmadd132ps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 98
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmadd132sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 99
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD132SD xmm1, xmmV, xmm2/m64","VFMADD132SD xmm2/m64, xmmV, xmm1","vfmadd132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 99 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD132SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfmadd132sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 99
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmadd132ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 99
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD132SS xmm1, xmmV, xmm2/m32","VFMADD132SS xmm2/m32, xmmV, xmm1","vfmadd132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 99 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD132SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfmadd132ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 99
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFMADD213PD xmm1, xmmV, xmm2/m128","VFMADD213PD xmm2/m128, xmmV, xmm1","vfmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A8 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD213PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmadd213pd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A8
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMADD213PD ymm1, ymmV, ymm2/m256","VFMADD213PD ymm2/m256, ymmV, ymm1","vfmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A8 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD213PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmadd213pd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A8
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmadd213pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 A8
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD213PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmadd213pd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A8
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMADD213PS xmm1, xmmV, xmm2/m128","VFMADD213PS xmm2/m128, xmmV, xmm1","vfmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A8 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD213PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmadd213ps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A8
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMADD213PS ymm1, ymmV, ymm2/m256","VFMADD213PS ymm2/m256, ymmV, ymm1","vfmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A8 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD213PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmadd213ps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A8
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmadd213ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 A8
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD213PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmadd213ps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A8
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmadd213sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 A9
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD213SD xmm1, xmmV, xmm2/m64","VFMADD213SD xmm2/m64, xmmV, xmm1","vfmadd213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 A9 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD213SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfmadd213sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 A9
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmadd213ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 A9
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD213SS xmm1, xmmV, xmm2/m32","VFMADD213SS xmm2/m32, xmmV, xmm1","vfmadd213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 A9 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD213SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfmadd213ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 A9
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFMADD231PD xmm1, xmmV, xmm2/m128","VFMADD231PD xmm2/m128, xmmV, xmm1","vfmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B8 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD231PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmadd231pd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B8
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMADD231PD ymm1, ymmV, ymm2/m256","VFMADD231PD ymm2/m256, ymmV, ymm1","vfmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B8 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD231PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmadd231pd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B8
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmadd231pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 B8
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD231PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmadd231pd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B8
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMADD231PS xmm1, xmmV, xmm2/m128","VFMADD231PS xmm2/m128, xmmV, xmm1","vfmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B8 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD231PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmadd231ps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B8
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMADD231PS ymm1, ymmV, ymm2/m256","VFMADD231PS ymm2/m256, ymmV, ymm1","vfmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B8 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD231PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmadd231ps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B8
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmadd231ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 B8
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD231PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmadd231ps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B8
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmadd231sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 B9
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD231SD xmm1, xmmV, xmm2/m64","VFMADD231SD xmm2/m64, xmmV, xmm1","vfmadd231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 B9 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD231SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfmadd231sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 B9
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmadd231ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 B9
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADD231SS xmm1, xmmV, xmm2/m32","VFMADD231SS xmm2/m32, xmmV, xmm1","vfmadd231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 B9 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD231SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfmadd231ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 B9
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPD xmm2/m128, xmmIH,
> xmmV, xmm1","vfmaddpd xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 69 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPD xmmIH, xmm2/m128,
> xmmV, xmm1","vfmaddpd xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 69 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPD ymm2/m256, ymmIH,
> ymmV, ymm1","vfmaddpd ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 69 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPD ymmIH, ymm2/m256,
> ymmV, ymm1","vfmaddpd ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 69 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPS xmm2/m128, xmmIH,
> xmmV, xmm1","vfmaddps xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 68 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPS xmmIH, xmm2/m128,
> xmmV, xmm1","vfmaddps xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 68 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPS ymm2/m256, ymmIH,
> ymmV, ymm1","vfmaddps ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 68 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPS ymmIH, ymm2/m256,
> ymmV, ymm1","vfmaddps ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 68 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFMADDSD xmm2/m64, xmmIH,
> xmmV, xmm1","vfmaddsd xmm2/m64, xmmIH, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W1 6B /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFMADDSD xmmIH, xmm2/m64,
> xmmV, xmm1","vfmaddsd xmmIH, xmm2/m64, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W0 6B /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFMADDSS xmm2/m32, xmmIH,
> xmmV, xmm1","vfmaddss xmm2/m32, xmmIH, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W1 6A /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFMADDSS xmmIH, xmm2/m32,
> xmmV, xmm1","vfmaddss xmmIH, xmm2/m32, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W0 6A /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSUB132PD xmm1, xmmV, xmm2/m128","VFMADDSUB132PD xmm2/m128,
> xmmV, xmm1","vfmaddsub132pd xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W1 96 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB132PD xmm1, {k}{z}, xmmV,
> xmm2/m128/m64bcst","VFMADDSUB132PD xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vfmaddsub132pd xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 96
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMADDSUB132PD ymm1, ymmV, ymm2/m256","VFMADDSUB132PD ymm2/m256,
> ymmV, ymm1","vfmaddsub132pd ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W1 96 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB132PD ymm1, {k}{z}, ymmV,
> ymm2/m256/m64bcst","VFMADDSUB132PD ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vfmaddsub132pd ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 96
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMADDSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PD zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmaddsub132pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 96
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADDSUB132PD zmm1, {k}{z}, zmmV,
> zmm2/m512/m64bcst","VFMADDSUB132PD zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vfmaddsub132pd zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 96
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMADDSUB132PS xmm1, xmmV, xmm2/m128","VFMADDSUB132PS xmm2/m128,
> xmmV, xmm1","vfmaddsub132ps xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W0 96 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB132PS xmm1, {k}{z}, xmmV,
> xmm2/m128/m32bcst","VFMADDSUB132PS xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","vfmaddsub132ps xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W0 96
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMADDSUB132PS ymm1, ymmV, ymm2/m256","VFMADDSUB132PS ymm2/m256,
> ymmV, ymm1","vfmaddsub132ps ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W0 96 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB132PS ymm1, {k}{z}, ymmV,
> ymm2/m256/m32bcst","VFMADDSUB132PS ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","vfmaddsub132ps ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W0 96
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMADDSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PS zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmaddsub132ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 96
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADDSUB132PS zmm1, {k}{z}, zmmV,
> zmm2/m512/m32bcst","VFMADDSUB132PS zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","vfmaddsub132ps zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W0 96
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMADDSUB213PD xmm1, xmmV, xmm2/m128","VFMADDSUB213PD xmm2/m128,
> xmmV, xmm1","vfmaddsub213pd xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W1 A6 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB213PD xmm1, {k}{z}, xmmV,
> xmm2/m128/m64bcst","VFMADDSUB213PD xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vfmaddsub213pd xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 A6
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMADDSUB213PD ymm1, ymmV, ymm2/m256","VFMADDSUB213PD ymm2/m256,
> ymmV, ymm1","vfmaddsub213pd ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W1 A6 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB213PD ymm1, {k}{z}, ymmV,
> ymm2/m256/m64bcst","VFMADDSUB213PD ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vfmaddsub213pd ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 A6
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMADDSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PD zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmaddsub213pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 A6
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADDSUB213PD zmm1, {k}{z}, zmmV,
> zmm2/m512/m64bcst","VFMADDSUB213PD zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vfmaddsub213pd zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 A6
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMADDSUB213PS xmm1, xmmV, xmm2/m128","VFMADDSUB213PS xmm2/m128,
> xmmV, xmm1","vfmaddsub213ps xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W0 A6 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB213PS xmm1, {k}{z}, xmmV,
> xmm2/m128/m32bcst","VFMADDSUB213PS xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","vfmaddsub213ps xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W0 A6
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMADDSUB213PS ymm1, ymmV, ymm2/m256","VFMADDSUB213PS ymm2/m256,
> ymmV, ymm1","vfmaddsub213ps ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W0 A6 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB213PS ymm1, {k}{z}, ymmV,
> ymm2/m256/m32bcst","VFMADDSUB213PS ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","vfmaddsub213ps ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W0 A6
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMADDSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PS zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmaddsub213ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 A6
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADDSUB213PS zmm1, {k}{z}, zmmV,
> zmm2/m512/m32bcst","VFMADDSUB213PS zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","vfmaddsub213ps zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W0 A6
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMADDSUB231PD xmm1, xmmV, xmm2/m128","VFMADDSUB231PD xmm2/m128,
> xmmV, xmm1","vfmaddsub231pd xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W1 B6 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB231PD xmm1, {k}{z}, xmmV,
> xmm2/m128/m64bcst","VFMADDSUB231PD xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vfmaddsub231pd xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 B6
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMADDSUB231PD ymm1, ymmV, ymm2/m256","VFMADDSUB231PD ymm2/m256,
> ymmV, ymm1","vfmaddsub231pd ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W1 B6 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB231PD ymm1, {k}{z}, ymmV,
> ymm2/m256/m64bcst","VFMADDSUB231PD ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vfmaddsub231pd ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 B6
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMADDSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PD zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmaddsub231pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 B6
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADDSUB231PD zmm1, {k}{z}, zmmV,
> zmm2/m512/m64bcst","VFMADDSUB231PD zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vfmaddsub231pd zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 B6
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMADDSUB231PS xmm1, xmmV, xmm2/m128","VFMADDSUB231PS xmm2/m128,
> xmmV, xmm1","vfmaddsub231ps xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W0 B6 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB231PS xmm1, {k}{z}, xmmV,
> xmm2/m128/m32bcst","VFMADDSUB231PS xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","vfmaddsub231ps xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W0 B6
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMADDSUB231PS ymm1, ymmV, ymm2/m256","VFMADDSUB231PS ymm2/m256,
> ymmV, ymm1","vfmaddsub231ps ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W0 B6 /r","V","V","FMA","","rw,r,r","",""
> +"VFMADDSUB231PS ymm1, {k}{z}, ymmV,
> ymm2/m256/m32bcst","VFMADDSUB231PS ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","vfmaddsub231ps ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W0 B6
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMADDSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PS zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmaddsub231ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 B6
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMADDSUB231PS zmm1, {k}{z}, zmmV,
> zmm2/m512/m32bcst","VFMADDSUB231PS zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","vfmaddsub231ps zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W0 B6
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMADDSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPD xmm2/m128,
> xmmIH, xmmV, xmm1","vfmaddsubpd xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 5D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPD xmmIH,
> xmm2/m128, xmmV, xmm1","vfmaddsubpd xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 5D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPD ymm2/m256,
> ymmIH, ymmV, ymm1","vfmaddsubpd ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 5D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPD ymmIH,
> ymm2/m256, ymmV, ymm1","vfmaddsubpd ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 5D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPS xmm2/m128,
> xmmIH, xmmV, xmm1","vfmaddsubps xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 5C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPS xmmIH,
> xmm2/m128, xmmV, xmm1","vfmaddsubps xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 5C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPS ymm2/m256,
> ymmIH, ymmV, ymm1","vfmaddsubps ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 5C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMADDSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPS ymmIH,
> ymm2/m256, ymmV, ymm1","vfmaddsubps ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 5C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUB132PD xmm1, xmmV, xmm2/m128","VFMSUB132PD xmm2/m128, xmmV, xmm1","vfmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9A /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB132PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsub132pd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 9A
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMSUB132PD ymm1, ymmV, ymm2/m256","VFMSUB132PD ymm2/m256, ymmV, ymm1","vfmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9A /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB132PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsub132pd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 9A
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmsub132pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 9A
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB132PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsub132pd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 9A
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMSUB132PS xmm1, xmmV, xmm2/m128","VFMSUB132PS xmm2/m128, xmmV, xmm1","vfmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9A /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB132PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsub132ps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 9A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMSUB132PS ymm1, ymmV, ymm2/m256","VFMSUB132PS ymm2/m256, ymmV, ymm1","vfmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9A /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB132PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsub132ps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 9A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmsub132ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 9A
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB132PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsub132ps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 9A
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmsub132sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 9B
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB132SD xmm1, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64, xmmV, xmm1","vfmsub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9B /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfmsub132sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 9B
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmsub132ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 9B
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB132SS xmm1, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32, xmmV, xmm1","vfmsub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9B /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfmsub132ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 9B
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFMSUB213PD xmm1, xmmV, xmm2/m128","VFMSUB213PD xmm2/m128, xmmV, xmm1","vfmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AA /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB213PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsub213pd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 AA
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMSUB213PD ymm1, ymmV, ymm2/m256","VFMSUB213PD ymm2/m256, ymmV, ymm1","vfmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AA /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB213PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsub213pd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 AA
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmsub213pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 AA
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB213PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsub213pd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 AA
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMSUB213PS xmm1, xmmV, xmm2/m128","VFMSUB213PS xmm2/m128, xmmV, xmm1","vfmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AA /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB213PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsub213ps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 AA
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMSUB213PS ymm1, ymmV, ymm2/m256","VFMSUB213PS ymm2/m256, ymmV, ymm1","vfmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AA /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB213PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsub213ps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 AA
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmsub213ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 AA
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB213PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsub213ps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 AA
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmsub213sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 AB
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB213SD xmm1, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64, xmmV, xmm1","vfmsub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AB /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfmsub213sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 AB
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmsub213ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 AB
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB213SS xmm1, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32, xmmV, xmm1","vfmsub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AB /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfmsub213ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 AB
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFMSUB231PD xmm1, xmmV, xmm2/m128","VFMSUB231PD xmm2/m128, xmmV, xmm1","vfmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BA /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB231PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsub231pd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 BA
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMSUB231PD ymm1, ymmV, ymm2/m256","VFMSUB231PD ymm2/m256, ymmV, ymm1","vfmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BA /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB231PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsub231pd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 BA
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmsub231pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 BA
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB231PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsub231pd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 BA
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMSUB231PS xmm1, xmmV, xmm2/m128","VFMSUB231PS xmm2/m128, xmmV, xmm1","vfmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BA /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB231PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsub231ps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 BA
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMSUB231PS ymm1, ymmV, ymm2/m256","VFMSUB231PS ymm2/m256, ymmV, ymm1","vfmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BA /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB231PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsub231ps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 BA
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfmsub231ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 BA
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB231PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsub231ps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 BA
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmsub231sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 BB
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB231SD xmm1, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64, xmmV, xmm1","vfmsub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BB /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfmsub231sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 BB
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfmsub231ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 BB
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUB231SS xmm1, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32, xmmV, xmm1","vfmsub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BB /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfmsub231ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 BB
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFMSUBADD132PD xmm1, xmmV, xmm2/m128","VFMSUBADD132PD xmm2/m128,
> xmmV, xmm1","vfmsubadd132pd xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W1 97 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD132PD xmm1, {k}{z}, xmmV,
> xmm2/m128/m64bcst","VFMSUBADD132PD xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vfmsubadd132pd xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 97
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMSUBADD132PD ymm1, ymmV, ymm2/m256","VFMSUBADD132PD ymm2/m256,
> ymmV, ymm1","vfmsubadd132pd ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W1 97 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD132PD ymm1, {k}{z}, ymmV,
> ymm2/m256/m64bcst","VFMSUBADD132PD ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vfmsubadd132pd ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 97
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMSUBADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PD zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmsubadd132pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 97
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUBADD132PD zmm1, {k}{z}, zmmV,
> zmm2/m512/m64bcst","VFMSUBADD132PD zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vfmsubadd132pd zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 97
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMSUBADD132PS xmm1, xmmV, xmm2/m128","VFMSUBADD132PS xmm2/m128,
> xmmV, xmm1","vfmsubadd132ps xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W0 97 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD132PS xmm1, {k}{z}, xmmV,
> xmm2/m128/m32bcst","VFMSUBADD132PS xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","vfmsubadd132ps xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W0 97
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMSUBADD132PS ymm1, ymmV, ymm2/m256","VFMSUBADD132PS ymm2/m256,
> ymmV, ymm1","vfmsubadd132ps ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W0 97 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD132PS ymm1, {k}{z}, ymmV,
> ymm2/m256/m32bcst","VFMSUBADD132PS ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","vfmsubadd132ps ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W0 97
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMSUBADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PS zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmsubadd132ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 97
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUBADD132PS zmm1, {k}{z}, zmmV,
> zmm2/m512/m32bcst","VFMSUBADD132PS zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","vfmsubadd132ps zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W0 97
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMSUBADD213PD xmm1, xmmV, xmm2/m128","VFMSUBADD213PD xmm2/m128,
> xmmV, xmm1","vfmsubadd213pd xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W1 A7 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD213PD xmm1, {k}{z}, xmmV,
> xmm2/m128/m64bcst","VFMSUBADD213PD xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vfmsubadd213pd xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 A7
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMSUBADD213PD ymm1, ymmV, ymm2/m256","VFMSUBADD213PD ymm2/m256,
> ymmV, ymm1","vfmsubadd213pd ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W1 A7 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD213PD ymm1, {k}{z}, ymmV,
> ymm2/m256/m64bcst","VFMSUBADD213PD ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vfmsubadd213pd ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 A7
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMSUBADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PD zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmsubadd213pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 A7
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUBADD213PD zmm1, {k}{z}, zmmV,
> zmm2/m512/m64bcst","VFMSUBADD213PD zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vfmsubadd213pd zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 A7
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMSUBADD213PS xmm1, xmmV, xmm2/m128","VFMSUBADD213PS xmm2/m128,
> xmmV, xmm1","vfmsubadd213ps xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W0 A7 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD213PS xmm1, {k}{z}, xmmV,
> xmm2/m128/m32bcst","VFMSUBADD213PS xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","vfmsubadd213ps xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W0 A7
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMSUBADD213PS ymm1, ymmV, ymm2/m256","VFMSUBADD213PS ymm2/m256,
> ymmV, ymm1","vfmsubadd213ps ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W0 A7 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD213PS ymm1, {k}{z}, ymmV,
> ymm2/m256/m32bcst","VFMSUBADD213PS ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","vfmsubadd213ps ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W0 A7
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMSUBADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PS zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmsubadd213ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 A7
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUBADD213PS zmm1, {k}{z}, zmmV,
> zmm2/m512/m32bcst","VFMSUBADD213PS zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","vfmsubadd213ps zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W0 A7
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMSUBADD231PD xmm1, xmmV, xmm2/m128","VFMSUBADD231PD xmm2/m128,
> xmmV, xmm1","vfmsubadd231pd xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W1 B7 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD231PD xmm1, {k}{z}, xmmV,
> xmm2/m128/m64bcst","VFMSUBADD231PD xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vfmsubadd231pd xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 B7
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFMSUBADD231PD ymm1, ymmV, ymm2/m256","VFMSUBADD231PD ymm2/m256,
> ymmV, ymm1","vfmsubadd231pd ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W1 B7 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD231PD ymm1, {k}{z}, ymmV,
> ymm2/m256/m64bcst","VFMSUBADD231PD ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vfmsubadd231pd ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 B7
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFMSUBADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PD zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmsubadd231pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 B7
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUBADD231PD zmm1, {k}{z}, zmmV,
> zmm2/m512/m64bcst","VFMSUBADD231PD zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vfmsubadd231pd zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 B7
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFMSUBADD231PS xmm1, xmmV, xmm2/m128","VFMSUBADD231PS xmm2/m128,
> xmmV, xmm1","vfmsubadd231ps xmm2/m128, xmmV,
> xmm1","VEX.DDS.128.66.0F38.W0 B7 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD231PS xmm1, {k}{z}, xmmV,
> xmm2/m128/m32bcst","VFMSUBADD231PS xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","vfmsubadd231ps xmm2/m128/m32bcst, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W0 B7
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFMSUBADD231PS ymm1, ymmV, ymm2/m256","VFMSUBADD231PS ymm2/m256,
> ymmV, ymm1","vfmsubadd231ps ymm2/m256, ymmV,
> ymm1","VEX.DDS.256.66.0F38.W0 B7 /r","V","V","FMA","","rw,r,r","",""
> +"VFMSUBADD231PS ymm1, {k}{z}, ymmV,
> ymm2/m256/m32bcst","VFMSUBADD231PS ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","vfmsubadd231ps ymm2/m256/m32bcst, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W0 B7
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFMSUBADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PS zmm2,
> zmmV, {k}{z}, zmm1{er}","vfmsubadd231ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 B7
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFMSUBADD231PS zmm1, {k}{z}, zmmV,
> zmm2/m512/m32bcst","VFMSUBADD231PS zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","vfmsubadd231ps zmm2/m512/m32bcst, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W0 B7
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFMSUBADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPD xmm2/m128,
> xmmIH, xmmV, xmm1","vfmsubaddpd xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 5F /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPD xmmIH,
> xmm2/m128, xmmV, xmm1","vfmsubaddpd xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 5F /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPD ymm2/m256,
> ymmIH, ymmV, ymm1","vfmsubaddpd ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 5F /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPD ymmIH,
> ymm2/m256, ymmV, ymm1","vfmsubaddpd ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 5F /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPS xmm2/m128,
> xmmIH, xmmV, xmm1","vfmsubaddps xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 5E /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPS xmmIH,
> xmm2/m128, xmmV, xmm1","vfmsubaddps xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 5E /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPS ymm2/m256,
> ymmIH, ymmV, ymm1","vfmsubaddps ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 5E /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPS ymmIH,
> ymm2/m256, ymmV, ymm1","vfmsubaddps ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 5E /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPD xmm2/m128, xmmIH,
> xmmV, xmm1","vfmsubpd xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 6D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPD xmmIH, xmm2/m128,
> xmmV, xmm1","vfmsubpd xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 6D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPD ymm2/m256, ymmIH,
> ymmV, ymm1","vfmsubpd ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 6D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPD ymmIH, ymm2/m256,
> ymmV, ymm1","vfmsubpd ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 6D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPS xmm2/m128, xmmIH,
> xmmV, xmm1","vfmsubps xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 6C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPS xmmIH, xmm2/m128,
> xmmV, xmm1","vfmsubps xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 6C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPS ymm2/m256, ymmIH,
> ymmV, ymm1","vfmsubps ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 6C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPS ymmIH, ymm2/m256,
> ymmV, ymm1","vfmsubps ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 6C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFMSUBSD xmm2/m64, xmmIH,
> xmmV, xmm1","vfmsubsd xmm2/m64, xmmIH, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W1 6F /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFMSUBSD xmmIH, xmm2/m64,
> xmmV, xmm1","vfmsubsd xmmIH, xmm2/m64, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W0 6F /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFMSUBSS xmm2/m32, xmmIH,
> xmmV, xmm1","vfmsubss xmm2/m32, xmmIH, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W1 6E /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFMSUBSS xmmIH, xmm2/m32,
> xmmV, xmm1","vfmsubss xmmIH, xmm2/m32, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W0 6E /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADD132PD xmm1, xmmV, xmm2/m128","VFNMADD132PD xmm2/m128, xmmV,
> xmm1","vfnmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9C
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD132PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd132pd
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 9C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFNMADD132PD ymm1, ymmV, ymm2/m256","VFNMADD132PD ymm2/m256, ymmV,
> ymm1","vfnmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9C
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD132PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd132pd
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 9C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFNMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmadd132pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 9C
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD132PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd132pd
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 9C
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFNMADD132PS xmm1, xmmV, xmm2/m128","VFNMADD132PS xmm2/m128, xmmV,
> xmm1","vfnmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9C
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD132PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd132ps
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 9C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFNMADD132PS ymm1, ymmV, ymm2/m256","VFNMADD132PS ymm2/m256, ymmV,
> ymm1","vfnmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9C
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD132PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd132ps
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 9C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFNMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmadd132ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 9C
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD132PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd132ps
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 9C
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFNMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmadd132sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 9D
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD132SD xmm1, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64, xmmV, xmm1","vfnmadd132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9D /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfnmadd132sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 9D
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFNMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmadd132ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 9D
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD132SS xmm1, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32, xmmV, xmm1","vfnmadd132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9D /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfnmadd132ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 9D
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFNMADD213PD xmm1, xmmV, xmm2/m128","VFNMADD213PD xmm2/m128, xmmV,
> xmm1","vfnmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AC
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD213PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd213pd
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 AC
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFNMADD213PD ymm1, ymmV, ymm2/m256","VFNMADD213PD ymm2/m256, ymmV,
> ymm1","vfnmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AC
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD213PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd213pd
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 AC
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFNMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmadd213pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 AC
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD213PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd213pd
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 AC
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFNMADD213PS xmm1, xmmV, xmm2/m128","VFNMADD213PS xmm2/m128, xmmV,
> xmm1","vfnmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AC
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD213PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd213ps
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 AC
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFNMADD213PS ymm1, ymmV, ymm2/m256","VFNMADD213PS ymm2/m256, ymmV,
> ymm1","vfnmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AC
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD213PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd213ps
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 AC
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFNMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmadd213ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 AC
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD213PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd213ps
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 AC
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFNMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmadd213sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 AD
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD213SD xmm1, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64, xmmV, xmm1","vfnmadd213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AD /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfnmadd213sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 AD
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFNMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmadd213ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 AD
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD213SS xmm1, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32, xmmV, xmm1","vfnmadd213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AD /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfnmadd213ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 AD
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFNMADD231PD xmm1, xmmV, xmm2/m128","VFNMADD231PD xmm2/m128, xmmV,
> xmm1","vfnmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BC
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD231PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd231pd
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 BC
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFNMADD231PD ymm1, ymmV, ymm2/m256","VFNMADD231PD ymm2/m256, ymmV,
> ymm1","vfnmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BC
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD231PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd231pd
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 BC
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFNMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmadd231pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 BC
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD231PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd231pd
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 BC
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFNMADD231PS xmm1, xmmV, xmm2/m128","VFNMADD231PS xmm2/m128, xmmV,
> xmm1","vfnmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BC
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD231PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd231ps
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 BC
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFNMADD231PS ymm1, ymmV, ymm2/m256","VFNMADD231PS ymm2/m256, ymmV,
> ymm1","vfnmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BC
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD231PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd231ps
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 BC
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFNMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmadd231ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 BC
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD231PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd231ps
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 BC
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFNMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmadd231sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 BD
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD231SD xmm1, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64, xmmV, xmm1","vfnmadd231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BD /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfnmadd231sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 BD
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFNMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmadd231ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 BD
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMADD231SS xmm1, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32, xmmV, xmm1","vfnmadd231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BD /r","V","V","FMA","","rw,r,r","",""
> +"VFNMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfnmadd231ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 BD
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFNMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPD xmm2/m128, xmmIH,
> xmmV, xmm1","vfnmaddpd xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 79 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPD xmmIH, xmm2/m128,
> xmmV, xmm1","vfnmaddpd xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 79 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPD ymm2/m256, ymmIH,
> ymmV, ymm1","vfnmaddpd ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 79 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPD ymmIH, ymm2/m256,
> ymmV, ymm1","vfnmaddpd ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 79 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPS xmm2/m128, xmmIH,
> xmmV, xmm1","vfnmaddps xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 78 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPS xmmIH, xmm2/m128,
> xmmV, xmm1","vfnmaddps xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 78 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPS ymm2/m256, ymmIH,
> ymmV, ymm1","vfnmaddps ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 78 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPS ymmIH, ymm2/m256,
> ymmV, ymm1","vfnmaddps ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 78 /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMADDSD xmm2/m64, xmmIH,
> xmmV, xmm1","vfnmaddsd xmm2/m64, xmmIH, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W1 7B /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMADDSD xmmIH, xmm2/m64,
> xmmV, xmm1","vfnmaddsd xmmIH, xmm2/m64, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W0 7B /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMADDSS xmm2/m32, xmmIH,
> xmmV, xmm1","vfnmaddss xmm2/m32, xmmIH, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W1 7A /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMADDSS xmmIH, xmm2/m32,
> xmmV, xmm1","vfnmaddss xmmIH, xmm2/m32, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W0 7A /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUB132PD xmm1, xmmV, xmm2/m128","VFNMSUB132PD xmm2/m128, xmmV,
> xmm1","vfnmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9E
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB132PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub132pd
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 9E
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFNMSUB132PD ymm1, ymmV, ymm2/m256","VFNMSUB132PD ymm2/m256, ymmV,
> ymm1","vfnmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9E
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB132PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub132pd
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 9E
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFNMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmsub132pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 9E
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB132PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub132pd
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 9E
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFNMSUB132PS xmm1, xmmV, xmm2/m128","VFNMSUB132PS xmm2/m128, xmmV,
> xmm1","vfnmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9E
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB132PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub132ps
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 9E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFNMSUB132PS ymm1, ymmV, ymm2/m256","VFNMSUB132PS ymm2/m256, ymmV,
> ymm1","vfnmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9E
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB132PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub132ps
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 9E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFNMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmsub132ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 9E
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB132PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub132ps
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 9E
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFNMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmsub132sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 9F
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB132SD xmm1, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64, xmmV, xmm1","vfnmsub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9F /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfnmsub132sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 9F
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFNMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmsub132ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 9F
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB132SS xmm1, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32, xmmV, xmm1","vfnmsub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9F /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfnmsub132ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 9F
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFNMSUB213PD xmm1, xmmV, xmm2/m128","VFNMSUB213PD xmm2/m128, xmmV,
> xmm1","vfnmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AE
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB213PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub213pd
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 AE
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFNMSUB213PD ymm1, ymmV, ymm2/m256","VFNMSUB213PD ymm2/m256, ymmV,
> ymm1","vfnmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AE
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB213PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub213pd
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 AE
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFNMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmsub213pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 AE
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB213PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub213pd
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 AE
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFNMSUB213PS xmm1, xmmV, xmm2/m128","VFNMSUB213PS xmm2/m128, xmmV,
> xmm1","vfnmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AE
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB213PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub213ps
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 AE
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFNMSUB213PS ymm1, ymmV, ymm2/m256","VFNMSUB213PS ymm2/m256, ymmV,
> ymm1","vfnmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AE
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB213PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub213ps
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 AE
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFNMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmsub213ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 AE
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB213PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub213ps
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 AE
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFNMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmsub213sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 AF
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB213SD xmm1, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64, xmmV, xmm1","vfnmsub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AF /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfnmsub213sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 AF
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFNMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmsub213ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 AF
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB213SS xmm1, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32, xmmV, xmm1","vfnmsub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AF /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfnmsub213ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 AF
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFNMSUB231PD xmm1, xmmV, xmm2/m128","VFNMSUB231PD xmm2/m128, xmmV,
> xmm1","vfnmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BE
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB231PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub231pd
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 BE
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VFNMSUB231PD ymm1, ymmV, ymm2/m256","VFNMSUB231PD ymm2/m256, ymmV,
> ymm1","vfnmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BE
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB231PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub231pd
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 BE
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VFNMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PD zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmsub231pd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W1 BE
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB231PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub231pd
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 BE
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VFNMSUB231PS xmm1, xmmV, xmm2/m128","VFNMSUB231PS xmm2/m128, xmmV,
> xmm1","vfnmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BE
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB231PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub231ps
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 BE
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VFNMSUB231PS ymm1, ymmV, ymm2/m256","VFNMSUB231PS ymm2/m256, ymmV,
> ymm1","vfnmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BE
> /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB231PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub231ps
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 BE
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VFNMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PS zmm2, zmmV,
> {k}{z}, zmm1{er}","vfnmsub231ps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.DDS.512.66.0F38.W0 BE
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB231PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub231ps
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 BE
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VFNMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SD xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmsub231sd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W1 BF
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB231SD xmm1, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64, xmmV, xmm1","vfnmsub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BF /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64,
> xmmV, {k}{z}, xmm1","vfnmsub231sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W1 BF
> /r","V","V","AVX512F","scale8","rw,r,r,r","",""
> +"VFNMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SS xmm2, xmmV,
> {k}{z}, xmm1{er}","vfnmsub231ss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.DDS.128.66.0F38.W0 BF
> /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
> +"VFNMSUB231SS xmm1, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32, xmmV, xmm1","vfnmsub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BF /r","V","V","FMA","","rw,r,r","",""
> +"VFNMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32,
> xmmV, {k}{z}, xmm1","vfnmsub231ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.DDS.LIG.66.0F38.W0 BF
> /r","V","V","AVX512F","scale4","rw,r,r,r","",""
> +"VFNMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPD xmm2/m128, xmmIH,
> xmmV, xmm1","vfnmsubpd xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 7D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPD xmmIH, xmm2/m128,
> xmmV, xmm1","vfnmsubpd xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 7D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPD ymm2/m256, ymmIH,
> ymmV, ymm1","vfnmsubpd ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 7D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPD ymmIH, ymm2/m256,
> ymmV, ymm1","vfnmsubpd ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 7D /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPS xmm2/m128, xmmIH,
> xmmV, xmm1","vfnmsubps xmm2/m128, xmmIH, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 7C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPS xmmIH, xmm2/m128,
> xmmV, xmm1","vfnmsubps xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 7C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPS ymm2/m256, ymmIH,
> ymmV, ymm1","vfnmsubps ymm2/m256, ymmIH, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 7C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPS ymmIH, ymm2/m256,
> ymmV, ymm1","vfnmsubps ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 7C /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMSUBSD xmm2/m64, xmmIH,
> xmmV, xmm1","vfnmsubsd xmm2/m64, xmmIH, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W1 7F /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMSUBSD xmmIH, xmm2/m64,
> xmmV, xmm1","vfnmsubsd xmmIH, xmm2/m64, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W0 7F /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMSUBSS xmm2/m32, xmmIH,
> xmmV, xmm1","vfnmsubss xmm2/m32, xmmIH, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W1 7E /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFNMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMSUBSS xmmIH, xmm2/m32,
> xmmV, xmm1","vfnmsubss xmmIH, xmm2/m32, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.W0 7E /r
> /is4","V","V","FMA4","amd","w,r,r,r","",""
> +"VFPCLASSPD k1, {k}, xmm2/m128/m64bcst, imm8u","VFPCLASSPDX imm8u,
> xmm2/m128/m64bcst, {k}, k1","vfpclasspdx imm8u, xmm2/m128/m64bcst,
> {k}, k1","EVEX.128.66.0F3A.W1 66 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","Y","128"
> +"VFPCLASSPD k1, {k}, ymm2/m256/m64bcst, imm8u","VFPCLASSPDY imm8u,
> ymm2/m256/m64bcst, {k}, k1","vfpclasspdy imm8u, ymm2/m256/m64bcst,
> {k}, k1","EVEX.256.66.0F3A.W1 66 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","Y","256"
> +"VFPCLASSPD k1, {k}, zmm2/m512/m64bcst, imm8u","VFPCLASSPDZ imm8u,
> zmm2/m512/m64bcst, {k}, k1","vfpclasspdz imm8u, zmm2/m512/m64bcst,
> {k}, k1","EVEX.512.66.0F3A.W1 66 /r
> ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","Y","512"
> +"VFPCLASSPS k1, {k}, xmm2/m128/m32bcst, imm8u","VFPCLASSPSX imm8u,
> xmm2/m128/m32bcst, {k}, k1","vfpclasspsx imm8u, xmm2/m128/m32bcst,
> {k}, k1","EVEX.128.66.0F3A.W0 66 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","Y","128"
> +"VFPCLASSPS k1, {k}, ymm2/m256/m32bcst, imm8u","VFPCLASSPSY imm8u,
> ymm2/m256/m32bcst, {k}, k1","vfpclasspsy imm8u, ymm2/m256/m32bcst,
> {k}, k1","EVEX.256.66.0F3A.W0 66 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","Y","256"
> +"VFPCLASSPS k1, {k}, zmm2/m512/m32bcst, imm8u","VFPCLASSPSZ imm8u,
> zmm2/m512/m32bcst, {k}, k1","vfpclasspsz imm8u, zmm2/m512/m32bcst,
> {k}, k1","EVEX.512.66.0F3A.W0 66 /r
> ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","Y","512"
> +"VFPCLASSSD k1, {k}, xmm2/m64, imm8u","VFPCLASSSD imm8u, xmm2/m64,
> {k}, k1","vfpclasssd imm8u, xmm2/m64, {k}, k1","EVEX.LIG.66.0F3A.W1 67
> /r ib","V","V","AVX512DQ","scale8","w,r,r,r","",""
> +"VFPCLASSSS k1, {k}, xmm2/m32, imm8u","VFPCLASSSS imm8u, xmm2/m32,
> {k}, k1","vfpclassss imm8u, xmm2/m32, {k}, k1","EVEX.LIG.66.0F3A.W0 67
> /r ib","V","V","AVX512DQ","scale4","w,r,r,r","",""
> +"VFRCZPD xmm1, xmm2/m128","VFRCZPD xmm2/m128, xmm1","vfrczpd xmm2/m128, xmm1","XOP.128.09.W0 81 /r","V","V","XOP","amd","w,r","",""
> +"VFRCZPD ymm1, ymm2/m256","VFRCZPD ymm2/m256, ymm1","vfrczpd ymm2/m256, ymm1","XOP.256.09.W0 81 /r","V","V","XOP","amd","w,r","",""
> +"VFRCZPS xmm1, xmm2/m128","VFRCZPS xmm2/m128, xmm1","vfrczps xmm2/m128, xmm1","XOP.128.09.W0 80 /r","V","V","XOP","amd","w,r","",""
> +"VFRCZPS ymm1, ymm2/m256","VFRCZPS ymm2/m256, ymm1","vfrczps ymm2/m256, ymm1","XOP.256.09.W0 80 /r","V","V","XOP","amd","w,r","",""
> +"VFRCZSD xmm1, xmm2/m64","VFRCZSD xmm2/m64, xmm1","vfrczsd xmm2/m64, xmm1","XOP.128.09.W0 83 /r","V","V","XOP","amd","w,r","",""
> +"VFRCZSS xmm1, xmm2/m32","VFRCZSS xmm2/m32, xmm1","vfrczss xmm2/m32, xmm1","XOP.128.09.W0 82 /r","V","V","XOP","amd","w,r","",""
> +"VGATHERDPD xmm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7},
> xmm1","vgatherdpd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 92
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VGATHERDPD ymm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7},
> ymm1","vgatherdpd vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 92
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VGATHERDPD zmm1, {k1-k7}, vm32y","VGATHERDPD vm32y, {k1-k7},
> zmm1","vgatherdpd vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 92
> /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
> +"VGATHERDPD xmm1, vm32x, xmmV","VGATHERDPD xmmV, vm32x, xmm1","vgatherdpd xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VGATHERDPD ymm1, vm32x, ymmV","VGATHERDPD ymmV, vm32x, ymm1","vgatherdpd ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VGATHERDPS xmm1, {k1-k7}, vm32x","VGATHERDPS vm32x, {k1-k7},
> xmm1","vgatherdps vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 92
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VGATHERDPS ymm1, {k1-k7}, vm32y","VGATHERDPS vm32y, {k1-k7},
> ymm1","vgatherdps vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 92
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VGATHERDPS zmm1, {k1-k7}, vm32z","VGATHERDPS vm32z, {k1-k7},
> zmm1","vgatherdps vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 92
> /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
> +"VGATHERDPS xmm1, vm32x, xmmV","VGATHERDPS xmmV, vm32x, xmm1","vgatherdps xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VGATHERDPS ymm1, vm32y, ymmV","VGATHERDPS ymmV, vm32y, ymm1","vgatherdps ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VGATHERPF0DPD vm32y, {k1-k7}","VGATHERPF0DPD {k1-k7},
> vm32y","vgatherpf0dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6
> /1","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
> +"VGATHERPF0DPS vm32z, {k1-k7}","VGATHERPF0DPS {k1-k7},
> vm32z","vgatherpf0dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6
> /1","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
> +"VGATHERPF0QPD vm64z, {k1-k7}","VGATHERPF0QPD {k1-k7},
> vm64z","vgatherpf0qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7
> /1","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
> +"VGATHERPF0QPS vm64z, {k1-k7}","VGATHERPF0QPS {k1-k7},
> vm64z","vgatherpf0qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7
> /1","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
> +"VGATHERPF1DPD vm32y, {k1-k7}","VGATHERPF1DPD {k1-k7},
> vm32y","vgatherpf1dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6
> /2","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
> +"VGATHERPF1DPS vm32z, {k1-k7}","VGATHERPF1DPS {k1-k7},
> vm32z","vgatherpf1dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6
> /2","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
> +"VGATHERPF1QPD vm64z, {k1-k7}","VGATHERPF1QPD {k1-k7},
> vm64z","vgatherpf1qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7
> /2","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
> +"VGATHERPF1QPS vm64z, {k1-k7}","VGATHERPF1QPS {k1-k7},
> vm64z","vgatherpf1qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7
> /2","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
> +"VGATHERQPD xmm1, {k1-k7}, vm64x","VGATHERQPD vm64x, {k1-k7},
> xmm1","vgatherqpd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 93
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VGATHERQPD ymm1, {k1-k7}, vm64y","VGATHERQPD vm64y, {k1-k7},
> ymm1","vgatherqpd vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 93
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VGATHERQPD zmm1, {k1-k7}, vm64z","VGATHERQPD vm64z, {k1-k7},
> zmm1","vgatherqpd vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 93
> /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
> +"VGATHERQPD xmm1, vm64x, xmmV","VGATHERQPD xmmV, vm64x, xmm1","vgatherqpd xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VGATHERQPD ymm1, vm64y, ymmV","VGATHERQPD ymmV, vm64y, ymm1","vgatherqpd ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VGATHERQPS xmm1, {k1-k7}, vm64x","VGATHERQPS vm64x, {k1-k7},
> xmm1","vgatherqps vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 93
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VGATHERQPS xmm1, {k1-k7}, vm64y","VGATHERQPS vm64y, {k1-k7},
> xmm1","vgatherqps vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 93
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VGATHERQPS ymm1, {k1-k7}, vm64z","VGATHERQPS vm64z, {k1-k7},
> ymm1","vgatherqps vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 93
> /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
> +"VGATHERQPS xmm1, vm64x, xmmV","VGATHERQPS xmmV, vm64x, xmm1","vgatherqps xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VGATHERQPS xmm1, vm64y, xmmV","VGATHERQPS xmmV, vm64y, xmm1","vgatherqps xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VGETEXPPD xmm1, {k}{z}, xmm2/m128/m64bcst","VGETEXPPD
> xmm2/m128/m64bcst, {k}{z}, xmm1","vgetexppd xmm2/m128/m64bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W1 42
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VGETEXPPD ymm1, {k}{z}, ymm2/m256/m64bcst","VGETEXPPD
> ymm2/m256/m64bcst, {k}{z}, ymm1","vgetexppd ymm2/m256/m64bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W1 42
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VGETEXPPD zmm1{sae}, {k}{z}, zmm2","VGETEXPPD zmm2, {k}{z},
> zmm1{sae}","vgetexppd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 42
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VGETEXPPD zmm1, {k}{z}, zmm2/m512/m64bcst","VGETEXPPD
> zmm2/m512/m64bcst, {k}{z}, zmm1","vgetexppd zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W1 42
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
> +"VGETEXPPS xmm1, {k}{z}, xmm2/m128/m32bcst","VGETEXPPS
> xmm2/m128/m32bcst, {k}{z}, xmm1","vgetexpps xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W0 42
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VGETEXPPS ymm1, {k}{z}, ymm2/m256/m32bcst","VGETEXPPS
> ymm2/m256/m32bcst, {k}{z}, ymm1","vgetexpps ymm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W0 42
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VGETEXPPS zmm1{sae}, {k}{z}, zmm2","VGETEXPPS zmm2, {k}{z},
> zmm1{sae}","vgetexpps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 42
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VGETEXPPS zmm1, {k}{z}, zmm2/m512/m32bcst","VGETEXPPS
> zmm2/m512/m32bcst, {k}{z}, zmm1","vgetexpps zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W0 42
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VGETEXPSD xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSD xmm2, xmmV,
> {k}{z}, xmm1{sae}","vgetexpsd xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F38.W1 43
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VGETEXPSD xmm1, {k}{z}, xmmV, xmm2/m64","VGETEXPSD xmm2/m64, xmmV,
> {k}{z}, xmm1","vgetexpsd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W1 43
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VGETEXPSS xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSS xmm2, xmmV,
> {k}{z}, xmm1{sae}","vgetexpss xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F38.W0 43
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VGETEXPSS xmm1, {k}{z}, xmmV, xmm2/m32","VGETEXPSS xmm2/m32, xmmV,
> {k}{z}, xmm1","vgetexpss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W0 43
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VGETMANTPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u:4","VGETMANTPD
> imm8u:4, xmm2/m128/m64bcst, {k}{z}, xmm1","vgetmantpd imm8u:4,
> xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 26 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VGETMANTPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u:4","VGETMANTPD
> imm8u:4, ymm2/m256/m64bcst, {k}{z}, ymm1","vgetmantpd imm8u:4,
> ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 26 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VGETMANTPD zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPD imm8u:4,
> zmm2, {k}{z}, zmm1{sae}","vgetmantpd imm8u:4, zmm2, {k}{z},
> zmm1{sae}","EVEX.512.66.0F3A.W1 26 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VGETMANTPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u:4","VGETMANTPD
> imm8u:4, zmm2/m512/m64bcst, {k}{z}, zmm1","vgetmantpd imm8u:4,
> zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 26 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VGETMANTPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u:4","VGETMANTPS
> imm8u:4, xmm2/m128/m32bcst, {k}{z}, xmm1","vgetmantps imm8u:4,
> xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 26 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VGETMANTPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u:4","VGETMANTPS
> imm8u:4, ymm2/m256/m32bcst, {k}{z}, ymm1","vgetmantps imm8u:4,
> ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 26 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VGETMANTPS zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPS imm8u:4,
> zmm2, {k}{z}, zmm1{sae}","vgetmantps imm8u:4, zmm2, {k}{z},
> zmm1{sae}","EVEX.512.66.0F3A.W0 26 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VGETMANTPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u:4","VGETMANTPS
> imm8u:4, zmm2/m512/m32bcst, {k}{z}, zmm1","vgetmantps imm8u:4,
> zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 26 /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VGETMANTSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSD
> imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vgetmantsd imm8u:4, xmm2,
> xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 27 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
> +"VGETMANTSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VGETMANTSD
> imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","vgetmantsd imm8u:4, xmm2/m64,
> xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 27 /r
> ib","V","V","AVX512F","scale8","w,r,r,r,r","",""
> +"VGETMANTSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSS
> imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vgetmantss imm8u:4, xmm2,
> xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 27 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
> +"VGETMANTSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VGETMANTSS
> imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","vgetmantss imm8u:4, xmm2/m32,
> xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 27 /r
> ib","V","V","AVX512F","scale4","w,r,r,r,r","",""
> +"VGF2P8AFFINEINVQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEINVQB
> imm8u, xmm2/m128, xmmV, xmm1","vgf2p8affineinvqb imm8u, xmm2/m128,
> xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 CF /r
> ib","V","V","GFNI+AVX","","w,r,r,r","",""
> +"VGF2P8AFFINEINVQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst,
> imm8u","VGF2P8AFFINEINVQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vgf2p8affineinvqb imm8u, xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F3A.W1 CF /r
> ib","V","V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VGF2P8AFFINEINVQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEINVQB
> imm8u, ymm2/m256, ymmV, ymm1","vgf2p8affineinvqb imm8u, ymm2/m256,
> ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 CF /r
> ib","V","V","GFNI+AVX","","w,r,r,r","",""
> +"VGF2P8AFFINEINVQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst,
> imm8u","VGF2P8AFFINEINVQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vgf2p8affineinvqb imm8u, ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F3A.W1 CF /r
> ib","V","V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VGF2P8AFFINEINVQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst,
> imm8u","VGF2P8AFFINEINVQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vgf2p8affineinvqb imm8u, zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F3A.W1 CF /r
> ib","V","V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","",""
> +"VGF2P8AFFINEQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEQB imm8u,
> xmm2/m128, xmmV, xmm1","vgf2p8affineqb imm8u, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W1 CE /r
> ib","V","V","GFNI+AVX","","w,r,r,r","",""
> +"VGF2P8AFFINEQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst,
> imm8u","VGF2P8AFFINEQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vgf2p8affineqb imm8u, xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F3A.W1 CE /r
> ib","V","V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VGF2P8AFFINEQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEQB imm8u,
> ymm2/m256, ymmV, ymm1","vgf2p8affineqb imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W1 CE /r
> ib","V","V","GFNI+AVX","","w,r,r,r","",""
> +"VGF2P8AFFINEQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst,
> imm8u","VGF2P8AFFINEQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vgf2p8affineqb imm8u, ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F3A.W1 CE /r
> ib","V","V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VGF2P8AFFINEQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst,
> imm8u","VGF2P8AFFINEQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vgf2p8affineqb imm8u, zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F3A.W1 CE /r
> ib","V","V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","",""
> +"VGF2P8MULB xmm1, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128, xmmV, xmm1","vgf2p8mulb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 CF /r","V","V","GFNI+AVX","","w,r,r","",""
> +"VGF2P8MULB xmm1, {k}{z}, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128,
> xmmV, {k}{z}, xmm1","vgf2p8mulb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.W0 CF
> /r","V","V","GFNI+AVX512VL","scale16","w,r,r,r","",""
> +"VGF2P8MULB ymm1, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256, ymmV, ymm1","vgf2p8mulb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 CF /r","V","V","GFNI+AVX","","w,r,r","",""
> +"VGF2P8MULB ymm1, {k}{z}, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256,
> ymmV, {k}{z}, ymm1","vgf2p8mulb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.W0 CF
> /r","V","V","GFNI+AVX512VL","scale32","w,r,r,r","",""
> +"VGF2P8MULB zmm1, {k}{z}, zmmV, zmm2/m512","VGF2P8MULB zmm2/m512,
> zmmV, {k}{z}, zmm1","vgf2p8mulb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.W0 CF
> /r","V","V","GFNI+AVX512F","scale64","w,r,r,r","",""
> +"VHADDPD xmm1, xmmV, xmm2/m128","VHADDPD xmm2/m128, xmmV, xmm1","vhaddpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
> +"VHADDPD ymm1, ymmV, ymm2/m256","VHADDPD ymm2/m256, ymmV, ymm1","vhaddpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
> +"VHADDPS xmm1, xmmV, xmm2/m128","VHADDPS xmm2/m128, xmmV, xmm1","vhaddps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
> +"VHADDPS ymm1, ymmV, ymm2/m256","VHADDPS ymm2/m256, ymmV, ymm1","vhaddps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
> +"VHSUBPD xmm1, xmmV, xmm2/m128","VHSUBPD xmm2/m128, xmmV, xmm1","vhsubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
> +"VHSUBPD ymm1, ymmV, ymm2/m256","VHSUBPD ymm2/m256, ymmV, ymm1","vhsubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
> +"VHSUBPS xmm1, xmmV, xmm2/m128","VHSUBPS xmm2/m128, xmmV, xmm1","vhsubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
> +"VHSUBPS ymm1, ymmV, ymm2/m256","VHSUBPS ymm2/m256, ymmV, ymm1","vhsubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
> +"VINSERTF128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTF128 imm8u:1,
> xmm2/m128, ymmV, ymm1","vinsertf128 imm8u:1, xmm2/m128, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 18 /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VINSERTF32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF32X4
> imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf32x4 imm8u:1,
> xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 18 /r
> ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r,r","",""
> +"VINSERTF32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF32X4
> imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf32x4 imm8u:2,
> xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 18 /r
> ib","V","V","AVX512F","scale16","w,r,r,r,r","",""
> +"VINSERTF32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF32X8
> imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf32x8 imm8u:1,
> ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 1A /r
> ib","V","V","AVX512DQ","scale32","w,r,r,r,r","",""
> +"VINSERTF64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF64X2
> imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf64x2 imm8u:1,
> xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 18 /r
> ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r,r","",""
> +"VINSERTF64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF64X2
> imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf64x2 imm8u:2,
> xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 18 /r
> ib","V","V","AVX512DQ","scale16","w,r,r,r,r","",""
> +"VINSERTF64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF64X4
> imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf64x4 imm8u:1,
> ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 1A /r
> ib","V","V","AVX512F","scale32","w,r,r,r,r","",""
> +"VINSERTI128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTI128 imm8u:1,
> xmm2/m128, ymmV, ymm1","vinserti128 imm8u:1, xmm2/m128, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 38 /r
> ib","V","V","AVX2","","w,r,r,r","",""
> +"VINSERTI32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI32X4
> imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti32x4 imm8u:1,
> xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 38 /r
> ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r,r","",""
> +"VINSERTI32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI32X4
> imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti32x4 imm8u:2,
> xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 38 /r
> ib","V","V","AVX512F","scale16","w,r,r,r,r","",""
> +"VINSERTI32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI32X8
> imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti32x8 imm8u:1,
> ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 3A /r
> ib","V","V","AVX512DQ","scale32","w,r,r,r,r","",""
> +"VINSERTI64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI64X2
> imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti64x2 imm8u:1,
> xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 38 /r
> ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r,r","",""
> +"VINSERTI64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI64X2
> imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti64x2 imm8u:2,
> xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 38 /r
> ib","V","V","AVX512DQ","scale16","w,r,r,r,r","",""
> +"VINSERTI64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI64X4
> imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti64x4 imm8u:1,
> ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 3A /r
> ib","V","V","AVX512F","scale32","w,r,r,r,r","",""
> +"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32,
> xmmV, xmm1","vinsertps imm8u, xmm2/m32, xmmV,
> xmm1","EVEX.NDS.128.66.0F3A.W0 21 /r
> ib","V","V","AVX512F+AVX512VL","scale4","w,r,r,r","",""
> +"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32,
> xmmV, xmm1","vinsertps imm8u, xmm2/m32, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.WIG 21 /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VLDDQU xmm1, m128","VLDDQU m128, xmm1","vlddqu m128, xmm1","VEX.128.F2.0F.WIG F0 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VLDDQU ymm1, m256","VLDDQU m256, ymm1","vlddqu m256, ymm1","VEX.256.F2.0F.WIG F0 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VLDMXCSR m32","VLDMXCSR m32","vldmxcsr m32","VEX.128.0F.WIG AE /2","V","V","AVX","modrm_memonly","r","",""
> +"VMASKMOVDQU xmm1, xmm2","VMASKMOVDQU xmm2, xmm1","vmaskmovdqu xmm2, xmm1","VEX.128.66.0F.WIG F7 /r","V","V","AVX","modrm_regonly","r,r","",""
> +"VMASKMOVPD xmm1, xmmV, m128","VMASKMOVPD m128, xmmV, xmm1","vmaskmovpd m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMASKMOVPD ymm1, ymmV, m256","VMASKMOVPD m256, ymmV, ymm1","vmaskmovpd m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMASKMOVPD m128, xmmV, xmm1","VMASKMOVPD xmm1, xmmV, m128","vmaskmovpd xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMASKMOVPD m256, ymmV, ymm1","VMASKMOVPD ymm1, ymmV, m256","vmaskmovpd ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMASKMOVPS xmm1, xmmV, m128","VMASKMOVPS m128, xmmV, xmm1","vmaskmovps m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMASKMOVPS ymm1, ymmV, m256","VMASKMOVPS m256, ymmV, ymm1","vmaskmovps m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMASKMOVPS m128, xmmV, xmm1","VMASKMOVPS xmm1, xmmV, m128","vmaskmovps xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMASKMOVPS m256, ymmV, ymm1","VMASKMOVPS ymm1, ymmV, m256","vmaskmovps ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMAXPD xmm1, xmmV, xmm2/m128","VMAXPD xmm2/m128, xmmV, xmm1","vmaxpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
> +"VMAXPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMAXPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vmaxpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5F
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VMAXPD ymm1, ymmV, ymm2/m256","VMAXPD ymm2/m256, ymmV, ymm1","vmaxpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
> +"VMAXPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMAXPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vmaxpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5F
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VMAXPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPD zmm2, zmmV, {k}{z},
> zmm1{sae}","vmaxpd zmm2, zmmV, {k}{z},
> zmm1{sae}","EVEX.NDS.512.66.0F.W1 5F
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMAXPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMAXPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vmaxpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5F
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VMAXPS xmm1, xmmV, xmm2/m128","VMAXPS xmm2/m128, xmmV, xmm1","vmaxps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
> +"VMAXPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMAXPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vmaxps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5F
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VMAXPS ymm1, ymmV, ymm2/m256","VMAXPS ymm2/m256, ymmV, ymm1","vmaxps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
> +"VMAXPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMAXPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vmaxps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5F
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VMAXPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPS zmm2, zmmV, {k}{z},
> zmm1{sae}","vmaxps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0
> 5F /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMAXPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMAXPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vmaxps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5F
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VMAXSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSD xmm2, xmmV, {k}{z},
> xmm1{sae}","vmaxsd xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5F
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMAXSD xmm1, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, xmm1","vmaxsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
> +"VMAXSD xmm1, {k}{z}, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, {k}{z},
> xmm1","vmaxsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5F
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VMAXSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSS xmm2, xmmV, {k}{z},
> xmm1{sae}","vmaxss xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5F
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMAXSS xmm1, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, xmm1","vmaxss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
> +"VMAXSS xmm1, {k}{z}, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, {k}{z},
> xmm1","vmaxss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5F
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VMCALL","VMCALL","vmcall","0F 01 C1","V","V","VTX","","","",""
> +"VMCLEAR m64","VMCLEAR m64","vmclear m64","66 0F C7 /6","V","V","VTX","modrm_memonly","r","",""
> +"VMFUNC","VMFUNC","vmfunc","0F 01 D4","V","V","","","","",""
> +"VMINPD xmm1, xmmV, xmm2/m128","VMINPD xmm2/m128, xmmV, xmm1","vminpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
> +"VMINPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMINPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vminpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5D
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VMINPD ymm1, ymmV, ymm2/m256","VMINPD ymm2/m256, ymmV, ymm1","vminpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
> +"VMINPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMINPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vminpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5D
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VMINPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPD zmm2, zmmV, {k}{z},
> zmm1{sae}","vminpd zmm2, zmmV, {k}{z},
> zmm1{sae}","EVEX.NDS.512.66.0F.W1 5D
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMINPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMINPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vminpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5D
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VMINPS xmm1, xmmV, xmm2/m128","VMINPS xmm2/m128, xmmV, xmm1","vminps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
> +"VMINPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMINPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vminps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5D
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VMINPS ymm1, ymmV, ymm2/m256","VMINPS ymm2/m256, ymmV, ymm1","vminps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
> +"VMINPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMINPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vminps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5D
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VMINPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPS zmm2, zmmV, {k}{z},
> zmm1{sae}","vminps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0
> 5D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMINPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMINPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vminps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5D
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VMINSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSD xmm2, xmmV, {k}{z},
> xmm1{sae}","vminsd xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5D
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMINSD xmm1, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, xmm1","vminsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
> +"VMINSD xmm1, {k}{z}, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, {k}{z},
> xmm1","vminsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5D
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VMINSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSS xmm2, xmmV, {k}{z},
> xmm1{sae}","vminss xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5D
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMINSS xmm1, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, xmm1","vminss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
> +"VMINSS xmm1, {k}{z}, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, {k}{z},
> xmm1","vminss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5D
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VMLAUNCH","VMLAUNCH","vmlaunch","0F 01 C2","V","V","VTX","","","",""
> +"VMLOAD EAX","VMLOADL EAX","vmloadl EAX","0F 01 DA","V","V","SVM","amd,modrm_regonly,operand32","r","Y","32"
> +"VMLOAD RAX","VMLOADQ RAX","vmloadq RAX","REX.W 0F 01 DA","N.S.","V","SVM","amd,modrm_regonly","r","Y","64"
> +"VMLOAD AX","VMLOADW AX","vmloadw AX","0F 01 DA","V","V","SVM","amd,modrm_regonly,operand16","r","Y","16"
> +"VMMCALL","VMMCALL","vmmcall","0F 01 D9","V","V","SVM","amd","","",""
> +"VMOVAPD xmm2/m128, xmm1","VMOVAPD xmm1, xmm2/m128","vmovapd xmm1, xmm2/m128","VEX.128.66.0F.WIG 29 /r","V","V","AVX","","w,r","",""
> +"VMOVAPD xmm2/m128, {k}{z}, xmm1","VMOVAPD xmm1, {k}{z},
> xmm2/m128","vmovapd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 29
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVAPD xmm1, xmm2/m128","VMOVAPD xmm2/m128, xmm1","vmovapd xmm2/m128, xmm1","VEX.128.66.0F.WIG 28 /r","V","V","AVX","","w,r","",""
> +"VMOVAPD xmm1, {k}{z}, xmm2/m128","VMOVAPD xmm2/m128, {k}{z},
> xmm1","vmovapd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 28
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVAPD ymm2/m256, ymm1","VMOVAPD ymm1, ymm2/m256","vmovapd ymm1, ymm2/m256","VEX.256.66.0F.WIG 29 /r","V","V","AVX","","w,r","",""
> +"VMOVAPD ymm2/m256, {k}{z}, ymm1","VMOVAPD ymm1, {k}{z},
> ymm2/m256","vmovapd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 29
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVAPD ymm1, ymm2/m256","VMOVAPD ymm2/m256, ymm1","vmovapd ymm2/m256, ymm1","VEX.256.66.0F.WIG 28 /r","V","V","AVX","","w,r","",""
> +"VMOVAPD ymm1, {k}{z}, ymm2/m256","VMOVAPD ymm2/m256, {k}{z},
> ymm1","vmovapd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 28
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVAPD zmm2/m512, {k}{z}, zmm1","VMOVAPD zmm1, {k}{z}, zmm2/m512","vmovapd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 29 /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVAPD zmm1, {k}{z}, zmm2/m512","VMOVAPD zmm2/m512, {k}{z}, zmm1","vmovapd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 28 /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVAPS xmm2/m128, xmm1","VMOVAPS xmm1, xmm2/m128","vmovaps xmm1, xmm2/m128","VEX.128.0F.WIG 29 /r","V","V","AVX","","w,r","",""
> +"VMOVAPS xmm2/m128, {k}{z}, xmm1","VMOVAPS xmm1, {k}{z},
> xmm2/m128","vmovaps xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 29
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVAPS xmm1, xmm2/m128","VMOVAPS xmm2/m128, xmm1","vmovaps xmm2/m128, xmm1","VEX.128.0F.WIG 28 /r","V","V","AVX","","w,r","",""
> +"VMOVAPS xmm1, {k}{z}, xmm2/m128","VMOVAPS xmm2/m128, {k}{z},
> xmm1","vmovaps xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 28
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVAPS ymm2/m256, ymm1","VMOVAPS ymm1, ymm2/m256","vmovaps ymm1, ymm2/m256","VEX.256.0F.WIG 29 /r","V","V","AVX","","w,r","",""
> +"VMOVAPS ymm2/m256, {k}{z}, ymm1","VMOVAPS ymm1, {k}{z},
> ymm2/m256","vmovaps ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 29
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVAPS ymm1, ymm2/m256","VMOVAPS ymm2/m256, ymm1","vmovaps ymm2/m256, ymm1","VEX.256.0F.WIG 28 /r","V","V","AVX","","w,r","",""
> +"VMOVAPS ymm1, {k}{z}, ymm2/m256","VMOVAPS ymm2/m256, {k}{z},
> ymm1","vmovaps ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 28
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVAPS zmm2/m512, {k}{z}, zmm1","VMOVAPS zmm1, {k}{z}, zmm2/m512","vmovaps zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 29 /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVAPS zmm1, {k}{z}, zmm2/m512","VMOVAPS zmm2/m512, {k}{z}, zmm1","vmovaps zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 28 /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","EVEX.128.66.0F.W0 6E /r","V","V","AVX512F+AVX512VL","scale4","w,r","",""
> +"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","VEX.128.66.0F.W0 6E /r","V","V","AVX","","w,r","",""
> +"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","EVEX.128.66.0F.W0 7E /r","V","V","AVX512F+AVX512VL","scale4","w,r","",""
> +"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","VEX.128.66.0F.W0 7E /r","V","V","AVX","","w,r","",""
> +"VMOVDDUP xmm1, xmm2/m64","VMOVDDUP xmm2/m64, xmm1","vmovddup xmm2/m64, xmm1","VEX.128.F2.0F.WIG 12 /r","V","V","AVX","","w,r","",""
> +"VMOVDDUP xmm1, {k}{z}, xmm2/m64","VMOVDDUP xmm2/m64, {k}{z},
> xmm1","vmovddup xmm2/m64, {k}{z}, xmm1","EVEX.128.F2.0F.W1 12
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VMOVDDUP ymm1, ymm2/m256","VMOVDDUP ymm2/m256, ymm1","vmovddup ymm2/m256, ymm1","VEX.256.F2.0F.WIG 12 /r","V","V","AVX","","w,r","",""
> +"VMOVDDUP ymm1, {k}{z}, ymm2/m256","VMOVDDUP ymm2/m256, {k}{z},
> ymm1","vmovddup ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 12
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVDDUP zmm1, {k}{z}, zmm2/m512","VMOVDDUP zmm2/m512, {k}{z},
> zmm1","vmovddup zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 12
> /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVDQA xmm2/m128, xmm1","VMOVDQA xmm1, xmm2/m128","vmovdqa xmm1, xmm2/m128","VEX.128.66.0F.WIG 7F /r","V","V","AVX","","w,r","",""
> +"VMOVDQA xmm1, xmm2/m128","VMOVDQA xmm2/m128, xmm1","vmovdqa xmm2/m128, xmm1","VEX.128.66.0F.WIG 6F /r","V","V","AVX","","w,r","",""
> +"VMOVDQA ymm2/m256, ymm1","VMOVDQA ymm1, ymm2/m256","vmovdqa ymm1, ymm2/m256","VEX.256.66.0F.WIG 7F /r","V","V","AVX","","w,r","",""
> +"VMOVDQA ymm1, ymm2/m256","VMOVDQA ymm2/m256, ymm1","vmovdqa ymm2/m256, ymm1","VEX.256.66.0F.WIG 6F /r","V","V","AVX","","w,r","",""
> +"VMOVDQA32 xmm2/m128, {k}{z}, xmm1","VMOVDQA32 xmm1, {k}{z},
> xmm2/m128","vmovdqa32 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W0 7F
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVDQA32 xmm1, {k}{z}, xmm2/m128","VMOVDQA32 xmm2/m128, {k}{z},
> xmm1","vmovdqa32 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W0 6F
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVDQA32 ymm2/m256, {k}{z}, ymm1","VMOVDQA32 ymm1, {k}{z},
> ymm2/m256","vmovdqa32 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W0 7F
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVDQA32 ymm1, {k}{z}, ymm2/m256","VMOVDQA32 ymm2/m256, {k}{z},
> ymm1","vmovdqa32 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W0 6F
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVDQA32 zmm2/m512, {k}{z}, zmm1","VMOVDQA32 zmm1, {k}{z},
> zmm2/m512","vmovdqa32 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W0 7F
> /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVDQA32 zmm1, {k}{z}, zmm2/m512","VMOVDQA32 zmm2/m512, {k}{z},
> zmm1","vmovdqa32 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W0 6F
> /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVDQA64 xmm2/m128, {k}{z}, xmm1","VMOVDQA64 xmm1, {k}{z},
> xmm2/m128","vmovdqa64 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 7F
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
> +"VMOVDQA64 xmm1, {k}{z}, xmm2/m128","VMOVDQA64 xmm2/m128, {k}{z},
> xmm1","vmovdqa64 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 6F
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
> +"VMOVDQA64 ymm2/m256, {k}{z}, ymm1","VMOVDQA64 ymm1, {k}{z},
> ymm2/m256","vmovdqa64 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 7F
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
> +"VMOVDQA64 ymm1, {k}{z}, ymm2/m256","VMOVDQA64 ymm2/m256, {k}{z},
> ymm1","vmovdqa64 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 6F
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
> +"VMOVDQA64 zmm2/m512, {k}{z}, zmm1","VMOVDQA64 zmm1, {k}{z},
> zmm2/m512","vmovdqa64 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 7F
> /r","V","V","AVX512F","scale64","w,r,r","Y","512"
> +"VMOVDQA64 zmm1, {k}{z}, zmm2/m512","VMOVDQA64 zmm2/m512, {k}{z},
> zmm1","vmovdqa64 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 6F
> /r","V","V","AVX512F","scale64","w,r,r","Y","512"
> +"VMOVDQU xmm2/m128, xmm1","VMOVDQU xmm1, xmm2/m128","vmovdqu xmm1, xmm2/m128","VEX.128.F3.0F.WIG 7F /r","V","V","AVX","","w,r","",""
> +"VMOVDQU xmm1, xmm2/m128","VMOVDQU xmm2/m128, xmm1","vmovdqu xmm2/m128, xmm1","VEX.128.F3.0F.WIG 6F /r","V","V","AVX","","w,r","",""
> +"VMOVDQU ymm2/m256, ymm1","VMOVDQU ymm1, ymm2/m256","vmovdqu ymm1, ymm2/m256","VEX.256.F3.0F.WIG 7F /r","V","V","AVX","","w,r","",""
> +"VMOVDQU ymm1, ymm2/m256","VMOVDQU ymm2/m256, ymm1","vmovdqu ymm2/m256, ymm1","VEX.256.F3.0F.WIG 6F /r","V","V","AVX","","w,r","",""
> +"VMOVDQU16 xmm2/m128, {k}{z}, xmm1","VMOVDQU16 xmm1, {k}{z},
> xmm2/m128","vmovdqu16 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W1 7F
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VMOVDQU16 xmm1, {k}{z}, xmm2/m128","VMOVDQU16 xmm2/m128, {k}{z},
> xmm1","vmovdqu16 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W1 6F
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VMOVDQU16 ymm2/m256, {k}{z}, ymm1","VMOVDQU16 ymm1, {k}{z},
> ymm2/m256","vmovdqu16 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W1 7F
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
> +"VMOVDQU16 ymm1, {k}{z}, ymm2/m256","VMOVDQU16 ymm2/m256, {k}{z},
> ymm1","vmovdqu16 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 6F
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
> +"VMOVDQU16 zmm2/m512, {k}{z}, zmm1","VMOVDQU16 zmm1, {k}{z},
> zmm2/m512","vmovdqu16 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W1 7F
> /r","V","V","AVX512BW","scale64","w,r,r","",""
> +"VMOVDQU16 zmm1, {k}{z}, zmm2/m512","VMOVDQU16 zmm2/m512, {k}{z},
> zmm1","vmovdqu16 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 6F
> /r","V","V","AVX512BW","scale64","w,r,r","",""
> +"VMOVDQU32 xmm2/m128, {k}{z}, xmm1","VMOVDQU32 xmm1, {k}{z},
> xmm2/m128","vmovdqu32 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W0 7F
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
> +"VMOVDQU32 xmm1, {k}{z}, xmm2/m128","VMOVDQU32 xmm2/m128, {k}{z},
> xmm1","vmovdqu32 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 6F
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
> +"VMOVDQU32 ymm2/m256, {k}{z}, ymm1","VMOVDQU32 ymm1, {k}{z},
> ymm2/m256","vmovdqu32 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W0 7F
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
> +"VMOVDQU32 ymm1, {k}{z}, ymm2/m256","VMOVDQU32 ymm2/m256, {k}{z},
> ymm1","vmovdqu32 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 6F
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
> +"VMOVDQU32 zmm2/m512, {k}{z}, zmm1","VMOVDQU32 zmm1, {k}{z},
> zmm2/m512","vmovdqu32 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W0 7F
> /r","V","V","AVX512F","scale64","w,r,r","Y","512"
> +"VMOVDQU32 zmm1, {k}{z}, zmm2/m512","VMOVDQU32 zmm2/m512, {k}{z},
> zmm1","vmovdqu32 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 6F
> /r","V","V","AVX512F","scale64","w,r,r","Y","512"
> +"VMOVDQU64 xmm2/m128, {k}{z}, xmm1","VMOVDQU64 xmm1, {k}{z},
> xmm2/m128","vmovdqu64 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W1 7F
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
> +"VMOVDQU64 xmm1, {k}{z}, xmm2/m128","VMOVDQU64 xmm2/m128, {k}{z},
> xmm1","vmovdqu64 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W1 6F
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
> +"VMOVDQU64 ymm2/m256, {k}{z}, ymm1","VMOVDQU64 ymm1, {k}{z},
> ymm2/m256","vmovdqu64 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W1 7F
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
> +"VMOVDQU64 ymm1, {k}{z}, ymm2/m256","VMOVDQU64 ymm2/m256, {k}{z},
> ymm1","vmovdqu64 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W1 6F
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
> +"VMOVDQU64 zmm2/m512, {k}{z}, zmm1","VMOVDQU64 zmm1, {k}{z},
> zmm2/m512","vmovdqu64 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W1 7F
> /r","V","V","AVX512F","scale64","w,r,r","Y","512"
> +"VMOVDQU64 zmm1, {k}{z}, zmm2/m512","VMOVDQU64 zmm2/m512, {k}{z},
> zmm1","vmovdqu64 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W1 6F
> /r","V","V","AVX512F","scale64","w,r,r","Y","512"
> +"VMOVDQU8 xmm2/m128, {k}{z}, xmm1","VMOVDQU8 xmm1, {k}{z},
> xmm2/m128","vmovdqu8 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W0 7F
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","Y","128"
> +"VMOVDQU8 xmm1, {k}{z}, xmm2/m128","VMOVDQU8 xmm2/m128, {k}{z},
> xmm1","vmovdqu8 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W0 6F
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","Y","128"
> +"VMOVDQU8 ymm2/m256, {k}{z}, ymm1","VMOVDQU8 ymm1, {k}{z},
> ymm2/m256","vmovdqu8 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W0 7F
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","Y","256"
> +"VMOVDQU8 ymm1, {k}{z}, ymm2/m256","VMOVDQU8 ymm2/m256, {k}{z},
> ymm1","vmovdqu8 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W0 6F
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","Y","256"
> +"VMOVDQU8 zmm2/m512, {k}{z}, zmm1","VMOVDQU8 zmm1, {k}{z},
> zmm2/m512","vmovdqu8 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W0 7F
> /r","V","V","AVX512BW","scale64","w,r,r","Y","512"
> +"VMOVDQU8 zmm1, {k}{z}, zmm2/m512","VMOVDQU8 zmm2/m512, {k}{z},
> zmm1","vmovdqu8 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W0 6F
> /r","V","V","AVX512BW","scale64","w,r,r","Y","512"
> +"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xmmV, xmm1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
> +"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xmmV, xmm1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xmm1","EVEX.NDS.LIG.66.0F.W1 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
> +"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","EVEX.LIG.66.0F.W1 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
> +"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","VEX.128.66.0F.WIG 17 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xmm1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
> +"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xmm1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","EVEX.128.0F.W0 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
> +"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","VEX.128.0F.WIG 17 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xmmV, xmm1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
> +"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xmmV, xmm1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xmm1","EVEX.NDS.LIG.66.0F.W1 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
> +"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","EVEX.LIG.66.0F.W1 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
> +"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","VEX.128.66.0F.WIG 13 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xmm1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
> +"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xmm1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","",""
> +"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","EVEX.128.0F.W0 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
> +"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","VEX.128.0F.WIG 13 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVMSKPD r32, xmm2","VMOVMSKPD xmm2, r32","vmovmskpd xmm2, r32","VEX.128.66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
> +"VMOVMSKPD r32, ymm2","VMOVMSKPD ymm2, r32","vmovmskpd ymm2, r32","VEX.256.66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
> +"VMOVMSKPS r32, xmm2","VMOVMSKPS xmm2, r32","vmovmskps xmm2, r32","VEX.128.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
> +"VMOVMSKPS r32, ymm2","VMOVMSKPS ymm2, r32","vmovmskps ymm2, r32","VEX.256.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
> +"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","EVEX.128.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
> +"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","VEX.128.66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","EVEX.256.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
> +"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","VEX.256.66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVNTDQ m512, zmm1","VMOVNTDQ zmm1, m512","vmovntdq zmm1, m512","EVEX.512.66.0F.W0 E7 /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
> +"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","EVEX.128.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
> +"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","VEX.128.66.0F38.WIG 2A /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","EVEX.256.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
> +"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","VEX.256.66.0F38.WIG 2A /r","V","V","AVX2","modrm_memonly","w,r","",""
> +"VMOVNTDQA zmm1, m512","VMOVNTDQA m512, zmm1","vmovntdqa m512, zmm1","EVEX.512.66.0F38.W0 2A /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
> +"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","EVEX.128.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
> +"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","VEX.128.66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","EVEX.256.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
> +"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","VEX.256.66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVNTPD m512, zmm1","VMOVNTPD zmm1, m512","vmovntpd zmm1, m512","EVEX.512.66.0F.W1 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
> +"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","EVEX.128.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
> +"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","VEX.128.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","EVEX.256.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
> +"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","VEX.256.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVNTPS m512, zmm1","VMOVNTPS zmm1, m512","vmovntps zmm1, m512","EVEX.512.0F.W0 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
> +"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","EVEX.128.66.0F.W1 6E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","",""
> +"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","VEX.128.66.0F.W1 6E /r","N.S.","V","AVX","","w,r","",""
> +"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","EVEX.128.66.0F.W1 7E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","",""
> +"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","VEX.128.66.0F.W1 7E /r","N.S.","V","AVX","","w,r","",""
> +"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","EVEX.LIG.66.0F.W1 D6 /r","V","V","AVX512F+AVX512VL","scale8","w,r","",""
> +"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","VEX.128.66.0F.WIG D6 /r","V","V","AVX","","w,r","",""
> +"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","EVEX.LIG.F3.0F.W1 7E /r","V","V","AVX512F+AVX512VL","scale8","w,r","",""
> +"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","VEX.128.F3.0F.WIG 7E /r","V","V","AVX","","w,r","",""
> +"VMOVSD xmm1, m64","VMOVSD m64, xmm1","vmovsd m64, xmm1","VEX.LIG.F2.0F.WIG 10 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVSD xmm1, {k}{z}, m64","VMOVSD m64, {k}{z}, xmm1","vmovsd m64, {k}{z}, xmm1","EVEX.LIG.F2.0F.W1 10 /r","V","V","AVX512F","modrm_memonly,scale8","w,r,r","",""
> +"VMOVSD m64, xmm1","VMOVSD xmm1, m64","vmovsd xmm1, m64","VEX.LIG.F2.0F.WIG 11 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVSD xmm2, xmmV, xmm1","VMOVSD xmm1, xmmV, xmm2","vmovsd xmm1, xmmV, xmm2","VEX.NDS.LIG.F2.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VMOVSD xmm2, {k}{z}, xmmV, xmm1","VMOVSD xmm1, xmmV, {k}{z},
> xmm2","vmovsd xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F2.0F.W1 11
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMOVSD m64, {k}, xmm1","VMOVSD xmm1, {k}, m64","vmovsd xmm1, {k}, m64","EVEX.LIG.F2.0F.W1 11 /r","V","V","AVX512F","modrm_memonly,scale8","w,r,r","",""
> +"VMOVSD xmm1, xmmV, xmm2","VMOVSD xmm2, xmmV, xmm1","vmovsd xmm2, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VMOVSD xmm1, {k}{z}, xmmV, xmm2","VMOVSD xmm2, xmmV, {k}{z},
> xmm1","vmovsd xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 10
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMOVSHDUP xmm1, xmm2/m128","VMOVSHDUP xmm2/m128, xmm1","vmovshdup xmm2/m128, xmm1","VEX.128.F3.0F.WIG 16 /r","V","V","AVX","","w,r","",""
> +"VMOVSHDUP xmm1, {k}{z}, xmm2/m128","VMOVSHDUP xmm2/m128, {k}{z},
> xmm1","vmovshdup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 16
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVSHDUP ymm1, ymm2/m256","VMOVSHDUP ymm2/m256, ymm1","vmovshdup ymm2/m256, ymm1","VEX.256.F3.0F.WIG 16 /r","V","V","AVX","","w,r","",""
> +"VMOVSHDUP ymm1, {k}{z}, ymm2/m256","VMOVSHDUP ymm2/m256, {k}{z},
> ymm1","vmovshdup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 16
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVSHDUP zmm1, {k}{z}, zmm2/m512","VMOVSHDUP zmm2/m512, {k}{z},
> zmm1","vmovshdup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 16
> /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVSLDUP xmm1, xmm2/m128","VMOVSLDUP xmm2/m128, xmm1","vmovsldup xmm2/m128, xmm1","VEX.128.F3.0F.WIG 12 /r","V","V","AVX","","w,r","",""
> +"VMOVSLDUP xmm1, {k}{z}, xmm2/m128","VMOVSLDUP xmm2/m128, {k}{z},
> xmm1","vmovsldup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 12
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVSLDUP ymm1, ymm2/m256","VMOVSLDUP ymm2/m256, ymm1","vmovsldup ymm2/m256, ymm1","VEX.256.F3.0F.WIG 12 /r","V","V","AVX","","w,r","",""
> +"VMOVSLDUP ymm1, {k}{z}, ymm2/m256","VMOVSLDUP ymm2/m256, {k}{z},
> ymm1","vmovsldup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 12
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVSLDUP zmm1, {k}{z}, zmm2/m512","VMOVSLDUP zmm2/m512, {k}{z},
> zmm1","vmovsldup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 12
> /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVSS xmm1, m32","VMOVSS m32, xmm1","vmovss m32, xmm1","VEX.LIG.F3.0F.WIG 10 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVSS xmm1, {k}{z}, m32","VMOVSS m32, {k}{z}, xmm1","vmovss m32, {k}{z}, xmm1","EVEX.LIG.F3.0F.W0 10 /r","V","V","AVX512F","modrm_memonly,scale4","w,r,r","",""
> +"VMOVSS m32, xmm1","VMOVSS xmm1, m32","vmovss xmm1, m32","VEX.LIG.F3.0F.WIG 11 /r","V","V","AVX","modrm_memonly","w,r","",""
> +"VMOVSS xmm2, xmmV, xmm1","VMOVSS xmm1, xmmV, xmm2","vmovss xmm1, xmmV, xmm2","VEX.NDS.LIG.F3.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VMOVSS xmm2, {k}{z}, xmmV, xmm1","VMOVSS xmm1, xmmV, {k}{z},
> xmm2","vmovss xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F3.0F.W0 11
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMOVSS m32, {k}, xmm1","VMOVSS xmm1, {k}, m32","vmovss xmm1, {k}, m32","EVEX.LIG.F3.0F.W0 11 /r","V","V","AVX512F","modrm_memonly,scale4","w,r,r","",""
> +"VMOVSS xmm1, xmmV, xmm2","VMOVSS xmm2, xmmV, xmm1","vmovss xmm2, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VMOVSS xmm1, {k}{z}, xmmV, xmm2","VMOVSS xmm2, xmmV, {k}{z},
> xmm1","vmovss xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 10
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMOVUPD xmm2/m128, xmm1","VMOVUPD xmm1, xmm2/m128","vmovupd xmm1, xmm2/m128","VEX.128.66.0F.WIG 11 /r","V","V","AVX","","w,r","",""
> +"VMOVUPD xmm2/m128, {k}{z}, xmm1","VMOVUPD xmm1, {k}{z},
> xmm2/m128","vmovupd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 11
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVUPD xmm1, xmm2/m128","VMOVUPD xmm2/m128, xmm1","vmovupd xmm2/m128, xmm1","VEX.128.66.0F.WIG 10 /r","V","V","AVX","","w,r","",""
> +"VMOVUPD xmm1, {k}{z}, xmm2/m128","VMOVUPD xmm2/m128, {k}{z},
> xmm1","vmovupd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 10
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVUPD ymm2/m256, ymm1","VMOVUPD ymm1, ymm2/m256","vmovupd ymm1, ymm2/m256","VEX.256.66.0F.WIG 11 /r","V","V","AVX","","w,r","",""
> +"VMOVUPD ymm2/m256, {k}{z}, ymm1","VMOVUPD ymm1, {k}{z},
> ymm2/m256","vmovupd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 11
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVUPD ymm1, ymm2/m256","VMOVUPD ymm2/m256, ymm1","vmovupd ymm2/m256, ymm1","VEX.256.66.0F.WIG 10 /r","V","V","AVX","","w,r","",""
> +"VMOVUPD ymm1, {k}{z}, ymm2/m256","VMOVUPD ymm2/m256, {k}{z},
> ymm1","vmovupd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 10
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVUPD zmm2/m512, {k}{z}, zmm1","VMOVUPD zmm1, {k}{z}, zmm2/m512","vmovupd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 11 /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVUPD zmm1, {k}{z}, zmm2/m512","VMOVUPD zmm2/m512, {k}{z}, zmm1","vmovupd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 10 /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVUPS xmm2/m128, xmm1","VMOVUPS xmm1, xmm2/m128","vmovups xmm1, xmm2/m128","VEX.128.0F.WIG 11 /r","V","V","AVX","","w,r","",""
> +"VMOVUPS xmm2/m128, {k}{z}, xmm1","VMOVUPS xmm1, {k}{z},
> xmm2/m128","vmovups xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 11
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVUPS xmm1, xmm2/m128","VMOVUPS xmm2/m128, xmm1","vmovups xmm2/m128, xmm1","VEX.128.0F.WIG 10 /r","V","V","AVX","","w,r","",""
> +"VMOVUPS xmm1, {k}{z}, xmm2/m128","VMOVUPS xmm2/m128, {k}{z},
> xmm1","vmovups xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 10
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VMOVUPS ymm2/m256, ymm1","VMOVUPS ymm1, ymm2/m256","vmovups ymm1, ymm2/m256","VEX.256.0F.WIG 11 /r","V","V","AVX","","w,r","",""
> +"VMOVUPS ymm2/m256, {k}{z}, ymm1","VMOVUPS ymm1, {k}{z},
> ymm2/m256","vmovups ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 11
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVUPS ymm1, ymm2/m256","VMOVUPS ymm2/m256, ymm1","vmovups ymm2/m256, ymm1","VEX.256.0F.WIG 10 /r","V","V","AVX","","w,r","",""
> +"VMOVUPS ymm1, {k}{z}, ymm2/m256","VMOVUPS ymm2/m256, {k}{z},
> ymm1","vmovups ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 10
> /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
> +"VMOVUPS zmm2/m512, {k}{z}, zmm1","VMOVUPS zmm1, {k}{z}, zmm2/m512","vmovups zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 11 /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMOVUPS zmm1, {k}{z}, zmm2/m512","VMOVUPS zmm2/m512, {k}{z}, zmm1","vmovups zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 10 /r","V","V","AVX512F","scale64","w,r,r","",""
> +"VMPSADBW xmm1, xmmV, xmm2/m128, imm8u","VMPSADBW imm8u, xmm2/m128,
> xmmV, xmm1","vmpsadbw imm8u, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.WIG 42 /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VMPSADBW ymm1, ymmV, ymm2/m256, imm8u","VMPSADBW imm8u, ymm2/m256,
> ymmV, ymm1","vmpsadbw imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.WIG 42 /r
> ib","V","V","AVX2","","w,r,r,r","",""
> +"VMPTRLD m64","VMPTRLD m64","vmptrld m64","0F C7 /6","V","V","VTX","modrm_memonly","r","",""
> +"VMPTRST m64","VMPTRST m64","vmptrst m64","0F C7 /7","V","V","VTX","modrm_memonly","w","",""
> +"VMREAD r/m32, r32","VMREAD r32, r/m32","vmread r32, r/m32","0F 78 /r","V","N.S.","VTX","","rw,r","",""
> +"VMREAD r/m64, r64","VMREAD r64, r/m64","vmread r64, r/m64","0F 78 /r","N.S.","V","VTX","default64","rw,r","",""
> +"VMRESUME","VMRESUME","vmresume","0F 01 C3","V","V","VTX","","","",""
> +"VMRUN EAX","VMRUNL EAX","vmrunl EAX","0F 01 D8","V","V","SVM","amd,modrm_regonly,operand32","r","Y","32"
> +"VMRUN RAX","VMRUNQ RAX","vmrunq RAX","REX.W 0F 01 D8","N.S.","V","SVM","amd,modrm_regonly","r","Y","64"
> +"VMRUN AX","VMRUNW AX","vmrunw AX","0F 01 D8","V","V","SVM","amd,modrm_regonly,operand16","r","Y","16"
> +"VMSAVE","VMSAVE","vmsave","0F 01 DB","V","V","SVM","amd","","",""
> +"VMULPD xmm1, xmmV, xmm2/m128","VMULPD xmm2/m128, xmmV, xmm1","vmulpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
> +"VMULPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMULPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vmulpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 59
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VMULPD ymm1, ymmV, ymm2/m256","VMULPD ymm2/m256, ymmV, ymm1","vmulpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
> +"VMULPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMULPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vmulpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 59
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VMULPD zmm1{er}, {k}{z}, zmmV, zmm2","VMULPD zmm2, zmmV, {k}{z},
> zmm1{er}","vmulpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1
> 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMULPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMULPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vmulpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 59
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VMULPS xmm1, xmmV, xmm2/m128","VMULPS xmm2/m128, xmmV, xmm1","vmulps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
> +"VMULPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMULPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vmulps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 59
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VMULPS ymm1, ymmV, ymm2/m256","VMULPS ymm2/m256, ymmV, ymm1","vmulps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
> +"VMULPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMULPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vmulps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 59
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VMULPS zmm1{er}, {k}{z}, zmmV, zmm2","VMULPS zmm2, zmmV, {k}{z},
> zmm1{er}","vmulps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 59
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMULPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMULPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vmulps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 59
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VMULSD xmm1{er}, {k}{z}, xmmV, xmm2","VMULSD xmm2, xmmV, {k}{z},
> xmm1{er}","vmulsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1
> 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMULSD xmm1, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, xmm1","vmulsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
> +"VMULSD xmm1, {k}{z}, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, {k}{z},
> xmm1","vmulsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 59
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VMULSS xmm1{er}, {k}{z}, xmmV, xmm2","VMULSS xmm2, xmmV, {k}{z},
> xmm1{er}","vmulss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0
> 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VMULSS xmm1, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, xmm1","vmulss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
> +"VMULSS xmm1, {k}{z}, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, {k}{z},
> xmm1","vmulss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 59
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VMWRITE r32, r/m32","VMWRITE r/m32, r32","vmwrite r/m32, r32","0F 79 /r","V","N.S.","VTX","","r,r","",""
> +"VMWRITE r64, r/m64","VMWRITE r/m64, r64","vmwrite r/m64, r64","0F 79 /r","N.S.","V","VTX","default64","r,r","",""
> +"VMXOFF","VMXOFF","vmxoff","0F 01 C4","V","V","VTX","","","",""
> +"VMXON m64","VMXON m64","vmxon m64","F3 0F C7 /6","V","V","VTX","modrm_memonly","r","",""
> +"VORPD xmm1, xmmV, xmm2/m128","VORPD xmm2/m128, xmmV, xmm1","vorpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
> +"VORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VORPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vorpd xmm2/m128/m64bcst, xmmV,
> {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 56
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VORPD ymm1, ymmV, ymm2/m256","VORPD ymm2/m256, ymmV, ymm1","vorpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
> +"VORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VORPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vorpd ymm2/m256/m64bcst, ymmV,
> {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 56
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VORPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vorpd zmm2/m512/m64bcst, zmmV,
> {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 56
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
> +"VORPS xmm1, xmmV, xmm2/m128","VORPS xmm2/m128, xmmV, xmm1","vorps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
> +"VORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VORPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vorps xmm2/m128/m32bcst, xmmV,
> {k}{z}, xmm1","EVEX.NDS.128.0F.W0 56
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VORPS ymm1, ymmV, ymm2/m256","VORPS ymm2/m256, ymmV, ymm1","vorps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
> +"VORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VORPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vorps ymm2/m256/m32bcst, ymmV,
> {k}{z}, ymm1","EVEX.NDS.256.0F.W0 56
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VORPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vorps zmm2/m512/m32bcst, zmmV,
> {k}{z}, zmm1","EVEX.NDS.512.0F.W0 56
> /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
> +"VP4DPWSSD zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSD m128, zmmV+3,
> {k}{z}, zmm1","vp4dpwssd m128, zmmV+3, {k}{z},
> zmm1","EVEX.DDS.512.F2.0F38.W0 52
> /r","V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","",""
> +"VP4DPWSSDS zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSDS m128, zmmV+3,
> {k}{z}, zmm1","vp4dpwssds m128, zmmV+3, {k}{z},
> zmm1","EVEX.DDS.512.F2.0F38.W0 53
> /r","V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","",""
> +"VPABSB xmm1, xmm2/m128","VPABSB xmm2/m128, xmm1","vpabsb xmm2/m128, xmm1","VEX.128.66.0F38.WIG 1C /r","V","V","AVX","","w,r","",""
> +"VPABSB xmm1, {k}{z}, xmm2/m128","VPABSB xmm2/m128, {k}{z},
> xmm1","vpabsb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1C
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPABSB ymm1, ymm2/m256","VPABSB ymm2/m256, ymm1","vpabsb ymm2/m256, ymm1","VEX.256.66.0F38.WIG 1C /r","V","V","AVX2","","w,r","",""
> +"VPABSB ymm1, {k}{z}, ymm2/m256","VPABSB ymm2/m256, {k}{z},
> ymm1","vpabsb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1C
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
> +"VPABSB zmm1, {k}{z}, zmm2/m512","VPABSB zmm2/m512, {k}{z}, zmm1","vpabsb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1C /r","V","V","AVX512BW","scale64","w,r,r","",""
> +"VPABSD xmm1, xmm2/m128","VPABSD xmm2/m128, xmm1","vpabsd xmm2/m128, xmm1","VEX.128.66.0F38.WIG 1E /r","V","V","AVX","","w,r","",""
> +"VPABSD xmm1, {k}{z}, xmm2/m128/m32bcst","VPABSD xmm2/m128/m32bcst,
> {k}{z}, xmm1","vpabsd xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W0 1E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VPABSD ymm1, ymm2/m256","VPABSD ymm2/m256, ymm1","vpabsd ymm2/m256, ymm1","VEX.256.66.0F38.WIG 1E /r","V","V","AVX2","","w,r","",""
> +"VPABSD ymm1, {k}{z}, ymm2/m256/m32bcst","VPABSD ymm2/m256/m32bcst,
> {k}{z}, ymm1","vpabsd ymm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W0 1E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VPABSD zmm1, {k}{z}, zmm2/m512/m32bcst","VPABSD zmm2/m512/m32bcst,
> {k}{z}, zmm1","vpabsd zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W0 1E
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VPABSQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPABSQ xmm2/m128/m64bcst,
> {k}{z}, xmm1","vpabsq xmm2/m128/m64bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W1 1F
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VPABSQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPABSQ ymm2/m256/m64bcst,
> {k}{z}, ymm1","vpabsq ymm2/m256/m64bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W1 1F
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VPABSQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPABSQ zmm2/m512/m64bcst,
> {k}{z}, zmm1","vpabsq zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W1 1F
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
> +"VPABSW xmm1, xmm2/m128","VPABSW xmm2/m128, xmm1","vpabsw xmm2/m128, xmm1","VEX.128.66.0F38.WIG 1D /r","V","V","AVX","","w,r","",""
> +"VPABSW xmm1, {k}{z}, xmm2/m128","VPABSW xmm2/m128, {k}{z},
> xmm1","vpabsw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1D
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPABSW ymm1, ymm2/m256","VPABSW ymm2/m256, ymm1","vpabsw ymm2/m256, ymm1","VEX.256.66.0F38.WIG 1D /r","V","V","AVX2","","w,r","",""
> +"VPABSW ymm1, {k}{z}, ymm2/m256","VPABSW ymm2/m256, {k}{z},
> ymm1","vpabsw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1D
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
> +"VPABSW zmm1, {k}{z}, zmm2/m512","VPABSW zmm2/m512, {k}{z}, zmm1","vpabsw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1D /r","V","V","AVX512BW","scale64","w,r,r","",""
> +"VPACKSSDW xmm1, xmmV, xmm2/m128","VPACKSSDW xmm2/m128, xmmV, xmm1","vpackssdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6B /r","V","V","AVX","","w,r,r","",""
> +"VPACKSSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKSSDW
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpackssdw xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 6B
> /r","V","V","AVX512BW+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPACKSSDW ymm1, ymmV, ymm2/m256","VPACKSSDW ymm2/m256, ymmV, ymm1","vpackssdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6B /r","V","V","AVX2","","w,r,r","",""
> +"VPACKSSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKSSDW
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpackssdw ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 6B
> /r","V","V","AVX512BW+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPACKSSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKSSDW
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpackssdw zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 6B
> /r","V","V","AVX512BW","bscale4,scale64","w,r,r,r","",""
> +"VPACKSSWB xmm1, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV, xmm1","vpacksswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 63 /r","V","V","AVX","","w,r,r","",""
> +"VPACKSSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpacksswb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG 63
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPACKSSWB ymm1, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV, ymm1","vpacksswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 63 /r","V","V","AVX2","","w,r,r","",""
> +"VPACKSSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpacksswb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG 63
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPACKSSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKSSWB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpacksswb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG 63
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPACKUSDW xmm1, xmmV, xmm2/m128","VPACKUSDW xmm2/m128, xmmV, xmm1","vpackusdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 2B /r","V","V","AVX","","w,r,r","",""
> +"VPACKUSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKUSDW
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpackusdw xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 2B
> /r","V","V","AVX512BW+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPACKUSDW ymm1, ymmV, ymm2/m256","VPACKUSDW ymm2/m256, ymmV, ymm1","vpackusdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 2B /r","V","V","AVX2","","w,r,r","",""
> +"VPACKUSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKUSDW
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpackusdw ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 2B
> /r","V","V","AVX512BW+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPACKUSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKUSDW
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpackusdw zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 2B
> /r","V","V","AVX512BW","bscale4,scale64","w,r,r,r","",""
> +"VPACKUSWB xmm1, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV, xmm1","vpackuswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 67 /r","V","V","AVX","","w,r,r","",""
> +"VPACKUSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpackuswb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG 67
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPACKUSWB ymm1, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV, ymm1","vpackuswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 67 /r","V","V","AVX2","","w,r,r","",""
> +"VPACKUSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpackuswb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG 67
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPACKUSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKUSWB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpackuswb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG 67
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPADDB xmm1, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV, xmm1","vpaddb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FC /r","V","V","AVX","","w,r,r","",""
> +"VPADDB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpaddb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG FC
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPADDB ymm1, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV, ymm1","vpaddb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FC /r","V","V","AVX2","","w,r,r","",""
> +"VPADDB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpaddb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG FC
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPADDB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpaddb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG FC
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPADDD xmm1, xmmV, xmm2/m128","VPADDD xmm2/m128, xmmV, xmm1","vpaddd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FE /r","V","V","AVX","","w,r,r","",""
> +"VPADDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPADDD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpaddd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 FE
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPADDD ymm1, ymmV, ymm2/m256","VPADDD ymm2/m256, ymmV, ymm1","vpaddd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FE /r","V","V","AVX2","","w,r,r","",""
> +"VPADDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPADDD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpaddd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 FE
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPADDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPADDD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpaddd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 FE
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPADDQ xmm1, xmmV, xmm2/m128","VPADDQ xmm2/m128, xmmV, xmm1","vpaddq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D4 /r","V","V","AVX","","w,r,r","",""
> +"VPADDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPADDQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpaddq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 D4
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPADDQ ymm1, ymmV, ymm2/m256","VPADDQ ymm2/m256, ymmV, ymm1","vpaddq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D4 /r","V","V","AVX2","","w,r,r","",""
> +"VPADDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPADDQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpaddq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 D4
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPADDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPADDQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpaddq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 D4
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPADDSB xmm1, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV, xmm1","vpaddsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EC /r","V","V","AVX","","w,r,r","",""
> +"VPADDSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpaddsb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG EC
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPADDSB ymm1, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV, ymm1","vpaddsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EC /r","V","V","AVX2","","w,r,r","",""
> +"VPADDSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpaddsb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG EC
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPADDSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpaddsb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG EC
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPADDSW xmm1, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV, xmm1","vpaddsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG ED /r","V","V","AVX","","w,r,r","",""
> +"VPADDSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpaddsw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG ED
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPADDSW ymm1, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV, ymm1","vpaddsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG ED /r","V","V","AVX2","","w,r,r","",""
> +"VPADDSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpaddsw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG ED
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPADDSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpaddsw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG ED
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPADDUSB xmm1, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV, xmm1","vpaddusb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DC /r","V","V","AVX","","w,r,r","",""
> +"VPADDUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpaddusb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG DC
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPADDUSB ymm1, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV, ymm1","vpaddusb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DC /r","V","V","AVX2","","w,r,r","",""
> +"VPADDUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpaddusb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG DC
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPADDUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpaddusb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG DC
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPADDUSW xmm1, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV, xmm1","vpaddusw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DD /r","V","V","AVX","","w,r,r","",""
> +"VPADDUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpaddusw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG DD
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPADDUSW ymm1, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV, ymm1","vpaddusw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DD /r","V","V","AVX2","","w,r,r","",""
> +"VPADDUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpaddusw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG DD
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPADDUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpaddusw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG DD
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPADDW xmm1, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV, xmm1","vpaddw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FD /r","V","V","AVX","","w,r,r","",""
> +"VPADDW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpaddw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG FD
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPADDW ymm1, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV, ymm1","vpaddw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FD /r","V","V","AVX2","","w,r,r","",""
> +"VPADDW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpaddw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG FD
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPADDW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpaddw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG FD
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPALIGNR xmm1, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u, xmm2/m128,
> xmmV, xmm1","vpalignr imm8u, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.WIG 0F /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VPALIGNR xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u,
> xmm2/m128, xmmV, {k}{z}, xmm1","vpalignr imm8u, xmm2/m128, xmmV,
> {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.WIG 0F /r
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
> +"VPALIGNR ymm1, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u, ymm2/m256,
> ymmV, ymm1","vpalignr imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.WIG 0F /r
> ib","V","V","AVX2","","w,r,r,r","",""
> +"VPALIGNR ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u,
> ymm2/m256, ymmV, {k}{z}, ymm1","vpalignr imm8u, ymm2/m256, ymmV,
> {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.WIG 0F /r
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
> +"VPALIGNR zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPALIGNR imm8u,
> zmm2/m512, zmmV, {k}{z}, zmm1","vpalignr imm8u, zmm2/m512, zmmV,
> {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.WIG 0F /r
> ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
> +"VPAND xmm1, xmmV, xmm2/m128","VPAND xmm2/m128, xmmV, xmm1","vpand xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DB /r","V","V","AVX","","w,r,r","",""
> +"VPAND ymm1, ymmV, ymm2/m256","VPAND ymm2/m256, ymmV, ymm1","vpand ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DB /r","V","V","AVX2","","w,r,r","",""
> +"VPANDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpandd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 DB
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPANDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpandd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 DB
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPANDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpandd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 DB
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPANDN xmm1, xmmV, xmm2/m128","VPANDN xmm2/m128, xmmV, xmm1","vpandn xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DF /r","V","V","AVX","","w,r,r","",""
> +"VPANDN ymm1, ymmV, ymm2/m256","VPANDN ymm2/m256, ymmV, ymm1","vpandn ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DF /r","V","V","AVX2","","w,r,r","",""
> +"VPANDND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDND
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpandnd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 DF
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPANDND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDND
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpandnd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 DF
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPANDND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDND
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpandnd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 DF
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPANDNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDNQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpandnq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 DF
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPANDNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDNQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpandnq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 DF
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPANDNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDNQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpandnq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 DF
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPANDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpandq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 DB
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPANDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpandq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 DB
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPANDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpandq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 DB
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPAVGB xmm1, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV, xmm1","vpavgb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E0 /r","V","V","AVX","","w,r,r","",""
> +"VPAVGB xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpavgb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG E0
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPAVGB ymm1, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV, ymm1","vpavgb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E0 /r","V","V","AVX2","","w,r,r","",""
> +"VPAVGB ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpavgb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG E0
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPAVGB zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpavgb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG E0
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPAVGW xmm1, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV, xmm1","vpavgw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E3 /r","V","V","AVX","","w,r,r","",""
> +"VPAVGW xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpavgw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG E3
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPAVGW ymm1, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV, ymm1","vpavgw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E3 /r","V","V","AVX2","","w,r,r","",""
> +"VPAVGW ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpavgw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG E3
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPAVGW zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpavgw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG E3
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPBLENDD xmm1, xmmV, xmm2/m128, imm8u","VPBLENDD imm8u, xmm2/m128,
> xmmV, xmm1","vpblendd imm8u, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 02 /r
> ib","V","V","AVX2","","w,r,r,r","",""
> +"VPBLENDD ymm1, ymmV, ymm2/m256, imm8u","VPBLENDD imm8u, ymm2/m256,
> ymmV, ymm1","vpblendd imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 02 /r
> ib","V","V","AVX2","","w,r,r,r","",""
> +"VPBLENDMB xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpblendmb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.W0 66
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPBLENDMB ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpblendmb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.W0 66
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPBLENDMB zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpblendmb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.W0 66
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPBLENDMD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPBLENDMD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpblendmd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 64
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPBLENDMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPBLENDMD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpblendmd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 64
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPBLENDMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPBLENDMD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpblendmd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 64
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPBLENDMQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPBLENDMQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpblendmq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 64
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPBLENDMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPBLENDMQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpblendmq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 64
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPBLENDMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPBLENDMQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpblendmq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 64
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPBLENDMW xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpblendmw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.W1 66
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPBLENDMW ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpblendmw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.W1 66
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPBLENDMW zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpblendmw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.W1 66
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPBLENDVB xmm1, xmmV, xmm2/m128, xmmIH","VPBLENDVB xmmIH, xmm2/m128,
> xmmV, xmm1","vpblendvb xmmIH, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.W0 4C /r
> /is4","V","V","AVX","","w,r,r,r","",""
> +"VPBLENDVB ymm1, ymmV, ymm2/m256, ymmIH","VPBLENDVB ymmIH, ymm2/m256,
> ymmV, ymm1","vpblendvb ymmIH, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 4C /r
> /is4","V","V","AVX2","","w,r,r,r","",""
> +"VPBLENDW xmm1, xmmV, xmm2/m128, imm8u","VPBLENDW imm8u, xmm2/m128,
> xmmV, xmm1","vpblendw imm8u, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.WIG 0E /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VPBLENDW ymm1, ymmV, ymm2/m256, imm8u","VPBLENDW imm8u, ymm2/m256,
> ymmV, ymm1","vpblendw imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.WIG 0E /r
> ib","V","V","AVX2","","w,r,r,r","",""
> +"VPBROADCASTB xmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z},
> xmm1","vpbroadcastb rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7A
> /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
> +"VPBROADCASTB ymm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z},
> ymm1","vpbroadcastb rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7A
> /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
> +"VPBROADCASTB zmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z},
> zmm1","vpbroadcastb rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7A
> /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"VPBROADCASTB xmm1, xmm2/m8","VPBROADCASTB xmm2/m8, xmm1","vpbroadcastb xmm2/m8, xmm1","VEX.128.66.0F38.W0 78 /r","V","V","AVX2","","w,r","",""
> +"VPBROADCASTB ymm1, xmm2/m8","VPBROADCASTB xmm2/m8, ymm1","vpbroadcastb xmm2/m8, ymm1","VEX.256.66.0F38.W0 78 /r","V","V","AVX2","","w,r","",""
> +"VPBROADCASTB xmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z},
> xmm1","vpbroadcastb xmm2/m8, {k}{z}, xmm1","EVEX.128.66.0F38.W0 78
> /r","V","V","AVX512BW+AVX512VL","scale1","w,r,r","",""
> +"VPBROADCASTB ymm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z},
> ymm1","vpbroadcastb xmm2/m8, {k}{z}, ymm1","EVEX.256.66.0F38.W0 78
> /r","V","V","AVX512BW+AVX512VL","scale1","w,r,r","",""
> +"VPBROADCASTB zmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z},
> zmm1","vpbroadcastb xmm2/m8, {k}{z}, zmm1","EVEX.512.66.0F38.W0 78
> /r","V","V","AVX512BW","scale1","w,r,r","",""
> +"VPBROADCASTD xmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z},
> xmm1","vpbroadcastd rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7C
> /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
> +"VPBROADCASTD ymm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z},
> ymm1","vpbroadcastd rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7C
> /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
> +"VPBROADCASTD zmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z},
> zmm1","vpbroadcastd rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7C
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VPBROADCASTD xmm1, xmm2/m32","VPBROADCASTD xmm2/m32, xmm1","vpbroadcastd xmm2/m32, xmm1","VEX.128.66.0F38.W0 58 /r","V","V","AVX2","","w,r","",""
> +"VPBROADCASTD ymm1, xmm2/m32","VPBROADCASTD xmm2/m32, ymm1","vpbroadcastd xmm2/m32, ymm1","VEX.256.66.0F38.W0 58 /r","V","V","AVX2","","w,r","",""
> +"VPBROADCASTD xmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z},
> xmm1","vpbroadcastd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 58
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPBROADCASTD ymm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z},
> ymm1","vpbroadcastd xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 58
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPBROADCASTD zmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z},
> zmm1","vpbroadcastd xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 58
> /r","V","V","AVX512F","scale4","w,r,r","",""
> +"VPBROADCASTMB2Q xmm1, k2","VPBROADCASTMB2Q k2, xmm1","vpbroadcastmb2q k2, xmm1","EVEX.128.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
> +"VPBROADCASTMB2Q ymm1, k2","VPBROADCASTMB2Q k2, ymm1","vpbroadcastmb2q k2, ymm1","EVEX.256.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
> +"VPBROADCASTMB2Q zmm1, k2","VPBROADCASTMB2Q k2, zmm1","vpbroadcastmb2q k2, zmm1","EVEX.512.F3.0F38.W1 2A /r","V","V","AVX512CD","modrm_regonly","w,r","",""
> +"VPBROADCASTMW2D xmm1, k2","VPBROADCASTMW2D k2, xmm1","vpbroadcastmw2d k2, xmm1","EVEX.128.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
> +"VPBROADCASTMW2D ymm1, k2","VPBROADCASTMW2D k2, ymm1","vpbroadcastmw2d k2, ymm1","EVEX.256.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
> +"VPBROADCASTMW2D zmm1, k2","VPBROADCASTMW2D k2, zmm1","vpbroadcastmw2d k2, zmm1","EVEX.512.F3.0F38.W0 3A /r","V","V","AVX512CD","modrm_regonly","w,r","",""
> +"VPBROADCASTQ xmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z},
> xmm1","vpbroadcastq rmr64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 7C
> /r","N.S.","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
> +"VPBROADCASTQ ymm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z},
> ymm1","vpbroadcastq rmr64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 7C
> /r","N.S.","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
> +"VPBROADCASTQ zmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z},
> zmm1","vpbroadcastq rmr64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 7C
> /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VPBROADCASTQ xmm1, xmm2/m64","VPBROADCASTQ xmm2/m64, xmm1","vpbroadcastq xmm2/m64, xmm1","VEX.128.66.0F38.W0 59 /r","V","V","AVX2","","w,r","",""
> +"VPBROADCASTQ ymm1, xmm2/m64","VPBROADCASTQ xmm2/m64, ymm1","vpbroadcastq xmm2/m64, ymm1","VEX.256.66.0F38.W0 59 /r","V","V","AVX2","","w,r","",""
> +"VPBROADCASTQ xmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z},
> xmm1","vpbroadcastq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 59
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPBROADCASTQ ymm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z},
> ymm1","vpbroadcastq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 59
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPBROADCASTQ zmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z},
> zmm1","vpbroadcastq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 59
> /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VPBROADCASTW xmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z},
> xmm1","vpbroadcastw rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7B
> /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
> +"VPBROADCASTW ymm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z},
> ymm1","vpbroadcastw rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7B
> /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
> +"VPBROADCASTW zmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z},
> zmm1","vpbroadcastw rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7B
> /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
> +"VPBROADCASTW xmm1, xmm2/m16","VPBROADCASTW xmm2/m16, xmm1","vpbroadcastw xmm2/m16, xmm1","VEX.128.66.0F38.W0 79 /r","V","V","AVX2","","w,r","",""
> +"VPBROADCASTW ymm1, xmm2/m16","VPBROADCASTW xmm2/m16, ymm1","vpbroadcastw xmm2/m16, ymm1","VEX.256.66.0F38.W0 79 /r","V","V","AVX2","","w,r","",""
> +"VPBROADCASTW xmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z},
> xmm1","vpbroadcastw xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.W0 79
> /r","V","V","AVX512BW+AVX512VL","scale2","w,r,r","",""
> +"VPBROADCASTW ymm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z},
> ymm1","vpbroadcastw xmm2/m16, {k}{z}, ymm1","EVEX.256.66.0F38.W0 79
> /r","V","V","AVX512BW+AVX512VL","scale2","w,r,r","",""
> +"VPBROADCASTW zmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z},
> zmm1","vpbroadcastw xmm2/m16, {k}{z}, zmm1","EVEX.512.66.0F38.W0 79
> /r","V","V","AVX512BW","scale2","w,r,r","",""
> +"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u,
> xmm2/m128, xmmV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV,
> xmm1","EVEX.NDS.128.66.0F3A.WIG 44 /r
> ib","V","V","VPCLMULQDQ+AVX512VL","scale16","w,r,r,r","",""
> +"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u,
> xmm2/m128, xmmV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F3A.WIG 44 /r
> ib","V","V","PCLMULQDQ+AVX","","w,r,r,r","",""
> +"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u,
> ymm2/m256, ymmV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV,
> ymm1","EVEX.NDS.256.66.0F3A.WIG 44 /r
> ib","V","V","VPCLMULQDQ+AVX512VL","scale32","w,r,r,r","",""
> +"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u,
> ymm2/m256, ymmV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.WIG 44 /r
> ib","V","V","VPCLMULQDQ","","w,r,r,r","",""
> +"VPCLMULQDQ zmm1, zmmV, zmm2/m512, imm8u","VPCLMULQDQ imm8u,
> zmm2/m512, zmmV, zmm1","vpclmulqdq imm8u, zmm2/m512, zmmV,
> zmm1","EVEX.NDS.512.66.0F3A.WIG 44 /r
> ib","V","V","VPCLMULQDQ+AVX512F","scale64","w,r,r,r","",""
> +"VPCMOV xmm1, xmmV, xmmIH, xmm2/m128","VPCMOV xmm2/m128, xmmIH, xmmV,
> xmm1","vpcmov xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A2 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPCMOV xmm1, xmmV, xmm2/m128, xmmIH","VPCMOV xmmIH, xmm2/m128, xmmV,
> xmm1","vpcmov xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A2 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPCMOV ymm1, ymmV, ymmIH, ymm2/m256","VPCMOV ymm2/m256, ymmIH, ymmV,
> ymm1","vpcmov ymm2/m256, ymmIH, ymmV, ymm1","XOP.NDS.256.08.W1 A2 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPCMOV ymm1, ymmV, ymm2/m256, ymmIH","VPCMOV ymmIH, ymm2/m256, ymmV,
> ymm1","vpcmov ymmIH, ymm2/m256, ymmV, ymm1","XOP.NDS.256.08.W0 A2 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPCMPB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPB imm8u, xmm2/m128,
> xmmV, {k}, k1","vpcmpb imm8u, xmm2/m128, xmmV, {k},
> k1","EVEX.NDS.128.66.0F3A.W0 3F /r
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
> +"VPCMPB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPB imm8u, ymm2/m256,
> ymmV, {k}, k1","vpcmpb imm8u, ymm2/m256, ymmV, {k},
> k1","EVEX.NDS.256.66.0F3A.W0 3F /r
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
> +"VPCMPB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPB imm8u, zmm2/m512,
> zmmV, {k}, k1","vpcmpb imm8u, zmm2/m512, zmmV, {k},
> k1","EVEX.NDS.512.66.0F3A.W0 3F /r
> ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
> +"VPCMPD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPD imm8u,
> xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpd imm8u, xmm2/m128/m32bcst,
> xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 1F /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
> +"VPCMPD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPD imm8u,
> ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpd imm8u, ymm2/m256/m32bcst,
> ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 1F /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VPCMPD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPD imm8u,
> zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpd imm8u, zmm2/m512/m32bcst,
> zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 1F /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
> +"VPCMPEQB xmm1, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, xmm1","vpcmpeqb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 74 /r","V","V","AVX","","w,r,r","",""
> +"VPCMPEQB k1, {k}, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, {k},
> k1","vpcmpeqb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 74
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPCMPEQB ymm1, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, ymm1","vpcmpeqb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 74 /r","V","V","AVX2","","w,r,r","",""
> +"VPCMPEQB k1, {k}, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, {k},
> k1","vpcmpeqb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 74
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPCMPEQB k1, {k}, zmmV, zmm2/m512","VPCMPEQB zmm2/m512, zmmV, {k},
> k1","vpcmpeqb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 74
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPCMPEQD xmm1, xmmV, xmm2/m128","VPCMPEQD xmm2/m128, xmmV, xmm1","vpcmpeqd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 76 /r","V","V","AVX","","w,r,r","",""
> +"VPCMPEQD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPEQD
> xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpeqd xmm2/m128/m32bcst, xmmV,
> {k}, k1","EVEX.NDS.128.66.0F.W0 76
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPCMPEQD ymm1, ymmV, ymm2/m256","VPCMPEQD ymm2/m256, ymmV, ymm1","vpcmpeqd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 76 /r","V","V","AVX2","","w,r,r","",""
> +"VPCMPEQD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPEQD
> ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpeqd ymm2/m256/m32bcst, ymmV,
> {k}, k1","EVEX.NDS.256.66.0F.W0 76
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPCMPEQD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPEQD
> zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpeqd zmm2/m512/m32bcst, zmmV,
> {k}, k1","EVEX.NDS.512.66.0F.W0 76
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPCMPEQQ xmm1, xmmV, xmm2/m128","VPCMPEQQ xmm2/m128, xmmV, xmm1","vpcmpeqq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 29 /r","V","V","AVX","","w,r,r","",""
> +"VPCMPEQQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPEQQ
> xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpeqq xmm2/m128/m64bcst, xmmV,
> {k}, k1","EVEX.NDS.128.66.0F38.W1 29
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPCMPEQQ ymm1, ymmV, ymm2/m256","VPCMPEQQ ymm2/m256, ymmV, ymm1","vpcmpeqq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 29 /r","V","V","AVX2","","w,r,r","",""
> +"VPCMPEQQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPEQQ
> ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpeqq ymm2/m256/m64bcst, ymmV,
> {k}, k1","EVEX.NDS.256.66.0F38.W1 29
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPCMPEQQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPEQQ
> zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpeqq zmm2/m512/m64bcst, zmmV,
> {k}, k1","EVEX.NDS.512.66.0F38.W1 29
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPCMPEQW xmm1, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, xmm1","vpcmpeqw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 75 /r","V","V","AVX","","w,r,r","",""
> +"VPCMPEQW k1, {k}, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, {k},
> k1","vpcmpeqw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 75
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPCMPEQW ymm1, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, ymm1","vpcmpeqw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 75 /r","V","V","AVX2","","w,r,r","",""
> +"VPCMPEQW k1, {k}, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, {k},
> k1","vpcmpeqw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 75
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPCMPEQW k1, {k}, zmmV, zmm2/m512","VPCMPEQW zmm2/m512, zmmV, {k},
> k1","vpcmpeqw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 75
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPCMPESTRI xmm1, xmm2/m128, imm8u","VPCMPESTRI imm8u, xmm2/m128, xmm1","vpcmpestri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 61 /r ib","V","V","AVX","","r,r,r","",""
> +"VPCMPESTRM xmm1, xmm2/m128, imm8u","VPCMPESTRM imm8u, xmm2/m128, xmm1","vpcmpestrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 60 /r ib","V","V","AVX","","r,r,r","",""
> +"VPCMPGTB xmm1, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, xmm1","vpcmpgtb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 64 /r","V","V","AVX","","w,r,r","",""
> +"VPCMPGTB k1, {k}, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, {k},
> k1","vpcmpgtb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 64
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPCMPGTB ymm1, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, ymm1","vpcmpgtb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 64 /r","V","V","AVX2","","w,r,r","",""
> +"VPCMPGTB k1, {k}, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, {k},
> k1","vpcmpgtb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 64
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPCMPGTB k1, {k}, zmmV, zmm2/m512","VPCMPGTB zmm2/m512, zmmV, {k},
> k1","vpcmpgtb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 64
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPCMPGTD xmm1, xmmV, xmm2/m128","VPCMPGTD xmm2/m128, xmmV, xmm1","vpcmpgtd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 66 /r","V","V","AVX","","w,r,r","",""
> +"VPCMPGTD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPGTD
> xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpgtd xmm2/m128/m32bcst, xmmV,
> {k}, k1","EVEX.NDS.128.66.0F.W0 66
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPCMPGTD ymm1, ymmV, ymm2/m256","VPCMPGTD ymm2/m256, ymmV, ymm1","vpcmpgtd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 66 /r","V","V","AVX2","","w,r,r","",""
> +"VPCMPGTD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPGTD
> ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpgtd ymm2/m256/m32bcst, ymmV,
> {k}, k1","EVEX.NDS.256.66.0F.W0 66
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPCMPGTD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPGTD
> zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpgtd zmm2/m512/m32bcst, zmmV,
> {k}, k1","EVEX.NDS.512.66.0F.W0 66
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPCMPGTQ xmm1, xmmV, xmm2/m128","VPCMPGTQ xmm2/m128, xmmV, xmm1","vpcmpgtq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 37 /r","V","V","AVX","","w,r,r","",""
> +"VPCMPGTQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPGTQ
> xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpgtq xmm2/m128/m64bcst, xmmV,
> {k}, k1","EVEX.NDS.128.66.0F38.W1 37
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPCMPGTQ ymm1, ymmV, ymm2/m256","VPCMPGTQ ymm2/m256, ymmV, ymm1","vpcmpgtq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 37 /r","V","V","AVX2","","w,r,r","",""
> +"VPCMPGTQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPGTQ
> ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpgtq ymm2/m256/m64bcst, ymmV,
> {k}, k1","EVEX.NDS.256.66.0F38.W1 37
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPCMPGTQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPGTQ
> zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpgtq zmm2/m512/m64bcst, zmmV,
> {k}, k1","EVEX.NDS.512.66.0F38.W1 37
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPCMPGTW xmm1, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, xmm1","vpcmpgtw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 65 /r","V","V","AVX","","w,r,r","",""
> +"VPCMPGTW k1, {k}, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, {k},
> k1","vpcmpgtw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 65
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPCMPGTW ymm1, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, ymm1","vpcmpgtw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 65 /r","V","V","AVX2","","w,r,r","",""
> +"VPCMPGTW k1, {k}, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, {k},
> k1","vpcmpgtw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 65
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPCMPGTW k1, {k}, zmmV, zmm2/m512","VPCMPGTW zmm2/m512, zmmV, {k},
> k1","vpcmpgtw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 65
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPCMPISTRI xmm1, xmm2/m128, imm8u","VPCMPISTRI imm8u, xmm2/m128, xmm1","vpcmpistri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 63 /r ib","V","V","AVX","","r,r,r","",""
> +"VPCMPISTRM xmm1, xmm2/m128, imm8u","VPCMPISTRM imm8u, xmm2/m128, xmm1","vpcmpistrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 62 /r ib","V","V","AVX","","r,r,r","",""
> +"VPCMPQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPQ imm8u,
> xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpq imm8u, xmm2/m128/m64bcst,
> xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 1F /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VPCMPQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPQ imm8u,
> ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpq imm8u, ymm2/m256/m64bcst,
> ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 1F /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VPCMPQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPQ imm8u,
> zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpq imm8u, zmm2/m512/m64bcst,
> zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 1F /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
> +"VPCMPUB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUB imm8u, xmm2/m128,
> xmmV, {k}, k1","vpcmpub imm8u, xmm2/m128, xmmV, {k},
> k1","EVEX.NDS.128.66.0F3A.W0 3E /r
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
> +"VPCMPUB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUB imm8u, ymm2/m256,
> ymmV, {k}, k1","vpcmpub imm8u, ymm2/m256, ymmV, {k},
> k1","EVEX.NDS.256.66.0F3A.W0 3E /r
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
> +"VPCMPUB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUB imm8u, zmm2/m512,
> zmmV, {k}, k1","vpcmpub imm8u, zmm2/m512, zmmV, {k},
> k1","EVEX.NDS.512.66.0F3A.W0 3E /r
> ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
> +"VPCMPUD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPUD imm8u,
> xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpud imm8u, xmm2/m128/m32bcst,
> xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 1E /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
> +"VPCMPUD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPUD imm8u,
> ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpud imm8u, ymm2/m256/m32bcst,
> ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 1E /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VPCMPUD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPUD imm8u,
> zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpud imm8u, zmm2/m512/m32bcst,
> zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 1E /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
> +"VPCMPUQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPUQ imm8u,
> xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpuq imm8u, xmm2/m128/m64bcst,
> xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 1E /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VPCMPUQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPUQ imm8u,
> ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpuq imm8u, ymm2/m256/m64bcst,
> ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 1E /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VPCMPUQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPUQ imm8u,
> zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpuq imm8u, zmm2/m512/m64bcst,
> zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 1E /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
> +"VPCMPUW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUW imm8u, xmm2/m128,
> xmmV, {k}, k1","vpcmpuw imm8u, xmm2/m128, xmmV, {k},
> k1","EVEX.NDS.128.66.0F3A.W1 3E /r
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
> +"VPCMPUW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUW imm8u, ymm2/m256,
> ymmV, {k}, k1","vpcmpuw imm8u, ymm2/m256, ymmV, {k},
> k1","EVEX.NDS.256.66.0F3A.W1 3E /r
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
> +"VPCMPUW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUW imm8u, zmm2/m512,
> zmmV, {k}, k1","vpcmpuw imm8u, zmm2/m512, zmmV, {k},
> k1","EVEX.NDS.512.66.0F3A.W1 3E /r
> ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
> +"VPCMPW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPW imm8u, xmm2/m128,
> xmmV, {k}, k1","vpcmpw imm8u, xmm2/m128, xmmV, {k},
> k1","EVEX.NDS.128.66.0F3A.W1 3F /r
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
> +"VPCMPW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPW imm8u, ymm2/m256,
> ymmV, {k}, k1","vpcmpw imm8u, ymm2/m256, ymmV, {k},
> k1","EVEX.NDS.256.66.0F3A.W1 3F /r
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
> +"VPCMPW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPW imm8u, zmm2/m512,
> zmmV, {k}, k1","vpcmpw imm8u, zmm2/m512, zmmV, {k},
> k1","EVEX.NDS.512.66.0F3A.W1 3F /r
> ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
> +"VPCOMB xmm1, xmmV, xmm2/m128, imm8u","VPCOMB imm8u, xmm2/m128, xmmV,
> xmm1","vpcomb imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CC /r
> ib","V","V","XOP","amd","w,r,r,r","",""
> +"VPCOMD xmm1, xmmV, xmm2/m128, imm8u","VPCOMD imm8u, xmm2/m128, xmmV,
> xmm1","vpcomd imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CE /r
> ib","V","V","XOP","amd","w,r,r,r","",""
> +"VPCOMPRESSB xmm2/m128, {k}{z}, xmm1","VPCOMPRESSB xmm1, {k}{z},
> xmm2/m128","vpcompressb xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0
> 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
> +"VPCOMPRESSB ymm2/m256, {k}{z}, ymm1","VPCOMPRESSB ymm1, {k}{z},
> ymm2/m256","vpcompressb ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0
> 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
> +"VPCOMPRESSB zmm2/m512, {k}{z}, zmm1","VPCOMPRESSB zmm1, {k}{z},
> zmm2/m512","vpcompressb zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0
> 63 /r","V","V","AVX512_VBMI2","scale1","w,r,r","",""
> +"VPCOMPRESSD xmm2/m128, {k}{z}, xmm1","VPCOMPRESSD xmm1, {k}{z},
> xmm2/m128","vpcompressd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0
> 8B /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPCOMPRESSD ymm2/m256, {k}{z}, ymm1","VPCOMPRESSD ymm1, {k}{z},
> ymm2/m256","vpcompressd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0
> 8B /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPCOMPRESSD zmm2/m512, {k}{z}, zmm1","VPCOMPRESSD zmm1, {k}{z},
> zmm2/m512","vpcompressd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0
> 8B /r","V","V","AVX512F","scale4","w,r,r","",""
> +"VPCOMPRESSQ xmm2/m128, {k}{z}, xmm1","VPCOMPRESSQ xmm1, {k}{z},
> xmm2/m128","vpcompressq xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1
> 8B /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPCOMPRESSQ ymm2/m256, {k}{z}, ymm1","VPCOMPRESSQ ymm1, {k}{z},
> ymm2/m256","vpcompressq ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1
> 8B /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPCOMPRESSQ zmm2/m512, {k}{z}, zmm1","VPCOMPRESSQ zmm1, {k}{z},
> zmm2/m512","vpcompressq zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1
> 8B /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VPCOMPRESSW xmm2/m128, {k}{z}, xmm1","VPCOMPRESSW xmm1, {k}{z},
> xmm2/m128","vpcompressw xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1
> 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
> +"VPCOMPRESSW ymm2/m256, {k}{z}, ymm1","VPCOMPRESSW ymm1, {k}{z},
> ymm2/m256","vpcompressw ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1
> 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
> +"VPCOMPRESSW zmm2/m512, {k}{z}, zmm1","VPCOMPRESSW zmm1, {k}{z},
> zmm2/m512","vpcompressw zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1
> 63 /r","V","V","AVX512_VBMI2","scale2","w,r,r","",""
> +"VPCOMQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMQ imm8u, xmm2/m128, xmmV,
> xmm1","vpcomq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CF /r
> ib","V","V","XOP","amd","w,r,r,r","",""
> +"VPCOMUB xmm1, xmmV, xmm2/m128, imm8u","VPCOMUB imm8u, xmm2/m128,
> xmmV, xmm1","vpcomub imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0
> EC /r ib","V","V","XOP","amd","w,r,r,r","",""
> +"VPCOMUD xmm1, xmmV, xmm2/m128, imm8u","VPCOMUD imm8u, xmm2/m128,
> xmmV, xmm1","vpcomud imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0
> EE /r ib","V","V","XOP","amd","w,r,r,r","",""
> +"VPCOMUQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMUQ imm8u, xmm2/m128,
> xmmV, xmm1","vpcomuq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0
> EF /r ib","V","V","XOP","amd","w,r,r,r","",""
> +"VPCOMUW xmm1, xmmV, xmm2/m128, imm8u","VPCOMUW imm8u, xmm2/m128,
> xmmV, xmm1","vpcomuw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0
> ED /r ib","V","V","XOP","amd","w,r,r,r","",""
> +"VPCOMW xmm1, xmmV, xmm2/m128, imm8u","VPCOMW imm8u, xmm2/m128, xmmV,
> xmm1","vpcomw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CD /r
> ib","V","V","XOP","amd","w,r,r,r","",""
> +"VPCONFLICTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPCONFLICTD
> xmm2/m128/m32bcst, {k}{z}, xmm1","vpconflictd xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.66.0F38.W0 C4
> /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VPCONFLICTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPCONFLICTD
> ymm2/m256/m32bcst, {k}{z}, ymm1","vpconflictd ymm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.66.0F38.W0 C4
> /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VPCONFLICTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPCONFLICTD
> zmm2/m512/m32bcst, {k}{z}, zmm1","vpconflictd zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.66.0F38.W0 C4
> /r","V","V","AVX512CD","bscale4,scale64","w,r,r","",""
> +"VPCONFLICTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPCONFLICTQ
> xmm2/m128/m64bcst, {k}{z}, xmm1","vpconflictq xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.66.0F38.W1 C4
> /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VPCONFLICTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPCONFLICTQ
> ymm2/m256/m64bcst, {k}{z}, ymm1","vpconflictq ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.66.0F38.W1 C4
> /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VPCONFLICTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPCONFLICTQ
> zmm2/m512/m64bcst, {k}{z}, zmm1","vpconflictq zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F38.W1 C4
> /r","V","V","AVX512CD","bscale8,scale64","w,r,r","",""
> +"VPDPBUSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpbusd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 50
> /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPDPBUSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpbusd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 50
> /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPDPBUSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpbusd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 50
> /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
> +"VPDPBUSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSDS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpbusds xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 51
> /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPDPBUSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSDS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpbusds ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 51
> /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPDPBUSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSDS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpbusds zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 51
> /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
> +"VPDPWSSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpwssd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 52
> /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPDPWSSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpwssd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 52
> /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPDPWSSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpwssd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 52
> /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
> +"VPDPWSSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSDS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpwssds xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 53
> /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPDPWSSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSDS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpwssds ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 53
> /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPDPWSSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSDS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpwssds zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 53
> /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
> +"VPERM2F128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2F128 imm8u,
> ymm2/m256, ymmV, ymm1","vperm2f128 imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 06 /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VPERM2I128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2I128 imm8u,
> ymm2/m256, ymmV, ymm1","vperm2i128 imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F3A.W0 46 /r
> ib","V","V","AVX2","","w,r,r,r","",""
> +"VPERMB xmm1, {k}{z}, xmmV, xmm2/m128","VPERMB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpermb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.W0 8D
> /r","V","V","AVX512_VBMI+AVX512VL","scale16","w,r,r,r","",""
> +"VPERMB ymm1, {k}{z}, ymmV, ymm2/m256","VPERMB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpermb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.W0 8D
> /r","V","V","AVX512_VBMI+AVX512VL","scale32","w,r,r,r","",""
> +"VPERMB zmm1, {k}{z}, zmmV, zmm2/m512","VPERMB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpermb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.W0 8D
> /r","V","V","AVX512_VBMI","scale64","w,r,r,r","",""
> +"VPERMD ymm1, ymmV, ymm2/m256","VPERMD ymm2/m256, ymmV, ymm1","vpermd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 36 /r","V","V","AVX2","","w,r,r","",""
> +"VPERMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 36
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPERMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 36
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPERMI2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2B xmm2/m128, xmmV,
> {k}{z}, xmm1","vpermi2b xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W0 75
> /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","",""
> +"VPERMI2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2B ymm2/m256, ymmV,
> {k}{z}, ymm1","vpermi2b ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W0 75
> /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","",""
> +"VPERMI2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2B zmm2/m512, zmmV,
> {k}{z}, zmm1","vpermi2b zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W0 75
> /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","",""
> +"VPERMI2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2D
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermi2d xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 76
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPERMI2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2D
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermi2d ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 76
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPERMI2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2D
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermi2d zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 76
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VPERMI2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermi2pd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 77
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VPERMI2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermi2pd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 77
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VPERMI2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermi2pd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 77
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VPERMI2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermi2ps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 77
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPERMI2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermi2ps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 77
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPERMI2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermi2ps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 77
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VPERMI2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2Q
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermi2q xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 76
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VPERMI2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2Q
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermi2q ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 76
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VPERMI2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2Q
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermi2q zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 76
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VPERMI2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2W xmm2/m128, xmmV,
> {k}{z}, xmm1","vpermi2w xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 75
> /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","",""
> +"VPERMI2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2W ymm2/m256, ymmV,
> {k}{z}, ymm1","vpermi2w ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 75
> /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","",""
> +"VPERMI2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2W zmm2/m512, zmmV,
> {k}{z}, zmm1","vpermi2w zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 75
> /r","V","V","AVX512BW","scale64","rw,r,r,r","",""
> +"VPERMIL2PD xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PD imm8u,
> xmm2/m128, xmmIH, xmmV, xmm1","vpermil2pd imm8u, xmm2/m128, xmmIH,
> xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 49 /r
> /is4","V","V","XOP","amd","w,r,r,r,r","",""
> +"VPERMIL2PD xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PD imm8u,
> xmmIH, xmm2/m128, xmmV, xmm1","vpermil2pd imm8u, xmmIH, xmm2/m128,
> xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 49 /r
> /is4","V","V","XOP","amd","w,r,r,r,r","",""
> +"VPERMIL2PD ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PD imm8u,
> ymm2/m256, ymmIH, ymmV, ymm1","vpermil2pd imm8u, ymm2/m256, ymmIH,
> ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 49 /r
> /is4","V","V","XOP","amd","w,r,r,r,r","",""
> +"VPERMIL2PD ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PD imm8u,
> ymmIH, ymm2/m256, ymmV, ymm1","vpermil2pd imm8u, ymmIH, ymm2/m256,
> ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 49 /r
> /is4","V","V","XOP","amd","w,r,r,r,r","",""
> +"VPERMIL2PS xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PS imm8u,
> xmm2/m128, xmmIH, xmmV, xmm1","vpermil2ps imm8u, xmm2/m128, xmmIH,
> xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 48 /r
> /is4","V","V","XOP","amd","w,r,r,r,r","",""
> +"VPERMIL2PS xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PS imm8u,
> xmmIH, xmm2/m128, xmmV, xmm1","vpermil2ps imm8u, xmmIH, xmm2/m128,
> xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 48 /r
> /is4","V","V","XOP","amd","w,r,r,r,r","",""
> +"VPERMIL2PS ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PS imm8u,
> ymm2/m256, ymmIH, ymmV, ymm1","vpermil2ps imm8u, ymm2/m256, ymmIH,
> ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 48 /r
> /is4","V","V","XOP","amd","w,r,r,r,r","",""
> +"VPERMIL2PS ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PS imm8u,
> ymmIH, ymm2/m256, ymmV, ymm1","vpermil2ps imm8u, ymmIH, ymm2/m256,
> ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 48 /r
> /is4","V","V","XOP","amd","w,r,r,r,r","",""
> +"VPERMILPD xmm1, xmm2/m128, imm8u","VPERMILPD imm8u, xmm2/m128, xmm1","vpermilpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 05 /r ib","V","V","AVX","","w,r,r","",""
> +"VPERMILPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VPERMILPD imm8u,
> xmm2/m128/m64bcst, {k}{z}, xmm1","vpermilpd imm8u, xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.66.0F3A.W1 05 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPERMILPD ymm1, ymm2/m256, imm8u","VPERMILPD imm8u, ymm2/m256, ymm1","vpermilpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 05 /r ib","V","V","AVX","","w,r,r","",""
> +"VPERMILPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMILPD imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymm1","vpermilpd imm8u, ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.66.0F3A.W1 05 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPERMILPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMILPD imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmm1","vpermilpd imm8u, zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F3A.W1 05 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPERMILPD xmm1, xmmV, xmm2/m128","VPERMILPD xmm2/m128, xmmV, xmm1","vpermilpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0D /r","V","V","AVX","","w,r,r","",""
> +"VPERMILPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMILPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermilpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 0D
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPERMILPD ymm1, ymmV, ymm2/m256","VPERMILPD ymm2/m256, ymmV, ymm1","vpermilpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0D /r","V","V","AVX","","w,r,r","",""
> +"VPERMILPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMILPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermilpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 0D
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPERMILPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMILPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermilpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 0D
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPERMILPS xmm1, xmm2/m128, imm8u","VPERMILPS imm8u, xmm2/m128, xmm1","vpermilps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 04 /r ib","V","V","AVX","","w,r,r","",""
> +"VPERMILPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPERMILPS imm8u,
> xmm2/m128/m32bcst, {k}{z}, xmm1","vpermilps imm8u, xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.66.0F3A.W0 04 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPERMILPS ymm1, ymm2/m256, imm8u","VPERMILPS imm8u, ymm2/m256, ymm1","vpermilps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 04 /r ib","V","V","AVX","","w,r,r","",""
> +"VPERMILPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPERMILPS imm8u,
> ymm2/m256/m32bcst, {k}{z}, ymm1","vpermilps imm8u, ymm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.66.0F3A.W0 04 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPERMILPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPERMILPS imm8u,
> zmm2/m512/m32bcst, {k}{z}, zmm1","vpermilps imm8u, zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.66.0F3A.W0 04 /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPERMILPS xmm1, xmmV, xmm2/m128","VPERMILPS xmm2/m128, xmmV, xmm1","vpermilps xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0C /r","V","V","AVX","","w,r,r","",""
> +"VPERMILPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMILPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermilps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 0C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPERMILPS ymm1, ymmV, ymm2/m256","VPERMILPS ymm2/m256, ymmV, ymm1","vpermilps ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0C /r","V","V","AVX","","w,r,r","",""
> +"VPERMILPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMILPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermilps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 0C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPERMILPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMILPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermilps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 0C
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPERMPD ymm1, ymm2/m256, imm8u","VPERMPD imm8u, ymm2/m256, ymm1","vpermpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 01 /r ib","V","V","AVX2","","w,r,r","",""
> +"VPERMPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMPD imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymm1","vpermpd imm8u, ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.66.0F3A.W1 01 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPERMPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMPD imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmm1","vpermpd imm8u, zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F3A.W1 01 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPERMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 16
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPERMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 16
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPERMPS ymm1, ymmV, ymm2/m256","VPERMPS ymm2/m256, ymmV, ymm1","vpermps ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 16 /r","V","V","AVX2","","w,r,r","",""
> +"VPERMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 16
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPERMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 16
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPERMQ ymm1, ymm2/m256, imm8u","VPERMQ imm8u, ymm2/m256, ymm1","vpermq imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 00 /r ib","V","V","AVX2","","w,r,r","",""
> +"VPERMQ ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMQ imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymm1","vpermq imm8u, ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.66.0F3A.W1 00 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPERMQ zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMQ imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmm1","vpermq imm8u, zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F3A.W1 00 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPERMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 36
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPERMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 36
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPERMT2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2B xmm2/m128, xmmV,
> {k}{z}, xmm1","vpermt2b xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W0 7D
> /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","",""
> +"VPERMT2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2B ymm2/m256, ymmV,
> {k}{z}, ymm1","vpermt2b ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W0 7D
> /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","",""
> +"VPERMT2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2B zmm2/m512, zmmV,
> {k}{z}, zmm1","vpermt2b zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W0 7D
> /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","",""
> +"VPERMT2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2D
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermt2d xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPERMT2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2D
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermt2d ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPERMT2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2D
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermt2d zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7E
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VPERMT2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2PD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermt2pd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7F
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VPERMT2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2PD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermt2pd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7F
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VPERMT2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2PD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermt2pd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7F
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VPERMT2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2PS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermt2ps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7F
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPERMT2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2PS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermt2ps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7F
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPERMT2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2PS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermt2ps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7F
> /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
> +"VPERMT2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2Q
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermt2q xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7E
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VPERMT2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2Q
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermt2q ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7E
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VPERMT2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2Q
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermt2q zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7E
> /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
> +"VPERMT2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2W xmm2/m128, xmmV,
> {k}{z}, xmm1","vpermt2w xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 7D
> /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","",""
> +"VPERMT2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2W ymm2/m256, ymmV,
> {k}{z}, ymm1","vpermt2w ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 7D
> /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","",""
> +"VPERMT2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2W zmm2/m512, zmmV,
> {k}{z}, zmm1","vpermt2w zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 7D
> /r","V","V","AVX512BW","scale64","rw,r,r,r","",""
> +"VPERMW xmm1, {k}{z}, xmmV, xmm2/m128","VPERMW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpermw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.W1 8D
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPERMW ymm1, {k}{z}, ymmV, ymm2/m256","VPERMW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpermw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.W1 8D
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPERMW zmm1, {k}{z}, zmmV, zmm2/m512","VPERMW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpermw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.W1 8D
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPEXPANDB xmm1, {k}{z}, xmm2/m128","VPEXPANDB xmm2/m128, {k}{z},
> xmm1","vpexpandb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 62
> /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
> +"VPEXPANDB ymm1, {k}{z}, ymm2/m256","VPEXPANDB ymm2/m256, {k}{z},
> ymm1","vpexpandb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 62
> /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
> +"VPEXPANDB zmm1, {k}{z}, zmm2/m512","VPEXPANDB zmm2/m512, {k}{z},
> zmm1","vpexpandb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 62
> /r","V","V","AVX512_VBMI2","scale1","w,r,r","",""
> +"VPEXPANDD xmm1, {k}{z}, xmm2/m128","VPEXPANDD xmm2/m128, {k}{z},
> xmm1","vpexpandd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 89
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPEXPANDD ymm1, {k}{z}, ymm2/m256","VPEXPANDD ymm2/m256, {k}{z},
> ymm1","vpexpandd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 89
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPEXPANDD zmm1, {k}{z}, zmm2/m512","VPEXPANDD zmm2/m512, {k}{z},
> zmm1","vpexpandd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 89
> /r","V","V","AVX512F","scale4","w,r,r","",""
> +"VPEXPANDQ xmm1, {k}{z}, xmm2/m128","VPEXPANDQ xmm2/m128, {k}{z},
> xmm1","vpexpandq xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 89
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPEXPANDQ ymm1, {k}{z}, ymm2/m256","VPEXPANDQ ymm2/m256, {k}{z},
> ymm1","vpexpandq ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 89
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPEXPANDQ zmm1, {k}{z}, zmm2/m512","VPEXPANDQ zmm2/m512, {k}{z},
> zmm1","vpexpandq zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 89
> /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VPEXPANDW xmm1, {k}{z}, xmm2/m128","VPEXPANDW xmm2/m128, {k}{z},
> xmm1","vpexpandw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 62
> /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
> +"VPEXPANDW ymm1, {k}{z}, ymm2/m256","VPEXPANDW ymm2/m256, {k}{z},
> ymm1","vpexpandw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 62
> /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
> +"VPEXPANDW zmm1, {k}{z}, zmm2/m512","VPEXPANDW zmm2/m512, {k}{z},
> zmm1","vpexpandw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 62
> /r","V","V","AVX512_VBMI2","scale2","w,r,r","",""
> +"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb
> imm8u, xmm1, r32/m8","EVEX.128.66.0F3A.WIG 14 /r
> ib","V","V","AVX512BW+AVX512VL","scale1","w,r,r","",""
> +"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb imm8u, xmm1, r32/m8","VEX.128.66.0F3A.WIG 14 /r ib","V","V","AVX","","w,r,r","",""
> +"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, xmm1, r/m32","EVEX.128.66.0F3A.W0 16 /r ib","V","V","AVX512DQ+AVX512VL","scale4","w,r,r","",""
> +"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, xmm1, r/m32","VEX.128.66.0F3A.W0 16 /r ib","V","V","AVX","","w,r,r","",""
> +"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, xmm1, r/m64","EVEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
> +"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, xmm1, r/m64","VEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX","","w,r,r","",""
> +"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1,
> r32/m16","vpextrw imm8u, xmm1, r32/m16","EVEX.128.66.0F3A.WIG 15 /r
> ib","V","V","AVX512BW+AVX512VL","scale2","w,r,r","",""
> +"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1, r32/m16","vpextrw imm8u, xmm1, r32/m16","VEX.128.66.0F3A.WIG 15 /r ib","V","V","AVX","","w,r,r","",""
> +"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2, r32","EVEX.128.66.0F.WIG C5 /r ib","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
> +"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2, r32","VEX.128.66.0F.WIG C5 /r ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPGATHERDD xmm1, {k1-k7}, vm32x","VPGATHERDD vm32x, {k1-k7},
> xmm1","vpgatherdd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 90
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VPGATHERDD ymm1, {k1-k7}, vm32y","VPGATHERDD vm32y, {k1-k7},
> ymm1","vpgatherdd vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 90
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VPGATHERDD zmm1, {k1-k7}, vm32z","VPGATHERDD vm32z, {k1-k7},
> zmm1","vpgatherdd vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 90
> /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
> +"VPGATHERDD xmm1, vm32x, xmmV","VPGATHERDD xmmV, vm32x, xmm1","vpgatherdd xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VPGATHERDD ymm1, vm32y, ymmV","VPGATHERDD ymmV, vm32y, ymm1","vpgatherdd ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VPGATHERDQ xmm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7},
> xmm1","vpgatherdq vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 90
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VPGATHERDQ ymm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7},
> ymm1","vpgatherdq vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 90
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VPGATHERDQ zmm1, {k1-k7}, vm32y","VPGATHERDQ vm32y, {k1-k7},
> zmm1","vpgatherdq vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 90
> /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
> +"VPGATHERDQ xmm1, vm32x, xmmV","VPGATHERDQ xmmV, vm32x, xmm1","vpgatherdq xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VPGATHERDQ ymm1, vm32x, ymmV","VPGATHERDQ ymmV, vm32x, ymm1","vpgatherdq ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VPGATHERQD xmm1, {k1-k7}, vm64x","VPGATHERQD vm64x, {k1-k7},
> xmm1","vpgatherqd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 91
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VPGATHERQD xmm1, {k1-k7}, vm64y","VPGATHERQD vm64y, {k1-k7},
> xmm1","vpgatherqd vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 91
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VPGATHERQD ymm1, {k1-k7}, vm64z","VPGATHERQD vm64z, {k1-k7},
> ymm1","vpgatherqd vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 91
> /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
> +"VPGATHERQD xmm1, vm64x, xmmV","VPGATHERQD xmmV, vm64x, xmm1","vpgatherqd xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VPGATHERQD xmm1, vm64y, xmmV","VPGATHERQD xmmV, vm64y, xmm1","vpgatherqd xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VPGATHERQQ xmm1, {k1-k7}, vm64x","VPGATHERQQ vm64x, {k1-k7},
> xmm1","vpgatherqq vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 91
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VPGATHERQQ ymm1, {k1-k7}, vm64y","VPGATHERQQ vm64y, {k1-k7},
> ymm1","vpgatherqq vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 91
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VPGATHERQQ zmm1, {k1-k7}, vm64z","VPGATHERQQ vm64z, {k1-k7},
> zmm1","vpgatherqq vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 91
> /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
> +"VPGATHERQQ xmm1, vm64x, xmmV","VPGATHERQQ xmmV, vm64x, xmm1","vpgatherqq xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VPGATHERQQ ymm1, vm64y, ymmV","VPGATHERQQ ymmV, vm64y, ymm1","vpgatherqq ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
> +"VPHADDBD xmm1, xmm2/m128","VPHADDBD xmm2/m128, xmm1","vphaddbd xmm2/m128, xmm1","XOP.128.09.W0 C2 /r","V","V","XOP","amd","w,r","",""
> +"VPHADDBQ xmm1, xmm2/m128","VPHADDBQ xmm2/m128, xmm1","vphaddbq xmm2/m128, xmm1","XOP.128.09.W0 C3 /r","V","V","XOP","amd","w,r","",""
> +"VPHADDBW xmm1, xmm2/m128","VPHADDBW xmm2/m128, xmm1","vphaddbw xmm2/m128, xmm1","XOP.128.09.W0 C1 /r","V","V","XOP","amd","w,r","",""
> +"VPHADDD xmm1, xmmV, xmm2/m128","VPHADDD xmm2/m128, xmmV, xmm1","vphaddd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 02 /r","V","V","AVX","","w,r,r","",""
> +"VPHADDD ymm1, ymmV, ymm2/m256","VPHADDD ymm2/m256, ymmV, ymm1","vphaddd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 02 /r","V","V","AVX2","","w,r,r","",""
> +"VPHADDDQ xmm1, xmm2/m128","VPHADDDQ xmm2/m128, xmm1","vphadddq xmm2/m128, xmm1","XOP.128.09.W0 CB /r","V","V","XOP","amd","w,r","",""
> +"VPHADDSW xmm1, xmmV, xmm2/m128","VPHADDSW xmm2/m128, xmmV, xmm1","vphaddsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 03 /r","V","V","AVX","","w,r,r","",""
> +"VPHADDSW ymm1, ymmV, ymm2/m256","VPHADDSW ymm2/m256, ymmV, ymm1","vphaddsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 03 /r","V","V","AVX2","","w,r,r","",""
> +"VPHADDUBD xmm1, xmm2/m128","VPHADDUBD xmm2/m128, xmm1","vphaddubd xmm2/m128, xmm1","XOP.128.09.W0 D2 /r","V","V","XOP","amd","w,r","",""
> +"VPHADDUBQ xmm1, xmm2/m128","VPHADDUBQ xmm2/m128, xmm1","vphaddubq xmm2/m128, xmm1","XOP.128.09.W0 D3 /r","V","V","XOP","amd","w,r","",""
> +"VPHADDUBW xmm1, xmm2/m128","VPHADDUBW xmm2/m128, xmm1","vphaddubw xmm2/m128, xmm1","XOP.128.09.W0 D1 /r","V","V","XOP","amd","w,r","",""
> +"VPHADDUDQ xmm1, xmm2/m128","VPHADDUDQ xmm2/m128, xmm1","vphaddudq xmm2/m128, xmm1","XOP.128.09.W0 DB /r","V","V","XOP","amd","w,r","",""
> +"VPHADDUWD xmm1, xmm2/m128","VPHADDUWD xmm2/m128, xmm1","vphadduwd xmm2/m128, xmm1","XOP.128.09.W0 D6 /r","V","V","XOP","amd","w,r","",""
> +"VPHADDUWQ xmm1, xmm2/m128","VPHADDUWQ xmm2/m128, xmm1","vphadduwq xmm2/m128, xmm1","XOP.128.09.W0 D7 /r","V","V","XOP","amd","w,r","",""
> +"VPHADDW xmm1, xmmV, xmm2/m128","VPHADDW xmm2/m128, xmmV, xmm1","vphaddw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 01 /r","V","V","AVX","","w,r,r","",""
> +"VPHADDW ymm1, ymmV, ymm2/m256","VPHADDW ymm2/m256, ymmV, ymm1","vphaddw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 01 /r","V","V","AVX2","","w,r,r","",""
> +"VPHADDWD xmm1, xmm2/m128","VPHADDWD xmm2/m128, xmm1","vphaddwd xmm2/m128, xmm1","XOP.128.09.W0 C6 /r","V","V","XOP","amd","w,r","",""
> +"VPHADDWQ xmm1, xmm2/m128","VPHADDWQ xmm2/m128, xmm1","vphaddwq xmm2/m128, xmm1","XOP.128.09.W0 C7 /r","V","V","XOP","amd","w,r","",""
> +"VPHMINPOSUW xmm1, xmm2/m128","VPHMINPOSUW xmm2/m128, xmm1","vphminposuw xmm2/m128, xmm1","VEX.128.66.0F38.WIG 41 /r","V","V","AVX","","w,r","",""
> +"VPHSUBBW xmm1, xmm2/m128","VPHSUBBW xmm2/m128, xmm1","vphsubbw xmm2/m128, xmm1","XOP.128.09.W0 E1 /r","V","V","XOP","amd","w,r","",""
> +"VPHSUBD xmm1, xmmV, xmm2/m128","VPHSUBD xmm2/m128, xmmV, xmm1","vphsubd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 06 /r","V","V","AVX","","w,r,r","",""
> +"VPHSUBD ymm1, ymmV, ymm2/m256","VPHSUBD ymm2/m256, ymmV, ymm1","vphsubd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 06 /r","V","V","AVX2","","w,r,r","",""
> +"VPHSUBDQ xmm1, xmm2/m128","VPHSUBDQ xmm2/m128, xmm1","vphsubdq xmm2/m128, xmm1","XOP.128.09.W0 E3 /r","V","V","XOP","amd","w,r","",""
> +"VPHSUBSW xmm1, xmmV, xmm2/m128","VPHSUBSW xmm2/m128, xmmV, xmm1","vphsubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 07 /r","V","V","AVX","","w,r,r","",""
> +"VPHSUBSW ymm1, ymmV, ymm2/m256","VPHSUBSW ymm2/m256, ymmV, ymm1","vphsubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 07 /r","V","V","AVX2","","w,r,r","",""
> +"VPHSUBW xmm1, xmmV, xmm2/m128","VPHSUBW xmm2/m128, xmmV, xmm1","vphsubw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 05 /r","V","V","AVX","","w,r,r","",""
> +"VPHSUBW ymm1, ymmV, ymm2/m256","VPHSUBW ymm2/m256, ymmV, ymm1","vphsubw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 05 /r","V","V","AVX2","","w,r,r","",""
> +"VPHSUBWD xmm1, xmm2/m128","VPHSUBWD xmm2/m128, xmm1","vphsubwd xmm2/m128, xmm1","XOP.128.09.W0 E2 /r","V","V","XOP","amd","w,r","",""
> +"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV,
> xmm1","vpinsrb imm8u, r32/m8, xmmV, xmm1","EVEX.NDS.128.66.0F3A.WIG 20
> /r ib","V","V","AVX512BW+AVX512VL","scale1","w,r,r,r","",""
> +"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV,
> xmm1","vpinsrb imm8u, r32/m8, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 20
> /r ib","V","V","AVX","","w,r,r,r","",""
> +"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV,
> xmm1","vpinsrd imm8u, r/m32, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W0 22
> /r ib","V","V","AVX512DQ+AVX512VL","scale4","w,r,r,r","",""
> +"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV, xmm1","vpinsrd imm8u, r/m32, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 22 /r ib","V","V","AVX","","w,r,r,r","",""
> +"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV,
> xmm1","vpinsrq imm8u, r/m64, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W1 22
> /r ib","N.S.","V","AVX512DQ+AVX512VL","scale8","w,r,r,r","",""
> +"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV,
> xmm1","vpinsrq imm8u, r/m64, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 22 /r
> ib","N.S.","V","AVX","","w,r,r,r","",""
> +"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV,
> xmm1","vpinsrw imm8u, r32/m16, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG C4
> /r ib","V","V","AVX512BW+AVX512VL","scale2","w,r,r,r","",""
> +"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV,
> xmm1","vpinsrw imm8u, r32/m16, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C4
> /r ib","V","V","AVX","","w,r,r,r","",""
> +"VPLZCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPLZCNTD
> xmm2/m128/m32bcst, {k}{z}, xmm1","vplzcntd xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W0 44
> /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VPLZCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPLZCNTD
> ymm2/m256/m32bcst, {k}{z}, ymm1","vplzcntd ymm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W0 44
> /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VPLZCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPLZCNTD
> zmm2/m512/m32bcst, {k}{z}, zmm1","vplzcntd zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W0 44
> /r","V","V","AVX512CD","bscale4,scale64","w,r,r","",""
> +"VPLZCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPLZCNTQ
> xmm2/m128/m64bcst, {k}{z}, xmm1","vplzcntq xmm2/m128/m64bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W1 44
> /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VPLZCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPLZCNTQ
> ymm2/m256/m64bcst, {k}{z}, ymm1","vplzcntq ymm2/m256/m64bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W1 44
> /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VPLZCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPLZCNTQ
> zmm2/m512/m64bcst, {k}{z}, zmm1","vplzcntq zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W1 44
> /r","V","V","AVX512CD","bscale8,scale64","w,r,r","",""
> +"VPMACSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDD xmmIH, xmm2/m128,
> xmmV, xmm1","vpmacsdd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0
> 9E /r /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMACSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQH xmmIH, xmm2/m128,
> xmmV, xmm1","vpmacsdqh xmmIH, xmm2/m128, xmmV,
> xmm1","XOP.NDS.128.08.W0 9F /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMACSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQL xmmIH, xmm2/m128,
> xmmV, xmm1","vpmacsdql xmmIH, xmm2/m128, xmmV,
> xmm1","XOP.NDS.128.08.W0 97 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMACSSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDD xmmIH, xmm2/m128,
> xmmV, xmm1","vpmacssdd xmmIH, xmm2/m128, xmmV,
> xmm1","XOP.NDS.128.08.W0 8E /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMACSSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQH xmmIH,
> xmm2/m128, xmmV, xmm1","vpmacssdqh xmmIH, xmm2/m128, xmmV,
> xmm1","XOP.NDS.128.08.W0 8F /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMACSSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQL xmmIH,
> xmm2/m128, xmmV, xmm1","vpmacssdql xmmIH, xmm2/m128, xmmV,
> xmm1","XOP.NDS.128.08.W0 87 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMACSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWD xmmIH, xmm2/m128,
> xmmV, xmm1","vpmacsswd xmmIH, xmm2/m128, xmmV,
> xmm1","XOP.NDS.128.08.W0 86 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMACSSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWW xmmIH, xmm2/m128,
> xmmV, xmm1","vpmacssww xmmIH, xmm2/m128, xmmV,
> xmm1","XOP.NDS.128.08.W0 85 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMACSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWD xmmIH, xmm2/m128,
> xmmV, xmm1","vpmacswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0
> 96 /r /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMACSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWW xmmIH, xmm2/m128,
> xmmV, xmm1","vpmacsww xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0
> 95 /r /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMADCSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSSWD xmmIH,
> xmm2/m128, xmmV, xmm1","vpmadcsswd xmmIH, xmm2/m128, xmmV,
> xmm1","XOP.NDS.128.08.W0 A6 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMADCSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSWD xmmIH, xmm2/m128,
> xmmV, xmm1","vpmadcswd xmmIH, xmm2/m128, xmmV,
> xmm1","XOP.NDS.128.08.W0 B6 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPMADD52HUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52HUQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmadd52huq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B5
> /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VPMADD52HUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52HUQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmadd52huq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B5
> /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VPMADD52HUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52HUQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmadd52huq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B5
> /r","V","V","AVX512_IFMA","bscale8,scale64","rw,r,r,r","",""
> +"VPMADD52LUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52LUQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmadd52luq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B4
> /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VPMADD52LUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52LUQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmadd52luq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B4
> /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VPMADD52LUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52LUQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmadd52luq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B4
> /r","V","V","AVX512_IFMA","bscale8,scale64","rw,r,r,r","",""
> +"VPMADDUBSW xmm1, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128, xmmV, xmm1","vpmaddubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 04 /r","V","V","AVX","","w,r,r","",""
> +"VPMADDUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128,
> xmmV, {k}{z}, xmm1","vpmaddubsw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.WIG 04
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMADDUBSW ymm1, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256, ymmV, ymm1","vpmaddubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 04 /r","V","V","AVX2","","w,r,r","",""
> +"VPMADDUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256,
> ymmV, {k}{z}, ymm1","vpmaddubsw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.WIG 04
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMADDUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDUBSW zmm2/m512,
> zmmV, {k}{z}, zmm1","vpmaddubsw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.WIG 04
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMADDWD xmm1, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV, xmm1","vpmaddwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F5 /r","V","V","AVX","","w,r,r","",""
> +"VPMADDWD xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV,
> {k}{z}, xmm1","vpmaddwd xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG F5
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMADDWD ymm1, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV, ymm1","vpmaddwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F5 /r","V","V","AVX2","","w,r,r","",""
> +"VPMADDWD ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV,
> {k}{z}, ymm1","vpmaddwd ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG F5
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMADDWD zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDWD zmm2/m512, zmmV,
> {k}{z}, zmm1","vpmaddwd zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG F5
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMASKMOVD xmm1, xmmV, m128","VPMASKMOVD m128, xmmV, xmm1","vpmaskmovd m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
> +"VPMASKMOVD ymm1, ymmV, m256","VPMASKMOVD m256, ymmV, ymm1","vpmaskmovd m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
> +"VPMASKMOVD m128, xmmV, xmm1","VPMASKMOVD xmm1, xmmV, m128","vpmaskmovd xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
> +"VPMASKMOVD m256, ymmV, ymm1","VPMASKMOVD ymm1, ymmV, m256","vpmaskmovd ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
> +"VPMASKMOVQ xmm1, xmmV, m128","VPMASKMOVQ m128, xmmV, xmm1","vpmaskmovq m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
> +"VPMASKMOVQ ymm1, ymmV, m256","VPMASKMOVQ m256, ymmV, ymm1","vpmaskmovq m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
> +"VPMASKMOVQ m128, xmmV, xmm1","VPMASKMOVQ xmm1, xmmV, m128","vpmaskmovq xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
> +"VPMASKMOVQ m256, ymmV, ymm1","VPMASKMOVQ ymm1, ymmV, m256","vpmaskmovq ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
> +"VPMAXSB xmm1, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV, xmm1","vpmaxsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3C /r","V","V","AVX","","w,r,r","",""
> +"VPMAXSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpmaxsb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.WIG 3C
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMAXSB ymm1, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV, ymm1","vpmaxsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3C /r","V","V","AVX2","","w,r,r","",""
> +"VPMAXSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpmaxsb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.WIG 3C
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMAXSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpmaxsb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.WIG 3C
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMAXSD xmm1, xmmV, xmm2/m128","VPMAXSD xmm2/m128, xmmV, xmm1","vpmaxsd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3D /r","V","V","AVX","","w,r,r","",""
> +"VPMAXSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXSD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpmaxsd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 3D
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPMAXSD ymm1, ymmV, ymm2/m256","VPMAXSD ymm2/m256, ymmV, ymm1","vpmaxsd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3D /r","V","V","AVX2","","w,r,r","",""
> +"VPMAXSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXSD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpmaxsd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 3D
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPMAXSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXSD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpmaxsd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 3D
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPMAXSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXSQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmaxsq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 3D
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPMAXSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXSQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmaxsq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 3D
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPMAXSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXSQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmaxsq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 3D
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPMAXSW xmm1, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV, xmm1","vpmaxsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EE /r","V","V","AVX","","w,r,r","",""
> +"VPMAXSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpmaxsw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG EE
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMAXSW ymm1, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV, ymm1","vpmaxsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EE /r","V","V","AVX2","","w,r,r","",""
> +"VPMAXSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpmaxsw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG EE
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMAXSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpmaxsw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG EE
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMAXUB xmm1, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV, xmm1","vpmaxub xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DE /r","V","V","AVX","","w,r,r","",""
> +"VPMAXUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpmaxub xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG DE
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMAXUB ymm1, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV, ymm1","vpmaxub ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DE /r","V","V","AVX2","","w,r,r","",""
> +"VPMAXUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpmaxub ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG DE
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMAXUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpmaxub zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG DE
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMAXUD xmm1, xmmV, xmm2/m128","VPMAXUD xmm2/m128, xmmV, xmm1","vpmaxud xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3F /r","V","V","AVX","","w,r,r","",""
> +"VPMAXUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXUD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpmaxud xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 3F
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPMAXUD ymm1, ymmV, ymm2/m256","VPMAXUD ymm2/m256, ymmV, ymm1","vpmaxud ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3F /r","V","V","AVX2","","w,r,r","",""
> +"VPMAXUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXUD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpmaxud ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 3F
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPMAXUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXUD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpmaxud zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 3F
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPMAXUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXUQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmaxuq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 3F
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPMAXUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXUQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmaxuq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 3F
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPMAXUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXUQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmaxuq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 3F
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPMAXUW xmm1, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV, xmm1","vpmaxuw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3E /r","V","V","AVX","","w,r,r","",""
> +"VPMAXUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpmaxuw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.WIG 3E
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMAXUW ymm1, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV, ymm1","vpmaxuw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3E /r","V","V","AVX2","","w,r,r","",""
> +"VPMAXUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpmaxuw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.WIG 3E
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMAXUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpmaxuw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.WIG 3E
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMINSB xmm1, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV, xmm1","vpminsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 38 /r","V","V","AVX","","w,r,r","",""
> +"VPMINSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpminsb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.WIG 38
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMINSB ymm1, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV, ymm1","vpminsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 38 /r","V","V","AVX2","","w,r,r","",""
> +"VPMINSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpminsb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.WIG 38
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMINSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpminsb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.WIG 38
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMINSD xmm1, xmmV, xmm2/m128","VPMINSD xmm2/m128, xmmV, xmm1","vpminsd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 39 /r","V","V","AVX","","w,r,r","",""
> +"VPMINSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINSD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpminsd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 39
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPMINSD ymm1, ymmV, ymm2/m256","VPMINSD ymm2/m256, ymmV, ymm1","vpminsd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 39 /r","V","V","AVX2","","w,r,r","",""
> +"VPMINSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINSD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpminsd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 39
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPMINSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINSD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpminsd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 39
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPMINSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINSQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpminsq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 39
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPMINSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINSQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpminsq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 39
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPMINSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINSQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpminsq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 39
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPMINSW xmm1, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV, xmm1","vpminsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EA /r","V","V","AVX","","w,r,r","",""
> +"VPMINSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpminsw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG EA
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMINSW ymm1, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV, ymm1","vpminsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EA /r","V","V","AVX2","","w,r,r","",""
> +"VPMINSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpminsw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG EA
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMINSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpminsw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG EA
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMINUB xmm1, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV, xmm1","vpminub xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DA /r","V","V","AVX","","w,r,r","",""
> +"VPMINUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpminub xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG DA
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMINUB ymm1, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV, ymm1","vpminub ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DA /r","V","V","AVX2","","w,r,r","",""
> +"VPMINUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpminub ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG DA
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMINUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpminub zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG DA
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMINUD xmm1, xmmV, xmm2/m128","VPMINUD xmm2/m128, xmmV, xmm1","vpminud xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3B /r","V","V","AVX","","w,r,r","",""
> +"VPMINUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINUD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpminud xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 3B
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPMINUD ymm1, ymmV, ymm2/m256","VPMINUD ymm2/m256, ymmV, ymm1","vpminud ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3B /r","V","V","AVX2","","w,r,r","",""
> +"VPMINUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINUD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpminud ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 3B
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPMINUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINUD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpminud zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 3B
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPMINUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINUQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpminuq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 3B
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPMINUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINUQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpminuq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 3B
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPMINUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINUQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpminuq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 3B
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPMINUW xmm1, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV, xmm1","vpminuw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3A /r","V","V","AVX","","w,r,r","",""
> +"VPMINUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpminuw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.WIG 3A
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMINUW ymm1, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV, ymm1","vpminuw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3A /r","V","V","AVX2","","w,r,r","",""
> +"VPMINUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpminuw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.WIG 3A
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMINUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpminuw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.WIG 3A
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMOVB2M k1, xmm2","VPMOVB2M xmm2, k1","vpmovb2m xmm2, k1","EVEX.128.F3.0F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVB2M k1, ymm2","VPMOVB2M ymm2, k1","vpmovb2m ymm2, k1","EVEX.256.F3.0F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVB2M k1, zmm2","VPMOVB2M zmm2, k1","vpmovb2m zmm2, k1","EVEX.512.F3.0F38.W0 29 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
> +"VPMOVD2M k1, xmm2","VPMOVD2M xmm2, k1","vpmovd2m xmm2, k1","EVEX.128.F3.0F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVD2M k1, ymm2","VPMOVD2M ymm2, k1","vpmovd2m ymm2, k1","EVEX.256.F3.0F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVD2M k1, zmm2","VPMOVD2M zmm2, k1","vpmovd2m zmm2, k1","EVEX.512.F3.0F38.W0 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
> +"VPMOVDB xmm2/m32, {k}{z}, xmm1","VPMOVDB xmm1, {k}{z},
> xmm2/m32","vpmovdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 31
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVDB xmm2/m64, {k}{z}, ymm1","VPMOVDB ymm1, {k}{z},
> xmm2/m64","vpmovdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 31
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVDB xmm2/m128, {k}{z}, zmm1","VPMOVDB zmm1, {k}{z}, xmm2/m128","vpmovdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 31 /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVDW xmm2/m64, {k}{z}, xmm1","VPMOVDW xmm1, {k}{z},
> xmm2/m64","vpmovdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 33
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVDW xmm2/m128, {k}{z}, ymm1","VPMOVDW ymm1, {k}{z},
> xmm2/m128","vpmovdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 33
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVDW ymm2/m256, {k}{z}, zmm1","VPMOVDW zmm1, {k}{z}, ymm2/m256","vpmovdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 33 /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVM2B xmm1, k2","VPMOVM2B k2, xmm1","vpmovm2b k2, xmm1","EVEX.128.F3.0F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVM2B ymm1, k2","VPMOVM2B k2, ymm1","vpmovm2b k2, ymm1","EVEX.256.F3.0F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVM2B zmm1, k2","VPMOVM2B k2, zmm1","vpmovm2b k2, zmm1","EVEX.512.F3.0F38.W0 28 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
> +"VPMOVM2D xmm1, k2","VPMOVM2D k2, xmm1","vpmovm2d k2, xmm1","EVEX.128.F3.0F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVM2D ymm1, k2","VPMOVM2D k2, ymm1","vpmovm2d k2, ymm1","EVEX.256.F3.0F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVM2D zmm1, k2","VPMOVM2D k2, zmm1","vpmovm2d k2, zmm1","EVEX.512.F3.0F38.W0 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
> +"VPMOVM2Q xmm1, k2","VPMOVM2Q k2, xmm1","vpmovm2q k2, xmm1","EVEX.128.F3.0F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVM2Q ymm1, k2","VPMOVM2Q k2, ymm1","vpmovm2q k2, ymm1","EVEX.256.F3.0F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVM2Q zmm1, k2","VPMOVM2Q k2, zmm1","vpmovm2q k2, zmm1","EVEX.512.F3.0F38.W1 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
> +"VPMOVM2W xmm1, k2","VPMOVM2W k2, xmm1","vpmovm2w k2, xmm1","EVEX.128.F3.0F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVM2W ymm1, k2","VPMOVM2W k2, ymm1","vpmovm2w k2, ymm1","EVEX.256.F3.0F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVM2W zmm1, k2","VPMOVM2W k2, zmm1","vpmovm2w k2, zmm1","EVEX.512.F3.0F38.W1 28 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
> +"VPMOVMSKB r32, xmm2","VPMOVMSKB xmm2, r32","vpmovmskb xmm2, r32","VEX.128.66.0F.WIG D7 /r","V","V","AVX","modrm_regonly","w,r","",""
> +"VPMOVMSKB r32, ymm2","VPMOVMSKB ymm2, r32","vpmovmskb ymm2, r32","VEX.256.66.0F.WIG D7 /r","V","V","AVX2","modrm_regonly","w,r","",""
> +"VPMOVQ2M k1, xmm2","VPMOVQ2M xmm2, k1","vpmovq2m xmm2, k1","EVEX.128.F3.0F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVQ2M k1, ymm2","VPMOVQ2M ymm2, k1","vpmovq2m ymm2, k1","EVEX.256.F3.0F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVQ2M k1, zmm2","VPMOVQ2M zmm2, k1","vpmovq2m zmm2, k1","EVEX.512.F3.0F38.W1 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
> +"VPMOVQB xmm2/m16, {k}{z}, xmm1","VPMOVQB xmm1, {k}{z},
> xmm2/m16","vpmovqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 32
> /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
> +"VPMOVQB xmm2/m32, {k}{z}, ymm1","VPMOVQB ymm1, {k}{z},
> xmm2/m32","vpmovqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 32
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVQB xmm2/m64, {k}{z}, zmm1","VPMOVQB zmm1, {k}{z}, xmm2/m64","vpmovqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 32 /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VPMOVQD xmm2/m64, {k}{z}, xmm1","VPMOVQD xmm1, {k}{z},
> xmm2/m64","vpmovqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 35
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVQD xmm2/m128, {k}{z}, ymm1","VPMOVQD ymm1, {k}{z},
> xmm2/m128","vpmovqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 35
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVQD ymm2/m256, {k}{z}, zmm1","VPMOVQD zmm1, {k}{z}, ymm2/m256","vpmovqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 35 /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVQW xmm2/m32, {k}{z}, xmm1","VPMOVQW xmm1, {k}{z},
> xmm2/m32","vpmovqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 34
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVQW xmm2/m64, {k}{z}, ymm1","VPMOVQW ymm1, {k}{z},
> xmm2/m64","vpmovqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 34
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVQW xmm2/m128, {k}{z}, zmm1","VPMOVQW zmm1, {k}{z}, xmm2/m128","vpmovqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 34 /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVSDB xmm2/m32, {k}{z}, xmm1","VPMOVSDB xmm1, {k}{z},
> xmm2/m32","vpmovsdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 21
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVSDB xmm2/m64, {k}{z}, ymm1","VPMOVSDB ymm1, {k}{z},
> xmm2/m64","vpmovsdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 21
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVSDB xmm2/m128, {k}{z}, zmm1","VPMOVSDB zmm1, {k}{z},
> xmm2/m128","vpmovsdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 21
> /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVSDW xmm2/m64, {k}{z}, xmm1","VPMOVSDW xmm1, {k}{z},
> xmm2/m64","vpmovsdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 23
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVSDW xmm2/m128, {k}{z}, ymm1","VPMOVSDW ymm1, {k}{z},
> xmm2/m128","vpmovsdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 23
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVSDW ymm2/m256, {k}{z}, zmm1","VPMOVSDW zmm1, {k}{z},
> ymm2/m256","vpmovsdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 23
> /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVSQB xmm2/m16, {k}{z}, xmm1","VPMOVSQB xmm1, {k}{z},
> xmm2/m16","vpmovsqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 22
> /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
> +"VPMOVSQB xmm2/m32, {k}{z}, ymm1","VPMOVSQB ymm1, {k}{z},
> xmm2/m32","vpmovsqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 22
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVSQB xmm2/m64, {k}{z}, zmm1","VPMOVSQB zmm1, {k}{z}, xmm2/m64","vpmovsqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 22 /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VPMOVSQD xmm2/m64, {k}{z}, xmm1","VPMOVSQD xmm1, {k}{z},
> xmm2/m64","vpmovsqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 25
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVSQD xmm2/m128, {k}{z}, ymm1","VPMOVSQD ymm1, {k}{z},
> xmm2/m128","vpmovsqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 25
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVSQD ymm2/m256, {k}{z}, zmm1","VPMOVSQD zmm1, {k}{z},
> ymm2/m256","vpmovsqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 25
> /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVSQW xmm2/m32, {k}{z}, xmm1","VPMOVSQW xmm1, {k}{z},
> xmm2/m32","vpmovsqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 24
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVSQW xmm2/m64, {k}{z}, ymm1","VPMOVSQW ymm1, {k}{z},
> xmm2/m64","vpmovsqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 24
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVSQW xmm2/m128, {k}{z}, zmm1","VPMOVSQW zmm1, {k}{z},
> xmm2/m128","vpmovsqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 24
> /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVSWB xmm2/m64, {k}{z}, xmm1","VPMOVSWB xmm1, {k}{z},
> xmm2/m64","vpmovswb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 20
> /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
> +"VPMOVSWB xmm2/m128, {k}{z}, ymm1","VPMOVSWB ymm1, {k}{z},
> xmm2/m128","vpmovswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 20
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPMOVSWB ymm2/m256, {k}{z}, zmm1","VPMOVSWB zmm1, {k}{z},
> ymm2/m256","vpmovswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 20
> /r","V","V","AVX512BW","scale32","w,r,r","",""
> +"VPMOVSXBD zmm1, {k}{z}, xmm2/m128","VPMOVSXBD xmm2/m128, {k}{z},
> zmm1","vpmovsxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 21
> /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVSXBD xmm1, xmm2/m32","VPMOVSXBD xmm2/m32, xmm1","vpmovsxbd xmm2/m32, xmm1","VEX.128.66.0F38.WIG 21 /r","V","V","AVX","","w,r","",""
> +"VPMOVSXBD xmm1, {k}{z}, xmm2/m32","VPMOVSXBD xmm2/m32, {k}{z},
> xmm1","vpmovsxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 21
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVSXBD ymm1, xmm2/m64","VPMOVSXBD xmm2/m64, ymm1","vpmovsxbd xmm2/m64, ymm1","VEX.256.66.0F38.WIG 21 /r","V","V","AVX2","","w,r","",""
> +"VPMOVSXBD ymm1, {k}{z}, xmm2/m64","VPMOVSXBD xmm2/m64, {k}{z},
> ymm1","vpmovsxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 21
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVSXBQ xmm1, xmm2/m16","VPMOVSXBQ xmm2/m16, xmm1","vpmovsxbq xmm2/m16, xmm1","VEX.128.66.0F38.WIG 22 /r","V","V","AVX","","w,r","",""
> +"VPMOVSXBQ xmm1, {k}{z}, xmm2/m16","VPMOVSXBQ xmm2/m16, {k}{z},
> xmm1","vpmovsxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 22
> /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
> +"VPMOVSXBQ ymm1, xmm2/m32","VPMOVSXBQ xmm2/m32, ymm1","vpmovsxbq xmm2/m32, ymm1","VEX.256.66.0F38.WIG 22 /r","V","V","AVX2","","w,r","",""
> +"VPMOVSXBQ ymm1, {k}{z}, xmm2/m32","VPMOVSXBQ xmm2/m32, {k}{z},
> ymm1","vpmovsxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 22
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVSXBQ zmm1, {k}{z}, xmm2/m64","VPMOVSXBQ xmm2/m64, {k}{z},
> zmm1","vpmovsxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 22
> /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VPMOVSXBW ymm1, xmm2/m128","VPMOVSXBW xmm2/m128, ymm1","vpmovsxbw xmm2/m128, ymm1","VEX.256.66.0F38.WIG 20 /r","V","V","AVX2","","w,r","",""
> +"VPMOVSXBW ymm1, {k}{z}, xmm2/m128","VPMOVSXBW xmm2/m128, {k}{z},
> ymm1","vpmovsxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 20
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPMOVSXBW xmm1, xmm2/m64","VPMOVSXBW xmm2/m64, xmm1","vpmovsxbw xmm2/m64, xmm1","VEX.128.66.0F38.WIG 20 /r","V","V","AVX","","w,r","",""
> +"VPMOVSXBW xmm1, {k}{z}, xmm2/m64","VPMOVSXBW xmm2/m64, {k}{z},
> xmm1","vpmovsxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 20
> /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
> +"VPMOVSXBW zmm1, {k}{z}, ymm2/m256","VPMOVSXBW ymm2/m256, {k}{z},
> zmm1","vpmovsxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 20
> /r","V","V","AVX512BW","scale32","w,r,r","",""
> +"VPMOVSXDQ ymm1, xmm2/m128","VPMOVSXDQ xmm2/m128, ymm1","vpmovsxdq xmm2/m128, ymm1","VEX.256.66.0F38.WIG 25 /r","V","V","AVX2","","w,r","",""
> +"VPMOVSXDQ ymm1, {k}{z}, xmm2/m128","VPMOVSXDQ xmm2/m128, {k}{z},
> ymm1","vpmovsxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 25
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVSXDQ xmm1, xmm2/m64","VPMOVSXDQ xmm2/m64, xmm1","vpmovsxdq xmm2/m64, xmm1","VEX.128.66.0F38.WIG 25 /r","V","V","AVX","","w,r","",""
> +"VPMOVSXDQ xmm1, {k}{z}, xmm2/m64","VPMOVSXDQ xmm2/m64, {k}{z},
> xmm1","vpmovsxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 25
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVSXDQ zmm1, {k}{z}, ymm2/m256","VPMOVSXDQ ymm2/m256, {k}{z},
> zmm1","vpmovsxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 25
> /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVSXWD ymm1, xmm2/m128","VPMOVSXWD xmm2/m128, ymm1","vpmovsxwd xmm2/m128, ymm1","VEX.256.66.0F38.WIG 23 /r","V","V","AVX2","","w,r","",""
> +"VPMOVSXWD ymm1, {k}{z}, xmm2/m128","VPMOVSXWD xmm2/m128, {k}{z},
> ymm1","vpmovsxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 23
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVSXWD xmm1, xmm2/m64","VPMOVSXWD xmm2/m64, xmm1","vpmovsxwd xmm2/m64, xmm1","VEX.128.66.0F38.WIG 23 /r","V","V","AVX","","w,r","",""
> +"VPMOVSXWD xmm1, {k}{z}, xmm2/m64","VPMOVSXWD xmm2/m64, {k}{z},
> xmm1","vpmovsxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 23
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVSXWD zmm1, {k}{z}, ymm2/m256","VPMOVSXWD ymm2/m256, {k}{z},
> zmm1","vpmovsxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 23
> /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVSXWQ zmm1, {k}{z}, xmm2/m128","VPMOVSXWQ xmm2/m128, {k}{z},
> zmm1","vpmovsxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 24
> /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVSXWQ xmm1, xmm2/m32","VPMOVSXWQ xmm2/m32, xmm1","vpmovsxwq xmm2/m32, xmm1","VEX.128.66.0F38.WIG 24 /r","V","V","AVX","","w,r","",""
> +"VPMOVSXWQ xmm1, {k}{z}, xmm2/m32","VPMOVSXWQ xmm2/m32, {k}{z},
> xmm1","vpmovsxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 24
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVSXWQ ymm1, xmm2/m64","VPMOVSXWQ xmm2/m64, ymm1","vpmovsxwq xmm2/m64, ymm1","VEX.256.66.0F38.WIG 24 /r","V","V","AVX2","","w,r","",""
> +"VPMOVSXWQ ymm1, {k}{z}, xmm2/m64","VPMOVSXWQ xmm2/m64, {k}{z},
> ymm1","vpmovsxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 24
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVUSDB xmm2/m32, {k}{z}, xmm1","VPMOVUSDB xmm1, {k}{z},
> xmm2/m32","vpmovusdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 11
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVUSDB xmm2/m64, {k}{z}, ymm1","VPMOVUSDB ymm1, {k}{z},
> xmm2/m64","vpmovusdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 11
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVUSDB xmm2/m128, {k}{z}, zmm1","VPMOVUSDB zmm1, {k}{z},
> xmm2/m128","vpmovusdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 11
> /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVUSDW xmm2/m64, {k}{z}, xmm1","VPMOVUSDW xmm1, {k}{z},
> xmm2/m64","vpmovusdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 13
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVUSDW xmm2/m128, {k}{z}, ymm1","VPMOVUSDW ymm1, {k}{z},
> xmm2/m128","vpmovusdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 13
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVUSDW ymm2/m256, {k}{z}, zmm1","VPMOVUSDW zmm1, {k}{z},
> ymm2/m256","vpmovusdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 13
> /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVUSQB xmm2/m16, {k}{z}, xmm1","VPMOVUSQB xmm1, {k}{z},
> xmm2/m16","vpmovusqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 12
> /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
> +"VPMOVUSQB xmm2/m32, {k}{z}, ymm1","VPMOVUSQB ymm1, {k}{z},
> xmm2/m32","vpmovusqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 12
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVUSQB xmm2/m64, {k}{z}, zmm1","VPMOVUSQB zmm1, {k}{z},
> xmm2/m64","vpmovusqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 12
> /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VPMOVUSQD xmm2/m64, {k}{z}, xmm1","VPMOVUSQD xmm1, {k}{z},
> xmm2/m64","vpmovusqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 15
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVUSQD xmm2/m128, {k}{z}, ymm1","VPMOVUSQD ymm1, {k}{z},
> xmm2/m128","vpmovusqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 15
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVUSQD ymm2/m256, {k}{z}, zmm1","VPMOVUSQD zmm1, {k}{z},
> ymm2/m256","vpmovusqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 15
> /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVUSQW xmm2/m32, {k}{z}, xmm1","VPMOVUSQW xmm1, {k}{z},
> xmm2/m32","vpmovusqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 14
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVUSQW xmm2/m64, {k}{z}, ymm1","VPMOVUSQW ymm1, {k}{z},
> xmm2/m64","vpmovusqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 14
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVUSQW xmm2/m128, {k}{z}, zmm1","VPMOVUSQW zmm1, {k}{z},
> xmm2/m128","vpmovusqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 14
> /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVUSWB xmm2/m64, {k}{z}, xmm1","VPMOVUSWB xmm1, {k}{z},
> xmm2/m64","vpmovuswb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 10
> /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
> +"VPMOVUSWB xmm2/m128, {k}{z}, ymm1","VPMOVUSWB ymm1, {k}{z},
> xmm2/m128","vpmovuswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 10
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPMOVUSWB ymm2/m256, {k}{z}, zmm1","VPMOVUSWB zmm1, {k}{z},
> ymm2/m256","vpmovuswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 10
> /r","V","V","AVX512BW","scale32","w,r,r","",""
> +"VPMOVW2M k1, xmm2","VPMOVW2M xmm2, k1","vpmovw2m xmm2, k1","EVEX.128.F3.0F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVW2M k1, ymm2","VPMOVW2M ymm2, k1","vpmovw2m ymm2, k1","EVEX.256.F3.0F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
> +"VPMOVW2M k1, zmm2","VPMOVW2M zmm2, k1","vpmovw2m zmm2, k1","EVEX.512.F3.0F38.W1 29 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
> +"VPMOVWB xmm2/m64, {k}{z}, xmm1","VPMOVWB xmm1, {k}{z},
> xmm2/m64","vpmovwb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 30
> /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
> +"VPMOVWB xmm2/m128, {k}{z}, ymm1","VPMOVWB ymm1, {k}{z},
> xmm2/m128","vpmovwb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 30
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPMOVWB ymm2/m256, {k}{z}, zmm1","VPMOVWB zmm1, {k}{z},
> ymm2/m256","vpmovwb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 30
> /r","V","V","AVX512BW","scale32","w,r,r","",""
> +"VPMOVZXBD zmm1, {k}{z}, xmm2/m128","VPMOVZXBD xmm2/m128, {k}{z},
> zmm1","vpmovzxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 31
> /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVZXBD xmm1, xmm2/m32","VPMOVZXBD xmm2/m32, xmm1","vpmovzxbd xmm2/m32, xmm1","VEX.128.66.0F38.WIG 31 /r","V","V","AVX","","w,r","",""
> +"VPMOVZXBD xmm1, {k}{z}, xmm2/m32","VPMOVZXBD xmm2/m32, {k}{z},
> xmm1","vpmovzxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 31
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVZXBD ymm1, xmm2/m64","VPMOVZXBD xmm2/m64, ymm1","vpmovzxbd xmm2/m64, ymm1","VEX.256.66.0F38.WIG 31 /r","V","V","AVX2","","w,r","",""
> +"VPMOVZXBD ymm1, {k}{z}, xmm2/m64","VPMOVZXBD xmm2/m64, {k}{z},
> ymm1","vpmovzxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 31
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVZXBQ xmm1, xmm2/m16","VPMOVZXBQ xmm2/m16, xmm1","vpmovzxbq xmm2/m16, xmm1","VEX.128.66.0F38.WIG 32 /r","V","V","AVX","","w,r","",""
> +"VPMOVZXBQ xmm1, {k}{z}, xmm2/m16","VPMOVZXBQ xmm2/m16, {k}{z},
> xmm1","vpmovzxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 32
> /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
> +"VPMOVZXBQ ymm1, xmm2/m32","VPMOVZXBQ xmm2/m32, ymm1","vpmovzxbq xmm2/m32, ymm1","VEX.256.66.0F38.WIG 32 /r","V","V","AVX2","","w,r","",""
> +"VPMOVZXBQ ymm1, {k}{z}, xmm2/m32","VPMOVZXBQ xmm2/m32, {k}{z},
> ymm1","vpmovzxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 32
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVZXBQ zmm1, {k}{z}, xmm2/m64","VPMOVZXBQ xmm2/m64, {k}{z},
> zmm1","vpmovzxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 32
> /r","V","V","AVX512F","scale8","w,r,r","",""
> +"VPMOVZXBW ymm1, xmm2/m128","VPMOVZXBW xmm2/m128, ymm1","vpmovzxbw xmm2/m128, ymm1","VEX.256.66.0F38.WIG 30 /r","V","V","AVX2","","w,r","",""
> +"VPMOVZXBW ymm1, {k}{z}, xmm2/m128","VPMOVZXBW xmm2/m128, {k}{z},
> ymm1","vpmovzxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 30
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPMOVZXBW xmm1, xmm2/m64","VPMOVZXBW xmm2/m64, xmm1","vpmovzxbw xmm2/m64, xmm1","VEX.128.66.0F38.WIG 30 /r","V","V","AVX","","w,r","",""
> +"VPMOVZXBW xmm1, {k}{z}, xmm2/m64","VPMOVZXBW xmm2/m64, {k}{z},
> xmm1","vpmovzxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 30
> /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
> +"VPMOVZXBW zmm1, {k}{z}, ymm2/m256","VPMOVZXBW ymm2/m256, {k}{z},
> zmm1","vpmovzxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 30
> /r","V","V","AVX512BW","scale32","w,r,r","",""
> +"VPMOVZXDQ ymm1, xmm2/m128","VPMOVZXDQ xmm2/m128, ymm1","vpmovzxdq xmm2/m128, ymm1","VEX.256.66.0F38.WIG 35 /r","V","V","AVX2","","w,r","",""
> +"VPMOVZXDQ ymm1, {k}{z}, xmm2/m128","VPMOVZXDQ xmm2/m128, {k}{z},
> ymm1","vpmovzxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 35
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVZXDQ xmm1, xmm2/m64","VPMOVZXDQ xmm2/m64, xmm1","vpmovzxdq xmm2/m64, xmm1","VEX.128.66.0F38.WIG 35 /r","V","V","AVX","","w,r","",""
> +"VPMOVZXDQ xmm1, {k}{z}, xmm2/m64","VPMOVZXDQ xmm2/m64, {k}{z},
> xmm1","vpmovzxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 35
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVZXDQ zmm1, {k}{z}, ymm2/m256","VPMOVZXDQ ymm2/m256, {k}{z},
> zmm1","vpmovzxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 35
> /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVZXWD ymm1, xmm2/m128","VPMOVZXWD xmm2/m128, ymm1","vpmovzxwd xmm2/m128, ymm1","VEX.256.66.0F38.WIG 33 /r","V","V","AVX2","","w,r","",""
> +"VPMOVZXWD ymm1, {k}{z}, xmm2/m128","VPMOVZXWD xmm2/m128, {k}{z},
> ymm1","vpmovzxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 33
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
> +"VPMOVZXWD xmm1, xmm2/m64","VPMOVZXWD xmm2/m64, xmm1","vpmovzxwd xmm2/m64, xmm1","VEX.128.66.0F38.WIG 33 /r","V","V","AVX","","w,r","",""
> +"VPMOVZXWD xmm1, {k}{z}, xmm2/m64","VPMOVZXWD xmm2/m64, {k}{z},
> xmm1","vpmovzxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 33
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMOVZXWD zmm1, {k}{z}, ymm2/m256","VPMOVZXWD ymm2/m256, {k}{z},
> zmm1","vpmovzxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 33
> /r","V","V","AVX512F","scale32","w,r,r","",""
> +"VPMOVZXWQ zmm1, {k}{z}, xmm2/m128","VPMOVZXWQ xmm2/m128, {k}{z},
> zmm1","vpmovzxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 34
> /r","V","V","AVX512F","scale16","w,r,r","",""
> +"VPMOVZXWQ xmm1, xmm2/m32","VPMOVZXWQ xmm2/m32, xmm1","vpmovzxwq xmm2/m32, xmm1","VEX.128.66.0F38.WIG 34 /r","V","V","AVX","","w,r","",""
> +"VPMOVZXWQ xmm1, {k}{z}, xmm2/m32","VPMOVZXWQ xmm2/m32, {k}{z},
> xmm1","vpmovzxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 34
> /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
> +"VPMOVZXWQ ymm1, xmm2/m64","VPMOVZXWQ xmm2/m64, ymm1","vpmovzxwq xmm2/m64, ymm1","VEX.256.66.0F38.WIG 34 /r","V","V","AVX2","","w,r","",""
> +"VPMOVZXWQ ymm1, {k}{z}, xmm2/m64","VPMOVZXWQ xmm2/m64, {k}{z},
> ymm1","vpmovzxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 34
> /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
> +"VPMULDQ xmm1, xmmV, xmm2/m128","VPMULDQ xmm2/m128, xmmV, xmm1","vpmuldq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 28 /r","V","V","AVX","","w,r,r","",""
> +"VPMULDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULDQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmuldq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 28
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPMULDQ ymm1, ymmV, ymm2/m256","VPMULDQ ymm2/m256, ymmV, ymm1","vpmuldq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 28 /r","V","V","AVX2","","w,r,r","",""
> +"VPMULDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULDQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmuldq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 28
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPMULDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULDQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmuldq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 28
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPMULHRSW xmm1, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV, xmm1","vpmulhrsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0B /r","V","V","AVX","","w,r,r","",""
> +"VPMULHRSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpmulhrsw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.WIG 0B
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMULHRSW ymm1, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV, ymm1","vpmulhrsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0B /r","V","V","AVX2","","w,r,r","",""
> +"VPMULHRSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpmulhrsw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.WIG 0B
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMULHRSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHRSW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpmulhrsw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.WIG 0B
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMULHUW xmm1, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV, xmm1","vpmulhuw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E4 /r","V","V","AVX","","w,r,r","",""
> +"VPMULHUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpmulhuw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG E4
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMULHUW ymm1, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV, ymm1","vpmulhuw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E4 /r","V","V","AVX2","","w,r,r","",""
> +"VPMULHUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpmulhuw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG E4
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMULHUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHUW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpmulhuw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG E4
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMULHW xmm1, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV, xmm1","vpmulhw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E5 /r","V","V","AVX","","w,r,r","",""
> +"VPMULHW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpmulhw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG E5
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMULHW ymm1, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV, ymm1","vpmulhw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E5 /r","V","V","AVX2","","w,r,r","",""
> +"VPMULHW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpmulhw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG E5
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMULHW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpmulhw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG E5
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMULLD xmm1, xmmV, xmm2/m128","VPMULLD xmm2/m128, xmmV, xmm1","vpmulld xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 40 /r","V","V","AVX","","w,r,r","",""
> +"VPMULLD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMULLD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpmulld xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 40
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPMULLD ymm1, ymmV, ymm2/m256","VPMULLD ymm2/m256, ymmV, ymm1","vpmulld ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 40 /r","V","V","AVX2","","w,r,r","",""
> +"VPMULLD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMULLD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpmulld ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 40
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPMULLD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMULLD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpmulld zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 40
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPMULLQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULLQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmullq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 40
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPMULLQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULLQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmullq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 40
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPMULLQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULLQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmullq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 40
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
> +"VPMULLW xmm1, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV, xmm1","vpmullw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D5 /r","V","V","AVX","","w,r,r","",""
> +"VPMULLW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpmullw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG D5
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPMULLW ymm1, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV, ymm1","vpmullw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D5 /r","V","V","AVX2","","w,r,r","",""
> +"VPMULLW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpmullw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG D5
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPMULLW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULLW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpmullw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG D5
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPMULTISHIFTQB xmm1, {k}{z}, xmmV,
> xmm2/m128/m64bcst","VPMULTISHIFTQB xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","vpmultishiftqb xmm2/m128/m64bcst, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.W1 83
> /r","V","V","AVX512_VBMI+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPMULTISHIFTQB ymm1, {k}{z}, ymmV,
> ymm2/m256/m64bcst","VPMULTISHIFTQB ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","vpmultishiftqb ymm2/m256/m64bcst, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.W1 83
> /r","V","V","AVX512_VBMI+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPMULTISHIFTQB zmm1, {k}{z}, zmmV,
> zmm2/m512/m64bcst","VPMULTISHIFTQB zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","vpmultishiftqb zmm2/m512/m64bcst, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.W1 83
> /r","V","V","AVX512_VBMI","bscale8,scale64","w,r,r,r","",""
> +"VPMULUDQ xmm1, xmmV, xmm2/m128","VPMULUDQ xmm2/m128, xmmV, xmm1","vpmuludq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F4 /r","V","V","AVX","","w,r,r","",""
> +"VPMULUDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULUDQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmuludq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 F4
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPMULUDQ ymm1, ymmV, ymm2/m256","VPMULUDQ ymm2/m256, ymmV, ymm1","vpmuludq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F4 /r","V","V","AVX2","","w,r,r","",""
> +"VPMULUDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULUDQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmuludq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 F4
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPMULUDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULUDQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmuludq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 F4
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPOPCNTB xmm1, {k}{z}, xmm2/m128","VPOPCNTB xmm2/m128, {k}{z},
> xmm1","vpopcntb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 54
> /r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r","",""
> +"VPOPCNTB ymm1, {k}{z}, ymm2/m256","VPOPCNTB ymm2/m256, {k}{z},
> ymm1","vpopcntb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 54
> /r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r","",""
> +"VPOPCNTB zmm1, {k}{z}, zmm2/m512","VPOPCNTB zmm2/m512, {k}{z},
> zmm1","vpopcntb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 54
> /r","V","V","AVX512_BITALG","scale64","w,r,r","",""
> +"VPOPCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPOPCNTD
> xmm2/m128/m32bcst, {k}{z}, xmm1","vpopcntd xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W0 55
> /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VPOPCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPOPCNTD
> ymm2/m256/m32bcst, {k}{z}, ymm1","vpopcntd ymm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W0 55
> /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VPOPCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPOPCNTD
> zmm2/m512/m32bcst, {k}{z}, zmm1","vpopcntd zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W0 55
> /r","V","V","AVX512_VPOPCNTDQ","bscale4,scale64","w,r,r","",""
> +"VPOPCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPOPCNTQ
> xmm2/m128/m64bcst, {k}{z}, xmm1","vpopcntq xmm2/m128/m64bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W1 55
> /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VPOPCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPOPCNTQ
> ymm2/m256/m64bcst, {k}{z}, ymm1","vpopcntq ymm2/m256/m64bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W1 55
> /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VPOPCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPOPCNTQ
> zmm2/m512/m64bcst, {k}{z}, zmm1","vpopcntq zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W1 55
> /r","V","V","AVX512_VPOPCNTDQ","bscale8,scale64","w,r,r","",""
> +"VPOPCNTW xmm1, {k}{z}, xmm2/m128","VPOPCNTW xmm2/m128, {k}{z},
> xmm1","vpopcntw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 54
> /r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r","",""
> +"VPOPCNTW ymm1, {k}{z}, ymm2/m256","VPOPCNTW ymm2/m256, {k}{z},
> ymm1","vpopcntw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 54
> /r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r","",""
> +"VPOPCNTW zmm1, {k}{z}, zmm2/m512","VPOPCNTW zmm2/m512, {k}{z},
> zmm1","vpopcntw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 54
> /r","V","V","AVX512_BITALG","scale64","w,r,r","",""
> +"VPOR xmm1, xmmV, xmm2/m128","VPOR xmm2/m128, xmmV, xmm1","vpor xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EB /r","V","V","AVX","","w,r,r","",""
> +"VPOR ymm1, ymmV, ymm2/m256","VPOR ymm2/m256, ymmV, ymm1","vpor ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EB /r","V","V","AVX2","","w,r,r","",""
> +"VPORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPORD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpord xmm2/m128/m32bcst, xmmV,
> {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 EB
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPORD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpord ymm2/m256/m32bcst, ymmV,
> {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 EB
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPORD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpord zmm2/m512/m32bcst, zmmV,
> {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 EB
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPORQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vporq xmm2/m128/m64bcst, xmmV,
> {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 EB
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPORQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vporq ymm2/m256/m64bcst, ymmV,
> {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 EB
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPORQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vporq zmm2/m512/m64bcst, zmmV,
> {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 EB
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPPERM xmm1, xmmV, xmmIH, xmm2/m128","VPPERM xmm2/m128, xmmIH, xmmV,
> xmm1","vpperm xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A3 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPPERM xmm1, xmmV, xmm2/m128, xmmIH","VPPERM xmmIH, xmm2/m128, xmmV,
> xmm1","vpperm xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A3 /r
> /is4","V","V","XOP","amd","w,r,r,r","",""
> +"VPROLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPROLD imm8u,
> xmm2/m128/m32bcst, {k}{z}, xmmV","vprold imm8u, xmm2/m128/m32bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /1
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPROLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPROLD imm8u,
> ymm2/m256/m32bcst, {k}{z}, ymmV","vprold imm8u, ymm2/m256/m32bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /1
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPROLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPROLD imm8u,
> zmm2/m512/m32bcst, {k}{z}, zmmV","vprold imm8u, zmm2/m512/m32bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /1
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPROLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPROLQ imm8u,
> xmm2/m128/m64bcst, {k}{z}, xmmV","vprolq imm8u, xmm2/m128/m64bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 72 /1
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPROLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPROLQ imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymmV","vprolq imm8u, ymm2/m256/m64bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 72 /1
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPROLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPROLQ imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmmV","vprolq imm8u, zmm2/m512/m64bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 72 /1
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPROLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPROLVD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vprolvd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 15
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPROLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPROLVD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vprolvd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 15
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPROLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPROLVD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vprolvd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 15
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPROLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPROLVQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vprolvq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 15
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPROLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPROLVQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vprolvq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 15
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPROLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPROLVQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vprolvq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 15
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPRORD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPRORD imm8u,
> xmm2/m128/m32bcst, {k}{z}, xmmV","vprord imm8u, xmm2/m128/m32bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /0
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPRORD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPRORD imm8u,
> ymm2/m256/m32bcst, {k}{z}, ymmV","vprord imm8u, ymm2/m256/m32bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /0
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPRORD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPRORD imm8u,
> zmm2/m512/m32bcst, {k}{z}, zmmV","vprord imm8u, zmm2/m512/m32bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /0
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPRORQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPRORQ imm8u,
> xmm2/m128/m64bcst, {k}{z}, xmmV","vprorq imm8u, xmm2/m128/m64bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 72 /0
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPRORQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPRORQ imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymmV","vprorq imm8u, ymm2/m256/m64bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 72 /0
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPRORQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPRORQ imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmmV","vprorq imm8u, zmm2/m512/m64bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 72 /0
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPRORVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPRORVD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vprorvd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 14
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPRORVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPRORVD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vprorvd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 14
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPRORVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPRORVD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vprorvd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 14
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPRORVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPRORVQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vprorvq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 14
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPRORVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPRORVQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vprorvq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 14
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPRORVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPRORVQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vprorvq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 14
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPROTB xmm1, xmm2/m128, imm8u","VPROTB imm8u, xmm2/m128, xmm1","vprotb imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C0 /r ib","V","V","XOP","amd","w,r,r","",""
> +"VPROTB xmm1, xmmV, xmm2/m128","VPROTB xmm2/m128, xmmV, xmm1","vprotb xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 90 /r","V","V","XOP","amd","w,r,r","",""
> +"VPROTB xmm1, xmm2/m128, xmmV","VPROTB xmmV, xmm2/m128, xmm1","vprotb xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 90 /r","V","V","XOP","amd","w,r,r","",""
> +"VPROTD xmm1, xmm2/m128, imm8u","VPROTD imm8u, xmm2/m128, xmm1","vprotd imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C2 /r ib","V","V","XOP","amd","w,r,r","",""
> +"VPROTD xmm1, xmmV, xmm2/m128","VPROTD xmm2/m128, xmmV, xmm1","vprotd xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 92 /r","V","V","XOP","amd","w,r,r","",""
> +"VPROTD xmm1, xmm2/m128, xmmV","VPROTD xmmV, xmm2/m128, xmm1","vprotd xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 92 /r","V","V","XOP","amd","w,r,r","",""
> +"VPROTQ xmm1, xmm2/m128, imm8u","VPROTQ imm8u, xmm2/m128, xmm1","vprotq imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C3 /r ib","V","V","XOP","amd","w,r,r","",""
> +"VPROTQ xmm1, xmmV, xmm2/m128","VPROTQ xmm2/m128, xmmV, xmm1","vprotq xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 93 /r","V","V","XOP","amd","w,r,r","",""
> +"VPROTQ xmm1, xmm2/m128, xmmV","VPROTQ xmmV, xmm2/m128, xmm1","vprotq xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 93 /r","V","V","XOP","amd","w,r,r","",""
> +"VPROTW xmm1, xmm2/m128, imm8u","VPROTW imm8u, xmm2/m128, xmm1","vprotw imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C1 /r ib","V","V","XOP","amd","w,r,r","",""
> +"VPROTW xmm1, xmmV, xmm2/m128","VPROTW xmm2/m128, xmmV, xmm1","vprotw xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 91 /r","V","V","XOP","amd","w,r,r","",""
> +"VPROTW xmm1, xmm2/m128, xmmV","VPROTW xmmV, xmm2/m128, xmm1","vprotw xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 91 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV,
> xmm1","vpsadbw xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG F6
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV, xmm1","vpsadbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F6 /r","V","V","AVX","","w,r,r","",""
> +"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV,
> ymm1","vpsadbw ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F.WIG F6
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
> +"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV, ymm1","vpsadbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F6 /r","V","V","AVX2","","w,r,r","",""
> +"VPSADBW zmm1, zmmV, zmm2/m512","VPSADBW zmm2/m512, zmmV, zmm1","vpsadbw zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F.WIG F6 /r","V","V","AVX512BW","scale64","w,r,r","",""
> +"VPSCATTERDD vm32x, {k1-k7}, xmm1","VPSCATTERDD xmm1, {k1-k7},
> vm32x","vpscatterdd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A0
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VPSCATTERDD vm32y, {k1-k7}, ymm1","VPSCATTERDD ymm1, {k1-k7},
> vm32y","vpscatterdd ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A0
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VPSCATTERDD vm32z, {k1-k7}, zmm1","VPSCATTERDD zmm1, {k1-k7},
> vm32z","vpscatterdd zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A0
> /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
> +"VPSCATTERDQ vm32x, {k1-k7}, xmm1","VPSCATTERDQ xmm1, {k1-k7},
> vm32x","vpscatterdq xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A0
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VPSCATTERDQ vm32x, {k1-k7}, ymm1","VPSCATTERDQ ymm1, {k1-k7},
> vm32x","vpscatterdq ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A0
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VPSCATTERDQ vm32y, {k1-k7}, zmm1","VPSCATTERDQ zmm1, {k1-k7},
> vm32y","vpscatterdq zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A0
> /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
> +"VPSCATTERQD vm64x, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7},
> vm64x","vpscatterqd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A1
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VPSCATTERQD vm64y, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7},
> vm64y","vpscatterqd xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A1
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VPSCATTERQD vm64z, {k1-k7}, ymm1","VPSCATTERQD ymm1, {k1-k7},
> vm64z","vpscatterqd ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A1
> /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
> +"VPSCATTERQQ vm64x, {k1-k7}, xmm1","VPSCATTERQQ xmm1, {k1-k7},
> vm64x","vpscatterqq xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A1
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VPSCATTERQQ vm64y, {k1-k7}, ymm1","VPSCATTERQQ ymm1, {k1-k7},
> vm64y","vpscatterqq ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A1
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VPSCATTERQQ vm64z, {k1-k7}, zmm1","VPSCATTERQQ zmm1, {k1-k7},
> vm64z","vpscatterqq zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A1
> /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
> +"VPSHAB xmm1, xmmV, xmm2/m128","VPSHAB xmm2/m128, xmmV, xmm1","vpshab xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 98 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHAB xmm1, xmm2/m128, xmmV","VPSHAB xmmV, xmm2/m128, xmm1","vpshab xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 98 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHAD xmm1, xmmV, xmm2/m128","VPSHAD xmm2/m128, xmmV, xmm1","vpshad xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 9A /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHAD xmm1, xmm2/m128, xmmV","VPSHAD xmmV, xmm2/m128, xmm1","vpshad xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 9A /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHAQ xmm1, xmmV, xmm2/m128","VPSHAQ xmm2/m128, xmmV, xmm1","vpshaq xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 9B /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHAQ xmm1, xmm2/m128, xmmV","VPSHAQ xmmV, xmm2/m128, xmm1","vpshaq xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 9B /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHAW xmm1, xmmV, xmm2/m128","VPSHAW xmm2/m128, xmmV, xmm1","vpshaw xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 99 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHAW xmm1, xmm2/m128, xmmV","VPSHAW xmmV, xmm2/m128, xmm1","vpshaw xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 99 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHLB xmm1, xmmV, xmm2/m128","VPSHLB xmm2/m128, xmmV, xmm1","vpshlb xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 94 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHLB xmm1, xmm2/m128, xmmV","VPSHLB xmmV, xmm2/m128, xmm1","vpshlb xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 94 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHLD xmm1, xmmV, xmm2/m128","VPSHLD xmm2/m128, xmmV, xmm1","vpshld xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 96 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHLD xmm1, xmm2/m128, xmmV","VPSHLD xmmV, xmm2/m128, xmm1","vpshld xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 96 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHLDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHLDD
> imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshldd imm8u,
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 71 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
> +"VPSHLDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHLDD
> imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshldd imm8u,
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 71 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VPSHLDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHLDD
> imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshldd imm8u,
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 71 /r
> ib","V","V","AVX512_VBMI2","bscale4,scale64","w,r,r,r,r","",""
> +"VPSHLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHLDQ
> imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshldq imm8u,
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 71 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VPSHLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHLDQ
> imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshldq imm8u,
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 71 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VPSHLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHLDQ
> imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshldq imm8u,
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 71 /r
> ib","V","V","AVX512_VBMI2","bscale8,scale64","w,r,r,r,r","",""
> +"VPSHLDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHLDVD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshldvd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 71
> /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPSHLDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHLDVD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshldvd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 71
> /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPSHLDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHLDVD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshldvd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 71
> /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,r,r,r","",""
> +"VPSHLDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHLDVQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshldvq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 71
> /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VPSHLDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHLDVQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshldvq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 71
> /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VPSHLDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHLDVQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshldvq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 71
> /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,r,r,r","",""
> +"VPSHLDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHLDVW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpshldvw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 70
> /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","",""
> +"VPSHLDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHLDVW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpshldvw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 70
> /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","",""
> +"VPSHLDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHLDVW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpshldvw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 70
> /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","",""
> +"VPSHLDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHLDW imm8u,
> xmm2/m128, xmmV, {k}{z}, xmm1","vpshldw imm8u, xmm2/m128, xmmV,
> {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 70 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r,r,r","",""
> +"VPSHLDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHLDW imm8u,
> ymm2/m256, ymmV, {k}{z}, ymm1","vpshldw imm8u, ymm2/m256, ymmV,
> {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 70 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r,r,r","",""
> +"VPSHLDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHLDW imm8u,
> zmm2/m512, zmmV, {k}{z}, zmm1","vpshldw imm8u, zmm2/m512, zmmV,
> {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 70 /r
> ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",""
> +"VPSHLQ xmm1, xmmV, xmm2/m128","VPSHLQ xmm2/m128, xmmV, xmm1","vpshlq xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 97 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHLQ xmm1, xmm2/m128, xmmV","VPSHLQ xmmV, xmm2/m128, xmm1","vpshlq xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 97 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHLW xmm1, xmmV, xmm2/m128","VPSHLW xmm2/m128, xmmV, xmm1","vpshlw xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 95 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHLW xmm1, xmm2/m128, xmmV","VPSHLW xmmV, xmm2/m128, xmm1","vpshlw xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 95 /r","V","V","XOP","amd","w,r,r","",""
> +"VPSHRDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHRDD
> imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshrdd imm8u,
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 73 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
> +"VPSHRDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHRDD
> imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshrdd imm8u,
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 73 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VPSHRDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHRDD
> imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshrdd imm8u,
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 73 /r
> ib","V","V","AVX512_VBMI2","bscale4,scale64","w,r,r,r,r","",""
> +"VPSHRDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHRDQ
> imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshrdq imm8u,
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 73 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VPSHRDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHRDQ
> imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshrdq imm8u,
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 73 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VPSHRDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHRDQ
> imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshrdq imm8u,
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 73 /r
> ib","V","V","AVX512_VBMI2","bscale8,scale64","w,r,r,r,r","",""
> +"VPSHRDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHRDVD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshrdvd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 73
> /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","rw,r,r,r","",""
> +"VPSHRDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHRDVD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshrdvd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 73
> /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","rw,r,r,r","",""
> +"VPSHRDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHRDVD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshrdvd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 73
> /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,r,r,r","",""
> +"VPSHRDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHRDVQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshrdvq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 73
> /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","rw,r,r,r","",""
> +"VPSHRDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHRDVQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshrdvq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 73
> /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","rw,r,r,r","",""
> +"VPSHRDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHRDVQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshrdvq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 73
> /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,r,r,r","",""
> +"VPSHRDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHRDVW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpshrdvw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.DDS.128.66.0F38.W1 72
> /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","",""
> +"VPSHRDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHRDVW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpshrdvw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.DDS.256.66.0F38.W1 72
> /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","",""
> +"VPSHRDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHRDVW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpshrdvw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.DDS.512.66.0F38.W1 72
> /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","",""
> +"VPSHRDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHRDW imm8u,
> xmm2/m128, xmmV, {k}{z}, xmm1","vpshrdw imm8u, xmm2/m128, xmmV,
> {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 72 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r,r,r","",""
> +"VPSHRDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHRDW imm8u,
> ymm2/m256, ymmV, {k}{z}, ymm1","vpshrdw imm8u, ymm2/m256, ymmV,
> {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 72 /r
> ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r,r,r","",""
> +"VPSHRDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHRDW imm8u,
> zmm2/m512, zmmV, {k}{z}, zmm1","vpshrdw imm8u, zmm2/m512, zmmV,
> {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 72 /r
> ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",""
> +"VPSHUFB xmm1, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV, xmm1","vpshufb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 00 /r","V","V","AVX","","w,r,r","",""
> +"VPSHUFB xmm1, {k}{z}, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpshufb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.WIG 00
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSHUFB ymm1, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV, ymm1","vpshufb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 00 /r","V","V","AVX2","","w,r,r","",""
> +"VPSHUFB ymm1, {k}{z}, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpshufb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.WIG 00
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSHUFB zmm1, {k}{z}, zmmV, zmm2/m512","VPSHUFB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpshufb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.WIG 00
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSHUFBITQMB k1, {k}, xmmV, xmm2/m128","VPSHUFBITQMB xmm2/m128,
> xmmV, {k}, k1","vpshufbitqmb xmm2/m128, xmmV, {k},
> k1","EVEX.NDS.128.66.0F38.W0 8F
> /r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r,r","",""
> +"VPSHUFBITQMB k1, {k}, ymmV, ymm2/m256","VPSHUFBITQMB ymm2/m256,
> ymmV, {k}, k1","vpshufbitqmb ymm2/m256, ymmV, {k},
> k1","EVEX.NDS.256.66.0F38.W0 8F
> /r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r,r","",""
> +"VPSHUFBITQMB k1, {k}, zmmV, zmm2/m512","VPSHUFBITQMB zmm2/m512,
> zmmV, {k}, k1","vpshufbitqmb zmm2/m512, zmmV, {k},
> k1","EVEX.NDS.512.66.0F38.W0 8F
> /r","V","V","AVX512_BITALG","scale64","w,r,r,r","",""
> +"VPSHUFD xmm1, xmm2/m128, imm8u","VPSHUFD imm8u, xmm2/m128, xmm1","vpshufd imm8u, xmm2/m128, xmm1","VEX.128.66.0F.WIG 70 /r ib","V","V","AVX","","w,r,r","",""
> +"VPSHUFD xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSHUFD imm8u,
> xmm2/m128/m32bcst, {k}{z}, xmm1","vpshufd imm8u, xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.66.0F.W0 70 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPSHUFD ymm1, ymm2/m256, imm8u","VPSHUFD imm8u, ymm2/m256, ymm1","vpshufd imm8u, ymm2/m256, ymm1","VEX.256.66.0F.WIG 70 /r ib","V","V","AVX2","","w,r,r","",""
> +"VPSHUFD ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSHUFD imm8u,
> ymm2/m256/m32bcst, {k}{z}, ymm1","vpshufd imm8u, ymm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.66.0F.W0 70 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPSHUFD zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSHUFD imm8u,
> zmm2/m512/m32bcst, {k}{z}, zmm1","vpshufd imm8u, zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.66.0F.W0 70 /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPSHUFHW xmm1, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128, xmm1","vpshufhw imm8u, xmm2/m128, xmm1","VEX.128.F3.0F.WIG 70 /r ib","V","V","AVX","","w,r,r","",""
> +"VPSHUFHW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128,
> {k}{z}, xmm1","vpshufhw imm8u, xmm2/m128, {k}{z},
> xmm1","EVEX.128.F3.0F.WIG 70 /r
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSHUFHW ymm1, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256, ymm1","vpshufhw imm8u, ymm2/m256, ymm1","VEX.256.F3.0F.WIG 70 /r ib","V","V","AVX2","","w,r,r","",""
> +"VPSHUFHW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256,
> {k}{z}, ymm1","vpshufhw imm8u, ymm2/m256, {k}{z},
> ymm1","EVEX.256.F3.0F.WIG 70 /r
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSHUFHW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFHW imm8u, zmm2/m512,
> {k}{z}, zmm1","vpshufhw imm8u, zmm2/m512, {k}{z},
> zmm1","EVEX.512.F3.0F.WIG 70 /r
> ib","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSHUFLW xmm1, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128, xmm1","vpshuflw imm8u, xmm2/m128, xmm1","VEX.128.F2.0F.WIG 70 /r ib","V","V","AVX","","w,r,r","",""
> +"VPSHUFLW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128,
> {k}{z}, xmm1","vpshuflw imm8u, xmm2/m128, {k}{z},
> xmm1","EVEX.128.F2.0F.WIG 70 /r
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSHUFLW ymm1, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256, ymm1","vpshuflw imm8u, ymm2/m256, ymm1","VEX.256.F2.0F.WIG 70 /r ib","V","V","AVX2","","w,r,r","",""
> +"VPSHUFLW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256,
> {k}{z}, ymm1","vpshuflw imm8u, ymm2/m256, {k}{z},
> ymm1","EVEX.256.F2.0F.WIG 70 /r
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSHUFLW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFLW imm8u, zmm2/m512,
> {k}{z}, zmm1","vpshuflw imm8u, zmm2/m512, {k}{z},
> zmm1","EVEX.512.F2.0F.WIG 70 /r
> ib","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSIGNB xmm1, xmmV, xmm2/m128","VPSIGNB xmm2/m128, xmmV, xmm1","vpsignb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 08 /r","V","V","AVX","","w,r,r","",""
> +"VPSIGNB ymm1, ymmV, ymm2/m256","VPSIGNB ymm2/m256, ymmV, ymm1","vpsignb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 08 /r","V","V","AVX2","","w,r,r","",""
> +"VPSIGND xmm1, xmmV, xmm2/m128","VPSIGND xmm2/m128, xmmV, xmm1","vpsignd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0A /r","V","V","AVX","","w,r,r","",""
> +"VPSIGND ymm1, ymmV, ymm2/m256","VPSIGND ymm2/m256, ymmV, ymm1","vpsignd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0A /r","V","V","AVX2","","w,r,r","",""
> +"VPSIGNW xmm1, xmmV, xmm2/m128","VPSIGNW xmm2/m128, xmmV, xmm1","vpsignw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 09 /r","V","V","AVX","","w,r,r","",""
> +"VPSIGNW ymm1, ymmV, ymm2/m256","VPSIGNW ymm2/m256, ymmV, ymm1","vpsignw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 09 /r","V","V","AVX2","","w,r,r","",""
> +"VPSLLD xmmV, xmm2, imm8u","VPSLLD imm8u, xmm2, xmmV","vpslld imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 72 /6 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSLLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSLLD imm8u,
> xmm2/m128/m32bcst, {k}{z}, xmmV","vpslld imm8u, xmm2/m128/m32bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /6
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPSLLD ymmV, ymm2, imm8u","VPSLLD imm8u, ymm2, ymmV","vpslld imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 72 /6 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSLLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSLLD imm8u,
> ymm2/m256/m32bcst, {k}{z}, ymmV","vpslld imm8u, ymm2/m256/m32bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /6
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPSLLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSLLD imm8u,
> zmm2/m512/m32bcst, {k}{z}, zmmV","vpslld imm8u, zmm2/m512/m32bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /6
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPSLLD xmm1, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV, xmm1","vpslld xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F2 /r","V","V","AVX","","w,r,r","",""
> +"VPSLLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV,
> {k}{z}, xmm1","vpslld xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.W0 F2
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSLLD ymm1, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV, ymm1","vpslld xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F2 /r","V","V","AVX2","","w,r,r","",""
> +"VPSLLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV,
> {k}{z}, ymm1","vpslld xmm2/m128, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.W0 F2
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSLLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLD xmm2/m128, zmmV,
> {k}{z}, zmm1","vpslld xmm2/m128, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.W0 F2
> /r","V","V","AVX512F","scale16","w,r,r,r","",""
> +"VPSLLDQ xmmV, xmm2, imm8u","VPSLLDQ imm8u, xmm2, xmmV","vpslldq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /7 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSLLDQ xmmV, xmm2/m128, imm8u","VPSLLDQ imm8u, xmm2/m128,
> xmmV","vpslldq imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /7
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPSLLDQ ymmV, ymm2, imm8u","VPSLLDQ imm8u, ymm2, ymmV","vpslldq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /7 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSLLDQ ymmV, ymm2/m256, imm8u","VPSLLDQ imm8u, ymm2/m256,
> ymmV","vpslldq imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /7
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
> +"VPSLLDQ zmmV, zmm2/m512, imm8u","VPSLLDQ imm8u, zmm2/m512,
> zmmV","vpslldq imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /7
> ib","V","V","AVX512BW","scale64","w,r,r","",""
> +"VPSLLQ xmmV, xmm2, imm8u","VPSLLQ imm8u, xmm2, xmmV","vpsllq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /6 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSLLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSLLQ imm8u,
> xmm2/m128/m64bcst, {k}{z}, xmmV","vpsllq imm8u, xmm2/m128/m64bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 73 /6
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPSLLQ ymmV, ymm2, imm8u","VPSLLQ imm8u, ymm2, ymmV","vpsllq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /6 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSLLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSLLQ imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymmV","vpsllq imm8u, ymm2/m256/m64bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 73 /6
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPSLLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSLLQ imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmmV","vpsllq imm8u, zmm2/m512/m64bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 73 /6
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPSLLQ xmm1, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV, xmm1","vpsllq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F3 /r","V","V","AVX","","w,r,r","",""
> +"VPSLLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsllq xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.W1 F3
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSLLQ ymm1, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV, ymm1","vpsllq xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F3 /r","V","V","AVX2","","w,r,r","",""
> +"VPSLLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV,
> {k}{z}, ymm1","vpsllq xmm2/m128, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.W1 F3
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSLLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLQ xmm2/m128, zmmV,
> {k}{z}, zmm1","vpsllq xmm2/m128, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.W1 F3
> /r","V","V","AVX512F","scale16","w,r,r,r","",""
> +"VPSLLVD xmm1, xmmV, xmm2/m128","VPSLLVD xmm2/m128, xmmV, xmm1","vpsllvd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 47 /r","V","V","AVX2","","w,r,r","",""
> +"VPSLLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSLLVD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsllvd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 47
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPSLLVD ymm1, ymmV, ymm2/m256","VPSLLVD ymm2/m256, ymmV, ymm1","vpsllvd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 47 /r","V","V","AVX2","","w,r,r","",""
> +"VPSLLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSLLVD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsllvd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 47
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPSLLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSLLVD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsllvd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 47
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPSLLVQ xmm1, xmmV, xmm2/m128","VPSLLVQ xmm2/m128, xmmV, xmm1","vpsllvq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 47 /r","V","V","AVX2","","w,r,r","",""
> +"VPSLLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSLLVQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsllvq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 47
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPSLLVQ ymm1, ymmV, ymm2/m256","VPSLLVQ ymm2/m256, ymmV, ymm1","vpsllvq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 47 /r","V","V","AVX2","","w,r,r","",""
> +"VPSLLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSLLVQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsllvq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 47
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPSLLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSLLVQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsllvq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 47
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPSLLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLVW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsllvw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.W1 12
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSLLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSLLVW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpsllvw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.W1 12
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSLLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSLLVW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpsllvw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.W1 12
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSLLW xmmV, xmm2, imm8u","VPSLLW imm8u, xmm2, xmmV","vpsllw imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 71 /6 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSLLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSLLW imm8u, xmm2/m128,
> {k}{z}, xmmV","vpsllw imm8u, xmm2/m128, {k}{z},
> xmmV","EVEX.NDD.128.66.0F.WIG 71 /6
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSLLW ymmV, ymm2, imm8u","VPSLLW imm8u, ymm2, ymmV","vpsllw imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 71 /6 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSLLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSLLW imm8u, ymm2/m256,
> {k}{z}, ymmV","vpsllw imm8u, ymm2/m256, {k}{z},
> ymmV","EVEX.NDD.256.66.0F.WIG 71 /6
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSLLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSLLW imm8u, zmm2/m512,
> {k}{z}, zmmV","vpsllw imm8u, zmm2/m512, {k}{z},
> zmmV","EVEX.NDD.512.66.0F.WIG 71 /6
> ib","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSLLW xmm1, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV, xmm1","vpsllw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F1 /r","V","V","AVX","","w,r,r","",""
> +"VPSLLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsllw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG F1
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSLLW ymm1, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV, ymm1","vpsllw xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F1 /r","V","V","AVX2","","w,r,r","",""
> +"VPSLLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV,
> {k}{z}, ymm1","vpsllw xmm2/m128, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG F1
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSLLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLW xmm2/m128, zmmV,
> {k}{z}, zmm1","vpsllw xmm2/m128, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG F1
> /r","V","V","AVX512BW","scale16","w,r,r,r","",""
> +"VPSRAD xmmV, xmm2, imm8u","VPSRAD imm8u, xmm2, xmmV","vpsrad imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 72 /4 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSRAD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRAD imm8u,
> xmm2/m128/m32bcst, {k}{z}, xmmV","vpsrad imm8u, xmm2/m128/m32bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /4
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPSRAD ymmV, ymm2, imm8u","VPSRAD imm8u, ymm2, ymmV","vpsrad imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 72 /4 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSRAD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRAD imm8u,
> ymm2/m256/m32bcst, {k}{z}, ymmV","vpsrad imm8u, ymm2/m256/m32bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /4
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPSRAD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRAD imm8u,
> zmm2/m512/m32bcst, {k}{z}, zmmV","vpsrad imm8u, zmm2/m512/m32bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /4
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPSRAD xmm1, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV, xmm1","vpsrad xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E2 /r","V","V","AVX","","w,r,r","",""
> +"VPSRAD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsrad xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.W0 E2
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRAD ymm1, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV, ymm1","vpsrad xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E2 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRAD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV,
> {k}{z}, ymm1","vpsrad xmm2/m128, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.W0 E2
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRAD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAD xmm2/m128, zmmV,
> {k}{z}, zmm1","vpsrad xmm2/m128, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.W0 E2
> /r","V","V","AVX512F","scale16","w,r,r,r","",""
> +"VPSRAQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRAQ imm8u,
> xmm2/m128/m64bcst, {k}{z}, xmmV","vpsraq imm8u, xmm2/m128/m64bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 72 /4
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPSRAQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRAQ imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymmV","vpsraq imm8u, ymm2/m256/m64bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 72 /4
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPSRAQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRAQ imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmmV","vpsraq imm8u, zmm2/m512/m64bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 72 /4
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPSRAQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAQ xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsraq xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.W1 E2
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRAQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAQ xmm2/m128, ymmV,
> {k}{z}, ymm1","vpsraq xmm2/m128, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.W1 E2
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRAQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAQ xmm2/m128, zmmV,
> {k}{z}, zmm1","vpsraq xmm2/m128, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.W1 E2
> /r","V","V","AVX512F","scale16","w,r,r,r","",""
> +"VPSRAVD xmm1, xmmV, xmm2/m128","VPSRAVD xmm2/m128, xmmV, xmm1","vpsravd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 46 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRAVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRAVD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsravd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 46
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPSRAVD ymm1, ymmV, ymm2/m256","VPSRAVD ymm2/m256, ymmV, ymm1","vpsravd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 46 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRAVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRAVD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsravd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 46
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPSRAVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRAVD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsravd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 46
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPSRAVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRAVQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsravq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 46
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPSRAVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRAVQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsravq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 46
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPSRAVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRAVQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsravq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 46
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPSRAVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAVW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsravw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.W1 11
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRAVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRAVW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpsravw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.W1 11
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSRAVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRAVW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpsravw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.W1 11
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSRAW xmmV, xmm2, imm8u","VPSRAW imm8u, xmm2, xmmV","vpsraw imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 71 /4 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSRAW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRAW imm8u, xmm2/m128,
> {k}{z}, xmmV","vpsraw imm8u, xmm2/m128, {k}{z},
> xmmV","EVEX.NDD.128.66.0F.WIG 71 /4
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRAW ymmV, ymm2, imm8u","VPSRAW imm8u, ymm2, ymmV","vpsraw imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 71 /4 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSRAW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRAW imm8u, ymm2/m256,
> {k}{z}, ymmV","vpsraw imm8u, ymm2/m256, {k}{z},
> ymmV","EVEX.NDD.256.66.0F.WIG 71 /4
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSRAW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRAW imm8u, zmm2/m512,
> {k}{z}, zmmV","vpsraw imm8u, zmm2/m512, {k}{z},
> zmmV","EVEX.NDD.512.66.0F.WIG 71 /4
> ib","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSRAW xmm1, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV, xmm1","vpsraw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E1 /r","V","V","AVX","","w,r,r","",""
> +"VPSRAW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsraw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG E1
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRAW ymm1, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV, ymm1","vpsraw xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E1 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRAW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV,
> {k}{z}, ymm1","vpsraw xmm2/m128, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG E1
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRAW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAW xmm2/m128, zmmV,
> {k}{z}, zmm1","vpsraw xmm2/m128, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG E1
> /r","V","V","AVX512BW","scale16","w,r,r,r","",""
> +"VPSRLD xmmV, xmm2, imm8u","VPSRLD imm8u, xmm2, xmmV","vpsrld imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 72 /2 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSRLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRLD imm8u,
> xmm2/m128/m32bcst, {k}{z}, xmmV","vpsrld imm8u, xmm2/m128/m32bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /2
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPSRLD ymmV, ymm2, imm8u","VPSRLD imm8u, ymm2, ymmV","vpsrld imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 72 /2 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSRLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRLD imm8u,
> ymm2/m256/m32bcst, {k}{z}, ymmV","vpsrld imm8u, ymm2/m256/m32bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /2
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPSRLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRLD imm8u,
> zmm2/m512/m32bcst, {k}{z}, zmmV","vpsrld imm8u, zmm2/m512/m32bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /2
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPSRLD xmm1, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV, xmm1","vpsrld xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D2 /r","V","V","AVX","","w,r,r","",""
> +"VPSRLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsrld xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.W0 D2
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRLD ymm1, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV, ymm1","vpsrld xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D2 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV,
> {k}{z}, ymm1","vpsrld xmm2/m128, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.W0 D2
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLD xmm2/m128, zmmV,
> {k}{z}, zmm1","vpsrld xmm2/m128, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.W0 D2
> /r","V","V","AVX512F","scale16","w,r,r,r","",""
> +"VPSRLDQ xmmV, xmm2, imm8u","VPSRLDQ imm8u, xmm2, xmmV","vpsrldq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /3 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSRLDQ xmmV, xmm2/m128, imm8u","VPSRLDQ imm8u, xmm2/m128,
> xmmV","vpsrldq imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /3
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
> +"VPSRLDQ ymmV, ymm2, imm8u","VPSRLDQ imm8u, ymm2, ymmV","vpsrldq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /3 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSRLDQ ymmV, ymm2/m256, imm8u","VPSRLDQ imm8u, ymm2/m256,
> ymmV","vpsrldq imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /3
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
> +"VPSRLDQ zmmV, zmm2/m512, imm8u","VPSRLDQ imm8u, zmm2/m512,
> zmmV","vpsrldq imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /3
> ib","V","V","AVX512BW","scale64","w,r,r","",""
> +"VPSRLQ xmmV, xmm2, imm8u","VPSRLQ imm8u, xmm2, xmmV","vpsrlq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /2 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSRLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRLQ imm8u,
> xmm2/m128/m64bcst, {k}{z}, xmmV","vpsrlq imm8u, xmm2/m128/m64bcst,
> {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 73 /2
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPSRLQ ymmV, ymm2, imm8u","VPSRLQ imm8u, ymm2, ymmV","vpsrlq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /2 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSRLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRLQ imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymmV","vpsrlq imm8u, ymm2/m256/m64bcst,
> {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 73 /2
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPSRLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRLQ imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmmV","vpsrlq imm8u, zmm2/m512/m64bcst,
> {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 73 /2
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPSRLQ xmm1, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV, xmm1","vpsrlq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D3 /r","V","V","AVX","","w,r,r","",""
> +"VPSRLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsrlq xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.W1 D3
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRLQ ymm1, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV, ymm1","vpsrlq xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D3 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV,
> {k}{z}, ymm1","vpsrlq xmm2/m128, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.W1 D3
> /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLQ xmm2/m128, zmmV,
> {k}{z}, zmm1","vpsrlq xmm2/m128, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.W1 D3
> /r","V","V","AVX512F","scale16","w,r,r,r","",""
> +"VPSRLVD xmm1, xmmV, xmm2/m128","VPSRLVD xmm2/m128, xmmV, xmm1","vpsrlvd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 45 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRLVD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsrlvd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 45
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPSRLVD ymm1, ymmV, ymm2/m256","VPSRLVD ymm2/m256, ymmV, ymm1","vpsrlvd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 45 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRLVD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsrlvd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 45
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPSRLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRLVD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsrlvd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 45
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPSRLVQ xmm1, xmmV, xmm2/m128","VPSRLVQ xmm2/m128, xmmV, xmm1","vpsrlvq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 45 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRLVQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsrlvq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 45
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPSRLVQ ymm1, ymmV, ymm2/m256","VPSRLVQ ymm2/m256, ymmV, ymm1","vpsrlvq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 45 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRLVQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsrlvq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 45
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPSRLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRLVQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsrlvq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 45
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPSRLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLVW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsrlvw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F38.W1 10
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRLVW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpsrlvw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F38.W1 10
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSRLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRLVW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpsrlvw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F38.W1 10
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSRLW xmmV, xmm2, imm8u","VPSRLW imm8u, xmm2, xmmV","vpsrlw imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 71 /2 ib","V","V","AVX","modrm_regonly","w,r,r","",""
> +"VPSRLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRLW imm8u, xmm2/m128,
> {k}{z}, xmmV","vpsrlw imm8u, xmm2/m128, {k}{z},
> xmmV","EVEX.NDD.128.66.0F.WIG 71 /2
> ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRLW ymmV, ymm2, imm8u","VPSRLW imm8u, ymm2, ymmV","vpsrlw imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 71 /2 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
> +"VPSRLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRLW imm8u, ymm2/m256,
> {k}{z}, ymmV","vpsrlw imm8u, ymm2/m256, {k}{z},
> ymmV","EVEX.NDD.256.66.0F.WIG 71 /2
> ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSRLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRLW imm8u, zmm2/m512,
> {k}{z}, zmmV","vpsrlw imm8u, zmm2/m512, {k}{z},
> zmmV","EVEX.NDD.512.66.0F.WIG 71 /2
> ib","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSRLW xmm1, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV, xmm1","vpsrlw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D1 /r","V","V","AVX","","w,r,r","",""
> +"VPSRLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsrlw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG D1
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRLW ymm1, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV, ymm1","vpsrlw xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D1 /r","V","V","AVX2","","w,r,r","",""
> +"VPSRLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV,
> {k}{z}, ymm1","vpsrlw xmm2/m128, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG D1
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSRLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLW xmm2/m128, zmmV,
> {k}{z}, zmm1","vpsrlw xmm2/m128, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG D1
> /r","V","V","AVX512BW","scale16","w,r,r,r","",""
> +"VPSUBB xmm1, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV, xmm1","vpsubb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F8 /r","V","V","AVX","","w,r,r","",""
> +"VPSUBB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsubb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG F8
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSUBB ymm1, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV, ymm1","vpsubb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F8 /r","V","V","AVX2","","w,r,r","",""
> +"VPSUBB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpsubb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG F8
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSUBB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpsubb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG F8
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSUBD xmm1, xmmV, xmm2/m128","VPSUBD xmm2/m128, xmmV, xmm1","vpsubd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FA /r","V","V","AVX","","w,r,r","",""
> +"VPSUBD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSUBD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsubd xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 FA
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPSUBD ymm1, ymmV, ymm2/m256","VPSUBD ymm2/m256, ymmV, ymm1","vpsubd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FA /r","V","V","AVX2","","w,r,r","",""
> +"VPSUBD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSUBD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsubd ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 FA
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPSUBD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSUBD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsubd zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 FA
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPSUBQ xmm1, xmmV, xmm2/m128","VPSUBQ xmm2/m128, xmmV, xmm1","vpsubq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FB /r","V","V","AVX","","w,r,r","",""
> +"VPSUBQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSUBQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsubq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 FB
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPSUBQ ymm1, ymmV, ymm2/m256","VPSUBQ ymm2/m256, ymmV, ymm1","vpsubq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FB /r","V","V","AVX2","","w,r,r","",""
> +"VPSUBQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSUBQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsubq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 FB
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPSUBQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSUBQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsubq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 FB
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPSUBSB xmm1, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV, xmm1","vpsubsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E8 /r","V","V","AVX","","w,r,r","",""
> +"VPSUBSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsubsb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG E8
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSUBSB ymm1, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV, ymm1","vpsubsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E8 /r","V","V","AVX2","","w,r,r","",""
> +"VPSUBSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpsubsb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG E8
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSUBSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpsubsb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG E8
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSUBSW xmm1, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV, xmm1","vpsubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E9 /r","V","V","AVX","","w,r,r","",""
> +"VPSUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsubsw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG E9
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSUBSW ymm1, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV, ymm1","vpsubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E9 /r","V","V","AVX2","","w,r,r","",""
> +"VPSUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpsubsw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG E9
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpsubsw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG E9
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSUBUSB xmm1, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV, xmm1","vpsubusb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D8 /r","V","V","AVX","","w,r,r","",""
> +"VPSUBUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsubusb xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG D8
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSUBUSB ymm1, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV, ymm1","vpsubusb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D8 /r","V","V","AVX2","","w,r,r","",""
> +"VPSUBUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV,
> {k}{z}, ymm1","vpsubusb ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG D8
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSUBUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSB zmm2/m512, zmmV,
> {k}{z}, zmm1","vpsubusb zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG D8
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSUBUSW xmm1, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV, xmm1","vpsubusw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D9 /r","V","V","AVX","","w,r,r","",""
> +"VPSUBUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsubusw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG D9
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSUBUSW ymm1, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV, ymm1","vpsubusw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D9 /r","V","V","AVX2","","w,r,r","",""
> +"VPSUBUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpsubusw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG D9
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSUBUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpsubusw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG D9
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPSUBW xmm1, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV, xmm1","vpsubw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F9 /r","V","V","AVX","","w,r,r","",""
> +"VPSUBW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV,
> {k}{z}, xmm1","vpsubw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG F9
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPSUBW ymm1, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV, ymm1","vpsubw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F9 /r","V","V","AVX2","","w,r,r","",""
> +"VPSUBW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV,
> {k}{z}, ymm1","vpsubw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG F9
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPSUBW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBW zmm2/m512, zmmV,
> {k}{z}, zmm1","vpsubw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG F9
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPTERNLOGD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPTERNLOGD
> imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpternlogd imm8u,
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W0 25 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r,r","",""
> +"VPTERNLOGD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPTERNLOGD
> imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpternlogd imm8u,
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W0 25 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r,r","",""
> +"VPTERNLOGD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPTERNLOGD
> imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpternlogd imm8u,
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W0 25 /r
> ib","V","V","AVX512F","bscale4,scale64","rw,r,r,r,r","",""
> +"VPTERNLOGQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPTERNLOGQ
> imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpternlogq imm8u,
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W1 25 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r,r","",""
> +"VPTERNLOGQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPTERNLOGQ
> imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpternlogq imm8u,
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W1 25 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r,r","",""
> +"VPTERNLOGQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPTERNLOGQ
> imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpternlogq imm8u,
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W1 25 /r
> ib","V","V","AVX512F","bscale8,scale64","rw,r,r,r,r","",""
> +"VPTEST xmm1, xmm2/m128","VPTEST xmm2/m128, xmm1","vptest xmm2/m128, xmm1","VEX.128.66.0F38.WIG 17 /r","V","V","AVX","","r,r","",""
> +"VPTEST ymm1, ymm2/m256","VPTEST ymm2/m256, ymm1","vptest ymm2/m256, ymm1","VEX.256.66.0F38.WIG 17 /r","V","V","AVX","","r,r","",""
> +"VPTESTMB k1, {k}, xmmV, xmm2/m128","VPTESTMB xmm2/m128, xmmV, {k},
> k1","vptestmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 26
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPTESTMB k1, {k}, ymmV, ymm2/m256","VPTESTMB ymm2/m256, ymmV, {k},
> k1","vptestmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 26
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPTESTMB k1, {k}, zmmV, zmm2/m512","VPTESTMB zmm2/m512, zmmV, {k},
> k1","vptestmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 26
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPTESTMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTMD
> xmm2/m128/m32bcst, xmmV, {k}, k1","vptestmd xmm2/m128/m32bcst, xmmV,
> {k}, k1","EVEX.NDS.128.66.0F38.W0 27
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPTESTMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTMD
> ymm2/m256/m32bcst, ymmV, {k}, k1","vptestmd ymm2/m256/m32bcst, ymmV,
> {k}, k1","EVEX.NDS.256.66.0F38.W0 27
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPTESTMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTMD
> zmm2/m512/m32bcst, zmmV, {k}, k1","vptestmd zmm2/m512/m32bcst, zmmV,
> {k}, k1","EVEX.NDS.512.66.0F38.W0 27
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPTESTMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTMQ
> xmm2/m128/m64bcst, xmmV, {k}, k1","vptestmq xmm2/m128/m64bcst, xmmV,
> {k}, k1","EVEX.NDS.128.66.0F38.W1 27
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPTESTMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTMQ
> ymm2/m256/m64bcst, ymmV, {k}, k1","vptestmq ymm2/m256/m64bcst, ymmV,
> {k}, k1","EVEX.NDS.256.66.0F38.W1 27
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPTESTMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTMQ
> zmm2/m512/m64bcst, zmmV, {k}, k1","vptestmq zmm2/m512/m64bcst, zmmV,
> {k}, k1","EVEX.NDS.512.66.0F38.W1 27
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPTESTMW k1, {k}, xmmV, xmm2/m128","VPTESTMW xmm2/m128, xmmV, {k},
> k1","vptestmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 26
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPTESTMW k1, {k}, ymmV, ymm2/m256","VPTESTMW ymm2/m256, ymmV, {k},
> k1","vptestmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 26
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPTESTMW k1, {k}, zmmV, zmm2/m512","VPTESTMW zmm2/m512, zmmV, {k},
> k1","vptestmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 26
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPTESTNMB k1, {k}, xmmV, xmm2/m128","VPTESTNMB xmm2/m128, xmmV, {k},
> k1","vptestnmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W0 26
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPTESTNMB k1, {k}, ymmV, ymm2/m256","VPTESTNMB ymm2/m256, ymmV, {k},
> k1","vptestnmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W0 26
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPTESTNMB k1, {k}, zmmV, zmm2/m512","VPTESTNMB zmm2/m512, zmmV, {k},
> k1","vptestnmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W0 26
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPTESTNMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTNMD
> xmm2/m128/m32bcst, xmmV, {k}, k1","vptestnmd xmm2/m128/m32bcst, xmmV,
> {k}, k1","EVEX.NDS.128.F3.0F38.W0 27
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPTESTNMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTNMD
> ymm2/m256/m32bcst, ymmV, {k}, k1","vptestnmd ymm2/m256/m32bcst, ymmV,
> {k}, k1","EVEX.NDS.256.F3.0F38.W0 27
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPTESTNMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTNMD
> zmm2/m512/m32bcst, zmmV, {k}, k1","vptestnmd zmm2/m512/m32bcst, zmmV,
> {k}, k1","EVEX.NDS.512.F3.0F38.W0 27
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPTESTNMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTNMQ
> xmm2/m128/m64bcst, xmmV, {k}, k1","vptestnmq xmm2/m128/m64bcst, xmmV,
> {k}, k1","EVEX.NDS.128.F3.0F38.W1 27
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPTESTNMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTNMQ
> ymm2/m256/m64bcst, ymmV, {k}, k1","vptestnmq ymm2/m256/m64bcst, ymmV,
> {k}, k1","EVEX.NDS.256.F3.0F38.W1 27
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPTESTNMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTNMQ
> zmm2/m512/m64bcst, zmmV, {k}, k1","vptestnmq zmm2/m512/m64bcst, zmmV,
> {k}, k1","EVEX.NDS.512.F3.0F38.W1 27
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPTESTNMW k1, {k}, xmmV, xmm2/m128","VPTESTNMW xmm2/m128, xmmV, {k},
> k1","vptestnmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W1 26
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPTESTNMW k1, {k}, ymmV, ymm2/m256","VPTESTNMW ymm2/m256, ymmV, {k},
> k1","vptestnmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W1 26
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPTESTNMW k1, {k}, zmmV, zmm2/m512","VPTESTNMW zmm2/m512, zmmV, {k},
> k1","vptestnmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W1 26
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPUNPCKHBW xmm1, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128, xmmV, xmm1","vpunpckhbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 68 /r","V","V","AVX","","w,r,r","",""
> +"VPUNPCKHBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128,
> xmmV, {k}{z}, xmm1","vpunpckhbw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG 68
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPUNPCKHBW ymm1, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256, ymmV, ymm1","vpunpckhbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 68 /r","V","V","AVX2","","w,r,r","",""
> +"VPUNPCKHBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256,
> ymmV, {k}{z}, ymm1","vpunpckhbw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG 68
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPUNPCKHBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHBW zmm2/m512,
> zmmV, {k}{z}, zmm1","vpunpckhbw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG 68
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPUNPCKHDQ xmm1, xmmV, xmm2/m128","VPUNPCKHDQ xmm2/m128, xmmV, xmm1","vpunpckhdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6A /r","V","V","AVX","","w,r,r","",""
> +"VPUNPCKHDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKHDQ
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpunpckhdq xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 6A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPUNPCKHDQ ymm1, ymmV, ymm2/m256","VPUNPCKHDQ ymm2/m256, ymmV, ymm1","vpunpckhdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6A /r","V","V","AVX2","","w,r,r","",""
> +"VPUNPCKHDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKHDQ
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpunpckhdq ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 6A
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPUNPCKHDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKHDQ
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpunpckhdq zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 6A
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPUNPCKHQDQ xmm1, xmmV, xmm2/m128","VPUNPCKHQDQ xmm2/m128, xmmV, xmm1","vpunpckhqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6D /r","V","V","AVX","","w,r,r","",""
> +"VPUNPCKHQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKHQDQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpunpckhqdq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 6D
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPUNPCKHQDQ ymm1, ymmV, ymm2/m256","VPUNPCKHQDQ ymm2/m256, ymmV, ymm1","vpunpckhqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6D /r","V","V","AVX2","","w,r,r","",""
> +"VPUNPCKHQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKHQDQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpunpckhqdq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 6D
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPUNPCKHQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKHQDQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpunpckhqdq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 6D
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPUNPCKHWD xmm1, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128, xmmV, xmm1","vpunpckhwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 69 /r","V","V","AVX","","w,r,r","",""
> +"VPUNPCKHWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128,
> xmmV, {k}{z}, xmm1","vpunpckhwd xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG 69
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPUNPCKHWD ymm1, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256, ymmV, ymm1","vpunpckhwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 69 /r","V","V","AVX2","","w,r,r","",""
> +"VPUNPCKHWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256,
> ymmV, {k}{z}, ymm1","vpunpckhwd ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG 69
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPUNPCKHWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHWD zmm2/m512,
> zmmV, {k}{z}, zmm1","vpunpckhwd zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG 69
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPUNPCKLBW xmm1, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128, xmmV, xmm1","vpunpcklbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 60 /r","V","V","AVX","","w,r,r","",""
> +"VPUNPCKLBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128,
> xmmV, {k}{z}, xmm1","vpunpcklbw xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG 60
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPUNPCKLBW ymm1, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256, ymmV, ymm1","vpunpcklbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 60 /r","V","V","AVX2","","w,r,r","",""
> +"VPUNPCKLBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256,
> ymmV, {k}{z}, ymm1","vpunpcklbw ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG 60
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPUNPCKLBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLBW zmm2/m512,
> zmmV, {k}{z}, zmm1","vpunpcklbw zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG 60
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPUNPCKLDQ xmm1, xmmV, xmm2/m128","VPUNPCKLDQ xmm2/m128, xmmV, xmm1","vpunpckldq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 62 /r","V","V","AVX","","w,r,r","",""
> +"VPUNPCKLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKLDQ
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpunpckldq xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 62
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPUNPCKLDQ ymm1, ymmV, ymm2/m256","VPUNPCKLDQ ymm2/m256, ymmV, ymm1","vpunpckldq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 62 /r","V","V","AVX2","","w,r,r","",""
> +"VPUNPCKLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKLDQ
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpunpckldq ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 62
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPUNPCKLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKLDQ
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpunpckldq zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 62
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPUNPCKLQDQ xmm1, xmmV, xmm2/m128","VPUNPCKLQDQ xmm2/m128, xmmV, xmm1","vpunpcklqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6C /r","V","V","AVX","","w,r,r","",""
> +"VPUNPCKLQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKLQDQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpunpcklqdq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 6C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPUNPCKLQDQ ymm1, ymmV, ymm2/m256","VPUNPCKLQDQ ymm2/m256, ymmV, ymm1","vpunpcklqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6C /r","V","V","AVX2","","w,r,r","",""
> +"VPUNPCKLQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKLQDQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpunpcklqdq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 6C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPUNPCKLQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKLQDQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpunpcklqdq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 6C
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VPUNPCKLWD xmm1, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128, xmmV, xmm1","vpunpcklwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 61 /r","V","V","AVX","","w,r,r","",""
> +"VPUNPCKLWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128,
> xmmV, {k}{z}, xmm1","vpunpcklwd xmm2/m128, xmmV, {k}{z},
> xmm1","EVEX.NDS.128.66.0F.WIG 61
> /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
> +"VPUNPCKLWD ymm1, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256, ymmV, ymm1","vpunpcklwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 61 /r","V","V","AVX2","","w,r,r","",""
> +"VPUNPCKLWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256,
> ymmV, {k}{z}, ymm1","vpunpcklwd ymm2/m256, ymmV, {k}{z},
> ymm1","EVEX.NDS.256.66.0F.WIG 61
> /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
> +"VPUNPCKLWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLWD zmm2/m512,
> zmmV, {k}{z}, zmm1","vpunpcklwd zmm2/m512, zmmV, {k}{z},
> zmm1","EVEX.NDS.512.66.0F.WIG 61
> /r","V","V","AVX512BW","scale64","w,r,r,r","",""
> +"VPXOR xmm1, xmmV, xmm2/m128","VPXOR xmm2/m128, xmmV, xmm1","vpxor xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EF /r","V","V","AVX","","w,r,r","",""
> +"VPXOR ymm1, ymmV, ymm2/m256","VPXOR ymm2/m256, ymmV, ymm1","vpxor ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EF /r","V","V","AVX2","","w,r,r","",""
> +"VPXORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPXORD
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpxord xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 EF
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VPXORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPXORD
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpxord ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 EF
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VPXORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPXORD
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpxord zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 EF
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VPXORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPXORQ
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpxorq xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 EF
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VPXORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPXORQ
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpxorq ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 EF
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VPXORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPXORQ
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpxorq zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 EF
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VRANGEPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u:4","VRANGEPD
> imm8u:4, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vrangepd imm8u:4,
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 50 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VRANGEPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u:4","VRANGEPD
> imm8u:4, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vrangepd imm8u:4,
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 50 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VRANGEPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPD imm8u:4,
> zmm2, zmmV, {k}{z}, zmm1{sae}","vrangepd imm8u:4, zmm2, zmmV, {k}{z},
> zmm1{sae}","EVEX.NDS.512.66.0F3A.W1 50 /r
> ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
> +"VRANGEPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u:4","VRANGEPD
> imm8u:4, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vrangepd imm8u:4,
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 50 /r
> ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r,r","",""
> +"VRANGEPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u:4","VRANGEPS
> imm8u:4, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vrangeps imm8u:4,
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 50 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
> +"VRANGEPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u:4","VRANGEPS
> imm8u:4, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vrangeps imm8u:4,
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 50 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VRANGEPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPS imm8u:4,
> zmm2, zmmV, {k}{z}, zmm1{sae}","vrangeps imm8u:4, zmm2, zmmV, {k}{z},
> zmm1{sae}","EVEX.NDS.512.66.0F3A.W0 50 /r
> ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
> +"VRANGEPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u:4","VRANGEPS
> imm8u:4, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vrangeps imm8u:4,
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 50 /r
> ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r,r","",""
> +"VRANGESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESD imm8u:4,
> xmm2, xmmV, {k}{z}, xmm1{sae}","vrangesd imm8u:4, xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 51 /r
> ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
> +"VRANGESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VRANGESD imm8u:4,
> xmm2/m64, xmmV, {k}{z}, xmm1","vrangesd imm8u:4, xmm2/m64, xmmV,
> {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 51 /r
> ib","V","V","AVX512DQ","scale8","w,r,r,r,r","",""
> +"VRANGESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESS imm8u:4,
> xmm2, xmmV, {k}{z}, xmm1{sae}","vrangess imm8u:4, xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 51 /r
> ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
> +"VRANGESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VRANGESS imm8u:4,
> xmm2/m32, xmmV, {k}{z}, xmm1","vrangess imm8u:4, xmm2/m32, xmmV,
> {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 51 /r
> ib","V","V","AVX512DQ","scale4","w,r,r,r,r","",""
> +"VRCP14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRCP14PD
> xmm2/m128/m64bcst, {k}{z}, xmm1","vrcp14pd xmm2/m128/m64bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W1 4C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VRCP14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRCP14PD
> ymm2/m256/m64bcst, {k}{z}, ymm1","vrcp14pd ymm2/m256/m64bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W1 4C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VRCP14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP14PD
> zmm2/m512/m64bcst, {k}{z}, zmm1","vrcp14pd zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W1 4C
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
> +"VRCP14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRCP14PS
> xmm2/m128/m32bcst, {k}{z}, xmm1","vrcp14ps xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.66.0F38.W0 4C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VRCP14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRCP14PS
> ymm2/m256/m32bcst, {k}{z}, ymm1","vrcp14ps ymm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.66.0F38.W0 4C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VRCP14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP14PS
> zmm2/m512/m32bcst, {k}{z}, zmm1","vrcp14ps zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W0 4C
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VRCP14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP14SD xmm2/m64, xmmV,
> {k}{z}, xmm1","vrcp14sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W1 4D
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VRCP14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP14SS xmm2/m32, xmmV,
> {k}{z}, xmm1","vrcp14ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W0 4D
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VRCP28PD zmm1{sae}, {k}{z}, zmm2","VRCP28PD zmm2, {k}{z},
> zmm1{sae}","vrcp28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 CA
> /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
> +"VRCP28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP28PD
> zmm2/m512/m64bcst, {k}{z}, zmm1","vrcp28pd zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W1 CA
> /r","V","V","AVX512ER","bscale8,scale64","w,r,r","",""
> +"VRCP28PS zmm1{sae}, {k}{z}, zmm2","VRCP28PS zmm2, {k}{z},
> zmm1{sae}","vrcp28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 CA
> /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
> +"VRCP28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP28PS
> zmm2/m512/m32bcst, {k}{z}, zmm1","vrcp28ps zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.66.0F38.W0 CA
> /r","V","V","AVX512ER","bscale4,scale64","w,r,r","",""
> +"VRCP28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SD xmm2, xmmV,
> {k}{z}, xmm1{sae}","vrcp28sd xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F38.W1 CB
> /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
> +"VRCP28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP28SD xmm2/m64, xmmV,
> {k}{z}, xmm1","vrcp28sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W1 CB
> /r","V","V","AVX512ER","scale8","w,r,r,r","",""
> +"VRCP28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SS xmm2, xmmV,
> {k}{z}, xmm1{sae}","vrcp28ss xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F38.W0 CB
> /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
> +"VRCP28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP28SS xmm2/m32, xmmV,
> {k}{z}, xmm1","vrcp28ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W0 CB
> /r","V","V","AVX512ER","scale4","w,r,r,r","",""
> +"VRCPPS xmm1, xmm2/m128","VRCPPS xmm2/m128, xmm1","vrcpps xmm2/m128, xmm1","VEX.128.0F.WIG 53 /r","V","V","AVX","","w,r","",""
> +"VRCPPS ymm1, ymm2/m256","VRCPPS ymm2/m256, ymm1","vrcpps ymm2/m256, ymm1","VEX.256.0F.WIG 53 /r","V","V","AVX","","w,r","",""
> +"VRCPSS xmm1, xmmV, xmm2/m32","VRCPSS xmm2/m32, xmmV, xmm1","vrcpss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 53 /r","V","V","AVX","","w,r,r","",""
> +"VREDUCEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VREDUCEPD imm8u,
> xmm2/m128/m64bcst, {k}{z}, xmm1","vreducepd imm8u, xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.66.0F3A.W1 56 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VREDUCEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VREDUCEPD imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymm1","vreducepd imm8u, ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.66.0F3A.W1 56 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VREDUCEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPD imm8u, zmm2,
> {k}{z}, zmm1{sae}","vreducepd imm8u, zmm2, {k}{z},
> zmm1{sae}","EVEX.512.66.0F3A.W1 56 /r
> ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","",""
> +"VREDUCEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VREDUCEPD imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmm1","vreducepd imm8u, zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F3A.W1 56 /r
> ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
> +"VREDUCEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VREDUCEPS imm8u,
> xmm2/m128/m32bcst, {k}{z}, xmm1","vreduceps imm8u, xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.66.0F3A.W0 56 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VREDUCEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VREDUCEPS imm8u,
> ymm2/m256/m32bcst, {k}{z}, ymm1","vreduceps imm8u, ymm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.66.0F3A.W0 56 /r
> ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VREDUCEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPS imm8u, zmm2,
> {k}{z}, zmm1{sae}","vreduceps imm8u, zmm2, {k}{z},
> zmm1{sae}","EVEX.512.66.0F3A.W0 56 /r
> ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","",""
> +"VREDUCEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VREDUCEPS imm8u,
> zmm2/m512/m32bcst, {k}{z}, zmm1","vreduceps imm8u, zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.66.0F3A.W0 56 /r
> ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
> +"VREDUCESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESD imm8u,
> xmm2, xmmV, {k}{z}, xmm1{sae}","vreducesd imm8u, xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 57 /r
> ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
> +"VREDUCESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VREDUCESD imm8u,
> xmm2/m64, xmmV, {k}{z}, xmm1","vreducesd imm8u, xmm2/m64, xmmV,
> {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 57 /r
> ib","V","V","AVX512DQ","scale8","w,r,r,r,r","",""
> +"VREDUCESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESS imm8u,
> xmm2, xmmV, {k}{z}, xmm1{sae}","vreducess imm8u, xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 57 /r
> ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
> +"VREDUCESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VREDUCESS imm8u,
> xmm2/m32, xmmV, {k}{z}, xmm1","vreducess imm8u, xmm2/m32, xmmV,
> {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 57 /r
> ib","V","V","AVX512DQ","scale4","w,r,r,r,r","",""
> +"VRNDSCALEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VRNDSCALEPD
> imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","vrndscalepd imm8u,
> xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 09 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VRNDSCALEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VRNDSCALEPD
> imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vrndscalepd imm8u,
> ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 09 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VRNDSCALEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPD imm8u,
> zmm2, {k}{z}, zmm1{sae}","vrndscalepd imm8u, zmm2, {k}{z},
> zmm1{sae}","EVEX.512.66.0F3A.W1 09 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VRNDSCALEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VRNDSCALEPD
> imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vrndscalepd imm8u,
> zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 09 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VRNDSCALEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VRNDSCALEPS
> imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","vrndscaleps imm8u,
> xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 08 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VRNDSCALEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VRNDSCALEPS
> imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","vrndscaleps imm8u,
> ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 08 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VRNDSCALEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPS imm8u,
> zmm2, {k}{z}, zmm1{sae}","vrndscaleps imm8u, zmm2, {k}{z},
> zmm1{sae}","EVEX.512.66.0F3A.W0 08 /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VRNDSCALEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VRNDSCALEPS
> imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","vrndscaleps imm8u,
> zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 08 /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VRNDSCALESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESD
> imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vrndscalesd imm8u, xmm2, xmmV,
> {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 0B /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
> +"VRNDSCALESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VRNDSCALESD imm8u,
> xmm2/m64, xmmV, {k}{z}, xmm1","vrndscalesd imm8u, xmm2/m64, xmmV,
> {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 0B /r
> ib","V","V","AVX512F","scale8","w,r,r,r,r","",""
> +"VRNDSCALESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESS
> imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vrndscaless imm8u, xmm2, xmmV,
> {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 0A /r
> ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
> +"VRNDSCALESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VRNDSCALESS imm8u,
> xmm2/m32, xmmV, {k}{z}, xmm1","vrndscaless imm8u, xmm2/m32, xmmV,
> {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 0A /r
> ib","V","V","AVX512F","scale4","w,r,r,r,r","",""
> +"VROUNDPD xmm1, xmm2/m128, imm8u","VROUNDPD imm8u, xmm2/m128, xmm1","vroundpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 09 /r ib","V","V","AVX","","w,r,r","",""
> +"VROUNDPD ymm1, ymm2/m256, imm8u","VROUNDPD imm8u, ymm2/m256, ymm1","vroundpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 09 /r ib","V","V","AVX","","w,r,r","",""
> +"VROUNDPS xmm1, xmm2/m128, imm8u","VROUNDPS imm8u, xmm2/m128, xmm1","vroundps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 08 /r ib","V","V","AVX","","w,r,r","",""
> +"VROUNDPS ymm1, ymm2/m256, imm8u","VROUNDPS imm8u, ymm2/m256, ymm1","vroundps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 08 /r ib","V","V","AVX","","w,r,r","",""
> +"VROUNDSD xmm1, xmmV, xmm2/m64, imm8u","VROUNDSD imm8u, xmm2/m64,
> xmmV, xmm1","vroundsd imm8u, xmm2/m64, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.WIG 0B /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VROUNDSS xmm1, xmmV, xmm2/m32, imm8u","VROUNDSS imm8u, xmm2/m32,
> xmmV, xmm1","vroundss imm8u, xmm2/m32, xmmV,
> xmm1","VEX.NDS.LIG.66.0F3A.WIG 0A /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VRSQRT14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRSQRT14PD
> xmm2/m128/m64bcst, {k}{z}, xmm1","vrsqrt14pd xmm2/m128/m64bcst,
> {k}{z}, xmm1","EVEX.128.66.0F38.W1 4E
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VRSQRT14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRSQRT14PD
> ymm2/m256/m64bcst, {k}{z}, ymm1","vrsqrt14pd ymm2/m256/m64bcst,
> {k}{z}, ymm1","EVEX.256.66.0F38.W1 4E
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VRSQRT14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT14PD
> zmm2/m512/m64bcst, {k}{z}, zmm1","vrsqrt14pd zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F38.W1 4E
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
> +"VRSQRT14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRSQRT14PS
> xmm2/m128/m32bcst, {k}{z}, xmm1","vrsqrt14ps xmm2/m128/m32bcst,
> {k}{z}, xmm1","EVEX.128.66.0F38.W0 4E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VRSQRT14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRSQRT14PS
> ymm2/m256/m32bcst, {k}{z}, ymm1","vrsqrt14ps ymm2/m256/m32bcst,
> {k}{z}, ymm1","EVEX.256.66.0F38.W0 4E
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VRSQRT14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT14PS
> zmm2/m512/m32bcst, {k}{z}, zmm1","vrsqrt14ps zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.66.0F38.W0 4E
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VRSQRT14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT14SD xmm2/m64, xmmV,
> {k}{z}, xmm1","vrsqrt14sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W1 4F
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VRSQRT14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT14SS xmm2/m32, xmmV,
> {k}{z}, xmm1","vrsqrt14ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W0 4F
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VRSQRT28PD zmm1{sae}, {k}{z}, zmm2","VRSQRT28PD zmm2, {k}{z},
> zmm1{sae}","vrsqrt28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1
> CC /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
> +"VRSQRT28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT28PD
> zmm2/m512/m64bcst, {k}{z}, zmm1","vrsqrt28pd zmm2/m512/m64bcst,
> {k}{z}, zmm1","EVEX.512.66.0F38.W1 CC
> /r","V","V","AVX512ER","bscale8,scale64","w,r,r","",""
> +"VRSQRT28PS zmm1{sae}, {k}{z}, zmm2","VRSQRT28PS zmm2, {k}{z},
> zmm1{sae}","vrsqrt28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0
> CC /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
> +"VRSQRT28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT28PS
> zmm2/m512/m32bcst, {k}{z}, zmm1","vrsqrt28ps zmm2/m512/m32bcst,
> {k}{z}, zmm1","EVEX.512.66.0F38.W0 CC
> /r","V","V","AVX512ER","bscale4,scale64","w,r,r","",""
> +"VRSQRT28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SD xmm2, xmmV,
> {k}{z}, xmm1{sae}","vrsqrt28sd xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F38.W1 CD
> /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
> +"VRSQRT28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT28SD xmm2/m64, xmmV,
> {k}{z}, xmm1","vrsqrt28sd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W1 CD
> /r","V","V","AVX512ER","scale8","w,r,r,r","",""
> +"VRSQRT28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SS xmm2, xmmV,
> {k}{z}, xmm1{sae}","vrsqrt28ss xmm2, xmmV, {k}{z},
> xmm1{sae}","EVEX.NDS.128.66.0F38.W0 CD
> /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
> +"VRSQRT28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT28SS xmm2/m32, xmmV,
> {k}{z}, xmm1","vrsqrt28ss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W0 CD
> /r","V","V","AVX512ER","scale4","w,r,r,r","",""
> +"VRSQRTPS xmm1, xmm2/m128","VRSQRTPS xmm2/m128, xmm1","vrsqrtps xmm2/m128, xmm1","VEX.128.0F.WIG 52 /r","V","V","AVX","","w,r","",""
> +"VRSQRTPS ymm1, ymm2/m256","VRSQRTPS ymm2/m256, ymm1","vrsqrtps ymm2/m256, ymm1","VEX.256.0F.WIG 52 /r","V","V","AVX","","w,r","",""
> +"VRSQRTSS xmm1, xmmV, xmm2/m32","VRSQRTSS xmm2/m32, xmmV, xmm1","vrsqrtss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 52 /r","V","V","AVX","","w,r,r","",""
> +"VSCALEFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSCALEFPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vscalefpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 2C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VSCALEFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSCALEFPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vscalefpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 2C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VSCALEFPD zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPD zmm2, zmmV,
> {k}{z}, zmm1{er}","vscalefpd zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.NDS.512.66.0F38.W1 2C
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSCALEFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSCALEFPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vscalefpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 2C
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VSCALEFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSCALEFPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vscalefps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 2C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VSCALEFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSCALEFPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vscalefps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 2C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VSCALEFPS zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPS zmm2, zmmV,
> {k}{z}, zmm1{er}","vscalefps zmm2, zmmV, {k}{z},
> zmm1{er}","EVEX.NDS.512.66.0F38.W0 2C
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSCALEFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSCALEFPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vscalefps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 2C
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VSCALEFSD xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSD xmm2, xmmV,
> {k}{z}, xmm1{er}","vscalefsd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.NDS.128.66.0F38.W1 2D
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSCALEFSD xmm1, {k}{z}, xmmV, xmm2/m64","VSCALEFSD xmm2/m64, xmmV,
> {k}{z}, xmm1","vscalefsd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W1 2D
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VSCALEFSS xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSS xmm2, xmmV,
> {k}{z}, xmm1{er}","vscalefss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.NDS.128.66.0F38.W0 2D
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSCALEFSS xmm1, {k}{z}, xmmV, xmm2/m32","VSCALEFSS xmm2/m32, xmmV,
> {k}{z}, xmm1","vscalefss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.66.0F38.W0 2D
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VSCATTERDPD vm32x, {k1-k7}, xmm1","VSCATTERDPD xmm1, {k1-k7},
> vm32x","vscatterdpd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A2
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VSCATTERDPD vm32x, {k1-k7}, ymm1","VSCATTERDPD ymm1, {k1-k7},
> vm32x","vscatterdpd ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A2
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VSCATTERDPD vm32y, {k1-k7}, zmm1","VSCATTERDPD zmm1, {k1-k7},
> vm32y","vscatterdpd zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A2
> /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
> +"VSCATTERDPS vm32x, {k1-k7}, xmm1","VSCATTERDPS xmm1, {k1-k7},
> vm32x","vscatterdps xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A2
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VSCATTERDPS vm32y, {k1-k7}, ymm1","VSCATTERDPS ymm1, {k1-k7},
> vm32y","vscatterdps ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A2
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VSCATTERDPS vm32z, {k1-k7}, zmm1","VSCATTERDPS zmm1, {k1-k7},
> vm32z","vscatterdps zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A2
> /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
> +"VSCATTERPF0DPD vm32y, {k1-k7}","VSCATTERPF0DPD {k1-k7},
> vm32y","vscatterpf0dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6
> /5","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
> +"VSCATTERPF0DPS vm32z, {k1-k7}","VSCATTERPF0DPS {k1-k7},
> vm32z","vscatterpf0dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6
> /5","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
> +"VSCATTERPF0QPD vm64z, {k1-k7}","VSCATTERPF0QPD {k1-k7},
> vm64z","vscatterpf0qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7
> /5","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
> +"VSCATTERPF0QPS vm64z, {k1-k7}","VSCATTERPF0QPS {k1-k7},
> vm64z","vscatterpf0qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7
> /5","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
> +"VSCATTERPF1DPD vm32y, {k1-k7}","VSCATTERPF1DPD {k1-k7},
> vm32y","vscatterpf1dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6
> /6","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
> +"VSCATTERPF1DPS vm32z, {k1-k7}","VSCATTERPF1DPS {k1-k7},
> vm32z","vscatterpf1dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6
> /6","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
> +"VSCATTERPF1QPD vm64z, {k1-k7}","VSCATTERPF1QPD {k1-k7},
> vm64z","vscatterpf1qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7
> /6","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
> +"VSCATTERPF1QPS vm64z, {k1-k7}","VSCATTERPF1QPS {k1-k7},
> vm64z","vscatterpf1qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7
> /6","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
> +"VSCATTERQPD vm64x, {k1-k7}, xmm1","VSCATTERQPD xmm1, {k1-k7},
> vm64x","vscatterqpd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A3
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VSCATTERQPD vm64y, {k1-k7}, ymm1","VSCATTERQPD ymm1, {k1-k7},
> vm64y","vscatterqpd ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A3
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
> +"VSCATTERQPD vm64z, {k1-k7}, zmm1","VSCATTERQPD zmm1, {k1-k7},
> vm64z","vscatterqpd zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A3
> /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
> +"VSCATTERQPS vm64x, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7},
> vm64x","vscatterqps xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A3
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VSCATTERQPS vm64y, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7},
> vm64y","vscatterqps xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A3
> /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
> +"VSCATTERQPS vm64z, {k1-k7}, ymm1","VSCATTERQPS ymm1, {k1-k7},
> vm64z","vscatterqps ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A3
> /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
> +"VSHUFF32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFF32X4
> imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshuff32x4 imm8u,
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 23 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VSHUFF32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFF32X4
> imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshuff32x4 imm8u,
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 23 /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
> +"VSHUFF64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFF64X2
> imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshuff64x2 imm8u,
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 23 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VSHUFF64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFF64X2
> imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshuff64x2 imm8u,
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 23 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
> +"VSHUFI32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFI32X4
> imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufi32x4 imm8u,
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 43 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VSHUFI32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFI32X4
> imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufi32x4 imm8u,
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 43 /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
> +"VSHUFI64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFI64X2
> imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufi64x2 imm8u,
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 43 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VSHUFI64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFI64X2
> imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufi64x2 imm8u,
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 43 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
> +"VSHUFPD xmm1, xmmV, xmm2/m128, imm8u","VSHUFPD imm8u, xmm2/m128,
> xmmV, xmm1","vshufpd imm8u, xmm2/m128, xmmV,
> xmm1","VEX.NDS.128.66.0F.WIG C6 /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VSHUFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VSHUFPD
> imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vshufpd imm8u,
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 C6 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
> +"VSHUFPD ymm1, ymmV, ymm2/m256, imm8u","VSHUFPD imm8u, ymm2/m256,
> ymmV, ymm1","vshufpd imm8u, ymm2/m256, ymmV,
> ymm1","VEX.NDS.256.66.0F.WIG C6 /r
> ib","V","V","AVX","","w,r,r,r","",""
> +"VSHUFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFPD
> imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufpd imm8u,
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 C6 /r
> ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
> +"VSHUFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFPD
> imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufpd imm8u,
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 C6 /r
> ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
> +"VSHUFPS xmm1, xmmV, xmm2/m128, imm8u","VSHUFPS imm8u, xmm2/m128,
> xmmV, xmm1","vshufps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG
> C6 /r ib","V","V","AVX","","w,r,r,r","",""
> +"VSHUFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VSHUFPS
> imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vshufps imm8u,
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 C6 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
> +"VSHUFPS ymm1, ymmV, ymm2/m256, imm8u","VSHUFPS imm8u, ymm2/m256,
> ymmV, ymm1","vshufps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG
> C6 /r ib","V","V","AVX","","w,r,r,r","",""
> +"VSHUFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFPS
> imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufps imm8u,
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 C6 /r
> ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
> +"VSHUFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFPS
> imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufps imm8u,
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 C6 /r
> ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
> +"VSQRTPD xmm1, xmm2/m128","VSQRTPD xmm2/m128, xmm1","vsqrtpd xmm2/m128, xmm1","VEX.128.66.0F.WIG 51 /r","V","V","AVX","","w,r","",""
> +"VSQRTPD xmm1, {k}{z}, xmm2/m128/m64bcst","VSQRTPD xmm2/m128/m64bcst,
> {k}{z}, xmm1","vsqrtpd xmm2/m128/m64bcst, {k}{z},
> xmm1","EVEX.128.66.0F.W1 51
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
> +"VSQRTPD ymm1, ymm2/m256","VSQRTPD ymm2/m256, ymm1","vsqrtpd ymm2/m256, ymm1","VEX.256.66.0F.WIG 51 /r","V","V","AVX","","w,r","",""
> +"VSQRTPD ymm1, {k}{z}, ymm2/m256/m64bcst","VSQRTPD ymm2/m256/m64bcst,
> {k}{z}, ymm1","vsqrtpd ymm2/m256/m64bcst, {k}{z},
> ymm1","EVEX.256.66.0F.W1 51
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
> +"VSQRTPD zmm1{er}, {k}{z}, zmm2","VSQRTPD zmm2, {k}{z},
> zmm1{er}","vsqrtpd zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 51
> /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VSQRTPD zmm1, {k}{z}, zmm2/m512/m64bcst","VSQRTPD zmm2/m512/m64bcst,
> {k}{z}, zmm1","vsqrtpd zmm2/m512/m64bcst, {k}{z},
> zmm1","EVEX.512.66.0F.W1 51
> /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
> +"VSQRTPS xmm1, xmm2/m128","VSQRTPS xmm2/m128, xmm1","vsqrtps xmm2/m128, xmm1","VEX.128.0F.WIG 51 /r","V","V","AVX","","w,r","",""
> +"VSQRTPS xmm1, {k}{z}, xmm2/m128/m32bcst","VSQRTPS xmm2/m128/m32bcst,
> {k}{z}, xmm1","vsqrtps xmm2/m128/m32bcst, {k}{z},
> xmm1","EVEX.128.0F.W0 51
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
> +"VSQRTPS ymm1, ymm2/m256","VSQRTPS ymm2/m256, ymm1","vsqrtps ymm2/m256, ymm1","VEX.256.0F.WIG 51 /r","V","V","AVX","","w,r","",""
> +"VSQRTPS ymm1, {k}{z}, ymm2/m256/m32bcst","VSQRTPS ymm2/m256/m32bcst,
> {k}{z}, ymm1","vsqrtps ymm2/m256/m32bcst, {k}{z},
> ymm1","EVEX.256.0F.W0 51
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
> +"VSQRTPS zmm1{er}, {k}{z}, zmm2","VSQRTPS zmm2, {k}{z}, zmm1{er}","vsqrtps zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 51 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
> +"VSQRTPS zmm1, {k}{z}, zmm2/m512/m32bcst","VSQRTPS zmm2/m512/m32bcst,
> {k}{z}, zmm1","vsqrtps zmm2/m512/m32bcst, {k}{z},
> zmm1","EVEX.512.0F.W0 51
> /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
> +"VSQRTSD xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSD xmm2, xmmV, {k}{z},
> xmm1{er}","vsqrtsd xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.NDS.128.F2.0F.W1 51
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSQRTSD xmm1, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV, xmm1","vsqrtsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 51 /r","V","V","AVX","","w,r,r","",""
> +"VSQRTSD xmm1, {k}{z}, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV,
> {k}{z}, xmm1","vsqrtsd xmm2/m64, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.F2.0F.W1 51
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VSQRTSS xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSS xmm2, xmmV, {k}{z},
> xmm1{er}","vsqrtss xmm2, xmmV, {k}{z},
> xmm1{er}","EVEX.NDS.128.F3.0F.W0 51
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSQRTSS xmm1, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV, xmm1","vsqrtss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 51 /r","V","V","AVX","","w,r,r","",""
> +"VSQRTSS xmm1, {k}{z}, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV,
> {k}{z}, xmm1","vsqrtss xmm2/m32, xmmV, {k}{z},
> xmm1","EVEX.NDS.LIG.F3.0F.W0 51
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VSTMXCSR m32","VSTMXCSR m32","vstmxcsr m32","VEX.128.0F.WIG AE /3","V","V","AVX","modrm_memonly","w","",""
> +"VSUBPD xmm1, xmmV, xmm2/m128","VSUBPD xmm2/m128, xmmV, xmm1","vsubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
> +"VSUBPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSUBPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vsubpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VSUBPD ymm1, ymmV, ymm2/m256","VSUBPD ymm2/m256, ymmV, ymm1","vsubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
> +"VSUBPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSUBPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vsubpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5C
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VSUBPD zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPD zmm2, zmmV, {k}{z},
> zmm1{er}","vsubpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1
> 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSUBPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSUBPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vsubpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5C
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VSUBPS xmm1, xmmV, xmm2/m128","VSUBPS xmm2/m128, xmmV, xmm1","vsubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
> +"VSUBPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSUBPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vsubps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VSUBPS ymm1, ymmV, ymm2/m256","VSUBPS ymm2/m256, ymmV, ymm1","vsubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
> +"VSUBPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSUBPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vsubps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5C
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VSUBPS zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPS zmm2, zmmV, {k}{z},
> zmm1{er}","vsubps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5C
> /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSUBPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSUBPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vsubps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5C
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VSUBSD xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSD xmm2, xmmV, {k}{z},
> xmm1{er}","vsubsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1
> 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSUBSD xmm1, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, xmm1","vsubsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
> +"VSUBSD xmm1, {k}{z}, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, {k}{z},
> xmm1","vsubsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5C
> /r","V","V","AVX512F","scale8","w,r,r,r","",""
> +"VSUBSS xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSS xmm2, xmmV, {k}{z},
> xmm1{er}","vsubss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0
> 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
> +"VSUBSS xmm1, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, xmm1","vsubss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
> +"VSUBSS xmm1, {k}{z}, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, {k}{z},
> xmm1","vsubss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5C
> /r","V","V","AVX512F","scale4","w,r,r,r","",""
> +"VTESTPD xmm1, xmm2/m128","VTESTPD xmm2/m128, xmm1","vtestpd xmm2/m128, xmm1","VEX.128.66.0F38.W0 0F /r","V","V","AVX","","r,r","",""
> +"VTESTPD ymm1, ymm2/m256","VTESTPD ymm2/m256, ymm1","vtestpd ymm2/m256, ymm1","VEX.256.66.0F38.W0 0F /r","V","V","AVX","","r,r","",""
> +"VTESTPS xmm1, xmm2/m128","VTESTPS xmm2/m128, xmm1","vtestps xmm2/m128, xmm1","VEX.128.66.0F38.W0 0E /r","V","V","AVX","","r,r","",""
> +"VTESTPS ymm1, ymm2/m256","VTESTPS ymm2/m256, ymm1","vtestps ymm2/m256, ymm1","VEX.256.66.0F38.W0 0E /r","V","V","AVX","","r,r","",""
> +"VUCOMISD xmm1{sae}, xmm2","VUCOMISD xmm2, xmm1{sae}","vucomisd xmm2, xmm1{sae}","EVEX.128.66.0F.W1 2E /r","V","V","AVX512F","modrm_regonly","r,r","",""
> +"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xmm1","EVEX.LIG.66.0F.W1 2E /r","V","V","AVX512F","scale8","r,r","",""
> +"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xmm1","VEX.LIG.66.0F.WIG 2E /r","V","V","AVX","","r,r","",""
> +"VUCOMISS xmm1{sae}, xmm2","VUCOMISS xmm2, xmm1{sae}","vucomiss xmm2, xmm1{sae}","EVEX.128.0F.W0 2E /r","V","V","AVX512F","modrm_regonly","r,r","",""
> +"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xmm1","EVEX.LIG.0F.W0 2E /r","V","V","AVX512F","scale4","r,r","",""
> +"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xmm1","VEX.LIG.0F.WIG 2E /r","V","V","AVX","","r,r","",""
> +"VUNPCKHPD xmm1, xmmV, xmm2/m128","VUNPCKHPD xmm2/m128, xmmV, xmm1","vunpckhpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
> +"VUNPCKHPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKHPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vunpckhpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 15
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VUNPCKHPD ymm1, ymmV, ymm2/m256","VUNPCKHPD ymm2/m256, ymmV, ymm1","vunpckhpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
> +"VUNPCKHPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKHPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vunpckhpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 15
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VUNPCKHPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKHPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vunpckhpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 15
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VUNPCKHPS xmm1, xmmV, xmm2/m128","VUNPCKHPS xmm2/m128, xmmV, xmm1","vunpckhps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
> +"VUNPCKHPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKHPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vunpckhps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 15
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VUNPCKHPS ymm1, ymmV, ymm2/m256","VUNPCKHPS ymm2/m256, ymmV, ymm1","vunpckhps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
> +"VUNPCKHPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKHPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vunpckhps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 15
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VUNPCKHPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKHPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vunpckhps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 15
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VUNPCKLPD xmm1, xmmV, xmm2/m128","VUNPCKLPD xmm2/m128, xmmV, xmm1","vunpcklpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
> +"VUNPCKLPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKLPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vunpcklpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 14
> /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VUNPCKLPD ymm1, ymmV, ymm2/m256","VUNPCKLPD ymm2/m256, ymmV, ymm1","vunpcklpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
> +"VUNPCKLPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKLPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vunpcklpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 14
> /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VUNPCKLPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKLPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vunpcklpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 14
> /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
> +"VUNPCKLPS xmm1, xmmV, xmm2/m128","VUNPCKLPS xmm2/m128, xmmV, xmm1","vunpcklps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
> +"VUNPCKLPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKLPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vunpcklps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 14
> /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VUNPCKLPS ymm1, ymmV, ymm2/m256","VUNPCKLPS ymm2/m256, ymmV, ymm1","vunpcklps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
> +"VUNPCKLPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKLPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vunpcklps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 14
> /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VUNPCKLPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKLPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vunpcklps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 14
> /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
> +"VXORPD xmm1, xmmV, xmm2/m128","VXORPD xmm2/m128, xmmV, xmm1","vxorpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
> +"VXORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VXORPD
> xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vxorpd xmm2/m128/m64bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 57
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
> +"VXORPD ymm1, ymmV, ymm2/m256","VXORPD ymm2/m256, ymmV, ymm1","vxorpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
> +"VXORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VXORPD
> ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vxorpd ymm2/m256/m64bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 57
> /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
> +"VXORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VXORPD
> zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vxorpd zmm2/m512/m64bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 57
> /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
> +"VXORPS xmm1, xmmV, xmm2/m128","VXORPS xmm2/m128, xmmV, xmm1","vxorps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
> +"VXORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VXORPS
> xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vxorps xmm2/m128/m32bcst,
> xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 57
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
> +"VXORPS ymm1, ymmV, ymm2/m256","VXORPS ymm2/m256, ymmV, ymm1","vxorps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
> +"VXORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VXORPS
> ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vxorps ymm2/m256/m32bcst,
> ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 57
> /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
> +"VXORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VXORPS
> zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vxorps zmm2/m512/m32bcst,
> zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 57
> /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
> +"VZEROALL","VZEROALL","vzeroall","VEX.256.0F.WIG 77","V","V","AVX","","","",""
> +"VZEROUPPER","VZEROUPPER","vzeroupper","VEX.128.0F.WIG 77","V","V","AVX","","","",""
> +"WAIT","WAIT","wait","9B","V","V","","pseudo","","",""
> +"WBINVD","WBINVD","wbinvd","0F 09","V","V","486","","","",""
> +"WRFSBASE rmr32","WRFSBASE rmr32","wrfsbase rmr32","F3 0F AE /2","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32"
> +"WRFSBASE rmr64","WRFSBASE rmr64","wrfsbase rmr64","F3 REX.W 0F AE /2","N.S.","V","FSGSBASE","modrm_regonly","r","Y","64"
> +"WRGSBASE rmr32","WRGSBASE rmr32","wrgsbase rmr32","F3 0F AE /3","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32"
> +"WRGSBASE rmr64","WRGSBASE rmr64","wrgsbase rmr64","F3 REX.W 0F AE /3","N.S.","V","FSGSBASE","modrm_regonly","r","Y","64"
> +"WRMSR","WRMSR","wrmsr","0F 30","V","V","Pentium","","","",""
> +"WRPKRU","WRPKRU","wrpkru","0F 01 EF","V","V","PKU","","","",""
> +"WRSSD m32, r32","WRSSD r32, m32","wrssd r32, m32","0F 38 F6 /r","V","V","CET","modrm_memonly,operand16,operand32","w,r","",""
> +"WRSSQ m64, r64","WRSSQ r64, m64","wrssq r64, m64","REX.W 0F 38 F6 /r","N.S.","V","CET","modrm_memonly","w,r","",""
> +"WRUSSD m32, r32","WRUSSD r32, m32","wrussd r32, m32","66 0F 38 F5 /r","V","V","CET","modrm_memonly,operand16,operand32","w,r","",""
> +"WRUSSQ m64, r64","WRUSSQ r64, m64","wrussq r64, m64","66 REX.W 0F 38 F5 /r","N.S.","V","CET","modrm_memonly","w,r","",""
> +"XABORT imm8u","XABORT imm8u","xabort imm8u","C6 F8 ib","V","V","RTM","modrm_regonly","r","",""
> +"XACQUIRE","XACQUIRE","xacquire","F2","V","V","HLE","pseudo","","",""
> +"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","0F C0 /r","V","V","486","","rw,rw","Y","8"
> +"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","REX 0F C0 /r","N.E.","V","","pseudo64","rw,w","Y","8"
> +"XADD r/m32, r32","XADDL r32, r/m32","xaddl r32, r/m32","0F C1 /r","V","V","486","operand32","rw,rw","Y","32"
> +"XADD r/m64, r64","XADDQ r64, r/m64","xaddq r64, r/m64","REX.W 0F C1 /r","N.S.","V","486","","rw,rw","Y","64"
> +"XADD r/m16, r16","XADDW r16, r/m16","xaddw r16, r/m16","0F C1 /r","V","V","486","operand16","rw,rw","Y","16"
> +"XBEGIN rel16","XBEGIN rel16","xbegin rel16","C7 F8 cw","V","V","RTM","modrm_regonly,operand16","r","",""
> +"XBEGIN rel32","XBEGIN rel32","xbegin rel32","C7 F8 cd","V","V","RTM","modrm_regonly,operand32,operand64","r","",""
> +"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","86 /r","V","V","","pseudo","w,r","Y","8"
> +"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","REX 86 /r","N.E.","V","","pseudo","w,r","Y","8"
> +"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","86 /r","V","V","","","rw,rw","Y","8"
> +"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","REX 86 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"XCHG r32op, EAX","XCHGL EAX, r32op","xchgl EAX, r32op","90+rd","V","V","","operand32","rw,rw","Y","32"
> +"XCHG r32, r/m32","XCHGL r/m32, r32","xchgl r/m32, r32","87 /r","V","V","","operand32,pseudo","w,r","Y","32"
> +"XCHG r/m32, r32","XCHGL r32, r/m32","xchgl r32, r/m32","87 /r","V","V","","operand32","rw,rw","Y","32"
> +"XCHG EAX, r32op","XCHGL r32op, EAX","xchgl r32op, EAX","90+rd","V","V","","operand32,pseudo","rw,rw","Y","32"
> +"XCHG r64op, RAX","XCHGQ RAX, r64op","xchgq RAX, r64op","REX.W 90+ro","N.S.","V","","","rw,rw","Y","64"
> +"XCHG r64, r/m64","XCHGQ r/m64, r64","xchgq r/m64, r64","REX.W 87 /r","N.E.","V","","pseudo","w,r","Y","64"
> +"XCHG r/m64, r64","XCHGQ r64, r/m64","xchgq r64, r/m64","REX.W 87 /r","N.S.","V","","","rw,rw","Y","64"
> +"XCHG RAX, r64op","XCHGQ r64op, RAX","xchgq r64op, RAX","REX.W 90+rd","N.E.","V","","pseudo","rw,rw","Y","64"
> +"XCHG r16op, AX","XCHGW AX, r16op","xchgw AX, r16op","90+rw","V","V","","operand16","rw,rw","Y","16"
> +"XCHG r16, r/m16","XCHGW r/m16, r16","xchgw r/m16, r16","87 /r","V","V","","operand16,pseudo","w,r","Y","16"
> +"XCHG r/m16, r16","XCHGW r16, r/m16","xchgw r16, r/m16","87 /r","V","V","","operand16","rw,rw","Y","16"
> +"XCHG AX, r16op","XCHGW r16op, AX","xchgw r16op, AX","90+rw","V","V","","operand16,pseudo","rw,rw","Y","16"
> +"XEND","XEND","xend","0F 01 D5","V","V","RTM","","","",""
> +"XGETBV","XGETBV","xgetbv","0F 01 D0","V","V","XSAVE","","","",""
> +"XLATB","XLAT","xlat","D7","V","V","","","","",""
> +"XLATB","XLAT","xlat","REX.W D7","N.E.","V","","pseudo","","",""
> +"XOR r/m8, imm8","XORB imm8, r/m8","xorb imm8, r/m8","REX 80 /6 ib","N.E.","V","","pseudo64","rw,r","Y","8"
> +"XOR AL, imm8u","XORB imm8u, AL","xorb imm8u, AL","34 ib","V","V","","","rw,r","Y","8"
> +"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","80 /6 ib","V","V","","","rw,r","Y","8"
> +"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","82 /6 ib","V","N.S.","","","rw,r","Y","8"
> +"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","32 /r","V","V","","","rw,r","Y","8"
> +"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","REX 32 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","30 /r","V","V","","","rw,r","Y","8"
> +"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","REX 30 /r","N.E.","V","","pseudo64","rw,r","Y","8"
> +"XOR EAX, imm32","XORL imm32, EAX","xorl imm32, EAX","35 id","V","V","","operand32","rw,r","Y","32"
> +"XOR r/m32, imm32","XORL imm32, r/m32","xorl imm32, r/m32","81 /6 id","V","V","","operand32","rw,r","Y","32"
> +"XOR r/m32, imm8","XORL imm8, r/m32","xorl imm8, r/m32","83 /6 ib","V","V","","operand32","rw,r","Y","32"
> +"XOR r32, r/m32","XORL r/m32, r32","xorl r/m32, r32","33 /r","V","V","","operand32","rw,r","Y","32"
> +"XOR r/m32, r32","XORL r32, r/m32","xorl r32, r/m32","31 /r","V","V","","operand32","rw,r","Y","32"
> +"XORPD xmm1, xmm2/m128","XORPD xmm2/m128, xmm1","xorpd xmm2/m128, xmm1","66 0F 57 /r","V","V","SSE2","","rw,r","",""
> +"XORPS xmm1, xmm2/m128","XORPS xmm2/m128, xmm1","xorps xmm2/m128, xmm1","0F 57 /r","V","V","SSE","","rw,r","",""
> +"XOR RAX, imm32","XORQ imm32, RAX","xorq imm32, RAX","REX.W 35 id","N.S.","V","","","rw,r","Y","64"
> +"XOR r/m64, imm32","XORQ imm32, r/m64","xorq imm32, r/m64","REX.W 81 /6 id","N.S.","V","","","rw,r","Y","64"
> +"XOR r/m64, imm8","XORQ imm8, r/m64","xorq imm8, r/m64","REX.W 83 /6 ib","N.S.","V","","","rw,r","Y","64"
> +"XOR r64, r/m64","XORQ r/m64, r64","xorq r/m64, r64","REX.W 33 /r","N.S.","V","","","rw,r","Y","64"
> +"XOR r/m64, r64","XORQ r64, r/m64","xorq r64, r/m64","REX.W 31 /r","N.S.","V","","","rw,r","Y","64"
> +"XOR AX, imm16","XORW imm16, AX","xorw imm16, AX","35 iw","V","V","","operand16","rw,r","Y","16"
> +"XOR r/m16, imm16","XORW imm16, r/m16","xorw imm16, r/m16","81 /6 iw","V","V","","operand16","rw,r","Y","16"
> +"XOR r/m16, imm8","XORW imm8, r/m16","xorw imm8, r/m16","83 /6 ib","V","V","","operand16","rw,r","Y","16"
> +"XOR r16, r/m16","XORW r/m16, r16","xorw r/m16, r16","33 /r","V","V","","operand16","rw,r","Y","16"
> +"XOR r/m16, r16","XORW r16, r/m16","xorw r16, r/m16","31 /r","V","V","","operand16","rw,r","Y","16"
> +"XRELEASE","XRELEASE","xrelease","F3","V","V","HLE","pseudo","","",""
> +"XRSTOR mem","XRSTOR mem","xrstor mem","0F AE /5","V","V","XSAVE","modrm_memonly,operand16,operand32","r","",""
> +"XRSTOR64 mem","XRSTOR64 mem","xrstor64 mem","REX.W 0F AE /5","N.S.","V","XSAVE","modrm_memonly","r","",""
> +"XRSTORS mem","XRSTORS mem","xrstors mem","0F C7 /3","V","V","XSAVES","modrm_memonly,operand16,operand32","r","",""
> +"XRSTORS64 mem","XRSTORS64 mem","xrstors64 mem","REX.W 0F C7 /3","N.S.","V","XSAVES","modrm_memonly","r","",""
> +"XSAVE mem","XSAVE mem","xsave mem","0F AE /4","V","V","XSAVE","modrm_memonly,operand16,operand32","w","",""
> +"XSAVE64 mem","XSAVE64 mem","xsave64 mem","REX.W 0F AE /4","N.S.","V","XSAVE","modrm_memonly","w","",""
> +"XSAVEC mem","XSAVEC mem","xsavec mem","0F C7 /4","V","V","XSAVEC","modrm_memonly,operand16,operand32","w","",""
> +"XSAVEC64 mem","XSAVEC64 mem","xsavec64 mem","REX.W 0F C7 /4","N.S.","V","XSAVEC","modrm_memonly","w","",""
> +"XSAVEOPT mem","XSAVEOPT mem","xsaveopt mem","0F AE /6","V","V","XSAVEOPT","modrm_memonly,operand16,operand32","w","",""
> +"XSAVEOPT64 mem","XSAVEOPT64 mem","xsaveopt64 mem","REX.W 0F AE /6","N.S.","V","XSAVEOPT","modrm_memonly","w","",""
> +"XSAVES mem","XSAVES mem","xsaves mem","0F C7 /5","V","V","XSAVES","modrm_memonly,operand16,operand32","w","",""
> +"XSAVES64 mem","XSAVES64 mem","xsaves64 mem","REX.W 0F C7 /5","N.S.","V","XSAVES","modrm_memonly","w","",""
> +"XSETBV","XSETBV","xsetbv","0F 01 D1","V","V","XSAVE","","","",""
> +"XTEST","XTEST","xtest","0F 01 D6","V","V","HLE or RTM","","","",""


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/4] TCG support for AVX
  2022-04-18 19:45     ` Paul Brook
  2022-04-18 19:50       ` Peter Maydell
  2022-04-18 23:14       ` Richard Henderson
@ 2022-04-20 14:19       ` Paolo Bonzini
  2022-04-20 18:59         ` Paul Brook
  2 siblings, 1 reply; 67+ messages in thread
From: Paolo Bonzini @ 2022-04-20 14:19 UTC (permalink / raw)
  To: Paul Brook, Peter Maydell; +Cc: Eduardo Habkost, Richard Henderson, qemu-devel

On 4/18/22 21:45, Paul Brook wrote:
>> Massively too large for a single patch, I'm afraid. This needs
>> to be split, probably into at least twenty patches, which each
>> are a reviewable chunk of code that does one coherent thing.
> Hmm, I'mm see what I can do.
> 
> Unfortunately the table driven decoding means that going from two to
> three operands tends to be a bit all or nothing just to get the thing
> to compile.

Hi Paul, welcome back and thanks for this huge work.  It should be
possible at least to split the patch as follows (at least that's
what _I_ would do in order to review it):

* mechanical changes to translate.c

-    [0x10] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movups, movupd, movss, movsd */
+    [0x10] = SSE_SPECIAL, /* movups, movupd, movss, movsd */
+    [0x11] = SSE_SPECIAL, /* movups, movupd, movss, movsd */
+    [0x12] = SSE_SPECIAL, /* movlps, movlpd, movsldup, movddup */
+    [0x13] = SSE_SPECIAL, /* movlps, movlpd */

* mechanical introduction of XMM_OFFSET()

* non-AVX/VEX changes (e.g. SSE_OPF_3DNOW)

* decoding fixes for SSE instructions (CHECK_NO_VEX)

* fix zeroing of high bits

* 3-operand decoding and helpers

* AVX/AVX2 support (existing instructions)

* AVX/AVX2 support (new instructions)

I can do some of the work too since I was planning to do this
anyway (but have hardly started yet).

Paolo


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/4] TCG support for AVX
  2022-04-20 14:19       ` Paolo Bonzini
@ 2022-04-20 18:59         ` Paul Brook
  0 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-20 18:59 UTC (permalink / raw)
  To: Paolo Bonzini, Peter Maydell
  Cc: Eduardo Habkost, Richard Henderson, qemu-devel

On Wed, 2022-04-20 at 16:19 +0200, Paolo Bonzini wrote:
> On 4/18/22 21:45, Paul Brook wrote:
> > > Massively too large for a single patch, I'm afraid. This needs
> > > to be split, probably into at least twenty patches, which each
> > > are a reviewable chunk of code that does one coherent thing.
> > Hmm, I'mm see what I can do.
> > 
> > Unfortunately the table driven decoding means that going from two
> > to
> > three operands tends to be a bit all or nothing just to get the
> > thing
> > to compile.
> 
> Hi Paul, welcome back and thanks for this huge work.  It should be
> possible at least to split the patch as follows (at least that's
> what _I_ would do in order to review it):
> [snip]



Ok, that sounds like a reasonable start.

> I can do some of the work too since I was planning to do this
> anyway (but have hardly started yet).

I'll push my changes to https://github.com/pbrook/qemu . This is a
personal project, so I'll be working on it as and when.

If you have additional comments/suggestions on the approach taken then
I'd be happy to hear them.

Paul


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v2 01/42] i386: pcmpestr 64-bit sign extension bug
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (3 preceding siblings ...)
  2022-04-18 17:39 ` [PATCH 4/4] AVX tests Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-25 15:50   ` Richard Henderson
  2022-04-27  7:00   ` Paolo Bonzini
  2022-04-24 22:01 ` [PATCH v2 02/42] i386: DPPS rounding fix Paul Brook
                   ` (40 subsequent siblings)
  45 siblings, 2 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

The abs1 function in ops_sse.h only works sorrectly when the result fits
in a signed int. This is fine most of the time because we're only dealing
with byte sized values.

However pcmp_elen helper function uses abs1 to calculate the absolute value
of a cpu register. This incorrectly truncates to 32 bits, and will give
the wrong anser for the most negative value.

Fix by open coding the saturation check before taking the absolute value.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index e4d74b814a..535440f882 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2011,25 +2011,23 @@ SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ)
 
 static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl)
 {
-    int val;
+    target_long val, limit;
 
     /* Presence of REX.W is indicated by a bit higher than 7 set */
     if (ctrl >> 8) {
-        val = abs1((int64_t)env->regs[reg]);
+        val = (target_long)env->regs[reg];
     } else {
-        val = abs1((int32_t)env->regs[reg]);
+        val = (int32_t)env->regs[reg];
     }
-
     if (ctrl & 1) {
-        if (val > 8) {
-            return 8;
-        }
+        limit = 8;
     } else {
-        if (val > 16) {
-            return 16;
-        }
+        limit = 16;
     }
-    return val;
+    if ((val > limit) || (val < -limit)) {
+        return limit;
+    }
+    return abs1(val);
 }
 
 static inline int pcmp_ilen(Reg *r, uint8_t ctrl)
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 02/42] i386: DPPS rounding fix
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (4 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 01/42] i386: pcmpestr 64-bit sign extension bug Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-25 16:09   ` Richard Henderson
  2022-04-24 22:01 ` [PATCH v2 03/42] Add AVX_EN hflag Paul Brook
                   ` (39 subsequent siblings)
  45 siblings, 1 reply; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

The DPPS (Dot Product) instruction is defined to first sum pairs of
intermediate results, then sum those values to get the final result.
i.e. (A+B)+(C+D)

We incrementally sum the results, i.e. ((A+B)+C)+D, which can result
in incorrect rouding.

For consistency, also remove the redundant (but harmless) add operation
from DPPD

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 47 +++++++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 22 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 535440f882..a5a48a20f6 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -1934,32 +1934,36 @@ SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP)
 
 void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
 {
-    float32 iresult = float32_zero;
+    float32 prod, iresult, iresult2;
 
+    /*
+     * We must evaluate (A+B)+(C+D), not ((A+B)+C)+D
+     * to correctly round the intermediate results
+     */
     if (mask & (1 << 4)) {
-        iresult = float32_add(iresult,
-                              float32_mul(d->ZMM_S(0), s->ZMM_S(0),
-                                          &env->sse_status),
-                              &env->sse_status);
+        iresult = float32_mul(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
+    } else {
+        iresult = float32_zero;
     }
     if (mask & (1 << 5)) {
-        iresult = float32_add(iresult,
-                              float32_mul(d->ZMM_S(1), s->ZMM_S(1),
-                                          &env->sse_status),
-                              &env->sse_status);
+        prod = float32_mul(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
+    } else {
+        prod = float32_zero;
     }
+    iresult = float32_add(iresult, prod, &env->sse_status);
     if (mask & (1 << 6)) {
-        iresult = float32_add(iresult,
-                              float32_mul(d->ZMM_S(2), s->ZMM_S(2),
-                                          &env->sse_status),
-                              &env->sse_status);
+        iresult2 = float32_mul(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
+    } else {
+        iresult2 = float32_zero;
     }
     if (mask & (1 << 7)) {
-        iresult = float32_add(iresult,
-                              float32_mul(d->ZMM_S(3), s->ZMM_S(3),
-                                          &env->sse_status),
-                              &env->sse_status);
+        prod = float32_mul(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
+    } else {
+        prod = float32_zero;
     }
+    iresult2 = float32_add(iresult2, prod, &env->sse_status);
+    iresult = float32_add(iresult, iresult2, &env->sse_status);
+
     d->ZMM_S(0) = (mask & (1 << 0)) ? iresult : float32_zero;
     d->ZMM_S(1) = (mask & (1 << 1)) ? iresult : float32_zero;
     d->ZMM_S(2) = (mask & (1 << 2)) ? iresult : float32_zero;
@@ -1968,13 +1972,12 @@ void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
 
 void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
 {
-    float64 iresult = float64_zero;
+    float64 iresult;
 
     if (mask & (1 << 4)) {
-        iresult = float64_add(iresult,
-                              float64_mul(d->ZMM_D(0), s->ZMM_D(0),
-                                          &env->sse_status),
-                              &env->sse_status);
+        iresult = float64_mul(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
+    } else {
+        iresult = float64_zero;
     }
     if (mask & (1 << 5)) {
         iresult = float64_add(iresult,
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 03/42] Add AVX_EN hflag
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (5 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 02/42] i386: DPPS rounding fix Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-25 17:27   ` Richard Henderson
  2022-04-24 22:01 ` [PATCH v2 04/42] i386: Rework sse_op_table1 Paul Brook
                   ` (38 subsequent siblings)
  45 siblings, 1 reply; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Add a new hflag bit to determine whether AVX instructions are allowed

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/cpu.h            |  3 +++
 target/i386/helper.c         | 12 ++++++++++++
 target/i386/tcg/fpu_helper.c |  1 +
 3 files changed, 16 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 9661f9fbd1..65200a1917 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -169,6 +169,7 @@ typedef enum X86Seg {
 #define HF_MPX_EN_SHIFT     25 /* MPX Enabled (CR4+XCR0+BNDCFGx) */
 #define HF_MPX_IU_SHIFT     26 /* BND registers in-use */
 #define HF_UMIP_SHIFT       27 /* CR4.UMIP */
+#define HF_AVX_EN_SHIFT     28 /* AVX Enabled (CR4+XCR0) */
 
 #define HF_CPL_MASK          (3 << HF_CPL_SHIFT)
 #define HF_INHIBIT_IRQ_MASK  (1 << HF_INHIBIT_IRQ_SHIFT)
@@ -195,6 +196,7 @@ typedef enum X86Seg {
 #define HF_MPX_EN_MASK       (1 << HF_MPX_EN_SHIFT)
 #define HF_MPX_IU_MASK       (1 << HF_MPX_IU_SHIFT)
 #define HF_UMIP_MASK         (1 << HF_UMIP_SHIFT)
+#define HF_AVX_EN_MASK       (1 << HF_AVX_EN_SHIFT)
 
 /* hflags2 */
 
@@ -2035,6 +2037,7 @@ void host_cpuid(uint32_t function, uint32_t count,
 
 /* helper.c */
 void x86_cpu_set_a20(X86CPU *cpu, int a20_state);
+void cpu_sync_avx_hflag(CPUX86State *env);
 
 #ifndef CONFIG_USER_ONLY
 static inline int x86_asidx_from_attrs(CPUState *cs, MemTxAttrs attrs)
diff --git a/target/i386/helper.c b/target/i386/helper.c
index fa409e9c44..30083c9cff 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -29,6 +29,17 @@
 #endif
 #include "qemu/log.h"
 
+void cpu_sync_avx_hflag(CPUX86State *env)
+{
+    if ((env->cr[4] & CR4_OSXSAVE_MASK)
+        && (env->xcr0 & (XSTATE_SSE_MASK | XSTATE_YMM_MASK))
+            == (XSTATE_SSE_MASK | XSTATE_YMM_MASK)) {
+        env->hflags |= HF_AVX_EN_MASK;
+    } else{
+        env->hflags &= ~HF_AVX_EN_MASK;
+    }
+}
+
 void cpu_sync_bndcs_hflags(CPUX86State *env)
 {
     uint32_t hflags = env->hflags;
@@ -209,6 +220,7 @@ void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_cr4)
     env->hflags = hflags;
 
     cpu_sync_bndcs_hflags(env);
+    cpu_sync_avx_hflag(env);
 }
 
 #if !defined(CONFIG_USER_ONLY)
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
index ebf5e73df9..b391b69635 100644
--- a/target/i386/tcg/fpu_helper.c
+++ b/target/i386/tcg/fpu_helper.c
@@ -2943,6 +2943,7 @@ void helper_xsetbv(CPUX86State *env, uint32_t ecx, uint64_t mask)
 
     env->xcr0 = mask;
     cpu_sync_bndcs_hflags(env);
+    cpu_sync_avx_hflag(env);
     return;
 
  do_gpf:
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 04/42] i386: Rework sse_op_table1
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (6 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 03/42] Add AVX_EN hflag Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 05/42] i386: Rework sse_op_table6/7 Paul Brook
                   ` (37 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Add a flags field each row in sse_op_table1.

Initially this is only used as a replacement for the magic
SSE_SPECIAL and SSE_DUMMY pointers, the other flags will become relevant
as the rest of the AVX implementation is built out.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 316 +++++++++++++++++++++---------------
 1 file changed, 186 insertions(+), 130 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index b7972f0ff5..7fec582358 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2788,146 +2788,196 @@ typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_i32 val);
 typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
                                TCGv val);
 
-#define SSE_SPECIAL ((void *)1)
-#define SSE_DUMMY ((void *)2)
+#define SSE_OPF_V0        (1 << 0) /* vex.v must be 1111b (only 2 operands) */
+#define SSE_OPF_CMP       (1 << 1) /* does not write for first operand */
+#define SSE_OPF_BLENDV    (1 << 2) /* blendv* instruction */
+#define SSE_OPF_SPECIAL   (1 << 3) /* magic */
+#define SSE_OPF_3DNOW     (1 << 4) /* 3DNow! instruction */
+#define SSE_OPF_MMX       (1 << 5) /* MMX/integer/AVX2 instruction */
+#define SSE_OPF_SCALAR    (1 << 6) /* Has SSE scalar variants */
+#define SSE_OPF_AVX2      (1 << 7) /* AVX2 instruction */
+#define SSE_OPF_SHUF      (1 << 9) /* pshufx/shufpx */
+
+#define OP(op, flags, a, b, c, d)       \
+    {flags, {a, b, c, d} }
+
+#define MMX_OP(x) OP(op2, SSE_OPF_MMX, \
+        gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL)
+
+#define SSE_FOP(name) OP(op2, SSE_OPF_SCALAR, \
+        gen_helper_##name##ps, gen_helper_##name##pd, \
+        gen_helper_##name##ss, gen_helper_##name##sd)
+#define SSE_OP(sname, dname, op, flags) OP(op, flags, \
+        gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL)
+
+struct SSEOpHelper_table1 {
+    int flags;
+    SSEFunc_0_epp op[4];
+};
 
-#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm }
-#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \
-                     gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, }
+#define SSE_3DNOW { SSE_OPF_3DNOW }
+#define SSE_SPECIAL { SSE_OPF_SPECIAL }
 
-static const SSEFunc_0_epp sse_op_table1[256][4] = {
+static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     /* 3DNow! extensions */
-    [0x0e] = { SSE_DUMMY }, /* femms */
-    [0x0f] = { SSE_DUMMY }, /* pf... */
+    [0x0e] = SSE_SPECIAL, /* femms */
+    [0x0f] = SSE_3DNOW, /* pf... (sse_op_table5) */
     /* pure SSE operations */
-    [0x10] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movups, movupd, movss, movsd */
-    [0x11] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movups, movupd, movss, movsd */
-    [0x12] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movlps, movlpd, movsldup, movddup */
-    [0x13] = { SSE_SPECIAL, SSE_SPECIAL },  /* movlps, movlpd */
-    [0x14] = { gen_helper_punpckldq_xmm, gen_helper_punpcklqdq_xmm },
-    [0x15] = { gen_helper_punpckhdq_xmm, gen_helper_punpckhqdq_xmm },
-    [0x16] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },  /* movhps, movhpd, movshdup */
-    [0x17] = { SSE_SPECIAL, SSE_SPECIAL },  /* movhps, movhpd */
-
-    [0x28] = { SSE_SPECIAL, SSE_SPECIAL },  /* movaps, movapd */
-    [0x29] = { SSE_SPECIAL, SSE_SPECIAL },  /* movaps, movapd */
-    [0x2a] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */
-    [0x2b] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movntps, movntpd, movntss, movntsd */
-    [0x2c] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si */
-    [0x2d] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */
-    [0x2e] = { gen_helper_ucomiss, gen_helper_ucomisd },
-    [0x2f] = { gen_helper_comiss, gen_helper_comisd },
-    [0x50] = { SSE_SPECIAL, SSE_SPECIAL }, /* movmskps, movmskpd */
-    [0x51] = SSE_FOP(sqrt),
-    [0x52] = { gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL },
-    [0x53] = { gen_helper_rcpps, NULL, gen_helper_rcpss, NULL },
-    [0x54] = { gen_helper_pand_xmm, gen_helper_pand_xmm }, /* andps, andpd */
-    [0x55] = { gen_helper_pandn_xmm, gen_helper_pandn_xmm }, /* andnps, andnpd */
-    [0x56] = { gen_helper_por_xmm, gen_helper_por_xmm }, /* orps, orpd */
-    [0x57] = { gen_helper_pxor_xmm, gen_helper_pxor_xmm }, /* xorps, xorpd */
+    [0x10] = SSE_SPECIAL, /* movups, movupd, movss, movsd */
+    [0x11] = SSE_SPECIAL, /* movups, movupd, movss, movsd */
+    [0x12] = SSE_SPECIAL, /* movlps, movlpd, movsldup, movddup */
+    [0x13] = SSE_SPECIAL, /* movlps, movlpd */
+    [0x14] = SSE_OP(punpckldq, punpcklqdq, op2, 0), /* unpcklps, unpcklpd */
+    [0x15] = SSE_OP(punpckhdq, punpckhqdq, op2, 0), /* unpckhps, unpckhpd */
+    [0x16] = SSE_SPECIAL, /* movhps, movhpd, movshdup */
+    [0x17] = SSE_SPECIAL, /* movhps, movhpd */
+
+    [0x28] = SSE_SPECIAL, /* movaps, movapd */
+    [0x29] = SSE_SPECIAL, /* movaps, movapd */
+    [0x2a] = SSE_SPECIAL, /* cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */
+    [0x2b] = SSE_SPECIAL, /* movntps, movntpd, movntss, movntsd */
+    [0x2c] = SSE_SPECIAL, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si */
+    [0x2d] = SSE_SPECIAL, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */
+    [0x2e] = OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0,
+            gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL),
+    [0x2f] = OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0,
+            gen_helper_comiss, gen_helper_comisd, NULL, NULL),
+    [0x50] = SSE_SPECIAL, /* movmskps, movmskpd */
+    [0x51] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
+                gen_helper_sqrtps, gen_helper_sqrtpd,
+                gen_helper_sqrtss, gen_helper_sqrtsd),
+    [0x52] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
+                gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL),
+    [0x53] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
+                gen_helper_rcpps, NULL, gen_helper_rcpss, NULL),
+    [0x54] = SSE_OP(pand, pand, op2, 0), /* andps, andpd */
+    [0x55] = SSE_OP(pandn, pandn, op2, 0), /* andnps, andnpd */
+    [0x56] = SSE_OP(por, por, op2, 0), /* orps, orpd */
+    [0x57] = SSE_OP(pxor, pxor, op2, 0), /* xorps, xorpd */
     [0x58] = SSE_FOP(add),
     [0x59] = SSE_FOP(mul),
-    [0x5a] = { gen_helper_cvtps2pd, gen_helper_cvtpd2ps,
-               gen_helper_cvtss2sd, gen_helper_cvtsd2ss },
-    [0x5b] = { gen_helper_cvtdq2ps, gen_helper_cvtps2dq, gen_helper_cvttps2dq },
+    [0x5a] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
+                gen_helper_cvtps2pd, gen_helper_cvtpd2ps,
+                gen_helper_cvtss2sd, gen_helper_cvtsd2ss),
+    [0x5b] = OP(op1, SSE_OPF_V0,
+                gen_helper_cvtdq2ps, gen_helper_cvtps2dq,
+                gen_helper_cvttps2dq, NULL),
     [0x5c] = SSE_FOP(sub),
     [0x5d] = SSE_FOP(min),
     [0x5e] = SSE_FOP(div),
     [0x5f] = SSE_FOP(max),
 
-    [0xc2] = SSE_FOP(cmpeq),
-    [0xc6] = { (SSEFunc_0_epp)gen_helper_shufps,
-               (SSEFunc_0_epp)gen_helper_shufpd }, /* XXX: casts */
+    [0xc2] = SSE_FOP(cmpeq), /* sse_op_table4 */
+    [0xc6] = OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps,
+                (SSEFunc_0_epp)gen_helper_shufpd, NULL, NULL),
 
     /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX.  */
-    [0x38] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },
-    [0x3a] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },
+    [0x38] = SSE_SPECIAL,
+    [0x3a] = SSE_SPECIAL,
 
     /* MMX ops and their SSE extensions */
-    [0x60] = MMX_OP2(punpcklbw),
-    [0x61] = MMX_OP2(punpcklwd),
-    [0x62] = MMX_OP2(punpckldq),
-    [0x63] = MMX_OP2(packsswb),
-    [0x64] = MMX_OP2(pcmpgtb),
-    [0x65] = MMX_OP2(pcmpgtw),
-    [0x66] = MMX_OP2(pcmpgtl),
-    [0x67] = MMX_OP2(packuswb),
-    [0x68] = MMX_OP2(punpckhbw),
-    [0x69] = MMX_OP2(punpckhwd),
-    [0x6a] = MMX_OP2(punpckhdq),
-    [0x6b] = MMX_OP2(packssdw),
-    [0x6c] = { NULL, gen_helper_punpcklqdq_xmm },
-    [0x6d] = { NULL, gen_helper_punpckhqdq_xmm },
-    [0x6e] = { SSE_SPECIAL, SSE_SPECIAL }, /* movd mm, ea */
-    [0x6f] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa, , movqdu */
-    [0x70] = { (SSEFunc_0_epp)gen_helper_pshufw_mmx,
-               (SSEFunc_0_epp)gen_helper_pshufd_xmm,
-               (SSEFunc_0_epp)gen_helper_pshufhw_xmm,
-               (SSEFunc_0_epp)gen_helper_pshuflw_xmm }, /* XXX: casts */
-    [0x71] = { SSE_SPECIAL, SSE_SPECIAL }, /* shiftw */
-    [0x72] = { SSE_SPECIAL, SSE_SPECIAL }, /* shiftd */
-    [0x73] = { SSE_SPECIAL, SSE_SPECIAL }, /* shiftq */
-    [0x74] = MMX_OP2(pcmpeqb),
-    [0x75] = MMX_OP2(pcmpeqw),
-    [0x76] = MMX_OP2(pcmpeql),
-    [0x77] = { SSE_DUMMY }, /* emms */
-    [0x78] = { NULL, SSE_SPECIAL, NULL, SSE_SPECIAL }, /* extrq_i, insertq_i */
-    [0x79] = { NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r },
-    [0x7c] = { NULL, gen_helper_haddpd, NULL, gen_helper_haddps },
-    [0x7d] = { NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps },
-    [0x7e] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movd, movd, , movq */
-    [0x7f] = { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa, movdqu */
-    [0xc4] = { SSE_SPECIAL, SSE_SPECIAL }, /* pinsrw */
-    [0xc5] = { SSE_SPECIAL, SSE_SPECIAL }, /* pextrw */
-    [0xd0] = { NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps },
-    [0xd1] = MMX_OP2(psrlw),
-    [0xd2] = MMX_OP2(psrld),
-    [0xd3] = MMX_OP2(psrlq),
-    [0xd4] = MMX_OP2(paddq),
-    [0xd5] = MMX_OP2(pmullw),
-    [0xd6] = { NULL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },
-    [0xd7] = { SSE_SPECIAL, SSE_SPECIAL }, /* pmovmskb */
-    [0xd8] = MMX_OP2(psubusb),
-    [0xd9] = MMX_OP2(psubusw),
-    [0xda] = MMX_OP2(pminub),
-    [0xdb] = MMX_OP2(pand),
-    [0xdc] = MMX_OP2(paddusb),
-    [0xdd] = MMX_OP2(paddusw),
-    [0xde] = MMX_OP2(pmaxub),
-    [0xdf] = MMX_OP2(pandn),
-    [0xe0] = MMX_OP2(pavgb),
-    [0xe1] = MMX_OP2(psraw),
-    [0xe2] = MMX_OP2(psrad),
-    [0xe3] = MMX_OP2(pavgw),
-    [0xe4] = MMX_OP2(pmulhuw),
-    [0xe5] = MMX_OP2(pmulhw),
-    [0xe6] = { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, gen_helper_cvtpd2dq },
-    [0xe7] = { SSE_SPECIAL , SSE_SPECIAL },  /* movntq, movntq */
-    [0xe8] = MMX_OP2(psubsb),
-    [0xe9] = MMX_OP2(psubsw),
-    [0xea] = MMX_OP2(pminsw),
-    [0xeb] = MMX_OP2(por),
-    [0xec] = MMX_OP2(paddsb),
-    [0xed] = MMX_OP2(paddsw),
-    [0xee] = MMX_OP2(pmaxsw),
-    [0xef] = MMX_OP2(pxor),
-    [0xf0] = { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */
-    [0xf1] = MMX_OP2(psllw),
-    [0xf2] = MMX_OP2(pslld),
-    [0xf3] = MMX_OP2(psllq),
-    [0xf4] = MMX_OP2(pmuludq),
-    [0xf5] = MMX_OP2(pmaddwd),
-    [0xf6] = MMX_OP2(psadbw),
-    [0xf7] = { (SSEFunc_0_epp)gen_helper_maskmov_mmx,
-               (SSEFunc_0_epp)gen_helper_maskmov_xmm }, /* XXX: casts */
-    [0xf8] = MMX_OP2(psubb),
-    [0xf9] = MMX_OP2(psubw),
-    [0xfa] = MMX_OP2(psubl),
-    [0xfb] = MMX_OP2(psubq),
-    [0xfc] = MMX_OP2(paddb),
-    [0xfd] = MMX_OP2(paddw),
-    [0xfe] = MMX_OP2(paddl),
+    [0x60] = MMX_OP(punpcklbw),
+    [0x61] = MMX_OP(punpcklwd),
+    [0x62] = MMX_OP(punpckldq),
+    [0x63] = MMX_OP(packsswb),
+    [0x64] = MMX_OP(pcmpgtb),
+    [0x65] = MMX_OP(pcmpgtw),
+    [0x66] = MMX_OP(pcmpgtl),
+    [0x67] = MMX_OP(packuswb),
+    [0x68] = MMX_OP(punpckhbw),
+    [0x69] = MMX_OP(punpckhwd),
+    [0x6a] = MMX_OP(punpckhdq),
+    [0x6b] = MMX_OP(packssdw),
+    [0x6c] = OP(op2, SSE_OPF_MMX,
+                NULL, gen_helper_punpcklqdq_xmm, NULL, NULL),
+    [0x6d] = OP(op2, SSE_OPF_MMX,
+                NULL, gen_helper_punpckhqdq_xmm, NULL, NULL),
+    [0x6e] = SSE_SPECIAL, /* movd mm, ea */
+    [0x6f] = SSE_SPECIAL, /* movq, movdqa, , movqdu */
+    [0x70] = OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX | SSE_OPF_V0,
+            (SSEFunc_0_epp)gen_helper_pshufw_mmx,
+            (SSEFunc_0_epp)gen_helper_pshufd_xmm,
+            (SSEFunc_0_epp)gen_helper_pshufhw_xmm,
+            (SSEFunc_0_epp)gen_helper_pshuflw_xmm),
+    [0x71] = SSE_SPECIAL, /* shiftw */
+    [0x72] = SSE_SPECIAL, /* shiftd */
+    [0x73] = SSE_SPECIAL, /* shiftq */
+    [0x74] = MMX_OP(pcmpeqb),
+    [0x75] = MMX_OP(pcmpeqw),
+    [0x76] = MMX_OP(pcmpeql),
+    [0x77] = SSE_SPECIAL, /* emms */
+    [0x78] = SSE_SPECIAL, /* extrq_i, insertq_i (sse4a) */
+    [0x79] = OP(op1, SSE_OPF_V0,
+            NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r),
+    [0x7c] = OP(op2, 0,
+                NULL, gen_helper_haddpd, NULL, gen_helper_haddps),
+    [0x7d] = OP(op2, 0,
+                NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps),
+    [0x7e] = SSE_SPECIAL, /* movd, movd, , movq */
+    [0x7f] = SSE_SPECIAL, /* movq, movdqa, movdqu */
+    [0xc4] = SSE_SPECIAL, /* pinsrw */
+    [0xc5] = SSE_SPECIAL, /* pextrw */
+    [0xd0] = OP(op2, 0,
+                NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps),
+    [0xd1] = MMX_OP(psrlw),
+    [0xd2] = MMX_OP(psrld),
+    [0xd3] = MMX_OP(psrlq),
+    [0xd4] = MMX_OP(paddq),
+    [0xd5] = MMX_OP(pmullw),
+    [0xd6] = SSE_SPECIAL,
+    [0xd7] = SSE_SPECIAL, /* pmovmskb */
+    [0xd8] = MMX_OP(psubusb),
+    [0xd9] = MMX_OP(psubusw),
+    [0xda] = MMX_OP(pminub),
+    [0xdb] = MMX_OP(pand),
+    [0xdc] = MMX_OP(paddusb),
+    [0xdd] = MMX_OP(paddusw),
+    [0xde] = MMX_OP(pmaxub),
+    [0xdf] = MMX_OP(pandn),
+    [0xe0] = MMX_OP(pavgb),
+    [0xe1] = MMX_OP(psraw),
+    [0xe2] = MMX_OP(psrad),
+    [0xe3] = MMX_OP(pavgw),
+    [0xe4] = MMX_OP(pmulhuw),
+    [0xe5] = MMX_OP(pmulhw),
+    [0xe6] = OP(op1, SSE_OPF_V0,
+            NULL, gen_helper_cvttpd2dq,
+            gen_helper_cvtdq2pd, gen_helper_cvtpd2dq),
+    [0xe7] = SSE_SPECIAL,  /* movntq, movntq */
+    [0xe8] = MMX_OP(psubsb),
+    [0xe9] = MMX_OP(psubsw),
+    [0xea] = MMX_OP(pminsw),
+    [0xeb] = MMX_OP(por),
+    [0xec] = MMX_OP(paddsb),
+    [0xed] = MMX_OP(paddsw),
+    [0xee] = MMX_OP(pmaxsw),
+    [0xef] = MMX_OP(pxor),
+    [0xf0] = SSE_SPECIAL, /* lddqu */
+    [0xf1] = MMX_OP(psllw),
+    [0xf2] = MMX_OP(pslld),
+    [0xf3] = MMX_OP(psllq),
+    [0xf4] = MMX_OP(pmuludq),
+    [0xf5] = MMX_OP(pmaddwd),
+    [0xf6] = MMX_OP(psadbw),
+    [0xf7] = OP(op1t, SSE_OPF_MMX | SSE_OPF_V0,
+                (SSEFunc_0_epp)gen_helper_maskmov_mmx,
+                (SSEFunc_0_epp)gen_helper_maskmov_xmm, NULL, NULL),
+    [0xf8] = MMX_OP(psubb),
+    [0xf9] = MMX_OP(psubw),
+    [0xfa] = MMX_OP(psubl),
+    [0xfb] = MMX_OP(psubq),
+    [0xfc] = MMX_OP(paddb),
+    [0xfd] = MMX_OP(paddw),
+    [0xfe] = MMX_OP(paddl),
 };
+#undef MMX_OP
+#undef OP
+#undef SSE_FOP
+#undef SSE_OP
+#undef SSE_SPECIAL
+
+#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm }
+#define SSE_SPECIAL_FN ((void *)1)
 
 static const SSEFunc_0_epp sse_op_table2[3 * 8][2] = {
     [0 + 2] = MMX_OP2(psrlw),
@@ -2970,6 +3020,8 @@ static const SSEFunc_l_ep sse_op_table3bq[] = {
 };
 #endif
 
+#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \
+                     gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, }
 static const SSEFunc_0_epp sse_op_table4[8][4] = {
     SSE_FOP(cmpeq),
     SSE_FOP(cmplt),
@@ -2980,6 +3032,7 @@ static const SSEFunc_0_epp sse_op_table4[8][4] = {
     SSE_FOP(cmpnle),
     SSE_FOP(cmpord),
 };
+#undef SSE_FOP
 
 static const SSEFunc_0_epp sse_op_table5[256] = {
     [0x0c] = gen_helper_pi2fw,
@@ -3021,7 +3074,7 @@ struct SSEOpHelper_eppi {
 #define SSSE3_OP(x) { MMX_OP2(x), CPUID_EXT_SSSE3 }
 #define SSE41_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE41 }
 #define SSE42_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE42 }
-#define SSE41_SPECIAL { { NULL, SSE_SPECIAL }, CPUID_EXT_SSE41 }
+#define SSE41_SPECIAL { { NULL, SSE_SPECIAL_FN }, CPUID_EXT_SSE41 }
 #define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \
         CPUID_EXT_PCLMULQDQ }
 #define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES }
@@ -3112,6 +3165,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 {
     int b1, op1_offset, op2_offset, is_xmm, val;
     int modrm, mod, rm, reg;
+    struct SSEOpHelper_table1 sse_op;
     SSEFunc_0_epp sse_fn_epp;
     SSEFunc_0_eppi sse_fn_eppi;
     SSEFunc_0_ppi sse_fn_ppi;
@@ -3127,8 +3181,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         b1 = 3;
     else
         b1 = 0;
-    sse_fn_epp = sse_op_table1[b][b1];
-    if (!sse_fn_epp) {
+    sse_op = sse_op_table1[b];
+    sse_fn_epp = sse_op.op[b1];
+    if ((sse_op.flags & (SSE_OPF_SPECIAL | SSE_OPF_3DNOW)) == 0
+            && !sse_fn_epp) {
         goto unknown_op;
     }
     if ((b <= 0x5f && b >= 0x10) || b == 0xc6 || b == 0xc2) {
@@ -3182,7 +3238,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         reg |= REX_R(s);
     }
     mod = (modrm >> 6) & 3;
-    if (sse_fn_epp == SSE_SPECIAL) {
+    if (sse_op.flags & SSE_OPF_SPECIAL) {
         b |= (b1 << 8);
         switch(b) {
         case 0x0e7: /* movntq */
@@ -3823,7 +3879,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     gen_ldq_env_A0(s, op2_offset);
                 }
             }
-            if (sse_fn_epp == SSE_SPECIAL) {
+            if (sse_fn_epp == SSE_SPECIAL_FN) {
                 goto unknown_op;
             }
 
@@ -4209,7 +4265,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 
             s->rip_offset = 1;
 
-            if (sse_fn_eppi == SSE_SPECIAL) {
+            if (sse_fn_eppi == SSE_SPECIAL_FN) {
                 ot = mo_64_32(s->dflag);
                 rm = (modrm & 7) | REX_B(s);
                 if (mod != 3)
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 05/42] i386: Rework sse_op_table6/7
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (7 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 04/42] i386: Rework sse_op_table1 Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 06/42] i386: Add CHECK_NO_VEX Paul Brook
                   ` (36 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Add a flags field each row in sse_op_table6 and sse_op_table7.

Initially this is only used as a replacement for the magic
SSE41_SPECIAL pointer.  The other flags will become relevant
as the rest of the avx implementation is built out.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 232 ++++++++++++++++++++----------------
 1 file changed, 132 insertions(+), 100 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 7fec582358..5335b86c01 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2977,7 +2977,6 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
 #undef SSE_SPECIAL
 
 #define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm }
-#define SSE_SPECIAL_FN ((void *)1)
 
 static const SSEFunc_0_epp sse_op_table2[3 * 8][2] = {
     [0 + 2] = MMX_OP2(psrlw),
@@ -3061,113 +3060,134 @@ static const SSEFunc_0_epp sse_op_table5[256] = {
     [0xbf] = gen_helper_pavgb_mmx /* pavgusb */
 };
 
-struct SSEOpHelper_epp {
+struct SSEOpHelper_table6 {
     SSEFunc_0_epp op[2];
     uint32_t ext_mask;
+    int flags;
 };
 
-struct SSEOpHelper_eppi {
+struct SSEOpHelper_table7 {
     SSEFunc_0_eppi op[2];
     uint32_t ext_mask;
+    int flags;
 };
 
-#define SSSE3_OP(x) { MMX_OP2(x), CPUID_EXT_SSSE3 }
-#define SSE41_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE41 }
-#define SSE42_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE42 }
-#define SSE41_SPECIAL { { NULL, SSE_SPECIAL_FN }, CPUID_EXT_SSE41 }
-#define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \
-        CPUID_EXT_PCLMULQDQ }
-#define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES }
-
-static const struct SSEOpHelper_epp sse_op_table6[256] = {
-    [0x00] = SSSE3_OP(pshufb),
-    [0x01] = SSSE3_OP(phaddw),
-    [0x02] = SSSE3_OP(phaddd),
-    [0x03] = SSSE3_OP(phaddsw),
-    [0x04] = SSSE3_OP(pmaddubsw),
-    [0x05] = SSSE3_OP(phsubw),
-    [0x06] = SSSE3_OP(phsubd),
-    [0x07] = SSSE3_OP(phsubsw),
-    [0x08] = SSSE3_OP(psignb),
-    [0x09] = SSSE3_OP(psignw),
-    [0x0a] = SSSE3_OP(psignd),
-    [0x0b] = SSSE3_OP(pmulhrsw),
-    [0x10] = SSE41_OP(pblendvb),
-    [0x14] = SSE41_OP(blendvps),
-    [0x15] = SSE41_OP(blendvpd),
-    [0x17] = SSE41_OP(ptest),
-    [0x1c] = SSSE3_OP(pabsb),
-    [0x1d] = SSSE3_OP(pabsw),
-    [0x1e] = SSSE3_OP(pabsd),
-    [0x20] = SSE41_OP(pmovsxbw),
-    [0x21] = SSE41_OP(pmovsxbd),
-    [0x22] = SSE41_OP(pmovsxbq),
-    [0x23] = SSE41_OP(pmovsxwd),
-    [0x24] = SSE41_OP(pmovsxwq),
-    [0x25] = SSE41_OP(pmovsxdq),
-    [0x28] = SSE41_OP(pmuldq),
-    [0x29] = SSE41_OP(pcmpeqq),
-    [0x2a] = SSE41_SPECIAL, /* movntqda */
-    [0x2b] = SSE41_OP(packusdw),
-    [0x30] = SSE41_OP(pmovzxbw),
-    [0x31] = SSE41_OP(pmovzxbd),
-    [0x32] = SSE41_OP(pmovzxbq),
-    [0x33] = SSE41_OP(pmovzxwd),
-    [0x34] = SSE41_OP(pmovzxwq),
-    [0x35] = SSE41_OP(pmovzxdq),
-    [0x37] = SSE42_OP(pcmpgtq),
-    [0x38] = SSE41_OP(pminsb),
-    [0x39] = SSE41_OP(pminsd),
-    [0x3a] = SSE41_OP(pminuw),
-    [0x3b] = SSE41_OP(pminud),
-    [0x3c] = SSE41_OP(pmaxsb),
-    [0x3d] = SSE41_OP(pmaxsd),
-    [0x3e] = SSE41_OP(pmaxuw),
-    [0x3f] = SSE41_OP(pmaxud),
-    [0x40] = SSE41_OP(pmulld),
-    [0x41] = SSE41_OP(phminposuw),
-    [0xdb] = AESNI_OP(aesimc),
-    [0xdc] = AESNI_OP(aesenc),
-    [0xdd] = AESNI_OP(aesenclast),
-    [0xde] = AESNI_OP(aesdec),
-    [0xdf] = AESNI_OP(aesdeclast),
+#define gen_helper_special_xmm NULL
+
+#define OP(name, op, flags, ext, mmx_name) \
+    {{mmx_name, gen_helper_ ## name ## _xmm}, CPUID_EXT_ ## ext, flags}
+#define BINARY_OP_MMX(name, ext) \
+    OP(name, op2, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx)
+#define BINARY_OP(name, ext, flags) \
+    OP(name, op2, flags, ext, NULL)
+#define UNARY_OP_MMX(name, ext) \
+    OP(name, op1, SSE_OPF_V0 | SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx)
+#define UNARY_OP(name, ext, flags) \
+    OP(name, op1, SSE_OPF_V0 | flags, ext, NULL)
+#define BLENDV_OP(name, ext, flags) OP(name, op3, SSE_OPF_BLENDV, ext, NULL)
+#define CMP_OP(name, ext) OP(name, op1, SSE_OPF_CMP | SSE_OPF_V0, ext, NULL)
+#define SPECIAL_OP(ext) OP(special, op1, SSE_OPF_SPECIAL, ext, NULL)
+
+/* prefix [66] 0f 38 */
+static const struct SSEOpHelper_table6 sse_op_table6[256] = {
+    [0x00] = BINARY_OP_MMX(pshufb, SSSE3),
+    [0x01] = BINARY_OP_MMX(phaddw, SSSE3),
+    [0x02] = BINARY_OP_MMX(phaddd, SSSE3),
+    [0x03] = BINARY_OP_MMX(phaddsw, SSSE3),
+    [0x04] = BINARY_OP_MMX(pmaddubsw, SSSE3),
+    [0x05] = BINARY_OP_MMX(phsubw, SSSE3),
+    [0x06] = BINARY_OP_MMX(phsubd, SSSE3),
+    [0x07] = BINARY_OP_MMX(phsubsw, SSSE3),
+    [0x08] = BINARY_OP_MMX(psignb, SSSE3),
+    [0x09] = BINARY_OP_MMX(psignw, SSSE3),
+    [0x0a] = BINARY_OP_MMX(psignd, SSSE3),
+    [0x0b] = BINARY_OP_MMX(pmulhrsw, SSSE3),
+    [0x10] = BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX),
+    [0x14] = BLENDV_OP(blendvps, SSE41, 0),
+    [0x15] = BLENDV_OP(blendvpd, SSE41, 0),
+    [0x17] = CMP_OP(ptest, SSE41),
+    [0x1c] = UNARY_OP_MMX(pabsb, SSSE3),
+    [0x1d] = UNARY_OP_MMX(pabsw, SSSE3),
+    [0x1e] = UNARY_OP_MMX(pabsd, SSSE3),
+    [0x20] = UNARY_OP(pmovsxbw, SSE41, SSE_OPF_MMX),
+    [0x21] = UNARY_OP(pmovsxbd, SSE41, SSE_OPF_MMX),
+    [0x22] = UNARY_OP(pmovsxbq, SSE41, SSE_OPF_MMX),
+    [0x23] = UNARY_OP(pmovsxwd, SSE41, SSE_OPF_MMX),
+    [0x24] = UNARY_OP(pmovsxwq, SSE41, SSE_OPF_MMX),
+    [0x25] = UNARY_OP(pmovsxdq, SSE41, SSE_OPF_MMX),
+    [0x28] = BINARY_OP(pmuldq, SSE41, SSE_OPF_MMX),
+    [0x29] = BINARY_OP(pcmpeqq, SSE41, SSE_OPF_MMX),
+    [0x2a] = SPECIAL_OP(SSE41), /* movntqda */
+    [0x2b] = BINARY_OP(packusdw, SSE41, SSE_OPF_MMX),
+    [0x30] = UNARY_OP(pmovzxbw, SSE41, SSE_OPF_MMX),
+    [0x31] = UNARY_OP(pmovzxbd, SSE41, SSE_OPF_MMX),
+    [0x32] = UNARY_OP(pmovzxbq, SSE41, SSE_OPF_MMX),
+    [0x33] = UNARY_OP(pmovzxwd, SSE41, SSE_OPF_MMX),
+    [0x34] = UNARY_OP(pmovzxwq, SSE41, SSE_OPF_MMX),
+    [0x35] = UNARY_OP(pmovzxdq, SSE41, SSE_OPF_MMX),
+    [0x37] = BINARY_OP(pcmpgtq, SSE41, SSE_OPF_MMX),
+    [0x38] = BINARY_OP(pminsb, SSE41, SSE_OPF_MMX),
+    [0x39] = BINARY_OP(pminsd, SSE41, SSE_OPF_MMX),
+    [0x3a] = BINARY_OP(pminuw, SSE41, SSE_OPF_MMX),
+    [0x3b] = BINARY_OP(pminud, SSE41, SSE_OPF_MMX),
+    [0x3c] = BINARY_OP(pmaxsb, SSE41, SSE_OPF_MMX),
+    [0x3d] = BINARY_OP(pmaxsd, SSE41, SSE_OPF_MMX),
+    [0x3e] = BINARY_OP(pmaxuw, SSE41, SSE_OPF_MMX),
+    [0x3f] = BINARY_OP(pmaxud, SSE41, SSE_OPF_MMX),
+    [0x40] = BINARY_OP(pmulld, SSE41, SSE_OPF_MMX),
+    [0x41] = UNARY_OP(phminposuw, SSE41, 0),
+    [0xdb] = UNARY_OP(aesimc, AES, 0),
+    [0xdc] = BINARY_OP(aesenc, AES, 0),
+    [0xdd] = BINARY_OP(aesenclast, AES, 0),
+    [0xde] = BINARY_OP(aesdec, AES, 0),
+    [0xdf] = BINARY_OP(aesdeclast, AES, 0),
 };
 
-static const struct SSEOpHelper_eppi sse_op_table7[256] = {
-    [0x08] = SSE41_OP(roundps),
-    [0x09] = SSE41_OP(roundpd),
-    [0x0a] = SSE41_OP(roundss),
-    [0x0b] = SSE41_OP(roundsd),
-    [0x0c] = SSE41_OP(blendps),
-    [0x0d] = SSE41_OP(blendpd),
-    [0x0e] = SSE41_OP(pblendw),
-    [0x0f] = SSSE3_OP(palignr),
-    [0x14] = SSE41_SPECIAL, /* pextrb */
-    [0x15] = SSE41_SPECIAL, /* pextrw */
-    [0x16] = SSE41_SPECIAL, /* pextrd/pextrq */
-    [0x17] = SSE41_SPECIAL, /* extractps */
-    [0x20] = SSE41_SPECIAL, /* pinsrb */
-    [0x21] = SSE41_SPECIAL, /* insertps */
-    [0x22] = SSE41_SPECIAL, /* pinsrd/pinsrq */
-    [0x40] = SSE41_OP(dpps),
-    [0x41] = SSE41_OP(dppd),
-    [0x42] = SSE41_OP(mpsadbw),
-    [0x44] = PCLMULQDQ_OP(pclmulqdq),
-    [0x60] = SSE42_OP(pcmpestrm),
-    [0x61] = SSE42_OP(pcmpestri),
-    [0x62] = SSE42_OP(pcmpistrm),
-    [0x63] = SSE42_OP(pcmpistri),
-    [0xdf] = AESNI_OP(aeskeygenassist),
+/* prefix [66] 0f 3a */
+static const struct SSEOpHelper_table7 sse_op_table7[256] = {
+    [0x08] = UNARY_OP(roundps, SSE41, 0),
+    [0x09] = UNARY_OP(roundpd, SSE41, 0),
+    [0x0a] = UNARY_OP(roundss, SSE41, SSE_OPF_SCALAR),
+    [0x0b] = UNARY_OP(roundsd, SSE41, SSE_OPF_SCALAR),
+    [0x0c] = BINARY_OP(blendps, SSE41, 0),
+    [0x0d] = BINARY_OP(blendpd, SSE41, 0),
+    [0x0e] = BINARY_OP(pblendw, SSE41, SSE_OPF_MMX),
+    [0x0f] = BINARY_OP_MMX(palignr, SSSE3),
+    [0x14] = SPECIAL_OP(SSE41), /* pextrb */
+    [0x15] = SPECIAL_OP(SSE41), /* pextrw */
+    [0x16] = SPECIAL_OP(SSE41), /* pextrd/pextrq */
+    [0x17] = SPECIAL_OP(SSE41), /* extractps */
+    [0x20] = SPECIAL_OP(SSE41), /* pinsrb */
+    [0x21] = SPECIAL_OP(SSE41), /* insertps */
+    [0x22] = SPECIAL_OP(SSE41), /* pinsrd/pinsrq */
+    [0x40] = BINARY_OP(dpps, SSE41, 0),
+    [0x41] = BINARY_OP(dppd, SSE41, 0),
+    [0x42] = BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX),
+    [0x44] = BINARY_OP(pclmulqdq, PCLMULQDQ, 0),
+    [0x60] = CMP_OP(pcmpestrm, SSE42),
+    [0x61] = CMP_OP(pcmpestri, SSE42),
+    [0x62] = CMP_OP(pcmpistrm, SSE42),
+    [0x63] = CMP_OP(pcmpistri, SSE42),
+    [0xdf] = UNARY_OP(aeskeygenassist, AES, 0),
 };
 
+#undef OP
+#undef BINARY_OP_MMX
+#undef BINARY_OP
+#undef UNARY_OP_MMX
+#undef UNARY_OP
+#undef BLENDV_OP
+#undef SPECIAL_OP
+
 static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     target_ulong pc_start)
 {
     int b1, op1_offset, op2_offset, is_xmm, val;
     int modrm, mod, rm, reg;
     struct SSEOpHelper_table1 sse_op;
+    struct SSEOpHelper_table6 op6;
+    struct SSEOpHelper_table7 op7;
     SSEFunc_0_epp sse_fn_epp;
-    SSEFunc_0_eppi sse_fn_eppi;
     SSEFunc_0_ppi sse_fn_ppi;
     SSEFunc_0_eppt sse_fn_eppt;
     MemOp ot;
@@ -3828,12 +3848,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             mod = (modrm >> 6) & 3;
 
             assert(b1 < 2);
-            sse_fn_epp = sse_op_table6[b].op[b1];
-            if (!sse_fn_epp) {
+            op6 = sse_op_table6[b];
+            if (op6.ext_mask == 0) {
                 goto unknown_op;
             }
-            if (!(s->cpuid_ext_features & sse_op_table6[b].ext_mask))
+            if (!(s->cpuid_ext_features & op6.ext_mask)) {
                 goto illegal_op;
+            }
 
             if (b1) {
                 op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
@@ -3870,6 +3891,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                 }
             } else {
+                if ((op6.flags & SSE_OPF_MMX) == 0) {
+                    goto unknown_op;
+                }
                 op1_offset = offsetof(CPUX86State,fpregs[reg].mmx);
                 if (mod == 3) {
                     op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
@@ -3879,13 +3903,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     gen_ldq_env_A0(s, op2_offset);
                 }
             }
-            if (sse_fn_epp == SSE_SPECIAL_FN) {
-                goto unknown_op;
+            if (!op6.op[b1]) {
+                goto illegal_op;
             }
 
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
             tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
+            op6.op[b1](cpu_env, s->ptr0, s->ptr1);
 
             if (b == 0x17) {
                 set_cc_op(s, CC_OP_EFLAGS);
@@ -4256,16 +4280,21 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             mod = (modrm >> 6) & 3;
 
             assert(b1 < 2);
-            sse_fn_eppi = sse_op_table7[b].op[b1];
-            if (!sse_fn_eppi) {
+            op7 = sse_op_table7[b];
+            if (op7.ext_mask == 0) {
                 goto unknown_op;
             }
-            if (!(s->cpuid_ext_features & sse_op_table7[b].ext_mask))
+            if (!(s->cpuid_ext_features & op7.ext_mask)) {
                 goto illegal_op;
+            }
 
             s->rip_offset = 1;
 
-            if (sse_fn_eppi == SSE_SPECIAL_FN) {
+            if (op7.flags & SSE_OPF_SPECIAL) {
+                /* None of the "special" ops are valid on mmx registers */
+                if (b1 == 0) {
+                    goto illegal_op;
+                }
                 ot = mo_64_32(s->dflag);
                 rm = (modrm & 7) | REX_B(s);
                 if (mod != 3)
@@ -4410,6 +4439,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     gen_ldo_env_A0(s, op2_offset);
                 }
             } else {
+                if ((op7.flags & SSE_OPF_MMX) == 0) {
+                    goto illegal_op;
+                }
                 op1_offset = offsetof(CPUX86State,fpregs[reg].mmx);
                 if (mod == 3) {
                     op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
@@ -4432,7 +4464,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
             tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_eppi(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val));
+            op7.op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val));
             break;
 
         case 0x33a:
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 06/42] i386: Add CHECK_NO_VEX
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (8 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 05/42] i386: Rework sse_op_table6/7 Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-25 20:39   ` Richard Henderson
  2022-04-25 20:41   ` Richard Henderson
  2022-04-24 22:01 ` [PATCH v2 07/42] Enforce VEX encoding restrictions Paul Brook
                   ` (35 subsequent siblings)
  45 siblings, 2 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Reject invalid VEX encodings on MMX instructions.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 5335b86c01..66ba690b7d 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3179,6 +3179,12 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
 #undef BLENDV_OP
 #undef SPECIAL_OP
 
+/* VEX prefix not allowed */
+#define CHECK_NO_VEX(s) do { \
+    if (s->prefix & PREFIX_VEX) \
+        goto illegal_op; \
+    } while (0)
+
 static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     target_ulong pc_start)
 {
@@ -3262,6 +3268,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         b |= (b1 << 8);
         switch(b) {
         case 0x0e7: /* movntq */
+            CHECK_NO_VEX(s);
             if (mod == 3) {
                 goto illegal_op;
             }
@@ -3297,6 +3304,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x6e: /* movd mm, ea */
+            CHECK_NO_VEX(s);
 #ifdef TARGET_X86_64
             if (s->dflag == MO_64) {
                 gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0);
@@ -3330,6 +3338,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x6f: /* movq mm, ea */
+            CHECK_NO_VEX(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx));
@@ -3464,6 +3473,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x178:
         case 0x378:
+            CHECK_NO_VEX(s);
             {
                 int bit_index, field_length;
 
@@ -3484,6 +3494,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x7e: /* movd ea, mm */
+            CHECK_NO_VEX(s);
 #ifdef TARGET_X86_64
             if (s->dflag == MO_64) {
                 tcg_gen_ld_i64(s->T0, cpu_env,
@@ -3524,6 +3535,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)));
             break;
         case 0x7f: /* movq ea, mm */
+            CHECK_NO_VEX(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx));
@@ -3607,6 +3619,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                                 offsetof(CPUX86State, xmm_t0.ZMM_L(1)));
                 op1_offset = offsetof(CPUX86State,xmm_t0);
             } else {
+                CHECK_NO_VEX(s);
                 tcg_gen_movi_tl(s->T0, val);
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, mmx_t0.MMX_L(0)));
@@ -3648,6 +3661,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x02a: /* cvtpi2ps */
         case 0x12a: /* cvtpi2pd */
+            CHECK_NO_VEX(s);
             gen_helper_enter_mmx(cpu_env);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
@@ -3693,6 +3707,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x12c: /* cvttpd2pi */
         case 0x02d: /* cvtps2pi */
         case 0x12d: /* cvtpd2pi */
+            CHECK_NO_VEX(s);
             gen_helper_enter_mmx(cpu_env);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
@@ -3766,6 +3781,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 tcg_gen_st16_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State,xmm_regs[reg].ZMM_W(val)));
             } else {
+                CHECK_NO_VEX(s);
                 val &= 3;
                 tcg_gen_st16_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State,fpregs[reg].mmx.MMX_W(val)));
@@ -3805,6 +3821,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x2d6: /* movq2dq */
+            CHECK_NO_VEX(s);
             gen_helper_enter_mmx(cpu_env);
             rm = (modrm & 7);
             gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
@@ -3812,6 +3829,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)));
             break;
         case 0x3d6: /* movdq2q */
+            CHECK_NO_VEX(s);
             gen_helper_enter_mmx(cpu_env);
             rm = (modrm & 7) | REX_B(s);
             gen_op_movq(s, offsetof(CPUX86State, fpregs[reg & 7].mmx),
@@ -3827,6 +3845,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                                  offsetof(CPUX86State, xmm_regs[rm]));
                 gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0);
             } else {
+                CHECK_NO_VEX(s);
                 rm = (modrm & 7);
                 tcg_gen_addi_ptr(s->ptr0, cpu_env,
                                  offsetof(CPUX86State, fpregs[rm].mmx));
@@ -3891,6 +3910,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                 }
             } else {
+                CHECK_NO_VEX(s);
                 if ((op6.flags & SSE_OPF_MMX) == 0) {
                     goto unknown_op;
                 }
@@ -3928,6 +3948,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             case 0x3f0: /* crc32 Gd,Eb */
             case 0x3f1: /* crc32 Gd,Ey */
             do_crc32:
+                CHECK_NO_VEX(s);
                 if (!(s->cpuid_ext_features & CPUID_EXT_SSE42)) {
                     goto illegal_op;
                 }
@@ -3950,6 +3971,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 
             case 0x1f0: /* crc32 or movbe */
             case 0x1f1:
+                CHECK_NO_VEX(s);
                 /* For these insns, the f3 prefix is supposed to have priority
                    over the 66 prefix, but that's not what we implement above
                    setting b1.  */
@@ -3959,6 +3981,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 /* FALLTHRU */
             case 0x0f0: /* movbe Gy,My */
             case 0x0f1: /* movbe My,Gy */
+                CHECK_NO_VEX(s);
                 if (!(s->cpuid_ext_features & CPUID_EXT_MOVBE)) {
                     goto illegal_op;
                 }
@@ -4125,6 +4148,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 
             case 0x1f6: /* adcx Gy, Ey */
             case 0x2f6: /* adox Gy, Ey */
+                CHECK_NO_VEX(s);
                 if (!(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_ADX)) {
                     goto illegal_op;
                 } else {
@@ -4439,6 +4463,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     gen_ldo_env_A0(s, op2_offset);
                 }
             } else {
+                CHECK_NO_VEX(s);
                 if ((op7.flags & SSE_OPF_MMX) == 0) {
                     goto illegal_op;
                 }
@@ -4565,6 +4590,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 op2_offset = offsetof(CPUX86State,xmm_regs[rm]);
             }
         } else {
+            CHECK_NO_VEX(s);
             op1_offset = offsetof(CPUX86State,fpregs[reg].mmx);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 07/42] Enforce VEX encoding restrictions
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (9 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 06/42] i386: Add CHECK_NO_VEX Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-25 20:42   ` Richard Henderson
                     ` (2 more replies)
  2022-04-24 22:01 ` [PATCH v2 08/42] i386: Add ZMM_OFFSET macro Paul Brook
                   ` (34 subsequent siblings)
  45 siblings, 3 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Add CHECK_AVX* macros, and use them to validate VEX encoded AVX instructions

All AVX instructions require both CPU and OS support, this is encapsulated
by HF_AVX_EN.

Some also require specific values in the VEX.L and VEX.V fields.
Some (mostly integer operations) also require AVX2

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 159 +++++++++++++++++++++++++++++++++---
 1 file changed, 149 insertions(+), 10 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 66ba690b7d..2f5cc24e0c 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3185,10 +3185,54 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
         goto illegal_op; \
     } while (0)
 
+/*
+ * VEX encodings require AVX
+ * Allow legacy SSE encodings even if AVX not enabled
+ */
+#define CHECK_AVX(s) do { \
+    if ((s->prefix & PREFIX_VEX) \
+        && !(env->hflags & HF_AVX_EN_MASK)) \
+        goto illegal_op; \
+    } while (0)
+
+/* If a VEX prefix is used then it must have V=1111b */
+#define CHECK_AVX_V0(s) do { \
+    CHECK_AVX(s); \
+    if ((s->prefix & PREFIX_VEX) && (s->vex_v != 0)) \
+        goto illegal_op; \
+    } while (0)
+
+/* If a VEX prefix is used then it must have L=0 */
+#define CHECK_AVX_128(s) do { \
+    CHECK_AVX(s); \
+    if ((s->prefix & PREFIX_VEX) && (s->vex_l != 0)) \
+        goto illegal_op; \
+    } while (0)
+
+/* If a VEX prefix is used then it must have V=1111b and L=0 */
+#define CHECK_AVX_V0_128(s) do { \
+    CHECK_AVX(s); \
+    if ((s->prefix & PREFIX_VEX) && (s->vex_v != 0 || s->vex_l != 0)) \
+        goto illegal_op; \
+    } while (0)
+
+/* 256-bit (ymm) variants require AVX2 */
+#define CHECK_AVX2_256(s) do { \
+    if (s->vex_l && !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2)) \
+        goto illegal_op; \
+    } while (0)
+
+/* Requires AVX2 and VEX encoding */
+#define CHECK_AVX2(s) do { \
+    if ((s->prefix & PREFIX_VEX) == 0 \
+            || !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2)) \
+        goto illegal_op; \
+    } while (0)
+
 static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     target_ulong pc_start)
 {
-    int b1, op1_offset, op2_offset, is_xmm, val;
+    int b1, op1_offset, op2_offset, is_xmm, val, scalar_op;
     int modrm, mod, rm, reg;
     struct SSEOpHelper_table1 sse_op;
     struct SSEOpHelper_table6 op6;
@@ -3228,15 +3272,18 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
         return;
     }
-    if (s->flags & HF_EM_MASK) {
-    illegal_op:
-        gen_illegal_opcode(s);
-        return;
-    }
-    if (is_xmm
-        && !(s->flags & HF_OSFXSR_MASK)
-        && (b != 0x38 && b != 0x3a)) {
-        goto unknown_op;
+    /* VEX encoded instuctions ignore EM bit. See also CHECK_AVX */
+    if (!(s->prefix & PREFIX_VEX)) {
+        if (s->flags & HF_EM_MASK) {
+        illegal_op:
+            gen_illegal_opcode(s);
+            return;
+        }
+        if (is_xmm
+            && !(s->flags & HF_OSFXSR_MASK)
+            && (b != 0x38 && b != 0x3a)) {
+            goto unknown_op;
+        }
     }
     if (b == 0x0e) {
         if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) {
@@ -3278,12 +3325,14 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x1e7: /* movntdq */
         case 0x02b: /* movntps */
         case 0x12b: /* movntps */
+            CHECK_AVX_V0(s);
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
             gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
             break;
         case 0x3f0: /* lddqu */
+            CHECK_AVX_V0(s);
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
@@ -3291,6 +3340,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x22b: /* movntss */
         case 0x32b: /* movntsd */
+            CHECK_AVX_V0_128(s);
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
@@ -3321,6 +3371,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x16e: /* movd xmm, ea */
+            CHECK_AVX_V0_128(s);
 #ifdef TARGET_X86_64
             if (s->dflag == MO_64) {
                 gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0);
@@ -3356,6 +3407,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x128: /* movapd */
         case 0x16f: /* movdqa xmm, ea */
         case 0x26f: /* movdqu xmm, ea */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
@@ -3367,6 +3419,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x210: /* movss xmm, ea */
             if (mod != 3) {
+                CHECK_AVX_V0_128(s);
                 gen_lea_modrm(env, s, modrm);
                 gen_op_ld_v(s, MO_32, s->T0, s->A0);
                 tcg_gen_st32_tl(s->T0, cpu_env,
@@ -3379,6 +3432,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)));
             } else {
+                CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)),
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0)));
@@ -3386,6 +3440,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x310: /* movsd xmm, ea */
             if (mod != 3) {
+                CHECK_AVX_V0_128(s);
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
                                            xmm_regs[reg].ZMM_Q(0)));
@@ -3395,6 +3450,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)));
             } else {
+                CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
@@ -3402,6 +3458,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x012: /* movlps */
         case 0x112: /* movlpd */
+            CHECK_AVX_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
@@ -3414,6 +3471,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x212: /* movsldup */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
@@ -3430,6 +3488,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         offsetof(CPUX86State,xmm_regs[reg].ZMM_L(2)));
             break;
         case 0x312: /* movddup */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
@@ -3444,6 +3503,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x016: /* movhps */
         case 0x116: /* movhpd */
+            CHECK_AVX_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
@@ -3456,6 +3516,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x216: /* movshdup */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
@@ -3509,6 +3570,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x17e: /* movd ea, xmm */
+            CHECK_AVX_V0_128(s);
 #ifdef TARGET_X86_64
             if (s->dflag == MO_64) {
                 tcg_gen_ld_i64(s->T0, cpu_env,
@@ -3523,6 +3585,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             break;
         case 0x27e: /* movq xmm, ea */
+            CHECK_AVX_V0_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
@@ -3551,6 +3614,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x129: /* movapd */
         case 0x17f: /* movdqa ea, xmm */
         case 0x27f: /* movdqu ea, xmm */
+            CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
@@ -3562,11 +3626,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x211: /* movss ea, xmm */
             if (mod != 3) {
+                CHECK_AVX_V0_128(s);
                 gen_lea_modrm(env, s, modrm);
                 tcg_gen_ld32u_tl(s->T0, cpu_env,
                                  offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)));
                 gen_op_st_v(s, MO_32, s->T0, s->A0);
             } else {
+                CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)),
                             offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0)));
@@ -3574,10 +3640,12 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x311: /* movsd ea, xmm */
             if (mod != 3) {
+                CHECK_AVX_V0_128(s);
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State,
                                            xmm_regs[reg].ZMM_Q(0)));
             } else {
+                CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)),
                             offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0)));
@@ -3585,6 +3653,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x013: /* movlps */
         case 0x113: /* movlpd */
+            CHECK_AVX_V0_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State,
@@ -3595,6 +3664,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x017: /* movhps */
         case 0x117: /* movhpd */
+            CHECK_AVX_V0_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State,
@@ -3611,6 +3681,8 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x173:
             val = x86_ldub_code(env, s);
             if (is_xmm) {
+                CHECK_AVX(s);
+                CHECK_AVX2_256(s);
                 tcg_gen_movi_tl(s->T0, val);
                 tcg_gen_st32_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State, xmm_t0.ZMM_L(0)));
@@ -3646,6 +3718,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
             break;
         case 0x050: /* movmskps */
+            CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
             tcg_gen_addi_ptr(s->ptr0, cpu_env,
                              offsetof(CPUX86State,xmm_regs[rm]));
@@ -3653,6 +3726,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
         case 0x150: /* movmskpd */
+            CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
             tcg_gen_addi_ptr(s->ptr0, cpu_env,
                              offsetof(CPUX86State,xmm_regs[rm]));
@@ -3686,6 +3760,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x22a: /* cvtsi2ss */
         case 0x32a: /* cvtsi2sd */
+            CHECK_AVX(s);
             ot = mo_64_32(s->dflag);
             gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
             op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
@@ -3739,6 +3814,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x32c: /* cvttsd2si */
         case 0x22d: /* cvtss2si */
         case 0x32d: /* cvtsd2si */
+            CHECK_AVX_V0(s);
             ot = mo_64_32(s->dflag);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
@@ -3773,6 +3849,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0xc4: /* pinsrw */
         case 0x1c4:
+            CHECK_AVX_128(s);
             s->rip_offset = 1;
             gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
             val = x86_ldub_code(env, s);
@@ -3789,6 +3866,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0xc5: /* pextrw */
         case 0x1c5:
+            CHECK_AVX_V0_128(s);
             if (mod != 3)
                 goto illegal_op;
             ot = mo_64_32(s->dflag);
@@ -3808,6 +3886,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             gen_op_mov_reg_v(s, ot, reg, s->T0);
             break;
         case 0x1d6: /* movq ea, xmm */
+            CHECK_AVX_V0_128(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
                 gen_stq_env_A0(s, offsetof(CPUX86State,
@@ -3840,6 +3919,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             if (mod != 3)
                 goto illegal_op;
             if (b1) {
+                CHECK_AVX_V0(s);
                 rm = (modrm & 7) | REX_B(s);
                 tcg_gen_addi_ptr(s->ptr0, cpu_env,
                                  offsetof(CPUX86State, xmm_regs[rm]));
@@ -3875,8 +3955,33 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 goto illegal_op;
             }
 
+            if (op6.ext_mask == CPUID_EXT_AVX
+                    && (s->prefix & PREFIX_VEX) == 0) {
+                goto illegal_op;
+            }
+            if (op6.flags & SSE_OPF_AVX2) {
+                CHECK_AVX2(s);
+            }
+
             if (b1) {
+                if (op6.flags & SSE_OPF_V0) {
+                    CHECK_AVX_V0(s);
+                } else {
+                    CHECK_AVX(s);
+                }
                 op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+
+                if (op6.flags & SSE_OPF_MMX) {
+                    CHECK_AVX2_256(s);
+                }
+                if (op6.flags & SSE_OPF_BLENDV) {
+                    /*
+                     * VEX encodings of the blendv opcodes are not valid
+                     * they use a different opcode with an 0f 3a prefix
+                     */
+                    CHECK_NO_VEX(s);
+                }
+
                 if (mod == 3) {
                     op2_offset = offsetof(CPUX86State,xmm_regs[rm | REX_B(s)]);
                 } else {
@@ -4327,6 +4432,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 val = x86_ldub_code(env, s);
                 switch (b) {
                 case 0x14: /* pextrb */
+                    CHECK_AVX_V0_128(s);
                     tcg_gen_ld8u_tl(s->T0, cpu_env, offsetof(CPUX86State,
                                             xmm_regs[reg].ZMM_B(val & 15)));
                     if (mod == 3) {
@@ -4337,6 +4443,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     break;
                 case 0x15: /* pextrw */
+                    CHECK_AVX_V0_128(s);
                     tcg_gen_ld16u_tl(s->T0, cpu_env, offsetof(CPUX86State,
                                             xmm_regs[reg].ZMM_W(val & 7)));
                     if (mod == 3) {
@@ -4347,6 +4454,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     break;
                 case 0x16:
+                    CHECK_AVX_V0_128(s);
                     if (ot == MO_32) { /* pextrd */
                         tcg_gen_ld_i32(s->tmp2_i32, cpu_env,
                                         offsetof(CPUX86State,
@@ -4374,6 +4482,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     break;
                 case 0x17: /* extractps */
+                    CHECK_AVX_V0_128(s);
                     tcg_gen_ld32u_tl(s->T0, cpu_env, offsetof(CPUX86State,
                                             xmm_regs[reg].ZMM_L(val & 3)));
                     if (mod == 3) {
@@ -4384,6 +4493,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     break;
                 case 0x20: /* pinsrb */
+                    CHECK_AVX_128(s);
                     if (mod == 3) {
                         gen_op_mov_v_reg(s, MO_32, s->T0, rm);
                     } else {
@@ -4394,6 +4504,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                                             xmm_regs[reg].ZMM_B(val & 15)));
                     break;
                 case 0x21: /* insertps */
+                    CHECK_AVX_128(s);
                     if (mod == 3) {
                         tcg_gen_ld_i32(s->tmp2_i32, cpu_env,
                                         offsetof(CPUX86State,xmm_regs[rm]
@@ -4423,6 +4534,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                                                 xmm_regs[reg].ZMM_L(3)));
                     break;
                 case 0x22:
+                    CHECK_AVX_128(s);
                     if (ot == MO_32) { /* pinsrd */
                         if (mod == 3) {
                             tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[rm]);
@@ -4453,6 +4565,20 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 return;
             }
 
+            CHECK_AVX(s);
+            scalar_op = (s->prefix & PREFIX_VEX)
+                && (op7.flags & SSE_OPF_SCALAR)
+                && !(op7.flags & SSE_OPF_CMP);
+            if (is_xmm && (op7.flags & SSE_OPF_MMX)) {
+                CHECK_AVX2_256(s);
+            }
+            if (op7.flags & SSE_OPF_AVX2) {
+                CHECK_AVX2(s);
+            }
+            if ((op7.flags & SSE_OPF_V0) && !scalar_op) {
+                CHECK_AVX_V0(s);
+            }
+
             if (b1) {
                 op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
                 if (mod == 3) {
@@ -4540,6 +4666,19 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         }
         if (is_xmm) {
+            scalar_op = (s->prefix & PREFIX_VEX)
+                && (sse_op.flags & SSE_OPF_SCALAR)
+                && !(sse_op.flags & SSE_OPF_CMP)
+                && (b1 == 2 || b1 == 3);
+            /* VEX encoded scalar ops always have 3 operands! */
+            if ((sse_op.flags & SSE_OPF_V0) && !scalar_op) {
+                CHECK_AVX_V0(s);
+            } else {
+                CHECK_AVX(s);
+            }
+            if (sse_op.flags & SSE_OPF_MMX) {
+                CHECK_AVX2_256(s);
+            }
             op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
             if (mod != 3) {
                 int sz = 4;
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 08/42] i386: Add ZMM_OFFSET macro
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (10 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 07/42] Enforce VEX encoding restrictions Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-25 21:03   ` Richard Henderson
  2022-04-24 22:01 ` [PATCH v2 09/42] i386: Helper macro for 256 bit AVX helpers Paul Brook
                   ` (33 subsequent siblings)
  45 siblings, 1 reply; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Add a convenience macro to get the address of an xmm_regs element within
CPUX86State.

This was originally going to be the basis of an implementation that broke
operations into 128 bit chunks. I scrapped that idea, so this is now a purely
cosmetic change. But I think a worthwhile one - it reduces the number of
function calls that need to be split over multiple lines.

No functional changes.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 60 +++++++++++++++++--------------------
 1 file changed, 27 insertions(+), 33 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2f5cc24e0c..e9e6062b7f 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2777,6 +2777,8 @@ static inline void gen_op_movq_env_0(DisasContext *s, int d_offset)
     tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset);
 }
 
+#define ZMM_OFFSET(reg) offsetof(CPUX86State, xmm_regs[reg])
+
 typedef void (*SSEFunc_i_ep)(TCGv_i32 val, TCGv_ptr env, TCGv_ptr reg);
 typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr env, TCGv_ptr reg);
 typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val);
@@ -3329,14 +3331,14 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
-            gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+            gen_sto_env_A0(s, ZMM_OFFSET(reg));
             break;
         case 0x3f0: /* lddqu */
             CHECK_AVX_V0(s);
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
-            gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+            gen_ldo_env_A0(s, ZMM_OFFSET(reg));
             break;
         case 0x22b: /* movntss */
         case 0x32b: /* movntsd */
@@ -3375,15 +3377,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 #ifdef TARGET_X86_64
             if (s->dflag == MO_64) {
                 gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0);
-                tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                                 offsetof(CPUX86State,xmm_regs[reg]));
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg));
                 gen_helper_movq_mm_T0_xmm(s->ptr0, s->T0);
             } else
 #endif
             {
                 gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 0);
-                tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                                 offsetof(CPUX86State,xmm_regs[reg]));
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg));
                 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
                 gen_helper_movl_mm_T0_xmm(s->ptr0, s->tmp2_i32);
             }
@@ -3410,11 +3410,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+                gen_ldo_env_A0(s, ZMM_OFFSET(reg));
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                gen_op_movo(s, offsetof(CPUX86State, xmm_regs[reg]),
-                            offsetof(CPUX86State,xmm_regs[rm]));
+                gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm));
             }
             break;
         case 0x210: /* movss xmm, ea */
@@ -3474,7 +3473,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+                gen_ldo_env_A0(s, ZMM_OFFSET(reg));
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)),
@@ -3519,7 +3518,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+                gen_ldo_env_A0(s, ZMM_OFFSET(reg));
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)),
@@ -3542,8 +3541,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     goto illegal_op;
                 field_length = x86_ldub_code(env, s) & 0x3F;
                 bit_index = x86_ldub_code(env, s) & 0x3F;
-                tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                    offsetof(CPUX86State,xmm_regs[reg]));
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg));
                 if (b1 == 1)
                     gen_helper_extrq_i(cpu_env, s->ptr0,
                                        tcg_const_i32(bit_index),
@@ -3617,11 +3615,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+                gen_sto_env_A0(s, ZMM_OFFSET(reg));
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                gen_op_movo(s, offsetof(CPUX86State, xmm_regs[rm]),
-                            offsetof(CPUX86State,xmm_regs[reg]));
+                gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg));
             }
             break;
         case 0x211: /* movss ea, xmm */
@@ -3708,7 +3705,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             if (is_xmm) {
                 rm = (modrm & 7) | REX_B(s);
-                op2_offset = offsetof(CPUX86State,xmm_regs[rm]);
+                op2_offset = ZMM_OFFSET(rm);
             } else {
                 rm = (modrm & 7);
                 op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
@@ -3720,16 +3717,14 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x050: /* movmskps */
             CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
-            tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                             offsetof(CPUX86State,xmm_regs[rm]));
+            tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm));
             gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0);
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
         case 0x150: /* movmskpd */
             CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
-            tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                             offsetof(CPUX86State,xmm_regs[rm]));
+            tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm));
             gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0);
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
@@ -3745,7 +3740,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 rm = (modrm & 7);
                 op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
             }
-            op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+            op1_offset = ZMM_OFFSET(reg);
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
             tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
             switch(b >> 8) {
@@ -3763,7 +3758,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             CHECK_AVX(s);
             ot = mo_64_32(s->dflag);
             gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
-            op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+            op1_offset = ZMM_OFFSET(reg);
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
             if (ot == MO_32) {
                 SSEFunc_0_epi sse_fn_epi = sse_op_table3ai[(b >> 8) & 1];
@@ -3790,7 +3785,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 gen_ldo_env_A0(s, op2_offset);
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                op2_offset = offsetof(CPUX86State,xmm_regs[rm]);
+                op2_offset = ZMM_OFFSET(rm);
             }
             op1_offset = offsetof(CPUX86State,fpregs[reg & 7].mmx);
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
@@ -3828,7 +3823,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 op2_offset = offsetof(CPUX86State,xmm_t0);
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                op2_offset = offsetof(CPUX86State,xmm_regs[rm]);
+                op2_offset = ZMM_OFFSET(rm);
             }
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset);
             if (ot == MO_32) {
@@ -3921,8 +3916,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             if (b1) {
                 CHECK_AVX_V0(s);
                 rm = (modrm & 7) | REX_B(s);
-                tcg_gen_addi_ptr(s->ptr0, cpu_env,
-                                 offsetof(CPUX86State, xmm_regs[rm]));
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm));
                 gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0);
             } else {
                 CHECK_NO_VEX(s);
@@ -3969,7 +3963,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 } else {
                     CHECK_AVX(s);
                 }
-                op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+                op1_offset = ZMM_OFFSET(reg);
 
                 if (op6.flags & SSE_OPF_MMX) {
                     CHECK_AVX2_256(s);
@@ -3983,7 +3977,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 }
 
                 if (mod == 3) {
-                    op2_offset = offsetof(CPUX86State,xmm_regs[rm | REX_B(s)]);
+                    op2_offset = ZMM_OFFSET(rm | REX_B(s));
                 } else {
                     op2_offset = offsetof(CPUX86State,xmm_t0);
                     gen_lea_modrm(env, s, modrm);
@@ -4580,9 +4574,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
 
             if (b1) {
-                op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+                op1_offset = ZMM_OFFSET(reg);
                 if (mod == 3) {
-                    op2_offset = offsetof(CPUX86State,xmm_regs[rm | REX_B(s)]);
+                    op2_offset = ZMM_OFFSET(rm | REX_B(s));
                 } else {
                     op2_offset = offsetof(CPUX86State,xmm_t0);
                     gen_lea_modrm(env, s, modrm);
@@ -4679,7 +4673,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             if (sse_op.flags & SSE_OPF_MMX) {
                 CHECK_AVX2_256(s);
             }
-            op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
+            op1_offset = ZMM_OFFSET(reg);
             if (mod != 3) {
                 int sz = 4;
 
@@ -4726,7 +4720,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 }
             } else {
                 rm = (modrm & 7) | REX_B(s);
-                op2_offset = offsetof(CPUX86State,xmm_regs[rm]);
+                op2_offset = ZMM_OFFSET(rm);
             }
         } else {
             CHECK_NO_VEX(s);
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 09/42] i386: Helper macro for 256 bit AVX helpers
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (11 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 08/42] i386: Add ZMM_OFFSET macro Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 10/42] i386: Rewrite vector shift helper Paul Brook
                   ` (32 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Once all the code is in place, 256 bit vector helpers will be generated by
including ops_sse.h a third time with SHIFT=2.

The first bit of support for this is to define a YMM_ONLY macro for code that
only apples to 256 bit vectors.  XXM_ONLY code will be executed for both
128 and 256 bit vectors.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 8 ++++++++
 target/i386/ops_sse_header.h | 4 ++++
 2 files changed, 12 insertions(+)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index a5a48a20f6..23daab6b50 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -24,6 +24,7 @@
 #define Reg MMXReg
 #define SIZE 8
 #define XMM_ONLY(...)
+#define YMM_ONLY(...)
 #define B(n) MMX_B(n)
 #define W(n) MMX_W(n)
 #define L(n) MMX_L(n)
@@ -37,7 +38,13 @@
 #define W(n) ZMM_W(n)
 #define L(n) ZMM_L(n)
 #define Q(n) ZMM_Q(n)
+#if SHIFT == 1
 #define SUFFIX _xmm
+#define YMM_ONLY(...)
+#else
+#define SUFFIX _ymm
+#define YMM_ONLY(...) __VA_ARGS__
+#endif
 #endif
 
 /*
@@ -2337,6 +2344,7 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 
 #undef SHIFT
 #undef XMM_ONLY
+#undef YMM_ONLY
 #undef Reg
 #undef B
 #undef W
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index cef28f2aae..7e7f2cee2a 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -21,7 +21,11 @@
 #define SUFFIX _mmx
 #else
 #define Reg ZMMReg
+#if SHIFT == 1
 #define SUFFIX _xmm
+#else
+#define SUFFIX _ymm
+#endif
 #endif
 
 #define dh_alias_Reg ptr
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 10/42] i386: Rewrite vector shift helper
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (12 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 09/42] i386: Helper macro for 256 bit AVX helpers Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-25 21:33   ` Richard Henderson
  2022-04-24 22:01 ` [PATCH v2 11/42] i386: Rewrite simple integer vector helpers Paul Brook
                   ` (31 subsequent siblings)
  45 siblings, 1 reply; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Rewrite the vector shift helpers in preperation for AVX support (3 operand
form and 256 bit vectors).

For now keep the existing two operand interface.

No functional changes to existing helpers.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 250 ++++++++++++++++++++++--------------------
 1 file changed, 133 insertions(+), 117 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 23daab6b50..9297c96d04 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -63,199 +63,215 @@
 #define MOVE(d, r) memcpy(&(d).B(0), &(r).B(0), SIZE)
 #endif
 
-void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+#if SHIFT == 0
+#define SHIFT_HELPER_BODY(n, elem, F) do {      \
+    d->elem(0) = F(s->elem(0), shift);          \
+    if ((n) > 1) {                              \
+        d->elem(1) = F(s->elem(1), shift);      \
+    }                                           \
+    if ((n) > 2) {                              \
+        d->elem(2) = F(s->elem(2), shift);      \
+        d->elem(3) = F(s->elem(3), shift);      \
+    }                                           \
+    if ((n) > 4) {                              \
+        d->elem(4) = F(s->elem(4), shift);      \
+        d->elem(5) = F(s->elem(5), shift);      \
+        d->elem(6) = F(s->elem(6), shift);      \
+        d->elem(7) = F(s->elem(7), shift);      \
+    }                                           \
+    if ((n) > 8) {                              \
+        d->elem(8) = F(s->elem(8), shift);      \
+        d->elem(9) = F(s->elem(9), shift);      \
+        d->elem(10) = F(s->elem(10), shift);    \
+        d->elem(11) = F(s->elem(11), shift);    \
+        d->elem(12) = F(s->elem(12), shift);    \
+        d->elem(13) = F(s->elem(13), shift);    \
+        d->elem(14) = F(s->elem(14), shift);    \
+        d->elem(15) = F(s->elem(15), shift);    \
+    }                                           \
+    } while (0)
+
+#define FPSRL(x, c) ((x) >> shift)
+#define FPSRAW(x, c) ((int16_t)(x) >> shift)
+#define FPSRAL(x, c) ((int32_t)(x) >> shift)
+#define FPSLL(x, c) ((x) << shift)
+#endif
+
+void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift;
-
-    if (s->Q(0) > 15) {
+    if (c->Q(0) > 15) {
         d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
-        d->W(0) >>= shift;
-        d->W(1) >>= shift;
-        d->W(2) >>= shift;
-        d->W(3) >>= shift;
-#if SHIFT == 1
-        d->W(4) >>= shift;
-        d->W(5) >>= shift;
-        d->W(6) >>= shift;
-        d->W(7) >>= shift;
-#endif
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRL);
     }
 }
 
-void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift;
-
-    if (s->Q(0) > 15) {
-        shift = 15;
+    if (c->Q(0) > 15) {
+        d->Q(0) = 0;
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(4 << SHIFT, W, FPSLL);
     }
-    d->W(0) = (int16_t)d->W(0) >> shift;
-    d->W(1) = (int16_t)d->W(1) >> shift;
-    d->W(2) = (int16_t)d->W(2) >> shift;
-    d->W(3) = (int16_t)d->W(3) >> shift;
-#if SHIFT == 1
-    d->W(4) = (int16_t)d->W(4) >> shift;
-    d->W(5) = (int16_t)d->W(5) >> shift;
-    d->W(6) = (int16_t)d->W(6) >> shift;
-    d->W(7) = (int16_t)d->W(7) >> shift;
-#endif
 }
 
-void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift;
-
-    if (s->Q(0) > 15) {
-        d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+    if (c->Q(0) > 15) {
+        shift = 15;
     } else {
-        shift = s->B(0);
-        d->W(0) <<= shift;
-        d->W(1) <<= shift;
-        d->W(2) <<= shift;
-        d->W(3) <<= shift;
-#if SHIFT == 1
-        d->W(4) <<= shift;
-        d->W(5) <<= shift;
-        d->W(6) <<= shift;
-        d->W(7) <<= shift;
-#endif
+        shift = c->B(0);
     }
+    SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRAW);
 }
 
-void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift;
-
-    if (s->Q(0) > 31) {
+    if (c->Q(0) > 31) {
         d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
-        d->L(0) >>= shift;
-        d->L(1) >>= shift;
-#if SHIFT == 1
-        d->L(2) >>= shift;
-        d->L(3) >>= shift;
-#endif
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRL);
     }
 }
 
-void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift;
-
-    if (s->Q(0) > 31) {
-        shift = 31;
+    if (c->Q(0) > 31) {
+        d->Q(0) = 0;
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(2 << SHIFT, L, FPSLL);
     }
-    d->L(0) = (int32_t)d->L(0) >> shift;
-    d->L(1) = (int32_t)d->L(1) >> shift;
-#if SHIFT == 1
-    d->L(2) = (int32_t)d->L(2) >> shift;
-    d->L(3) = (int32_t)d->L(3) >> shift;
-#endif
 }
 
-void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift;
-
-    if (s->Q(0) > 31) {
-        d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+    if (c->Q(0) > 31) {
+        shift = 31;
     } else {
-        shift = s->B(0);
-        d->L(0) <<= shift;
-        d->L(1) <<= shift;
-#if SHIFT == 1
-        d->L(2) <<= shift;
-        d->L(3) <<= shift;
-#endif
+        shift = c->B(0);
     }
+    SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRAL);
 }
 
-void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift;
-
-    if (s->Q(0) > 63) {
+    if (c->Q(0) > 63) {
         d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
-        d->Q(0) >>= shift;
-#if SHIFT == 1
-        d->Q(1) >>= shift;
-#endif
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(1 << SHIFT, Q, FPSRL);
     }
 }
 
-void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift;
-
-    if (s->Q(0) > 63) {
+    if (c->Q(0) > 63) {
         d->Q(0) = 0;
-#if SHIFT == 1
-        d->Q(1) = 0;
-#endif
+        XMM_ONLY(d->Q(1) = 0;)
+        YMM_ONLY(
+                d->Q(2) = 0;
+                d->Q(3) = 0;
+                )
     } else {
-        shift = s->B(0);
-        d->Q(0) <<= shift;
-#if SHIFT == 1
-        d->Q(1) <<= shift;
-#endif
+        shift = c->B(0);
+        SHIFT_HELPER_BODY(1 << SHIFT, Q, FPSLL);
     }
 }
 
-#if SHIFT == 1
-void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+#if SHIFT >= 1
+void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift, i;
 
-    shift = s->L(0);
+    shift = c->L(0);
     if (shift > 16) {
         shift = 16;
     }
     for (i = 0; i < 16 - shift; i++) {
-        d->B(i) = d->B(i + shift);
+        d->B(i) = s->B(i + shift);
     }
     for (i = 16 - shift; i < 16; i++) {
         d->B(i) = 0;
     }
+#if SHIFT == 2
+    for (i = 0; i < 16 - shift; i++) {
+        d->B(i + 16) = s->B(i + 16 + shift);
+    }
+    for (i = 16 - shift; i < 16; i++) {
+        d->B(i + 16) = 0;
+    }
+#endif
 }
 
-void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 {
+    Reg *s = d;
     int shift, i;
 
-    shift = s->L(0);
+    shift = c->L(0);
     if (shift > 16) {
         shift = 16;
     }
     for (i = 15; i >= shift; i--) {
-        d->B(i) = d->B(i - shift);
+        d->B(i) = s->B(i - shift);
     }
     for (i = 0; i < shift; i++) {
         d->B(i) = 0;
     }
+#if SHIFT == 2
+    for (i = 15; i >= shift; i--) {
+        d->B(i + 16) = s->B(i + 16 - shift);
+    }
+    for (i = 0; i < shift; i++) {
+        d->B(i + 16) = 0;
+    }
+#endif
 }
 #endif
 
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 11/42] i386: Rewrite simple integer vector helpers
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (13 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 10/42] i386: Rewrite vector shift helper Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 12/42] i386: Misc integer AVX helper prep Paul Brook
                   ` (30 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Rewrite the "simple" vector integer helpers in preperation for AVX support.

While the current code is able to use the same prototype for unary
(a = F(b)) and binary (a = F(b, c)) operations, future changes will cause
them to diverge.

No functional changes to existing helpers

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 180 ++++++++++++++++++++++++++++++++----------
 1 file changed, 137 insertions(+), 43 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 9297c96d04..bb9cbf9ead 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -275,61 +275,148 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 }
 #endif
 
-#define SSE_HELPER_B(name, F)                                   \
+#define SSE_HELPER_1(name, elem, num, F)                                   \
     void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
     {                                                           \
-        d->B(0) = F(d->B(0), s->B(0));                          \
-        d->B(1) = F(d->B(1), s->B(1));                          \
-        d->B(2) = F(d->B(2), s->B(2));                          \
-        d->B(3) = F(d->B(3), s->B(3));                          \
-        d->B(4) = F(d->B(4), s->B(4));                          \
-        d->B(5) = F(d->B(5), s->B(5));                          \
-        d->B(6) = F(d->B(6), s->B(6));                          \
-        d->B(7) = F(d->B(7), s->B(7));                          \
+        d->elem(0) = F(s->elem(0));                             \
+        d->elem(1) = F(s->elem(1));                             \
+        if ((num << SHIFT) > 2) {                               \
+            d->elem(2) = F(s->elem(2));                         \
+            d->elem(3) = F(s->elem(3));                         \
+        }                                                       \
+        if ((num << SHIFT) > 4) {                               \
+            d->elem(4) = F(s->elem(4));                         \
+            d->elem(5) = F(s->elem(5));                         \
+            d->elem(6) = F(s->elem(6));                         \
+            d->elem(7) = F(s->elem(7));                         \
+        }                                                       \
+        if ((num << SHIFT) > 8) {                               \
+            d->elem(8) = F(s->elem(8));                         \
+            d->elem(9) = F(s->elem(9));                         \
+            d->elem(10) = F(s->elem(10));                       \
+            d->elem(11) = F(s->elem(11));                       \
+            d->elem(12) = F(s->elem(12));                       \
+            d->elem(13) = F(s->elem(13));                       \
+            d->elem(14) = F(s->elem(14));                       \
+            d->elem(15) = F(s->elem(15));                       \
+        }                                                       \
+        if ((num << SHIFT) > 16) {                              \
+            d->elem(16) = F(s->elem(16));                       \
+            d->elem(17) = F(s->elem(17));                       \
+            d->elem(18) = F(s->elem(18));                       \
+            d->elem(19) = F(s->elem(19));                       \
+            d->elem(20) = F(s->elem(20));                       \
+            d->elem(21) = F(s->elem(21));                       \
+            d->elem(22) = F(s->elem(22));                       \
+            d->elem(23) = F(s->elem(23));                       \
+            d->elem(24) = F(s->elem(24));                       \
+            d->elem(25) = F(s->elem(25));                       \
+            d->elem(26) = F(s->elem(26));                       \
+            d->elem(27) = F(s->elem(27));                       \
+            d->elem(28) = F(s->elem(28));                       \
+            d->elem(29) = F(s->elem(29));                       \
+            d->elem(30) = F(s->elem(30));                       \
+            d->elem(31) = F(s->elem(31));                       \
+        }                                                       \
+    }
+
+#define SSE_HELPER_B(name, F)                                   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+    {                                                           \
+        Reg *v = d;                                             \
+        d->B(0) = F(v->B(0), s->B(0));                          \
+        d->B(1) = F(v->B(1), s->B(1));                          \
+        d->B(2) = F(v->B(2), s->B(2));                          \
+        d->B(3) = F(v->B(3), s->B(3));                          \
+        d->B(4) = F(v->B(4), s->B(4));                          \
+        d->B(5) = F(v->B(5), s->B(5));                          \
+        d->B(6) = F(v->B(6), s->B(6));                          \
+        d->B(7) = F(v->B(7), s->B(7));                          \
         XMM_ONLY(                                               \
-                 d->B(8) = F(d->B(8), s->B(8));                 \
-                 d->B(9) = F(d->B(9), s->B(9));                 \
-                 d->B(10) = F(d->B(10), s->B(10));              \
-                 d->B(11) = F(d->B(11), s->B(11));              \
-                 d->B(12) = F(d->B(12), s->B(12));              \
-                 d->B(13) = F(d->B(13), s->B(13));              \
-                 d->B(14) = F(d->B(14), s->B(14));              \
-                 d->B(15) = F(d->B(15), s->B(15));              \
+                 d->B(8) = F(v->B(8), s->B(8));                 \
+                 d->B(9) = F(v->B(9), s->B(9));                 \
+                 d->B(10) = F(v->B(10), s->B(10));              \
+                 d->B(11) = F(v->B(11), s->B(11));              \
+                 d->B(12) = F(v->B(12), s->B(12));              \
+                 d->B(13) = F(v->B(13), s->B(13));              \
+                 d->B(14) = F(v->B(14), s->B(14));              \
+                 d->B(15) = F(v->B(15), s->B(15));              \
+                                                        )       \
+        YMM_ONLY(                                               \
+                 d->B(16) = F(v->B(16), s->B(16));              \
+                 d->B(17) = F(v->B(17), s->B(17));              \
+                 d->B(18) = F(v->B(18), s->B(18));              \
+                 d->B(19) = F(v->B(19), s->B(19));              \
+                 d->B(20) = F(v->B(20), s->B(20));              \
+                 d->B(21) = F(v->B(21), s->B(21));              \
+                 d->B(22) = F(v->B(22), s->B(22));              \
+                 d->B(23) = F(v->B(23), s->B(23));              \
+                 d->B(24) = F(v->B(24), s->B(24));              \
+                 d->B(25) = F(v->B(25), s->B(25));              \
+                 d->B(26) = F(v->B(26), s->B(26));              \
+                 d->B(27) = F(v->B(27), s->B(27));              \
+                 d->B(28) = F(v->B(28), s->B(28));              \
+                 d->B(29) = F(v->B(29), s->B(29));              \
+                 d->B(30) = F(v->B(30), s->B(30));              \
+                 d->B(31) = F(v->B(31), s->B(31));              \
                                                         )       \
             }
 
 #define SSE_HELPER_W(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
     {                                                           \
-        d->W(0) = F(d->W(0), s->W(0));                          \
-        d->W(1) = F(d->W(1), s->W(1));                          \
-        d->W(2) = F(d->W(2), s->W(2));                          \
-        d->W(3) = F(d->W(3), s->W(3));                          \
+        Reg *v = d;                                             \
+        d->W(0) = F(v->W(0), s->W(0));                          \
+        d->W(1) = F(v->W(1), s->W(1));                          \
+        d->W(2) = F(v->W(2), s->W(2));                          \
+        d->W(3) = F(v->W(3), s->W(3));                          \
         XMM_ONLY(                                               \
-                 d->W(4) = F(d->W(4), s->W(4));                 \
-                 d->W(5) = F(d->W(5), s->W(5));                 \
-                 d->W(6) = F(d->W(6), s->W(6));                 \
-                 d->W(7) = F(d->W(7), s->W(7));                 \
+                 d->W(4) = F(v->W(4), s->W(4));                 \
+                 d->W(5) = F(v->W(5), s->W(5));                 \
+                 d->W(6) = F(v->W(6), s->W(6));                 \
+                 d->W(7) = F(v->W(7), s->W(7));                 \
+                                                        )       \
+        YMM_ONLY(                                               \
+                 d->W(8) = F(v->W(8), s->W(8));                 \
+                 d->W(9) = F(v->W(9), s->W(9));                 \
+                 d->W(10) = F(v->W(10), s->W(10));              \
+                 d->W(11) = F(v->W(11), s->W(11));              \
+                 d->W(12) = F(v->W(12), s->W(12));              \
+                 d->W(13) = F(v->W(13), s->W(13));              \
+                 d->W(14) = F(v->W(14), s->W(14));              \
+                 d->W(15) = F(v->W(15), s->W(15));              \
                                                         )       \
             }
 
 #define SSE_HELPER_L(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
     {                                                           \
-        d->L(0) = F(d->L(0), s->L(0));                          \
-        d->L(1) = F(d->L(1), s->L(1));                          \
+        Reg *v = d;                                             \
+        d->L(0) = F(v->L(0), s->L(0));                          \
+        d->L(1) = F(v->L(1), s->L(1));                          \
         XMM_ONLY(                                               \
-                 d->L(2) = F(d->L(2), s->L(2));                 \
-                 d->L(3) = F(d->L(3), s->L(3));                 \
+                 d->L(2) = F(v->L(2), s->L(2));                 \
+                 d->L(3) = F(v->L(3), s->L(3));                 \
+                                                        )       \
+        YMM_ONLY(                                               \
+                 d->L(4) = F(v->L(4), s->L(4));                 \
+                 d->L(5) = F(v->L(5), s->L(5));                 \
+                 d->L(6) = F(v->L(6), s->L(6));                 \
+                 d->L(7) = F(v->L(7), s->L(7));                 \
                                                         )       \
             }
 
 #define SSE_HELPER_Q(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
     {                                                           \
-        d->Q(0) = F(d->Q(0), s->Q(0));                          \
+        Reg *v = d;                                             \
+        d->Q(0) = F(v->Q(0), s->Q(0));                          \
         XMM_ONLY(                                               \
-                 d->Q(1) = F(d->Q(1), s->Q(1));                 \
+                 d->Q(1) = F(v->Q(1), s->Q(1));                 \
+                                                        )       \
+        YMM_ONLY(                                               \
+                 d->Q(2) = F(v->Q(2), s->Q(2));                 \
+                 d->Q(3) = F(v->Q(3), s->Q(3));                 \
                                                         )       \
             }
 
@@ -452,12 +539,19 @@ SSE_HELPER_W(helper_pcmpeqw, FCMPEQ)
 SSE_HELPER_L(helper_pcmpeql, FCMPEQ)
 
 SSE_HELPER_W(helper_pmullw, FMULLW)
-#if SHIFT == 0
-SSE_HELPER_W(helper_pmulhrw, FMULHRW)
-#endif
 SSE_HELPER_W(helper_pmulhuw, FMULHUW)
 SSE_HELPER_W(helper_pmulhw, FMULHW)
 
+#if SHIFT == 0
+void glue(helper_pmulhrw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->W(0) = FMULHRW(d->W(0), s->W(0));
+    d->W(1) = FMULHRW(d->W(1), s->W(1));
+    d->W(2) = FMULHRW(d->W(2), s->W(2));
+    d->W(3) = FMULHRW(d->W(3), s->W(3));
+}
+#endif
+
 SSE_HELPER_B(helper_pavgb, FAVG)
 SSE_HELPER_W(helper_pavgw, FAVG)
 
@@ -1581,12 +1675,12 @@ void glue(helper_phsubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     XMM_ONLY(d->W(7) = satsw((int16_t)s->W(6) - (int16_t)s->W(7)));
 }
 
-#define FABSB(_, x) (x > INT8_MAX  ? -(int8_t)x : x)
-#define FABSW(_, x) (x > INT16_MAX ? -(int16_t)x : x)
-#define FABSL(_, x) (x > INT32_MAX ? -(int32_t)x : x)
-SSE_HELPER_B(helper_pabsb, FABSB)
-SSE_HELPER_W(helper_pabsw, FABSW)
-SSE_HELPER_L(helper_pabsd, FABSL)
+#define FABSB(x) (x > INT8_MAX  ? -(int8_t)x : x)
+#define FABSW(x) (x > INT16_MAX ? -(int16_t)x : x)
+#define FABSL(x) (x > INT32_MAX ? -(int32_t)x : x)
+SSE_HELPER_1(helper_pabsb, B, 8, FABSB)
+SSE_HELPER_1(helper_pabsw, W, 4, FABSW)
+SSE_HELPER_1(helper_pabsd, L, 2, FABSL)
 
 #define FMULHRSW(d, s) (((int16_t) d * (int16_t)s + 0x4000) >> 15)
 SSE_HELPER_W(helper_pmulhrsw, FMULHRSW)
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 12/42] i386: Misc integer AVX helper prep
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (14 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 11/42] i386: Rewrite simple integer vector helpers Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 13/42] i386: Destructive vector helpers for AVX Paul Brook
                   ` (29 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

More perparatory work for AVX support in various integer vector helpers

No functional changes to existing helpers.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 133 +++++++++++++++++++++++++++++++++---------
 1 file changed, 104 insertions(+), 29 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index bb9cbf9ead..d0424140d9 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -557,19 +557,25 @@ SSE_HELPER_W(helper_pavgw, FAVG)
 
 void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    d->Q(0) = (uint64_t)s->L(0) * (uint64_t)d->L(0);
-#if SHIFT == 1
-    d->Q(1) = (uint64_t)s->L(2) * (uint64_t)d->L(2);
+    Reg *v = d;
+    d->Q(0) = (uint64_t)s->L(0) * (uint64_t)v->L(0);
+#if SHIFT >= 1
+    d->Q(1) = (uint64_t)s->L(2) * (uint64_t)v->L(2);
+#if SHIFT == 2
+    d->Q(2) = (uint64_t)s->L(4) * (uint64_t)v->L(4);
+    d->Q(3) = (uint64_t)s->L(6) * (uint64_t)v->L(6);
+#endif
 #endif
 }
 
 void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
+    Reg *v = d;
     int i;
 
     for (i = 0; i < (2 << SHIFT); i++) {
-        d->L(i) = (int16_t)s->W(2 * i) * (int16_t)d->W(2 * i) +
-            (int16_t)s->W(2 * i + 1) * (int16_t)d->W(2 * i + 1);
+        d->L(i) = (int16_t)s->W(2 * i) * (int16_t)v->W(2 * i) +
+            (int16_t)s->W(2 * i + 1) * (int16_t)v->W(2 * i + 1);
     }
 }
 
@@ -583,31 +589,55 @@ static inline int abs1(int a)
     }
 }
 #endif
+
 void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
+    Reg *v = d;
     unsigned int val;
 
     val = 0;
-    val += abs1(d->B(0) - s->B(0));
-    val += abs1(d->B(1) - s->B(1));
-    val += abs1(d->B(2) - s->B(2));
-    val += abs1(d->B(3) - s->B(3));
-    val += abs1(d->B(4) - s->B(4));
-    val += abs1(d->B(5) - s->B(5));
-    val += abs1(d->B(6) - s->B(6));
-    val += abs1(d->B(7) - s->B(7));
+    val += abs1(v->B(0) - s->B(0));
+    val += abs1(v->B(1) - s->B(1));
+    val += abs1(v->B(2) - s->B(2));
+    val += abs1(v->B(3) - s->B(3));
+    val += abs1(v->B(4) - s->B(4));
+    val += abs1(v->B(5) - s->B(5));
+    val += abs1(v->B(6) - s->B(6));
+    val += abs1(v->B(7) - s->B(7));
     d->Q(0) = val;
-#if SHIFT == 1
+#if SHIFT >= 1
     val = 0;
-    val += abs1(d->B(8) - s->B(8));
-    val += abs1(d->B(9) - s->B(9));
-    val += abs1(d->B(10) - s->B(10));
-    val += abs1(d->B(11) - s->B(11));
-    val += abs1(d->B(12) - s->B(12));
-    val += abs1(d->B(13) - s->B(13));
-    val += abs1(d->B(14) - s->B(14));
-    val += abs1(d->B(15) - s->B(15));
+    val += abs1(v->B(8) - s->B(8));
+    val += abs1(v->B(9) - s->B(9));
+    val += abs1(v->B(10) - s->B(10));
+    val += abs1(v->B(11) - s->B(11));
+    val += abs1(v->B(12) - s->B(12));
+    val += abs1(v->B(13) - s->B(13));
+    val += abs1(v->B(14) - s->B(14));
+    val += abs1(v->B(15) - s->B(15));
     d->Q(1) = val;
+#if SHIFT == 2
+    val = 0;
+    val += abs1(v->B(16) - s->B(16));
+    val += abs1(v->B(17) - s->B(17));
+    val += abs1(v->B(18) - s->B(18));
+    val += abs1(v->B(19) - s->B(19));
+    val += abs1(v->B(20) - s->B(20));
+    val += abs1(v->B(21) - s->B(21));
+    val += abs1(v->B(22) - s->B(22));
+    val += abs1(v->B(23) - s->B(23));
+    d->Q(2) = val;
+    val = 0;
+    val += abs1(v->B(24) - s->B(24));
+    val += abs1(v->B(25) - s->B(25));
+    val += abs1(v->B(26) - s->B(26));
+    val += abs1(v->B(27) - s->B(27));
+    val += abs1(v->B(28) - s->B(28));
+    val += abs1(v->B(29) - s->B(29));
+    val += abs1(v->B(30) - s->B(30));
+    val += abs1(v->B(31) - s->B(31));
+    d->Q(3) = val;
+#endif
 #endif
 }
 
@@ -627,8 +657,12 @@ void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val)
 {
     d->L(0) = val;
     d->L(1) = 0;
-#if SHIFT == 1
+#if SHIFT >= 1
     d->Q(1) = 0;
+#if SHIFT == 2
+    d->Q(2) = 0;
+    d->Q(3) = 0;
+#endif
 #endif
 }
 
@@ -636,8 +670,12 @@ void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val)
 void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t val)
 {
     d->Q(0) = val;
-#if SHIFT == 1
+#if SHIFT >= 1
     d->Q(1) = 0;
+#if SHIFT == 2
+    d->Q(2) = 0;
+    d->Q(3) = 0;
+#endif
 #endif
 }
 #endif
@@ -1251,7 +1289,7 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *env, Reg *s)
     val |= (s->B(5) >> 2) & 0x20;
     val |= (s->B(6) >> 1) & 0x40;
     val |= (s->B(7)) & 0x80;
-#if SHIFT == 1
+#if SHIFT >= 1
     val |= (s->B(8) << 1) & 0x0100;
     val |= (s->B(9) << 2) & 0x0200;
     val |= (s->B(10) << 3) & 0x0400;
@@ -1260,6 +1298,24 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *env, Reg *s)
     val |= (s->B(13) << 6) & 0x2000;
     val |= (s->B(14) << 7) & 0x4000;
     val |= (s->B(15) << 8) & 0x8000;
+#if SHIFT == 2
+    val |= ((uint32_t)s->B(16) << 9) & 0x00010000;
+    val |= ((uint32_t)s->B(17) << 10) & 0x00020000;
+    val |= ((uint32_t)s->B(18) << 11) & 0x00040000;
+    val |= ((uint32_t)s->B(19) << 12) & 0x00080000;
+    val |= ((uint32_t)s->B(20) << 13) & 0x00100000;
+    val |= ((uint32_t)s->B(21) << 14) & 0x00200000;
+    val |= ((uint32_t)s->B(22) << 15) & 0x00400000;
+    val |= ((uint32_t)s->B(23) << 16) & 0x00800000;
+    val |= ((uint32_t)s->B(24) << 17) & 0x01000000;
+    val |= ((uint32_t)s->B(25) << 18) & 0x02000000;
+    val |= ((uint32_t)s->B(26) << 19) & 0x04000000;
+    val |= ((uint32_t)s->B(27) << 20) & 0x08000000;
+    val |= ((uint32_t)s->B(28) << 21) & 0x10000000;
+    val |= ((uint32_t)s->B(29) << 22) & 0x20000000;
+    val |= ((uint32_t)s->B(30) << 23) & 0x40000000;
+    val |= ((uint32_t)s->B(31) << 24) & 0x80000000;
+#endif
 #endif
     return val;
 }
@@ -1799,14 +1855,28 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     uint64_t zf = (s->Q(0) &  d->Q(0)) | (s->Q(1) &  d->Q(1));
     uint64_t cf = (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1));
 
+#if SHIFT == 2
+    zf |= (s->Q(2) &  d->Q(2)) | (s->Q(3) &  d->Q(3));
+    cf |= (s->Q(2) & ~d->Q(2)) | (s->Q(3) & ~d->Q(3));
+#endif
     CC_SRC = (zf ? 0 : CC_Z) | (cf ? 0 : CC_C);
 }
 
 #define SSE_HELPER_F(name, elem, num, F)        \
     void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)     \
     {                                           \
-        if (num > 2) {                          \
-            if (num > 4) {                      \
+        if (num * SHIFT > 2) {                  \
+            if (num * SHIFT > 8) {              \
+                d->elem(15) = F(15);            \
+                d->elem(14) = F(14);            \
+                d->elem(13) = F(13);            \
+                d->elem(12) = F(12);            \
+                d->elem(11) = F(11);            \
+                d->elem(10) = F(10);            \
+                d->elem(9) = F(9);              \
+                d->elem(8) = F(8);              \
+            }                                   \
+            if (num * SHIFT > 4) {              \
                 d->elem(7) = F(7);              \
                 d->elem(6) = F(6);              \
                 d->elem(5) = F(5);              \
@@ -1834,8 +1904,13 @@ SSE_HELPER_F(helper_pmovzxdq, Q, 2, s->L)
 
 void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    d->Q(0) = (int64_t)(int32_t) d->L(0) * (int32_t) s->L(0);
-    d->Q(1) = (int64_t)(int32_t) d->L(2) * (int32_t) s->L(2);
+    Reg *v = d;
+    d->Q(0) = (int64_t)(int32_t) v->L(0) * (int32_t) s->L(0);
+    d->Q(1) = (int64_t)(int32_t) v->L(2) * (int32_t) s->L(2);
+#if SHIFT == 2
+    d->Q(2) = (int64_t)(int32_t) v->L(4) * (int32_t) s->L(4);
+    d->Q(3) = (int64_t)(int32_t) v->L(6) * (int32_t) s->L(6);
+#endif
 }
 
 #define FCMPEQQ(d, s) (d == s ? -1 : 0)
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 13/42] i386: Destructive vector helpers for AVX
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (15 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 12/42] i386: Misc integer AVX helper prep Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-27  6:53   ` Paolo Bonzini
  2022-04-24 22:01 ` [PATCH v2 14/42] i386: Add size suffix to vector FP helpers Paul Brook
                   ` (28 subsequent siblings)
  45 siblings, 1 reply; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

These helpers need to take special care to avoid overwriting source values
before the wole result has been calculated.  Currently they use a dummy
Reg typed variable to store the result then assign the whole register.
This will cause 128 bit operations to corrupt the upper half of the register,
so replace it with explicit temporaries and element assignments.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 707 ++++++++++++++++++++++++++----------------
 1 file changed, 437 insertions(+), 270 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index d0424140d9..c645d2ddbf 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -680,71 +680,85 @@ void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t val)
 }
 #endif
 
+#define SHUFFLE4(F, a, b, offset) do {      \
+    r0 = a->F((order & 3) + offset);        \
+    r1 = a->F(((order >> 2) & 3) + offset); \
+    r2 = b->F(((order >> 4) & 3) + offset); \
+    r3 = b->F(((order >> 6) & 3) + offset); \
+    d->F(offset) = r0;                      \
+    d->F(offset + 1) = r1;                  \
+    d->F(offset + 2) = r2;                  \
+    d->F(offset + 3) = r3;                  \
+    } while (0)
+
 #if SHIFT == 0
 void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    uint16_t r0, r1, r2, r3;
 
-    r.W(0) = s->W(order & 3);
-    r.W(1) = s->W((order >> 2) & 3);
-    r.W(2) = s->W((order >> 4) & 3);
-    r.W(3) = s->W((order >> 6) & 3);
-    MOVE(*d, r);
+    SHUFFLE4(W, s, s, 0);
 }
 #else
 void helper_shufps(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    Reg *v = d;
+    uint32_t r0, r1, r2, r3;
 
-    r.L(0) = d->L(order & 3);
-    r.L(1) = d->L((order >> 2) & 3);
-    r.L(2) = s->L((order >> 4) & 3);
-    r.L(3) = s->L((order >> 6) & 3);
-    MOVE(*d, r);
+    SHUFFLE4(L, v, s, 0);
+#if SHIFT == 2
+    SHUFFLE4(L, v, s, 4);
+#endif
 }
 
 void helper_shufpd(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    Reg *v = d;
+    uint64_t r0, r1;
 
-    r.Q(0) = d->Q(order & 1);
-    r.Q(1) = s->Q((order >> 1) & 1);
-    MOVE(*d, r);
+    r0 = v->Q(order & 1);
+    r1 = s->Q((order >> 1) & 1);
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+#if SHIFT == 2
+    r0 = v->Q(((order >> 2) & 1) + 2);
+    r1 = s->Q(((order >> 3) & 1) + 2);
+    d->Q(2) = r0;
+    d->Q(3) = r1;
+#endif
 }
 
 void glue(helper_pshufd, SUFFIX)(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    uint32_t r0, r1, r2, r3;
 
-    r.L(0) = s->L(order & 3);
-    r.L(1) = s->L((order >> 2) & 3);
-    r.L(2) = s->L((order >> 4) & 3);
-    r.L(3) = s->L((order >> 6) & 3);
-    MOVE(*d, r);
+    SHUFFLE4(L, s, s, 0);
+#if SHIFT ==  2
+    SHUFFLE4(L, s, s, 4);
+#endif
 }
 
 void glue(helper_pshuflw, SUFFIX)(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    uint16_t r0, r1, r2, r3;
 
-    r.W(0) = s->W(order & 3);
-    r.W(1) = s->W((order >> 2) & 3);
-    r.W(2) = s->W((order >> 4) & 3);
-    r.W(3) = s->W((order >> 6) & 3);
-    r.Q(1) = s->Q(1);
-    MOVE(*d, r);
+    SHUFFLE4(W, s, s, 0);
+    d->Q(1) = s->Q(1);
+#if SHIFT == 2
+    SHUFFLE4(W, s, s, 8);
+    d->Q(3) = s->Q(3);
+#endif
 }
 
 void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order)
 {
-    Reg r;
+    uint16_t r0, r1, r2, r3;
 
-    r.Q(0) = s->Q(0);
-    r.W(4) = s->W(4 + (order & 3));
-    r.W(5) = s->W(4 + ((order >> 2) & 3));
-    r.W(6) = s->W(4 + ((order >> 4) & 3));
-    r.W(7) = s->W(4 + ((order >> 6) & 3));
-    MOVE(*d, r);
+    d->Q(0) = s->Q(0);
+    SHUFFLE4(W, s, s, 4);
+#if SHIFT == 2
+    d->Q(2) = s->Q(2);
+    SHUFFLE4(W, s, s, 12);
+#endif
 }
 #endif
 
@@ -1320,156 +1334,190 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *env, Reg *s)
     return val;
 }
 
-void glue(helper_packsswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    Reg r;
-
-    r.B(0) = satsb((int16_t)d->W(0));
-    r.B(1) = satsb((int16_t)d->W(1));
-    r.B(2) = satsb((int16_t)d->W(2));
-    r.B(3) = satsb((int16_t)d->W(3));
-#if SHIFT == 1
-    r.B(4) = satsb((int16_t)d->W(4));
-    r.B(5) = satsb((int16_t)d->W(5));
-    r.B(6) = satsb((int16_t)d->W(6));
-    r.B(7) = satsb((int16_t)d->W(7));
-#endif
-    r.B((4 << SHIFT) + 0) = satsb((int16_t)s->W(0));
-    r.B((4 << SHIFT) + 1) = satsb((int16_t)s->W(1));
-    r.B((4 << SHIFT) + 2) = satsb((int16_t)s->W(2));
-    r.B((4 << SHIFT) + 3) = satsb((int16_t)s->W(3));
-#if SHIFT == 1
-    r.B(12) = satsb((int16_t)s->W(4));
-    r.B(13) = satsb((int16_t)s->W(5));
-    r.B(14) = satsb((int16_t)s->W(6));
-    r.B(15) = satsb((int16_t)s->W(7));
-#endif
-    MOVE(*d, r);
-}
-
-void glue(helper_packuswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    Reg r;
-
-    r.B(0) = satub((int16_t)d->W(0));
-    r.B(1) = satub((int16_t)d->W(1));
-    r.B(2) = satub((int16_t)d->W(2));
-    r.B(3) = satub((int16_t)d->W(3));
-#if SHIFT == 1
-    r.B(4) = satub((int16_t)d->W(4));
-    r.B(5) = satub((int16_t)d->W(5));
-    r.B(6) = satub((int16_t)d->W(6));
-    r.B(7) = satub((int16_t)d->W(7));
-#endif
-    r.B((4 << SHIFT) + 0) = satub((int16_t)s->W(0));
-    r.B((4 << SHIFT) + 1) = satub((int16_t)s->W(1));
-    r.B((4 << SHIFT) + 2) = satub((int16_t)s->W(2));
-    r.B((4 << SHIFT) + 3) = satub((int16_t)s->W(3));
-#if SHIFT == 1
-    r.B(12) = satub((int16_t)s->W(4));
-    r.B(13) = satub((int16_t)s->W(5));
-    r.B(14) = satub((int16_t)s->W(6));
-    r.B(15) = satub((int16_t)s->W(7));
+#if SHIFT == 0
+#define PACK_WIDTH 4
+#else
+#define PACK_WIDTH 8
 #endif
-    MOVE(*d, r);
-}
 
 void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    Reg r;
+    Reg *v = d;
+    uint16_t r[PACK_WIDTH];
+    int i;
 
-    r.W(0) = satsw(d->L(0));
-    r.W(1) = satsw(d->L(1));
-#if SHIFT == 1
-    r.W(2) = satsw(d->L(2));
-    r.W(3) = satsw(d->L(3));
+    r[0] = satsw(v->L(0));
+    r[1] = satsw(v->L(1));
+    r[PACK_WIDTH / 2 + 0] = satsw(s->L(0));
+    r[PACK_WIDTH / 2 + 1] = satsw(s->L(1));
+#if SHIFT >= 1
+    r[2] = satsw(v->L(2));
+    r[3] = satsw(v->L(3));
+    r[6] = satsw(s->L(2));
+    r[7] = satsw(s->L(3));
 #endif
-    r.W((2 << SHIFT) + 0) = satsw(s->L(0));
-    r.W((2 << SHIFT) + 1) = satsw(s->L(1));
-#if SHIFT == 1
-    r.W(6) = satsw(s->L(2));
-    r.W(7) = satsw(s->L(3));
+    for (i = 0; i < PACK_WIDTH; i++) {
+        d->W(i) = r[i];
+    }
+#if SHIFT == 2
+    r[0] = satsw(v->L(4));
+    r[1] = satsw(v->L(5));
+    r[2] = satsw(v->L(6));
+    r[3] = satsw(v->L(7));
+    r[4] = satsw(s->L(4));
+    r[5] = satsw(s->L(5));
+    r[6] = satsw(s->L(6));
+    r[7] = satsw(s->L(7));
+    for (i = 0; i < 8; i++) {
+        d->W(i + 8) = r[i];
+    }
 #endif
-    MOVE(*d, r);
 }
 
 #define UNPCK_OP(base_name, base)                                       \
                                                                         \
     void glue(helper_punpck ## base_name ## bw, SUFFIX)(CPUX86State *env,\
-                                                        Reg *d, Reg *s) \
+                                                Reg *d, Reg *s) \
     {                                                                   \
-        Reg r;                                                          \
+        Reg *v = d;                                                     \
+        uint8_t r[PACK_WIDTH * 2];                                      \
+        int i;                                                          \
                                                                         \
-        r.B(0) = d->B((base << (SHIFT + 2)) + 0);                       \
-        r.B(1) = s->B((base << (SHIFT + 2)) + 0);                       \
-        r.B(2) = d->B((base << (SHIFT + 2)) + 1);                       \
-        r.B(3) = s->B((base << (SHIFT + 2)) + 1);                       \
-        r.B(4) = d->B((base << (SHIFT + 2)) + 2);                       \
-        r.B(5) = s->B((base << (SHIFT + 2)) + 2);                       \
-        r.B(6) = d->B((base << (SHIFT + 2)) + 3);                       \
-        r.B(7) = s->B((base << (SHIFT + 2)) + 3);                       \
+        r[0] = v->B((base * PACK_WIDTH) + 0);                           \
+        r[1] = s->B((base * PACK_WIDTH) + 0);                           \
+        r[2] = v->B((base * PACK_WIDTH) + 1);                           \
+        r[3] = s->B((base * PACK_WIDTH) + 1);                           \
+        r[4] = v->B((base * PACK_WIDTH) + 2);                           \
+        r[5] = s->B((base * PACK_WIDTH) + 2);                           \
+        r[6] = v->B((base * PACK_WIDTH) + 3);                           \
+        r[7] = s->B((base * PACK_WIDTH) + 3);                           \
         XMM_ONLY(                                                       \
-                 r.B(8) = d->B((base << (SHIFT + 2)) + 4);              \
-                 r.B(9) = s->B((base << (SHIFT + 2)) + 4);              \
-                 r.B(10) = d->B((base << (SHIFT + 2)) + 5);             \
-                 r.B(11) = s->B((base << (SHIFT + 2)) + 5);             \
-                 r.B(12) = d->B((base << (SHIFT + 2)) + 6);             \
-                 r.B(13) = s->B((base << (SHIFT + 2)) + 6);             \
-                 r.B(14) = d->B((base << (SHIFT + 2)) + 7);             \
-                 r.B(15) = s->B((base << (SHIFT + 2)) + 7);             \
+                 r[8] = v->B((base * PACK_WIDTH) + 4);                  \
+                 r[9] = s->B((base * PACK_WIDTH) + 4);                  \
+                 r[10] = v->B((base * PACK_WIDTH) + 5);                 \
+                 r[11] = s->B((base * PACK_WIDTH) + 5);                 \
+                 r[12] = v->B((base * PACK_WIDTH) + 6);                 \
+                 r[13] = s->B((base * PACK_WIDTH) + 6);                 \
+                 r[14] = v->B((base * PACK_WIDTH) + 7);                 \
+                 r[15] = s->B((base * PACK_WIDTH) + 7);                 \
+                                                                      ) \
+        for (i = 0; i < PACK_WIDTH * 2; i++) {                          \
+            d->B(i) = r[i];                                             \
+        }                                                               \
+        YMM_ONLY(                                                       \
+                r[0] = v->B((base * 8) + 16);                           \
+                r[1] = s->B((base * 8) + 16);                           \
+                r[2] = v->B((base * 8) + 17);                           \
+                r[3] = s->B((base * 8) + 17);                           \
+                r[4] = v->B((base * 8) + 18);                           \
+                r[5] = s->B((base * 8) + 18);                           \
+                r[6] = v->B((base * 8) + 19);                           \
+                r[7] = s->B((base * 8) + 19);                           \
+                r[8] = v->B((base * 8) + 20);                           \
+                r[9] = s->B((base * 8) + 20);                           \
+                r[10] = v->B((base * 8) + 21);                          \
+                r[11] = s->B((base * 8) + 21);                          \
+                r[12] = v->B((base * 8) + 22);                          \
+                r[13] = s->B((base * 8) + 22);                          \
+                r[14] = v->B((base * 8) + 23);                          \
+                r[15] = s->B((base * 8) + 23);                          \
+                for (i = 0; i < PACK_WIDTH * 2; i++) {                  \
+                    d->B(16 + i) = r[i];                                \
+                }                                                       \
                                                                       ) \
-        MOVE(*d, r);                                                    \
     }                                                                   \
                                                                         \
     void glue(helper_punpck ## base_name ## wd, SUFFIX)(CPUX86State *env,\
-                                                        Reg *d, Reg *s) \
+                                                Reg *d, Reg *s) \
     {                                                                   \
-        Reg r;                                                          \
+        Reg *v = d;                                                     \
+        uint16_t r[PACK_WIDTH];                                         \
+        int i;                                                          \
                                                                         \
-        r.W(0) = d->W((base << (SHIFT + 1)) + 0);                       \
-        r.W(1) = s->W((base << (SHIFT + 1)) + 0);                       \
-        r.W(2) = d->W((base << (SHIFT + 1)) + 1);                       \
-        r.W(3) = s->W((base << (SHIFT + 1)) + 1);                       \
+        r[0] = v->W((base * (PACK_WIDTH / 2)) + 0);                     \
+        r[1] = s->W((base * (PACK_WIDTH / 2)) + 0);                     \
+        r[2] = v->W((base * (PACK_WIDTH / 2)) + 1);                     \
+        r[3] = s->W((base * (PACK_WIDTH / 2)) + 1);                     \
         XMM_ONLY(                                                       \
-                 r.W(4) = d->W((base << (SHIFT + 1)) + 2);              \
-                 r.W(5) = s->W((base << (SHIFT + 1)) + 2);              \
-                 r.W(6) = d->W((base << (SHIFT + 1)) + 3);              \
-                 r.W(7) = s->W((base << (SHIFT + 1)) + 3);              \
+                 r[4] = v->W((base * 4) + 2);                           \
+                 r[5] = s->W((base * 4) + 2);                           \
+                 r[6] = v->W((base * 4) + 3);                           \
+                 r[7] = s->W((base * 4) + 3);                           \
+                                                                      ) \
+        for (i = 0; i < PACK_WIDTH; i++) {                              \
+            d->W(i) = r[i];                                             \
+        }                                                               \
+        YMM_ONLY(                                                       \
+                r[0] = v->W((base * 4) + 8);                            \
+                r[1] = s->W((base * 4) + 8);                            \
+                r[2] = v->W((base * 4) + 9);                            \
+                r[3] = s->W((base * 4) + 9);                            \
+                r[4] = v->W((base * 4) + 10);                           \
+                r[5] = s->W((base * 4) + 10);                           \
+                r[6] = v->W((base * 4) + 11);                           \
+                r[7] = s->W((base * 4) + 11);                           \
+                for (i = 0; i < PACK_WIDTH; i++) {                      \
+                    d->W(i + 8) = r[i];                                 \
+                }                                                       \
                                                                       ) \
-            MOVE(*d, r);                                                \
     }                                                                   \
                                                                         \
     void glue(helper_punpck ## base_name ## dq, SUFFIX)(CPUX86State *env,\
-                                                        Reg *d, Reg *s) \
+                                                Reg *d, Reg *s) \
     {                                                                   \
-        Reg r;                                                          \
+        Reg *v = d;                                                     \
+        uint32_t r[4];                                                  \
                                                                         \
-        r.L(0) = d->L((base << SHIFT) + 0);                             \
-        r.L(1) = s->L((base << SHIFT) + 0);                             \
+        r[0] = v->L((base * (PACK_WIDTH / 4)) + 0);                     \
+        r[1] = s->L((base * (PACK_WIDTH / 4)) + 0);                     \
         XMM_ONLY(                                                       \
-                 r.L(2) = d->L((base << SHIFT) + 1);                    \
-                 r.L(3) = s->L((base << SHIFT) + 1);                    \
+                 r[2] = v->L((base * 2) + 1);                           \
+                 r[3] = s->L((base * 2) + 1);                           \
+                 d->L(2) = r[2];                                        \
+                 d->L(3) = r[3];                                        \
+                                                                      ) \
+        d->L(0) = r[0];                                                 \
+        d->L(1) = r[1];                                                 \
+        YMM_ONLY(                                                       \
+                 r[0] = v->L((base * 2) + 4);                           \
+                 r[1] = s->L((base * 2) + 4);                           \
+                 r[2] = v->L((base * 2) + 5);                           \
+                 r[3] = s->L((base * 2) + 5);                           \
+                 d->L(4) = r[0];                                        \
+                 d->L(5) = r[1];                                        \
+                 d->L(6) = r[2];                                        \
+                 d->L(7) = r[3];                                        \
                                                                       ) \
-            MOVE(*d, r);                                                \
     }                                                                   \
                                                                         \
     XMM_ONLY(                                                           \
-             void glue(helper_punpck ## base_name ## qdq, SUFFIX)(CPUX86State \
-                                                                  *env, \
-                                                                  Reg *d, \
-                                                                  Reg *s) \
+             void glue(helper_punpck ## base_name ## qdq, SUFFIX)(      \
+                        CPUX86State *env, Reg *d, Reg *s)       \
              {                                                          \
-                 Reg r;                                                 \
+                 Reg *v = d;                                            \
+                 uint64_t r[2];                                         \
                                                                         \
-                 r.Q(0) = d->Q(base);                                   \
-                 r.Q(1) = s->Q(base);                                   \
-                 MOVE(*d, r);                                           \
+                 r[0] = v->Q(base);                                     \
+                 r[1] = s->Q(base);                                     \
+                 d->Q(0) = r[0];                                        \
+                 d->Q(1) = r[1];                                        \
+                 YMM_ONLY(                                              \
+                     r[0] = v->Q(base + 2);                             \
+                     r[1] = s->Q(base + 2);                             \
+                     d->Q(2) = r[0];                                    \
+                     d->Q(3) = r[1];                                    \
+                                                                      ) \
              }                                                          \
                                                                         )
 
 UNPCK_OP(l, 0)
 UNPCK_OP(h, 1)
 
+#undef PACK_WIDTH
+#undef PACK_HELPER_B
+#undef PACK4
+
+
 /* 3DNow! float ops */
 #if SHIFT == 0
 void helper_pi2fd(CPUX86State *env, MMXReg *d, MMXReg *s)
@@ -1622,113 +1670,172 @@ void helper_pswapd(CPUX86State *env, MMXReg *d, MMXReg *s)
 /* SSSE3 op helpers */
 void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
+    Reg *v = d;
     int i;
-    Reg r;
+#if SHIFT == 0
+    uint8_t r[8];
 
-    for (i = 0; i < (8 << SHIFT); i++) {
-        r.B(i) = (s->B(i) & 0x80) ? 0 : (d->B(s->B(i) & ((8 << SHIFT) - 1)));
+    for (i = 0; i < 8; i++) {
+        r[i] = (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 7));
+    }
+    for (i = 0; i < 8; i++) {
+        d->B(i) = r[i];
     }
+#else
+    uint8_t r[16];
 
-    MOVE(*d, r);
+    for (i = 0; i < 16; i++) {
+        r[i] = (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 0xf));
+    }
+    for (i = 0; i < 16; i++) {
+        d->B(i) = r[i];
+    }
+#if SHIFT == 2
+    for (i = 0; i < 16; i++) {
+        r[i] = (s->B(i + 16) & 0x80) ? 0 : (v->B((s->B(i + 16) & 0xf) + 16));
+    }
+    for (i = 0; i < 16; i++) {
+        d->B(i + 16) = r[i];
+    }
+#endif
+#endif
 }
 
-void glue(helper_phaddw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-
-    Reg r;
-
-    r.W(0) = (int16_t)d->W(0) + (int16_t)d->W(1);
-    r.W(1) = (int16_t)d->W(2) + (int16_t)d->W(3);
-    XMM_ONLY(r.W(2) = (int16_t)d->W(4) + (int16_t)d->W(5));
-    XMM_ONLY(r.W(3) = (int16_t)d->W(6) + (int16_t)d->W(7));
-    r.W((2 << SHIFT) + 0) = (int16_t)s->W(0) + (int16_t)s->W(1);
-    r.W((2 << SHIFT) + 1) = (int16_t)s->W(2) + (int16_t)s->W(3);
-    XMM_ONLY(r.W(6) = (int16_t)s->W(4) + (int16_t)s->W(5));
-    XMM_ONLY(r.W(7) = (int16_t)s->W(6) + (int16_t)s->W(7));
+#if SHIFT == 0
 
-    MOVE(*d, r);
+#define SSE_HELPER_HW(name, F)  \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+{                               \
+    Reg *v = d;                 \
+    uint16_t r[4];              \
+    r[0] = F(v->W(0), v->W(1)); \
+    r[1] = F(v->W(2), v->W(3)); \
+    r[2] = F(s->W(0), s->W(1)); \
+    r[3] = F(s->W(3), s->W(3)); \
+    d->W(0) = r[0];             \
+    d->W(1) = r[1];             \
+    d->W(2) = r[2];             \
+    d->W(3) = r[3];             \
+}
+
+#define SSE_HELPER_HL(name, F)  \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+{                               \
+    Reg *v = d;                 \
+    uint32_t r0, r1;            \
+    r0 = F(v->L(0), v->L(1));   \
+    r1 = F(s->L(0), s->L(1));   \
+    d->W(0) = r0;               \
+    d->W(1) = r1;               \
 }
 
-void glue(helper_phaddd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    Reg r;
-
-    r.L(0) = (int32_t)d->L(0) + (int32_t)d->L(1);
-    XMM_ONLY(r.L(1) = (int32_t)d->L(2) + (int32_t)d->L(3));
-    r.L((1 << SHIFT) + 0) = (int32_t)s->L(0) + (int32_t)s->L(1);
-    XMM_ONLY(r.L(3) = (int32_t)s->L(2) + (int32_t)s->L(3));
+#else
 
-    MOVE(*d, r);
+#define SSE_HELPER_HW(name, F)  \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+{                                   \
+    Reg *v = d;                     \
+    int32_t r[8];                   \
+    r[0] = F(v->W(0), v->W(1));     \
+    r[1] = F(v->W(2), v->W(3));     \
+    r[2] = F(v->W(4), v->W(5));     \
+    r[3] = F(v->W(6), v->W(7));     \
+    r[4] = F(s->W(0), s->W(1));     \
+    r[5] = F(s->W(2), s->W(3));     \
+    r[6] = F(s->W(4), s->W(5));     \
+    r[7] = F(s->W(6), s->W(7));     \
+    d->W(0) = r[0];                 \
+    d->W(1) = r[1];                 \
+    d->W(2) = r[2];                 \
+    d->W(3) = r[3];                 \
+    d->W(4) = r[4];                 \
+    d->W(5) = r[5];                 \
+    d->W(6) = r[6];                 \
+    d->W(7) = r[7];                 \
+    YMM_ONLY(                       \
+    r[0] = F(v->W(8), v->W(9));     \
+    r[1] = F(v->W(10), v->W(11));   \
+    r[2] = F(v->W(12), v->W(13));   \
+    r[3] = F(v->W(14), v->W(15));   \
+    r[4] = F(s->W(8), s->W(9));     \
+    r[5] = F(s->W(10), s->W(11));   \
+    r[6] = F(s->W(12), s->W(13));   \
+    r[7] = F(s->W(14), s->W(15));   \
+    d->W(8) = r[0];                 \
+    d->W(9) = r[1];                 \
+    d->W(10) = r[2];                \
+    d->W(11) = r[3];                \
+    d->W(12) = r[4];                \
+    d->W(13) = r[5];                \
+    d->W(14) = r[6];                \
+    d->W(15) = r[7];                \
+    )                               \
+}
+
+#define SSE_HELPER_HL(name, F)  \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+{                               \
+    Reg *v = d;                 \
+    int32_t r0, r1, r2, r3;     \
+    r0 = F(v->L(0), v->L(1));   \
+    r1 = F(v->L(2), v->L(3));   \
+    r2 = F(s->L(0), s->L(1));   \
+    r3 = F(s->L(2), s->L(3));   \
+    d->L(0) = r0;               \
+    d->L(1) = r1;               \
+    d->L(2) = r2;               \
+    d->L(3) = r3;               \
+    YMM_ONLY(                   \
+    r0 = F(v->L(4), v->L(5));   \
+    r1 = F(v->L(6), v->L(7));   \
+    r2 = F(s->L(4), s->L(5));   \
+    r3 = F(s->L(6), s->L(7));   \
+    d->L(4) = r0;               \
+    d->L(5) = r1;               \
+    d->L(6) = r2;               \
+    d->L(7) = r3;               \
+    )                           \
 }
+#endif
 
-void glue(helper_phaddsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    Reg r;
-
-    r.W(0) = satsw((int16_t)d->W(0) + (int16_t)d->W(1));
-    r.W(1) = satsw((int16_t)d->W(2) + (int16_t)d->W(3));
-    XMM_ONLY(r.W(2) = satsw((int16_t)d->W(4) + (int16_t)d->W(5)));
-    XMM_ONLY(r.W(3) = satsw((int16_t)d->W(6) + (int16_t)d->W(7)));
-    r.W((2 << SHIFT) + 0) = satsw((int16_t)s->W(0) + (int16_t)s->W(1));
-    r.W((2 << SHIFT) + 1) = satsw((int16_t)s->W(2) + (int16_t)s->W(3));
-    XMM_ONLY(r.W(6) = satsw((int16_t)s->W(4) + (int16_t)s->W(5)));
-    XMM_ONLY(r.W(7) = satsw((int16_t)s->W(6) + (int16_t)s->W(7)));
+SSE_HELPER_HW(phaddw, FADD)
+SSE_HELPER_HW(phsubw, FSUB)
+SSE_HELPER_HW(phaddsw, FADDSW)
+SSE_HELPER_HW(phsubsw, FSUBSW)
+SSE_HELPER_HL(phaddd, FADD)
+SSE_HELPER_HL(phsubd, FSUB)
 
-    MOVE(*d, r);
-}
+#undef SSE_HELPER_HW
+#undef SSE_HELPER_HL
 
 void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    d->W(0) = satsw((int8_t)s->B(0) * (uint8_t)d->B(0) +
-                    (int8_t)s->B(1) * (uint8_t)d->B(1));
-    d->W(1) = satsw((int8_t)s->B(2) * (uint8_t)d->B(2) +
-                    (int8_t)s->B(3) * (uint8_t)d->B(3));
-    d->W(2) = satsw((int8_t)s->B(4) * (uint8_t)d->B(4) +
-                    (int8_t)s->B(5) * (uint8_t)d->B(5));
-    d->W(3) = satsw((int8_t)s->B(6) * (uint8_t)d->B(6) +
-                    (int8_t)s->B(7) * (uint8_t)d->B(7));
-#if SHIFT == 1
-    d->W(4) = satsw((int8_t)s->B(8) * (uint8_t)d->B(8) +
-                    (int8_t)s->B(9) * (uint8_t)d->B(9));
-    d->W(5) = satsw((int8_t)s->B(10) * (uint8_t)d->B(10) +
-                    (int8_t)s->B(11) * (uint8_t)d->B(11));
-    d->W(6) = satsw((int8_t)s->B(12) * (uint8_t)d->B(12) +
-                    (int8_t)s->B(13) * (uint8_t)d->B(13));
-    d->W(7) = satsw((int8_t)s->B(14) * (uint8_t)d->B(14) +
-                    (int8_t)s->B(15) * (uint8_t)d->B(15));
+    Reg *v = d;
+    d->W(0) = satsw((int8_t)s->B(0) * (uint8_t)v->B(0) +
+                    (int8_t)s->B(1) * (uint8_t)v->B(1));
+    d->W(1) = satsw((int8_t)s->B(2) * (uint8_t)v->B(2) +
+                    (int8_t)s->B(3) * (uint8_t)v->B(3));
+    d->W(2) = satsw((int8_t)s->B(4) * (uint8_t)v->B(4) +
+                    (int8_t)s->B(5) * (uint8_t)v->B(5));
+    d->W(3) = satsw((int8_t)s->B(6) * (uint8_t)v->B(6) +
+                    (int8_t)s->B(7) * (uint8_t)v->B(7));
+#if SHIFT >= 1
+    d->W(4) = satsw((int8_t)s->B(8) * (uint8_t)v->B(8) +
+                    (int8_t)s->B(9) * (uint8_t)v->B(9));
+    d->W(5) = satsw((int8_t)s->B(10) * (uint8_t)v->B(10) +
+                    (int8_t)s->B(11) * (uint8_t)v->B(11));
+    d->W(6) = satsw((int8_t)s->B(12) * (uint8_t)v->B(12) +
+                    (int8_t)s->B(13) * (uint8_t)v->B(13));
+    d->W(7) = satsw((int8_t)s->B(14) * (uint8_t)v->B(14) +
+                    (int8_t)s->B(15) * (uint8_t)v->B(15));
+#if SHIFT == 2
+    int i;
+    for (i = 8; i < 16; i++) {
+        d->W(i) = satsw((int8_t)s->B(i * 2) * (uint8_t)v->B(i * 2) +
+                        (int8_t)s->B(i * 2 + 1) * (uint8_t)v->B(i * 2 + 1));
+    }
+#endif
 #endif
-}
-
-void glue(helper_phsubw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    d->W(0) = (int16_t)d->W(0) - (int16_t)d->W(1);
-    d->W(1) = (int16_t)d->W(2) - (int16_t)d->W(3);
-    XMM_ONLY(d->W(2) = (int16_t)d->W(4) - (int16_t)d->W(5));
-    XMM_ONLY(d->W(3) = (int16_t)d->W(6) - (int16_t)d->W(7));
-    d->W((2 << SHIFT) + 0) = (int16_t)s->W(0) - (int16_t)s->W(1);
-    d->W((2 << SHIFT) + 1) = (int16_t)s->W(2) - (int16_t)s->W(3);
-    XMM_ONLY(d->W(6) = (int16_t)s->W(4) - (int16_t)s->W(5));
-    XMM_ONLY(d->W(7) = (int16_t)s->W(6) - (int16_t)s->W(7));
-}
-
-void glue(helper_phsubd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    d->L(0) = (int32_t)d->L(0) - (int32_t)d->L(1);
-    XMM_ONLY(d->L(1) = (int32_t)d->L(2) - (int32_t)d->L(3));
-    d->L((1 << SHIFT) + 0) = (int32_t)s->L(0) - (int32_t)s->L(1);
-    XMM_ONLY(d->L(3) = (int32_t)s->L(2) - (int32_t)s->L(3));
-}
-
-void glue(helper_phsubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-    d->W(0) = satsw((int16_t)d->W(0) - (int16_t)d->W(1));
-    d->W(1) = satsw((int16_t)d->W(2) - (int16_t)d->W(3));
-    XMM_ONLY(d->W(2) = satsw((int16_t)d->W(4) - (int16_t)d->W(5)));
-    XMM_ONLY(d->W(3) = satsw((int16_t)d->W(6) - (int16_t)d->W(7)));
-    d->W((2 << SHIFT) + 0) = satsw((int16_t)s->W(0) - (int16_t)s->W(1));
-    d->W((2 << SHIFT) + 1) = satsw((int16_t)s->W(2) - (int16_t)s->W(3));
-    XMM_ONLY(d->W(6) = satsw((int16_t)s->W(4) - (int16_t)s->W(5)));
-    XMM_ONLY(d->W(7) = satsw((int16_t)s->W(6) - (int16_t)s->W(7)));
 }
 
 #define FABSB(x) (x > INT8_MAX  ? -(int8_t)x : x)
@@ -1751,32 +1858,49 @@ SSE_HELPER_L(helper_psignd, FSIGNL)
 void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                   int32_t shift)
 {
-    Reg r;
-
+    Reg *v = d;
     /* XXX could be checked during translation */
-    if (shift >= (16 << SHIFT)) {
-        r.Q(0) = 0;
-        XMM_ONLY(r.Q(1) = 0);
+    if (shift >= (SHIFT ? 32 : 16)) {
+        d->Q(0) = 0;
+        XMM_ONLY(d->Q(1) = 0);
+#if SHIFT == 2
+        d->Q(2) = 0;
+        d->Q(3) = 0;
+#endif
     } else {
         shift <<= 3;
 #define SHR(v, i) (i < 64 && i > -64 ? i > 0 ? v >> (i) : (v << -(i)) : 0)
 #if SHIFT == 0
-        r.Q(0) = SHR(s->Q(0), shift - 0) |
-            SHR(d->Q(0), shift -  64);
+        d->Q(0) = SHR(s->Q(0), shift - 0) |
+            SHR(v->Q(0), shift -  64);
 #else
-        r.Q(0) = SHR(s->Q(0), shift - 0) |
-            SHR(s->Q(1), shift -  64) |
-            SHR(d->Q(0), shift - 128) |
-            SHR(d->Q(1), shift - 192);
-        r.Q(1) = SHR(s->Q(0), shift + 64) |
-            SHR(s->Q(1), shift -   0) |
-            SHR(d->Q(0), shift -  64) |
-            SHR(d->Q(1), shift - 128);
+        uint64_t r0, r1;
+
+        r0 = SHR(s->Q(0), shift - 0) |
+             SHR(s->Q(1), shift -  64) |
+             SHR(v->Q(0), shift - 128) |
+             SHR(v->Q(1), shift - 192);
+        r1 = SHR(s->Q(0), shift + 64) |
+             SHR(s->Q(1), shift -   0) |
+             SHR(v->Q(0), shift -  64) |
+             SHR(v->Q(1), shift - 128);
+        d->Q(0) = r0;
+        d->Q(1) = r1;
+#if SHIFT == 2
+        r0 = SHR(s->Q(2), shift - 0) |
+             SHR(s->Q(3), shift -  64) |
+             SHR(v->Q(2), shift - 128) |
+             SHR(v->Q(3), shift - 192);
+        r1 = SHR(s->Q(2), shift + 64) |
+             SHR(s->Q(3), shift -   0) |
+             SHR(v->Q(2), shift -  64) |
+             SHR(v->Q(3), shift - 128);
+        d->Q(2) = r0;
+        d->Q(3) = r1;
+#endif
 #endif
 #undef SHR
     }
-
-    MOVE(*d, r);
 }
 
 #define XMM0 (env->xmm_regs[0])
@@ -1918,17 +2042,43 @@ SSE_HELPER_Q(helper_pcmpeqq, FCMPEQQ)
 
 void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    Reg r;
-
-    r.W(0) = satuw((int32_t) d->L(0));
-    r.W(1) = satuw((int32_t) d->L(1));
-    r.W(2) = satuw((int32_t) d->L(2));
-    r.W(3) = satuw((int32_t) d->L(3));
-    r.W(4) = satuw((int32_t) s->L(0));
-    r.W(5) = satuw((int32_t) s->L(1));
-    r.W(6) = satuw((int32_t) s->L(2));
-    r.W(7) = satuw((int32_t) s->L(3));
-    MOVE(*d, r);
+    Reg *v = d;
+    uint16_t r[8];
+
+    r[0] = satuw((int32_t) v->L(0));
+    r[1] = satuw((int32_t) v->L(1));
+    r[2] = satuw((int32_t) v->L(2));
+    r[3] = satuw((int32_t) v->L(3));
+    r[4] = satuw((int32_t) s->L(0));
+    r[5] = satuw((int32_t) s->L(1));
+    r[6] = satuw((int32_t) s->L(2));
+    r[7] = satuw((int32_t) s->L(3));
+    d->W(0) = r[0];
+    d->W(1) = r[1];
+    d->W(2) = r[2];
+    d->W(3) = r[3];
+    d->W(4) = r[4];
+    d->W(5) = r[5];
+    d->W(6) = r[6];
+    d->W(7) = r[7];
+#if SHIFT == 2
+    r[0] = satuw((int32_t) v->L(4));
+    r[1] = satuw((int32_t) v->L(5));
+    r[2] = satuw((int32_t) v->L(6));
+    r[3] = satuw((int32_t) v->L(7));
+    r[4] = satuw((int32_t) s->L(4));
+    r[5] = satuw((int32_t) s->L(5));
+    r[6] = satuw((int32_t) s->L(6));
+    r[7] = satuw((int32_t) s->L(7));
+    d->W(8) = r[0];
+    d->W(9) = r[1];
+    d->W(10) = r[2];
+    d->W(11) = r[3];
+    d->W(12) = r[4];
+    d->W(13) = r[5];
+    d->W(14) = r[6];
+    d->W(15) = r[7];
+#endif
 }
 
 #define FMINSB(d, s) MIN((int8_t)d, (int8_t)s)
@@ -2184,20 +2334,37 @@ void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
 void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                   uint32_t offset)
 {
+    Reg *v = d;
     int s0 = (offset & 3) << 2;
     int d0 = (offset & 4) << 0;
     int i;
-    Reg r;
+    uint16_t r[8];
 
     for (i = 0; i < 8; i++, d0++) {
-        r.W(i) = 0;
-        r.W(i) += abs1(d->B(d0 + 0) - s->B(s0 + 0));
-        r.W(i) += abs1(d->B(d0 + 1) - s->B(s0 + 1));
-        r.W(i) += abs1(d->B(d0 + 2) - s->B(s0 + 2));
-        r.W(i) += abs1(d->B(d0 + 3) - s->B(s0 + 3));
+        r[i] = 0;
+        r[i] += abs1(v->B(d0 + 0) - s->B(s0 + 0));
+        r[i] += abs1(v->B(d0 + 1) - s->B(s0 + 1));
+        r[i] += abs1(v->B(d0 + 2) - s->B(s0 + 2));
+        r[i] += abs1(v->B(d0 + 3) - s->B(s0 + 3));
     }
+    for (i = 0; i < 8; i++) {
+        d->W(i) = r[i];
+    }
+#if SHIFT == 2
+    s0 = ((offset & 0x18) >> 1) + 16;
+    d0 = ((offset & 0x20) >> 3) + 16;
 
-    MOVE(*d, r);
+    for (i = 0; i < 8; i++, d0++) {
+        r[i] = 0;
+        r[i] += abs1(v->B(d0 + 0) - s->B(s0 + 0));
+        r[i] += abs1(v->B(d0 + 1) - s->B(s0 + 1));
+        r[i] += abs1(v->B(d0 + 2) - s->B(s0 + 2));
+        r[i] += abs1(v->B(d0 + 3) - s->B(s0 + 3));
+    }
+    for (i = 0; i < 8; i++) {
+        d->W(i + 8) = r[i];
+    }
+#endif
 }
 
 /* SSE4.2 op helpers */
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 14/42] i386: Add size suffix to vector FP helpers
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (16 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 13/42] i386: Destructive vector helpers for AVX Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 15/42] i386: Floating point atithmetic helper AVX prep Paul Brook
                   ` (27 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

For AVX we're going to need both 128 bit (xmm) and 256 bit (ymm) variants of
floating point helpers. Add the register type suffix to the existing
*PS and *PD helpers (SS and SD variants are only valid on 128 bit vectors)

No functional changes.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 48 ++++++++++++++++++------------------
 target/i386/ops_sse_header.h | 48 ++++++++++++++++++------------------
 target/i386/tcg/translate.c  | 37 +++++++++++++--------------
 3 files changed, 67 insertions(+), 66 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index c645d2ddbf..fc8fd57aa5 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -699,7 +699,7 @@ void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order)
     SHUFFLE4(W, s, s, 0);
 }
 #else
-void helper_shufps(Reg *d, Reg *s, int order)
+void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int order)
 {
     Reg *v = d;
     uint32_t r0, r1, r2, r3;
@@ -710,7 +710,7 @@ void helper_shufps(Reg *d, Reg *s, int order)
 #endif
 }
 
-void helper_shufpd(Reg *d, Reg *s, int order)
+void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *s, int order)
 {
     Reg *v = d;
     uint64_t r0, r1;
@@ -767,7 +767,7 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order)
 /* XXX: not accurate */
 
 #define SSE_HELPER_S(name, F)                                           \
-    void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s)        \
+    void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)\
     {                                                                   \
         d->ZMM_S(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
         d->ZMM_S(1) = F(32, d->ZMM_S(1), s->ZMM_S(1));                  \
@@ -780,7 +780,7 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order)
         d->ZMM_S(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
     }                                                                   \
                                                                         \
-    void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s)        \
+    void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)\
     {                                                                   \
         d->ZMM_D(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
         d->ZMM_D(1) = F(64, d->ZMM_D(1), s->ZMM_D(1));                  \
@@ -816,7 +816,7 @@ SSE_HELPER_S(sqrt, FPU_SQRT)
 
 
 /* float to float conversions */
-void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     float32 s0, s1;
 
@@ -826,7 +826,7 @@ void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s)
     d->ZMM_D(1) = float32_to_float64(s1, &env->sse_status);
 }
 
-void helper_cvtpd2ps(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     d->ZMM_S(0) = float64_to_float32(s->ZMM_D(0), &env->sse_status);
     d->ZMM_S(1) = float64_to_float32(s->ZMM_D(1), &env->sse_status);
@@ -844,7 +844,7 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s)
 }
 
 /* integer to float */
-void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     d->ZMM_S(0) = int32_to_float32(s->ZMM_L(0), &env->sse_status);
     d->ZMM_S(1) = int32_to_float32(s->ZMM_L(1), &env->sse_status);
@@ -852,7 +852,7 @@ void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s)
     d->ZMM_S(3) = int32_to_float32(s->ZMM_L(3), &env->sse_status);
 }
 
-void helper_cvtdq2pd(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int32_t l0, l1;
 
@@ -929,7 +929,7 @@ WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero, float32, INT64_MIN)
 WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN)
 WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN)
 
-void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     d->ZMM_L(0) = x86_float32_to_int32(s->ZMM_S(0), &env->sse_status);
     d->ZMM_L(1) = x86_float32_to_int32(s->ZMM_S(1), &env->sse_status);
@@ -937,7 +937,7 @@ void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     d->ZMM_L(3) = x86_float32_to_int32(s->ZMM_S(3), &env->sse_status);
 }
 
-void helper_cvtpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     d->ZMM_L(0) = x86_float64_to_int32(s->ZMM_D(0), &env->sse_status);
     d->ZMM_L(1) = x86_float64_to_int32(s->ZMM_D(1), &env->sse_status);
@@ -979,7 +979,7 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s)
 #endif
 
 /* float to integer truncated */
-void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     d->ZMM_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
     d->ZMM_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);
@@ -987,7 +987,7 @@ void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     d->ZMM_L(3) = x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->sse_status);
 }
 
-void helper_cvttpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     d->ZMM_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
     d->ZMM_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);
@@ -1028,7 +1028,7 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s)
 }
 #endif
 
-void helper_rsqrtps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
     d->ZMM_S(0) = float32_div(float32_one,
@@ -1055,7 +1055,7 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     set_float_exception_flags(old_flags, &env->sse_status);
 }
 
-void helper_rcpps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
     d->ZMM_S(0) = float32_div(float32_one, s->ZMM_S(0), &env->sse_status);
@@ -1116,7 +1116,7 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, int index, int length)
     d->ZMM_Q(0) = helper_insertq(d->ZMM_Q(0), index, length);
 }
 
-void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_haddps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     ZMMReg r;
 
@@ -1127,7 +1127,7 @@ void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     MOVE(*d, r);
 }
 
-void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_haddpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     ZMMReg r;
 
@@ -1136,7 +1136,7 @@ void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     MOVE(*d, r);
 }
 
-void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_hsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     ZMMReg r;
 
@@ -1147,7 +1147,7 @@ void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     MOVE(*d, r);
 }
 
-void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     ZMMReg r;
 
@@ -1156,7 +1156,7 @@ void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     MOVE(*d, r);
 }
 
-void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_addsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     d->ZMM_S(0) = float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
     d->ZMM_S(1) = float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
@@ -1164,7 +1164,7 @@ void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     d->ZMM_S(3) = float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
 }
 
-void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     d->ZMM_D(0) = float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
     d->ZMM_D(1) = float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status);
@@ -1172,7 +1172,7 @@ void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 
 /* XXX: unordered */
 #define SSE_HELPER_CMP(name, F)                                         \
-    void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s)        \
+    void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)\
     {                                                                   \
         d->ZMM_L(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
         d->ZMM_L(1) = F(32, d->ZMM_S(1), s->ZMM_S(1));                  \
@@ -1185,7 +1185,7 @@ void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s)
         d->ZMM_L(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
     }                                                                   \
                                                                         \
-    void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s)        \
+    void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)\
     {                                                                   \
         d->ZMM_Q(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
         d->ZMM_Q(1) = F(64, d->ZMM_D(1), s->ZMM_D(1));                  \
@@ -1268,7 +1268,7 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s)
     CC_SRC = comis_eflags[ret + 1];
 }
 
-uint32_t helper_movmskps(CPUX86State *env, Reg *s)
+uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s)
 {
     int b0, b1, b2, b3;
 
@@ -1279,7 +1279,7 @@ uint32_t helper_movmskps(CPUX86State *env, Reg *s)
     return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3);
 }
 
-uint32_t helper_movmskpd(CPUX86State *env, Reg *s)
+uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s)
 {
     int b0, b1;
 
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 7e7f2cee2a..b8b0666f61 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -126,8 +126,8 @@ DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64)
 #if SHIFT == 0
 DEF_HELPER_3(glue(pshufw, SUFFIX), void, Reg, Reg, int)
 #else
-DEF_HELPER_3(shufps, void, Reg, Reg, int)
-DEF_HELPER_3(shufpd, void, Reg, Reg, int)
+DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int)
+DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int)
 DEF_HELPER_3(glue(pshufd, SUFFIX), void, Reg, Reg, int)
 DEF_HELPER_3(glue(pshuflw, SUFFIX), void, Reg, Reg, int)
 DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int)
@@ -138,9 +138,9 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int)
 /* XXX: not accurate */
 
 #define SSE_HELPER_S(name, F)                            \
-    DEF_HELPER_3(name ## ps, void, env, Reg, Reg)        \
+    DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg)        \
     DEF_HELPER_3(name ## ss, void, env, Reg, Reg)        \
-    DEF_HELPER_3(name ## pd, void, env, Reg, Reg)        \
+    DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg)        \
     DEF_HELPER_3(name ## sd, void, env, Reg, Reg)
 
 SSE_HELPER_S(add, FPU_ADD)
@@ -152,12 +152,12 @@ SSE_HELPER_S(max, FPU_MAX)
 SSE_HELPER_S(sqrt, FPU_SQRT)
 
 
-DEF_HELPER_3(cvtps2pd, void, env, Reg, Reg)
-DEF_HELPER_3(cvtpd2ps, void, env, Reg, Reg)
+DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg)
 DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg)
-DEF_HELPER_3(cvtdq2ps, void, env, Reg, Reg)
-DEF_HELPER_3(cvtdq2pd, void, env, Reg, Reg)
+DEF_HELPER_3(glue(cvtdq2ps, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(cvtdq2pd, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(cvtpi2ps, void, env, ZMMReg, MMXReg)
 DEF_HELPER_3(cvtpi2pd, void, env, ZMMReg, MMXReg)
 DEF_HELPER_3(cvtsi2ss, void, env, ZMMReg, i32)
@@ -168,8 +168,8 @@ DEF_HELPER_3(cvtsq2ss, void, env, ZMMReg, i64)
 DEF_HELPER_3(cvtsq2sd, void, env, ZMMReg, i64)
 #endif
 
-DEF_HELPER_3(cvtps2dq, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(cvtpd2dq, void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(cvtps2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_3(cvtpd2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_2(cvtss2si, s32, env, ZMMReg)
@@ -179,8 +179,8 @@ DEF_HELPER_2(cvtss2sq, s64, env, ZMMReg)
 DEF_HELPER_2(cvtsd2sq, s64, env, ZMMReg)
 #endif
 
-DEF_HELPER_3(cvttps2dq, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(cvttpd2dq, void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(cvttps2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_3(cvttpd2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_2(cvttss2si, s32, env, ZMMReg)
@@ -190,25 +190,25 @@ DEF_HELPER_2(cvttss2sq, s64, env, ZMMReg)
 DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg)
 #endif
 
-DEF_HELPER_3(rsqrtps, void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(rcpps, void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(extrq_r, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int)
 DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int)
-DEF_HELPER_3(haddps, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(haddpd, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(hsubps, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(hsubpd, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(addsubps, void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(addsubpd, void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(haddps, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(haddpd, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(hsubps, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(hsubpd, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(addsubps, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(addsubpd, SUFFIX), void, env, ZMMReg, ZMMReg)
 
 #define SSE_HELPER_CMP(name, F)                           \
-    DEF_HELPER_3(name ## ps, void, env, Reg, Reg)         \
+    DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg)         \
     DEF_HELPER_3(name ## ss, void, env, Reg, Reg)         \
-    DEF_HELPER_3(name ## pd, void, env, Reg, Reg)         \
+    DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg)         \
     DEF_HELPER_3(name ## sd, void, env, Reg, Reg)
 
 SSE_HELPER_CMP(cmpeq, FPU_CMPEQ)
@@ -224,8 +224,8 @@ DEF_HELPER_3(ucomiss, void, env, Reg, Reg)
 DEF_HELPER_3(comiss, void, env, Reg, Reg)
 DEF_HELPER_3(ucomisd, void, env, Reg, Reg)
 DEF_HELPER_3(comisd, void, env, Reg, Reg)
-DEF_HELPER_2(movmskps, i32, env, Reg)
-DEF_HELPER_2(movmskpd, i32, env, Reg)
+DEF_HELPER_2(glue(movmskps, SUFFIX), i32, env, Reg)
+DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg)
 #endif
 
 DEF_HELPER_2(glue(pmovmskb, SUFFIX), i32, env, Reg)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index e9e6062b7f..63b32a77e3 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2807,7 +2807,7 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
         gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL)
 
 #define SSE_FOP(name) OP(op2, SSE_OPF_SCALAR, \
-        gen_helper_##name##ps, gen_helper_##name##pd, \
+        gen_helper_##name##ps##_xmm, gen_helper_##name##pd##_xmm, \
         gen_helper_##name##ss, gen_helper_##name##sd)
 #define SSE_OP(sname, dname, op, flags) OP(op, flags, \
         gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL)
@@ -2846,12 +2846,12 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
             gen_helper_comiss, gen_helper_comisd, NULL, NULL),
     [0x50] = SSE_SPECIAL, /* movmskps, movmskpd */
     [0x51] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
-                gen_helper_sqrtps, gen_helper_sqrtpd,
+                gen_helper_sqrtps_xmm, gen_helper_sqrtpd_xmm,
                 gen_helper_sqrtss, gen_helper_sqrtsd),
     [0x52] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
-                gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL),
+                gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL),
     [0x53] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
-                gen_helper_rcpps, NULL, gen_helper_rcpss, NULL),
+                gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL),
     [0x54] = SSE_OP(pand, pand, op2, 0), /* andps, andpd */
     [0x55] = SSE_OP(pandn, pandn, op2, 0), /* andnps, andnpd */
     [0x56] = SSE_OP(por, por, op2, 0), /* orps, orpd */
@@ -2859,19 +2859,19 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0x58] = SSE_FOP(add),
     [0x59] = SSE_FOP(mul),
     [0x5a] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
-                gen_helper_cvtps2pd, gen_helper_cvtpd2ps,
+                gen_helper_cvtps2pd_xmm, gen_helper_cvtpd2ps_xmm,
                 gen_helper_cvtss2sd, gen_helper_cvtsd2ss),
     [0x5b] = OP(op1, SSE_OPF_V0,
-                gen_helper_cvtdq2ps, gen_helper_cvtps2dq,
-                gen_helper_cvttps2dq, NULL),
+                gen_helper_cvtdq2ps_xmm, gen_helper_cvtps2dq_xmm,
+                gen_helper_cvttps2dq_xmm, NULL),
     [0x5c] = SSE_FOP(sub),
     [0x5d] = SSE_FOP(min),
     [0x5e] = SSE_FOP(div),
     [0x5f] = SSE_FOP(max),
 
     [0xc2] = SSE_FOP(cmpeq), /* sse_op_table4 */
-    [0xc6] = OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps,
-                (SSEFunc_0_epp)gen_helper_shufpd, NULL, NULL),
+    [0xc6] = OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps_xmm,
+                (SSEFunc_0_epp)gen_helper_shufpd_xmm, NULL, NULL),
 
     /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX.  */
     [0x38] = SSE_SPECIAL,
@@ -2912,15 +2912,15 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0x79] = OP(op1, SSE_OPF_V0,
             NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r),
     [0x7c] = OP(op2, 0,
-                NULL, gen_helper_haddpd, NULL, gen_helper_haddps),
+                NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm),
     [0x7d] = OP(op2, 0,
-                NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps),
+                NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm),
     [0x7e] = SSE_SPECIAL, /* movd, movd, , movq */
     [0x7f] = SSE_SPECIAL, /* movq, movdqa, movdqu */
     [0xc4] = SSE_SPECIAL, /* pinsrw */
     [0xc5] = SSE_SPECIAL, /* pextrw */
     [0xd0] = OP(op2, 0,
-                NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps),
+                NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_xmm),
     [0xd1] = MMX_OP(psrlw),
     [0xd2] = MMX_OP(psrld),
     [0xd3] = MMX_OP(psrlq),
@@ -2943,8 +2943,8 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0xe4] = MMX_OP(pmulhuw),
     [0xe5] = MMX_OP(pmulhw),
     [0xe6] = OP(op1, SSE_OPF_V0,
-            NULL, gen_helper_cvttpd2dq,
-            gen_helper_cvtdq2pd, gen_helper_cvtpd2dq),
+            NULL, gen_helper_cvttpd2dq_xmm,
+            gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm),
     [0xe7] = SSE_SPECIAL,  /* movntq, movntq */
     [0xe8] = MMX_OP(psubsb),
     [0xe9] = MMX_OP(psubsw),
@@ -3021,8 +3021,9 @@ static const SSEFunc_l_ep sse_op_table3bq[] = {
 };
 #endif
 
-#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \
-                     gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, }
+#define SSE_FOP(x) { \
+    gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \
+    gen_helper_ ## x ## ss, gen_helper_ ## x ## sd}
 static const SSEFunc_0_epp sse_op_table4[8][4] = {
     SSE_FOP(cmpeq),
     SSE_FOP(cmplt),
@@ -3718,14 +3719,14 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
             tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm));
-            gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0);
+            gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0);
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
         case 0x150: /* movmskpd */
             CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
             tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm));
-            gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0);
+            gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0);
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
         case 0x02a: /* cvtpi2ps */
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 15/42] i386: Floating point atithmetic helper AVX prep
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (17 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 14/42] i386: Add size suffix to vector FP helpers Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 16/42] i386: Dot product AVX helper prep Paul Brook
                   ` (26 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Prepare the "easy" floating point vector helpers for AVX

No functional changes to existing helpers.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 144 ++++++++++++++++++++++++++++++++++--------
 1 file changed, 119 insertions(+), 25 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index fc8fd57aa5..d308a1ec40 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -762,40 +762,66 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order)
 }
 #endif
 
-#if SHIFT == 1
+#if SHIFT >= 1
 /* FPU ops */
 /* XXX: not accurate */
 
-#define SSE_HELPER_S(name, F)                                           \
-    void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)\
+#define SSE_HELPER_P(name, F)                                           \
+    void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env,          \
+            Reg *d, Reg *s)                                     \
     {                                                                   \
-        d->ZMM_S(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
-        d->ZMM_S(1) = F(32, d->ZMM_S(1), s->ZMM_S(1));                  \
-        d->ZMM_S(2) = F(32, d->ZMM_S(2), s->ZMM_S(2));                  \
-        d->ZMM_S(3) = F(32, d->ZMM_S(3), s->ZMM_S(3));                  \
+        Reg *v = d;                                                     \
+        d->ZMM_S(0) = F(32, v->ZMM_S(0), s->ZMM_S(0));                  \
+        d->ZMM_S(1) = F(32, v->ZMM_S(1), s->ZMM_S(1));                  \
+        d->ZMM_S(2) = F(32, v->ZMM_S(2), s->ZMM_S(2));                  \
+        d->ZMM_S(3) = F(32, v->ZMM_S(3), s->ZMM_S(3));                  \
+        YMM_ONLY(                                                       \
+        d->ZMM_S(4) = F(32, v->ZMM_S(4), s->ZMM_S(4));                  \
+        d->ZMM_S(5) = F(32, v->ZMM_S(5), s->ZMM_S(5));                  \
+        d->ZMM_S(6) = F(32, v->ZMM_S(6), s->ZMM_S(6));                  \
+        d->ZMM_S(7) = F(32, v->ZMM_S(7), s->ZMM_S(7));                  \
+        )                                                               \
     }                                                                   \
                                                                         \
-    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)        \
+    void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env,          \
+            Reg *d, Reg *s)                                     \
     {                                                                   \
-        d->ZMM_S(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
-    }                                                                   \
+        Reg *v = d;                                                     \
+        d->ZMM_D(0) = F(64, v->ZMM_D(0), s->ZMM_D(0));                  \
+        d->ZMM_D(1) = F(64, v->ZMM_D(1), s->ZMM_D(1));                  \
+        YMM_ONLY(                                                       \
+        d->ZMM_D(2) = F(64, v->ZMM_D(2), s->ZMM_D(2));                  \
+        d->ZMM_D(3) = F(64, v->ZMM_D(3), s->ZMM_D(3));                  \
+        )                                                               \
+    }
+
+#if SHIFT == 1
+
+#define SSE_HELPER_S(name, F)                                           \
+    SSE_HELPER_P(name, F)                                               \
                                                                         \
-    void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)\
+    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)\
     {                                                                   \
-        d->ZMM_D(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
-        d->ZMM_D(1) = F(64, d->ZMM_D(1), s->ZMM_D(1));                  \
+        Reg *v = d;                                                     \
+        d->ZMM_S(0) = F(32, v->ZMM_S(0), s->ZMM_S(0));                  \
     }                                                                   \
                                                                         \
-    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)        \
+    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)\
     {                                                                   \
-        d->ZMM_D(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
+        Reg *v = d;                                                     \
+        d->ZMM_D(0) = F(64, v->ZMM_D(0), s->ZMM_D(0));                  \
     }
 
+#else
+
+#define SSE_HELPER_S(name, F) SSE_HELPER_P(name, F)
+
+#endif
+
 #define FPU_ADD(size, a, b) float ## size ## _add(a, b, &env->sse_status)
 #define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status)
 #define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status)
 #define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status)
-#define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status)
 
 /* Note that the choice of comparison op here is important to get the
  * special cases right: for min and max Intel specifies that (-0,0),
@@ -812,8 +838,42 @@ SSE_HELPER_S(mul, FPU_MUL)
 SSE_HELPER_S(div, FPU_DIV)
 SSE_HELPER_S(min, FPU_MIN)
 SSE_HELPER_S(max, FPU_MAX)
-SSE_HELPER_S(sqrt, FPU_SQRT)
 
+void glue(helper_sqrtps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->ZMM_S(0) = float32_sqrt(s->ZMM_S(0), &env->sse_status);
+    d->ZMM_S(1) = float32_sqrt(s->ZMM_S(1), &env->sse_status);
+    d->ZMM_S(2) = float32_sqrt(s->ZMM_S(2), &env->sse_status);
+    d->ZMM_S(3) = float32_sqrt(s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_sqrt(s->ZMM_S(4), &env->sse_status);
+    d->ZMM_S(5) = float32_sqrt(s->ZMM_S(5), &env->sse_status);
+    d->ZMM_S(6) = float32_sqrt(s->ZMM_S(6), &env->sse_status);
+    d->ZMM_S(7) = float32_sqrt(s->ZMM_S(7), &env->sse_status);
+#endif
+}
+
+void glue(helper_sqrtpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->ZMM_D(0) = float64_sqrt(s->ZMM_D(0), &env->sse_status);
+    d->ZMM_D(1) = float64_sqrt(s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_D(2) = float64_sqrt(s->ZMM_D(2), &env->sse_status);
+    d->ZMM_D(3) = float64_sqrt(s->ZMM_D(3), &env->sse_status);
+#endif
+}
+
+#if SHIFT == 1
+void helper_sqrtss(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->ZMM_S(0) = float32_sqrt(s->ZMM_S(0), &env->sse_status);
+}
+
+void helper_sqrtsd(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->ZMM_D(0) = float64_sqrt(s->ZMM_D(0), &env->sse_status);
+}
+#endif
 
 /* float to float conversions */
 void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
@@ -1043,6 +1103,20 @@ void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     d->ZMM_S(3) = float32_div(float32_one,
                               float32_sqrt(s->ZMM_S(3), &env->sse_status),
                               &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_div(float32_one,
+                              float32_sqrt(s->ZMM_S(4), &env->sse_status),
+                              &env->sse_status);
+    d->ZMM_S(5) = float32_div(float32_one,
+                              float32_sqrt(s->ZMM_S(5), &env->sse_status),
+                              &env->sse_status);
+    d->ZMM_S(6) = float32_div(float32_one,
+                              float32_sqrt(s->ZMM_S(6), &env->sse_status),
+                              &env->sse_status);
+    d->ZMM_S(7) = float32_div(float32_one,
+                              float32_sqrt(s->ZMM_S(7), &env->sse_status),
+                              &env->sse_status);
+#endif
     set_float_exception_flags(old_flags, &env->sse_status);
 }
 
@@ -1062,6 +1136,12 @@ void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     d->ZMM_S(1) = float32_div(float32_one, s->ZMM_S(1), &env->sse_status);
     d->ZMM_S(2) = float32_div(float32_one, s->ZMM_S(2), &env->sse_status);
     d->ZMM_S(3) = float32_div(float32_one, s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_div(float32_one, s->ZMM_S(4), &env->sse_status);
+    d->ZMM_S(5) = float32_div(float32_one, s->ZMM_S(5), &env->sse_status);
+    d->ZMM_S(6) = float32_div(float32_one, s->ZMM_S(6), &env->sse_status);
+    d->ZMM_S(7) = float32_div(float32_one, s->ZMM_S(7), &env->sse_status);
+#endif
     set_float_exception_flags(old_flags, &env->sse_status);
 }
 
@@ -1156,18 +1236,30 @@ void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     MOVE(*d, r);
 }
 
-void glue(helper_addsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    d->ZMM_S(0) = float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
-    d->ZMM_S(1) = float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
-    d->ZMM_S(2) = float32_sub(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
-    d->ZMM_S(3) = float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
+    Reg *v = d;
+    d->ZMM_S(0) = float32_sub(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
+    d->ZMM_S(1) = float32_add(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
+    d->ZMM_S(2) = float32_sub(v->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
+    d->ZMM_S(3) = float32_add(v->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_sub(v->ZMM_S(4), s->ZMM_S(4), &env->sse_status);
+    d->ZMM_S(5) = float32_add(v->ZMM_S(5), s->ZMM_S(5), &env->sse_status);
+    d->ZMM_S(6) = float32_sub(v->ZMM_S(6), s->ZMM_S(6), &env->sse_status);
+    d->ZMM_S(7) = float32_add(v->ZMM_S(7), s->ZMM_S(7), &env->sse_status);
+#endif
 }
 
-void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    d->ZMM_D(0) = float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
-    d->ZMM_D(1) = float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status);
+    Reg *v = d;
+    d->ZMM_D(0) = float64_sub(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
+    d->ZMM_D(1) = float64_add(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_D(2) = float64_sub(v->ZMM_D(2), s->ZMM_D(2), &env->sse_status);
+    d->ZMM_D(3) = float64_add(v->ZMM_D(3), s->ZMM_D(3), &env->sse_status);
+#endif
 }
 
 /* XXX: unordered */
@@ -2694,6 +2786,8 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 }
 #endif
 
+#undef SSE_HELPER_S
+
 #undef SHIFT
 #undef XMM_ONLY
 #undef YMM_ONLY
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 16/42] i386: Dot product AVX helper prep
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (18 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 15/42] i386: Floating point atithmetic helper AVX prep Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 17/42] i386: Destructive FP helpers for AVX Paul Brook
                   ` (25 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Make the dpps and dppd helpers AVX-ready

I can't see any obvious reason why dppd shouldn't work on 256 bit ymm
registers, but both AMD and Intel agree that it's xmm only.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 54 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index d308a1ec40..4137e6e1fa 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2366,8 +2366,10 @@ SSE_HELPER_I(helper_blendps, L, 4, FBLENDP)
 SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP)
 SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP)
 
-void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
+void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+                               uint32_t mask)
 {
+    Reg *v = d;
     float32 prod, iresult, iresult2;
 
     /*
@@ -2375,23 +2377,23 @@ void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
      * to correctly round the intermediate results
      */
     if (mask & (1 << 4)) {
-        iresult = float32_mul(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
+        iresult = float32_mul(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
     } else {
         iresult = float32_zero;
     }
     if (mask & (1 << 5)) {
-        prod = float32_mul(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
+        prod = float32_mul(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
     } else {
         prod = float32_zero;
     }
     iresult = float32_add(iresult, prod, &env->sse_status);
     if (mask & (1 << 6)) {
-        iresult2 = float32_mul(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
+        iresult2 = float32_mul(v->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
     } else {
         iresult2 = float32_zero;
     }
     if (mask & (1 << 7)) {
-        prod = float32_mul(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
+        prod = float32_mul(v->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
     } else {
         prod = float32_zero;
     }
@@ -2402,26 +2404,62 @@ void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
     d->ZMM_S(1) = (mask & (1 << 1)) ? iresult : float32_zero;
     d->ZMM_S(2) = (mask & (1 << 2)) ? iresult : float32_zero;
     d->ZMM_S(3) = (mask & (1 << 3)) ? iresult : float32_zero;
+#if SHIFT == 2
+    if (mask & (1 << 4)) {
+        iresult = float32_mul(v->ZMM_S(4), s->ZMM_S(4), &env->sse_status);
+    } else {
+        iresult = float32_zero;
+    }
+    if (mask & (1 << 5)) {
+        prod = float32_mul(v->ZMM_S(5), s->ZMM_S(5), &env->sse_status);
+    } else {
+        prod = float32_zero;
+    }
+    iresult = float32_add(iresult, prod, &env->sse_status);
+    if (mask & (1 << 6)) {
+        iresult2 = float32_mul(v->ZMM_S(6), s->ZMM_S(6), &env->sse_status);
+    } else {
+        iresult2 = float32_zero;
+    }
+    if (mask & (1 << 7)) {
+        prod = float32_mul(v->ZMM_S(7), s->ZMM_S(7), &env->sse_status);
+    } else {
+        prod = float32_zero;
+    }
+    iresult2 = float32_add(iresult2, prod, &env->sse_status);
+    iresult = float32_add(iresult, iresult2, &env->sse_status);
+
+    d->ZMM_S(4) = (mask & (1 << 0)) ? iresult : float32_zero;
+    d->ZMM_S(5) = (mask & (1 << 1)) ? iresult : float32_zero;
+    d->ZMM_S(6) = (mask & (1 << 2)) ? iresult : float32_zero;
+    d->ZMM_S(7) = (mask & (1 << 3)) ? iresult : float32_zero;
+#endif
 }
 
-void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
+#if SHIFT == 1
+/* Oddly, there is no ymm version of dppd */
+void glue(helper_dppd, SUFFIX)(CPUX86State *env,
+                               Reg *d, Reg *s, uint32_t mask)
 {
+    Reg *v = d;
     float64 iresult;
 
     if (mask & (1 << 4)) {
-        iresult = float64_mul(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
+        iresult = float64_mul(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
     } else {
         iresult = float64_zero;
     }
+
     if (mask & (1 << 5)) {
         iresult = float64_add(iresult,
-                              float64_mul(d->ZMM_D(1), s->ZMM_D(1),
+                              float64_mul(v->ZMM_D(1), s->ZMM_D(1),
                                           &env->sse_status),
                               &env->sse_status);
     }
     d->ZMM_D(0) = (mask & (1 << 0)) ? iresult : float64_zero;
     d->ZMM_D(1) = (mask & (1 << 1)) ? iresult : float64_zero;
 }
+#endif
 
 void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                   uint32_t offset)
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 17/42] i386: Destructive FP helpers for AVX
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (19 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 16/42] i386: Dot product AVX helper prep Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 18/42] i386: Misc AVX helper prep Paul Brook
                   ` (24 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Perpare the horizontal atithmetic vector helpers for AVX
These currently use a dummy Reg typed variable to store the result then
assign the whole register.  This will cause 128 bit operations to corrupt
the upper half of the register, so replace it with explicit temporaries
and element assignments.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 96 +++++++++++++++++++++++++++++++------------
 1 file changed, 70 insertions(+), 26 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 4137e6e1fa..d128af6cc8 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -1196,44 +1196,88 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, int index, int length)
     d->ZMM_Q(0) = helper_insertq(d->ZMM_Q(0), index, length);
 }
 
-void glue(helper_haddps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    ZMMReg r;
-
-    r.ZMM_S(0) = float32_add(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status);
-    r.ZMM_S(1) = float32_add(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status);
-    r.ZMM_S(2) = float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status);
-    r.ZMM_S(3) = float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status);
-    MOVE(*d, r);
+    Reg *v = d;
+    float32 r0, r1, r2, r3;
+
+    r0 = float32_add(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status);
+    r1 = float32_add(v->ZMM_S(2), v->ZMM_S(3), &env->sse_status);
+    r2 = float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status);
+    r3 = float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status);
+    d->ZMM_S(0) = r0;
+    d->ZMM_S(1) = r1;
+    d->ZMM_S(2) = r2;
+    d->ZMM_S(3) = r3;
+#if SHIFT == 2
+    r0 = float32_add(v->ZMM_S(4), v->ZMM_S(5), &env->sse_status);
+    r1 = float32_add(v->ZMM_S(6), v->ZMM_S(7), &env->sse_status);
+    r2 = float32_add(s->ZMM_S(4), s->ZMM_S(5), &env->sse_status);
+    r3 = float32_add(s->ZMM_S(6), s->ZMM_S(7), &env->sse_status);
+    d->ZMM_S(4) = r0;
+    d->ZMM_S(5) = r1;
+    d->ZMM_S(6) = r2;
+    d->ZMM_S(7) = r3;
+#endif
 }
 
-void glue(helper_haddpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    ZMMReg r;
+    Reg *v = d;
+    float64 r0, r1;
 
-    r.ZMM_D(0) = float64_add(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status);
-    r.ZMM_D(1) = float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status);
-    MOVE(*d, r);
+    r0 = float64_add(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status);
+    r1 = float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status);
+    d->ZMM_D(0) = r0;
+    d->ZMM_D(1) = r1;
+#if SHIFT == 2
+    r0 = float64_add(v->ZMM_D(2), v->ZMM_D(3), &env->sse_status);
+    r1 = float64_add(s->ZMM_D(2), s->ZMM_D(3), &env->sse_status);
+    d->ZMM_D(2) = r0;
+    d->ZMM_D(3) = r1;
+#endif
 }
 
-void glue(helper_hsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    ZMMReg r;
-
-    r.ZMM_S(0) = float32_sub(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status);
-    r.ZMM_S(1) = float32_sub(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status);
-    r.ZMM_S(2) = float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status);
-    r.ZMM_S(3) = float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status);
-    MOVE(*d, r);
+    Reg *v = d;
+    float32 r0, r1, r2, r3;
+
+    r0 = float32_sub(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status);
+    r1 = float32_sub(v->ZMM_S(2), v->ZMM_S(3), &env->sse_status);
+    r2 = float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status);
+    r3 = float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status);
+    d->ZMM_S(0) = r0;
+    d->ZMM_S(1) = r1;
+    d->ZMM_S(2) = r2;
+    d->ZMM_S(3) = r3;
+#if SHIFT == 2
+    r0 = float32_sub(v->ZMM_S(4), v->ZMM_S(5), &env->sse_status);
+    r1 = float32_sub(v->ZMM_S(6), v->ZMM_S(7), &env->sse_status);
+    r2 = float32_sub(s->ZMM_S(4), s->ZMM_S(5), &env->sse_status);
+    r3 = float32_sub(s->ZMM_S(6), s->ZMM_S(7), &env->sse_status);
+    d->ZMM_S(4) = r0;
+    d->ZMM_S(5) = r1;
+    d->ZMM_S(6) = r2;
+    d->ZMM_S(7) = r3;
+#endif
 }
 
-void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
+void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-    ZMMReg r;
+    Reg *v = d;
+    float64 r0, r1;
 
-    r.ZMM_D(0) = float64_sub(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status);
-    r.ZMM_D(1) = float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status);
-    MOVE(*d, r);
+    r0 = float64_sub(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status);
+    r1 = float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status);
+    d->ZMM_D(0) = r0;
+    d->ZMM_D(1) = r1;
+#if SHIFT == 2
+    r0 = float64_sub(v->ZMM_D(2), v->ZMM_D(3), &env->sse_status);
+    r1 = float64_sub(s->ZMM_D(2), s->ZMM_D(3), &env->sse_status);
+    d->ZMM_D(2) = r0;
+    d->ZMM_D(3) = r1;
+#endif
 }
 
 void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 18/42] i386: Misc AVX helper prep
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (20 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 17/42] i386: Destructive FP helpers for AVX Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 19/42] i386: Rewrite blendv helpers Paul Brook
                   ` (23 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Fixup various vector helpers that either trivially exten to 256 bit,
or don't have 256 bit variants.

No functional changes to existing helpers

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 159 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 139 insertions(+), 20 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index d128af6cc8..3202c00572 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -641,6 +641,7 @@ void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
+#if SHIFT < 2
 void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                   target_ulong a0)
 {
@@ -652,6 +653,7 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
         }
     }
 }
+#endif
 
 void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val)
 {
@@ -882,6 +884,13 @@ void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 
     s0 = s->ZMM_S(0);
     s1 = s->ZMM_S(1);
+#if SHIFT == 2
+    float32 s2, s3;
+    s2 = s->ZMM_S(2);
+    s3 = s->ZMM_S(3);
+    d->ZMM_D(2) = float32_to_float64(s2, &env->sse_status);
+    d->ZMM_D(3) = float32_to_float64(s3, &env->sse_status);
+#endif
     d->ZMM_D(0) = float32_to_float64(s0, &env->sse_status);
     d->ZMM_D(1) = float32_to_float64(s1, &env->sse_status);
 }
@@ -890,9 +899,17 @@ void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     d->ZMM_S(0) = float64_to_float32(s->ZMM_D(0), &env->sse_status);
     d->ZMM_S(1) = float64_to_float32(s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(2) = float64_to_float32(s->ZMM_D(2), &env->sse_status);
+    d->ZMM_S(3) = float64_to_float32(s->ZMM_D(3), &env->sse_status);
+    d->Q(2) = 0;
+    d->Q(3) = 0;
+#else
     d->Q(1) = 0;
+#endif
 }
 
+#if SHIFT == 1
 void helper_cvtss2sd(CPUX86State *env, Reg *d, Reg *s)
 {
     d->ZMM_D(0) = float32_to_float64(s->ZMM_S(0), &env->sse_status);
@@ -902,6 +919,7 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s)
 {
     d->ZMM_S(0) = float64_to_float32(s->ZMM_D(0), &env->sse_status);
 }
+#endif
 
 /* integer to float */
 void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
@@ -910,6 +928,12 @@ void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     d->ZMM_S(1) = int32_to_float32(s->ZMM_L(1), &env->sse_status);
     d->ZMM_S(2) = int32_to_float32(s->ZMM_L(2), &env->sse_status);
     d->ZMM_S(3) = int32_to_float32(s->ZMM_L(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = int32_to_float32(s->ZMM_L(4), &env->sse_status);
+    d->ZMM_S(5) = int32_to_float32(s->ZMM_L(5), &env->sse_status);
+    d->ZMM_S(6) = int32_to_float32(s->ZMM_L(6), &env->sse_status);
+    d->ZMM_S(7) = int32_to_float32(s->ZMM_L(7), &env->sse_status);
+#endif
 }
 
 void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
@@ -918,10 +942,18 @@ void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 
     l0 = (int32_t)s->ZMM_L(0);
     l1 = (int32_t)s->ZMM_L(1);
+#if SHIFT == 2
+    int32_t l2, l3;
+    l2 = (int32_t)s->ZMM_L(2);
+    l3 = (int32_t)s->ZMM_L(3);
+    d->ZMM_D(2) = int32_to_float64(l2, &env->sse_status);
+    d->ZMM_D(3) = int32_to_float64(l3, &env->sse_status);
+#endif
     d->ZMM_D(0) = int32_to_float64(l0, &env->sse_status);
     d->ZMM_D(1) = int32_to_float64(l1, &env->sse_status);
 }
 
+#if SHIFT == 1
 void helper_cvtpi2ps(CPUX86State *env, ZMMReg *d, MMXReg *s)
 {
     d->ZMM_S(0) = int32_to_float32(s->MMX_L(0), &env->sse_status);
@@ -956,8 +988,11 @@ void helper_cvtsq2sd(CPUX86State *env, ZMMReg *d, uint64_t val)
 }
 #endif
 
+#endif
+
 /* float to integer */
 
+#if SHIFT == 1
 /*
  * x86 mandates that we return the indefinite integer value for the result
  * of any float-to-integer conversion that raises the 'invalid' exception.
@@ -988,6 +1023,7 @@ WRAP_FLOATCONV(int64_t, float32_to_int64, float32, INT64_MIN)
 WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero, float32, INT64_MIN)
 WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN)
 WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN)
+#endif
 
 void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
@@ -995,15 +1031,29 @@ void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     d->ZMM_L(1) = x86_float32_to_int32(s->ZMM_S(1), &env->sse_status);
     d->ZMM_L(2) = x86_float32_to_int32(s->ZMM_S(2), &env->sse_status);
     d->ZMM_L(3) = x86_float32_to_int32(s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_L(4) = x86_float32_to_int32(s->ZMM_S(4), &env->sse_status);
+    d->ZMM_L(5) = x86_float32_to_int32(s->ZMM_S(5), &env->sse_status);
+    d->ZMM_L(6) = x86_float32_to_int32(s->ZMM_S(6), &env->sse_status);
+    d->ZMM_L(7) = x86_float32_to_int32(s->ZMM_S(7), &env->sse_status);
+#endif
 }
 
 void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     d->ZMM_L(0) = x86_float64_to_int32(s->ZMM_D(0), &env->sse_status);
     d->ZMM_L(1) = x86_float64_to_int32(s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_L(2) = x86_float64_to_int32(s->ZMM_D(2), &env->sse_status);
+    d->ZMM_L(3) = x86_float64_to_int32(s->ZMM_D(3), &env->sse_status);
+    d->Q(2) = 0;
+    d->Q(3) = 0;
+#else
     d->ZMM_Q(1) = 0;
+#endif
 }
 
+#if SHIFT == 1
 void helper_cvtps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
     d->MMX_L(0) = x86_float32_to_int32(s->ZMM_S(0), &env->sse_status);
@@ -1037,33 +1087,64 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s)
     return x86_float64_to_int64(s->ZMM_D(0), &env->sse_status);
 }
 #endif
+#endif
 
 /* float to integer truncated */
 void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
-    d->ZMM_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
-    d->ZMM_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);
-    d->ZMM_L(2) = x86_float32_to_int32_round_to_zero(s->ZMM_S(2), &env->sse_status);
-    d->ZMM_L(3) = x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->sse_status);
+    d->ZMM_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0),
+                                                     &env->sse_status);
+    d->ZMM_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1),
+                                                     &env->sse_status);
+    d->ZMM_L(2) = x86_float32_to_int32_round_to_zero(s->ZMM_S(2),
+                                                     &env->sse_status);
+    d->ZMM_L(3) = x86_float32_to_int32_round_to_zero(s->ZMM_S(3),
+                                                     &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_L(4) = x86_float32_to_int32_round_to_zero(s->ZMM_S(4),
+                                                     &env->sse_status);
+    d->ZMM_L(5) = x86_float32_to_int32_round_to_zero(s->ZMM_S(5),
+                                                     &env->sse_status);
+    d->ZMM_L(6) = x86_float32_to_int32_round_to_zero(s->ZMM_S(6),
+                                                     &env->sse_status);
+    d->ZMM_L(7) = x86_float32_to_int32_round_to_zero(s->ZMM_S(7),
+                                                     &env->sse_status);
+#endif
 }
 
 void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
-    d->ZMM_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
-    d->ZMM_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);
+    d->ZMM_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0),
+                                                     &env->sse_status);
+    d->ZMM_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1),
+                                                     &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_L(2) = x86_float64_to_int32_round_to_zero(s->ZMM_D(2),
+                                                     &env->sse_status);
+    d->ZMM_L(3) = x86_float64_to_int32_round_to_zero(s->ZMM_D(3),
+                                                     &env->sse_status);
+    d->ZMM_Q(2) = 0;
+    d->ZMM_Q(3) = 0;
+#else
     d->ZMM_Q(1) = 0;
+#endif
 }
 
+#if SHIFT == 1
 void helper_cvttps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
-    d->MMX_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
-    d->MMX_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);
+    d->MMX_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0),
+                                                     &env->sse_status);
+    d->MMX_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1),
+                                                     &env->sse_status);
 }
 
 void helper_cvttpd2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
-    d->MMX_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
-    d->MMX_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);
+    d->MMX_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0),
+                                                     &env->sse_status);
+    d->MMX_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1),
+                                                     &env->sse_status);
 }
 
 int32_t helper_cvttss2si(CPUX86State *env, ZMMReg *s)
@@ -1087,6 +1168,7 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s)
     return x86_float64_to_int64_round_to_zero(s->ZMM_D(0), &env->sse_status);
 }
 #endif
+#endif
 
 void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
@@ -1120,6 +1202,7 @@ void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     set_float_exception_flags(old_flags, &env->sse_status);
 }
 
+#if SHIFT == 1
 void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
@@ -1128,6 +1211,7 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg *s)
                               &env->sse_status);
     set_float_exception_flags(old_flags, &env->sse_status);
 }
+#endif
 
 void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
@@ -1145,13 +1229,16 @@ void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s)
     set_float_exception_flags(old_flags, &env->sse_status);
 }
 
+#if SHIFT == 1
 void helper_rcpss(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
     uint8_t old_flags = get_float_exception_flags(&env->sse_status);
     d->ZMM_S(0) = float32_div(float32_one, s->ZMM_S(0), &env->sse_status);
     set_float_exception_flags(old_flags, &env->sse_status);
 }
+#endif
 
+#if SHIFT == 1
 static inline uint64_t helper_extrq(uint64_t src, int shift, int len)
 {
     uint64_t mask;
@@ -1195,6 +1282,7 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, int index, int length)
 {
     d->ZMM_Q(0) = helper_insertq(d->ZMM_Q(0), index, length);
 }
+#endif
 
 void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
@@ -1358,6 +1446,7 @@ SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT)
 SSE_HELPER_CMP(cmpnle, FPU_CMPNLE)
 SSE_HELPER_CMP(cmpord, FPU_CMPORD)
 
+#if SHIFT == 1
 static const int comis_eflags[4] = {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C};
 
 void helper_ucomiss(CPUX86State *env, Reg *d, Reg *s)
@@ -1403,25 +1492,38 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s)
     ret = float64_compare(d0, d1, &env->sse_status);
     CC_SRC = comis_eflags[ret + 1];
 }
+#endif
 
 uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s)
 {
-    int b0, b1, b2, b3;
+    uint32_t mask;
 
-    b0 = s->ZMM_L(0) >> 31;
-    b1 = s->ZMM_L(1) >> 31;
-    b2 = s->ZMM_L(2) >> 31;
-    b3 = s->ZMM_L(3) >> 31;
-    return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3);
+    mask = 0;
+    mask |= (s->ZMM_L(0) >> (31 - 0)) & (1 << 0);
+    mask |= (s->ZMM_L(1) >> (31 - 1)) & (1 << 1);
+    mask |= (s->ZMM_L(2) >> (31 - 2)) & (1 << 2);
+    mask |= (s->ZMM_L(3) >> (31 - 3)) & (1 << 3);
+#if SHIFT == 2
+    mask |= (s->ZMM_L(4) >> (31 - 4)) & (1 << 4);
+    mask |= (s->ZMM_L(5) >> (31 - 5)) & (1 << 5);
+    mask |= (s->ZMM_L(6) >> (31 - 6)) & (1 << 6);
+    mask |= (s->ZMM_L(7) >> (31 - 7)) & (1 << 7);
+#endif
+    return mask;
 }
 
 uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s)
 {
-    int b0, b1;
+    uint32_t mask;
 
-    b0 = s->ZMM_L(1) >> 31;
-    b1 = s->ZMM_L(3) >> 31;
-    return b0 | (b1 << 1);
+    mask = 0;
+    mask |= (s->ZMM_L(1) >> (31 - 0)) & (1 << 0);
+    mask |= (s->ZMM_L(3) >> (31 - 1)) & (1 << 1);
+#if SHIFT == 2
+    mask |= (s->ZMM_L(5) >> (31 - 2)) & (1 << 2);
+    mask |= (s->ZMM_L(7) >> (31 - 3)) & (1 << 3);
+#endif
+    return mask;
 }
 
 #endif
@@ -2233,6 +2335,7 @@ SSE_HELPER_L(helper_pmaxud, MAX)
 #define FMULLD(d, s) ((int32_t)d * (int32_t)s)
 SSE_HELPER_L(helper_pmulld, FMULLD)
 
+#if SHIFT == 1
 void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int idx = 0;
@@ -2264,6 +2367,7 @@ void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     d->L(1) = 0;
     d->Q(1) = 0;
 }
+#endif
 
 void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                   uint32_t mode)
@@ -2293,6 +2397,12 @@ void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     d->ZMM_S(1) = float32_round_to_int(s->ZMM_S(1), &env->sse_status);
     d->ZMM_S(2) = float32_round_to_int(s->ZMM_S(2), &env->sse_status);
     d->ZMM_S(3) = float32_round_to_int(s->ZMM_S(3), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_S(4) = float32_round_to_int(s->ZMM_S(4), &env->sse_status);
+    d->ZMM_S(5) = float32_round_to_int(s->ZMM_S(5), &env->sse_status);
+    d->ZMM_S(6) = float32_round_to_int(s->ZMM_S(6), &env->sse_status);
+    d->ZMM_S(7) = float32_round_to_int(s->ZMM_S(7), &env->sse_status);
+#endif
 
     if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) {
         set_float_exception_flags(get_float_exception_flags(&env->sse_status) &
@@ -2328,6 +2438,10 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 
     d->ZMM_D(0) = float64_round_to_int(s->ZMM_D(0), &env->sse_status);
     d->ZMM_D(1) = float64_round_to_int(s->ZMM_D(1), &env->sse_status);
+#if SHIFT == 2
+    d->ZMM_D(2) = float64_round_to_int(s->ZMM_D(2), &env->sse_status);
+    d->ZMM_D(3) = float64_round_to_int(s->ZMM_D(3), &env->sse_status);
+#endif
 
     if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) {
         set_float_exception_flags(get_float_exception_flags(&env->sse_status) &
@@ -2337,6 +2451,7 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     env->sse_status.float_rounding_mode = prev_rounding_mode;
 }
 
+#if SHIFT == 1
 void glue(helper_roundss, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                   uint32_t mode)
 {
@@ -2404,6 +2519,7 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     }
     env->sse_status.float_rounding_mode = prev_rounding_mode;
 }
+#endif
 
 #define FBLENDP(d, s, m) (m ? s : d)
 SSE_HELPER_I(helper_blendps, L, 4, FBLENDP)
@@ -2545,6 +2661,7 @@ void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 #define FCMPGTQ(d, s) ((int64_t)d > (int64_t)s ? -1 : 0)
 SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ)
 
+#if SHIFT == 1
 static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl)
 {
     target_long val, limit;
@@ -2765,6 +2882,8 @@ target_ulong helper_crc32(uint32_t crc1, target_ulong msg, uint32_t len)
     return crc;
 }
 
+#endif
+
 void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                     uint32_t ctrl)
 {
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 19/42] i386: Rewrite blendv helpers
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (21 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 18/42] i386: Misc AVX helper prep Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 20/42] i386: AVX pclmulqdq Paul Brook
                   ` (22 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Rewrite the blendv helpers so that they can easily be extended to support
the AVX encodings, which make all 4 arguments explicit.

No functional changes to the existing helpers

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 119 +++++++++++++++++++++---------------------
 1 file changed, 60 insertions(+), 59 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 3202c00572..9f388b02b9 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2141,73 +2141,74 @@ void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     }
 }
 
-#define XMM0 (env->xmm_regs[0])
+#if SHIFT >= 1
+
+#define BLEND_V128(elem, num, F, b) do {                                    \
+    d->elem(b + 0) = F(v->elem(b + 0), s->elem(b + 0), m->elem(b + 0));     \
+    d->elem(b + 1) = F(v->elem(b + 1), s->elem(b + 1), m->elem(b + 1));     \
+    if (num > 2) {                                                          \
+        d->elem(b + 2) = F(v->elem(b + 2), s->elem(b + 2), m->elem(b + 2)); \
+        d->elem(b + 3) = F(v->elem(b + 3), s->elem(b + 3), m->elem(b + 3)); \
+    }                                                                       \
+    if (num > 4) {                                                          \
+        d->elem(b + 4) = F(v->elem(b + 4), s->elem(b + 4), m->elem(b + 4)); \
+        d->elem(b + 5) = F(v->elem(b + 5), s->elem(b + 5), m->elem(b + 5)); \
+        d->elem(b + 6) = F(v->elem(b + 6), s->elem(b + 6), m->elem(b + 6)); \
+        d->elem(b + 7) = F(v->elem(b + 7), s->elem(b + 7), m->elem(b + 7)); \
+    }                                                                       \
+    if (num > 8) {                                                          \
+        d->elem(b + 8) = F(v->elem(b + 8), s->elem(b + 8), m->elem(b + 8)); \
+        d->elem(b + 9) = F(v->elem(b + 9), s->elem(b + 9), m->elem(b + 9)); \
+        d->elem(b + 10) = F(v->elem(b + 10), s->elem(b + 10), m->elem(b + 10));\
+        d->elem(b + 11) = F(v->elem(b + 11), s->elem(b + 11), m->elem(b + 11));\
+        d->elem(b + 12) = F(v->elem(b + 12), s->elem(b + 12), m->elem(b + 12));\
+        d->elem(b + 13) = F(v->elem(b + 13), s->elem(b + 13), m->elem(b + 13));\
+        d->elem(b + 14) = F(v->elem(b + 14), s->elem(b + 14), m->elem(b + 14));\
+        d->elem(b + 15) = F(v->elem(b + 15), s->elem(b + 15), m->elem(b + 15));\
+    }                                                                   \
+    } while (0)
 
-#if SHIFT == 1
 #define SSE_HELPER_V(name, elem, num, F)                                \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)           \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
     {                                                                   \
-        d->elem(0) = F(d->elem(0), s->elem(0), XMM0.elem(0));           \
-        d->elem(1) = F(d->elem(1), s->elem(1), XMM0.elem(1));           \
-        if (num > 2) {                                                  \
-            d->elem(2) = F(d->elem(2), s->elem(2), XMM0.elem(2));       \
-            d->elem(3) = F(d->elem(3), s->elem(3), XMM0.elem(3));       \
-            if (num > 4) {                                              \
-                d->elem(4) = F(d->elem(4), s->elem(4), XMM0.elem(4));   \
-                d->elem(5) = F(d->elem(5), s->elem(5), XMM0.elem(5));   \
-                d->elem(6) = F(d->elem(6), s->elem(6), XMM0.elem(6));   \
-                d->elem(7) = F(d->elem(7), s->elem(7), XMM0.elem(7));   \
-                if (num > 8) {                                          \
-                    d->elem(8) = F(d->elem(8), s->elem(8), XMM0.elem(8)); \
-                    d->elem(9) = F(d->elem(9), s->elem(9), XMM0.elem(9)); \
-                    d->elem(10) = F(d->elem(10), s->elem(10), XMM0.elem(10)); \
-                    d->elem(11) = F(d->elem(11), s->elem(11), XMM0.elem(11)); \
-                    d->elem(12) = F(d->elem(12), s->elem(12), XMM0.elem(12)); \
-                    d->elem(13) = F(d->elem(13), s->elem(13), XMM0.elem(13)); \
-                    d->elem(14) = F(d->elem(14), s->elem(14), XMM0.elem(14)); \
-                    d->elem(15) = F(d->elem(15), s->elem(15), XMM0.elem(15)); \
-                }                                                       \
-            }                                                           \
-        }                                                               \
-    }
+        Reg *v = d;                                                     \
+        Reg *m = &env->xmm_regs[0];                                     \
+        BLEND_V128(elem, num, F, 0);                                    \
+        YMM_ONLY(BLEND_V128(elem, num, F, num);)                        \
+    }
+
+#define BLEND_I128(elem, num, F, b) do {                                    \
+    d->elem(b + 0) = F(v->elem(b + 0), s->elem(b + 0), ((imm >> 0) & 1));   \
+    d->elem(b + 1) = F(v->elem(b + 1), s->elem(b + 1), ((imm >> 1) & 1));   \
+    if (num > 2) {                                                          \
+        d->elem(b + 2) = F(v->elem(b + 2), s->elem(b + 2), ((imm >> 2) & 1)); \
+        d->elem(b + 3) = F(v->elem(b + 3), s->elem(b + 3), ((imm >> 3) & 1)); \
+    }                                                                       \
+    if (num > 4) {                                                          \
+        d->elem(b + 4) = F(v->elem(b + 4), s->elem(b + 4), ((imm >> 4) & 1)); \
+        d->elem(b + 5) = F(v->elem(b + 5), s->elem(b + 5), ((imm >> 5) & 1)); \
+        d->elem(b + 6) = F(v->elem(b + 6), s->elem(b + 6), ((imm >> 6) & 1)); \
+        d->elem(b + 7) = F(v->elem(b + 7), s->elem(b + 7), ((imm >> 7) & 1)); \
+    }                                                                       \
+    } while (0)
 
 #define SSE_HELPER_I(name, elem, num, F)                                \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t imm) \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,   \
+                            uint32_t imm)                               \
     {                                                                   \
-        d->elem(0) = F(d->elem(0), s->elem(0), ((imm >> 0) & 1));       \
-        d->elem(1) = F(d->elem(1), s->elem(1), ((imm >> 1) & 1));       \
-        if (num > 2) {                                                  \
-            d->elem(2) = F(d->elem(2), s->elem(2), ((imm >> 2) & 1));   \
-            d->elem(3) = F(d->elem(3), s->elem(3), ((imm >> 3) & 1));   \
-            if (num > 4) {                                              \
-                d->elem(4) = F(d->elem(4), s->elem(4), ((imm >> 4) & 1)); \
-                d->elem(5) = F(d->elem(5), s->elem(5), ((imm >> 5) & 1)); \
-                d->elem(6) = F(d->elem(6), s->elem(6), ((imm >> 6) & 1)); \
-                d->elem(7) = F(d->elem(7), s->elem(7), ((imm >> 7) & 1)); \
-                if (num > 8) {                                          \
-                    d->elem(8) = F(d->elem(8), s->elem(8), ((imm >> 8) & 1)); \
-                    d->elem(9) = F(d->elem(9), s->elem(9), ((imm >> 9) & 1)); \
-                    d->elem(10) = F(d->elem(10), s->elem(10),           \
-                                    ((imm >> 10) & 1));                 \
-                    d->elem(11) = F(d->elem(11), s->elem(11),           \
-                                    ((imm >> 11) & 1));                 \
-                    d->elem(12) = F(d->elem(12), s->elem(12),           \
-                                    ((imm >> 12) & 1));                 \
-                    d->elem(13) = F(d->elem(13), s->elem(13),           \
-                                    ((imm >> 13) & 1));                 \
-                    d->elem(14) = F(d->elem(14), s->elem(14),           \
-                                    ((imm >> 14) & 1));                 \
-                    d->elem(15) = F(d->elem(15), s->elem(15),           \
-                                    ((imm >> 15) & 1));                 \
-                }                                                       \
-            }                                                           \
-        }                                                               \
+        Reg *v = d;                                                     \
+        BLEND_I128(elem, num, F, 0);                                    \
+        YMM_ONLY(                                                       \
+        if (num < 8)                                                    \
+            imm >>= num;                                                \
+        BLEND_I128(elem, num, F, num);                                  \
+        )                                                               \
     }
 
 /* SSE4.1 op helpers */
-#define FBLENDVB(d, s, m) ((m & 0x80) ? s : d)
-#define FBLENDVPS(d, s, m) ((m & 0x80000000) ? s : d)
-#define FBLENDVPD(d, s, m) ((m & 0x8000000000000000LL) ? s : d)
+#define FBLENDVB(v, s, m) ((m & 0x80) ? s : v)
+#define FBLENDVPS(v, s, m) ((m & 0x80000000) ? s : v)
+#define FBLENDVPD(v, s, m) ((m & 0x8000000000000000LL) ? s : v)
 SSE_HELPER_V(helper_pblendvb, B, 16, FBLENDVB)
 SSE_HELPER_V(helper_blendvps, L, 4, FBLENDVPS)
 SSE_HELPER_V(helper_blendvpd, Q, 2, FBLENDVPD)
@@ -2521,7 +2522,7 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 }
 #endif
 
-#define FBLENDP(d, s, m) (m ? s : d)
+#define FBLENDP(v, s, m) (m ? s : v)
 SSE_HELPER_I(helper_blendps, L, 4, FBLENDP)
 SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP)
 SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP)
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 20/42] i386: AVX pclmulqdq
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (22 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 19/42] i386: Rewrite blendv helpers Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 21/42] i386: AVX+AES helpers Paul Brook
                   ` (21 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Make the pclmulqdq helper AVX ready

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h | 31 ++++++++++++++++++++++++-------
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 9f388b02b9..b7100fdce1 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2885,14 +2885,14 @@ target_ulong helper_crc32(uint32_t crc1, target_ulong msg, uint32_t len)
 
 #endif
 
-void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
-                                    uint32_t ctrl)
+#if SHIFT == 1
+static void clmulq(uint64_t *dest_l, uint64_t *dest_h,
+                          uint64_t a, uint64_t b)
 {
-    uint64_t ah, al, b, resh, resl;
+    uint64_t al, ah, resh, resl;
 
     ah = 0;
-    al = d->Q((ctrl & 1) != 0);
-    b = s->Q((ctrl & 16) != 0);
+    al = a;
     resh = resl = 0;
 
     while (b) {
@@ -2905,8 +2905,25 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
         b >>= 1;
     }
 
-    d->Q(0) = resl;
-    d->Q(1) = resh;
+    *dest_l = resl;
+    *dest_h = resh;
+}
+#endif
+
+void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+                                    uint32_t ctrl)
+{
+    Reg *v = d;
+    uint64_t a, b;
+
+    a = v->Q((ctrl & 1) != 0);
+    b = s->Q((ctrl & 16) != 0);
+    clmulq(&d->Q(0), &d->Q(1), a, b);
+#if SHIFT == 2
+    a = v->Q(((ctrl & 1) != 0) + 2);
+    b = s->Q(((ctrl & 16) != 0) + 2);
+    clmulq(&d->Q(2), &d->Q(3), a, b);
+#endif
 }
 
 void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 21/42] i386: AVX+AES helpers
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (23 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 20/42] i386: AVX pclmulqdq Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 22/42] i386: Update ops_sse_helper.h ready for 256 bit AVX Paul Brook
                   ` (20 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Make the AES vector helpers AVX ready

No functional changes to existing helpers

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 63 ++++++++++++++++++++++++++----------
 target/i386/ops_sse_header.h | 55 ++++++++++++++++++++++---------
 2 files changed, 85 insertions(+), 33 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index b7100fdce1..48cec40074 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2929,64 +2929,92 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int i;
-    Reg st = *d;
+    Reg st = *d; // v
     Reg rk = *s;
 
     for (i = 0 ; i < 4 ; i++) {
-        d->L(i) = rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4*i+0])] ^
-                                    AES_Td1[st.B(AES_ishifts[4*i+1])] ^
-                                    AES_Td2[st.B(AES_ishifts[4*i+2])] ^
-                                    AES_Td3[st.B(AES_ishifts[4*i+3])]);
+        d->L(i) = rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4 * i + 0])] ^
+                                    AES_Td1[st.B(AES_ishifts[4 * i + 1])] ^
+                                    AES_Td2[st.B(AES_ishifts[4 * i + 2])] ^
+                                    AES_Td3[st.B(AES_ishifts[4 * i + 3])]);
     }
+#if SHIFT == 2
+    for (i = 0 ; i < 4 ; i++) {
+        d->L(i + 4) = rk.L(i + 4) ^ bswap32(
+                AES_Td0[st.B(AES_ishifts[4 * i + 0] + 16)] ^
+                AES_Td1[st.B(AES_ishifts[4 * i + 1] + 16)] ^
+                AES_Td2[st.B(AES_ishifts[4 * i + 2] + 16)] ^
+                AES_Td3[st.B(AES_ishifts[4 * i + 3] + 16)]);
+    }
+#endif
 }
 
 void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int i;
-    Reg st = *d;
+    Reg st = *d; // v
     Reg rk = *s;
 
     for (i = 0; i < 16; i++) {
         d->B(i) = rk.B(i) ^ (AES_isbox[st.B(AES_ishifts[i])]);
     }
+#if SHIFT == 2
+    for (i = 0; i < 16; i++) {
+        d->B(i + 16) = rk.B(i + 16) ^ (AES_isbox[st.B(AES_ishifts[i] + 16)]);
+    }
+#endif
 }
 
 void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int i;
-    Reg st = *d;
+    Reg st = *d; // v
     Reg rk = *s;
 
     for (i = 0 ; i < 4 ; i++) {
-        d->L(i) = rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4*i+0])] ^
-                                    AES_Te1[st.B(AES_shifts[4*i+1])] ^
-                                    AES_Te2[st.B(AES_shifts[4*i+2])] ^
-                                    AES_Te3[st.B(AES_shifts[4*i+3])]);
+        d->L(i) = rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4 * i + 0])] ^
+                                    AES_Te1[st.B(AES_shifts[4 * i + 1])] ^
+                                    AES_Te2[st.B(AES_shifts[4 * i + 2])] ^
+                                    AES_Te3[st.B(AES_shifts[4 * i + 3])]);
     }
+#if SHIFT == 2
+    for (i = 0 ; i < 4 ; i++) {
+        d->L(i + 4) = rk.L(i + 4) ^ bswap32(
+                AES_Te0[st.B(AES_shifts[4 * i + 0] + 16)] ^
+                AES_Te1[st.B(AES_shifts[4 * i + 1] + 16)] ^
+                AES_Te2[st.B(AES_shifts[4 * i + 2] + 16)] ^
+                AES_Te3[st.B(AES_shifts[4 * i + 3] + 16)]);
+    }
+#endif
 }
 
 void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int i;
-    Reg st = *d;
+    Reg st = *d; // v
     Reg rk = *s;
 
     for (i = 0; i < 16; i++) {
         d->B(i) = rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i])]);
     }
-
+#if SHIFT == 2
+    for (i = 0; i < 16; i++) {
+        d->B(i + 16) = rk.B(i + 16) ^ (AES_sbox[st.B(AES_shifts[i] + 16)]);
+    }
+#endif
 }
 
+#if SHIFT == 1
 void glue(helper_aesimc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
     int i;
     Reg tmp = *s;
 
     for (i = 0 ; i < 4 ; i++) {
-        d->L(i) = bswap32(AES_imc[tmp.B(4*i+0)][0] ^
-                          AES_imc[tmp.B(4*i+1)][1] ^
-                          AES_imc[tmp.B(4*i+2)][2] ^
-                          AES_imc[tmp.B(4*i+3)][3]);
+        d->L(i) = bswap32(AES_imc[tmp.B(4 * i + 0)][0] ^
+                          AES_imc[tmp.B(4 * i + 1)][1] ^
+                          AES_imc[tmp.B(4 * i + 2)][2] ^
+                          AES_imc[tmp.B(4 * i + 3)][3]);
     }
 }
 
@@ -3004,6 +3032,7 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     d->L(3) = (d->L(2) << 24 | d->L(2) >> 8) ^ ctrl;
 }
 #endif
+#endif
 
 #undef SSE_HELPER_S
 
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index b8b0666f61..203afbb5a1 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -47,7 +47,7 @@ DEF_HELPER_3(glue(pslld, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(psrlq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(psllq, SUFFIX), void, env, Reg, Reg)
 
-#if SHIFT == 1
+#if SHIFT >= 1
 DEF_HELPER_3(glue(psrldq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pslldq, SUFFIX), void, env, Reg, Reg)
 #endif
@@ -105,7 +105,7 @@ SSE_HELPER_L(pcmpeql, FCMPEQ)
 
 SSE_HELPER_W(pmullw, FMULLW)
 #if SHIFT == 0
-SSE_HELPER_W(pmulhrw, FMULHRW)
+DEF_HELPER_3(glue(pmulhrw, SUFFIX), FMULHRW)
 #endif
 SSE_HELPER_W(pmulhuw, FMULHUW)
 SSE_HELPER_W(pmulhw, FMULHW)
@@ -117,7 +117,9 @@ DEF_HELPER_3(glue(pmuludq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmaddwd, SUFFIX), void, env, Reg, Reg)
 
 DEF_HELPER_3(glue(psadbw, SUFFIX), void, env, Reg, Reg)
+#if SHIFT < 2
 DEF_HELPER_4(glue(maskmov, SUFFIX), void, env, Reg, Reg, tl)
+#endif
 DEF_HELPER_2(glue(movl_mm_T0, SUFFIX), void, Reg, i32)
 #ifdef TARGET_X86_64
 DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64)
@@ -126,17 +128,18 @@ DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64)
 #if SHIFT == 0
 DEF_HELPER_3(glue(pshufw, SUFFIX), void, Reg, Reg, int)
 #else
-DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int)
-DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int)
 DEF_HELPER_3(glue(pshufd, SUFFIX), void, Reg, Reg, int)
 DEF_HELPER_3(glue(pshuflw, SUFFIX), void, Reg, Reg, int)
 DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int)
 #endif
 
-#if SHIFT == 1
+#if SHIFT >= 1
 /* FPU ops */
 /* XXX: not accurate */
 
+DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int)
+DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int)
+
 #define SSE_HELPER_S(name, F)                            \
     DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg)        \
     DEF_HELPER_3(name ## ss, void, env, Reg, Reg)        \
@@ -154,10 +157,18 @@ SSE_HELPER_S(sqrt, FPU_SQRT)
 
 DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg)
-DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg)
 DEF_HELPER_3(glue(cvtdq2ps, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(cvtdq2pd, SUFFIX), void, env, Reg, Reg)
+
+DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
+
+DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
+DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
+
+#if SHIFT == 1
+DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg)
+DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg)
 DEF_HELPER_3(cvtpi2ps, void, env, ZMMReg, MMXReg)
 DEF_HELPER_3(cvtpi2pd, void, env, ZMMReg, MMXReg)
 DEF_HELPER_3(cvtsi2ss, void, env, ZMMReg, i32)
@@ -168,8 +179,6 @@ DEF_HELPER_3(cvtsq2ss, void, env, ZMMReg, i64)
 DEF_HELPER_3(cvtsq2sd, void, env, ZMMReg, i64)
 #endif
 
-DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(cvtps2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_3(cvtpd2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_2(cvtss2si, s32, env, ZMMReg)
@@ -179,8 +188,6 @@ DEF_HELPER_2(cvtss2sq, s64, env, ZMMReg)
 DEF_HELPER_2(cvtsd2sq, s64, env, ZMMReg)
 #endif
 
-DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(cvttps2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_3(cvttpd2pi, void, env, MMXReg, ZMMReg)
 DEF_HELPER_2(cvttss2si, s32, env, ZMMReg)
@@ -189,15 +196,18 @@ DEF_HELPER_2(cvttsd2si, s32, env, ZMMReg)
 DEF_HELPER_2(cvttss2sq, s64, env, ZMMReg)
 DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg)
 #endif
+#endif
 
 DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg)
+#if SHIFT == 1
+DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(extrq_r, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int)
 DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int)
+#endif
 DEF_HELPER_3(glue(haddps, SUFFIX), void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(glue(haddpd, SUFFIX), void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(glue(hsubps, SUFFIX), void, env, ZMMReg, ZMMReg)
@@ -220,10 +230,13 @@ SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT)
 SSE_HELPER_CMP(cmpnle, FPU_CMPNLE)
 SSE_HELPER_CMP(cmpord, FPU_CMPORD)
 
+#if SHIFT == 1
 DEF_HELPER_3(ucomiss, void, env, Reg, Reg)
 DEF_HELPER_3(comiss, void, env, Reg, Reg)
 DEF_HELPER_3(ucomisd, void, env, Reg, Reg)
 DEF_HELPER_3(comisd, void, env, Reg, Reg)
+#endif
+
 DEF_HELPER_2(glue(movmskps, SUFFIX), i32, env, Reg)
 DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg)
 #endif
@@ -240,7 +253,7 @@ DEF_HELPER_3(glue(packssdw, SUFFIX), void, env, Reg, Reg)
 UNPCK_OP(l, 0)
 UNPCK_OP(h, 1)
 
-#if SHIFT == 1
+#if SHIFT >= 1
 DEF_HELPER_3(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg)
 #endif
@@ -287,7 +300,7 @@ DEF_HELPER_3(glue(psignd, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_4(glue(palignr, SUFFIX), void, env, Reg, Reg, s32)
 
 /* SSE4.1 op helpers */
-#if SHIFT == 1
+#if SHIFT >= 1
 DEF_HELPER_3(glue(pblendvb, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(blendvps, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(blendvpd, SUFFIX), void, env, Reg, Reg)
@@ -316,22 +329,30 @@ DEF_HELPER_3(glue(pmaxsd, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmaxuw, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmaxud, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmulld, SUFFIX), void, env, Reg, Reg)
+#if SHIFT == 1
 DEF_HELPER_3(glue(phminposuw, SUFFIX), void, env, Reg, Reg)
+#endif
 DEF_HELPER_4(glue(roundps, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(roundpd, SUFFIX), void, env, Reg, Reg, i32)
+#if SHIFT == 1
 DEF_HELPER_4(glue(roundss, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(roundsd, SUFFIX), void, env, Reg, Reg, i32)
+#endif
 DEF_HELPER_4(glue(blendps, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(blendpd, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(pblendw, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(dpps, SUFFIX), void, env, Reg, Reg, i32)
+#if SHIFT == 1
 DEF_HELPER_4(glue(dppd, SUFFIX), void, env, Reg, Reg, i32)
+#endif
 DEF_HELPER_4(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, i32)
 #endif
 
 /* SSE4.2 op helpers */
-#if SHIFT == 1
+#if SHIFT >= 1
 DEF_HELPER_3(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg)
+#endif
+#if SHIFT == 1
 DEF_HELPER_4(glue(pcmpestri, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(pcmpestrm, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(pcmpistri, SUFFIX), void, env, Reg, Reg, i32)
@@ -340,13 +361,15 @@ DEF_HELPER_3(crc32, tl, i32, tl, i32)
 #endif
 
 /* AES-NI op helpers */
-#if SHIFT == 1
+#if SHIFT >= 1
 DEF_HELPER_3(glue(aesdec, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(aesdeclast, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(aesenc, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(aesenclast, SUFFIX), void, env, Reg, Reg)
+#if SHIFT == 1
 DEF_HELPER_3(glue(aesimc, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env, Reg, Reg, i32)
+#endif
 DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32)
 #endif
 
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 22/42] i386: Update ops_sse_helper.h ready for 256 bit AVX
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (24 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 21/42] i386: AVX+AES helpers Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 23/42] i386: AVX comparison helpers Paul Brook
                   ` (19 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Update ops_sse_helper.h ready for 256 bit AVX helpers

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse_header.h | 67 +++++++++++++++++++++---------------
 1 file changed, 40 insertions(+), 27 deletions(-)

diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 203afbb5a1..63b63eb532 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -105,7 +105,7 @@ SSE_HELPER_L(pcmpeql, FCMPEQ)
 
 SSE_HELPER_W(pmullw, FMULLW)
 #if SHIFT == 0
-DEF_HELPER_3(glue(pmulhrw, SUFFIX), FMULHRW)
+DEF_HELPER_3(glue(pmulhrw, SUFFIX), void, env, Reg, Reg)
 #endif
 SSE_HELPER_W(pmulhuw, FMULHUW)
 SSE_HELPER_W(pmulhw, FMULHW)
@@ -137,23 +137,39 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int)
 /* FPU ops */
 /* XXX: not accurate */
 
-DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int)
-DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int)
+#define SSE_HELPER_P4(name)                                             \
+    DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg)         \
+    DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg)
+
+#define SSE_HELPER_P3(name, ...)                                        \
+    DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg)         \
+    DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg)
 
-#define SSE_HELPER_S(name, F)                            \
-    DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg)        \
-    DEF_HELPER_3(name ## ss, void, env, Reg, Reg)        \
-    DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg)        \
+#if SHIFT == 1
+#define SSE_HELPER_S4(name)                                             \
+    SSE_HELPER_P4(name)                                                 \
+    DEF_HELPER_3(name ## ss, void, env, Reg, Reg)                       \
     DEF_HELPER_3(name ## sd, void, env, Reg, Reg)
+#define SSE_HELPER_S3(name)                                             \
+    SSE_HELPER_P3(name)                                                 \
+    DEF_HELPER_3(name ## ss, void, env, Reg, Reg)                       \
+    DEF_HELPER_3(name ## sd, void, env, Reg, Reg)
+#else
+#define SSE_HELPER_S4(name, ...) SSE_HELPER_P4(name)
+#define SSE_HELPER_S3(name, ...) SSE_HELPER_P3(name)
+#endif
+
+DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int)
+DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int)
 
-SSE_HELPER_S(add, FPU_ADD)
-SSE_HELPER_S(sub, FPU_SUB)
-SSE_HELPER_S(mul, FPU_MUL)
-SSE_HELPER_S(div, FPU_DIV)
-SSE_HELPER_S(min, FPU_MIN)
-SSE_HELPER_S(max, FPU_MAX)
-SSE_HELPER_S(sqrt, FPU_SQRT)
+SSE_HELPER_S4(add)
+SSE_HELPER_S4(sub)
+SSE_HELPER_S4(mul)
+SSE_HELPER_S4(div)
+SSE_HELPER_S4(min)
+SSE_HELPER_S4(max)
 
+SSE_HELPER_S3(sqrt)
 
 DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg)
@@ -208,18 +224,12 @@ DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int)
 DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int)
 #endif
-DEF_HELPER_3(glue(haddps, SUFFIX), void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(glue(haddpd, SUFFIX), void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(glue(hsubps, SUFFIX), void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(glue(hsubpd, SUFFIX), void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(glue(addsubps, SUFFIX), void, env, ZMMReg, ZMMReg)
-DEF_HELPER_3(glue(addsubpd, SUFFIX), void, env, ZMMReg, ZMMReg)
-
-#define SSE_HELPER_CMP(name, F)                           \
-    DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg)         \
-    DEF_HELPER_3(name ## ss, void, env, Reg, Reg)         \
-    DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg)         \
-    DEF_HELPER_3(name ## sd, void, env, Reg, Reg)
+
+SSE_HELPER_P4(hadd)
+SSE_HELPER_P4(hsub)
+SSE_HELPER_P4(addsub)
+
+#define SSE_HELPER_CMP(name, F) SSE_HELPER_S4(name)
 
 SSE_HELPER_CMP(cmpeq, FPU_CMPEQ)
 SSE_HELPER_CMP(cmplt, FPU_CMPLT)
@@ -381,6 +391,9 @@ DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32)
 #undef SSE_HELPER_W
 #undef SSE_HELPER_L
 #undef SSE_HELPER_Q
-#undef SSE_HELPER_S
+#undef SSE_HELPER_S3
+#undef SSE_HELPER_S4
+#undef SSE_HELPER_P3
+#undef SSE_HELPER_P4
 #undef SSE_HELPER_CMP
 #undef UNPCK_OP
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 23/42] i386: AVX comparison helpers
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (25 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 22/42] i386: Update ops_sse_helper.h ready for 256 bit AVX Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 24/42] i386: Move 3DNOW decoder Paul Brook
                   ` (18 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

AVX includes additional a more extensive set of comparison predicates,
some of some of which our softfloat implementation does not expose directly.
Rewrite the helpers in terms of floatN_compare

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 149 ++++++++++++++++++++++++-----------
 target/i386/ops_sse_header.h |  47 ++++++++---
 target/i386/tcg/translate.c  |  49 +++++++++---
 3 files changed, 177 insertions(+), 68 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 48cec40074..e48dfc2fc5 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -1394,57 +1394,112 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-/* XXX: unordered */
-#define SSE_HELPER_CMP(name, F)                                         \
-    void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)\
-    {                                                                   \
-        d->ZMM_L(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
-        d->ZMM_L(1) = F(32, d->ZMM_S(1), s->ZMM_S(1));                  \
-        d->ZMM_L(2) = F(32, d->ZMM_S(2), s->ZMM_S(2));                  \
-        d->ZMM_L(3) = F(32, d->ZMM_S(3), s->ZMM_S(3));                  \
-    }                                                                   \
-                                                                        \
-    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)        \
-    {                                                                   \
-        d->ZMM_L(0) = F(32, d->ZMM_S(0), s->ZMM_S(0));                  \
-    }                                                                   \
-                                                                        \
-    void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)\
+#define SSE_HELPER_CMP_P(name, F, C)                                    \
+    void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env,          \
+                                             Reg *d, Reg *s)    \
     {                                                                   \
-        d->ZMM_Q(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
-        d->ZMM_Q(1) = F(64, d->ZMM_D(1), s->ZMM_D(1));                  \
+        Reg *v = d;                                                     \
+        d->ZMM_L(0) = F(32, C, v->ZMM_S(0), s->ZMM_S(0));               \
+        d->ZMM_L(1) = F(32, C, v->ZMM_S(1), s->ZMM_S(1));               \
+        d->ZMM_L(2) = F(32, C, v->ZMM_S(2), s->ZMM_S(2));               \
+        d->ZMM_L(3) = F(32, C, v->ZMM_S(3), s->ZMM_S(3));               \
+        YMM_ONLY(                                                       \
+        d->ZMM_L(4) = F(32, C, v->ZMM_S(4), s->ZMM_S(4));               \
+        d->ZMM_L(5) = F(32, C, v->ZMM_S(5), s->ZMM_S(5));               \
+        d->ZMM_L(6) = F(32, C, v->ZMM_S(6), s->ZMM_S(6));               \
+        d->ZMM_L(7) = F(32, C, v->ZMM_S(7), s->ZMM_S(7));               \
+        )                                                               \
     }                                                                   \
                                                                         \
-    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)        \
+    void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env,          \
+                                             Reg *d, Reg *s)    \
     {                                                                   \
-        d->ZMM_Q(0) = F(64, d->ZMM_D(0), s->ZMM_D(0));                  \
-    }
-
-#define FPU_CMPEQ(size, a, b)                                           \
-    (float ## size ## _eq_quiet(a, b, &env->sse_status) ? -1 : 0)
-#define FPU_CMPLT(size, a, b)                                           \
-    (float ## size ## _lt(a, b, &env->sse_status) ? -1 : 0)
-#define FPU_CMPLE(size, a, b)                                           \
-    (float ## size ## _le(a, b, &env->sse_status) ? -1 : 0)
-#define FPU_CMPUNORD(size, a, b)                                        \
-    (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? -1 : 0)
-#define FPU_CMPNEQ(size, a, b)                                          \
-    (float ## size ## _eq_quiet(a, b, &env->sse_status) ? 0 : -1)
-#define FPU_CMPNLT(size, a, b)                                          \
-    (float ## size ## _lt(a, b, &env->sse_status) ? 0 : -1)
-#define FPU_CMPNLE(size, a, b)                                          \
-    (float ## size ## _le(a, b, &env->sse_status) ? 0 : -1)
-#define FPU_CMPORD(size, a, b)                                          \
-    (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? 0 : -1)
-
-SSE_HELPER_CMP(cmpeq, FPU_CMPEQ)
-SSE_HELPER_CMP(cmplt, FPU_CMPLT)
-SSE_HELPER_CMP(cmple, FPU_CMPLE)
-SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD)
-SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ)
-SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT)
-SSE_HELPER_CMP(cmpnle, FPU_CMPNLE)
-SSE_HELPER_CMP(cmpord, FPU_CMPORD)
+        Reg *v = d;                                                     \
+        d->ZMM_Q(0) = F(64, C, v->ZMM_D(0), s->ZMM_D(0));               \
+        d->ZMM_Q(1) = F(64, C, v->ZMM_D(1), s->ZMM_D(1));               \
+        YMM_ONLY(                                                       \
+        d->ZMM_Q(2) = F(64, C, v->ZMM_D(2), s->ZMM_D(2));               \
+        d->ZMM_Q(3) = F(64, C, v->ZMM_D(3), s->ZMM_D(3));               \
+        )                                                               \
+    }
+
+#if SHIFT == 1
+#define SSE_HELPER_CMP(name, F, C)                                          \
+    SSE_HELPER_CMP_P(name, F, C)                                            \
+    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)    \
+    {                                                                       \
+        Reg *v = d;                                                         \
+        d->ZMM_L(0) = F(32, C, v->ZMM_S(0), s->ZMM_S(0));                   \
+    }                                                                       \
+                                                                            \
+    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)    \
+    {                                                                       \
+        Reg *v = d;                                                         \
+        d->ZMM_Q(0) = F(64, C, v->ZMM_D(0), s->ZMM_D(0));                   \
+    }
+
+static inline bool FPU_EQU(FloatRelation x)
+{
+    return (x == float_relation_equal || x == float_relation_unordered);
+}
+static inline bool FPU_GE(FloatRelation x)
+{
+    return (x == float_relation_equal || x == float_relation_greater);
+}
+#define FPU_EQ(x) (x == float_relation_equal)
+#define FPU_LT(x) (x == float_relation_less)
+#define FPU_LE(x) (x <= float_relation_equal)
+#define FPU_GT(x) (x == float_relation_greater)
+#define FPU_UNORD(x) (x == float_relation_unordered)
+// We must make sure we evaluate the argument in case it is a signalling NAN
+#define FPU_FALSE(x) (x == float_relation_equal && 0)
+
+#define FPU_CMPQ(size, COND, a, b) \
+    (COND(float ## size ## _compare_quiet(a, b, &env->sse_status)) ? -1 : 0)
+#define FPU_CMPS(size, COND, a, b) \
+    (COND(float ## size ## _compare(a, b, &env->sse_status)) ? -1 : 0)
+
+#else
+#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_CMP_P(name, F, C)
+#endif
+
+SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ)
+SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT)
+SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE)
+SSE_HELPER_CMP(cmpunord, FPU_CMPQ,  FPU_UNORD)
+SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ)
+SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT)
+SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE)
+SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD)
+
+SSE_HELPER_CMP(cmpequ, FPU_CMPQ, FPU_EQU)
+SSE_HELPER_CMP(cmpnge, FPU_CMPS, !FPU_GE)
+SSE_HELPER_CMP(cmpngt, FPU_CMPS, !FPU_GT)
+SSE_HELPER_CMP(cmpfalse, FPU_CMPQ,  FPU_FALSE)
+SSE_HELPER_CMP(cmpnequ, FPU_CMPQ, !FPU_EQU)
+SSE_HELPER_CMP(cmpge, FPU_CMPS, FPU_GE)
+SSE_HELPER_CMP(cmpgt, FPU_CMPS, FPU_GT)
+SSE_HELPER_CMP(cmptrue, FPU_CMPQ,  !FPU_FALSE)
+
+SSE_HELPER_CMP(cmpeqs, FPU_CMPS, FPU_EQ)
+SSE_HELPER_CMP(cmpltq, FPU_CMPQ, FPU_LT)
+SSE_HELPER_CMP(cmpleq, FPU_CMPQ, FPU_LE)
+SSE_HELPER_CMP(cmpunords, FPU_CMPS,  FPU_UNORD)
+SSE_HELPER_CMP(cmpneqq, FPU_CMPS, !FPU_EQ)
+SSE_HELPER_CMP(cmpnltq, FPU_CMPQ, !FPU_LT)
+SSE_HELPER_CMP(cmpnleq, FPU_CMPQ, !FPU_LE)
+SSE_HELPER_CMP(cmpords, FPU_CMPS, !FPU_UNORD)
+
+SSE_HELPER_CMP(cmpequs, FPU_CMPS, FPU_EQU)
+SSE_HELPER_CMP(cmpngeq, FPU_CMPQ, !FPU_GE)
+SSE_HELPER_CMP(cmpngtq, FPU_CMPQ, !FPU_GT)
+SSE_HELPER_CMP(cmpfalses, FPU_CMPS,  FPU_FALSE)
+SSE_HELPER_CMP(cmpnequs, FPU_CMPS, !FPU_EQU)
+SSE_HELPER_CMP(cmpgeq, FPU_CMPQ, FPU_GE)
+SSE_HELPER_CMP(cmpgtq, FPU_CMPQ, FPU_GT)
+SSE_HELPER_CMP(cmptrues, FPU_CMPS,  !FPU_FALSE)
+
+#undef SSE_HELPER_CMP
 
 #if SHIFT == 1
 static const int comis_eflags[4] = {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C};
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 63b63eb532..793e581224 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -229,16 +229,43 @@ SSE_HELPER_P4(hadd)
 SSE_HELPER_P4(hsub)
 SSE_HELPER_P4(addsub)
 
-#define SSE_HELPER_CMP(name, F) SSE_HELPER_S4(name)
-
-SSE_HELPER_CMP(cmpeq, FPU_CMPEQ)
-SSE_HELPER_CMP(cmplt, FPU_CMPLT)
-SSE_HELPER_CMP(cmple, FPU_CMPLE)
-SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD)
-SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ)
-SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT)
-SSE_HELPER_CMP(cmpnle, FPU_CMPNLE)
-SSE_HELPER_CMP(cmpord, FPU_CMPORD)
+#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_S4(name)
+
+SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ)
+SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT)
+SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE)
+SSE_HELPER_CMP(cmpunord, FPU_CMPQ,  FPU_UNORD)
+SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ)
+SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT)
+SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE)
+SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD)
+
+SSE_HELPER_CMP(cmpequ, FPU_CMPQ, FPU_EQU)
+SSE_HELPER_CMP(cmpnge, FPU_CMPS, !FPU_GE)
+SSE_HELPER_CMP(cmpngt, FPU_CMPS, !FPU_GT)
+SSE_HELPER_CMP(cmpfalse, FPU_CMPQ,  FPU_FALSE)
+SSE_HELPER_CMP(cmpnequ, FPU_CMPQ, !FPU_EQU)
+SSE_HELPER_CMP(cmpge, FPU_CMPS, FPU_GE)
+SSE_HELPER_CMP(cmpgt, FPU_CMPS, FPU_GT)
+SSE_HELPER_CMP(cmptrue, FPU_CMPQ,  !FPU_FALSE)
+
+SSE_HELPER_CMP(cmpeqs, FPU_CMPS, FPU_EQ)
+SSE_HELPER_CMP(cmpltq, FPU_CMPQ, FPU_LT)
+SSE_HELPER_CMP(cmpleq, FPU_CMPQ, FPU_LE)
+SSE_HELPER_CMP(cmpunords, FPU_CMPS,  FPU_UNORD)
+SSE_HELPER_CMP(cmpneqq, FPU_CMPS, !FPU_EQ)
+SSE_HELPER_CMP(cmpnltq, FPU_CMPQ, !FPU_LT)
+SSE_HELPER_CMP(cmpnleq, FPU_CMPQ, !FPU_LE)
+SSE_HELPER_CMP(cmpords, FPU_CMPS, !FPU_UNORD)
+
+SSE_HELPER_CMP(cmpequs, FPU_CMPS, FPU_EQU)
+SSE_HELPER_CMP(cmpngeq, FPU_CMPQ, !FPU_GE)
+SSE_HELPER_CMP(cmpngtq, FPU_CMPQ, !FPU_GT)
+SSE_HELPER_CMP(cmpfalses, FPU_CMPS,  FPU_FALSE)
+SSE_HELPER_CMP(cmpnequs, FPU_CMPS, !FPU_EQU)
+SSE_HELPER_CMP(cmpgeq, FPU_CMPQ, FPU_GE)
+SSE_HELPER_CMP(cmpgtq, FPU_CMPQ, FPU_GT)
+SSE_HELPER_CMP(cmptrues, FPU_CMPS,  !FPU_FALSE)
 
 #if SHIFT == 1
 DEF_HELPER_3(ucomiss, void, env, Reg, Reg)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 63b32a77e3..64f026c0af 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3021,20 +3021,47 @@ static const SSEFunc_l_ep sse_op_table3bq[] = {
 };
 #endif
 
-#define SSE_FOP(x) { \
+#define SSE_CMP(x) { \
     gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \
     gen_helper_ ## x ## ss, gen_helper_ ## x ## sd}
-static const SSEFunc_0_epp sse_op_table4[8][4] = {
-    SSE_FOP(cmpeq),
-    SSE_FOP(cmplt),
-    SSE_FOP(cmple),
-    SSE_FOP(cmpunord),
-    SSE_FOP(cmpneq),
-    SSE_FOP(cmpnlt),
-    SSE_FOP(cmpnle),
-    SSE_FOP(cmpord),
+static const SSEFunc_0_epp sse_op_table4[32][4] = {
+    SSE_CMP(cmpeq),
+    SSE_CMP(cmplt),
+    SSE_CMP(cmple),
+    SSE_CMP(cmpunord),
+    SSE_CMP(cmpneq),
+    SSE_CMP(cmpnlt),
+    SSE_CMP(cmpnle),
+    SSE_CMP(cmpord),
+
+    SSE_CMP(cmpequ),
+    SSE_CMP(cmpnge),
+    SSE_CMP(cmpngt),
+    SSE_CMP(cmpfalse),
+    SSE_CMP(cmpnequ),
+    SSE_CMP(cmpge),
+    SSE_CMP(cmpgt),
+    SSE_CMP(cmptrue),
+
+    SSE_CMP(cmpeqs),
+    SSE_CMP(cmpltq),
+    SSE_CMP(cmpleq),
+    SSE_CMP(cmpunords),
+    SSE_CMP(cmpneqq),
+    SSE_CMP(cmpnltq),
+    SSE_CMP(cmpnleq),
+    SSE_CMP(cmpords),
+
+    SSE_CMP(cmpequs),
+    SSE_CMP(cmpngeq),
+    SSE_CMP(cmpngtq),
+    SSE_CMP(cmpfalses),
+    SSE_CMP(cmpnequs),
+    SSE_CMP(cmpgeq),
+    SSE_CMP(cmpgtq),
+    SSE_CMP(cmptrues),
 };
-#undef SSE_FOP
+#undef SSE_CMP
 
 static const SSEFunc_0_epp sse_op_table5[256] = {
     [0x0c] = gen_helper_pi2fw,
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 24/42] i386: Move 3DNOW decoder
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (26 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 23/42] i386: AVX comparison helpers Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 25/42] i386: VEX.V encodings (3 operand) Paul Brook
                   ` (17 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Handle 3DNOW instructions early to avoid complicating the AVX logic.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 64f026c0af..6c40df61d4 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3297,6 +3297,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             is_xmm = 1;
         }
     }
+    if (sse_op.flags & SSE_OPF_3DNOW) {
+        if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) {
+            goto illegal_op;
+        }
+    }
     /* simple MMX/SSE operation */
     if (s->flags & HF_TS_MASK) {
         gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
@@ -4761,21 +4766,20 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 rm = (modrm & 7);
                 op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
             }
+            if (sse_op.flags & SSE_OPF_3DNOW) {
+                /* 3DNow! data insns */
+                val = x86_ldub_code(env, s);
+                SSEFunc_0_epp op_3dnow = sse_op_table5[val];
+                if (!op_3dnow) {
+                    goto unknown_op;
+                }
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                op_3dnow(cpu_env, s->ptr0, s->ptr1);
+                return;
+            }
         }
         switch(b) {
-        case 0x0f: /* 3DNow! data insns */
-            val = x86_ldub_code(env, s);
-            sse_fn_epp = sse_op_table5[val];
-            if (!sse_fn_epp) {
-                goto unknown_op;
-            }
-            if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) {
-                goto illegal_op;
-            }
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
-            break;
         case 0x70: /* pshufx insn */
         case 0xc6: /* pshufx insn */
             val = x86_ldub_code(env, s);
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 25/42] i386: VEX.V encodings (3 operand)
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (27 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 24/42] i386: Move 3DNOW decoder Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 26/42] i386: Utility function for 128 bit AVX Paul Brook
                   ` (16 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Enable translation of VEX encoded AVX instructions.

The big change is the addition of an additional register operand in the VEX.V
field.  This is usually (but not always!) used to explictly encode the
first source operand.

The changes to ops_sse.h and ops_sse_header.h are purely mechanical, with
pervious changes ensuring that the relevant helper functions are ready to
handle the non destructive source operand.

We now have a grater variety of operand patterns for the vector helper
functions. The SSE_OPF_* flags we added to the opcode lookup tables are used
to select between these. This includes e.g. pshufX and cmpX instructions
which were previously overriden by opcode.

One gotcha is the "scalar" vector instructions. The SSE encodings write a
single element to the destination and leave the remainder of the register
unchanged.  The VEX encodings which copy the remainder of the destination from
first source operand. If the operation only has a single source value,
then the VEX.V encodes an additional operand from which is coped to the
the remainder of destination.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 214 +++++++++----------
 target/i386/ops_sse_header.h | 149 ++++++-------
 target/i386/tcg/translate.c  | 399 +++++++++++++++++++++++++----------
 3 files changed, 463 insertions(+), 299 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index e48dfc2fc5..ad3312d353 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -97,9 +97,8 @@
 #define FPSLL(x, c) ((x) << shift)
 #endif
 
-void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift;
     if (c->Q(0) > 15) {
         d->Q(0) = 0;
@@ -114,9 +113,8 @@ void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
     }
 }
 
-void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift;
     if (c->Q(0) > 15) {
         d->Q(0) = 0;
@@ -131,9 +129,8 @@ void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
     }
 }
 
-void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift;
     if (c->Q(0) > 15) {
         shift = 15;
@@ -143,9 +140,8 @@ void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
     SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRAW);
 }
 
-void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift;
     if (c->Q(0) > 31) {
         d->Q(0) = 0;
@@ -160,9 +156,8 @@ void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
     }
 }
 
-void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift;
     if (c->Q(0) > 31) {
         d->Q(0) = 0;
@@ -177,9 +172,8 @@ void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
     }
 }
 
-void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift;
     if (c->Q(0) > 31) {
         shift = 31;
@@ -189,9 +183,8 @@ void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
     SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRAL);
 }
 
-void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift;
     if (c->Q(0) > 63) {
         d->Q(0) = 0;
@@ -206,9 +199,8 @@ void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
     }
 }
 
-void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift;
     if (c->Q(0) > 63) {
         d->Q(0) = 0;
@@ -224,9 +216,8 @@ void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 }
 
 #if SHIFT >= 1
-void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift, i;
 
     shift = c->L(0);
@@ -249,9 +240,8 @@ void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
 #endif
 }
 
-void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-    Reg *s = d;
     int shift, i;
 
     shift = c->L(0);
@@ -321,9 +311,8 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
     }
 
 #define SSE_HELPER_B(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
     {                                                           \
-        Reg *v = d;                                             \
         d->B(0) = F(v->B(0), s->B(0));                          \
         d->B(1) = F(v->B(1), s->B(1));                          \
         d->B(2) = F(v->B(2), s->B(2));                          \
@@ -363,9 +352,8 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
             }
 
 #define SSE_HELPER_W(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
     {                                                           \
-        Reg *v = d;                                             \
         d->W(0) = F(v->W(0), s->W(0));                          \
         d->W(1) = F(v->W(1), s->W(1));                          \
         d->W(2) = F(v->W(2), s->W(2));                          \
@@ -389,9 +377,8 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
             }
 
 #define SSE_HELPER_L(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
     {                                                           \
-        Reg *v = d;                                             \
         d->L(0) = F(v->L(0), s->L(0));                          \
         d->L(1) = F(v->L(1), s->L(1));                          \
         XMM_ONLY(                                               \
@@ -407,9 +394,8 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
             }
 
 #define SSE_HELPER_Q(name, F)                                   \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
     {                                                           \
-        Reg *v = d;                                             \
         d->Q(0) = F(v->Q(0), s->Q(0));                          \
         XMM_ONLY(                                               \
                  d->Q(1) = F(v->Q(1), s->Q(1));                 \
@@ -555,9 +541,8 @@ void glue(helper_pmulhrw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 SSE_HELPER_B(helper_pavgb, FAVG)
 SSE_HELPER_W(helper_pavgw, FAVG)
 
-void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     d->Q(0) = (uint64_t)s->L(0) * (uint64_t)v->L(0);
 #if SHIFT >= 1
     d->Q(1) = (uint64_t)s->L(2) * (uint64_t)v->L(2);
@@ -568,9 +553,8 @@ void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     int i;
 
     for (i = 0; i < (2 << SHIFT); i++) {
@@ -589,10 +573,8 @@ static inline int abs1(int a)
     }
 }
 #endif
-
-void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     unsigned int val;
 
     val = 0;
@@ -701,9 +683,8 @@ void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order)
     SHUFFLE4(W, s, s, 0);
 }
 #else
-void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int order)
+void glue(helper_shufps, SUFFIX)(Reg *d, Reg *v, Reg *s, int order)
 {
-    Reg *v = d;
     uint32_t r0, r1, r2, r3;
 
     SHUFFLE4(L, v, s, 0);
@@ -712,9 +693,8 @@ void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int order)
 #endif
 }
 
-void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *s, int order)
+void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *v, Reg *s, int order)
 {
-    Reg *v = d;
     uint64_t r0, r1;
 
     r0 = v->Q(order & 1);
@@ -770,9 +750,8 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order)
 
 #define SSE_HELPER_P(name, F)                                           \
     void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env,          \
-            Reg *d, Reg *s)                                     \
+            Reg *d, Reg *v, Reg *s)                                     \
     {                                                                   \
-        Reg *v = d;                                                     \
         d->ZMM_S(0) = F(32, v->ZMM_S(0), s->ZMM_S(0));                  \
         d->ZMM_S(1) = F(32, v->ZMM_S(1), s->ZMM_S(1));                  \
         d->ZMM_S(2) = F(32, v->ZMM_S(2), s->ZMM_S(2));                  \
@@ -786,9 +765,8 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order)
     }                                                                   \
                                                                         \
     void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env,          \
-            Reg *d, Reg *s)                                     \
+            Reg *d, Reg *v, Reg *s)                                     \
     {                                                                   \
-        Reg *v = d;                                                     \
         d->ZMM_D(0) = F(64, v->ZMM_D(0), s->ZMM_D(0));                  \
         d->ZMM_D(1) = F(64, v->ZMM_D(1), s->ZMM_D(1));                  \
         YMM_ONLY(                                                       \
@@ -802,15 +780,13 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order)
 #define SSE_HELPER_S(name, F)                                           \
     SSE_HELPER_P(name, F)                                               \
                                                                         \
-    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)\
+    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *v, Reg *s)\
     {                                                                   \
-        Reg *v = d;                                                     \
         d->ZMM_S(0) = F(32, v->ZMM_S(0), s->ZMM_S(0));                  \
     }                                                                   \
                                                                         \
-    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)\
+    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *v, Reg *s)\
     {                                                                   \
-        Reg *v = d;                                                     \
         d->ZMM_D(0) = F(64, v->ZMM_D(0), s->ZMM_D(0));                  \
     }
 
@@ -1284,9 +1260,8 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, int index, int length)
 }
 #endif
 
-void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     float32 r0, r1, r2, r3;
 
     r0 = float32_add(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status);
@@ -1309,9 +1284,8 @@ void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     float64 r0, r1;
 
     r0 = float64_add(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status);
@@ -1326,9 +1300,8 @@ void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     float32 r0, r1, r2, r3;
 
     r0 = float32_sub(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status);
@@ -1351,9 +1324,8 @@ void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     float64 r0, r1;
 
     r0 = float64_sub(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status);
@@ -1368,9 +1340,8 @@ void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     d->ZMM_S(0) = float32_sub(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
     d->ZMM_S(1) = float32_add(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
     d->ZMM_S(2) = float32_sub(v->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
@@ -1383,9 +1354,8 @@ void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     d->ZMM_D(0) = float64_sub(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
     d->ZMM_D(1) = float64_add(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status);
 #if SHIFT == 2
@@ -1396,9 +1366,8 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 
 #define SSE_HELPER_CMP_P(name, F, C)                                    \
     void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env,          \
-                                             Reg *d, Reg *s)    \
+                                             Reg *d, Reg *v, Reg *s)    \
     {                                                                   \
-        Reg *v = d;                                                     \
         d->ZMM_L(0) = F(32, C, v->ZMM_S(0), s->ZMM_S(0));               \
         d->ZMM_L(1) = F(32, C, v->ZMM_S(1), s->ZMM_S(1));               \
         d->ZMM_L(2) = F(32, C, v->ZMM_S(2), s->ZMM_S(2));               \
@@ -1412,9 +1381,8 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     }                                                                   \
                                                                         \
     void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env,          \
-                                             Reg *d, Reg *s)    \
+                                             Reg *d, Reg *v, Reg *s)    \
     {                                                                   \
-        Reg *v = d;                                                     \
         d->ZMM_Q(0) = F(64, C, v->ZMM_D(0), s->ZMM_D(0));               \
         d->ZMM_Q(1) = F(64, C, v->ZMM_D(1), s->ZMM_D(1));               \
         YMM_ONLY(                                                       \
@@ -1426,15 +1394,13 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #if SHIFT == 1
 #define SSE_HELPER_CMP(name, F, C)                                          \
     SSE_HELPER_CMP_P(name, F, C)                                            \
-    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)    \
+    void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *v, Reg *s)    \
     {                                                                       \
-        Reg *v = d;                                                         \
         d->ZMM_L(0) = F(32, C, v->ZMM_S(0), s->ZMM_S(0));                   \
     }                                                                       \
                                                                             \
-    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)    \
+    void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *v, Reg *s)    \
     {                                                                       \
-        Reg *v = d;                                                         \
         d->ZMM_Q(0) = F(64, C, v->ZMM_D(0), s->ZMM_D(0));                   \
     }
 
@@ -1633,9 +1599,44 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *env, Reg *s)
 #define PACK_WIDTH 8
 #endif
 
-void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+#define PACK4(F, to, reg, from) do {        \
+    r[to + 0] = F((int16_t)reg->W(from + 0));   \
+    r[to + 1] = F((int16_t)reg->W(from + 1));   \
+    r[to + 2] = F((int16_t)reg->W(from + 2));   \
+    r[to + 3] = F((int16_t)reg->W(from + 3));   \
+    } while (0)
+
+#define PACK_HELPER_B(name, F) \
+void glue(helper_pack ## name, SUFFIX)(CPUX86State *env, \
+        Reg *d, Reg *v, Reg *s)                 \
+{                                               \
+    uint8_t r[PACK_WIDTH * 2];                  \
+    int i;                                      \
+    PACK4(F, 0, v, 0);                          \
+    PACK4(F, PACK_WIDTH, s, 0);                 \
+    XMM_ONLY(                                   \
+        PACK4(F, 4, v, 4);                      \
+        PACK4(F, 12, s, 4);                     \
+        )                                       \
+    for (i = 0; i < PACK_WIDTH * 2; i++) {      \
+        d->B(i) = r[i];                         \
+    }                                           \
+    YMM_ONLY(                                   \
+        PACK4(F, 0, v, 8);                      \
+        PACK4(F, 4, v, 12);                     \
+        PACK4(F, 8, s, 8);                      \
+        PACK4(F, 12, s, 12);                    \
+        for (i = 0; i < 16; i++) {              \
+            d->B(i + 16) = r[i];                \
+        }                                       \
+        )                                       \
+}
+
+PACK_HELPER_B(sswb, satsb)
+PACK_HELPER_B(uswb, satub)
+
+void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     uint16_t r[PACK_WIDTH];
     int i;
 
@@ -1670,9 +1671,8 @@ void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #define UNPCK_OP(base_name, base)                                       \
                                                                         \
     void glue(helper_punpck ## base_name ## bw, SUFFIX)(CPUX86State *env,\
-                                                Reg *d, Reg *s) \
+                                                Reg *d, Reg *v, Reg *s) \
     {                                                                   \
-        Reg *v = d;                                                     \
         uint8_t r[PACK_WIDTH * 2];                                      \
         int i;                                                          \
                                                                         \
@@ -1721,9 +1721,8 @@ void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     }                                                                   \
                                                                         \
     void glue(helper_punpck ## base_name ## wd, SUFFIX)(CPUX86State *env,\
-                                                Reg *d, Reg *s) \
+                                                Reg *d, Reg *v, Reg *s) \
     {                                                                   \
-        Reg *v = d;                                                     \
         uint16_t r[PACK_WIDTH];                                         \
         int i;                                                          \
                                                                         \
@@ -1756,9 +1755,8 @@ void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     }                                                                   \
                                                                         \
     void glue(helper_punpck ## base_name ## dq, SUFFIX)(CPUX86State *env,\
-                                                Reg *d, Reg *s) \
+                                                Reg *d, Reg *v, Reg *s) \
     {                                                                   \
-        Reg *v = d;                                                     \
         uint32_t r[4];                                                  \
                                                                         \
         r[0] = v->L((base * (PACK_WIDTH / 4)) + 0);                     \
@@ -1785,9 +1783,8 @@ void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
                                                                         \
     XMM_ONLY(                                                           \
              void glue(helper_punpck ## base_name ## qdq, SUFFIX)(      \
-                        CPUX86State *env, Reg *d, Reg *s)       \
+                        CPUX86State *env, Reg *d, Reg *v, Reg *s)       \
              {                                                          \
-                 Reg *v = d;                                            \
                  uint64_t r[2];                                         \
                                                                         \
                  r[0] = v->Q(base);                                     \
@@ -1961,9 +1958,8 @@ void helper_pswapd(CPUX86State *env, MMXReg *d, MMXReg *s)
 #endif
 
 /* SSSE3 op helpers */
-void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     int i;
 #if SHIFT == 0
     uint8_t r[8];
@@ -1997,9 +1993,8 @@ void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #if SHIFT == 0
 
 #define SSE_HELPER_HW(name, F)  \
-void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
 {                               \
-    Reg *v = d;                 \
     uint16_t r[4];              \
     r[0] = F(v->W(0), v->W(1)); \
     r[1] = F(v->W(2), v->W(3)); \
@@ -2012,9 +2007,8 @@ void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
 }
 
 #define SSE_HELPER_HL(name, F)  \
-void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
 {                               \
-    Reg *v = d;                 \
     uint32_t r0, r1;            \
     r0 = F(v->L(0), v->L(1));   \
     r1 = F(s->L(0), s->L(1));   \
@@ -2025,9 +2019,8 @@ void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
 #else
 
 #define SSE_HELPER_HW(name, F)  \
-void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
 {                                   \
-    Reg *v = d;                     \
     int32_t r[8];                   \
     r[0] = F(v->W(0), v->W(1));     \
     r[1] = F(v->W(2), v->W(3));     \
@@ -2066,9 +2059,8 @@ void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
 }
 
 #define SSE_HELPER_HL(name, F)  \
-void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \
+void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \
 {                               \
-    Reg *v = d;                 \
     int32_t r0, r1, r2, r3;     \
     r0 = F(v->L(0), v->L(1));   \
     r1 = F(v->L(2), v->L(3));   \
@@ -2101,9 +2093,8 @@ SSE_HELPER_HL(phsubd, FSUB)
 #undef SSE_HELPER_HW
 #undef SSE_HELPER_HL
 
-void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     d->W(0) = satsw((int8_t)s->B(0) * (uint8_t)v->B(0) +
                     (int8_t)s->B(1) * (uint8_t)v->B(1));
     d->W(1) = satsw((int8_t)s->B(2) * (uint8_t)v->B(2) +
@@ -2148,10 +2139,9 @@ SSE_HELPER_B(helper_psignb, FSIGNB)
 SSE_HELPER_W(helper_psignw, FSIGNW)
 SSE_HELPER_L(helper_psignd, FSIGNL)
 
-void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,
                                   int32_t shift)
 {
-    Reg *v = d;
     /* XXX could be checked during translation */
     if (shift >= (SHIFT ? 32 : 16)) {
         d->Q(0) = 0;
@@ -2224,10 +2214,9 @@ void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     } while (0)
 
 #define SSE_HELPER_V(name, elem, num, F)                                \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,   \
+                            Reg *m)                                     \
     {                                                                   \
-        Reg *v = d;                                                     \
-        Reg *m = &env->xmm_regs[0];                                     \
         BLEND_V128(elem, num, F, 0);                                    \
         YMM_ONLY(BLEND_V128(elem, num, F, num);)                        \
     }
@@ -2248,10 +2237,9 @@ void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     } while (0)
 
 #define SSE_HELPER_I(name, elem, num, F)                                \
-    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,   \
+    void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,   \
                             uint32_t imm)                               \
     {                                                                   \
-        Reg *v = d;                                                     \
         BLEND_I128(elem, num, F, 0);                                    \
         YMM_ONLY(                                                       \
         if (num < 8)                                                    \
@@ -2320,9 +2308,8 @@ SSE_HELPER_F(helper_pmovzxwd, L, 4, s->W)
 SSE_HELPER_F(helper_pmovzxwq, Q, 2, s->W)
 SSE_HELPER_F(helper_pmovzxdq, Q, 2, s->L)
 
-void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     d->Q(0) = (int64_t)(int32_t) v->L(0) * (int32_t) s->L(0);
     d->Q(1) = (int64_t)(int32_t) v->L(2) * (int32_t) s->L(2);
 #if SHIFT == 2
@@ -2334,9 +2321,8 @@ void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #define FCMPEQQ(d, s) (d == s ? -1 : 0)
 SSE_HELPER_Q(helper_pcmpeqq, FCMPEQQ)
 
-void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-    Reg *v = d;
     uint16_t r[8];
 
     r[0] = satuw((int32_t) v->L(0));
@@ -2582,10 +2568,9 @@ SSE_HELPER_I(helper_blendps, L, 4, FBLENDP)
 SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP)
 SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP)
 
-void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,
                                uint32_t mask)
 {
-    Reg *v = d;
     float32 prod, iresult, iresult2;
 
     /*
@@ -2655,9 +2640,8 @@ void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 #if SHIFT == 1
 /* Oddly, there is no ymm version of dppd */
 void glue(helper_dppd, SUFFIX)(CPUX86State *env,
-                               Reg *d, Reg *s, uint32_t mask)
+                               Reg *d, Reg *v, Reg *s, uint32_t mask)
 {
-    Reg *v = d;
     float64 iresult;
 
     if (mask & (1 << 4)) {
@@ -2677,10 +2661,9 @@ void glue(helper_dppd, SUFFIX)(CPUX86State *env,
 }
 #endif
 
-void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,
                                   uint32_t offset)
 {
-    Reg *v = d;
     int s0 = (offset & 3) << 2;
     int d0 = (offset & 4) << 0;
     int i;
@@ -2965,10 +2948,9 @@ static void clmulq(uint64_t *dest_l, uint64_t *dest_h,
 }
 #endif
 
-void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s,
                                     uint32_t ctrl)
 {
-    Reg *v = d;
     uint64_t a, b;
 
     a = v->Q((ctrl & 1) != 0);
@@ -2981,10 +2963,10 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 #endif
 }
 
-void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
-    Reg st = *d; // v
+    Reg st = *v;
     Reg rk = *s;
 
     for (i = 0 ; i < 4 ; i++) {
@@ -3004,10 +2986,10 @@ void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
-    Reg st = *d; // v
+    Reg st = *v;
     Reg rk = *s;
 
     for (i = 0; i < 16; i++) {
@@ -3020,10 +3002,10 @@ void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
-    Reg st = *d; // v
+    Reg st = *v;
     Reg rk = *s;
 
     for (i = 0 ; i < 4 ; i++) {
@@ -3043,10 +3025,10 @@ void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
-void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
     int i;
-    Reg st = *d; // v
+    Reg st = *v;
     Reg rk = *s;
 
     for (i = 0; i < 16; i++) {
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 793e581224..cfcfba154b 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -38,31 +38,31 @@
 #define dh_typecode_ZMMReg dh_typecode_ptr
 #define dh_typecode_MMXReg dh_typecode_ptr
 
-DEF_HELPER_3(glue(psrlw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psraw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psllw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psrld, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psrad, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pslld, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psrlq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psllq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(psrlw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psraw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psllw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psrld, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psrad, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pslld, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psrlq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psllq, SUFFIX), void, env, Reg, Reg, Reg)
 
 #if SHIFT >= 1
-DEF_HELPER_3(glue(psrldq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pslldq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(psrldq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pslldq, SUFFIX), void, env, Reg, Reg, Reg)
 #endif
 
 #define SSE_HELPER_B(name, F)\
-    DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
+    DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg)
 
 #define SSE_HELPER_W(name, F)\
-    DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
+    DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg)
 
 #define SSE_HELPER_L(name, F)\
-    DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
+    DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg)
 
 #define SSE_HELPER_Q(name, F)\
-    DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
+    DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg)
 
 SSE_HELPER_B(paddb, FADD)
 SSE_HELPER_W(paddw, FADD)
@@ -113,10 +113,10 @@ SSE_HELPER_W(pmulhw, FMULHW)
 SSE_HELPER_B(pavgb, FAVG)
 SSE_HELPER_W(pavgw, FAVG)
 
-DEF_HELPER_3(glue(pmuludq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaddwd, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(pmuludq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaddwd, SUFFIX), void, env, Reg, Reg, Reg)
 
-DEF_HELPER_3(glue(psadbw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(psadbw, SUFFIX), void, env, Reg, Reg, Reg)
 #if SHIFT < 2
 DEF_HELPER_4(glue(maskmov, SUFFIX), void, env, Reg, Reg, tl)
 #endif
@@ -138,8 +138,8 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int)
 /* XXX: not accurate */
 
 #define SSE_HELPER_P4(name)                                             \
-    DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg)         \
-    DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg)
+    DEF_HELPER_4(glue(name ## ps, SUFFIX), void, env, Reg, Reg, Reg)    \
+    DEF_HELPER_4(glue(name ## pd, SUFFIX), void, env, Reg, Reg, Reg)
 
 #define SSE_HELPER_P3(name, ...)                                        \
     DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg)         \
@@ -148,8 +148,8 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int)
 #if SHIFT == 1
 #define SSE_HELPER_S4(name)                                             \
     SSE_HELPER_P4(name)                                                 \
-    DEF_HELPER_3(name ## ss, void, env, Reg, Reg)                       \
-    DEF_HELPER_3(name ## sd, void, env, Reg, Reg)
+    DEF_HELPER_4(name ## ss, void, env, Reg, Reg, Reg)                  \
+    DEF_HELPER_4(name ## sd, void, env, Reg, Reg, Reg)
 #define SSE_HELPER_S3(name)                                             \
     SSE_HELPER_P3(name)                                                 \
     DEF_HELPER_3(name ## ss, void, env, Reg, Reg)                       \
@@ -159,8 +159,8 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int)
 #define SSE_HELPER_S3(name, ...) SSE_HELPER_P3(name)
 #endif
 
-DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int)
-DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int)
+DEF_HELPER_4(glue(shufps, SUFFIX), void, Reg, Reg, Reg, int)
+DEF_HELPER_4(glue(shufpd, SUFFIX), void, Reg, Reg, Reg, int)
 
 SSE_HELPER_S4(add)
 SSE_HELPER_S4(sub)
@@ -216,6 +216,7 @@ DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg)
 
 DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg)
+
 #if SHIFT == 1
 DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg)
 DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg)
@@ -279,20 +280,20 @@ DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg)
 #endif
 
 DEF_HELPER_2(glue(pmovmskb, SUFFIX), i32, env, Reg)
-DEF_HELPER_3(glue(packsswb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(packuswb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(packssdw, SUFFIX), void, env, Reg, Reg)
-#define UNPCK_OP(base_name, base)                                       \
-    DEF_HELPER_3(glue(punpck ## base_name ## bw, SUFFIX), void, env, Reg, Reg) \
-    DEF_HELPER_3(glue(punpck ## base_name ## wd, SUFFIX), void, env, Reg, Reg) \
-    DEF_HELPER_3(glue(punpck ## base_name ## dq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(packsswb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(packuswb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(packssdw, SUFFIX), void, env, Reg, Reg, Reg)
+#define UNPCK_OP(name, base)                                       \
+    DEF_HELPER_4(glue(punpck ## name ## bw, SUFFIX), void, env, Reg, Reg, Reg) \
+    DEF_HELPER_4(glue(punpck ## name ## wd, SUFFIX), void, env, Reg, Reg, Reg) \
+    DEF_HELPER_4(glue(punpck ## name ## dq, SUFFIX), void, env, Reg, Reg, Reg)
 
 UNPCK_OP(l, 0)
 UNPCK_OP(h, 1)
 
 #if SHIFT >= 1
-DEF_HELPER_3(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg, Reg)
 #endif
 
 /* 3DNow! float ops */
@@ -319,28 +320,28 @@ DEF_HELPER_3(pswapd, void, env, MMXReg, MMXReg)
 #endif
 
 /* SSSE3 op helpers */
-DEF_HELPER_3(glue(phaddw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phaddd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phaddsw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phsubw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phsubd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(phsubsw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(phaddw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phaddd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phaddsw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phsubw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phsubd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(phsubsw, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_3(glue(pabsb, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pabsw, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pabsd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaddubsw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmulhrsw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pshufb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psignb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psignw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(psignd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_4(glue(palignr, SUFFIX), void, env, Reg, Reg, s32)
+DEF_HELPER_4(glue(pmaddubsw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmulhrsw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pshufb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psignb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psignw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(psignd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_5(glue(palignr, SUFFIX), void, env, Reg, Reg, Reg, s32)
 
 /* SSE4.1 op helpers */
 #if SHIFT >= 1
-DEF_HELPER_3(glue(pblendvb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(blendvps, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(blendvpd, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_5(glue(pblendvb, SUFFIX), void, env, Reg, Reg, Reg, Reg)
+DEF_HELPER_5(glue(blendvps, SUFFIX), void, env, Reg, Reg, Reg, Reg)
+DEF_HELPER_5(glue(blendvpd, SUFFIX), void, env, Reg, Reg, Reg, Reg)
 DEF_HELPER_3(glue(ptest, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovsxbw, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovsxbd, SUFFIX), void, env, Reg, Reg)
@@ -354,40 +355,40 @@ DEF_HELPER_3(glue(pmovzxbq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovzxwd, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovzxwq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmovzxdq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmuldq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pcmpeqq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(packusdw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pminsb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pminsd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pminuw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pminud, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaxsb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaxsd, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaxuw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaxud, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmulld, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(pmuldq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pcmpeqq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(packusdw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pminsb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pminsd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pminuw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pminud, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaxsb, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaxsd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaxuw, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmaxud, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(pmulld, SUFFIX), void, env, Reg, Reg, Reg)
 #if SHIFT == 1
 DEF_HELPER_3(glue(phminposuw, SUFFIX), void, env, Reg, Reg)
 #endif
 DEF_HELPER_4(glue(roundps, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(roundpd, SUFFIX), void, env, Reg, Reg, i32)
 #if SHIFT == 1
-DEF_HELPER_4(glue(roundss, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(roundsd, SUFFIX), void, env, Reg, Reg, i32)
+DEF_HELPER_4(roundss_xmm, void, env, Reg, Reg, i32)
+DEF_HELPER_4(roundsd_xmm, void, env, Reg, Reg, i32)
 #endif
-DEF_HELPER_4(glue(blendps, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(blendpd, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(pblendw, SUFFIX), void, env, Reg, Reg, i32)
-DEF_HELPER_4(glue(dpps, SUFFIX), void, env, Reg, Reg, i32)
+DEF_HELPER_5(glue(blendps, SUFFIX), void, env, Reg, Reg, Reg, i32)
+DEF_HELPER_5(glue(blendpd, SUFFIX), void, env, Reg, Reg, Reg, i32)
+DEF_HELPER_5(glue(pblendw, SUFFIX), void, env, Reg, Reg, Reg, i32)
+DEF_HELPER_5(glue(dpps, SUFFIX), void, env, Reg, Reg, Reg, i32)
 #if SHIFT == 1
-DEF_HELPER_4(glue(dppd, SUFFIX), void, env, Reg, Reg, i32)
+DEF_HELPER_5(glue(dppd, SUFFIX), void, env, Reg, Reg, Reg, i32)
 #endif
-DEF_HELPER_4(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, i32)
+DEF_HELPER_5(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, Reg, i32)
 #endif
 
 /* SSE4.2 op helpers */
 #if SHIFT >= 1
-DEF_HELPER_3(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg, Reg)
 #endif
 #if SHIFT == 1
 DEF_HELPER_4(glue(pcmpestri, SUFFIX), void, env, Reg, Reg, i32)
@@ -399,15 +400,15 @@ DEF_HELPER_3(crc32, tl, i32, tl, i32)
 
 /* AES-NI op helpers */
 #if SHIFT >= 1
-DEF_HELPER_3(glue(aesdec, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(aesdeclast, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(aesenc, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(aesenclast, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(aesdec, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(aesdeclast, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(aesenc, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(aesenclast, SUFFIX), void, env, Reg, Reg, Reg)
 #if SHIFT == 1
 DEF_HELPER_3(glue(aesimc, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env, Reg, Reg, i32)
 #endif
-DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32)
+DEF_HELPER_5(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, Reg, i32)
 #endif
 
 #undef SHIFT
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 6c40df61d4..d148a2319d 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -125,6 +125,7 @@ typedef struct DisasContext {
     TCGv tmp4;
     TCGv_ptr ptr0;
     TCGv_ptr ptr1;
+    TCGv_ptr ptr2;
     TCGv_i32 tmp2_i32;
     TCGv_i32 tmp3_i32;
     TCGv_i64 tmp1_i64;
@@ -2784,11 +2785,21 @@ typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr env, TCGv_ptr reg);
 typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val);
 typedef void (*SSEFunc_0_epl)(TCGv_ptr env, TCGv_ptr reg, TCGv_i64 val);
 typedef void (*SSEFunc_0_epp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b);
+typedef void (*SSEFunc_0_eppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
+                               TCGv_ptr reg_c);
+typedef void (*SSEFunc_0_epppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
+                                TCGv_ptr reg_c, TCGv_ptr reg_d);
 typedef void (*SSEFunc_0_eppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
                                TCGv_i32 val);
+typedef void (*SSEFunc_0_epppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
+                                TCGv_ptr reg_c, TCGv_i32 val);
 typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_i32 val);
+typedef void (*SSEFunc_0_pppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_ptr reg_c,
+                               TCGv_i32 val);
 typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
                                TCGv val);
+typedef void (*SSEFunc_0_epppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
+                                TCGv_ptr reg_c, TCGv val);
 
 #define SSE_OPF_V0        (1 << 0) /* vex.v must be 1111b (only 2 operands) */
 #define SSE_OPF_CMP       (1 << 1) /* does not write for first operand */
@@ -2801,7 +2812,7 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
 #define SSE_OPF_SHUF      (1 << 9) /* pshufx/shufpx */
 
 #define OP(op, flags, a, b, c, d)       \
-    {flags, {a, b, c, d} }
+    {flags, {{.op = a}, {.op = b}, {.op = c}, {.op = d} } }
 
 #define MMX_OP(x) OP(op2, SSE_OPF_MMX, \
         gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL)
@@ -2814,7 +2825,13 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
 
 struct SSEOpHelper_table1 {
     int flags;
-    SSEFunc_0_epp op[4];
+    union {
+        SSEFunc_0_epp op1;
+        SSEFunc_0_ppi op1i;
+        SSEFunc_0_eppt op1t;
+        SSEFunc_0_eppp op2;
+        SSEFunc_0_pppi op2i;
+    } fn[4];
 };
 
 #define SSE_3DNOW { SSE_OPF_3DNOW }
@@ -2870,8 +2887,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0x5f] = SSE_FOP(max),
 
     [0xc2] = SSE_FOP(cmpeq), /* sse_op_table4 */
-    [0xc6] = OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps_xmm,
-                (SSEFunc_0_epp)gen_helper_shufpd_xmm, NULL, NULL),
+    [0xc6] = SSE_OP(shufps, shufpd, op2i, SSE_OPF_SHUF),
 
     /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX.  */
     [0x38] = SSE_SPECIAL,
@@ -2897,10 +2913,8 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0x6e] = SSE_SPECIAL, /* movd mm, ea */
     [0x6f] = SSE_SPECIAL, /* movq, movdqa, , movqdu */
     [0x70] = OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX | SSE_OPF_V0,
-            (SSEFunc_0_epp)gen_helper_pshufw_mmx,
-            (SSEFunc_0_epp)gen_helper_pshufd_xmm,
-            (SSEFunc_0_epp)gen_helper_pshufhw_xmm,
-            (SSEFunc_0_epp)gen_helper_pshuflw_xmm),
+            gen_helper_pshufw_mmx, gen_helper_pshufd_xmm,
+            gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm),
     [0x71] = SSE_SPECIAL, /* shiftw */
     [0x72] = SSE_SPECIAL, /* shiftd */
     [0x73] = SSE_SPECIAL, /* shiftq */
@@ -2962,8 +2976,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0xf5] = MMX_OP(pmaddwd),
     [0xf6] = MMX_OP(psadbw),
     [0xf7] = OP(op1t, SSE_OPF_MMX | SSE_OPF_V0,
-                (SSEFunc_0_epp)gen_helper_maskmov_mmx,
-                (SSEFunc_0_epp)gen_helper_maskmov_xmm, NULL, NULL),
+                gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL),
     [0xf8] = MMX_OP(psubb),
     [0xf9] = MMX_OP(psubw),
     [0xfa] = MMX_OP(psubl),
@@ -2980,7 +2993,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
 
 #define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm }
 
-static const SSEFunc_0_epp sse_op_table2[3 * 8][2] = {
+static const SSEFunc_0_eppp sse_op_table2[3 * 8][2] = {
     [0 + 2] = MMX_OP2(psrlw),
     [0 + 4] = MMX_OP2(psraw),
     [0 + 6] = MMX_OP2(psllw),
@@ -2992,6 +3005,7 @@ static const SSEFunc_0_epp sse_op_table2[3 * 8][2] = {
     [16 + 6] = MMX_OP2(psllq),
     [16 + 7] = { NULL, gen_helper_pslldq_xmm },
 };
+#undef MMX_OP2
 
 static const SSEFunc_0_epi sse_op_table3ai[] = {
     gen_helper_cvtsi2ss,
@@ -3024,7 +3038,7 @@ static const SSEFunc_l_ep sse_op_table3bq[] = {
 #define SSE_CMP(x) { \
     gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \
     gen_helper_ ## x ## ss, gen_helper_ ## x ## sd}
-static const SSEFunc_0_epp sse_op_table4[32][4] = {
+static const SSEFunc_0_eppp sse_op_table4[32][4] = {
     SSE_CMP(cmpeq),
     SSE_CMP(cmplt),
     SSE_CMP(cmple),
@@ -3063,6 +3077,11 @@ static const SSEFunc_0_epp sse_op_table4[32][4] = {
 };
 #undef SSE_CMP
 
+static void gen_helper_pavgusb(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b)
+{
+    gen_helper_pavgb_mmx(env, reg_a, reg_a, reg_b);
+}
+
 static const SSEFunc_0_epp sse_op_table5[256] = {
     [0x0c] = gen_helper_pi2fw,
     [0x0d] = gen_helper_pi2fd,
@@ -3087,17 +3106,25 @@ static const SSEFunc_0_epp sse_op_table5[256] = {
     [0xb6] = gen_helper_movq, /* pfrcpit2 */
     [0xb7] = gen_helper_pmulhrw_mmx,
     [0xbb] = gen_helper_pswapd,
-    [0xbf] = gen_helper_pavgb_mmx /* pavgusb */
+    [0xbf] = gen_helper_pavgusb,
 };
 
 struct SSEOpHelper_table6 {
-    SSEFunc_0_epp op[2];
+    union {
+        SSEFunc_0_epp op1;
+        SSEFunc_0_eppp op2;
+        SSEFunc_0_epppp op3;
+    } fn[2];
     uint32_t ext_mask;
     int flags;
 };
 
 struct SSEOpHelper_table7 {
-    SSEFunc_0_eppi op[2];
+    union {
+        SSEFunc_0_eppi op1;
+        SSEFunc_0_epppi op2;
+        SSEFunc_0_epppp op3;
+    } fn[2];
     uint32_t ext_mask;
     int flags;
 };
@@ -3105,7 +3132,8 @@ struct SSEOpHelper_table7 {
 #define gen_helper_special_xmm NULL
 
 #define OP(name, op, flags, ext, mmx_name) \
-    {{mmx_name, gen_helper_ ## name ## _xmm}, CPUID_EXT_ ## ext, flags}
+    {{{.op = mmx_name}, {.op = gen_helper_ ## name ## _xmm} }, \
+        CPUID_EXT_ ## ext, flags}
 #define BINARY_OP_MMX(name, ext) \
     OP(name, op2, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx)
 #define BINARY_OP(name, ext, flags) \
@@ -3262,14 +3290,11 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
 static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     target_ulong pc_start)
 {
-    int b1, op1_offset, op2_offset, is_xmm, val, scalar_op;
-    int modrm, mod, rm, reg;
+    int b1, op1_offset, op2_offset, v_offset, is_xmm, val, scalar_op;
+    int modrm, mod, rm, reg, reg_v;
     struct SSEOpHelper_table1 sse_op;
     struct SSEOpHelper_table6 op6;
     struct SSEOpHelper_table7 op7;
-    SSEFunc_0_epp sse_fn_epp;
-    SSEFunc_0_ppi sse_fn_ppi;
-    SSEFunc_0_eppt sse_fn_eppt;
     MemOp ot;
 
     b &= 0xff;
@@ -3282,9 +3307,8 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
     else
         b1 = 0;
     sse_op = sse_op_table1[b];
-    sse_fn_epp = sse_op.op[b1];
     if ((sse_op.flags & (SSE_OPF_SPECIAL | SSE_OPF_3DNOW)) == 0
-            && !sse_fn_epp) {
+            && !sse_op.fn[b1].op1) {
         goto unknown_op;
     }
     if ((b <= 0x5f && b >= 0x10) || b == 0xc6 || b == 0xc2) {
@@ -3345,6 +3369,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
     if (is_xmm) {
         reg |= REX_R(s);
     }
+    if (s->prefix & PREFIX_VEX) {
+        reg_v = s->vex_v;
+    } else {
+        reg_v = reg;
+    }
     mod = (modrm >> 6) & 3;
     if (sse_op.flags & SSE_OPF_SPECIAL) {
         b |= (b1 << 8);
@@ -3466,8 +3495,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             } else {
                 CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
-                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0)));
+                tcg_gen_ld_i32(s->tmp2_i32, cpu_env,
+                               offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)));
+                if (reg != reg_v) {
+                    gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v));
+                }
+                tcg_gen_st_i32(s->tmp2_i32, cpu_env,
+                               offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)));
             }
             break;
         case 0x310: /* movsd xmm, ea */
@@ -3484,8 +3518,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             } else {
                 CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
+                if (reg != reg_v) {
+                    gen_op_movq(s,
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
+                }
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
-                            offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)));
             }
             break;
         case 0x012: /* movlps */
@@ -3501,6 +3540,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(1)));
             }
+            if (reg != reg_v) {
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
+            }
             break;
         case 0x212: /* movsldup */
             CHECK_AVX_V0(s);
@@ -3546,6 +3589,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
             }
+            if (reg != reg_v) {
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(0)));
+            }
             break;
         case 0x216: /* movshdup */
             CHECK_AVX_V0(s);
@@ -3664,6 +3711,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             } else {
                 CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
+                if (rm != reg_v) {
+                    gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg_v));
+                }
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)),
                             offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0)));
             }
@@ -3677,6 +3727,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             } else {
                 CHECK_AVX_128(s);
                 rm = (modrm & 7) | REX_B(s);
+                if (rm != reg_v) {
+                    gen_op_movq(s,
+                            offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
+                }
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)),
                             offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0)));
             }
@@ -3731,21 +3786,28 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 op1_offset = offsetof(CPUX86State,mmx_t0);
             }
             assert(b1 < 2);
-            sse_fn_epp = sse_op_table2[((b - 1) & 3) * 8 +
+            SSEFunc_0_eppp fn = sse_op_table2[((b - 1) & 3) * 8 +
                                        (((modrm >> 3)) & 7)][b1];
-            if (!sse_fn_epp) {
+            if (!fn) {
                 goto unknown_op;
             }
             if (is_xmm) {
                 rm = (modrm & 7) | REX_B(s);
                 op2_offset = ZMM_OFFSET(rm);
+                if (s->prefix & PREFIX_VEX) {
+                    v_offset = ZMM_OFFSET(reg_v);
+                } else {
+                    v_offset = op2_offset;
+                }
             } else {
                 rm = (modrm & 7);
                 op2_offset = offsetof(CPUX86State,fpregs[rm].mmx);
+                v_offset = op2_offset;
             }
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op1_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
+            tcg_gen_addi_ptr(s->ptr0, cpu_env, v_offset);
+            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+            tcg_gen_addi_ptr(s->ptr2, cpu_env, op1_offset);
+            fn(cpu_env, s->ptr0, s->ptr1, s->ptr2);
             break;
         case 0x050: /* movmskps */
             CHECK_AVX_V0(s);
@@ -3792,6 +3854,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             ot = mo_64_32(s->dflag);
             gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
             op1_offset = ZMM_OFFSET(reg);
+            v_offset = ZMM_OFFSET(reg_v);
+            if (op1_offset != v_offset) {
+                gen_op_movo(s, op1_offset, v_offset);
+            }
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
             if (ot == MO_32) {
                 SSEFunc_0_epi sse_fn_epi = sse_op_table3ai[(b >> 8) & 1];
@@ -3881,6 +3947,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             s->rip_offset = 1;
             gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
             val = x86_ldub_code(env, s);
+            if (reg != reg_v) {
+                gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v));
+            }
             if (b1) {
                 val &= 7;
                 tcg_gen_st16_tl(s->T0, cpu_env,
@@ -3972,6 +4041,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             rm = modrm & 7;
             reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
+            if (s->prefix & PREFIX_VEX) {
+                reg_v = s->vex_v;
+            } else {
+                reg_v = reg;
+            }
 
             assert(b1 < 2);
             op6 = sse_op_table6[b];
@@ -4041,6 +4115,27 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         gen_ldo_env_A0(s, op2_offset);
                     }
                 }
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                if (!op6.fn[b1].op1) {
+                    goto illegal_op;
+                }
+                if (op6.flags & SSE_OPF_V0) {
+                    op6.fn[b1].op1(cpu_env, s->ptr0, s->ptr1);
+                } else {
+                    v_offset = ZMM_OFFSET(reg_v);
+                    tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+                    if (op6.flags & SSE_OPF_BLENDV) {
+                        TCGv_ptr mask = tcg_temp_new_ptr();
+                        tcg_gen_addi_ptr(mask, cpu_env, ZMM_OFFSET(0));
+                        op6.fn[b1].op3(cpu_env, s->ptr0, s->ptr2, s->ptr1,
+                                       mask);
+                        tcg_temp_free_ptr(mask);
+                    } else {
+                        SSEFunc_0_eppp fn = op6.fn[b1].op2;
+                        fn(cpu_env, s->ptr0, s->ptr2, s->ptr1);
+                    }
+                }
             } else {
                 CHECK_NO_VEX(s);
                 if ((op6.flags & SSE_OPF_MMX) == 0) {
@@ -4054,16 +4149,16 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     gen_lea_modrm(env, s, modrm);
                     gen_ldq_env_A0(s, op2_offset);
                 }
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                if (op6.flags & SSE_OPF_V0) {
+                    op6.fn[0].op1(cpu_env, s->ptr0, s->ptr1);
+                } else {
+                    op6.fn[0].op2(cpu_env, s->ptr0, s->ptr0, s->ptr1);
+                }
             }
-            if (!op6.op[b1]) {
-                goto illegal_op;
-            }
-
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            op6.op[b1](cpu_env, s->ptr0, s->ptr1);
 
-            if (b == 0x17) {
+            if (op6.flags & SSE_OPF_CMP) {
                 set_cc_op(s, CC_OP_EFLAGS);
             }
             break;
@@ -4434,6 +4529,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             rm = modrm & 7;
             reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
+            if (s->prefix & PREFIX_VEX) {
+                reg_v = s->vex_v;
+            } else {
+                reg_v = reg;
+            }
 
             assert(b1 < 2);
             op7 = sse_op_table7[b];
@@ -4521,6 +4621,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     break;
                 case 0x20: /* pinsrb */
                     CHECK_AVX_128(s);
+                    if (reg != reg_v) {
+                        gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v));
+                    }
                     if (mod == 3) {
                         gen_op_mov_v_reg(s, MO_32, s->T0, rm);
                     } else {
@@ -4540,6 +4643,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
                                             s->mem_index, MO_LEUL);
                     }
+                    if (reg != reg_v) {
+                        gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v));
+                    }
                     tcg_gen_st_i32(s->tmp2_i32, cpu_env,
                                     offsetof(CPUX86State,xmm_regs[reg]
                                             .ZMM_L((val >> 4) & 3)));
@@ -4562,6 +4668,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     break;
                 case 0x22:
                     CHECK_AVX_128(s);
+                    if (reg != reg_v) {
+                        gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v));
+                    }
                     if (ot == MO_32) { /* pinsrd */
                         if (mod == 3) {
                             tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[rm]);
@@ -4606,17 +4715,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 CHECK_AVX_V0(s);
             }
 
-            if (b1) {
-                op1_offset = ZMM_OFFSET(reg);
-                if (mod == 3) {
-                    op2_offset = ZMM_OFFSET(rm | REX_B(s));
-                } else {
-                    op2_offset = offsetof(CPUX86State,xmm_t0);
-                    gen_lea_modrm(env, s, modrm);
-                    gen_ldo_env_A0(s, op2_offset);
-                }
-            } else {
+            if (b1 == 0) {
                 CHECK_NO_VEX(s);
+                /* MMX */
                 if ((op7.flags & SSE_OPF_MMX) == 0) {
                     goto illegal_op;
                 }
@@ -4628,9 +4729,29 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     gen_lea_modrm(env, s, modrm);
                     gen_ldq_env_A0(s, op2_offset);
                 }
+                val = x86_ldub_code(env, s);
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+
+                /* We only actually have one MMX instuction (palignr) */
+                assert(b == 0x0f);
+
+                op7.fn[0].op2(cpu_env, s->ptr0, s->ptr0, s->ptr1,
+                              tcg_const_i32(val));
+                break;
+            }
+
+            /* SSE */
+            op1_offset = ZMM_OFFSET(reg);
+            if (mod == 3) {
+                op2_offset = ZMM_OFFSET(rm | REX_B(s));
+            } else {
+                op2_offset = offsetof(CPUX86State, xmm_t0);
+                gen_lea_modrm(env, s, modrm);
+                gen_ldo_env_A0(s, op2_offset);
             }
-            val = x86_ldub_code(env, s);
 
+            val = x86_ldub_code(env, s);
             if ((b & 0xfc) == 0x60) { /* pcmpXstrX */
                 set_cc_op(s, CC_OP_EFLAGS);
 
@@ -4640,9 +4761,32 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 }
             }
 
+            v_offset = ZMM_OFFSET(reg_v);
+            /*
+             * Populate the top part of the destination register for VEX
+             * encoded scalar operations
+             */
+            if (scalar_op && op1_offset != v_offset) {
+                if (b == 0x0a) { /* roundss */
+                    gen_op_movl(s,
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_L(1)));
+                }
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
+            }
             tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
             tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            op7.op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val));
+            if (op7.flags & SSE_OPF_V0) {
+                op7.fn[b1].op1(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val));
+            } else {
+                tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+                op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1,
+                               tcg_const_i32(val));
+            }
+            if (op7.flags & SSE_OPF_CMP) {
+                set_cc_op(s, CC_OP_EFLAGS);
+            }
             break;
 
         case 0x33a:
@@ -4711,28 +4855,24 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 int sz = 4;
 
                 gen_lea_modrm(env, s, modrm);
-                op2_offset = offsetof(CPUX86State,xmm_t0);
+                op2_offset = offsetof(CPUX86State, xmm_t0);
 
-                switch (b) {
-                case 0x50 ... 0x5a:
-                case 0x5c ... 0x5f:
-                case 0xc2:
-                    /* Most sse scalar operations.  */
-                    if (b1 == 2) {
-                        sz = 2;
-                    } else if (b1 == 3) {
-                        sz = 3;
-                    }
-                    break;
-
-                case 0x2e:  /* ucomis[sd] */
-                case 0x2f:  /* comis[sd] */
-                    if (b1 == 0) {
-                        sz = 2;
+                if (sse_op.flags & SSE_OPF_SCALAR) {
+                    if (sse_op.flags & SSE_OPF_CMP) {
+                        /* ucomis[sd], comis[sd] */
+                        if (b1 == 0) {
+                            sz = 2;
+                        } else {
+                            sz = 3;
+                        }
                     } else {
-                        sz = 3;
+                        /* Most sse scalar operations.  */
+                        if (b1 == 2) {
+                            sz = 2;
+                        } else if (b1 == 3) {
+                            sz = 3;
+                        }
                     }
-                    break;
                 }
 
                 switch (sz) {
@@ -4740,13 +4880,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     /* 32 bit access */
                     gen_op_ld_v(s, MO_32, s->T0, s->A0);
                     tcg_gen_st32_tl(s->T0, cpu_env,
-                                    offsetof(CPUX86State,xmm_t0.ZMM_L(0)));
+                                    offsetof(CPUX86State, xmm_t0.ZMM_L(0)));
                     break;
                 case 3:
                     /* 64 bit access */
                     gen_ldq_env_A0(s, offsetof(CPUX86State, xmm_t0.ZMM_D(0)));
                     break;
-                default:
+                case 4:
                     /* 128 bit access */
                     gen_ldo_env_A0(s, op2_offset);
                     break;
@@ -4755,8 +4895,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 rm = (modrm & 7) | REX_B(s);
                 op2_offset = ZMM_OFFSET(rm);
             }
+            v_offset = ZMM_OFFSET(reg_v);
         } else {
             CHECK_NO_VEX(s);
+            scalar_op = 0;
             op1_offset = offsetof(CPUX86State,fpregs[reg].mmx);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
@@ -4778,47 +4920,85 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 op_3dnow(cpu_env, s->ptr0, s->ptr1);
                 return;
             }
+            v_offset = op1_offset;
         }
-        switch(b) {
-        case 0x70: /* pshufx insn */
-        case 0xc6: /* pshufx insn */
-            val = x86_ldub_code(env, s);
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            /* XXX: introduce a new table? */
-            sse_fn_ppi = (SSEFunc_0_ppi)sse_fn_epp;
-            sse_fn_ppi(s->ptr0, s->ptr1, tcg_const_i32(val));
-            break;
-        case 0xc2:
-            /* compare insns, bits 7:3 (7:5 for AVX) are ignored */
-            val = x86_ldub_code(env, s) & 7;
-            sse_fn_epp = sse_op_table4[val][b1];
 
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
-            break;
-        case 0xf7:
-            /* maskmov : we must prepare A0 */
-            if (mod != 3)
-                goto illegal_op;
-            tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]);
-            gen_extu(s->aflag, s->A0);
-            gen_add_A0_ds_seg(s);
+        /*
+         * Populate the top part of the destination register for VEX
+         * encoded scalar operations
+         */
+        if (scalar_op && op1_offset != v_offset) {
+            if (b == 0x5a) {
+                /*
+                 * Scalar conversions are tricky because the src and dest
+                 * may be different sizes
+                 */
+                if (op1_offset == op2_offset) {
+                    /*
+                     * The the second source operand overlaps the
+                     * destination, so we need to copy the value
+                     */
+                    op2_offset = offsetof(CPUX86State, xmm_t0);
+                    gen_op_movq(s, op2_offset, op1_offset);
+                }
+                gen_op_movo(s, op1_offset, v_offset);
+            } else {
+                if (b1 == 2) { /* ss */
+                    gen_op_movl(s,
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_L(1)));
+                }
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
+                            offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
+            }
+        }
 
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            /* XXX: introduce a new table? */
-            sse_fn_eppt = (SSEFunc_0_eppt)sse_fn_epp;
-            sse_fn_eppt(cpu_env, s->ptr0, s->ptr1, s->A0);
-            break;
-        default:
-            tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
-            tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
-            sse_fn_epp(cpu_env, s->ptr0, s->ptr1);
-            break;
+        tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+        tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+        if (sse_op.flags & SSE_OPF_V0) {
+            if (sse_op.flags & SSE_OPF_SHUF) {
+                val = x86_ldub_code(env, s);
+                sse_op.fn[b1].op1i(s->ptr0, s->ptr1, tcg_const_i32(val));
+            } else if (b == 0xf7) {
+                /* maskmov : we must prepare A0 */
+                if (mod != 3) {
+                    goto illegal_op;
+                }
+                tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]);
+                gen_extu(s->aflag, s->A0);
+                gen_add_A0_ds_seg(s);
+
+                tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                sse_op.fn[b1].op1t(cpu_env, s->ptr0, s->ptr1, s->A0);
+                /* Does not write to the fist operand */
+                return;
+            } else {
+                sse_op.fn[b1].op1(cpu_env, s->ptr0, s->ptr1);
+            }
+        } else {
+            tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+            if (sse_op.flags & SSE_OPF_SHUF) {
+                val = x86_ldub_code(env, s);
+                sse_op.fn[b1].op2i(s->ptr0, s->ptr2, s->ptr1,
+                                   tcg_const_i32(val));
+            } else {
+                SSEFunc_0_eppp fn = sse_op.fn[b1].op2;
+                if (b == 0xc2) {
+                    /* compare insns */
+                    val = x86_ldub_code(env, s);
+                    if (s->prefix & PREFIX_VEX) {
+                        val &= 0x1f;
+                    } else {
+                        val &= 7;
+                    }
+                    fn = sse_op_table4[val][b1];
+                }
+                fn(cpu_env, s->ptr0, s->ptr2, s->ptr1);
+            }
         }
-        if (b == 0x2e || b == 0x2f) {
+
+        if (sse_op.flags & SSE_OPF_CMP) {
             set_cc_op(s, CC_OP_EFLAGS);
         }
     }
@@ -8900,6 +9080,7 @@ static void i386_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cpu)
     dc->tmp4 = tcg_temp_new();
     dc->ptr0 = tcg_temp_new_ptr();
     dc->ptr1 = tcg_temp_new_ptr();
+    dc->ptr2 = tcg_temp_new_ptr();
     dc->cc_srcT = tcg_temp_local_new();
 }
 
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 26/42] i386: Utility function for 128 bit AVX
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (28 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 25/42] i386: VEX.V encodings (3 operand) Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 27/42] i386: Translate 256 bit AVX instructions Paul Brook
                   ` (15 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

VEX encoded instructions that write to a (128 bit) xmm register clear the
rest (upper half) of the corresonding (256 bit) ymm register.
When legacy SSE encodings are used the rest of the ymm register is left
unchanged.

Add a utility fuction so that we don't have to keep duplicating this logic.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index d148a2319d..278ed8ed1c 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2780,6 +2780,18 @@ static inline void gen_op_movq_env_0(DisasContext *s, int d_offset)
 
 #define ZMM_OFFSET(reg) offsetof(CPUX86State, xmm_regs[reg])
 
+/*
+ * Clear the top half of the ymm register after a VEX.128 instruction
+ * This could be optimized by tracking this in env->hflags
+ */
+static void gen_clear_ymmh(DisasContext *s, int reg)
+{
+    if (s->prefix & PREFIX_VEX) {
+        gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2)));
+        gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(3)));
+    }
+}
+
 typedef void (*SSEFunc_i_ep)(TCGv_i32 val, TCGv_ptr env, TCGv_ptr reg);
 typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr env, TCGv_ptr reg);
 typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val);
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 27/42] i386: Translate 256 bit AVX instructions
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (29 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 26/42] i386: Utility function for 128 bit AVX Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 28/42] i386: Implement VZEROALL and VZEROUPPER Paul Brook
                   ` (14 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

All the work for the helper functions is already done, we just need to build
them, and a few macro tweaks to poulate the lookup tables.

For sse_op_table6 and sse_op_table7 we use #defines to fill in the entries
where and opcode only supports one vector size, rather than complicating the
main table.

Several of the open-coded mov type instruction need special handling, but most
of the rest falls out from the infrastructure we already added.

Also clear the top half of the register after 128 bit VEX register writes.
In the current code this correlates with VEX.L == 0, but there are exceptios
later.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/helper.h         |   2 +
 target/i386/tcg/fpu_helper.c |   3 +
 target/i386/tcg/translate.c  | 370 +++++++++++++++++++++++++++++------
 3 files changed, 319 insertions(+), 56 deletions(-)

diff --git a/target/i386/helper.h b/target/i386/helper.h
index ac3b4d1ee3..3da5df98b9 100644
--- a/target/i386/helper.h
+++ b/target/i386/helper.h
@@ -218,6 +218,8 @@ DEF_HELPER_3(movq, void, env, ptr, ptr)
 #include "ops_sse_header.h"
 #define SHIFT 1
 #include "ops_sse_header.h"
+#define SHIFT 2
+#include "ops_sse_header.h"
 
 DEF_HELPER_3(rclb, tl, env, tl, tl)
 DEF_HELPER_3(rclw, tl, env, tl, tl)
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
index b391b69635..74cf86c986 100644
--- a/target/i386/tcg/fpu_helper.c
+++ b/target/i386/tcg/fpu_helper.c
@@ -3053,3 +3053,6 @@ void helper_movq(CPUX86State *env, void *d, void *s)
 
 #define SHIFT 1
 #include "ops_sse.h"
+
+#define SHIFT 2
+#include "ops_sse.h"
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 278ed8ed1c..bcd6d47fd0 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2742,6 +2742,29 @@ static inline void gen_ldo_env_A0(DisasContext *s, int offset)
     tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(1)));
 }
 
+static inline void gen_ldo_env_A0_ymmh(DisasContext *s, int offset)
+{
+    int mem_index = s->mem_index;
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_addi_tl(s->tmp0, s->A0, 8);
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3)));
+}
+
+/* Load 256-bit ymm register value */
+static inline void gen_ldy_env_A0(DisasContext *s, int offset)
+{
+    int mem_index = s->mem_index;
+    gen_ldo_env_A0(s, offset);
+    tcg_gen_addi_tl(s->tmp0, s->A0, 16);
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_addi_tl(s->tmp0, s->A0, 24);
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3)));
+}
+
 static inline void gen_sto_env_A0(DisasContext *s, int offset)
 {
     int mem_index = s->mem_index;
@@ -2752,6 +2775,29 @@ static inline void gen_sto_env_A0(DisasContext *s, int offset)
     tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
 }
 
+static inline void gen_sto_env_A0_ymmh(DisasContext *s, int offset)
+{
+    int mem_index = s->mem_index;
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ);
+    tcg_gen_addi_tl(s->tmp0, s->A0, 8);
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3)));
+    tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+}
+
+/* Store 256-bit ymm register value */
+static inline void gen_sty_env_A0(DisasContext *s, int offset)
+{
+    int mem_index = s->mem_index;
+    gen_sto_env_A0(s, offset);
+    tcg_gen_addi_tl(s->tmp0, s->A0, 16);
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+    tcg_gen_addi_tl(s->tmp0, s->A0, 24);
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3)));
+    tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
+}
+
 static inline void gen_op_movo(DisasContext *s, int d_offset, int s_offset)
 {
     tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(0)));
@@ -2760,6 +2806,14 @@ static inline void gen_op_movo(DisasContext *s, int d_offset, int s_offset)
     tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(1)));
 }
 
+static inline void gen_op_movo_ymmh(DisasContext *s, int d_offset, int s_offset)
+{
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(3)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(3)));
+}
+
 static inline void gen_op_movq(DisasContext *s, int d_offset, int s_offset)
 {
     tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset);
@@ -2823,17 +2877,21 @@ typedef void (*SSEFunc_0_epppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b,
 #define SSE_OPF_AVX2      (1 << 7) /* AVX2 instruction */
 #define SSE_OPF_SHUF      (1 << 9) /* pshufx/shufpx */
 
-#define OP(op, flags, a, b, c, d)       \
-    {flags, {{.op = a}, {.op = b}, {.op = c}, {.op = d} } }
+#define OP(op, flags, a, b, c, d, e, f, g, h)       \
+    {flags, {{.op = a}, {.op = b}, {.op = c}, {.op = d},    \
+             {.op = e}, {.op = f}, {.op = g}, {.op = h} } }
 
 #define MMX_OP(x) OP(op2, SSE_OPF_MMX, \
-        gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL)
+        gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL, \
+        NULL, gen_helper_ ## x ## _ymm, NULL, NULL)
 
 #define SSE_FOP(name) OP(op2, SSE_OPF_SCALAR, \
-        gen_helper_##name##ps##_xmm, gen_helper_##name##pd##_xmm, \
-        gen_helper_##name##ss, gen_helper_##name##sd)
+        gen_helper_##name##ps_xmm, gen_helper_##name##pd_xmm, \
+        gen_helper_##name##ss, gen_helper_##name##sd, \
+        gen_helper_##name##ps_ymm, gen_helper_##name##pd_ymm, NULL, NULL)
 #define SSE_OP(sname, dname, op, flags) OP(op, flags, \
-        gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL)
+        gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL, \
+        gen_helper_##sname##_ymm, gen_helper_##dname##_ymm, NULL, NULL)
 
 struct SSEOpHelper_table1 {
     int flags;
@@ -2843,7 +2901,7 @@ struct SSEOpHelper_table1 {
         SSEFunc_0_eppt op1t;
         SSEFunc_0_eppp op2;
         SSEFunc_0_pppi op2i;
-    } fn[4];
+    } fn[8];
 };
 
 #define SSE_3DNOW { SSE_OPF_3DNOW }
@@ -2870,17 +2928,22 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0x2c] = SSE_SPECIAL, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si */
     [0x2d] = SSE_SPECIAL, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */
     [0x2e] = OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0,
-            gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL),
+            gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL,
+            NULL, NULL, NULL, NULL),
     [0x2f] = OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0,
-            gen_helper_comiss, gen_helper_comisd, NULL, NULL),
+            gen_helper_comiss, gen_helper_comisd, NULL, NULL,
+            NULL, NULL, NULL, NULL),
     [0x50] = SSE_SPECIAL, /* movmskps, movmskpd */
     [0x51] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
                 gen_helper_sqrtps_xmm, gen_helper_sqrtpd_xmm,
-                gen_helper_sqrtss, gen_helper_sqrtsd),
+                gen_helper_sqrtss, gen_helper_sqrtsd,
+                gen_helper_sqrtps_ymm, gen_helper_sqrtpd_ymm, NULL, NULL),
     [0x52] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
-                gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL),
+                gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL,
+                gen_helper_rsqrtps_ymm, NULL, NULL, NULL),
     [0x53] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
-                gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL),
+                gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL,
+                gen_helper_rcpps_ymm, NULL, NULL, NULL),
     [0x54] = SSE_OP(pand, pand, op2, 0), /* andps, andpd */
     [0x55] = SSE_OP(pandn, pandn, op2, 0), /* andnps, andnpd */
     [0x56] = SSE_OP(por, por, op2, 0), /* orps, orpd */
@@ -2889,10 +2952,13 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0x59] = SSE_FOP(mul),
     [0x5a] = OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0,
                 gen_helper_cvtps2pd_xmm, gen_helper_cvtpd2ps_xmm,
-                gen_helper_cvtss2sd, gen_helper_cvtsd2ss),
+                gen_helper_cvtss2sd, gen_helper_cvtsd2ss,
+                gen_helper_cvtps2pd_ymm, gen_helper_cvtpd2ps_ymm, NULL, NULL),
     [0x5b] = OP(op1, SSE_OPF_V0,
                 gen_helper_cvtdq2ps_xmm, gen_helper_cvtps2dq_xmm,
-                gen_helper_cvttps2dq_xmm, NULL),
+                gen_helper_cvttps2dq_xmm, NULL,
+                gen_helper_cvtdq2ps_ymm, gen_helper_cvtps2dq_ymm,
+                gen_helper_cvttps2dq_ymm, NULL),
     [0x5c] = SSE_FOP(sub),
     [0x5d] = SSE_FOP(min),
     [0x5e] = SSE_FOP(div),
@@ -2919,14 +2985,18 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0x6a] = MMX_OP(punpckhdq),
     [0x6b] = MMX_OP(packssdw),
     [0x6c] = OP(op2, SSE_OPF_MMX,
-                NULL, gen_helper_punpcklqdq_xmm, NULL, NULL),
+                NULL, gen_helper_punpcklqdq_xmm, NULL, NULL,
+                NULL, gen_helper_punpcklqdq_ymm, NULL, NULL),
     [0x6d] = OP(op2, SSE_OPF_MMX,
-                NULL, gen_helper_punpckhqdq_xmm, NULL, NULL),
+                NULL, gen_helper_punpckhqdq_xmm, NULL, NULL,
+                NULL, gen_helper_punpckhqdq_ymm, NULL, NULL),
     [0x6e] = SSE_SPECIAL, /* movd mm, ea */
     [0x6f] = SSE_SPECIAL, /* movq, movdqa, , movqdu */
     [0x70] = OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX | SSE_OPF_V0,
             gen_helper_pshufw_mmx, gen_helper_pshufd_xmm,
-            gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm),
+            gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm,
+            NULL, gen_helper_pshufd_ymm,
+            gen_helper_pshufhw_ymm, gen_helper_pshuflw_ymm),
     [0x71] = SSE_SPECIAL, /* shiftw */
     [0x72] = SSE_SPECIAL, /* shiftd */
     [0x73] = SSE_SPECIAL, /* shiftq */
@@ -2936,17 +3006,21 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0x77] = SSE_SPECIAL, /* emms */
     [0x78] = SSE_SPECIAL, /* extrq_i, insertq_i (sse4a) */
     [0x79] = OP(op1, SSE_OPF_V0,
-            NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r),
+            NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r,
+            NULL, NULL, NULL, NULL),
     [0x7c] = OP(op2, 0,
-                NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm),
+                NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm,
+                NULL, gen_helper_haddpd_ymm, NULL, gen_helper_haddps_ymm),
     [0x7d] = OP(op2, 0,
-                NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm),
+                NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm,
+                NULL, gen_helper_hsubpd_ymm, NULL, gen_helper_hsubps_ymm),
     [0x7e] = SSE_SPECIAL, /* movd, movd, , movq */
     [0x7f] = SSE_SPECIAL, /* movq, movdqa, movdqu */
     [0xc4] = SSE_SPECIAL, /* pinsrw */
     [0xc5] = SSE_SPECIAL, /* pextrw */
     [0xd0] = OP(op2, 0,
-                NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_xmm),
+                NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_xmm,
+                NULL, gen_helper_addsubpd_ymm, NULL, gen_helper_addsubps_ymm),
     [0xd1] = MMX_OP(psrlw),
     [0xd2] = MMX_OP(psrld),
     [0xd3] = MMX_OP(psrlq),
@@ -2970,7 +3044,9 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0xe5] = MMX_OP(pmulhw),
     [0xe6] = OP(op1, SSE_OPF_V0,
             NULL, gen_helper_cvttpd2dq_xmm,
-            gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm),
+            gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm,
+            NULL, gen_helper_cvttpd2dq_ymm,
+            gen_helper_cvtdq2pd_ymm, gen_helper_cvtpd2dq_ymm),
     [0xe7] = SSE_SPECIAL,  /* movntq, movntq */
     [0xe8] = MMX_OP(psubsb),
     [0xe9] = MMX_OP(psubsw),
@@ -2988,7 +3064,8 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
     [0xf5] = MMX_OP(pmaddwd),
     [0xf6] = MMX_OP(psadbw),
     [0xf7] = OP(op1t, SSE_OPF_MMX | SSE_OPF_V0,
-                gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL),
+                gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL,
+                NULL, NULL, NULL, NULL),
     [0xf8] = MMX_OP(psubb),
     [0xf9] = MMX_OP(psubw),
     [0xfa] = MMX_OP(psubl),
@@ -3003,9 +3080,9 @@ static const struct SSEOpHelper_table1 sse_op_table1[256] = {
 #undef SSE_OP
 #undef SSE_SPECIAL
 
-#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm }
-
-static const SSEFunc_0_eppp sse_op_table2[3 * 8][2] = {
+#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, \
+                    gen_helper_ ## x ## _ymm}
+static const SSEFunc_0_eppp sse_op_table2[3 * 8][3] = {
     [0 + 2] = MMX_OP2(psrlw),
     [0 + 4] = MMX_OP2(psraw),
     [0 + 6] = MMX_OP2(psllw),
@@ -3013,9 +3090,9 @@ static const SSEFunc_0_eppp sse_op_table2[3 * 8][2] = {
     [8 + 4] = MMX_OP2(psrad),
     [8 + 6] = MMX_OP2(pslld),
     [16 + 2] = MMX_OP2(psrlq),
-    [16 + 3] = { NULL, gen_helper_psrldq_xmm },
+    [16 + 3] = { NULL, gen_helper_psrldq_xmm, gen_helper_psrldq_ymm},
     [16 + 6] = MMX_OP2(psllq),
-    [16 + 7] = { NULL, gen_helper_pslldq_xmm },
+    [16 + 7] = { NULL, gen_helper_pslldq_xmm, gen_helper_pslldq_ymm},
 };
 #undef MMX_OP2
 
@@ -3049,8 +3126,9 @@ static const SSEFunc_l_ep sse_op_table3bq[] = {
 
 #define SSE_CMP(x) { \
     gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \
-    gen_helper_ ## x ## ss, gen_helper_ ## x ## sd}
-static const SSEFunc_0_eppp sse_op_table4[32][4] = {
+    gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, \
+    gen_helper_ ## x ## ps ## _ymm, gen_helper_ ## x ## pd ## _ymm}
+static const SSEFunc_0_eppp sse_op_table4[32][6] = {
     SSE_CMP(cmpeq),
     SSE_CMP(cmplt),
     SSE_CMP(cmple),
@@ -3126,7 +3204,7 @@ struct SSEOpHelper_table6 {
         SSEFunc_0_epp op1;
         SSEFunc_0_eppp op2;
         SSEFunc_0_epppp op3;
-    } fn[2];
+    } fn[3]; /* [0] = mmx, [1] = xmm, fn[2] = ymm */
     uint32_t ext_mask;
     int flags;
 };
@@ -3136,16 +3214,17 @@ struct SSEOpHelper_table7 {
         SSEFunc_0_eppi op1;
         SSEFunc_0_epppi op2;
         SSEFunc_0_epppp op3;
-    } fn[2];
+    } fn[3]; /* [0] = mmx, [1] = xmm, fn[2] = ymm */
     uint32_t ext_mask;
     int flags;
 };
 
 #define gen_helper_special_xmm NULL
+#define gen_helper_special_ymm NULL
 
 #define OP(name, op, flags, ext, mmx_name) \
-    {{{.op = mmx_name}, {.op = gen_helper_ ## name ## _xmm} }, \
-        CPUID_EXT_ ## ext, flags}
+    {{{.op = mmx_name}, {.op = gen_helper_ ## name ## _xmm}, \
+      {.op = gen_helper_ ## name ## _ymm} }, CPUID_EXT_ ## ext, flags}
 #define BINARY_OP_MMX(name, ext) \
     OP(name, op2, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx)
 #define BINARY_OP(name, ext, flags) \
@@ -3205,7 +3284,9 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x3e] = BINARY_OP(pmaxuw, SSE41, SSE_OPF_MMX),
     [0x3f] = BINARY_OP(pmaxud, SSE41, SSE_OPF_MMX),
     [0x40] = BINARY_OP(pmulld, SSE41, SSE_OPF_MMX),
+#define gen_helper_phminposuw_ymm NULL
     [0x41] = UNARY_OP(phminposuw, SSE41, 0),
+#define gen_helper_aesimc_ymm NULL
     [0xdb] = UNARY_OP(aesimc, AES, 0),
     [0xdc] = BINARY_OP(aesenc, AES, 0),
     [0xdd] = BINARY_OP(aesenclast, AES, 0),
@@ -3217,7 +3298,9 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
 static const struct SSEOpHelper_table7 sse_op_table7[256] = {
     [0x08] = UNARY_OP(roundps, SSE41, 0),
     [0x09] = UNARY_OP(roundpd, SSE41, 0),
+#define gen_helper_roundss_ymm NULL
     [0x0a] = UNARY_OP(roundss, SSE41, SSE_OPF_SCALAR),
+#define gen_helper_roundsd_ymm NULL
     [0x0b] = UNARY_OP(roundsd, SSE41, SSE_OPF_SCALAR),
     [0x0c] = BINARY_OP(blendps, SSE41, 0),
     [0x0d] = BINARY_OP(blendpd, SSE41, 0),
@@ -3231,13 +3314,19 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
     [0x21] = SPECIAL_OP(SSE41), /* insertps */
     [0x22] = SPECIAL_OP(SSE41), /* pinsrd/pinsrq */
     [0x40] = BINARY_OP(dpps, SSE41, 0),
+#define gen_helper_dppd_ymm NULL
     [0x41] = BINARY_OP(dppd, SSE41, 0),
     [0x42] = BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX),
     [0x44] = BINARY_OP(pclmulqdq, PCLMULQDQ, 0),
+#define gen_helper_pcmpestrm_ymm NULL
     [0x60] = CMP_OP(pcmpestrm, SSE42),
+#define gen_helper_pcmpestri_ymm NULL
     [0x61] = CMP_OP(pcmpestri, SSE42),
+#define gen_helper_pcmpistrm_ymm NULL
     [0x62] = CMP_OP(pcmpistrm, SSE42),
+#define gen_helper_pcmpistri_ymm NULL
     [0x63] = CMP_OP(pcmpistri, SSE42),
+#define gen_helper_aeskeygenassist_ymm NULL
     [0xdf] = UNARY_OP(aeskeygenassist, AES, 0),
 };
 
@@ -3405,14 +3494,23 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
-            gen_sto_env_A0(s, ZMM_OFFSET(reg));
+            if (s->vex_l) {
+                gen_sty_env_A0(s, ZMM_OFFSET(reg));
+            } else {
+                gen_sto_env_A0(s, ZMM_OFFSET(reg));
+            }
             break;
         case 0x3f0: /* lddqu */
             CHECK_AVX_V0(s);
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
-            gen_ldo_env_A0(s, ZMM_OFFSET(reg));
+            if (s->vex_l) {
+                gen_ldy_env_A0(s, ZMM_OFFSET(reg));
+            } else {
+                gen_ldo_env_A0(s, ZMM_OFFSET(reg));
+                gen_clear_ymmh(s, reg);
+            }
             break;
         case 0x22b: /* movntss */
         case 0x32b: /* movntsd */
@@ -3461,6 +3559,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
                 gen_helper_movl_mm_T0_xmm(s->ptr0, s->tmp2_i32);
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x6f: /* movq mm, ea */
             CHECK_NO_VEX(s);
@@ -3484,10 +3583,20 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, ZMM_OFFSET(reg));
+                if (s->vex_l) {
+                    gen_ldy_env_A0(s, ZMM_OFFSET(reg));
+                } else {
+                    gen_ldo_env_A0(s, ZMM_OFFSET(reg));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm));
+                if (s->vex_l) {
+                    gen_op_movo_ymmh(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm));
+                }
+            }
+            if (!s->vex_l) {
+                gen_clear_ymmh(s, reg);
             }
             break;
         case 0x210: /* movss xmm, ea */
@@ -3515,6 +3624,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 tcg_gen_st_i32(s->tmp2_i32, cpu_env,
                                offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x310: /* movsd xmm, ea */
             if (mod != 3) {
@@ -3538,6 +3648,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
                             offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x012: /* movlps */
         case 0x112: /* movlpd */
@@ -3556,23 +3667,44 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
                             offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x212: /* movsldup */
             CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, ZMM_OFFSET(reg));
+                if (s->vex_l) {
+                    gen_ldy_env_A0(s, ZMM_OFFSET(reg));
+                } else {
+                    gen_ldo_env_A0(s, ZMM_OFFSET(reg));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)),
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0)));
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2)),
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_L(2)));
+                if (s->vex_l) {
+                    gen_op_movl(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_L(4)));
+                    gen_op_movl(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_L(6)));
+                }
             }
             gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)),
                         offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0)));
             gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)),
                         offsetof(CPUX86State,xmm_regs[reg].ZMM_L(2)));
+            if (s->vex_l) {
+                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4)));
+                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6)));
+            } else {
+                gen_clear_ymmh(s, reg);
+            }
             break;
         case 0x312: /* movddup */
             CHECK_AVX_V0(s);
@@ -3580,13 +3712,29 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 gen_lea_modrm(env, s, modrm);
                 gen_ldq_env_A0(s, offsetof(CPUX86State,
                                            xmm_regs[reg].ZMM_Q(0)));
+                if (s->vex_l) {
+                    tcg_gen_addi_tl(s->A0, s->A0, 16);
+                    gen_ldq_env_A0(s, offsetof(CPUX86State,
+                                               xmm_regs[reg].ZMM_Q(2)));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
+                if (s->vex_l) {
+                    gen_op_movq(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(2)));
+                }
             }
             gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)),
                         offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0)));
+            if (s->vex_l) {
+                gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(3)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2)));
+            } else {
+                gen_clear_ymmh(s, reg);
+            }
             break;
         case 0x016: /* movhps */
         case 0x116: /* movhpd */
@@ -3605,23 +3753,44 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)),
                             offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(0)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x216: /* movshdup */
             CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, ZMM_OFFSET(reg));
+                if (s->vex_l) {
+                    gen_ldy_env_A0(s, ZMM_OFFSET(reg));
+                } else {
+                    gen_ldo_env_A0(s, ZMM_OFFSET(reg));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)),
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_L(1)));
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)),
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_L(3)));
+                if (s->vex_l) {
+                    gen_op_movl(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_L(5)));
+                    gen_op_movl(s,
+                                offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7)),
+                                offsetof(CPUX86State, xmm_regs[rm].ZMM_L(7)));
+                }
             }
             gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)),
                         offsetof(CPUX86State,xmm_regs[reg].ZMM_L(1)));
             gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2)),
                         offsetof(CPUX86State,xmm_regs[reg].ZMM_L(3)));
+            if (s->vex_l) {
+                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5)));
+                gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6)),
+                            offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7)));
+            } else {
+                gen_clear_ymmh(s, reg);
+            }
             break;
         case 0x178:
         case 0x378:
@@ -3686,6 +3855,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                             offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0)));
             }
             gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)));
+            gen_clear_ymmh(s, reg);
             break;
         case 0x7f: /* movq ea, mm */
             CHECK_NO_VEX(s);
@@ -3707,10 +3877,19 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             CHECK_AVX_V0(s);
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
-                gen_sto_env_A0(s, ZMM_OFFSET(reg));
+                if (s->vex_l) {
+                    gen_sty_env_A0(s, ZMM_OFFSET(reg));
+                } else {
+                    gen_sto_env_A0(s, ZMM_OFFSET(reg));
+                }
             } else {
                 rm = (modrm & 7) | REX_B(s);
                 gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg));
+                if (s->vex_l) {
+                    gen_op_movo_ymmh(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg));
+                } else {
+                    gen_clear_ymmh(s, rm);
+                }
             }
             break;
         case 0x211: /* movss ea, xmm */
@@ -3728,6 +3907,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 }
                 gen_op_movl(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)),
                             offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0)));
+                gen_clear_ymmh(s, rm);
             }
             break;
         case 0x311: /* movsd ea, xmm */
@@ -3746,6 +3926,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 }
                 gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)),
                             offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0)));
+                gen_clear_ymmh(s, rm);
             }
             break;
         case 0x013: /* movlps */
@@ -3798,6 +3979,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 op1_offset = offsetof(CPUX86State,mmx_t0);
             }
             assert(b1 < 2);
+            if (s->vex_l) {
+                b1 = 2;
+            }
             SSEFunc_0_eppp fn = sse_op_table2[((b - 1) & 3) * 8 +
                                        (((modrm >> 3)) & 7)][b1];
             if (!fn) {
@@ -3820,19 +4004,30 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
             tcg_gen_addi_ptr(s->ptr2, cpu_env, op1_offset);
             fn(cpu_env, s->ptr0, s->ptr1, s->ptr2);
+            if (!s->vex_l) {
+                gen_clear_ymmh(s, reg_v);
+            }
             break;
         case 0x050: /* movmskps */
             CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
             tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm));
-            gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+            if (s->vex_l) {
+                gen_helper_movmskps_ymm(s->tmp2_i32, cpu_env, s->ptr0);
+            } else {
+                gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+            }
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
         case 0x150: /* movmskpd */
             CHECK_AVX_V0(s);
             rm = (modrm & 7) | REX_B(s);
             tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm));
-            gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+            if (s->vex_l) {
+                gen_helper_movmskpd_ymm(s->tmp2_i32, cpu_env, s->ptr0);
+            } else {
+                gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+            }
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
         case 0x02a: /* cvtpi2ps */
@@ -3883,6 +4078,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 goto illegal_op;
 #endif
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0x02c: /* cvttps2pi */
         case 0x12c: /* cvttpd2pi */
@@ -3972,6 +4168,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 tcg_gen_st16_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State,fpregs[reg].mmx.MMX_W(val)));
             }
+            gen_clear_ymmh(s, reg);
             break;
         case 0xc5: /* pextrw */
         case 0x1c5:
@@ -4031,7 +4228,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 CHECK_AVX_V0(s);
                 rm = (modrm & 7) | REX_B(s);
                 tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm));
-                gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+                if (s->vex_l) {
+                    gen_helper_pmovmskb_ymm(s->tmp2_i32, cpu_env, s->ptr0);
+                } else {
+                    gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0);
+                }
             } else {
                 CHECK_NO_VEX(s);
                 rm = (modrm & 7);
@@ -4098,37 +4299,66 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 if (mod == 3) {
                     op2_offset = ZMM_OFFSET(rm | REX_B(s));
                 } else {
-                    op2_offset = offsetof(CPUX86State,xmm_t0);
+                    int size;
+                    op2_offset = offsetof(CPUX86State, xmm_t0);
                     gen_lea_modrm(env, s, modrm);
                     switch (b) {
                     case 0x20: case 0x30: /* pmovsxbw, pmovzxbw */
                     case 0x23: case 0x33: /* pmovsxwd, pmovzxwd */
                     case 0x25: case 0x35: /* pmovsxdq, pmovzxdq */
-                        gen_ldq_env_A0(s, op2_offset +
-                                        offsetof(ZMMReg, ZMM_Q(0)));
+                        size = 64;
                         break;
                     case 0x21: case 0x31: /* pmovsxbd, pmovzxbd */
                     case 0x24: case 0x34: /* pmovsxwq, pmovzxwq */
-                        tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
-                                            s->mem_index, MO_LEUL);
-                        tcg_gen_st_i32(s->tmp2_i32, cpu_env, op2_offset +
-                                        offsetof(ZMMReg, ZMM_L(0)));
+                        size = 32;
                         break;
                     case 0x22: case 0x32: /* pmovsxbq, pmovzxbq */
+                        size = 16;
+                        break;
+                    case 0x2a:            /* movntqda */
+                        if (s->vex_l) {
+                            gen_ldy_env_A0(s, op1_offset);
+                        } else {
+                            gen_ldo_env_A0(s, op1_offset);
+                            gen_clear_ymmh(s, reg);
+                        }
+                        return;
+                    default:
+                        size = 128;
+                    }
+                    if (s->vex_l) {
+                        size *= 2;
+                    }
+                    switch (size) {
+                    case 16:
                         tcg_gen_qemu_ld_tl(s->tmp0, s->A0,
                                            s->mem_index, MO_LEUW);
                         tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset +
                                         offsetof(ZMMReg, ZMM_W(0)));
                         break;
-                    case 0x2a:            /* movntqda */
-                        gen_ldo_env_A0(s, op1_offset);
-                        return;
-                    default:
+                    case 32:
+                        tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
+                                            s->mem_index, MO_LEUL);
+                        tcg_gen_st_i32(s->tmp2_i32, cpu_env, op2_offset +
+                                        offsetof(ZMMReg, ZMM_L(0)));
+                        break;
+                    case 64:
+                        gen_ldq_env_A0(s, op2_offset +
+                                        offsetof(ZMMReg, ZMM_Q(0)));
+                        break;
+                    case 128:
                         gen_ldo_env_A0(s, op2_offset);
+                        break;
+                    case 256:
+                        gen_ldy_env_A0(s, op2_offset);
+                        break;
                     }
                 }
                 tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
                 tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                if (s->vex_l) {
+                    b1 = 2;
+                }
                 if (!op6.fn[b1].op1) {
                     goto illegal_op;
                 }
@@ -4148,6 +4378,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         fn(cpu_env, s->ptr0, s->ptr2, s->ptr1);
                     }
                 }
+                if ((op6.flags & SSE_OPF_CMP) == 0 && s->vex_l == 0) {
+                    gen_clear_ymmh(s, reg);
+                }
             } else {
                 CHECK_NO_VEX(s);
                 if ((op6.flags & SSE_OPF_MMX) == 0) {
@@ -4644,6 +4877,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     tcg_gen_st8_tl(s->T0, cpu_env, offsetof(CPUX86State,
                                             xmm_regs[reg].ZMM_B(val & 15)));
+                    gen_clear_ymmh(s, reg);
                     break;
                 case 0x21: /* insertps */
                     CHECK_AVX_128(s);
@@ -4677,6 +4911,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         tcg_gen_st_i32(tcg_const_i32(0 /*float32_zero*/),
                                         cpu_env, offsetof(CPUX86State,
                                                 xmm_regs[reg].ZMM_L(3)));
+                    gen_clear_ymmh(s, reg);
                     break;
                 case 0x22:
                     CHECK_AVX_128(s);
@@ -4708,6 +4943,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         goto illegal_op;
 #endif
                     }
+                    gen_clear_ymmh(s, reg);
                     break;
                 }
                 return;
@@ -4760,7 +4996,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             } else {
                 op2_offset = offsetof(CPUX86State, xmm_t0);
                 gen_lea_modrm(env, s, modrm);
-                gen_ldo_env_A0(s, op2_offset);
+                if (s->vex_l) {
+                    gen_ldy_env_A0(s, op2_offset);
+                } else {
+                    gen_ldo_env_A0(s, op2_offset);
+                }
             }
 
             val = x86_ldub_code(env, s);
@@ -4771,8 +5011,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     /* The helper must use entire 64-bit gp registers */
                     val |= 1 << 8;
                 }
+                if ((b & 1) == 0) /* pcmpXsrtm */
+                    gen_clear_ymmh(s, 0);
             }
 
+            if (s->vex_l) {
+                b1 = 2;
+            }
             v_offset = ZMM_OFFSET(reg_v);
             /*
              * Populate the top part of the destination register for VEX
@@ -4796,6 +5041,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1,
                                tcg_const_i32(val));
             }
+            if ((op7.flags & SSE_OPF_CMP) == 0 && s->vex_l == 0) {
+                gen_clear_ymmh(s, reg);
+            }
             if (op7.flags & SSE_OPF_CMP) {
                 set_cc_op(s, CC_OP_EFLAGS);
             }
@@ -4848,6 +5096,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         default:
             break;
         }
+        if (s->vex_l) {
+            b1 += 4;
+        }
         if (is_xmm) {
             scalar_op = (s->prefix & PREFIX_VEX)
                 && (sse_op.flags & SSE_OPF_SCALAR)
@@ -4864,7 +5115,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             op1_offset = ZMM_OFFSET(reg);
             if (mod != 3) {
-                int sz = 4;
+                int sz = s->vex_l ? 5 : 4;
 
                 gen_lea_modrm(env, s, modrm);
                 op2_offset = offsetof(CPUX86State, xmm_t0);
@@ -4902,6 +5153,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     /* 128 bit access */
                     gen_ldo_env_A0(s, op2_offset);
                     break;
+                case 5:
+                    /* 256 bit access */
+                    gen_ldy_env_A0(s, op2_offset);
+                    break;
                 }
             } else {
                 rm = (modrm & 7) | REX_B(s);
@@ -5010,6 +5265,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
         }
 
+        if (s->vex_l == 0 && (sse_op.flags & SSE_OPF_CMP) == 0) {
+            gen_clear_ymmh(s, reg);
+        }
         if (sse_op.flags & SSE_OPF_CMP) {
             set_cc_op(s, CC_OP_EFLAGS);
         }
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 28/42] i386: Implement VZEROALL and VZEROUPPER
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (30 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 27/42] i386: Translate 256 bit AVX instructions Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 29/42] i386: Implement VBROADCAST Paul Brook
                   ` (13 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

The use the same opcode as EMMS, which I guess makes some sort of sense.
Fairly strightforward other than that.

If we were wanting to optimize out gen_clear_ymmh then this would be one of
the starting points.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 48 ++++++++++++++++++++++++++++++++++++
 target/i386/ops_sse_header.h |  9 +++++++
 target/i386/tcg/translate.c  | 26 ++++++++++++++++---
 3 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index ad3312d353..a1f50f0c8b 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -3071,6 +3071,54 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 #endif
 #endif
 
+#if SHIFT == 2
+void helper_vzeroall(CPUX86State *env)
+{
+    int i;
+
+    for (i = 0; i < 8; i++) {
+        env->xmm_regs[i].ZMM_Q(0) = 0;
+        env->xmm_regs[i].ZMM_Q(1) = 0;
+        env->xmm_regs[i].ZMM_Q(2) = 0;
+        env->xmm_regs[i].ZMM_Q(3) = 0;
+    }
+}
+
+void helper_vzeroupper(CPUX86State *env)
+{
+    int i;
+
+    for (i = 0; i < 8; i++) {
+        env->xmm_regs[i].ZMM_Q(2) = 0;
+        env->xmm_regs[i].ZMM_Q(3) = 0;
+    }
+}
+
+#ifdef TARGET_X86_64
+void helper_vzeroall_hi8(CPUX86State *env)
+{
+    int i;
+
+    for (i = 8; i < 16; i++) {
+        env->xmm_regs[i].ZMM_Q(0) = 0;
+        env->xmm_regs[i].ZMM_Q(1) = 0;
+        env->xmm_regs[i].ZMM_Q(2) = 0;
+        env->xmm_regs[i].ZMM_Q(3) = 0;
+    }
+}
+
+void helper_vzeroupper_hi8(CPUX86State *env)
+{
+    int i;
+
+    for (i = 8; i < 16; i++) {
+        env->xmm_regs[i].ZMM_Q(2) = 0;
+        env->xmm_regs[i].ZMM_Q(3) = 0;
+    }
+}
+#endif
+#endif
+
 #undef SSE_HELPER_S
 
 #undef SHIFT
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index cfcfba154b..48f0945917 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -411,6 +411,15 @@ DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_5(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, Reg, i32)
 #endif
 
+#if SHIFT == 2
+DEF_HELPER_1(vzeroall, void, env)
+DEF_HELPER_1(vzeroupper, void, env)
+#ifdef TARGET_X86_64
+DEF_HELPER_1(vzeroall_hi8, void, env)
+DEF_HELPER_1(vzeroupper_hi8, void, env)
+#endif
+#endif
+
 #undef SHIFT
 #undef Reg
 #undef SUFFIX
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index bcd6d47fd0..ba70aeb039 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3455,9 +3455,29 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         return;
     }
     if (b == 0x77) {
-        /* emms */
-        gen_helper_emms(cpu_env);
-        return;
+        if (s->prefix & PREFIX_VEX) {
+            CHECK_AVX(s);
+            if (s->vex_l) {
+                gen_helper_vzeroall(cpu_env);
+#ifdef TARGET_X86_64
+                if (CODE64(s)) {
+                    gen_helper_vzeroall_hi8(cpu_env);
+                }
+#endif
+            } else {
+                gen_helper_vzeroupper(cpu_env);
+#ifdef TARGET_X86_64
+                if (CODE64(s)) {
+                    gen_helper_vzeroupper_hi8(cpu_env);
+                }
+#endif
+            }
+            return;
+        } else {
+            /* emms */
+            gen_helper_emms(cpu_env);
+            return;
+        }
     }
     /* prepare MMX state (XXX: optimize by storing fptt and fptags in
        the static cpu state) */
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 29/42] i386: Implement VBROADCAST
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (31 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 28/42] i386: Implement VZEROALL and VZEROUPPER Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 30/42] i386: Implement VPERMIL Paul Brook
                   ` (12 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

The catch here is that these are whole vector operations (not independent 128
bit lanes). We abuse the SSE_OPF_SCALAR flag to select the memory operand
width appropriately.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 51 ++++++++++++++++++++++++++++++++++++
 target/i386/ops_sse_header.h |  8 ++++++
 target/i386/tcg/translate.c  | 42 ++++++++++++++++++++++++++++-
 3 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index a1f50f0c8b..4115c9a257 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -3071,7 +3071,57 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 #endif
 #endif
 
+#if SHIFT >= 1
+void glue(helper_vbroadcastb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint8_t val = s->B(0);
+    int i;
+
+    for (i = 0; i < 16 * SHIFT; i++) {
+        d->B(i) = val;
+    }
+}
+
+void glue(helper_vbroadcastw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint16_t val = s->W(0);
+    int i;
+
+    for (i = 0; i < 8 * SHIFT; i++) {
+        d->W(i) = val;
+    }
+}
+
+void glue(helper_vbroadcastl, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint32_t val = s->L(0);
+    int i;
+
+    for (i = 0; i < 8 * SHIFT; i++) {
+        d->L(i) = val;
+    }
+}
+
+void glue(helper_vbroadcastq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint64_t val = s->Q(0);
+    d->Q(0) = val;
+    d->Q(1) = val;
 #if SHIFT == 2
+    d->Q(2) = val;
+    d->Q(3) = val;
+#endif
+}
+
+#if SHIFT == 2
+void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    d->Q(0) = s->Q(0);
+    d->Q(1) = s->Q(1);
+    d->Q(2) = s->Q(0);
+    d->Q(3) = s->Q(1);
+}
+
 void helper_vzeroall(CPUX86State *env)
 {
     int i;
@@ -3118,6 +3168,7 @@ void helper_vzeroupper_hi8(CPUX86State *env)
 }
 #endif
 #endif
+#endif
 
 #undef SSE_HELPER_S
 
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 48f0945917..51e02cd4fa 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -411,7 +411,14 @@ DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_5(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, Reg, i32)
 #endif
 
+/* AVX helpers */
+#if SHIFT >= 1
+DEF_HELPER_3(glue(vbroadcastb, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(vbroadcastw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(vbroadcastl, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(vbroadcastq, SUFFIX), void, env, Reg, Reg)
 #if SHIFT == 2
+DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_1(vzeroall, void, env)
 DEF_HELPER_1(vzeroupper, void, env)
 #ifdef TARGET_X86_64
@@ -419,6 +426,7 @@ DEF_HELPER_1(vzeroall_hi8, void, env)
 DEF_HELPER_1(vzeroupper_hi8, void, env)
 #endif
 #endif
+#endif
 
 #undef SHIFT
 #undef Reg
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index ba70aeb039..59ab1dc562 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3255,6 +3255,11 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x14] = BLENDV_OP(blendvps, SSE41, 0),
     [0x15] = BLENDV_OP(blendvpd, SSE41, 0),
     [0x17] = CMP_OP(ptest, SSE41),
+    /* TODO:Some vbroadcast variants require AVX2 */
+    [0x18] = UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR), /* vbroadcastss */
+    [0x19] = UNARY_OP(vbroadcastq, AVX, SSE_OPF_SCALAR), /* vbroadcastsd */
+#define gen_helper_vbroadcastdq_xmm NULL
+    [0x1a] = UNARY_OP(vbroadcastdq, AVX, SSE_OPF_SCALAR), /* vbroadcastf128 */
     [0x1c] = UNARY_OP_MMX(pabsb, SSSE3),
     [0x1d] = UNARY_OP_MMX(pabsw, SSSE3),
     [0x1e] = UNARY_OP_MMX(pabsd, SSSE3),
@@ -3286,6 +3291,16 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x40] = BINARY_OP(pmulld, SSE41, SSE_OPF_MMX),
 #define gen_helper_phminposuw_ymm NULL
     [0x41] = UNARY_OP(phminposuw, SSE41, 0),
+    /* vpbroadcastd */
+    [0x58] = UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vpbroadcastq */
+    [0x59] = UNARY_OP(vbroadcastq, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vbroadcasti128 */
+    [0x5a] = UNARY_OP(vbroadcastdq, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vpbroadcastb */
+    [0x78] = UNARY_OP(vbroadcastb, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vpbroadcastw */
+    [0x79] = UNARY_OP(vbroadcastw, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
 #define gen_helper_aesimc_ymm NULL
     [0xdb] = UNARY_OP(aesimc, AES, 0),
     [0xdc] = BINARY_OP(aesenc, AES, 0),
@@ -4323,6 +4338,24 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     op2_offset = offsetof(CPUX86State, xmm_t0);
                     gen_lea_modrm(env, s, modrm);
                     switch (b) {
+                    case 0x78: /* vpbroadcastb */
+                        size = 8;
+                        break;
+                    case 0x79: /* vpbroadcastw */
+                        size = 16;
+                        break;
+                    case 0x18: /* vbroadcastss */
+                    case 0x58: /* vpbroadcastd */
+                        size = 32;
+                        break;
+                    case 0x19: /* vbroadcastsd */
+                    case 0x59: /* vpbroadcastq */
+                        size = 64;
+                        break;
+                    case 0x1a: /* vbroadcastf128 */
+                    case 0x5a: /* vbroadcasti128 */
+                        size = 128;
+                        break;
                     case 0x20: case 0x30: /* pmovsxbw, pmovzxbw */
                     case 0x23: case 0x33: /* pmovsxwd, pmovzxwd */
                     case 0x25: case 0x35: /* pmovsxdq, pmovzxdq */
@@ -4346,10 +4379,17 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     default:
                         size = 128;
                     }
-                    if (s->vex_l) {
+                    /* 256 bit vbroadcast only load a single element.  */
+                    if ((op6.flags & SSE_OPF_SCALAR) == 0 && s->vex_l) {
                         size *= 2;
                     }
                     switch (size) {
+                    case 8:
+                        tcg_gen_qemu_ld_tl(s->tmp0, s->A0,
+                                           s->mem_index, MO_UB);
+                        tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset +
+                                        offsetof(ZMMReg, ZMM_B(0)));
+                        break;
                     case 16:
                         tcg_gen_qemu_ld_tl(s->tmp0, s->A0,
                                            s->mem_index, MO_LEUW);
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 30/42] i386: Implement VPERMIL
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (32 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 29/42] i386: Implement VBROADCAST Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 31/42] i386: Implement AVX variable shifts Paul Brook
                   ` (11 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Some potentially surprising details when comparing vpermilpd v.s. vpermilps,
but overall pretty straightforward.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 82 ++++++++++++++++++++++++++++++++++++
 target/i386/ops_sse_header.h |  4 ++
 target/i386/tcg/translate.c  |  4 ++
 3 files changed, 90 insertions(+)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 4115c9a257..9b92b9790a 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -3113,6 +3113,88 @@ void glue(helper_vbroadcastq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 #endif
 }
 
+void glue(helper_vpermilpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    uint64_t r0, r1;
+
+    r0 = v->Q((s->Q(0) >> 1) & 1);
+    r1 = v->Q((s->Q(1) >> 1) & 1);
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+#if SHIFT == 2
+    r0 = v->Q(((s->Q(2) >> 1) & 1) + 2);
+    r1 = v->Q(((s->Q(3) >> 1) & 1) + 2);
+    d->Q(2) = r0;
+    d->Q(3) = r1;
+#endif
+}
+
+void glue(helper_vpermilps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    uint32_t r0, r1, r2, r3;
+
+    r0 = v->L(s->L(0) & 3);
+    r1 = v->L(s->L(1) & 3);
+    r2 = v->L(s->L(2) & 3);
+    r3 = v->L(s->L(3) & 3);
+    d->L(0) = r0;
+    d->L(1) = r1;
+    d->L(2) = r2;
+    d->L(3) = r3;
+#if SHIFT == 2
+    r0 = v->L((s->L(4) & 3) + 4);
+    r1 = v->L((s->L(5) & 3) + 4);
+    r2 = v->L((s->L(6) & 3) + 4);
+    r3 = v->L((s->L(7) & 3) + 4);
+    d->L(4) = r0;
+    d->L(5) = r1;
+    d->L(6) = r2;
+    d->L(7) = r3;
+#endif
+}
+
+void glue(helper_vpermilpd_imm, SUFFIX)(CPUX86State *env,
+                                        Reg *d, Reg *s, uint32_t order)
+{
+    uint64_t r0, r1;
+
+    r0 = s->Q((order >> 0) & 1);
+    r1 = s->Q((order >> 1) & 1);
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+#if SHIFT == 2
+    r0 = s->Q(((order >> 2) & 1) + 2);
+    r1 = s->Q(((order >> 3) & 1) + 2);
+    d->Q(2) = r0;
+    d->Q(3) = r1;
+#endif
+}
+
+void glue(helper_vpermilps_imm, SUFFIX)(CPUX86State *env,
+                                        Reg *d, Reg *s, uint32_t order)
+{
+    uint32_t r0, r1, r2, r3;
+
+    r0 = s->L((order >> 0) & 3);
+    r1 = s->L((order >> 2) & 3);
+    r2 = s->L((order >> 4) & 3);
+    r3 = s->L((order >> 6) & 3);
+    d->L(0) = r0;
+    d->L(1) = r1;
+    d->L(2) = r2;
+    d->L(3) = r3;
+#if SHIFT == 2
+    r0 = s->L(((order >> 0) & 3) + 4);
+    r1 = s->L(((order >> 2) & 3) + 4);
+    r2 = s->L(((order >> 4) & 3) + 4);
+    r3 = s->L(((order >> 6) & 3) + 4);
+    d->L(4) = r0;
+    d->L(5) = r1;
+    d->L(6) = r2;
+    d->L(7) = r3;
+#endif
+}
+
 #if SHIFT == 2
 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 51e02cd4fa..c52169a030 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -417,6 +417,10 @@ DEF_HELPER_3(glue(vbroadcastb, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(vbroadcastw, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(vbroadcastl, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(vbroadcastq, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(vpermilpd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpermilps, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpermilpd_imm, SUFFIX), void, env, Reg, Reg, i32)
+DEF_HELPER_4(glue(vpermilps_imm, SUFFIX), void, env, Reg, Reg, i32)
 #if SHIFT == 2
 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_1(vzeroall, void, env)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 59ab1dc562..358c3ecb0b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3251,6 +3251,8 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x09] = BINARY_OP_MMX(psignw, SSSE3),
     [0x0a] = BINARY_OP_MMX(psignd, SSSE3),
     [0x0b] = BINARY_OP_MMX(pmulhrsw, SSSE3),
+    [0x0c] = BINARY_OP(vpermilps, AVX, 0),
+    [0x0d] = BINARY_OP(vpermilpd, AVX, 0),
     [0x10] = BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX),
     [0x14] = BLENDV_OP(blendvps, SSE41, 0),
     [0x15] = BLENDV_OP(blendvpd, SSE41, 0),
@@ -3311,6 +3313,8 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
 
 /* prefix [66] 0f 3a */
 static const struct SSEOpHelper_table7 sse_op_table7[256] = {
+    [0x04] = UNARY_OP(vpermilps_imm, AVX, 0),
+    [0x05] = UNARY_OP(vpermilpd_imm, AVX, 0),
     [0x08] = UNARY_OP(roundps, SSE41, 0),
     [0x09] = UNARY_OP(roundpd, SSE41, 0),
 #define gen_helper_roundss_ymm NULL
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 31/42] i386: Implement AVX variable shifts
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (33 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 30/42] i386: Implement VPERMIL Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 32/42] i386: Implement VTEST Paul Brook
                   ` (10 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

These use the W bit to encode the operand width, but otherwise fairly
straightforward.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 17 +++++++++++++++++
 target/i386/ops_sse_header.h |  6 ++++++
 target/i386/tcg/translate.c  | 17 +++++++++++++++++
 3 files changed, 40 insertions(+)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 9b92b9790a..8f2bd48394 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -3195,6 +3195,23 @@ void glue(helper_vpermilps_imm, SUFFIX)(CPUX86State *env,
 #endif
 }
 
+#if SHIFT == 1
+#define FPSRLVD(x, c) (c < 32 ? ((x) >> c) : 0)
+#define FPSRLVQ(x, c) (c < 64 ? ((x) >> c) : 0)
+#define FPSRAVD(x, c) ((int32_t)(x) >> (c < 64 ? c : 31))
+#define FPSRAVQ(x, c) ((int64_t)(x) >> (c < 64 ? c : 63))
+#define FPSLLVD(x, c) (c < 32 ? ((x) << c) : 0)
+#define FPSLLVQ(x, c) (c < 64 ? ((x) << c) : 0)
+#endif
+
+SSE_HELPER_L(helper_vpsrlvd, FPSRLVD)
+SSE_HELPER_L(helper_vpsravd, FPSRAVD)
+SSE_HELPER_L(helper_vpsllvd, FPSLLVD)
+
+SSE_HELPER_Q(helper_vpsrlvq, FPSRLVQ)
+SSE_HELPER_Q(helper_vpsravq, FPSRAVQ)
+SSE_HELPER_Q(helper_vpsllvq, FPSLLVQ)
+
 #if SHIFT == 2
 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index c52169a030..20db6c4240 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -421,6 +421,12 @@ DEF_HELPER_4(glue(vpermilpd, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_4(glue(vpermilps, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_4(glue(vpermilpd_imm, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(vpermilps_imm, SUFFIX), void, env, Reg, Reg, i32)
+DEF_HELPER_4(glue(vpsrlvd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsravd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsllvd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsrlvq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsravq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpsllvq, SUFFIX), void, env, Reg, Reg, Reg)
 #if SHIFT == 2
 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_1(vzeroall, void, env)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 358c3ecb0b..4990470083 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3293,6 +3293,9 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x40] = BINARY_OP(pmulld, SSE41, SSE_OPF_MMX),
 #define gen_helper_phminposuw_ymm NULL
     [0x41] = UNARY_OP(phminposuw, SSE41, 0),
+    [0x45] = BINARY_OP(vpsrlvd, AVX, SSE_OPF_AVX2),
+    [0x46] = BINARY_OP(vpsravd, AVX, SSE_OPF_AVX2),
+    [0x47] = BINARY_OP(vpsllvd, AVX, SSE_OPF_AVX2),
     /* vpbroadcastd */
     [0x58] = UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
     /* vpbroadcastq */
@@ -3357,6 +3360,15 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
 #undef BLENDV_OP
 #undef SPECIAL_OP
 
+#define SSE_OP(name) \
+    {gen_helper_ ## name ##_xmm, gen_helper_ ## name ##_ymm}
+static const SSEFunc_0_eppp sse_op_table8[3][2] = {
+    SSE_OP(vpsrlvq),
+    SSE_OP(vpsravq),
+    SSE_OP(vpsllvq),
+};
+#undef SSE_OP
+
 /* VEX prefix not allowed */
 #define CHECK_NO_VEX(s) do { \
     if (s->prefix & PREFIX_VEX) \
@@ -4439,6 +4451,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         tcg_temp_free_ptr(mask);
                     } else {
                         SSEFunc_0_eppp fn = op6.fn[b1].op2;
+                        if (REX_W(s)) {
+                            if (b >= 0x45 && b <= 0x47) {
+                                fn = sse_op_table8[b - 0x45][b1 - 1];
+                            }
+                        }
                         fn(cpu_env, s->ptr0, s->ptr2, s->ptr1);
                     }
                 }
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 32/42] i386: Implement VTEST
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (34 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 31/42] i386: Implement AVX variable shifts Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 33/42] i386: Implement VMASKMOV Paul Brook
                   ` (9 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Noting special here

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 28 ++++++++++++++++++++++++++++
 target/i386/ops_sse_header.h |  2 ++
 target/i386/tcg/translate.c  |  2 ++
 3 files changed, 32 insertions(+)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 8f2bd48394..edf14a25d7 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -3212,6 +3212,34 @@ SSE_HELPER_Q(helper_vpsrlvq, FPSRLVQ)
 SSE_HELPER_Q(helper_vpsravq, FPSRAVQ)
 SSE_HELPER_Q(helper_vpsllvq, FPSLLVQ)
 
+void glue(helper_vtestps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint32_t zf = (s->L(0) &  d->L(0)) | (s->L(1) &  d->L(1));
+    uint32_t cf = (s->L(0) & ~d->L(0)) | (s->L(1) & ~d->L(1));
+
+    zf |= (s->L(2) &  d->L(2)) | (s->L(3) &  d->L(3));
+    cf |= (s->L(2) & ~d->L(2)) | (s->L(3) & ~d->L(3));
+#if SHIFT == 2
+    zf |= (s->L(4) &  d->L(4)) | (s->L(5) &  d->L(5));
+    cf |= (s->L(4) & ~d->L(4)) | (s->L(5) & ~d->L(5));
+    zf |= (s->L(6) &  d->L(6)) | (s->L(7) &  d->L(7));
+    cf |= (s->L(6) & ~d->L(6)) | (s->L(7) & ~d->L(7));
+#endif
+    CC_SRC = ((zf >> 31) ? 0 : CC_Z) | ((cf >> 31) ? 0 : CC_C);
+}
+
+void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    uint64_t zf = (s->Q(0) &  d->Q(0)) | (s->Q(1) &  d->Q(1));
+    uint64_t cf = (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1));
+
+#if SHIFT == 2
+    zf |= (s->Q(2) &  d->Q(2)) | (s->Q(3) &  d->Q(3));
+    cf |= (s->Q(2) & ~d->Q(2)) | (s->Q(3) & ~d->Q(3));
+#endif
+    CC_SRC = ((zf >> 63) ? 0 : CC_Z) | ((cf >> 63) ? 0 : CC_C);
+}
+
 #if SHIFT == 2
 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 20db6c4240..8b93b8e6d6 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -427,6 +427,8 @@ DEF_HELPER_4(glue(vpsllvd, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_4(glue(vpsrlvq, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_4(glue(vpsravq, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_4(glue(vpsllvq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_3(glue(vtestps, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(vtestpd, SUFFIX), void, env, Reg, Reg)
 #if SHIFT == 2
 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_1(vzeroall, void, env)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 4990470083..2fbb7bfcad 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3253,6 +3253,8 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x0b] = BINARY_OP_MMX(pmulhrsw, SSSE3),
     [0x0c] = BINARY_OP(vpermilps, AVX, 0),
     [0x0d] = BINARY_OP(vpermilpd, AVX, 0),
+    [0x0e] = CMP_OP(vtestps, AVX),
+    [0x0f] = CMP_OP(vtestpd, AVX),
     [0x10] = BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX),
     [0x14] = BLENDV_OP(blendvps, SSE41, 0),
     [0x15] = BLENDV_OP(blendvpd, SSE41, 0),
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 33/42] i386: Implement VMASKMOV
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (35 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 32/42] i386: Implement VTEST Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 34/42] i386: Implement VGATHER Paul Brook
                   ` (8 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Decoding these is a bit messy, but at least the integer and float variants
have the same semantics once decoded.

We don't try and be clever with the load forms, instead load the whole
vector then mask out the elements we want.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 48 ++++++++++++++++++++++++++++++++++++
 target/i386/ops_sse_header.h |  4 +++
 target/i386/tcg/translate.c  | 34 +++++++++++++++++++++++++
 3 files changed, 86 insertions(+)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index edf14a25d7..ffcba3d02c 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -3240,6 +3240,54 @@ void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
     CC_SRC = ((zf >> 63) ? 0 : CC_Z) | ((cf >> 63) ? 0 : CC_C);
 }
 
+void glue(helper_vpmaskmovd_st, SUFFIX)(CPUX86State *env,
+                                        Reg *s, Reg *v, target_ulong a0)
+{
+    int i;
+
+    for (i = 0; i < (2 << SHIFT); i++) {
+        if (v->L(i) >> 31) {
+            cpu_stl_data_ra(env, a0 + i * 4, s->L(i), GETPC());
+        }
+    }
+}
+
+void glue(helper_vpmaskmovq_st, SUFFIX)(CPUX86State *env,
+                                        Reg *s, Reg *v, target_ulong a0)
+{
+    int i;
+
+    for (i = 0; i < (1 << SHIFT); i++) {
+        if (v->Q(i) >> 63) {
+            cpu_stq_data_ra(env, a0 + i * 8, s->Q(i), GETPC());
+        }
+    }
+}
+
+void glue(helper_vpmaskmovd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    d->L(0) = (v->L(0) >> 31) ? s->L(0) : 0;
+    d->L(1) = (v->L(1) >> 31) ? s->L(1) : 0;
+    d->L(2) = (v->L(2) >> 31) ? s->L(2) : 0;
+    d->L(3) = (v->L(3) >> 31) ? s->L(3) : 0;
+#if SHIFT == 2
+    d->L(4) = (v->L(4) >> 31) ? s->L(4) : 0;
+    d->L(5) = (v->L(5) >> 31) ? s->L(5) : 0;
+    d->L(6) = (v->L(6) >> 31) ? s->L(6) : 0;
+    d->L(7) = (v->L(7) >> 31) ? s->L(7) : 0;
+#endif
+}
+
+void glue(helper_vpmaskmovq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    d->Q(0) = (v->Q(0) >> 63) ? s->Q(0) : 0;
+    d->Q(1) = (v->Q(1) >> 63) ? s->Q(1) : 0;
+#if SHIFT == 2
+    d->Q(2) = (v->Q(2) >> 63) ? s->Q(2) : 0;
+    d->Q(3) = (v->Q(3) >> 63) ? s->Q(3) : 0;
+#endif
+}
+
 #if SHIFT == 2
 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 8b93b8e6d6..a7a6bf6b10 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -429,6 +429,10 @@ DEF_HELPER_4(glue(vpsravq, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_4(glue(vpsllvq, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_3(glue(vtestps, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(vtestpd, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(vpmaskmovd_st, SUFFIX), void, env, Reg, Reg, tl)
+DEF_HELPER_4(glue(vpmaskmovq_st, SUFFIX), void, env, Reg, Reg, tl)
+DEF_HELPER_4(glue(vpmaskmovd, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_4(glue(vpmaskmovq, SUFFIX), void, env, Reg, Reg, Reg)
 #if SHIFT == 2
 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_1(vzeroall, void, env)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2fbb7bfcad..e00195d301 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3277,6 +3277,10 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x29] = BINARY_OP(pcmpeqq, SSE41, SSE_OPF_MMX),
     [0x2a] = SPECIAL_OP(SSE41), /* movntqda */
     [0x2b] = BINARY_OP(packusdw, SSE41, SSE_OPF_MMX),
+    [0x2c] = BINARY_OP(vpmaskmovd, AVX, 0), /* vmaskmovps */
+    [0x2d] = BINARY_OP(vpmaskmovq, AVX, 0), /* vmaskmovpd */
+    [0x2e] = SPECIAL_OP(AVX), /* vmaskmovps */
+    [0x2f] = SPECIAL_OP(AVX), /* vmaskmovpd */
     [0x30] = UNARY_OP(pmovzxbw, SSE41, SSE_OPF_MMX),
     [0x31] = UNARY_OP(pmovzxbd, SSE41, SSE_OPF_MMX),
     [0x32] = UNARY_OP(pmovzxbq, SSE41, SSE_OPF_MMX),
@@ -3308,6 +3312,9 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x78] = UNARY_OP(vbroadcastb, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
     /* vpbroadcastw */
     [0x79] = UNARY_OP(vbroadcastw, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX),
+    /* vpmaskmovd, vpmaskmovq */
+    [0x8c] = BINARY_OP(vpmaskmovd, AVX, SSE_OPF_AVX2),
+    [0x8e] = SPECIAL_OP(AVX), /* vpmaskmovd, vpmaskmovq */
 #define gen_helper_aesimc_ymm NULL
     [0xdb] = UNARY_OP(aesimc, AES, 0),
     [0xdc] = BINARY_OP(aesenc, AES, 0),
@@ -3369,6 +3376,11 @@ static const SSEFunc_0_eppp sse_op_table8[3][2] = {
     SSE_OP(vpsravq),
     SSE_OP(vpsllvq),
 };
+
+static const SSEFunc_0_eppt sse_op_table9[2][2] = {
+    SSE_OP(vpmaskmovd_st),
+    SSE_OP(vpmaskmovq_st),
+};
 #undef SSE_OP
 
 /* VEX prefix not allowed */
@@ -4394,6 +4406,22 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                             gen_clear_ymmh(s, reg);
                         }
                         return;
+                    case 0x2e: /* maskmovpd */
+                        b1 = 0;
+                        goto vpmaskmov;
+                    case 0x2f: /* maskmovpd */
+                        b1 = 1;
+                        goto vpmaskmov;
+                    case 0x8e: /* vpmaskmovd, vpmaskmovq */
+                        CHECK_AVX2(s);
+                        b1 = REX_W(s);
+                    vpmaskmov:
+                        tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                        v_offset = ZMM_OFFSET(reg_v);
+                        tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+                        sse_op_table9[b1][s->vex_l](cpu_env,
+                                s->ptr0, s->ptr2, s->A0);
+                        return;
                     default:
                         size = 128;
                     }
@@ -4456,6 +4484,12 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         if (REX_W(s)) {
                             if (b >= 0x45 && b <= 0x47) {
                                 fn = sse_op_table8[b - 0x45][b1 - 1];
+                            } else if (b == 0x8c) {
+                                if (s->vex_l) {
+                                    fn = gen_helper_vpmaskmovq_ymm;
+                                } else {
+                                    fn = gen_helper_vpmaskmovq_xmm;
+                                }
                             }
                         }
                         fn(cpu_env, s->ptr0, s->ptr2, s->ptr1);
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 34/42] i386: Implement VGATHER
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (36 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 33/42] i386: Implement VMASKMOV Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 35/42] i386: Implement VPERM Paul Brook
                   ` (7 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

These are scatter load instructions that need introduce a new "Vector SIB"
encoding.  Also a bit of hair to handle different index sizes and scaling
factors, but overall the combinatorial explosion doesn't end up too bad.

The other thing of note is probably that these also modify the mask operand.
Thankfully the operands may not overlap, and we do not have to make the whole
thing appear atomic.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 65 +++++++++++++++++++++++++++++++
 target/i386/ops_sse_header.h | 16 ++++++++
 target/i386/tcg/translate.c  | 74 ++++++++++++++++++++++++++++++++++++
 3 files changed, 155 insertions(+)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index ffcba3d02c..14a2d1bf78 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -3288,6 +3288,71 @@ void glue(helper_vpmaskmovq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 #endif
 }
 
+#define VGATHER_HELPER(scale)                                       \
+void glue(helper_vpgatherdd ## scale, SUFFIX)(CPUX86State *env,     \
+        Reg *d, Reg *v, Reg *s, target_ulong a0)                    \
+{                                                                   \
+    int i;                                                          \
+    for (i = 0; i < (2 << SHIFT); i++) {                            \
+        if (v->L(i) >> 31) {                                        \
+            target_ulong addr = a0                                  \
+                + ((target_ulong)(int32_t)s->L(i) << scale);        \
+            d->L(i) = cpu_ldl_data_ra(env, addr, GETPC());          \
+        }                                                           \
+        v->L(i) = 0;                                                \
+    }                                                               \
+}                                                                   \
+void glue(helper_vpgatherdq ## scale, SUFFIX)(CPUX86State *env,     \
+        Reg *d, Reg *v, Reg *s, target_ulong a0)                    \
+{                                                                   \
+    int i;                                                          \
+    for (i = 0; i < (1 << SHIFT); i++) {                            \
+        if (v->Q(i) >> 63) {                                        \
+            target_ulong addr = a0                                  \
+                + ((target_ulong)(int32_t)s->L(i) << scale);        \
+            d->Q(i) = cpu_ldq_data_ra(env, addr, GETPC());          \
+        }                                                           \
+        v->Q(i) = 0;                                                \
+    }                                                               \
+}                                                                   \
+void glue(helper_vpgatherqd ## scale, SUFFIX)(CPUX86State *env,     \
+        Reg *d, Reg *v, Reg *s, target_ulong a0)                    \
+{                                                                   \
+    int i;                                                          \
+    for (i = 0; i < (1 << SHIFT); i++) {                            \
+        if (v->L(i) >> 31) {                                        \
+            target_ulong addr = a0                                  \
+                + ((target_ulong)(int64_t)s->Q(i) << scale);        \
+            d->L(i) = cpu_ldl_data_ra(env, addr, GETPC());          \
+        }                                                           \
+        v->L(i) = 0;                                                \
+    }                                                               \
+    d->Q(SHIFT) = 0;                                                    \
+    v->Q(SHIFT) = 0;                                                    \
+    YMM_ONLY(                                                       \
+    d->Q(3) = 0;                                                    \
+    v->Q(3) = 0;                                                    \
+    )                                                               \
+}                                                                   \
+void glue(helper_vpgatherqq ## scale, SUFFIX)(CPUX86State *env,     \
+        Reg *d, Reg *v, Reg *s, target_ulong a0)                    \
+{                                                                   \
+    int i;                                                          \
+    for (i = 0; i < (1 << SHIFT); i++) {                            \
+        if (v->Q(i) >> 63) {                                        \
+            target_ulong addr = a0                                  \
+                + ((target_ulong)(int64_t)s->Q(i) << scale);        \
+            d->Q(i) = cpu_ldq_data_ra(env, addr, GETPC());          \
+        }                                                           \
+        v->Q(i) = 0;                                                \
+    }                                                               \
+}
+
+VGATHER_HELPER(0)
+VGATHER_HELPER(1)
+VGATHER_HELPER(2)
+VGATHER_HELPER(3)
+
 #if SHIFT == 2
 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index a7a6bf6b10..e5d8ea9bb7 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -433,6 +433,22 @@ DEF_HELPER_4(glue(vpmaskmovd_st, SUFFIX), void, env, Reg, Reg, tl)
 DEF_HELPER_4(glue(vpmaskmovq_st, SUFFIX), void, env, Reg, Reg, tl)
 DEF_HELPER_4(glue(vpmaskmovd, SUFFIX), void, env, Reg, Reg, Reg)
 DEF_HELPER_4(glue(vpmaskmovq, SUFFIX), void, env, Reg, Reg, Reg)
+DEF_HELPER_5(glue(vpgatherdd0, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdq0, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqd0, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqq0, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdd1, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdq1, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqd1, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqq1, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdd2, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdq2, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqd2, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqq2, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdd3, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherdq3, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqd3, SUFFIX), void, env, Reg, Reg, Reg, tl)
+DEF_HELPER_5(glue(vpgatherqq3, SUFFIX), void, env, Reg, Reg, Reg, tl)
 #if SHIFT == 2
 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_1(vzeroall, void, env)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index e00195d301..fe1ab58d07 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3315,6 +3315,10 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     /* vpmaskmovd, vpmaskmovq */
     [0x8c] = BINARY_OP(vpmaskmovd, AVX, SSE_OPF_AVX2),
     [0x8e] = SPECIAL_OP(AVX), /* vpmaskmovd, vpmaskmovq */
+    [0x90] = SPECIAL_OP(AVX), /* vpgatherdd, vpgatherdq */
+    [0x91] = SPECIAL_OP(AVX), /* vpgatherqd, vpgatherqq */
+    [0x92] = SPECIAL_OP(AVX), /* vgatherdpd, vgatherdps */
+    [0x93] = SPECIAL_OP(AVX), /* vgatherqpd, vgatherqps */
 #define gen_helper_aesimc_ymm NULL
     [0xdb] = UNARY_OP(aesimc, AES, 0),
     [0xdc] = BINARY_OP(aesenc, AES, 0),
@@ -3381,6 +3385,25 @@ static const SSEFunc_0_eppt sse_op_table9[2][2] = {
     SSE_OP(vpmaskmovd_st),
     SSE_OP(vpmaskmovq_st),
 };
+
+static const SSEFunc_0_epppt sse_op_table10[16][2] = {
+    SSE_OP(vpgatherdd0),
+    SSE_OP(vpgatherdq0),
+    SSE_OP(vpgatherqd0),
+    SSE_OP(vpgatherqq0),
+    SSE_OP(vpgatherdd1),
+    SSE_OP(vpgatherdq1),
+    SSE_OP(vpgatherqd1),
+    SSE_OP(vpgatherqq1),
+    SSE_OP(vpgatherdd2),
+    SSE_OP(vpgatherdq2),
+    SSE_OP(vpgatherqd2),
+    SSE_OP(vpgatherqq2),
+    SSE_OP(vpgatherdd3),
+    SSE_OP(vpgatherdq3),
+    SSE_OP(vpgatherqd3),
+    SSE_OP(vpgatherqq3),
+};
 #undef SSE_OP
 
 /* VEX prefix not allowed */
@@ -4350,6 +4373,57 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 }
                 op1_offset = ZMM_OFFSET(reg);
 
+                if ((b & 0xfc) == 0x90) { /* vgather */
+                    int scale, index, base;
+                    target_long disp = 0;
+                    CHECK_AVX2(s);
+                    if (mod == 3 || rm != 4) {
+                        goto illegal_op;
+                    }
+
+                    /* Vector SIB */
+                    val = x86_ldub_code(env, s);
+                    scale = (val >> 6) & 3;
+                    index = ((val >> 3) & 7) | REX_X(s);
+                    base = (val & 7) | REX_B(s);
+                    switch (mod) {
+                    case 0:
+                        if (base == 5) {
+                            base = -1;
+                            disp = (int32_t)x86_ldl_code(env, s);
+                        }
+                        break;
+                    case 1:
+                        disp = (int8_t)x86_ldub_code(env, s);
+                        break;
+                    default:
+                    case 2:
+                        disp = (int32_t)x86_ldl_code(env, s);
+                        break;
+                    }
+
+                    /* destination, index and mask registers must not overlap */
+                    if (reg == index || reg == reg_v) {
+                        goto illegal_op;
+                    }
+
+                    tcg_gen_addi_tl(s->A0, cpu_regs[base], disp);
+                    gen_add_A0_ds_seg(s);
+                    op2_offset = ZMM_OFFSET(index);
+                    v_offset = ZMM_OFFSET(reg_v);
+                    tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset);
+                    tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset);
+                    tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
+                    b1 = REX_W(s) | ((b & 1) << 1) | (scale << 2);
+                    sse_op_table10[b1][s->vex_l](cpu_env,
+                            s->ptr0, s->ptr2, s->ptr1, s->A0);
+                    if (!s->vex_l) {
+                        gen_clear_ymmh(s, reg);
+                        gen_clear_ymmh(s, reg_v);
+                    }
+                    return;
+                }
+
                 if (op6.flags & SSE_OPF_MMX) {
                     CHECK_AVX2_256(s);
                 }
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 35/42] i386: Implement VPERM
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (37 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 34/42] i386: Implement VGATHER Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 36/42] i386: Implement VINSERT128/VEXTRACT128 Paul Brook
                   ` (6 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

A set of shuffle operations that operate on complete 256 bit registers.
The integer and floating point variants have identical semantics.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/ops_sse.h        | 73 ++++++++++++++++++++++++++++++++++++
 target/i386/ops_sse_header.h |  3 ++
 target/i386/tcg/translate.c  |  9 +++++
 3 files changed, 85 insertions(+)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 14a2d1bf78..04d2006cd8 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -3407,6 +3407,79 @@ void helper_vzeroupper_hi8(CPUX86State *env)
     }
 }
 #endif
+
+void helper_vpermdq_ymm(CPUX86State *env,
+                        Reg *d, Reg *v, Reg *s, uint32_t order)
+{
+    uint64_t r0, r1, r2, r3;
+
+    switch (order & 3) {
+    case 0:
+        r0 = v->Q(0);
+        r1 = v->Q(1);
+        break;
+    case 1:
+        r0 = v->Q(2);
+        r1 = v->Q(3);
+        break;
+    case 2:
+        r0 = s->Q(0);
+        r1 = s->Q(1);
+        break;
+    case 3:
+        r0 = s->Q(2);
+        r1 = s->Q(3);
+        break;
+    }
+    switch ((order >> 4) & 3) {
+    case 0:
+        r2 = v->Q(0);
+        r3 = v->Q(1);
+        break;
+    case 1:
+        r2 = v->Q(2);
+        r3 = v->Q(3);
+        break;
+    case 2:
+        r2 = s->Q(0);
+        r3 = s->Q(1);
+        break;
+    case 3:
+        r2 = s->Q(2);
+        r3 = s->Q(3);
+        break;
+    }
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+    d->Q(2) = r2;
+    d->Q(3) = r3;
+}
+
+void helper_vpermq_ymm(CPUX86State *env, Reg *d, Reg *s, uint32_t order)
+{
+    uint64_t r0, r1, r2, r3;
+    r0 = s->Q(order & 3);
+    r1 = s->Q((order >> 2) & 3);
+    r2 = s->Q((order >> 4) & 3);
+    r3 = s->Q((order >> 6) & 3);
+    d->Q(0) = r0;
+    d->Q(1) = r1;
+    d->Q(2) = r2;
+    d->Q(3) = r3;
+}
+
+void helper_vpermd_ymm(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+    uint32_t r[8];
+    int i;
+
+    for (i = 0; i < 8; i++) {
+        r[i] = s->L(v->L(i) & 7);
+    }
+    for (i = 0; i < 8; i++) {
+        d->L(i) = r[i];
+    }
+}
 #endif
 #endif
 
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index e5d8ea9bb7..099e6e8ffc 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -457,6 +457,9 @@ DEF_HELPER_1(vzeroupper, void, env)
 DEF_HELPER_1(vzeroall_hi8, void, env)
 DEF_HELPER_1(vzeroupper_hi8, void, env)
 #endif
+DEF_HELPER_5(vpermdq_ymm, void, env, Reg, Reg, Reg, i32)
+DEF_HELPER_4(vpermq_ymm, void, env, Reg, Reg, i32)
+DEF_HELPER_4(vpermd_ymm, void, env, Reg, Reg, Reg)
 #endif
 #endif
 
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index fe1ab58d07..5a11d3c083 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3258,6 +3258,8 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x10] = BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX),
     [0x14] = BLENDV_OP(blendvps, SSE41, 0),
     [0x15] = BLENDV_OP(blendvpd, SSE41, 0),
+#define gen_helper_vpermd_xmm NULL
+    [0x16] = BINARY_OP(vpermd, AVX, SSE_OPF_AVX2), /* vpermps */
     [0x17] = CMP_OP(ptest, SSE41),
     /* TODO:Some vbroadcast variants require AVX2 */
     [0x18] = UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR), /* vbroadcastss */
@@ -3287,6 +3289,7 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
     [0x33] = UNARY_OP(pmovzxwd, SSE41, SSE_OPF_MMX),
     [0x34] = UNARY_OP(pmovzxwq, SSE41, SSE_OPF_MMX),
     [0x35] = UNARY_OP(pmovzxdq, SSE41, SSE_OPF_MMX),
+    [0x36] = BINARY_OP(vpermd, AVX, SSE_OPF_AVX2), /* vpermd */
     [0x37] = BINARY_OP(pcmpgtq, SSE41, SSE_OPF_MMX),
     [0x38] = BINARY_OP(pminsb, SSE41, SSE_OPF_MMX),
     [0x39] = BINARY_OP(pminsd, SSE41, SSE_OPF_MMX),
@@ -3329,8 +3332,13 @@ static const struct SSEOpHelper_table6 sse_op_table6[256] = {
 
 /* prefix [66] 0f 3a */
 static const struct SSEOpHelper_table7 sse_op_table7[256] = {
+#define gen_helper_vpermq_xmm NULL
+    [0x00] = UNARY_OP(vpermq, AVX, SSE_OPF_AVX2),
+    [0x01] = UNARY_OP(vpermq, AVX, SSE_OPF_AVX2), /* vpermpd */
     [0x04] = UNARY_OP(vpermilps_imm, AVX, 0),
     [0x05] = UNARY_OP(vpermilpd_imm, AVX, 0),
+#define gen_helper_vpermdq_xmm NULL
+    [0x06] = BINARY_OP(vpermdq, AVX, 0), /* vperm2f128 */
     [0x08] = UNARY_OP(roundps, SSE41, 0),
     [0x09] = UNARY_OP(roundpd, SSE41, 0),
 #define gen_helper_roundss_ymm NULL
@@ -3353,6 +3361,7 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
     [0x41] = BINARY_OP(dppd, SSE41, 0),
     [0x42] = BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX),
     [0x44] = BINARY_OP(pclmulqdq, PCLMULQDQ, 0),
+    [0x46] = BINARY_OP(vpermdq, AVX, SSE_OPF_AVX2), /* vperm2i128 */
 #define gen_helper_pcmpestrm_ymm NULL
     [0x60] = CMP_OP(pcmpestrm, SSE42),
 #define gen_helper_pcmpestri_ymm NULL
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 36/42] i386: Implement VINSERT128/VEXTRACT128
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (38 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 35/42] i386: Implement VPERM Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:01 ` [PATCH v2 37/42] i386: Implement VBLENDV Paul Brook
                   ` (5 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

128-bit vinsert/vextract instructions. The integer and loating point variants
have the same semantics.

This is where we encounter an instruction encoded with VEX.L == 1 and
a 128 bit (xmm) destination operand.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 78 +++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 5a11d3c083..4072fa28d3 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2814,6 +2814,24 @@ static inline void gen_op_movo_ymmh(DisasContext *s, int d_offset, int s_offset)
     tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(3)));
 }
 
+static inline void gen_op_movo_ymm_l2h(DisasContext *s,
+                                       int d_offset, int s_offset)
+{
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(0)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(1)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(3)));
+}
+
+static inline void gen_op_movo_ymm_h2l(DisasContext *s,
+                                       int d_offset, int s_offset)
+{
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(2)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(0)));
+    tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q(3)));
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q(1)));
+}
+
 static inline void gen_op_movq(DisasContext *s, int d_offset, int s_offset)
 {
     tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset);
@@ -3353,9 +3371,13 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
     [0x15] = SPECIAL_OP(SSE41), /* pextrw */
     [0x16] = SPECIAL_OP(SSE41), /* pextrd/pextrq */
     [0x17] = SPECIAL_OP(SSE41), /* extractps */
+    [0x18] = SPECIAL_OP(AVX), /* vinsertf128 */
+    [0x19] = SPECIAL_OP(AVX), /* vextractf128 */
     [0x20] = SPECIAL_OP(SSE41), /* pinsrb */
     [0x21] = SPECIAL_OP(SSE41), /* insertps */
     [0x22] = SPECIAL_OP(SSE41), /* pinsrd/pinsrq */
+    [0x38] = SPECIAL_OP(AVX), /* vinserti128 */
+    [0x39] = SPECIAL_OP(AVX), /* vextracti128 */
     [0x40] = BINARY_OP(dpps, SSE41, 0),
 #define gen_helper_dppd_ymm NULL
     [0x41] = BINARY_OP(dppd, SSE41, 0),
@@ -5145,6 +5167,62 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     }
                     gen_clear_ymmh(s, reg);
                     break;
+                case 0x38: /* vinserti128 */
+                    CHECK_AVX2_256(s);
+                    /* fall through */
+                case 0x18: /* vinsertf128 */
+                    CHECK_AVX(s);
+                    if ((s->prefix & PREFIX_VEX) == 0 || s->vex_l == 0) {
+                        goto illegal_op;
+                    }
+                    if (mod == 3) {
+                        if (val & 1) {
+                            gen_op_movo_ymm_l2h(s, ZMM_OFFSET(reg),
+                                                ZMM_OFFSET(rm));
+                        } else {
+                            gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm));
+                        }
+                    } else {
+                        if (val & 1) {
+                            gen_ldo_env_A0_ymmh(s, ZMM_OFFSET(reg));
+                        } else {
+                            gen_ldo_env_A0(s, ZMM_OFFSET(reg));
+                        }
+                    }
+                    if (reg != reg_v) {
+                        if (val & 1) {
+                            gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v));
+                        } else {
+                            gen_op_movo_ymmh(s, ZMM_OFFSET(reg),
+                                             ZMM_OFFSET(reg_v));
+                        }
+                    }
+                    break;
+                case 0x39: /* vextracti128 */
+                    CHECK_AVX2_256(s);
+                    /* fall through */
+                case 0x19: /* vextractf128 */
+                    CHECK_AVX_V0(s);
+                    if ((s->prefix & PREFIX_VEX) == 0 || s->vex_l == 0) {
+                        goto illegal_op;
+                    }
+                    if (mod == 3) {
+                        op1_offset = ZMM_OFFSET(rm);
+                        if (val & 1) {
+                            gen_op_movo_ymm_h2l(s, ZMM_OFFSET(rm),
+                                                ZMM_OFFSET(reg));
+                        } else {
+                            gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg));
+                        }
+                        gen_clear_ymmh(s, rm);
+                    } else{
+                        if (val & 1) {
+                            gen_sto_env_A0_ymmh(s, ZMM_OFFSET(reg));
+                        } else {
+                            gen_sto_env_A0(s, ZMM_OFFSET(reg));
+                        }
+                    }
+                    break;
                 }
                 return;
             }
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 37/42] i386: Implement VBLENDV
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (39 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 36/42] i386: Implement VINSERT128/VEXTRACT128 Paul Brook
@ 2022-04-24 22:01 ` Paul Brook
  2022-04-24 22:02 ` [PATCH v2 38/42] i386: Implement VPBLENDD Paul Brook
                   ` (4 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:01 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

The AVX variants of the BLENDV instructions use a different opcode prefix
to support the additional operands. We already modified the helper functions
in anticipation of this.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 4072fa28d3..95ecdea8fe 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3384,6 +3384,9 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
     [0x42] = BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX),
     [0x44] = BINARY_OP(pclmulqdq, PCLMULQDQ, 0),
     [0x46] = BINARY_OP(vpermdq, AVX, SSE_OPF_AVX2), /* vperm2i128 */
+    [0x4a] = BLENDV_OP(blendvps, AVX, 0),
+    [0x4b] = BLENDV_OP(blendvpd, AVX, 0),
+    [0x4c] = BLENDV_OP(pblendvb, AVX, SSE_OPF_MMX),
 #define gen_helper_pcmpestrm_ymm NULL
     [0x60] = CMP_OP(pcmpestrm, SSE42),
 #define gen_helper_pcmpestri_ymm NULL
@@ -5268,6 +5271,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
 
             /* SSE */
+            if (op7.flags & SSE_OPF_BLENDV && !(s->prefix & PREFIX_VEX)) {
+                /* Only VEX encodings are valid for these blendv opcodes */
+                goto illegal_op;
+            }
             op1_offset = ZMM_OFFSET(reg);
             if (mod == 3) {
                 op2_offset = ZMM_OFFSET(rm | REX_B(s));
@@ -5316,8 +5323,15 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 op7.fn[b1].op1(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val));
             } else {
                 tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset);
-                op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1,
-                               tcg_const_i32(val));
+                if (op7.flags & SSE_OPF_BLENDV) {
+                    TCGv_ptr mask = tcg_temp_new_ptr();
+                    tcg_gen_addi_ptr(mask, cpu_env, ZMM_OFFSET(val >> 4));
+                    op7.fn[b1].op3(cpu_env, s->ptr0, s->ptr2, s->ptr1, mask);
+                    tcg_temp_free_ptr(mask);
+                } else {
+                    op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1,
+                                   tcg_const_i32(val));
+                }
             }
             if ((op7.flags & SSE_OPF_CMP) == 0 && s->vex_l == 0) {
                 gen_clear_ymmh(s, reg);
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 38/42] i386: Implement VPBLENDD
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (40 preceding siblings ...)
  2022-04-24 22:01 ` [PATCH v2 37/42] i386: Implement VBLENDV Paul Brook
@ 2022-04-24 22:02 ` Paul Brook
  2022-04-24 22:02 ` [PATCH v2 39/42] i386: Enable AVX cpuid bits when using TCG Paul Brook
                   ` (3 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:02 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

This is semantically eqivalent to VBLENDPS.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/tcg/translate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 95ecdea8fe..73f3842c36 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3353,6 +3353,7 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
 #define gen_helper_vpermq_xmm NULL
     [0x00] = UNARY_OP(vpermq, AVX, SSE_OPF_AVX2),
     [0x01] = UNARY_OP(vpermq, AVX, SSE_OPF_AVX2), /* vpermpd */
+    [0x02] = BINARY_OP(blendps, AVX, SSE_OPF_AVX2), /* vpblendd */
     [0x04] = UNARY_OP(vpermilps_imm, AVX, 0),
     [0x05] = UNARY_OP(vpermilpd_imm, AVX, 0),
 #define gen_helper_vpermdq_xmm NULL
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 39/42] i386: Enable AVX cpuid bits when using TCG
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (41 preceding siblings ...)
  2022-04-24 22:02 ` [PATCH v2 38/42] i386: Implement VPBLENDD Paul Brook
@ 2022-04-24 22:02 ` Paul Brook
  2022-04-24 22:02 ` [PATCH v2 40/42] Enable all x86-64 cpu features in user mode Paul Brook
                   ` (2 subsequent siblings)
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:02 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Include AVX and AVX2 in the guest cpuid features supported by TCG

Signed-off-by: Paul Brook <paul@nowt.org>
---
 target/i386/cpu.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 99343be926..bd35233d5b 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -625,12 +625,12 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
           CPUID_EXT_SSE41 | CPUID_EXT_SSE42 | CPUID_EXT_POPCNT | \
           CPUID_EXT_XSAVE | /* CPUID_EXT_OSXSAVE is dynamic */   \
           CPUID_EXT_MOVBE | CPUID_EXT_AES | CPUID_EXT_HYPERVISOR | \
-          CPUID_EXT_RDRAND)
+          CPUID_EXT_RDRAND | CPUID_EXT_AVX)
           /* missing:
           CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_VMX, CPUID_EXT_SMX,
           CPUID_EXT_EST, CPUID_EXT_TM2, CPUID_EXT_CID, CPUID_EXT_FMA,
           CPUID_EXT_XTPR, CPUID_EXT_PDCM, CPUID_EXT_PCID, CPUID_EXT_DCA,
-          CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AVX,
+          CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER,
           CPUID_EXT_F16C */
 
 #ifdef TARGET_X86_64
@@ -653,9 +653,9 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
           CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ADX | \
           CPUID_7_0_EBX_PCOMMIT | CPUID_7_0_EBX_CLFLUSHOPT |            \
           CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_MPX | CPUID_7_0_EBX_FSGSBASE | \
-          CPUID_7_0_EBX_ERMS)
+          CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_AVX2)
           /* missing:
-          CPUID_7_0_EBX_HLE, CPUID_7_0_EBX_AVX2,
+          CPUID_7_0_EBX_HLE
           CPUID_7_0_EBX_INVPCID, CPUID_7_0_EBX_RTM,
           CPUID_7_0_EBX_RDSEED */
 #define TCG_7_0_ECX_FEATURES (CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU | \
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 40/42] Enable all x86-64 cpu features in user mode
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (42 preceding siblings ...)
  2022-04-24 22:02 ` [PATCH v2 39/42] i386: Enable AVX cpuid bits when using TCG Paul Brook
@ 2022-04-24 22:02 ` Paul Brook
  2022-04-24 22:02 ` [PATCH v2 41/42] AVX tests Paul Brook
  2022-04-24 22:02 ` [PATCH v2 42/42] i386: Add sha512-avx test Paul Brook
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:02 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Laurent Vivier, Paul Brook

We don't have any migration concerns for usermode emulation, so we may
as well enable all available CPU features by default.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 linux-user/x86_64/target_elf.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/x86_64/target_elf.h b/linux-user/x86_64/target_elf.h
index 7b76a90de8..3f628f8d66 100644
--- a/linux-user/x86_64/target_elf.h
+++ b/linux-user/x86_64/target_elf.h
@@ -9,6 +9,6 @@
 #define X86_64_TARGET_ELF_H
 static inline const char *cpu_get_model(uint32_t eflags)
 {
-    return "qemu64";
+    return "max";
 }
 #endif
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 41/42] AVX tests
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (43 preceding siblings ...)
  2022-04-24 22:02 ` [PATCH v2 40/42] Enable all x86-64 cpu features in user mode Paul Brook
@ 2022-04-24 22:02 ` Paul Brook
  2022-04-24 22:02 ` [PATCH v2 42/42] i386: Add sha512-avx test Paul Brook
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:02 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Tests for correct operation of most x86-64 SSE and AVX instructions.
It should cover all combinations of overlapping register and memory
operands on a set of random-ish data.

Results are bit-identical to an Intel i5-8500, with the exception of
the RCPSS and RSQRT approximations where the real CPU gives less accurate
results (the Intel spec allows relative errors up to 1.5 * 2^-12)

Signed-off-by: Paul Brook <paul@nowt.org>
---
 tests/tcg/i386/Makefile.target |   10 +-
 tests/tcg/i386/README          |    9 +
 tests/tcg/i386/test-avx.c      |  347 +++
 tests/tcg/i386/test-avx.py     |  352 +++
 tests/tcg/i386/x86.csv         | 4658 ++++++++++++++++++++++++++++++++
 5 files changed, 5374 insertions(+), 2 deletions(-)
 create mode 100644 tests/tcg/i386/test-avx.c
 create mode 100755 tests/tcg/i386/test-avx.py
 create mode 100644 tests/tcg/i386/x86.csv

diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
index bd73c96d0d..eb06f7eb89 100644
--- a/tests/tcg/i386/Makefile.target
+++ b/tests/tcg/i386/Makefile.target
@@ -7,8 +7,8 @@ VPATH 		+= $(I386_SRC)
 
 I386_SRCS=$(notdir $(wildcard $(I386_SRC)/*.c))
 ALL_X86_TESTS=$(I386_SRCS:.c=)
-SKIP_I386_TESTS=test-i386-ssse3
-X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
+SKIP_I386_TESTS=test-i386-ssse3 test-avx
+X86_64_TESTS:=$(filter test-i386-ssse3 test-avx, $(ALL_X86_TESTS))
 
 test-i386-sse-exceptions: CFLAGS += -msse4.1 -mfpmath=sse
 run-test-i386-sse-exceptions: QEMU_OPTS += -cpu max
@@ -80,3 +80,9 @@ run-sha512-sse: QEMU_OPTS+=-cpu max
 run-plugin-sha512-sse-with-%: QEMU_OPTS+=-cpu max
 
 TESTS+=sha512-sse
+
+test-avx.h: test-avx.py x86.csv
+	$(PYTHON) $(I386_SRC)/test-avx.py $(I386_SRC)/x86.csv $@
+
+test-avx: CFLAGS += -mavx -masm=intel -O -I.
+test-avx: test-avx.h
diff --git a/tests/tcg/i386/README b/tests/tcg/i386/README
index 09e88f30dc..403d10dad8 100644
--- a/tests/tcg/i386/README
+++ b/tests/tcg/i386/README
@@ -15,6 +15,15 @@ The Linux system call vm86() is used to test vm86 emulation.
 Various exceptions are raised to test most of the x86 user space
 exception reporting.
 
+test-avx
+--------
+
+This program executes most SSE/AVX instructions and generates a text output,
+for comparison with the output obtained with a real CPU or another emulator.
+
+test-avx.h is generate from x86.csv by test-avx.py
+x86.csv comes from https://github.com/quasilyte/avx512test
+
 linux-test
 ----------
 
diff --git a/tests/tcg/i386/test-avx.c b/tests/tcg/i386/test-avx.c
new file mode 100644
index 0000000000..953e2906fe
--- /dev/null
+++ b/tests/tcg/i386/test-avx.c
@@ -0,0 +1,347 @@
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+typedef void (*testfn)(void);
+
+typedef struct {
+    uint64_t q0, q1, q2, q3;
+} __attribute__((aligned(32))) v4di;
+
+typedef struct {
+    uint64_t mm[8];
+    v4di ymm[16];
+    uint64_t r[16];
+    uint64_t flags;
+    uint32_t ff;
+    uint64_t pad;
+    v4di mem[4];
+    v4di mem0[4];
+} reg_state;
+
+typedef struct {
+    int n;
+    testfn fn;
+    const char *s;
+    reg_state *init;
+} TestDef;
+
+reg_state initI;
+reg_state initF32;
+reg_state initF64;
+
+static void dump_ymm(const char *name, int n, const v4di *r, int ff)
+{
+    printf("%s%d = %016lx %016lx %016lx %016lx\n",
+           name, n, r->q3, r->q2, r->q1, r->q0);
+    if (ff == 64) {
+        double v[4];
+        memcpy(v, r, sizeof(v));
+        printf("        %16g %16g %16g %16g\n",
+                v[3], v[2], v[1], v[0]);
+    } else if (ff == 32) {
+        float v[8];
+        memcpy(v, r, sizeof(v));
+        printf(" %8g %8g %8g %8g %8g %8g %8g %8g\n",
+                v[7], v[6], v[5], v[4], v[3], v[2], v[1], v[0]);
+    }
+}
+
+static void dump_regs(reg_state *s)
+{
+    int i;
+
+    for (i = 0; i < 16; i++) {
+        dump_ymm("ymm", i, &s->ymm[i], 0);
+    }
+    for (i = 0; i < 4; i++) {
+        dump_ymm("mem", i, &s->mem0[i], 0);
+    }
+}
+
+static void compare_state(const reg_state *a, const reg_state *b)
+{
+    int i;
+    for (i = 0; i < 8; i++) {
+        if (a->mm[i] != b->mm[i]) {
+            printf("MM%d = %016lx\n", i, b->mm[i]);
+        }
+    }
+    for (i = 0; i < 16; i++) {
+        if (a->r[i] != b->r[i]) {
+            printf("r%d = %016lx\n", i, b->r[i]);
+        }
+    }
+    for (i = 0; i < 16; i++) {
+        if (memcmp(&a->ymm[i], &b->ymm[i], 32)) {
+            dump_ymm("ymm", i, &b->ymm[i], a->ff);
+        }
+    }
+    for (i = 0; i < 4; i++) {
+        if (memcmp(&a->mem0[i], &a->mem[i], 32)) {
+            dump_ymm("mem", i, &a->mem[i], a->ff);
+        }
+    }
+    if (a->flags != b->flags) {
+        printf("FLAGS = %016lx\n", b->flags);
+    }
+}
+
+#define LOADMM(r, o) "movq " #r ", " #o "[%0]\n\t"
+#define LOADYMM(r, o) "vmovdqa " #r ", " #o "[%0]\n\t"
+#define STOREMM(r, o) "movq " #o "[%1], " #r "\n\t"
+#define STOREYMM(r, o) "vmovdqa " #o "[%1], " #r "\n\t"
+#define MMREG(F) \
+    F(mm0, 0x00) \
+    F(mm1, 0x08) \
+    F(mm2, 0x10) \
+    F(mm3, 0x18) \
+    F(mm4, 0x20) \
+    F(mm5, 0x28) \
+    F(mm6, 0x30) \
+    F(mm7, 0x38)
+#define YMMREG(F) \
+    F(ymm0, 0x040) \
+    F(ymm1, 0x060) \
+    F(ymm2, 0x080) \
+    F(ymm3, 0x0a0) \
+    F(ymm4, 0x0c0) \
+    F(ymm5, 0x0e0) \
+    F(ymm6, 0x100) \
+    F(ymm7, 0x120) \
+    F(ymm8, 0x140) \
+    F(ymm9, 0x160) \
+    F(ymm10, 0x180) \
+    F(ymm11, 0x1a0) \
+    F(ymm12, 0x1c0) \
+    F(ymm13, 0x1e0) \
+    F(ymm14, 0x200) \
+    F(ymm15, 0x220)
+#define LOADREG(r, o) "mov " #r ", " #o "[rax]\n\t"
+#define STOREREG(r, o) "mov " #o "[rax], " #r "\n\t"
+#define REG(F) \
+    F(rbx, 0x248) \
+    F(rcx, 0x250) \
+    F(rdx, 0x258) \
+    F(rsi, 0x260) \
+    F(rdi, 0x268) \
+    F(r8, 0x280) \
+    F(r9, 0x288) \
+    F(r10, 0x290) \
+    F(r11, 0x298) \
+    F(r12, 0x2a0) \
+    F(r13, 0x2a8) \
+    F(r14, 0x2b0) \
+    F(r15, 0x2b8) \
+
+static void run_test(const TestDef *t)
+{
+    reg_state result;
+    reg_state *init = t->init;
+    memcpy(init->mem, init->mem0, sizeof(init->mem));
+    printf("%5d %s\n", t->n, t->s);
+    asm volatile(
+            MMREG(LOADMM)
+            YMMREG(LOADYMM)
+            "sub rsp, 128\n\t"
+            "push rax\n\t"
+            "push rbx\n\t"
+            "push rcx\n\t"
+            "push rdx\n\t"
+            "push %1\n\t"
+            "push %2\n\t"
+            "mov rax, %0\n\t"
+            "pushf\n\t"
+            "pop rbx\n\t"
+            "shr rbx, 8\n\t"
+            "shl rbx, 8\n\t"
+            "mov rcx, 0x2c0[rax]\n\t"
+            "and rcx, 0xff\n\t"
+            "or rbx, rcx\n\t"
+            "push rbx\n\t"
+            "popf\n\t"
+            REG(LOADREG)
+            "mov rax, 0x240[rax]\n\t"
+            "call [rsp]\n\t"
+            "mov [rsp], rax\n\t"
+            "mov rax, 8[rsp]\n\t"
+            REG(STOREREG)
+            "mov rbx, [rsp]\n\t"
+            "mov 0x240[rax], rbx\n\t"
+            "mov rbx, 0\n\t"
+            "mov 0x270[rax], rbx\n\t"
+            "mov 0x278[rax], rbx\n\t"
+            "pushf\n\t"
+            "pop rbx\n\t"
+            "and rbx, 0xff\n\t"
+            "mov 0x2c0[rax], rbx\n\t"
+            "add rsp, 16\n\t"
+            "pop rdx\n\t"
+            "pop rcx\n\t"
+            "pop rbx\n\t"
+            "pop rax\n\t"
+            "add rsp, 128\n\t"
+            MMREG(STOREMM)
+            YMMREG(STOREYMM)
+            : : "r"(init), "r"(&result), "r"(t->fn)
+            : "memory", "cc",
+            "rsi", "rdi",
+            "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15",
+            "mm0", "mm1", "mm2", "mm3", "mm4", "mm5", "mm6", "mm7",
+            "ymm0", "ymm1", "ymm2", "ymm3", "ymm4", "ymm5",
+            "ymm6", "ymm7", "ymm8", "ymm9", "ymm10", "ymm11",
+            "ymm12", "ymm13", "ymm14", "ymm15"
+            );
+    compare_state(init, &result);
+}
+
+#define TEST(n, cmd, type) \
+static void __attribute__((naked)) test_##n(void) \
+{ \
+    asm volatile(cmd); \
+    asm volatile("ret"); \
+}
+#include "test-avx.h"
+
+
+static const TestDef test_table[] = {
+#define TEST(n, cmd, type) {n, test_##n, cmd, &init##type},
+#include "test-avx.h"
+    {-1, NULL, "", NULL}
+};
+
+static void run_all(void)
+{
+    const TestDef *t;
+    for (t = test_table; t->fn; t++) {
+        run_test(t);
+    }
+}
+
+#define ARRAY_LEN(x) (sizeof(x) / sizeof(x[0]))
+
+float val_f32[] = {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5, 8.3};
+double val_f64[] = {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5};
+v4di val_i64[] = {
+    {0x3d6b3b6a9e4118f2lu, 0x355ae76d2774d78clu,
+     0xac3ff76c4daa4b28lu, 0xe7fabd204cb54083lu},
+    {0xd851c54a56bf1f29lu, 0x4a84d1d50bf4c4fflu,
+     0x56621e553d52b56clu, 0xd0069553da8f584alu},
+    {0x5826475e2c5fd799lu, 0xfd32edc01243f5e9lu,
+     0x738ba2c66d3fe126lu, 0x5707219c6e6c26b4lu},
+};
+
+v4di deadbeef = {0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull,
+                 0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull};
+v4di indexq = {0x000000000000001full, 0x000000000000008full,
+               0xffffffffffffffffull, 0xffffffffffffff5full};
+v4di indexd = {0x00000002000000efull, 0xfffffff500000010ull,
+               0x0000000afffffff0ull, 0x000000000000000eull};
+
+v4di gather_mem[0x20];
+
+void init_f32reg(v4di *r)
+{
+    static int n;
+    float v[8];
+    int i;
+    for (i = 0; i < 8; i++) {
+        v[i] = val_f32[n++];
+        if (n == ARRAY_LEN(val_f32)) {
+            n = 0;
+        }
+    }
+    memcpy(r, v, sizeof(*r));
+}
+
+void init_f64reg(v4di *r)
+{
+    static int n;
+    double v[4];
+    int i;
+    for (i = 0; i < 4; i++) {
+        v[i] = val_f64[n++];
+        if (n == ARRAY_LEN(val_f64)) {
+            n = 0;
+        }
+    }
+    memcpy(r, v, sizeof(*r));
+}
+
+void init_intreg(v4di *r)
+{
+    static uint64_t mask;
+    static int n;
+
+    r->q0 = val_i64[n].q0 ^ mask;
+    r->q1 = val_i64[n].q1 ^ mask;
+    r->q2 = val_i64[n].q2 ^ mask;
+    r->q3 = val_i64[n].q3 ^ mask;
+    n++;
+    if (n == ARRAY_LEN(val_i64)) {
+        n = 0;
+        mask *= 0x104C11DB7;
+    }
+}
+
+static void init_all(reg_state *s)
+{
+    int i;
+
+    s->r[3] = (uint64_t)&s->mem[0]; /* rdx */
+    s->r[4] = (uint64_t)&gather_mem[ARRAY_LEN(gather_mem) / 2]; /* rsi */
+    s->r[5] = (uint64_t)&s->mem[2]; /* rdi */
+    s->flags = 2;
+    for (i = 0; i < 16; i++) {
+        s->ymm[i] = deadbeef;
+    }
+    s->ymm[13] = indexd;
+    s->ymm[14] = indexq;
+    for (i = 0; i < 4; i++) {
+        s->mem0[i] = deadbeef;
+    }
+}
+
+int main(int argc, char *argv[])
+{
+    int i;
+
+    init_all(&initI);
+    init_intreg(&initI.ymm[10]);
+    init_intreg(&initI.ymm[11]);
+    init_intreg(&initI.ymm[12]);
+    init_intreg(&initI.mem0[1]);
+    printf("Int:\n");
+    dump_regs(&initI);
+
+    init_all(&initF32);
+    init_f32reg(&initF32.ymm[10]);
+    init_f32reg(&initF32.ymm[11]);
+    init_f32reg(&initF32.ymm[12]);
+    init_f32reg(&initF32.mem0[1]);
+    initF32.ff = 32;
+    printf("F32:\n");
+    dump_regs(&initF32);
+
+    init_all(&initF64);
+    init_f64reg(&initF64.ymm[10]);
+    init_f64reg(&initF64.ymm[11]);
+    init_f64reg(&initF64.ymm[12]);
+    init_f64reg(&initF64.mem0[1]);
+    initF64.ff = 64;
+    printf("F64:\n");
+    dump_regs(&initF64);
+
+    for (i = 0; i < ARRAY_LEN(gather_mem); i++) {
+        init_intreg(&gather_mem[i]);
+    }
+
+    if (argc > 1) {
+        int n = atoi(argv[1]);
+        run_test(&test_table[n]);
+    } else {
+        run_all();
+    }
+    return 0;
+}
diff --git a/tests/tcg/i386/test-avx.py b/tests/tcg/i386/test-avx.py
new file mode 100755
index 0000000000..0b2d799c5c
--- /dev/null
+++ b/tests/tcg/i386/test-avx.py
@@ -0,0 +1,352 @@
+#! /usr/bin/env python3
+
+# Generate test-avx.h from x86.csv
+
+import csv
+import sys
+from fnmatch import fnmatch
+
+archs = [
+    # TODO: MMX?
+    "SSE", "SSE2", "SSE3", "SSSE3", "SSE4_1", "SSE4_2",
+    "AVX", "AVX2", "AES+AVX", # "VAES+AVX",
+]
+
+ignore = set(["FISTTP",
+    "LDMXCSR", "VLDMXCSR", "STMXCSR", "VSTMXCSR"])
+
+imask = {
+    'vBLENDPD': 0xff,
+    'vBLENDPS': 0x0f,
+    'CMP[PS][SD]': 0x07,
+    'VCMP[PS][SD]': 0x1f,
+    'vDPPD': 0x33,
+    'vDPPS': 0xff,
+    'vEXTRACTPS': 0x03,
+    'vINSERTPS': 0xff,
+    'MPSADBW': 0x7,
+    'VMPSADBW': 0x3f,
+    'vPALIGNR': 0x3f,
+    'vPBLENDW': 0xff,
+    'vPCMP[EI]STR*': 0x0f,
+    'vPEXTRB': 0x0f,
+    'vPEXTRW': 0x07,
+    'vPEXTRD': 0x03,
+    'vPEXTRQ': 0x01,
+    'vPINSRB': 0x0f,
+    'vPINSRW': 0x07,
+    'vPINSRD': 0x03,
+    'vPINSRQ': 0x01,
+    'vPSHUF[DW]': 0xff,
+    'vPSHUF[LH]W': 0xff,
+    'vPS[LR][AL][WDQ]': 0x3f,
+    'vPS[RL]LDQ': 0x1f,
+    'vROUND[PS][SD]': 0x7,
+    'vSHUFPD': 0x0f,
+    'vSHUFPS': 0xff,
+    'vAESKEYGENASSIST': 0,
+    'VEXTRACT[FI]128': 0x01,
+    'VINSERT[FI]128': 0x01,
+    'VPBLENDD': 0xff,
+    'VPERM2[FI]128': 0x33,
+    'VPERMPD': 0xff,
+    'VPERMQ': 0xff,
+    'VPERMILPS': 0xff,
+    'VPERMILPD': 0x0f,
+    }
+
+def strip_comments(x):
+    for l in x:
+        if l != '' and l[0] != '#':
+            yield l
+
+def reg_w(w):
+    if w == 8:
+        return 'al'
+    elif w == 16:
+        return 'ax'
+    elif w == 32:
+        return 'eax'
+    elif w == 64:
+        return 'rax'
+    raise Exception("bad reg_w %d" % w)
+
+def mem_w(w):
+    if w == 8:
+        t = "BYTE"
+    elif w == 16:
+        t = "WORD"
+    elif w == 32:
+        t = "DWORD"
+    elif w == 64:
+        t = "QWORD"
+    elif w == 128:
+        t = "XMMWORD"
+    elif w == 256:
+        t = "YMMWORD"
+    else:
+        raise Exception()
+
+    return t + " PTR 32[rdx]"
+
+class XMMArg():
+    isxmm = True
+    def __init__(self, reg, mw):
+        if mw not in [0, 8, 16, 32, 64, 128, 256]:
+            raise Exception("Bad /m width: %s" % w)
+        self.reg = reg
+        self.mw = mw
+        self.ismem = mw != 0
+    def regstr(self, n):
+        if n < 0:
+            return mem_w(self.mw)
+        else:
+            return "%smm%d" % (self.reg, n)
+
+class MMArg():
+    isxmm = True
+    ismem = False # TODO
+    def regstr(self, n):
+        return "mm%d" % (n & 7)
+
+def match(op, pattern):
+    if pattern[0] == 'v':
+        return fnmatch(op, pattern[1:]) or fnmatch(op, 'V'+pattern[1:])
+    return fnmatch(op, pattern)
+
+class ArgVSIB():
+    isxmm = True
+    ismem = False
+    def __init__(self, reg, w):
+        if w not in [32, 64]:
+            raise Exception("Bad vsib width: %s" % w)
+        self.w = w
+        self.reg = reg
+    def regstr(self, n):
+        reg = "%smm%d" % (self.reg, n >> 2)
+        return "[rsi + %s * %d]" % (reg, 1 << (n & 3))
+
+class ArgImm8u():
+    isxmm = False
+    ismem = False
+    def __init__(self, op):
+        for k, v in imask.items():
+            if match(op, k):
+                self.mask = imask[k];
+                return
+        raise Exception("Unknown immediate")
+    def vals(self):
+        mask = self.mask
+        yield 0
+        n = 0
+        while n != mask:
+            n += 1
+            while (n & ~mask) != 0:
+                n += (n & ~mask)
+            yield n
+
+class ArgRM():
+    isxmm = False
+    def __init__(self, rw, mw):
+        if rw not in [8, 16, 32, 64]:
+            raise Exception("Bad r/w width: %s" % w)
+        if mw not in [0, 8, 16, 32, 64]:
+            raise Exception("Bad r/w width: %s" % w)
+        self.rw = rw
+        self.mw = mw
+        self.ismem = mw != 0
+    def regstr(self, n):
+        if n < 0:
+            return mem_w(self.mw)
+        else:
+            return reg_w(self.rw)
+
+class ArgMem():
+    isxmm = False
+    ismem = True
+    def __init__(self, w):
+        if w not in [8, 16, 32, 64, 128, 256]:
+            raise Exception("Bad mem width: %s" % w)
+        self.w = w
+    def regstr(self, n):
+        return mem_w(self.w)
+
+def ArgGenerator(arg, op):
+    if arg[:3] == 'xmm' or arg[:3] == "ymm":
+        if "/" in arg:
+            r, m = arg.split('/')
+            if (m[0] != 'm'):
+                raise Exception("Expected /m: %s", arg)
+            return XMMArg(arg[0], int(m[1:]));
+        else:
+            return XMMArg(arg[0], 0);
+    elif arg[:2] == 'mm':
+        return MMArg();
+    elif arg[:4] == 'imm8':
+        return ArgImm8u(op);
+    elif arg == '<XMM0>':
+        return None
+    elif arg[0] == 'r':
+        if '/m' in arg:
+            r, m = arg.split('/')
+            if (m[0] != 'm'):
+                raise Exception("Expected /m: %s", arg)
+            mw = int(m[1:])
+            if r == 'r':
+                rw = mw
+            else:
+                rw = int(r[1:])
+            return ArgRM(rw, mw)
+
+        return ArgRM(int(arg[1:]), 0);
+    elif arg[0] == 'm':
+        return ArgMem(int(arg[1:]))
+    elif arg[:2] == 'vm':
+        return ArgVSIB(arg[-1], int(arg[2:-1]))
+    else:
+        raise Exception("Unrecognised arg: %s", arg)
+
+class InsnGenerator:
+    def __init__(self, op, args):
+        self.op = op
+        if op[-2:] in ["PS", "PD", "SS", "SD"]:
+            if op[-1] == 'S':
+                self.optype = 'F32'
+            else:
+                self.optype = 'F64'
+        else:
+            self.optype = 'I'
+
+        try:
+            self.args = list(ArgGenerator(a, op) for a in args)
+            if len(self.args) > 0 and self.args[-1] is None:
+                self.args = self.args[:-1]
+        except Exception as e:
+            raise Exception("Bad arg %s: %s" % (op, e))
+
+    def gen(self):
+        regs = (10, 11, 12)
+        dest = 9
+
+        nreg = len(self.args)
+        if nreg == 0:
+            yield self.op
+            return
+        if isinstance(self.args[-1], ArgImm8u):
+            nreg -= 1
+            immarg = self.args[-1]
+        else:
+            immarg = None
+        memarg = -1
+        for n, arg in enumerate(self.args):
+            if arg.ismem:
+                memarg = n
+
+        if (self.op.startswith("VGATHER") or self.op.startswith("VPGATHER")):
+            if "GATHERD" in self.op:
+                ireg = 13 << 2
+            else:
+                ireg = 14 << 2
+            regset = [
+                (dest, ireg | 0, regs[0]),
+                (dest, ireg | 1, regs[0]),
+                (dest, ireg | 2, regs[0]),
+                (dest, ireg | 3, regs[0]),
+                ]
+            if memarg >= 0:
+                raise Exception("vsib with memory: %s" % self.op)
+        elif nreg == 1:
+            regset = [(regs[0],)]
+            if memarg == 0:
+                regset += [(-1,)]
+        elif nreg == 2:
+            regset = [
+                (regs[0], regs[1]),
+                (regs[0], regs[0]),
+                ]
+            if memarg == 0:
+                regset += [(-1, regs[0])]
+            elif memarg == 1:
+                regset += [(dest, -1)]
+        elif nreg == 3:
+            regset = [
+                (dest, regs[0], regs[1]),
+                (dest, regs[0], regs[0]),
+                (regs[0], regs[0], regs[1]),
+                (regs[0], regs[1], regs[0]),
+                (regs[0], regs[0], regs[0]),
+                ]
+            if memarg == 2:
+                regset += [
+                    (dest, regs[0], -1),
+                    (regs[0], regs[0], -1),
+                    ]
+            elif memarg > 0:
+                raise Exception("Memarg %d" % memarg)
+        elif nreg == 4:
+            regset = [
+                (dest, regs[0], regs[1], regs[2]),
+                (dest, regs[0], regs[0], regs[1]),
+                (dest, regs[0], regs[1], regs[0]),
+                (dest, regs[1], regs[0], regs[0]),
+                (dest, regs[0], regs[0], regs[0]),
+                (regs[0], regs[0], regs[1], regs[2]),
+                (regs[0], regs[1], regs[0], regs[2]),
+                (regs[0], regs[1], regs[2], regs[0]),
+                (regs[0], regs[0], regs[0], regs[1]),
+                (regs[0], regs[0], regs[1], regs[0]),
+                (regs[0], regs[1], regs[0], regs[0]),
+                (regs[0], regs[0], regs[0], regs[0]),
+                ]
+            if memarg == 2:
+                regset += [
+                    (dest, regs[0], -1, regs[1]),
+                    (dest, regs[0], -1, regs[0]),
+                    (regs[0], regs[0], -1, regs[1]),
+                    (regs[0], regs[1], -1, regs[0]),
+                    (regs[0], regs[0], -1, regs[0]),
+                    ]
+            elif memarg > 0:
+                raise Exception("Memarg4 %d" % memarg)
+        else:
+            raise Exception("Too many regs: %s(%d)" % (self.op, nreg))
+
+        for regv in regset:
+            argstr = []
+            for i in range(nreg):
+                arg = self.args[i]
+                argstr.append(arg.regstr(regv[i]))
+            if immarg is None:
+                yield self.op + ' ' + ','.join(argstr)
+            else:
+                for immval in immarg.vals():
+                    yield self.op + ' ' + ','.join(argstr) + ',' + str(immval)
+
+def split0(s):
+    if s == '':
+        return []
+    return s.split(',')
+
+def main():
+    n = 0
+    if len(sys.argv) != 3:
+        print("Usage: test-avx.py x86.csv test-avx.h")
+        exit(1)
+    csvfile = open(sys.argv[1], 'r', newline='')
+    with open(sys.argv[2], "w") as outf:
+        outf.write("// Generated by test-avx.py. Do not edit.\n")
+        for row in csv.reader(strip_comments(csvfile)):
+            insn = row[0].replace(',', '').split()
+            if insn[0] in ignore:
+                continue
+            cpuid = row[6]
+            if cpuid in archs:
+                g = InsnGenerator(insn[0], insn[1:])
+                for insn in g.gen():
+                    outf.write('TEST(%d, "%s", %s)\n' % (n, insn, g.optype))
+                    n += 1
+        outf.write("#undef TEST\n")
+        csvfile.close()
+
+if __name__ == "__main__":
+    main()
diff --git a/tests/tcg/i386/x86.csv b/tests/tcg/i386/x86.csv
new file mode 100644
index 0000000000..d5d0c17f1b
--- /dev/null
+++ b/tests/tcg/i386/x86.csv
@@ -0,0 +1,4658 @@
+# x86 instruction set description version 0.2x, 2018-05-08
+#
+# https://golang.org/x/arch/x86
+#
+# The latest version of the CSV file is
+# available online at https://golang.org/s/x86.csv.
+#
+# This file contains a block of comment lines, each beginning with #,
+# followed by entries in CSV format. All the # comments are at the top
+# of the file, so a reader can skip past the comments and hand the
+# rest of the file to a standard CSV reader.
+# Each CSV line contains these fields:
+#
+# 1. The Intel manual instruction mnemonic. For example, "SHR r/m32, imm8".
+#
+# 2. The Go assembler instruction mnemonic. For example, "SHRL imm8, r/m32".
+#
+# 3. The GNU binutils instruction mnemonic. For example, "shrl imm8, r/m32".
+#
+# 4. The instruction encoding. For example, "C1 /4 ib".
+#
+# 5. The validity of the instruction in 32-bit (aka compatiblity, legacy) mode.
+#
+# 6. The validity of the instruction in 64-bit mode.
+#
+# 7. The CPUID feature flags that signal support for the instruction.
+#
+# 8. Additional comma-separated tags containing hints about the instruction.
+#
+# 9. The read/write actions of the instruction on the arguments used in
+# the Intel mnemonic. For example, "rw,r" to denote that "SHR r/m32, imm8"
+# reads and writes its first argument but only reads its second argument.
+#
+# 10. Whether the opcode used in the Intel mnemonic has encoding forms
+# distinguished only by operand size, like most arithmetic instructions.
+# The string "Y" indicates yes, the string "" indicates no.
+#
+# 11. The data size of the operation in bits. In general this is the size corresponding
+# to the Go and GNU assembler opcode suffix.
+# Mnemonics (the opcode string)
+#
+# The instruction mnemonics are as used in the Intel manual, with a few exceptions.
+#
+# Mnemonics claiming general memory forms but that really require fixed addressing modes
+# are omitted in favor of their equivalents with implicit arguments..
+# For example, "CMPS m16, m16" (really CMPS [SI], [DI]) is omitted in favor of "CMPSW".
+#
+# Instruction forms with an explicit REP, REPE, or REPNE prefix are also omitted.
+# Encoders and decoders are expected to handle those prefixes separately.
+#
+# Perhaps most significantly, the argument syntaxes used in the mnemonic indicate
+# exactly how to derive the argument from the instruction encoding, or vice versa.
+#
+# Immediate values: imm8, imm8u, imm16, imm16u, imm32, imm64.
+# Immediates are signed by default; the u suffixes indicates an unsigned value.
+# Immediates may have bitfield-like modifier that specifies how much bits
+# are used. For example, imm8u:4 is encoded like 8bit immediate,
+# but only 4bits are meaningful while the others are ignored or must be 0.
+#
+# Memory operands. The forms m, m128, m14/28byte, m16, m16&16, m16&32, m16&64, m16:16, m16:32,
+# m16:64, m16int, m256, m2byte, m32, m32&32, m32fp, m32int, m512byte, m64, m64fp, m64int,
+# m8, m80bcd, m80dec, m80fp, m94/108byte. These operands always correspond to the
+# memory address specified by the r/m half of the modrm encoding.
+#
+# Integer registers.
+# The forms r8, r16, r32, r64 indicate a register selected by the modrm reg encoding.
+# The forms rmr16, rmr32, rmr64 indicate a register (never memory) selected by the modrm r/m encoding.
+# The forms r/m8, r/m16, r/m32, and r/m64 indicate a register or memory selected by the modrm r/m encoding.
+# Forms with two sizes, like r32/m16 also indicate a register or memory selected by the modrm r/m encodng,
+# but the size for a register argument differs from the size of a memory argument.
+# The forms r8V, r16V, r32V, r64V indicate a register selected by the VEX.vvvv bits.
+#
+# Multimedia registers.
+# The forms mm1, xmm1, and ymm1 indicate a multimedia register selected by the
+# modrm reg encoding.
+# The forms mm2, xmm2, and ymm2 indicate a register (never memory) selected by
+# the modrm r/m encoding.
+# The forms mm2/m64, xmm2/m128, and so on indicate a register or memory
+# selected by the modrm r/m encoding.
+# The forms xmmV and ymmV indicate a register selected by the VEX.vvvv bits.
+# The forms xmmI and ymmI indicate a register selected by the top four bits of an /is4 immediate byte.
+#
+# Bound registers.
+# The form bnd1 indicates a bound register selected by the modrm reg encoding.
+# The form bnd2 indicates a bound register (never memory) selected by the modrm r/m encoding.
+# The forms bnd2/m64 and bnd2/m128 indicate a register or memorys selected by the modrm r/m encoding.
+# TODO: Describe mib.
+#
+# One-of-a-kind operands: rel8, rel16, rel32, ptr16:16, ptr16:32,
+# moffs8, moffs16, moffs32, moffs64, vm32x, vm32y, vm64x, and vm64y
+# are all as in the Intel manual.
+#
+# Encodings
+#
+# The encodings are also as used in the Intel manual, with automated corrections.
+# For example, the Intel manual sometimes omits the modrm /r indicator or other trailing bytes,
+# and it also contains typographical errors.
+# These problems are corrected so that the CSV data may be used to generate
+# tools for processing x86 machine code.
+# See https://golang.org/x/arch/x86/x86map for one such generator.
+#
+# Valid32 and Valid64
+#
+# These columns hold validity abbreviations as defined in the Intel manual:
+# V, I, N.E., N.P., N.S., or N.I.
+# Tools processing the data are typically only concerned with whether the
+# column is "V" (valid) or not.
+# This data is also corrected compared to the manual.
+# For example, the manual lists many instruction forms using REX bytes
+# with an incorrect "V" in the Valid32 column.
+#
+# CPUID Feature Flags
+#
+# This column specifies CPUID feature flags that must be present in order
+# to use the instruction. If multiple flags are required,
+# they are listed separated by plus signs, as in PCLMULQDQ+AVX.
+# The column can also list one of the values 486, Pentium, PentiumII, and P6,
+# indicating that the instruction was introduced on that architecture version.
+#
+# Tags
+#
+# The tag column does not correspond to a traditional column in the Intel manual tables.
+# Instead, it is itself a comma-separated list of tags or hints derived by analysis
+# of the instruction set or the instruction encodings.
+#
+# The tags address16, address32, and address64 indicate that the instruction form
+# applies when using the specified addressing size. It may therefore be necessary to use an
+# address size prefix byte to access the instruction.
+# If two address tags are listed, the instruction can be used with either of those
+# address sizes. An instruction will never list all three address sizes.
+# (In fact, today, no instruction lists two address sizes, but that may change.)
+#
+# The tags operand16, operand32, and operand64 indicate that the instruction form
+# applies when using the specified operand size. It may therefore be necessary to use an
+# operand size prefix byte to access the instruction.
+# If two operand tags are listed,  the instruction can be used with either of those
+# operand sizes. An instruction will never list all three operand sizes.
+# For some instructions, default64 is used instead of operand64,
+# which specifies data promotion to 64-bit.
+# For instructions with different possible data sizes,
+# it also describes that default data size is 64-bit instead of 32-bit.
+# Using refining prefix like 0x66 will lead to 32-bit operation (if supported).
+#
+# The tags modrm_regonly or modrm_memonly indicate that the modrm byte's
+# r/m encoding must specify a register or memory, respectively.
+# Especially in newer instructions, the modrm constraint may be the only way
+# to distinguish two instruction forms. For example the MOVHLPS and MOVLPS
+# instructions share the same encoding, except that the former requires the
+# modrm byte's r/m to indicate a register, while the latter requires it to indicate memory.
+#
+# The tags pseudo and pseudo64 indicate that this instruction form is redundant
+# with others listed in the table and should be ignored when generating disassembly
+# or instruction scanning programs. The pseudo64 tag is reserved for the case where
+# the manual lists an instruction twice, once with the optional 64-bit mode REX byte.
+# Since most decoders will handle the REX byte separately, the form with the
+# unnecessary REX is tagged pseudo64.
+#
+# The amd tag marks AMD-specific instructions.
+# As an example, all instructions of SSE4a have such tag.
+#
+# The AVX512-specific tags: scaleX and bscaleX.
+# scale1, scale2, scale4, scale8, scale16, scale32, scale64 specify
+# the compressed displacement multiplier (scaling).
+# For example, if displacement is 128 and scale32 is set,
+# disp8 value should be calculated as 128/32.
+# bscale4 and bscale8 have the same meaning, but are used
+# when instruction uses embedded broadcast feature.
+# If instruction does not have bscaleX tag, it does not support EVEX broadcasting.
+#
+# Related packages (can be a good source of additional documentation):
+#	x86csv - read and manipulate x86.csv
+#	x86spec - x86.csv generator
+#	x86map - x86asm table generator based on x86.csv
+#	x86avxgen - cmd/internal/obj/x86 optab generator based x86.csv
+# All listed packages are located at golang.org/x/arch/x86/.
+"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","V","N.S.","","operand32","r","Y",""
+"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","N.S.","V","","default64","r","Y",""
+"AAA","AAA","aaa","37","V","N.S.","","","","",""
+"AAD","AAD","aad","D5 0A","V","I","","pseudo","","",""
+"AAD imm8u","AAD imm8u","aad imm8u","D5 ib","V","N.S.","","","r","",""
+"AAM","AAM","aam","D4 0A","V","I","","pseudo","","",""
+"AAM imm8u","AAM imm8u","aam imm8u","D4 ib","V","N.S.","","","r","",""
+"AAS","AAS","aas","3F","V","N.S.","","","","",""
+"ADC AL, imm8","ADCB imm8, AL","adcb imm8, AL","14 ib","V","V","","","rw,r","Y","8"
+"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","80 /2 ib","V","V","","","rw,r","Y","8"
+"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","82 /2 ib","V","N.S.","","","rw,r","Y","8"
+"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","REX 80 /2 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","12 /r","V","V","","","rw,r","Y","8"
+"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","REX 12 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","10 /r","V","V","","","rw,r","Y","8"
+"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","REX 10 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADC EAX, imm32","ADCL imm32, EAX","adcl imm32, EAX","15 id","V","V","","operand32","rw,r","Y","32"
+"ADC r/m32, imm32","ADCL imm32, r/m32","adcl imm32, r/m32","81 /2 id","V","V","","operand32","rw,r","Y","32"
+"ADC r/m32, imm8","ADCL imm8, r/m32","adcl imm8, r/m32","83 /2 ib","V","V","","operand32","rw,r","Y","32"
+"ADC r32, r/m32","ADCL r/m32, r32","adcl r/m32, r32","13 /r","V","V","","operand32","rw,r","Y","32"
+"ADC r/m32, r32","ADCL r32, r/m32","adcl r32, r/m32","11 /r","V","V","","operand32","rw,r","Y","32"
+"ADC RAX, imm32","ADCQ imm32, RAX","adcq imm32, RAX","REX.W 15 id","N.S.","V","","","rw,r","Y","64"
+"ADC r/m64, imm32","ADCQ imm32, r/m64","adcq imm32, r/m64","REX.W 81 /2 id","N.S.","V","","","rw,r","Y","64"
+"ADC r/m64, imm8","ADCQ imm8, r/m64","adcq imm8, r/m64","REX.W 83 /2 ib","N.S.","V","","","rw,r","Y","64"
+"ADC r64, r/m64","ADCQ r/m64, r64","adcq r/m64, r64","REX.W 13 /r","N.S.","V","","","rw,r","Y","64"
+"ADC r/m64, r64","ADCQ r64, r/m64","adcq r64, r/m64","REX.W 11 /r","N.S.","V","","","rw,r","Y","64"
+"ADC AX, imm16","ADCW imm16, AX","adcw imm16, AX","15 iw","V","V","","operand16","rw,r","Y","16"
+"ADC r/m16, imm16","ADCW imm16, r/m16","adcw imm16, r/m16","81 /2 iw","V","V","","operand16","rw,r","Y","16"
+"ADC r/m16, imm8","ADCW imm8, r/m16","adcw imm8, r/m16","83 /2 ib","V","V","","operand16","rw,r","Y","16"
+"ADC r16, r/m16","ADCW r/m16, r16","adcw r/m16, r16","13 /r","V","V","","operand16","rw,r","Y","16"
+"ADC r/m16, r16","ADCW r16, r/m16","adcw r16, r/m16","11 /r","V","V","","operand16","rw,r","Y","16"
+"ADCX r32, r/m32","ADCXL r/m32, r32","adcxl r/m32, r32","66 0F 38 F6 /r","V","V","ADX","operand16,operand32","rw,r","Y","32"
+"ADCX r64, r/m64","ADCXQ r/m64, r64","adcxq r/m64, r64","66 REX.W 0F 38 F6 /r","N.S.","V","ADX","","rw,r","Y","64"
+"ADD AL, imm8","ADDB imm8, AL","addb imm8, AL","04 ib","V","V","","","rw,r","Y","8"
+"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","80 /0 ib","V","V","","","rw,r","Y","8"
+"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","82 /0 ib","V","N.S.","","","rw,r","Y","8"
+"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","REX 80 /0 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","02 /r","V","V","","","rw,r","Y","8"
+"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","REX 02 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","00 /r","V","V","","","rw,r","Y","8"
+"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","REX 00 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"ADD EAX, imm32","ADDL imm32, EAX","addl imm32, EAX","05 id","V","V","","operand32","rw,r","Y","32"
+"ADD r/m32, imm32","ADDL imm32, r/m32","addl imm32, r/m32","81 /0 id","V","V","","operand32","rw,r","Y","32"
+"ADD r/m32, imm8","ADDL imm8, r/m32","addl imm8, r/m32","83 /0 ib","V","V","","operand32","rw,r","Y","32"
+"ADD r32, r/m32","ADDL r/m32, r32","addl r/m32, r32","03 /r","V","V","","operand32","rw,r","Y","32"
+"ADD r/m32, r32","ADDL r32, r/m32","addl r32, r/m32","01 /r","V","V","","operand32","rw,r","Y","32"
+"ADDPD xmm1, xmm2/m128","ADDPD xmm2/m128, xmm1","addpd xmm2/m128, xmm1","66 0F 58 /r","V","V","SSE2","","rw,r","",""
+"ADDPS xmm1, xmm2/m128","ADDPS xmm2/m128, xmm1","addps xmm2/m128, xmm1","0F 58 /r","V","V","SSE","","rw,r","",""
+"ADD RAX, imm32","ADDQ imm32, RAX","addq imm32, RAX","REX.W 05 id","N.S.","V","","","rw,r","Y","64"
+"ADD r/m64, imm32","ADDQ imm32, r/m64","addq imm32, r/m64","REX.W 81 /0 id","N.S.","V","","","rw,r","Y","64"
+"ADD r/m64, imm8","ADDQ imm8, r/m64","addq imm8, r/m64","REX.W 83 /0 ib","N.S.","V","","","rw,r","Y","64"
+"ADD r64, r/m64","ADDQ r/m64, r64","addq r/m64, r64","REX.W 03 /r","N.S.","V","","","rw,r","Y","64"
+"ADD r/m64, r64","ADDQ r64, r/m64","addq r64, r/m64","REX.W 01 /r","N.S.","V","","","rw,r","Y","64"
+"ADDSD xmm1, xmm2/m64","ADDSD xmm2/m64, xmm1","addsd xmm2/m64, xmm1","F2 0F 58 /r","V","V","SSE2","","rw,r","",""
+"ADDSS xmm1, xmm2/m32","ADDSS xmm2/m32, xmm1","addss xmm2/m32, xmm1","F3 0F 58 /r","V","V","SSE","","rw,r","",""
+"ADDSUBPD xmm1, xmm2/m128","ADDSUBPD xmm2/m128, xmm1","addsubpd xmm2/m128, xmm1","66 0F D0 /r","V","V","SSE3","","rw,r","",""
+"ADDSUBPS xmm1, xmm2/m128","ADDSUBPS xmm2/m128, xmm1","addsubps xmm2/m128, xmm1","F2 0F D0 /r","V","V","SSE3","","rw,r","",""
+"ADD AX, imm16","ADDW imm16, AX","addw imm16, AX","05 iw","V","V","","operand16","rw,r","Y","16"
+"ADD r/m16, imm16","ADDW imm16, r/m16","addw imm16, r/m16","81 /0 iw","V","V","","operand16","rw,r","Y","16"
+"ADD r/m16, imm8","ADDW imm8, r/m16","addw imm8, r/m16","83 /0 ib","V","V","","operand16","rw,r","Y","16"
+"ADD r16, r/m16","ADDW r/m16, r16","addw r/m16, r16","03 /r","V","V","","operand16","rw,r","Y","16"
+"ADD r/m16, r16","ADDW r16, r/m16","addw r16, r/m16","01 /r","V","V","","operand16","rw,r","Y","16"
+"ADOX r32, r/m32","ADOXL r/m32, r32","adoxl r/m32, r32","F3 0F 38 F6 /r","V","V","ADX","operand16,operand32","rw,r","Y","32"
+"ADOX r64, r/m64","ADOXQ r/m64, r64","adoxq r/m64, r64","F3 REX.W 0F 38 F6 /r","N.S.","V","ADX","","rw,r","Y","64"
+"AESDEC xmm1, xmm2/m128","AESDEC xmm2/m128, xmm1","aesdec xmm2/m128, xmm1","66 0F 38 DE /r","V","V","AES","","rw,r","",""
+"AESDECLAST xmm1, xmm2/m128","AESDECLAST xmm2/m128, xmm1","aesdeclast xmm2/m128, xmm1","66 0F 38 DF /r","V","V","AES","","rw,r","",""
+"AESENC xmm1, xmm2/m128","AESENC xmm2/m128, xmm1","aesenc xmm2/m128, xmm1","66 0F 38 DC /r","V","V","AES","","rw,r","",""
+"AESENCLAST xmm1, xmm2/m128","AESENCLAST xmm2/m128, xmm1","aesenclast xmm2/m128, xmm1","66 0F 38 DD /r","V","V","AES","","rw,r","",""
+"AESIMC xmm1, xmm2/m128","AESIMC xmm2/m128, xmm1","aesimc xmm2/m128, xmm1","66 0F 38 DB /r","V","V","AES","","w,r","",""
+"AESKEYGENASSIST xmm1, xmm2/m128, imm8u","AESKEYGENASSIST imm8u, xmm2/m128, xmm1","aeskeygenassist imm8u, xmm2/m128, xmm1","66 0F 3A DF /r ib","V","V","AES","","w,r,r","",""
+"AND AL, imm8","ANDB imm8, AL","andb imm8, AL","24 ib","V","V","","","rw,r","Y","8"
+"AND r/m8, imm8","ANDB imm8, r/m8","andb imm8, r/m8","REX 80 /4 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","80 /4 ib","V","V","","","rw,r","Y","8"
+"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","82 /4 ib","V","N.S.","","","rw,r","Y","8"
+"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","22 /r","V","V","","","rw,r","Y","8"
+"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","REX 22 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","20 /r","V","V","","","rw,r","Y","8"
+"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","REX 20 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"AND EAX, imm32","ANDL imm32, EAX","andl imm32, EAX","25 id","V","V","","operand32","rw,r","Y","32"
+"AND r/m32, imm32","ANDL imm32, r/m32","andl imm32, r/m32","81 /4 id","V","V","","operand32","rw,r","Y","32"
+"AND r/m32, imm8","ANDL imm8, r/m32","andl imm8, r/m32","83 /4 ib","V","V","","operand32","rw,r","Y","32"
+"AND r32, r/m32","ANDL r/m32, r32","andl r/m32, r32","23 /r","V","V","","operand32","rw,r","Y","32"
+"AND r/m32, r32","ANDL r32, r/m32","andl r32, r/m32","21 /r","V","V","","operand32","rw,r","Y","32"
+"ANDN r32, r32V, r/m32","ANDNL r/m32, r32V, r32","andnl r/m32, r32V, r32","VEX.DDS.128.0F38.W0 F2 /r","V","V","BMI1","","rw,r,r","Y","32"
+"ANDNPD xmm1, xmm2/m128","ANDNPD xmm2/m128, xmm1","andnpd xmm2/m128, xmm1","66 0F 55 /r","V","V","SSE2","","rw,r","",""
+"ANDNPS xmm1, xmm2/m128","ANDNPS xmm2/m128, xmm1","andnps xmm2/m128, xmm1","0F 55 /r","V","V","SSE","","rw,r","",""
+"ANDN r64, r64V, r/m64","ANDNQ r/m64, r64V, r64","andnq r/m64, r64V, r64","VEX.DDS.128.0F38.W1 F2 /r","N.S.","V","BMI1","","rw,r,r","Y","64"
+"ANDPD xmm1, xmm2/m128","ANDPD xmm2/m128, xmm1","andpd xmm2/m128, xmm1","66 0F 54 /r","V","V","SSE2","","rw,r","",""
+"ANDPS xmm1, xmm2/m128","ANDPS xmm2/m128, xmm1","andps xmm2/m128, xmm1","0F 54 /r","V","V","SSE","","rw,r","",""
+"AND RAX, imm32","ANDQ imm32, RAX","andq imm32, RAX","REX.W 25 id","N.S.","V","","","rw,r","Y","64"
+"AND r/m64, imm32","ANDQ imm32, r/m64","andq imm32, r/m64","REX.W 81 /4 id","N.S.","V","","","rw,r","Y","64"
+"AND r/m64, imm8","ANDQ imm8, r/m64","andq imm8, r/m64","REX.W 83 /4 ib","N.S.","V","","","rw,r","Y","64"
+"AND r64, r/m64","ANDQ r/m64, r64","andq r/m64, r64","REX.W 23 /r","N.S.","V","","","rw,r","Y","64"
+"AND r/m64, r64","ANDQ r64, r/m64","andq r64, r/m64","REX.W 21 /r","N.S.","V","","","rw,r","Y","64"
+"AND AX, imm16","ANDW imm16, AX","andw imm16, AX","25 iw","V","V","","operand16","rw,r","Y","16"
+"AND r/m16, imm16","ANDW imm16, r/m16","andw imm16, r/m16","81 /4 iw","V","V","","operand16","rw,r","Y","16"
+"AND r/m16, imm8","ANDW imm8, r/m16","andw imm8, r/m16","83 /4 ib","V","V","","operand16","rw,r","Y","16"
+"AND r16, r/m16","ANDW r/m16, r16","andw r/m16, r16","23 /r","V","V","","operand16","rw,r","Y","16"
+"AND r/m16, r16","ANDW r16, r/m16","andw r16, r/m16","21 /r","V","V","","operand16","rw,r","Y","16"
+"ARPL r/m16, r16","ARPL r16, r/m16","arpl r16, r/m16","63 /r","V","N.S.","","","rw,r","",""
+"BEXTR r32, r/m32, r32V","BEXTRL r32V, r/m32, r32","bextrl r32V, r/m32, r32","VEX.NDS.128.0F38.W0 F7 /r","V","V","BMI1","","w,r,r","Y","32"
+"BEXTR r64, r/m64, r64V","BEXTRQ r64V, r/m64, r64","bextrq r64V, r/m64, r64","VEX.NDS.128.0F38.W1 F7 /r","N.S.","V","BMI1","","w,r,r","Y","64"
+"BEXTR_XOP r32, r/m32, imm32u","BEXTR_XOPL imm32u, r/m32, r32","bextr_xopl imm32u, r/m32, r32","XOP.128.0A.WIG 10 /r","V","V","TBM","amd,operand16,operand32","w,r,r","Y","32"
+"BEXTR_XOP r64, r/m64, imm32u","BEXTR_XOPQ imm32u, r/m64, r64","bextr_xopq imm32u, r/m64, r64","XOP.128.0A.WIG 10 /r","N.S.","V","TBM","amd,operand64","w,r,r","Y","64"
+"BLCFILL r32V, r/m32","BLCFILLL r/m32, r32V","blcfill r/m32, r32V","XOP.NDD.128.09.WIG 01 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCFILL r64V, r/m64","BLCFILLQ r/m64, r64V","blcfill r/m64, r64V","XOP.NDD.128.09.W1 01 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLCIC r32V, r/m32","BLCICL r/m32, r32V","blcicl r/m32, r32V","XOP.NDD.128.09.WIG 01 /5","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCIC r64V, r/m64","BLCICQ r/m64, r64V","blcicq r/m64, r64V","XOP.NDD.128.09.WIG 01 /5","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLCI r32V, r/m32","BLCIL r/m32, r32V","blcil r/m32, r32V","XOP.NDD.128.09.WIG 02 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCI r64V, r/m64","BLCIQ r/m64, r64V","blciq r/m64, r64V","XOP.NDD.128.09.WIG 02 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLCMSK r32V, r/m32","BLCMSKL r/m32, r32V","blcmskl r/m32, r32V","XOP.NDD.128.09.WIG 02 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCMSK r64V, r/m64","BLCMSKQ r/m64, r64V","blcmskq r/m64, r64V","XOP.NDD.128.09.WIG 02 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLCS r32V, r/m32","BLCSL r/m32, r32V","blcsl r/m32, r32V","XOP.NDD.128.09.WIG 01 /3","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLCS r64V, r/m64","BLCSQ r/m64, r64V","blcsq r/m64, r64V","XOP.NDD.128.09.WIG 01 /3","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLENDPD xmm1, xmm2/m128, imm8u","BLENDPD imm8u, xmm2/m128, xmm1","blendpd imm8u, xmm2/m128, xmm1","66 0F 3A 0D /r ib","V","V","SSE4_1","","rw,r,r","",""
+"BLENDPS xmm1, xmm2/m128, imm8u","BLENDPS imm8u, xmm2/m128, xmm1","blendps imm8u, xmm2/m128, xmm1","66 0F 3A 0C /r ib","V","V","SSE4_1","","rw,r,r","",""
+"BLENDVPD xmm1, xmm2/m128, <XMM0>","BLENDVPD <XMM0>, xmm2/m128, xmm1","blendvpd <XMM0>, xmm2/m128, xmm1","66 0F 38 15 /r","V","V","SSE4_1","","rw,r,r","",""
+"BLENDVPS xmm1, xmm2/m128, <XMM0>","BLENDVPS <XMM0>, xmm2/m128, xmm1","blendvps <XMM0>, xmm2/m128, xmm1","66 0F 38 14 /r","V","V","SSE4_1","","rw,r,r","",""
+"BLSFILL r32V, r/m32","BLSFILLL r/m32, r32V","blsfill r/m32, r32V","XOP.NDD.128.09.WIG 01 /2","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLSFILL r64V, r/m64","BLSFILLQ r/m64, r64V","blsfill r/m64, r64V","XOP.NDD.128.09.W1 01 /2","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLSIC r32V, r/m32","BLSICL r/m32, r32V","blsicl r/m32, r32V","XOP.NDD.128.09.WIG 01 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"BLSIC r64V, r/m64","BLSICQ r/m64, r64V","blsicq r/m64, r64V","XOP.NDD.128.09.WIG 01 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"BLSI r32V, r/m32","BLSIL r/m32, r32V","blsil r/m32, r32V","VEX.NDD.128.0F38.W0 F3 /3","V","V","BMI1","","w,r","Y","32"
+"BLSI r64V, r/m64","BLSIQ r/m64, r64V","blsiq r/m64, r64V","VEX.NDD.128.0F38.W1 F3 /3","N.S.","V","BMI1","","w,r","Y","64"
+"BLSMSK r32V, r/m32","BLSMSKL r/m32, r32V","blsmskl r/m32, r32V","VEX.NDD.128.0F38.W0 F3 /2","V","V","BMI1","","w,r","Y","32"
+"BLSMSK r64V, r/m64","BLSMSKQ r/m64, r64V","blsmskq r/m64, r64V","VEX.NDD.128.0F38.W1 F3 /2","N.S.","V","BMI1","","w,r","Y","64"
+"BLSR r32V, r/m32","BLSRL r/m32, r32V","blsrl r/m32, r32V","VEX.NDD.128.0F38.W0 F3 /1","V","V","BMI1","","w,r","Y","32"
+"BLSR r64V, r/m64","BLSRQ r/m64, r64V","blsrq r/m64, r64V","VEX.NDD.128.0F38.W1 F3 /1","N.S.","V","BMI1","","w,r","Y","64"
+"BNDCL bnd1, r/m32","BNDCL r/m32, bnd1","bndcl r/m32, bnd1","F3 0F 1A /r","V","N.S.","MPX","","r,r","",""
+"BNDCL bnd1, r/m64","BNDCL r/m64, bnd1","bndcl r/m64, bnd1","F3 0F 1A /r","N.S.","V","MPX","","r,r","",""
+"BNDCN bnd1, r/m32","BNDCN r/m32, bnd1","bndcn r/m32, bnd1","F2 0F 1B /r","V","N.S.","MPX","","r,r","",""
+"BNDCN bnd1, r/m64","BNDCN r/m64, bnd1","bndcn r/m64, bnd1","F2 0F 1B /r","N.S.","V","MPX","","r,r","",""
+"BNDCU bnd1, r/m32","BNDCU r/m32, bnd1","bndcu r/m32, bnd1","F2 0F 1A /r","V","N.S.","MPX","","r,r","",""
+"BNDCU bnd1, r/m64","BNDCU r/m64, bnd1","bndcu r/m64, bnd1","F2 0F 1A /r","N.S.","V","MPX","","r,r","",""
+"BNDLDX bnd1, mib","BNDLDX mib, bnd1","bndldx mib, bnd1","0F 1A /r","V","V","MPX","modrm_memonly","w,r","",""
+"BNDMK bnd1, m32","BNDMK m32, bnd1","bndmk m32, bnd1","F3 0F 1B /r","V","N.S.","MPX","modrm_memonly","w,r","",""
+"BNDMK bnd1, m64","BNDMK m64, bnd1","bndmk m64, bnd1","F3 0F 1B /r","N.S.","V","MPX","modrm_memonly","w,r","",""
+"BNDMOV bnd2/m128, bnd1","BNDMOV bnd1, bnd2/m128","bndmov bnd1, bnd2/m128","66 0F 1B /r","N.S.","V","MPX","","w,r","",""
+"BNDMOV bnd2/m64, bnd1","BNDMOV bnd1, bnd2/m64","bndmov bnd1, bnd2/m64","66 0F 1B /r","V","N.S.","MPX","","w,r","",""
+"BNDMOV bnd1, bnd2/m128","BNDMOV bnd2/m128, bnd1","bndmov bnd2/m128, bnd1","66 0F 1A /r","N.S.","V","MPX","","w,r","",""
+"BNDMOV bnd1, bnd2/m64","BNDMOV bnd2/m64, bnd1","bndmov bnd2/m64, bnd1","66 0F 1A /r","V","N.S.","MPX","","w,r","",""
+"BNDSTX mib, bnd1","BNDSTX bnd1, mib","bndstx bnd1, mib","0F 1B /r","V","V","MPX","modrm_memonly","w,r","",""
+"BOUND r32, m32&32","BOUNDL m32&32, r32","boundl r32, m32&32","62 /r","V","N.S.","","modrm_memonly,operand32","r,r","Y","32"
+"BOUND r16, m16&16","BOUNDW m16&16, r16","boundw r16, m16&16","62 /r","V","N.S.","","modrm_memonly,operand16","r,r","Y","16"
+"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","0F BC /r","V","V","","operand32","rw,r","Y","32"
+"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","F3 0F BC /r","V","V","","operand32","rw,r","Y","32"
+"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","F3 REX.W 0F BC /r","N.S.","V","","","rw,r","Y","64"
+"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","REX.W 0F BC /r","N.S.","V","","","rw,r","Y","64"
+"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","0F BC /r","V","V","","operand16","rw,r","Y","16"
+"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","F3 0F BC /r","V","V","","operand16","rw,r","Y","16"
+"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","0F BD /r","V","V","","operand32","rw,r","Y","32"
+"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","F3 0F BD /r","V","V","","operand32","rw,r","Y","32"
+"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","F3 REX.W 0F BD /r","N.S.","V","","","rw,r","Y","64"
+"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","REX.W 0F BD /r","N.S.","V","","","rw,r","Y","64"
+"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","0F BD /r","V","V","","operand16","rw,r","Y","16"
+"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","F3 0F BD /r","V","V","","operand16","rw,r","Y","16"
+"BSWAP r32op","BSWAPL r32op","bswap r32op","0F C8+rd","V","V","486","operand32","rw","Y","32"
+"BSWAP r64op","BSWAPQ r64op","bswap r64op","REX.W 0F C8+ro","N.S.","V","486","","rw","Y","64"
+"BSWAP r16op","BSWAPW r16op","bswap r16op","0F C8+rw","V","V","486","operand16","rw","Y","16"
+"BTC r/m32, imm8u","BTCL imm8u, r/m32","btcl imm8u, r/m32","0F BA /7 ib","V","V","","operand32","rw,r","Y","32"
+"BTC r/m32, r32","BTCL r32, r/m32","btcl r32, r/m32","0F BB /r","V","V","","operand32","rw,r","Y","32"
+"BTC r/m64, imm8u","BTCQ imm8u, r/m64","btcq imm8u, r/m64","REX.W 0F BA /7 ib","N.S.","V","","","rw,r","Y","64"
+"BTC r/m64, r64","BTCQ r64, r/m64","btcq r64, r/m64","REX.W 0F BB /r","N.S.","V","","","rw,r","Y","64"
+"BTC r/m16, imm8u","BTCW imm8u, r/m16","btcw imm8u, r/m16","0F BA /7 ib","V","V","","operand16","rw,r","Y","16"
+"BTC r/m16, r16","BTCW r16, r/m16","btcw r16, r/m16","0F BB /r","V","V","","operand16","rw,r","Y","16"
+"BT r/m32, imm8u","BTL imm8u, r/m32","btl imm8u, r/m32","0F BA /4 ib","V","V","","operand32","r,r","Y","32"
+"BT r/m32, r32","BTL r32, r/m32","btl r32, r/m32","0F A3 /r","V","V","","operand32","r,r","Y","32"
+"BT r/m64, imm8u","BTQ imm8u, r/m64","btq imm8u, r/m64","REX.W 0F BA /4 ib","N.S.","V","","","r,r","Y","64"
+"BT r/m64, r64","BTQ r64, r/m64","btq r64, r/m64","REX.W 0F A3 /r","N.S.","V","","","r,r","Y","64"
+"BTR r/m32, imm8u","BTRL imm8u, r/m32","btrl imm8u, r/m32","0F BA /6 ib","V","V","","operand32","rw,r","Y","32"
+"BTR r/m32, r32","BTRL r32, r/m32","btrl r32, r/m32","0F B3 /r","V","V","","operand32","rw,r","Y","32"
+"BTR r/m64, imm8u","BTRQ imm8u, r/m64","btrq imm8u, r/m64","REX.W 0F BA /6 ib","N.S.","V","","","rw,r","Y","64"
+"BTR r/m64, r64","BTRQ r64, r/m64","btrq r64, r/m64","REX.W 0F B3 /r","N.S.","V","","","rw,r","Y","64"
+"BTR r/m16, imm8u","BTRW imm8u, r/m16","btrw imm8u, r/m16","0F BA /6 ib","V","V","","operand16","rw,r","Y","16"
+"BTR r/m16, r16","BTRW r16, r/m16","btrw r16, r/m16","0F B3 /r","V","V","","operand16","rw,r","Y","16"
+"BTS r/m32, imm8u","BTSL imm8u, r/m32","btsl imm8u, r/m32","0F BA /5 ib","V","V","","operand32","rw,r","Y","32"
+"BTS r/m32, r32","BTSL r32, r/m32","btsl r32, r/m32","0F AB /r","V","V","","operand32","rw,r","Y","32"
+"BTS r/m64, imm8u","BTSQ imm8u, r/m64","btsq imm8u, r/m64","REX.W 0F BA /5 ib","N.S.","V","","","rw,r","Y","64"
+"BTS r/m64, r64","BTSQ r64, r/m64","btsq r64, r/m64","REX.W 0F AB /r","N.S.","V","","","rw,r","Y","64"
+"BTS r/m16, imm8u","BTSW imm8u, r/m16","btsw imm8u, r/m16","0F BA /5 ib","V","V","","operand16","rw,r","Y","16"
+"BTS r/m16, r16","BTSW r16, r/m16","btsw r16, r/m16","0F AB /r","V","V","","operand16","rw,r","Y","16"
+"BT r/m16, imm8u","BTW imm8u, r/m16","btw imm8u, r/m16","0F BA /4 ib","V","V","","operand16","r,r","Y","16"
+"BT r/m16, r16","BTW r16, r/m16","btw r16, r/m16","0F A3 /r","V","V","","operand16","r,r","Y","16"
+"BZHI r32, r/m32, r32V","BZHIL r32V, r/m32, r32","bzhil r32V, r/m32, r32","VEX.NDS.128.0F38.W0 F5 /r","V","V","BMI2","","w,r,r","Y","32"
+"BZHI r64, r/m64, r64V","BZHIQ r64V, r/m64, r64","bzhiq r64V, r/m64, r64","VEX.NDS.128.0F38.W1 F5 /r","N.S.","V","BMI2","","w,r,r","Y","64"
+"CALL rel16","CALL rel16","call rel16","E8 cw","V","N.S.","","operand16","r","Y",""
+"CALL rel32","CALL rel32","call rel32","E8 cd","V","N.S.","","operand32","r","Y",""
+"CALL rel32","CALL rel32","call rel32","E8 cd","N.S.","V","","default64","r","Y",""
+"CALL r/m32","CALLL* r/m32","calll* r/m32","FF /2","V","N.S.","","operand32","r","Y","32"
+"CALL r/m64","CALLQ* r/m64","callq* r/m64","FF /2","N.S.","V","","default64","r","Y","64"
+"CALL r/m16","CALLW* r/m16","callw* r/m16","FF /2","V","N.S.","","operand16","r","Y","16"
+"CBW","CBW","cbtw","98","V","V","","operand16","","",""
+"CDQ","CDQ","cltd","99","V","V","","operand32","","",""
+"CDQE","CDQE","cltq","REX.W 98","N.S.","V","","","","",""
+"CLAC","CLAC","clac","0F 01 CA","V","V","","","","",""
+"CLC","CLC","clc","F8","V","V","","","","",""
+"CLD","CLD","cld","FC","V","V","","","","",""
+"CLFLUSH m8","CLFLUSH m8","clflush m8","0F AE /7","V","V","","modrm_memonly","r","",""
+"CLFLUSHOPT m8","CLFLUSHOPT m8","clflushopt m8","66 0F AE /7","V","V","","modrm_memonly","r","",""
+"CLGI","CLGI","clgi","0F 01 DD","V","V","SVM","amd","","",""
+"CLI","CLI","cli","FA","V","V","","","","",""
+"CLRSSBSY m64","CLRSSBSY m64","clrssbsy m64","F3 0F AE /6","V","V","CET","modrm_memonly","w","",""
+"CLTS","CLTS","clts","0F 06","V","V","","","","",""
+"CLWB m8","CLWB m8","clwb m8","66 0F AE /6","V","V","CLWB","modrm_memonly","r","",""
+"CLZERO EAX","CLZEROL EAX","clzerol EAX","0F 01 FC","V","V","CLZERO","amd,modrm_regonly,operand32","r","Y","32"
+"CLZERO RAX","CLZEROQ RAX","clzeroq RAX","REX.W 0F 01 FC","N.S.","V","CLZERO","amd,modrm_regonly","r","Y","64"
+"CLZERO AX","CLZEROW AX","clzerow AX","0F 01 FC","V","V","CLZERO","amd,modrm_regonly,operand16","r","Y","16"
+"CMC","CMC","cmc","F5","V","V","","","","",""
+"CMOVC r16, r/m16","CMOVC r/m16, r16","cmovc r/m16, r16","0F 42 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVC r32, r/m32","CMOVC r/m32, r32","cmovc r/m32, r32","0F 42 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVC r64, r/m64","CMOVC r/m64, r64","cmovc r/m64, r64","REX.W 0F 42 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVAE r32, r/m32","CMOVLCC r/m32, r32","cmovael r/m32, r32","0F 43 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVB r32, r/m32","CMOVLCS r/m32, r32","cmovbl r/m32, r32","0F 42 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVE r32, r/m32","CMOVLEQ r/m32, r32","cmovel r/m32, r32","0F 44 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVGE r32, r/m32","CMOVLGE r/m32, r32","cmovgel r/m32, r32","0F 4D /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVG r32, r/m32","CMOVLGT r/m32, r32","cmovgl r/m32, r32","0F 4F /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVA r32, r/m32","CMOVLHI r/m32, r32","cmoval r/m32, r32","0F 47 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVLE r32, r/m32","CMOVLLE r/m32, r32","cmovlel r/m32, r32","0F 4E /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVBE r32, r/m32","CMOVLLS r/m32, r32","cmovbel r/m32, r32","0F 46 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVL r32, r/m32","CMOVLLT r/m32, r32","cmovll r/m32, r32","0F 4C /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVS r32, r/m32","CMOVLMI r/m32, r32","cmovsl r/m32, r32","0F 48 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNE r32, r/m32","CMOVLNE r/m32, r32","cmovnel r/m32, r32","0F 45 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNO r32, r/m32","CMOVLOC r/m32, r32","cmovnol r/m32, r32","0F 41 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVO r32, r/m32","CMOVLOS r/m32, r32","cmovol r/m32, r32","0F 40 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNP r32, r/m32","CMOVLPC r/m32, r32","cmovnpl r/m32, r32","0F 4B /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNS r32, r/m32","CMOVLPL r/m32, r32","cmovnsl r/m32, r32","0F 49 /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVP r32, r/m32","CMOVLPS r/m32, r32","cmovpl r/m32, r32","0F 4A /r","V","V","","P6,operand32","rw,r","Y","32"
+"CMOVNA r16, r/m16","CMOVNA r/m16, r16","cmovna r/m16, r16","0F 46 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNA r32, r/m32","CMOVNA r/m32, r32","cmovna r/m32, r32","0F 46 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNA r64, r/m64","CMOVNA r/m64, r64","cmovna r/m64, r64","REX.W 0F 46 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNAE r16, r/m16","CMOVNAE r/m16, r16","cmovnae r/m16, r16","0F 42 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNAE r32, r/m32","CMOVNAE r/m32, r32","cmovnae r/m32, r32","0F 42 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNAE r64, r/m64","CMOVNAE r/m64, r64","cmovnae r/m64, r64","REX.W 0F 42 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNB r16, r/m16","CMOVNB r/m16, r16","cmovnb r/m16, r16","0F 43 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNB r32, r/m32","CMOVNB r/m32, r32","cmovnb r/m32, r32","0F 43 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNB r64, r/m64","CMOVNB r/m64, r64","cmovnb r/m64, r64","REX.W 0F 43 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNBE r16, r/m16","CMOVNBE r/m16, r16","cmovnbe r/m16, r16","0F 47 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNBE r32, r/m32","CMOVNBE r/m32, r32","cmovnbe r/m32, r32","0F 47 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNBE r64, r/m64","CMOVNBE r/m64, r64","cmovnbe r/m64, r64","REX.W 0F 47 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNC r16, r/m16","CMOVNC r/m16, r16","cmovnc r/m16, r16","0F 43 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNC r32, r/m32","CMOVNC r/m32, r32","cmovnc r/m32, r32","0F 43 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNC r64, r/m64","CMOVNC r/m64, r64","cmovnc r/m64, r64","REX.W 0F 43 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNG r16, r/m16","CMOVNG r/m16, r16","cmovng r/m16, r16","0F 4E /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNG r32, r/m32","CMOVNG r/m32, r32","cmovng r/m32, r32","0F 4E /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNG r64, r/m64","CMOVNG r/m64, r64","cmovng r/m64, r64","REX.W 0F 4E /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNGE r16, r/m16","CMOVNGE r/m16, r16","cmovnge r/m16, r16","0F 4C /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNGE r32, r/m32","CMOVNGE r/m32, r32","cmovnge r/m32, r32","0F 4C /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNGE r64, r/m64","CMOVNGE r/m64, r64","cmovnge r/m64, r64","REX.W 0F 4C /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNL r16, r/m16","CMOVNL r/m16, r16","cmovnl r/m16, r16","0F 4D /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNL r32, r/m32","CMOVNL r/m32, r32","cmovnl r/m32, r32","0F 4D /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNL r64, r/m64","CMOVNL r/m64, r64","cmovnl r/m64, r64","REX.W 0F 4D /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNLE r16, r/m16","CMOVNLE r/m16, r16","cmovnle r/m16, r16","0F 4F /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNLE r32, r/m32","CMOVNLE r/m32, r32","cmovnle r/m32, r32","0F 4F /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNLE r64, r/m64","CMOVNLE r/m64, r64","cmovnle r/m64, r64","REX.W 0F 4F /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVNZ r16, r/m16","CMOVNZ r/m16, r16","cmovnz r/m16, r16","0F 45 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVNZ r32, r/m32","CMOVNZ r/m32, r32","cmovnz r/m32, r32","0F 45 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVNZ r64, r/m64","CMOVNZ r/m64, r64","cmovnz r/m64, r64","REX.W 0F 45 /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVPE r16, r/m16","CMOVPE r/m16, r16","cmovpe r/m16, r16","0F 4A /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVPE r32, r/m32","CMOVPE r/m32, r32","cmovpe r/m32, r32","0F 4A /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVPE r64, r/m64","CMOVPE r/m64, r64","cmovpe r/m64, r64","REX.W 0F 4A /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVPO r16, r/m16","CMOVPO r/m16, r16","cmovpo r/m16, r16","0F 4B /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVPO r32, r/m32","CMOVPO r/m32, r32","cmovpo r/m32, r32","0F 4B /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVPO r64, r/m64","CMOVPO r/m64, r64","cmovpo r/m64, r64","REX.W 0F 4B /r","N.E.","V","","pseudo","rw,r","",""
+"CMOVAE r64, r/m64","CMOVQCC r/m64, r64","cmovaeq r/m64, r64","REX.W 0F 43 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVB r64, r/m64","CMOVQCS r/m64, r64","cmovbq r/m64, r64","REX.W 0F 42 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVE r64, r/m64","CMOVQEQ r/m64, r64","cmoveq r/m64, r64","REX.W 0F 44 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVGE r64, r/m64","CMOVQGE r/m64, r64","cmovgeq r/m64, r64","REX.W 0F 4D /r","N.S.","V","","","rw,r","Y","64"
+"CMOVG r64, r/m64","CMOVQGT r/m64, r64","cmovgq r/m64, r64","REX.W 0F 4F /r","N.S.","V","","","rw,r","Y","64"
+"CMOVA r64, r/m64","CMOVQHI r/m64, r64","cmovaq r/m64, r64","REX.W 0F 47 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVLE r64, r/m64","CMOVQLE r/m64, r64","cmovleq r/m64, r64","REX.W 0F 4E /r","N.S.","V","","","rw,r","Y","64"
+"CMOVBE r64, r/m64","CMOVQLS r/m64, r64","cmovbeq r/m64, r64","REX.W 0F 46 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVL r64, r/m64","CMOVQLT r/m64, r64","cmovlq r/m64, r64","REX.W 0F 4C /r","N.S.","V","","","rw,r","Y","64"
+"CMOVS r64, r/m64","CMOVQMI r/m64, r64","cmovsq r/m64, r64","REX.W 0F 48 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVNE r64, r/m64","CMOVQNE r/m64, r64","cmovneq r/m64, r64","REX.W 0F 45 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVNO r64, r/m64","CMOVQOC r/m64, r64","cmovnoq r/m64, r64","REX.W 0F 41 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVO r64, r/m64","CMOVQOS r/m64, r64","cmovoq r/m64, r64","REX.W 0F 40 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVNP r64, r/m64","CMOVQPC r/m64, r64","cmovnpq r/m64, r64","REX.W 0F 4B /r","N.S.","V","","","rw,r","Y","64"
+"CMOVNS r64, r/m64","CMOVQPL r/m64, r64","cmovnsq r/m64, r64","REX.W 0F 49 /r","N.S.","V","","","rw,r","Y","64"
+"CMOVP r64, r/m64","CMOVQPS r/m64, r64","cmovpq r/m64, r64","REX.W 0F 4A /r","N.S.","V","","","rw,r","Y","64"
+"CMOVAE r16, r/m16","CMOVWCC r/m16, r16","cmovaew r/m16, r16","0F 43 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVB r16, r/m16","CMOVWCS r/m16, r16","cmovbw r/m16, r16","0F 42 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVE r16, r/m16","CMOVWEQ r/m16, r16","cmovew r/m16, r16","0F 44 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVGE r16, r/m16","CMOVWGE r/m16, r16","cmovgew r/m16, r16","0F 4D /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVG r16, r/m16","CMOVWGT r/m16, r16","cmovgw r/m16, r16","0F 4F /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVA r16, r/m16","CMOVWHI r/m16, r16","cmovaw r/m16, r16","0F 47 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVLE r16, r/m16","CMOVWLE r/m16, r16","cmovlew r/m16, r16","0F 4E /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVBE r16, r/m16","CMOVWLS r/m16, r16","cmovbew r/m16, r16","0F 46 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVL r16, r/m16","CMOVWLT r/m16, r16","cmovlw r/m16, r16","0F 4C /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVS r16, r/m16","CMOVWMI r/m16, r16","cmovsw r/m16, r16","0F 48 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVNE r16, r/m16","CMOVWNE r/m16, r16","cmovnew r/m16, r16","0F 45 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVNO r16, r/m16","CMOVWOC r/m16, r16","cmovnow r/m16, r16","0F 41 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVO r16, r/m16","CMOVWOS r/m16, r16","cmovow r/m16, r16","0F 40 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVNP r16, r/m16","CMOVWPC r/m16, r16","cmovnpw r/m16, r16","0F 4B /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVNS r16, r/m16","CMOVWPL r/m16, r16","cmovnsw r/m16, r16","0F 49 /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVP r16, r/m16","CMOVWPS r/m16, r16","cmovpw r/m16, r16","0F 4A /r","V","V","","P6,operand16","rw,r","Y","16"
+"CMOVZ r16, r/m16","CMOVZ r/m16, r16","cmovz r/m16, r16","0F 44 /r","V","V","","P6,operand16,pseudo","rw,r","",""
+"CMOVZ r32, r/m32","CMOVZ r/m32, r32","cmovz r/m32, r32","0F 44 /r","V","V","","P6,operand32,pseudo","rw,r","",""
+"CMOVZ r64, r/m64","CMOVZ r/m64, r64","cmovz r/m64, r64","REX.W 0F 44 /r","N.E.","V","","pseudo","rw,r","",""
+"CMP AL, imm8","CMPB AL, imm8","cmpb imm8, AL","3C ib","V","V","","","r,r","Y","8"
+"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","80 /7 ib","V","V","","","r,r","Y","8"
+"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","82 /7 ib","V","N.S.","","","r,r","Y","8"
+"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","REX 80 /7 ib","N.E.","V","","pseudo64","r,r","Y","8"
+"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","38 /r","V","V","","","r,r","Y","8"
+"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","REX 38 /r","N.E.","V","","pseudo64","r,r","Y","8"
+"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","3A /r","V","V","","","r,r","Y","8"
+"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","REX 3A /r","N.E.","V","","pseudo64","r,r","Y","8"
+"CMP EAX, imm32","CMPL EAX, imm32","cmpl imm32, EAX","3D id","V","V","","operand32","r,r","Y","32"
+"CMP r/m32, imm32","CMPL r/m32, imm32","cmpl imm32, r/m32","81 /7 id","V","V","","operand32","r,r","Y","32"
+"CMP r/m32, imm8","CMPL r/m32, imm8","cmpl imm8, r/m32","83 /7 ib","V","V","","operand32","r,r","Y","32"
+"CMP r/m32, r32","CMPL r/m32, r32","cmpl r32, r/m32","39 /r","V","V","","operand32","r,r","Y","32"
+"CMP r32, r/m32","CMPL r32, r/m32","cmpl r/m32, r32","3B /r","V","V","","operand32","r,r","Y","32"
+"CMPPD xmm1, xmm2/m128, imm8u","CMPPD imm8u, xmm1, xmm2/m128","cmppd imm8u, xmm2/m128, xmm1","66 0F C2 /r ib","V","V","SSE2","","rw,r,r","",""
+"CMPPS xmm1, xmm2/m128, imm8u","CMPPS imm8u, xmm1, xmm2/m128","cmpps imm8u, xmm2/m128, xmm1","0F C2 /r ib","V","V","SSE","","rw,r,r","",""
+"CMP RAX, imm32","CMPQ RAX, imm32","cmpq imm32, RAX","REX.W 3D id","N.S.","V","","","r,r","Y","64"
+"CMP r/m64, imm32","CMPQ r/m64, imm32","cmpq imm32, r/m64","REX.W 81 /7 id","N.S.","V","","","r,r","Y","64"
+"CMP r/m64, imm8","CMPQ r/m64, imm8","cmpq imm8, r/m64","REX.W 83 /7 ib","N.S.","V","","","r,r","Y","64"
+"CMP r/m64, r64","CMPQ r/m64, r64","cmpq r64, r/m64","REX.W 39 /r","N.S.","V","","","r,r","Y","64"
+"CMP r64, r/m64","CMPQ r64, r/m64","cmpq r/m64, r64","REX.W 3B /r","N.S.","V","","","r,r","Y","64"
+"CMPSB","CMPSB","cmpsb","A6","V","V","","","","",""
+"CMPSD xmm1, xmm2/m64, imm8u","CMPSD imm8u, xmm1, xmm2/m64","cmpsd imm8u, xmm2/m64, xmm1","F2 0F C2 /r ib","V","V","SSE2","","rw,r,r","",""
+"CMPSD","CMPSL","cmpsl","A7","V","V","","operand32","","",""
+"CMPSQ","CMPSQ","cmpsq","REX.W A7","N.S.","V","","","","",""
+"CMPSS xmm1, xmm2/m32, imm8u","CMPSS imm8u, xmm1, xmm2/m32","cmpss imm8u, xmm2/m32, xmm1","F3 0F C2 /r ib","V","V","SSE","","rw,r,r","",""
+"CMPSW","CMPSW","cmpsw","A7","V","V","","operand16","","",""
+"CMP AX, imm16","CMPW AX, imm16","cmpw imm16, AX","3D iw","V","V","","operand16","r,r","Y","16"
+"CMP r/m16, imm16","CMPW r/m16, imm16","cmpw imm16, r/m16","81 /7 iw","V","V","","operand16","r,r","Y","16"
+"CMP r/m16, imm8","CMPW r/m16, imm8","cmpw imm8, r/m16","83 /7 ib","V","V","","operand16","r,r","Y","16"
+"CMP r/m16, r16","CMPW r/m16, r16","cmpw r16, r/m16","39 /r","V","V","","operand16","r,r","Y","16"
+"CMP r16, r/m16","CMPW r16, r/m16","cmpw r/m16, r16","3B /r","V","V","","operand16","r,r","Y","16"
+"CMPXCHG16B m128","CMPXCHG16B m128","cmpxchg16b m128","REX.W 0F C7 /1","N.S.","V","","modrm_memonly","rw","",""
+"CMPXCHG8B m64","CMPXCHG8B m64","cmpxchg8b m64","0F C7 /1","V","V","Pentium","modrm_memonly,operand16,operand32","rw","",""
+"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","0F B0 /r","V","V","486","","rw,r","Y","8"
+"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","REX 0F B0 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"CMPXCHG r/m32, r32","CMPXCHGL r32, r/m32","cmpxchgl r32, r/m32","0F B1 /r","V","V","486","operand32","rw,r","Y","32"
+"CMPXCHG r/m64, r64","CMPXCHGQ r64, r/m64","cmpxchgq r64, r/m64","REX.W 0F B1 /r","N.S.","V","486","","rw,r","Y","64"
+"CMPXCHG r/m16, r16","CMPXCHGW r16, r/m16","cmpxchgw r16, r/m16","0F B1 /r","V","V","486","operand16","rw,r","Y","16"
+"COMISD xmm1, xmm2/m64","COMISD xmm2/m64, xmm1","comisd xmm2/m64, xmm1","66 0F 2F /r","V","V","SSE2","","r,r","",""
+"COMISS xmm1, xmm2/m32","COMISS xmm2/m32, xmm1","comiss xmm2/m32, xmm1","0F 2F /r","V","V","SSE","","r,r","",""
+"CPUID","CPUID","cpuid","0F A2","V","V","486","","","",""
+"CQO","CQO","cqto","REX.W 99","N.S.","V","","","","",""
+"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 0F 38 F0 /r","V","V","SSE4_2","operand16,operand32","rw,r","Y","8"
+"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 REX 0F 38 F0 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"CRC32 r64, r/m8","CRC32B r/m8, r64","crc32b r/m8, r64","F2 REX.W 0F 38 F0 /r","N.S.","V","SSE4_2","","rw,r","Y","8"
+"CRC32 r32, r/m32","CRC32L r/m32, r32","crc32l r/m32, r32","F2 0F 38 F1 /r","V","V","SSE4_2","operand32","rw,r","Y","32"
+"CRC32 r64, r/m64","CRC32Q r/m64, r64","crc32q r/m64, r64","F2 REX.W 0F 38 F1 /r","N.S.","V","SSE4_2","","rw,r","Y","64"
+"CRC32 r32, r/m16","CRC32W r/m16, r32","crc32w r/m16, r32","F2 0F 38 F1 /r","V","V","SSE4_2","operand16","rw,r","Y","16"
+"CVTPD2PI mm1, xmm2/m128","CVTPD2PI xmm2/m128, mm1","cvtpd2pi xmm2/m128, mm1","66 0F 2D /r","V","V","SSE2","","w,r","",""
+"CVTPD2DQ xmm1, xmm2/m128","CVTPD2PL xmm2/m128, xmm1","cvtpd2dq xmm2/m128, xmm1","F2 0F E6 /r","V","V","SSE2","","w,r","",""
+"CVTPD2PS xmm1, xmm2/m128","CVTPD2PS xmm2/m128, xmm1","cvtpd2ps xmm2/m128, xmm1","66 0F 5A /r","V","V","SSE2","","w,r","",""
+"CVTPI2PD xmm1, mm2/m64","CVTPI2PD mm2/m64, xmm1","cvtpi2pd mm2/m64, xmm1","66 0F 2A /r","V","V","SSE2","","w,r","",""
+"CVTPI2PS xmm1, mm2/m64","CVTPI2PS mm2/m64, xmm1","cvtpi2ps mm2/m64, xmm1","0F 2A /r","V","V","SSE","","w,r","",""
+"CVTDQ2PD xmm1, xmm2/m64","CVTPL2PD xmm2/m64, xmm1","cvtdq2pd xmm2/m64, xmm1","F3 0F E6 /r","V","V","SSE2","","w,r","",""
+"CVTDQ2PS xmm1, xmm2/m128","CVTPL2PS xmm2/m128, xmm1","cvtdq2ps xmm2/m128, xmm1","0F 5B /r","V","V","SSE2","","w,r","",""
+"CVTPS2PD xmm1, xmm2/m64","CVTPS2PD xmm2/m64, xmm1","cvtps2pd xmm2/m64, xmm1","0F 5A /r","V","V","SSE2","","w,r","",""
+"CVTPS2PI mm1, xmm2/m64","CVTPS2PI xmm2/m64, mm1","cvtps2pi xmm2/m64, mm1","0F 2D /r","V","V","SSE","","w,r","",""
+"CVTPS2DQ xmm1, xmm2/m128","CVTPS2PL xmm2/m128, xmm1","cvtps2dq xmm2/m128, xmm1","66 0F 5B /r","V","V","SSE2","","w,r","",""
+"CVTSD2SI r32, xmm2/m64","CVTSD2SL xmm2/m64, r32","cvtsd2si xmm2/m64, r32","F2 0F 2D /r","V","V","SSE2","operand16,operand32","w,r","Y","32"
+"CVTSD2SI r64, xmm2/m64","CVTSD2SL xmm2/m64, r64","cvtsd2siq xmm2/m64, r64","F2 REX.W 0F 2D /r","N.S.","V","SSE2","","w,r","Y","64"
+"CVTSD2SS xmm1, xmm2/m64","CVTSD2SS xmm2/m64, xmm1","cvtsd2ss xmm2/m64, xmm1","F2 0F 5A /r","V","V","SSE2","","w,r","",""
+"CVTSI2SD xmm1, r/m32","CVTSL2SD r/m32, xmm1","cvtsi2sdl r/m32, xmm1","F2 0F 2A /r","V","V","SSE2","operand16,operand32","w,r","Y","32"
+"CVTSI2SS xmm1, r/m32","CVTSL2SS r/m32, xmm1","cvtsi2ssl r/m32, xmm1","F3 0F 2A /r","V","V","SSE","operand16,operand32","w,r","Y","32"
+"CVTSI2SD xmm1, r/m64","CVTSQ2SD r/m64, xmm1","cvtsi2sdq r/m64, xmm1","F2 REX.W 0F 2A /r","N.S.","V","SSE2","","w,r","Y","64"
+"CVTSI2SS xmm1, r/m64","CVTSQ2SS r/m64, xmm1","cvtsi2ssq r/m64, xmm1","F3 REX.W 0F 2A /r","N.S.","V","SSE","","w,r","Y","64"
+"CVTSS2SD xmm1, xmm2/m32","CVTSS2SD xmm2/m32, xmm1","cvtss2sd xmm2/m32, xmm1","F3 0F 5A /r","V","V","SSE2","","w,r","",""
+"CVTSS2SI r32, xmm2/m32","CVTSS2SL xmm2/m32, r32","cvtss2si xmm2/m32, r32","F3 0F 2D /r","V","V","SSE","operand16,operand32","w,r","Y","32"
+"CVTSS2SI r64, xmm2/m32","CVTSS2SL xmm2/m32, r64","cvtss2siq xmm2/m32, r64","F3 REX.W 0F 2D /r","N.S.","V","SSE","","w,r","Y","64"
+"CVTTPD2PI mm1, xmm2/m128","CVTTPD2PI xmm2/m128, mm1","cvttpd2pi xmm2/m128, mm1","66 0F 2C /r","V","V","SSE2","","w,r","",""
+"CVTTPD2DQ xmm1, xmm2/m128","CVTTPD2PL xmm2/m128, xmm1","cvttpd2dq xmm2/m128, xmm1","66 0F E6 /r","V","V","SSE2","","w,r","",""
+"CVTTPS2PI mm1, xmm2/m64","CVTTPS2PI xmm2/m64, mm1","cvttps2pi xmm2/m64, mm1","0F 2C /r","V","V","SSE","","w,r","",""
+"CVTTPS2DQ xmm1, xmm2/m128","CVTTPS2PL xmm2/m128, xmm1","cvttps2dq xmm2/m128, xmm1","F3 0F 5B /r","V","V","SSE2","","w,r","",""
+"CVTTSD2SI r32, xmm2/m64","CVTTSD2SL xmm2/m64, r32","cvttsd2si xmm2/m64, r32","F2 0F 2C /r","V","V","SSE2","operand16,operand32","w,r","Y","32"
+"CVTTSD2SI r64, xmm2/m64","CVTTSD2SL xmm2/m64, r64","cvttsd2siq xmm2/m64, r64","F2 REX.W 0F 2C /r","N.S.","V","SSE2","","w,r","Y","64"
+"CVTTSS2SI r32, xmm2/m32","CVTTSS2SL xmm2/m32, r32","cvttss2si xmm2/m32, r32","F3 0F 2C /r","V","V","SSE","operand16,operand32","w,r","Y","32"
+"CVTTSS2SI r64, xmm2/m32","CVTTSS2SL xmm2/m32, r64","cvttss2siq xmm2/m32, r64","F3 REX.W 0F 2C /r","N.S.","V","SSE","","w,r","Y","64"
+"CWD","CWD","cwtd","99","V","V","","operand16","","",""
+"CWDE","CWDE","cwtl","98","V","V","","operand32","","",""
+"DAA","DAA","daa","27","V","N.S.","","","","",""
+"DAS","DAS","das","2F","V","N.S.","","","","",""
+"DEC r/m8","DECB r/m8","decb r/m8","FE /1","V","V","","","rw","Y","8"
+"DEC r/m8","DECB r/m8","decb r/m8","REX FE /1","N.E.","V","","pseudo64","rw","Y","8"
+"DEC r/m32","DECL r/m32","decl r/m32","FF /1","V","V","","operand32","rw","Y","32"
+"DEC r32op","DECL r32op","decl r32op","48+rd","V","N.S.","","operand32","rw","Y","32"
+"DEC r/m64","DECQ r/m64","decq r/m64","REX.W FF /1","N.S.","V","","","rw","Y","64"
+"DEC r/m16","DECW r/m16","decw r/m16","FF /1","V","V","","operand16","rw","Y","16"
+"DEC r16op","DECW r16op","decw r16op","48+rw","V","N.S.","","operand16","rw","Y","16"
+"DIV r/m8","DIVB r/m8","divb r/m8","F6 /6","V","V","","","r","Y","8"
+"DIV r/m8","DIVB r/m8","divb r/m8","REX F6 /6","N.E.","V","","pseudo64","w","Y","8"
+"DIV r/m32","DIVL r/m32","divl r/m32","F7 /6","V","V","","operand32","r","Y","32"
+"DIVPD xmm1, xmm2/m128","DIVPD xmm2/m128, xmm1","divpd xmm2/m128, xmm1","66 0F 5E /r","V","V","SSE2","","rw,r","",""
+"DIVPS xmm1, xmm2/m128","DIVPS xmm2/m128, xmm1","divps xmm2/m128, xmm1","0F 5E /r","V","V","SSE","","rw,r","",""
+"DIV r/m64","DIVQ r/m64","divq r/m64","REX.W F7 /6","N.S.","V","","","r","Y","64"
+"DIVSD xmm1, xmm2/m64","DIVSD xmm2/m64, xmm1","divsd xmm2/m64, xmm1","F2 0F 5E /r","V","V","SSE2","","rw,r","",""
+"DIVSS xmm1, xmm2/m32","DIVSS xmm2/m32, xmm1","divss xmm2/m32, xmm1","F3 0F 5E /r","V","V","SSE","","rw,r","",""
+"DIV r/m16","DIVW r/m16","divw r/m16","F7 /6","V","V","","operand16","r","Y","16"
+"DPPD xmm1, xmm2/m128, imm8u","DPPD imm8u, xmm2/m128, xmm1","dppd imm8u, xmm2/m128, xmm1","66 0F 3A 41 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"DPPS xmm1, xmm2/m128, imm8u","DPPS imm8u, xmm2/m128, xmm1","dpps imm8u, xmm2/m128, xmm1","66 0F 3A 40 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"EMMS","EMMS","emms","0F 77","V","V","MMX","","","",""
+"ENCLS","ENCLS","encls","0F 01 CF","V","V","","","","",""
+"ENCLU","ENCLU","enclu","0F 01 D7","V","V","","","","",""
+"ENDBR32","ENDBR32","endbr32","F3 0F 1E FB","V","V","CET","","","",""
+"ENDBR64","ENDBR64","endbr64","F3 0F 1E FA","V","V","CET","","","Y",""
+"ENTER imm16, 0","ENTER 0, imm16","enter imm16, 0","C8 iw 00","V","V","","pseudo","r,r","",""
+"ENTER imm16, 1","ENTER 1, imm16","enter imm16, 1","C8 iw 01","V","V","","pseudo","r,r","",""
+"ENTER imm16, imm8b","ENTERW/ENTERL/ENTERQ imm8b, imm16","enterw/enterl/enterq imm16, imm8b","C8 iw ib","V","V","","","r,r","",""
+"EXTRACTPS r/m32, xmm1, imm8u:2","EXTRACTPS imm8u:2, xmm1, r/m32","extractps imm8u:2, xmm1, r/m32","66 0F 3A 17 /r ib","V","V","SSE4_1","","w,r,r","",""
+"EXTRQ xmm1, imm8u, imm8u","EXTRQ imm8u, imm8u, xmm1","extrq imm8u, imm8u, xmm1","66 0F 78 /0 ib ib","V","V","SSE4a","amd,modrm_regonly","w,r,r","",""
+"EXTRQ xmm1, xmm2","EXTRQ xmm2, xmm1","extrq xmm2, xmm1","66 0F 79 /r","V","V","SSE4a","amd,modrm_regonly","w,r","",""
+"F2XM1","F2XM1","f2xm1","D9 F0","V","V","","","","",""
+"FABS","FABS","fabs","D9 E1","V","V","","","","",""
+"FADD ST(i), ST(0)","FADDD ST(0), ST(i)","fadd ST(0), ST(i)","DC C0+i","V","V","","","rw,r","Y",""
+"FADD ST(0), ST(i)","FADDD ST(i), ST(0)","fadd ST(i), ST(0)","D8 C0+i","V","V","","","rw,r","Y",""
+"FADD ST(0), m32fp","FADDD m32fp, ST(0)","fadds m32fp, ST(0)","D8 /0","V","V","","","rw,r","Y","32"
+"FADD ST(0), m64fp","FADDD m64fp, ST(0)","faddl m64fp, ST(0)","DC /0","V","V","","","rw,r","Y","64"
+"FADDP","FADDDP","faddp","DE C1","V","V","","pseudo","","",""
+"FADDP ST(i), ST(0)","FADDDP ST(0), ST(i)","faddp ST(0), ST(i)","DE C0+i","V","V","","","rw,r","",""
+"FBLD ST(0), m80dec","FBLD m80dec, ST(0)","fbld m80dec, ST(0)","DF /4","V","V","","","w,r","",""
+"FBSTP m80dec, ST(0)","FBSTP ST(0), m80dec","fbstp ST(0), m80dec","DF /6","V","V","","","w,r","",""
+"FCHS","FCHS","fchs","D9 E0","V","V","","","","",""
+"FCLEX","FCLEX","fclex","9B DB E2","V","V","","pseudo","","",""
+"FCMOVB ST(0), ST(i)","FCMOVB ST(i), ST(0)","fcmovb ST(i), ST(0)","DA C0+i","V","V","","P6","rw,r","",""
+"FCMOVBE ST(0), ST(i)","FCMOVBE ST(i), ST(0)","fcmovbe ST(i), ST(0)","DA D0+i","V","V","","P6","rw,r","",""
+"FCMOVE ST(0), ST(i)","FCMOVE ST(i), ST(0)","fcmove ST(i), ST(0)","DA C8+i","V","V","","P6","rw,r","",""
+"FCMOVNB ST(0), ST(i)","FCMOVNB ST(i), ST(0)","fcmovnb ST(i), ST(0)","DB C0+i","V","V","","P6","rw,r","",""
+"FCMOVNBE ST(0), ST(i)","FCMOVNBE ST(i), ST(0)","fcmovnbe ST(i), ST(0)","DB D0+i","V","V","","P6","rw,r","",""
+"FCMOVNE ST(0), ST(i)","FCMOVNE ST(i), ST(0)","fcmovne ST(i), ST(0)","DB C8+i","V","V","","P6","rw,r","",""
+"FCMOVNU ST(0), ST(i)","FCMOVNU ST(i), ST(0)","fcmovnu ST(i), ST(0)","DB D8+i","V","V","","P6","rw,r","",""
+"FCMOVU ST(0), ST(i)","FCMOVU ST(i), ST(0)","fcmovu ST(i), ST(0)","DA D8+i","V","V","","P6","rw,r","",""
+"FCOM","FCOMD","fcom","D8 D1","V","V","","pseudo","","Y",""
+"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","D8 D0+i","V","V","","","r,r","Y",""
+"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","DC D0+i","V","V","","","r,r","Y",""
+"FCOM ST(0), m32fp","FCOMD m32fp, ST(0)","fcoms m32fp, ST(0)","D8 /2","V","V","","","r,r","Y","32"
+"FCOM ST(0), m64fp","FCOMD m64fp, ST(0)","fcoml m64fp, ST(0)","DC /2","V","V","","","r,r","Y","64"
+"FCOMP ST(0), m32fp","FCOMFP m32fp, ST(0)","fcomps m32fp, ST(0)","D8 /3","V","V","","","r,r","Y","32"
+"FCOMI ST(0), ST(i)","FCOMI ST(i), ST(0)","fcomi ST(i), ST(0)","DB F0+i","V","V","PPRO","P6","r,r","",""
+"FCOMIP ST(0), ST(i)","FCOMIP ST(i), ST(0)","fcomip ST(i), ST(0)","DF F0+i","V","V","PPRO","P6","r,r","",""
+"FCOMP","FCOMP","fcomp","D8 D9","V","V","","pseudo","","Y",""
+"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","D8 D8+i","V","V","","","r,r","Y",""
+"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DC D8+i","V","V","","","r,r","Y",""
+"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DE D0+i","V","V","","","r,r","Y",""
+"FCOMP ST(0), m64fp","FCOMPL m64fp, ST(0)","fcompl m64fp, ST(0)","DC /3","V","V","","","r,r","Y","64"
+"FCOMPP","FCOMPP","fcompp","DE D9","V","V","","","","",""
+"FCOS","FCOS","fcos","D9 FF","V","V","","","","",""
+"FDECSTP","FDECSTP","fdecstp","D9 F6","V","V","","","","",""
+"FDISI8087_NOP","FDISI8087_NOP","fdisi8087_nop","DB E1","V","V","","","","",""
+"FDIVR ST(i), ST(0)","FDIVD ST(0), ST(i)","fdiv ST(0), ST(i)","DC F0+i","V","V","","","rw,r","Y",""
+"FDIV ST(i), ST(0)","FDIVD ST(0), ST(i)","fdivr ST(0), ST(i)","DC F8+i","V","V","","","rw,r","Y",""
+"FDIV ST(0), ST(i)","FDIVD ST(i), ST(0)","fdiv ST(i), ST(0)","D8 F0+i","V","V","","","rw,r","Y",""
+"FDIV ST(0), m32fp","FDIVD m32fp, ST(0)","fdivs m32fp, ST(0)","D8 /6","V","V","","","rw,r","Y","32"
+"FDIV ST(0), m64fp","FDIVD m64fp, ST(0)","fdivl m64fp, ST(0)","DC /6","V","V","","","rw,r","Y","64"
+"FDIVR ST(0), m32fp","FDIVFR m32fp, ST(0)","fdivrs m32fp, ST(0)","D8 /7","V","V","","","rw,r","Y","32"
+"FDIVP","FDIVP","fdivp","DE F9","V","V","","pseudo","","",""
+"FDIVRP ST(i), ST(0)","FDIVP ST(0), ST(i)","fdivp ST(0), ST(i)","DE F0+i","V","V","","","rw,r","",""
+"FDIVR ST(0), ST(i)","FDIVR ST(i), ST(0)","fdivr ST(i), ST(0)","D8 F8+i","V","V","","","rw,r","Y",""
+"FDIVR ST(0), m64fp","FDIVRL m64fp, ST(0)","fdivrl m64fp, ST(0)","DC /7","V","V","","","rw,r","Y","64"
+"FDIVRP","FDIVRP","fdivrp","DE F1","V","V","","pseudo","","",""
+"FDIVP ST(i), ST(0)","FDIVRP ST(0), ST(i)","fdivrp ST(0), ST(i)","DE F8+i","V","V","","","rw,r","",""
+"FEMMS","FEMMS","femms","0F 0E","V","V","3DNOW","amd","","",""
+"FENI8087_NOP","FENI8087_NOP","feni8087_nop","DB E0","V","V","","","","",""
+"FFREE ST(i)","FFREE ST(i)","ffree ST(i)","DD C0+i","V","V","","","r","",""
+"FFREEP ST(i)","FFREEP ST(i)","ffreep ST(i)","DF C0+i","V","V","","","r","",""
+"FIADD ST(0), m16int","FIADD m16int, ST(0)","fiadd m16int, ST(0)","DE /0","V","V","","","rw,r","Y",""
+"FIADD ST(0), m32int","FIADDL m32int, ST(0)","fiaddl m32int, ST(0)","DA /0","V","V","","","rw,r","Y","32"
+"FICOM ST(0), m16int","FICOM m16int, ST(0)","ficom m16int, ST(0)","DE /2","V","V","","","r,r","Y",""
+"FICOM ST(0), m32int","FICOML m32int, ST(0)","ficoml m32int, ST(0)","DA /2","V","V","","","r,r","Y","32"
+"FICOMP ST(0), m16int","FICOMP m16int, ST(0)","ficomp m16int, ST(0)","DE /3","V","V","","","r,r","Y",""
+"FICOMP ST(0), m32int","FICOMPL m32int, ST(0)","ficompl m32int, ST(0)","DA /3","V","V","","","r,r","Y","32"
+"FIDIV ST(0), m16int","FIDIV m16int, ST(0)","fidiv m16int, ST(0)","DE /6","V","V","","","rw,r","Y",""
+"FIDIV ST(0), m32int","FIDIVL m32int, ST(0)","fidivl m32int, ST(0)","DA /6","V","V","","","rw,r","Y","32"
+"FIDIVR ST(0), m16int","FIDIVR m16int, ST(0)","fidivr m16int, ST(0)","DE /7","V","V","","","rw,r","Y",""
+"FIDIVR ST(0), m32int","FIDIVRL m32int, ST(0)","fidivrl m32int, ST(0)","DA /7","V","V","","","rw,r","Y","32"
+"FILD ST(0), m16int","FILD m16int, ST(0)","fild m16int, ST(0)","DF /0","V","V","","","w,r","Y",""
+"FILD ST(0), m32int","FILDL m32int, ST(0)","fildl m32int, ST(0)","DB /0","V","V","","","w,r","Y","32"
+"FILD ST(0), m64int","FILDLL m64int, ST(0)","fildll m64int, ST(0)","DF /5","V","V","","","w,r","Y","64"
+"FIMUL ST(0), m16int","FIMUL m16int, ST(0)","fimul m16int, ST(0)","DE /1","V","V","","","rw,r","Y",""
+"FIMUL ST(0), m32int","FIMULL m32int, ST(0)","fimull m32int, ST(0)","DA /1","V","V","","","rw,r","Y","32"
+"FINCSTP","FINCSTP","fincstp","D9 F7","V","V","","","","",""
+"FINIT","FINIT","finit","9B DB E3","V","V","","pseudo","","",""
+"FIST m16int, ST(0)","FIST ST(0), m16int","fist ST(0), m16int","DF /2","V","V","","","w,r","Y",""
+"FIST m32int, ST(0)","FISTL ST(0), m32int","fistl ST(0), m32int","DB /2","V","V","","","w,r","Y","32"
+"FISTP m16int, ST(0)","FISTP ST(0), m16int","fistp ST(0), m16int","DF /3","V","V","","","w,r","Y",""
+"FISTP m32int, ST(0)","FISTPL ST(0), m32int","fistpl ST(0), m32int","DB /3","V","V","","","w,r","Y","32"
+"FISTP m64int, ST(0)","FISTPLL ST(0), m64int","fistpll ST(0), m64int","DF /7","V","V","","","w,r","Y","64"
+"FISTTP m16int, ST(0)","FISTTP ST(0), m16int","fisttp ST(0), m16int","DF /1","V","V","SSE3","modrm_memonly","w,r","Y",""
+"FISTTP m32int, ST(0)","FISTTPL ST(0), m32int","fisttpl ST(0), m32int","DB /1","V","V","SSE3","modrm_memonly","w,r","Y","32"
+"FISTTP m64int, ST(0)","FISTTPLL ST(0), m64int","fisttpll ST(0), m64int","DD /1","V","V","SSE3","modrm_memonly","w,r","Y","64"
+"FISUB ST(0), m16int","FISUB m16int, ST(0)","fisub m16int, ST(0)","DE /4","V","V","","","rw,r","Y",""
+"FISUB ST(0), m32int","FISUBL m32int, ST(0)","fisubl m32int, ST(0)","DA /4","V","V","","","rw,r","Y","32"
+"FISUBR ST(0), m16int","FISUBR m16int, ST(0)","fisubr m16int, ST(0)","DE /5","V","V","","","rw,r","Y",""
+"FISUBR ST(0), m32int","FISUBRL m32int, ST(0)","fisubrl m32int, ST(0)","DA /5","V","V","","","rw,r","Y","32"
+"FLD ST(0), ST(i)","FLD ST(i), ST(0)","fld ST(i), ST(0)","D9 C0+i","V","V","","","w,r","Y",""
+"FLD1","FLD1","fld1","D9 E8","V","V","","","","",""
+"FLDCW m2byte","FLDCW m2byte","fldcw m2byte","D9 /5","V","V","","","r","",""
+"FLDENV m28byte","FLDENV m28byte","fldenv m28byte","D9 /4","V","V","","operand32,operand64","r","",""
+"FLDENV m14byte","FLDENVS m14byte","fldenv m14byte","D9 /4","V","V","","operand16","r","",""
+"FLD ST(0), m64fp","FLDL m64fp, ST(0)","fldl m64fp, ST(0)","DD /0","V","V","","","w,r","Y","64"
+"FLDL2E","FLDL2E","fldl2e","D9 EA","V","V","","","","",""
+"FLDL2T","FLDL2T","fldl2t","D9 E9","V","V","","","","",""
+"FLDLG2","FLDLG2","fldlg2","D9 EC","V","V","","","","",""
+"FLDLN2","FLDLN2","fldln2","D9 ED","V","V","","","","",""
+"FLDPI","FLDPI","fldpi","D9 EB","V","V","","","","",""
+"FLD ST(0), m32fp","FLDS m32fp, ST(0)","flds m32fp, ST(0)","D9 /0","V","V","","","w,r","Y","32"
+"FLD ST(0), m80fp","FLDT m80fp, ST(0)","fldt m80fp, ST(0)","DB /5","V","V","","","w,r","Y","80"
+"FLDZ","FLDZ","fldz","D9 EE","V","V","","","","",""
+"FMUL ST(i), ST(0)","FMUL ST(0), ST(i)","fmul ST(0), ST(i)","DC C8+i","V","V","","","rw,r","Y",""
+"FMUL ST(0), ST(i)","FMUL ST(i), ST(0)","fmul ST(i), ST(0)","D8 C8+i","V","V","","","rw,r","Y",""
+"FMUL ST(0), m64fp","FMULL m64fp, ST(0)","fmull m64fp, ST(0)","DC /1","V","V","","","rw,r","Y","64"
+"FMULP","FMULP","fmulp","DE C9","V","V","","pseudo","","",""
+"FMULP ST(i), ST(0)","FMULP ST(0), ST(i)","fmulp ST(0), ST(i)","DE C8+i","V","V","","","rw,r","",""
+"FMUL ST(0), m32fp","FMULS m32fp, ST(0)","fmuls m32fp, ST(0)","D8 /1","V","V","","","rw,r","Y","32"
+"FNCLEX","FNCLEX","fnclex","DB E2","V","V","","","","",""
+"FNINIT","FNINIT","fninit","DB E3","V","V","","","","",""
+"FNOP","FNOP","fnop","D9 D0","V","V","","","","",""
+"FNSAVE m108byte","FNSAVE m108byte","fnsave m108byte","DD /6","V","V","","operand32,operand64","w","",""
+"FNSAVE m94byte","FNSAVES m94byte","fnsave m94byte","DD /6","V","V","","operand16","w","",""
+"FNSTCW m2byte","FNSTCW m2byte","fnstcw m2byte","D9 /7","V","V","","","w","",""
+"FNSTENV m28byte","FNSTENV m28byte","fnstenv m28byte","D9 /6","V","V","","operand32,operand64","w","",""
+"FNSTENV m14byte","FNSTENVS m14byte","fnstenv m14byte","D9 /6","V","V","","operand16","w","",""
+"FNSTSW AX","FNSTSW AX","fnstsw AX","DF E0","V","V","","","w","",""
+"FNSTSW m2byte","FNSTSW m2byte","fnstsw m2byte","DD /7","V","V","","","w","",""
+"FPATAN","FPATAN","fpatan","D9 F3","V","V","","","","",""
+"FPREM","FPREM","fprem","D9 F8","V","V","","","","",""
+"FPREM1","FPREM1","fprem1","D9 F5","V","V","","","","",""
+"FPTAN","FPTAN","fptan","D9 F2","V","V","","","","",""
+"FRNDINT","FRNDINT","frndint","D9 FC","V","V","","","","",""
+"FRSTOR m108byte","FRSTOR m108byte","frstor m108byte","DD /4","V","V","","operand32,operand64","r","",""
+"FRSTOR m94byte","FRSTORS m94byte","frstor m94byte","DD /4","V","V","","operand16","r","",""
+"FSAVE m94/108byte","FSAVE m94/108byte","fsave m94/108byte","9B DD /6","V","V","","pseudo","w","",""
+"FSCALE","FSCALE","fscale","D9 FD","V","V","","","","",""
+"FSETPM287_NOP","FSETPM287_NOP","fsetpm287_nop","DB E4","V","V","","","","",""
+"FSIN","FSIN","fsin","D9 FE","V","V","","","","",""
+"FSINCOS","FSINCOS","fsincos","D9 FB","V","V","","","","",""
+"FSQRT","FSQRT","fsqrt","D9 FA","V","V","","","","",""
+"FST ST(i), ST(0)","FST ST(0), ST(i)","fst ST(0), ST(i)","DD D0+i","V","V","","","w,r","Y",""
+"FSTCW m2byte","FSTCW m2byte","fstcw m2byte","9B D9 /7","V","V","","pseudo","w","",""
+"FSTENV m14/28byte","FSTENV m14/28byte","fstenv m14/28byte","9B D9 /6","V","V","","pseudo","w","",""
+"FST m64fp, ST(0)","FSTL ST(0), m64fp","fstl ST(0), m64fp","DD /2","V","V","","","w,r","Y","64"
+"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DD D8+i","V","V","","","w,r","Y",""
+"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D0+i","V","V","","","w,r","Y",""
+"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D8+i","V","V","","","w,r","Y",""
+"FSTP m64fp, ST(0)","FSTPL ST(0), m64fp","fstpl ST(0), m64fp","DD /3","V","V","","","w,r","Y","64"
+"FSTPNCE ST(i), ST(0)","FSTPNCE ST(0), ST(i)","fstpnce ST(0), ST(i)","D9 D8+i","V","V","","","w,r","",""
+"FSTP m32fp, ST(0)","FSTPS ST(0), m32fp","fstps ST(0), m32fp","D9 /3","V","V","","","w,r","Y","32"
+"FSTP m80fp, ST(0)","FSTPT ST(0), m80fp","fstpt ST(0), m80fp","DB /7","V","V","","","w,r","Y","80"
+"FST m32fp, ST(0)","FSTS ST(0), m32fp","fsts ST(0), m32fp","D9 /2","V","V","","","w,r","Y","32"
+"FSTSW AX","FSTSW AX","fstsw AX","9B DF E0","V","V","","pseudo","w","",""
+"FSTSW m2byte","FSTSW m2byte","fstsw m2byte","9B DD /7","V","V","","pseudo","w","",""
+"FSUBR ST(i), ST(0)","FSUB ST(0), ST(i)","fsub ST(0), ST(i)","DC E0+i","V","V","","","rw,r","Y",""
+"FSUB ST(0), ST(i)","FSUB ST(i), ST(0)","fsub ST(i), ST(0)","D8 E0+i","V","V","","","rw,r","Y",""
+"FSUB ST(0), m64fp","FSUBL m64fp, ST(0)","fsubl m64fp, ST(0)","DC /4","V","V","","","rw,r","Y","64"
+"FSUBP","FSUBP","fsubp","DE E9","V","V","","pseudo","","",""
+"FSUBRP ST(i), ST(0)","FSUBP ST(0), ST(i)","fsubp ST(0), ST(i)","DE E0+i","V","V","","","rw,r","",""
+"FSUB ST(i), ST(0)","FSUBR ST(0), ST(i)","fsubr ST(0), ST(i)","DC E8+i","V","V","","","rw,r","Y",""
+"FSUBR ST(0), ST(i)","FSUBR ST(i), ST(0)","fsubr ST(i), ST(0)","D8 E8+i","V","V","","","rw,r","Y",""
+"FSUBR ST(0), m64fp","FSUBRL m64fp, ST(0)","fsubrl m64fp, ST(0)","DC /5","V","V","","","rw,r","Y","64"
+"FSUBRP","FSUBRP","fsubrp","DE E1","V","V","","pseudo","","",""
+"FSUBP ST(i), ST(0)","FSUBRP ST(0), ST(i)","fsubrp ST(0), ST(i)","DE E8+i","V","V","","","rw,r","",""
+"FSUBR ST(0), m32fp","FSUBRS m32fp, ST(0)","fsubrs m32fp, ST(0)","D8 /5","V","V","","","rw,r","Y","32"
+"FSUB ST(0), m32fp","FSUBS m32fp, ST(0)","fsubs m32fp, ST(0)","D8 /4","V","V","","","rw,r","Y","32"
+"FTST","FTST","ftst","D9 E4","V","V","","","","",""
+"FUCOM","FUCOM","fucom","DD E1","V","V","","pseudo","","",""
+"FUCOM ST(0), ST(i)","FUCOM ST(i), ST(0)","fucom ST(i), ST(0)","DD E0+i","V","V","","","r,r","",""
+"FUCOMI ST(0), ST(i)","FUCOMI ST(i), ST(0)","fucomi ST(i), ST(0)","DB E8+i","V","V","PPRO","P6","r,r","",""
+"FUCOMIP ST(0), ST(i)","FUCOMIP ST(i), ST(0)","fucomip ST(i), ST(0)","DF E8+i","V","V","PPRO","P6","r,r","",""
+"FUCOMP","FUCOMP","fucomp","DD E9","V","V","","pseudo","","",""
+"FUCOMP ST(0), ST(i)","FUCOMP ST(i), ST(0)","fucomp ST(i), ST(0)","DD E8+i","V","V","","","r,r","",""
+"FUCOMPP","FUCOMPP","fucompp","DA E9","V","V","","","","",""
+"FWAIT","FWAIT","fwait","9B","V","V","","","","",""
+"FXAM","FXAM","fxam","D9 E5","V","V","","","","",""
+"FXCH","FXCH","fxch","D9 C9","V","V","","pseudo","","",""
+"FXCH ST(0), ST(i)","FXCH ST(i), ST(0)","fxch ST(i), ST(0)","D9 C8+i","V","V","","","rw,rw","",""
+"FXCH_ALIAS1 ST(0), ST(i)","FXCH_ALIAS1 ST(i), ST(0)","fxch_alias1 ST(i), ST(0)","DD C8+i","V","V","","","rw,rw","",""
+"FXCH_ALIAS2 ST(0), ST(i)","FXCH_ALIAS2 ST(i), ST(0)","fxch_alias2 ST(i), ST(0)","DF C8+i","V","V","","","rw,rw","",""
+"FXRSTOR m512byte","FXRSTOR m512byte","fxrstor m512byte","0F AE /1","V","V","","modrm_memonly,operand16,operand32","r","",""
+"FXRSTOR64 m512byte","FXRSTOR64 m512byte","fxrstor64 m512byte","REX.W 0F AE /1","N.S.","V","","modrm_memonly","r","",""
+"FXSAVE m512byte","FXSAVE m512byte","fxsave m512byte","0F AE /0","V","V","","modrm_memonly,operand16,operand32","w","",""
+"FXSAVE64 m512byte","FXSAVE64 m512byte","fxsave64 m512byte","REX.W 0F AE /0","N.S.","V","","modrm_memonly","w","",""
+"FXTRACT","FXTRACT","fxtract","D9 F4","V","V","","","","",""
+"FYL2X","FYL2X","fyl2x","D9 F1","V","V","","","","",""
+"FYL2XP1","FYL2XP1","fyl2xp1","D9 F9","V","V","","","","",""
+"GETSEC","GETSEC","getsec","0F 37","V","V","SMX","","","",""
+"GF2P8AFFINEINVQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEINVQB imm8u, xmm2/m128, xmm1","gf2p8affineinvqb imm8u, xmm2/m128, xmm1","66 0F 3A CF /r ib","V","V","GFNI","","rw,r,r","",""
+"GF2P8AFFINEQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEQB imm8u, xmm2/m128, xmm1","gf2p8affineqb imm8u, xmm2/m128, xmm1","66 0F 3A CE /r ib","V","V","GFNI","","rw,r,r","",""
+"GF2P8MULB xmm1, xmm2/m128","GF2P8MULB xmm2/m128, xmm1","gf2p8mulb xmm2/m128, xmm1","66 0F 38 CF /r","V","V","GFNI","","rw,r","",""
+"HADDPD xmm1, xmm2/m128","HADDPD xmm2/m128, xmm1","haddpd xmm2/m128, xmm1","66 0F 7C /r","V","V","SSE3","","rw,r","",""
+"HADDPS xmm1, xmm2/m128","HADDPS xmm2/m128, xmm1","haddps xmm2/m128, xmm1","F2 0F 7C /r","V","V","SSE3","","rw,r","",""
+"HLT","HLT","hlt","F4","V","V","","","","",""
+"HSUBPD xmm1, xmm2/m128","HSUBPD xmm2/m128, xmm1","hsubpd xmm2/m128, xmm1","66 0F 7D /r","V","V","SSE3","","rw,r","",""
+"HSUBPS xmm1, xmm2/m128","HSUBPS xmm2/m128, xmm1","hsubps xmm2/m128, xmm1","F2 0F 7D /r","V","V","SSE3","","rw,r","",""
+"ICEBP","ICEBP","icebp","F1","V","V","","","","",""
+"IDIV r/m8","IDIVB r/m8","idivb r/m8","F6 /7","V","V","","","r","Y","8"
+"IDIV r/m8","IDIVB r/m8","idivb r/m8","REX F6 /7","N.E.","V","","pseudo64","r","Y","8"
+"IDIV r/m32","IDIVL r/m32","idivl r/m32","F7 /7","V","V","","operand32","r","Y","32"
+"IDIV r/m64","IDIVQ r/m64","idivq r/m64","REX.W F7 /7","N.S.","V","","","r","Y","64"
+"IDIV r/m16","IDIVW r/m16","idivw r/m16","F7 /7","V","V","","operand16","r","Y","16"
+"IMUL r32, r/m32, imm32","IMUL3 imm32, r/m32, r32","imull imm32, r/m32, r32","69 /r id","V","V","","operand32","w,r,r","Y","32"
+"IMUL r64, r/m64, imm32","IMUL3 imm32, r/m64, r64","imulq imm32, r/m64, r64","REX.W 69 /r id","N.S.","V","","","w,r,r","Y","64"
+"IMUL r16, r/m16, imm8","IMUL3 imm8, r/m16, r16","imulw imm8, r/m16, r16","6B /r ib","V","V","","operand16","w,r,r","Y","16"
+"IMUL r32, r/m32, imm8","IMUL3 imm8, r/m32, r32","imull imm8, r/m32, r32","6B /r ib","V","V","","operand32","w,r,r","Y","32"
+"IMUL r64, r/m64, imm8","IMUL3 imm8, r/m64, r64","imulq imm8, r/m64, r64","REX.W 6B /r ib","N.S.","V","","","w,r,r","Y","64"
+"IMUL r/m8","IMULB r/m8","imulb r/m8","F6 /5","V","V","","","r","Y","8"
+"IMUL r/m32","IMULL r/m32","imull r/m32","F7 /5","V","V","","operand32","r","Y","32"
+"IMUL r32, r/m32","IMULL r/m32, r32","imull r/m32, r32","0F AF /r","V","V","","operand32","rw,r","Y","32"
+"IMUL r/m64","IMULQ r/m64","imulq r/m64","REX.W F7 /5","N.S.","V","","","r","Y","64"
+"IMUL r64, r/m64","IMULQ r/m64, r64","imulq r/m64, r64","REX.W 0F AF /r","N.S.","V","","","rw,r","Y","64"
+"IMUL r16, r/m16, imm16","IMULW imm16, r/m16, r16","imulw imm16, r/m16, r16","69 /r iw","V","V","","operand16","w,r,r","Y","16"
+"IMUL r/m16","IMULW r/m16","imulw r/m16","F7 /5","V","V","","operand16","r","Y","16"
+"IMUL r16, r/m16","IMULW r/m16, r16","imulw r/m16, r16","0F AF /r","V","V","","operand16","rw,r","Y","16"
+"IN AL, DX","INB DX, AL","inb DX, AL","EC","V","V","","","w,r","Y","8"
+"IN AL, imm8u","INB imm8u, AL","inb imm8u, AL","E4 ib","V","V","","","w,r","Y","8"
+"INC r/m8","INCB r/m8","incb r/m8","FE /0","V","V","","","rw","Y","8"
+"INC r/m8","INCB r/m8","incb r/m8","REX FE /0","N.E.","V","","pseudo64","rw","Y","8"
+"INC r/m32","INCL r/m32","incl r/m32","FF /0","V","V","","operand32","rw","Y","32"
+"INC r32op","INCL r32op","incl r32op","40+rd","V","N.S.","","operand32","rw","Y","32"
+"INC r/m64","INCQ r/m64","incq r/m64","REX.W FF /0","N.S.","V","","","rw","Y","64"
+"INCSSPD rmr32","INCSSPD rmr32","incsspd rmr32","F3 0F AE /5","V","V","CET","modrm_regonly,operand16,operand32","r","",""
+"INCSSPQ rmr64","INCSSPQ rmr64","incsspq rmr64","F3 REX.W 0F AE /5","N.S.","V","CET","modrm_regonly","r","",""
+"INC r/m16","INCW r/m16","incw r/m16","FF /0","V","V","","operand16","rw","Y","16"
+"INC r16op","INCW r16op","incw r16op","40+rw","V","N.S.","","operand16","rw","Y","16"
+"IN EAX, DX","INL DX, EAX","inl DX, EAX","ED","V","V","","operand32,operand64","w,r","Y","32"
+"IN EAX, imm8u","INL imm8u, EAX","inl imm8u, EAX","E5 ib","V","V","","operand32,operand64","w,r","Y","32"
+"INSB","INSB","insb","6C","V","V","","","","",""
+"INSERTPS xmm1, xmm2/m32, imm8u","INSERTPS imm8u, xmm2/m32, xmm1","insertps imm8u, xmm2/m32, xmm1","66 0F 3A 21 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"INSERTQ xmm1, xmm2, imm8u, imm8u","INSERTQ imm8u, imm8u, xmm2, xmm1","insertq imm8u, imm8u, xmm2, xmm1","F2 0F 78 /r ib ib","V","V","SSE4a","amd,modrm_regonly","w,r,r,r","",""
+"INSERTQ xmm1, xmm2","INSERTQ xmm2, xmm1","insertq xmm2, xmm1","F2 0F 79 /r","V","V","SSE4a","amd,modrm_regonly","w,r","",""
+"INSD","INSL","insl","6D","V","V","","operand32,operand64","","",""
+"INSW","INSW","insw","6D","V","V","","operand16","","",""
+"INT 3","INT 3","int 3","CC","V","V","","","r","",""
+"INT imm8u","INT imm8u","int imm8u","CD ib","V","V","","","r","",""
+"INTO","INTO","into","CE","V","N.S.","","","","",""
+"INVD","INVD","invd","0F 08","V","V","486","","","",""
+"INVEPT r32, m128","INVEPT m128, r32","invept m128, r32","66 0F 38 80 /r","V","N.S.","VTX","modrm_memonly","r,r","",""
+"INVEPT r64, m128","INVEPT m128, r64","invept m128, r64","66 0F 38 80 /r","N.S.","V","VTX","default64,modrm_memonly","r,r","",""
+"INVLPG m","INVLPG m","invlpg m","0F 01 /7","V","V","486","modrm_memonly","r","",""
+"INVLPGA EAX, ECX","INVLPGAL ECX, EAX","invlpgal ECX, EAX","0F 01 DF","V","V","SVM","amd,modrm_regonly,operand32","r,r","Y","32"
+"INVLPGA RAX, ECX","INVLPGAQ ECX, RAX","invlpgaq ECX, RAX","REX.W 0F 01 DF","N.S.","V","SVM","amd,modrm_regonly","r,r","Y","64"
+"INVLPGA AX, ECX","INVLPGAW ECX, AX","invlpgaw ECX, AX","0F 01 DF","V","V","SVM","amd,modrm_regonly,operand16","r,r","Y","16"
+"INVPCID r32, m128","INVPCID m128, r32","invpcid m128, r32","66 0F 38 82 /r","V","N.S.","INVPCID","modrm_memonly","r,r","",""
+"INVPCID r64, m128","INVPCID m128, r64","invpcid m128, r64","66 0F 38 82 /r","N.S.","V","INVPCID","default64,modrm_memonly","r,r","",""
+"INVVPID r32, m128","INVVPID m128, r32","invvpid m128, r32","66 0F 38 81 /r","V","N.S.","VTX","modrm_memonly","r,r","",""
+"INVVPID r64, m128","INVVPID m128, r64","invvpid m128, r64","66 0F 38 81 /r","N.S.","V","VTX","default64,modrm_memonly","r,r","",""
+"IN AX, DX","INW DX, AX","inw DX, AX","ED","V","V","","operand16","w,r","Y","16"
+"IN AX, imm8u","INW imm8u, AX","inw imm8u, AX","E5 ib","V","V","","operand16","w,r","Y","16"
+"IRETD","IRETL","iretl","CF","V","V","","operand32","","",""
+"IRETQ","IRETQ","iretq","REX.W CF","N.S.","V","","","","",""
+"IRET","IRETW","iretw","CF","V","V","","operand16","","",""
+"JA rel16","JA rel16","ja rel16","0F 87 cw","V","N.S.","","operand16","r","",""
+"JA rel32","JA rel32","ja rel32","0F 87 cd","V","N.S.","","operand32","r","",""
+"JA rel32","JA rel32","ja rel32","0F 87 cd","N.S.","V","","default64","r","",""
+"JA rel8","JA rel8","ja rel8","77 cb","N.S.","V","","default64","r","",""
+"JA rel8","JA rel8","ja rel8","77 cb","V","N.S.","","","r","",""
+"JAE rel16","JAE rel16","jae rel16","0F 83 cw","V","N.S.","","operand16","r","",""
+"JAE rel32","JAE rel32","jae rel32","0F 83 cd","N.S.","V","","default64","r","",""
+"JAE rel32","JAE rel32","jae rel32","0F 83 cd","V","N.S.","","operand32","r","",""
+"JAE rel8","JAE rel8","jae rel8","73 cb","V","N.S.","","","r","",""
+"JAE rel8","JAE rel8","jae rel8","73 cb","N.S.","V","","default64","r","",""
+"JB rel16","JB rel16","jb rel16","0F 82 cw","V","N.S.","","operand16","r","",""
+"JB rel32","JB rel32","jb rel32","0F 82 cd","V","N.S.","","operand32","r","",""
+"JB rel32","JB rel32","jb rel32","0F 82 cd","N.S.","V","","default64","r","",""
+"JB rel8","JB rel8","jb rel8","72 cb","N.S.","V","","default64","r","",""
+"JB rel8","JB rel8","jb rel8","72 cb","V","N.S.","","","r","",""
+"JBE rel16","JBE rel16","jbe rel16","0F 86 cw","V","N.S.","","operand16","r","",""
+"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","V","N.S.","","operand32","r","",""
+"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","N.S.","V","","default64","r","",""
+"JBE rel8","JBE rel8","jbe rel8","76 cb","V","N.S.","","","r","",""
+"JBE rel8","JBE rel8","jbe rel8","76 cb","N.S.","V","","default64","r","",""
+"JC rel16","JC rel16","jc rel16","0F 82 cw","V","N.S.","","pseudo","r","",""
+"JC rel32","JC rel32","jc rel32","0F 82 cd","V","V","","pseudo","r","",""
+"JC rel8","JC rel8","jc rel8","72 cb","V","V","","pseudo","r","",""
+"JCXZ rel8","JCXZ rel8","jcxz rel8","E3 cb","V","N.S.","","address16","r","",""
+"JE rel16","JE rel16","je rel16","0F 84 cw","V","N.S.","","operand16","r","",""
+"JE rel32","JE rel32","je rel32","0F 84 cd","V","N.S.","","operand32","r","",""
+"JE rel32","JE rel32","je rel32","0F 84 cd","N.S.","V","","default64","r","",""
+"JE rel8","JE rel8","je rel8","74 cb","N.S.","V","","default64","r","",""
+"JE rel8","JE rel8","je rel8","74 cb","V","N.S.","","","r","",""
+"JECXZ rel8","JECXZ rel8","jecxz rel8","E3 cb","V","V","","address32","r","",""
+"JG rel16","JG rel16","jg rel16","0F 8F cw","V","N.S.","","operand16","r","",""
+"JG rel32","JG rel32","jg rel32","0F 8F cd","N.S.","V","","default64","r","",""
+"JG rel32","JG rel32","jg rel32","0F 8F cd","V","N.S.","","operand32","r","",""
+"JG rel8","JG rel8","jg rel8","7F cb","V","N.S.","","","r","",""
+"JG rel8","JG rel8","jg rel8","7F cb","N.S.","V","","default64","r","",""
+"JGE rel16","JGE rel16","jge rel16","0F 8D cw","V","N.S.","","operand16","r","",""
+"JGE rel32","JGE rel32","jge rel32","0F 8D cd","V","N.S.","","operand32","r","",""
+"JGE rel32","JGE rel32","jge rel32","0F 8D cd","N.S.","V","","default64","r","",""
+"JGE rel8","JGE rel8","jge rel8","7D cb","N.S.","V","","default64","r","",""
+"JGE rel8","JGE rel8","jge rel8","7D cb","V","N.S.","","","r","",""
+"JL rel16","JL rel16","jl rel16","0F 8C cw","V","N.S.","","operand16","r","",""
+"JL rel32","JL rel32","jl rel32","0F 8C cd","V","N.S.","","operand32","r","",""
+"JL rel32","JL rel32","jl rel32","0F 8C cd","N.S.","V","","default64","r","",""
+"JL rel8","JL rel8","jl rel8","7C cb","V","N.S.","","","r","",""
+"JL rel8","JL rel8","jl rel8","7C cb","N.S.","V","","default64","r","",""
+"JLE rel16","JLE rel16","jle rel16","0F 8E cw","V","N.S.","","operand16","r","",""
+"JLE rel32","JLE rel32","jle rel32","0F 8E cd","V","N.S.","","operand32","r","",""
+"JLE rel32","JLE rel32","jle rel32","0F 8E cd","N.S.","V","","default64","r","",""
+"JLE rel8","JLE rel8","jle rel8","7E cb","N.S.","V","","default64","r","",""
+"JLE rel8","JLE rel8","jle rel8","7E cb","V","N.S.","","","r","",""
+"JMP rel16","JMP rel16","jmp rel16","E9 cw","V","N.S.","","operand16","r","Y",""
+"JMP rel32","JMP rel32","jmp rel32","E9 cd","N.S.","V","","default64","r","Y",""
+"JMP rel32","JMP rel32","jmp rel32","E9 cd","V","N.S.","","operand32","r","Y",""
+"JMP rel8","JMP rel8","jmp rel8","EB cb","N.S.","V","","default64","r","Y",""
+"JMP rel8","JMP rel8","jmp rel8","EB cb","V","N.S.","","","r","Y",""
+"JMP r/m32","JMPL* r/m32","jmpl* r/m32","FF /4","V","N.S.","","operand32","r","Y","32"
+"JMP r/m64","JMPQ* r/m64","jmpq* r/m64","FF /4","N.S.","V","","","r","Y","64"
+"JMP r/m16","JMPW* r/m16","jmpw* r/m16","FF /4","V","N.S.","","operand16","r","Y","16"
+"JNA rel16","JNA rel16","jna rel16","0F 86 cw","V","N.S.","","pseudo","r","",""
+"JNA rel32","JNA rel32","jna rel32","0F 86 cd","V","V","","pseudo","r","",""
+"JNA rel8","JNA rel8","jna rel8","76 cb","V","V","","pseudo","r","",""
+"JNAE rel16","JNAE rel16","jnae rel16","0F 82 cw","V","N.S.","","pseudo","r","",""
+"JNAE rel32","JNAE rel32","jnae rel32","0F 82 cd","V","V","","pseudo","r","",""
+"JNAE rel8","JNAE rel8","jnae rel8","72 cb","V","V","","pseudo","r","",""
+"JNB rel16","JNB rel16","jnb rel16","0F 83 cw","V","N.S.","","pseudo","r","",""
+"JNB rel32","JNB rel32","jnb rel32","0F 83 cd","V","V","","pseudo","r","",""
+"JNB rel8","JNB rel8","jnb rel8","73 cb","V","V","","pseudo","r","",""
+"JNBE rel16","JNBE rel16","jnbe rel16","0F 87 cw","V","N.S.","","pseudo","r","",""
+"JNBE rel32","JNBE rel32","jnbe rel32","0F 87 cd","V","V","","pseudo","r","",""
+"JNBE rel8","JNBE rel8","jnbe rel8","77 cb","V","V","","pseudo","r","",""
+"JNC rel16","JNC rel16","jnc rel16","0F 83 cw","V","N.S.","","pseudo","r","",""
+"JNC rel32","JNC rel32","jnc rel32","0F 83 cd","V","V","","pseudo","r","",""
+"JNC rel8","JNC rel8","jnc rel8","73 cb","V","V","","pseudo","r","",""
+"JNE rel16","JNE rel16","jne rel16","0F 85 cw","V","N.S.","","operand16","r","",""
+"JNE rel32","JNE rel32","jne rel32","0F 85 cd","N.S.","V","","default64","r","",""
+"JNE rel32","JNE rel32","jne rel32","0F 85 cd","V","N.S.","","operand32","r","",""
+"JNE rel8","JNE rel8","jne rel8","75 cb","V","N.S.","","","r","",""
+"JNE rel8","JNE rel8","jne rel8","75 cb","N.S.","V","","default64","r","",""
+"JNG rel16","JNG rel16","jng rel16","0F 8E cw","V","N.S.","","pseudo","r","",""
+"JNG rel32","JNG rel32","jng rel32","0F 8E cd","V","V","","pseudo","r","",""
+"JNG rel8","JNG rel8","jng rel8","7E cb","V","V","","pseudo","r","",""
+"JNGE rel16","JNGE rel16","jnge rel16","0F 8C cw","V","N.S.","","pseudo","r","",""
+"JNGE rel32","JNGE rel32","jnge rel32","0F 8C cd","V","V","","pseudo","r","",""
+"JNGE rel8","JNGE rel8","jnge rel8","7C cb","V","V","","pseudo","r","",""
+"JNL rel16","JNL rel16","jnl rel16","0F 8D cw","V","N.S.","","pseudo","r","",""
+"JNL rel32","JNL rel32","jnl rel32","0F 8D cd","V","V","","pseudo","r","",""
+"JNL rel8","JNL rel8","jnl rel8","7D cb","V","V","","pseudo","r","",""
+"JNLE rel16","JNLE rel16","jnle rel16","0F 8F cw","V","N.S.","","pseudo","r","",""
+"JNLE rel32","JNLE rel32","jnle rel32","0F 8F cd","V","V","","pseudo","r","",""
+"JNLE rel8","JNLE rel8","jnle rel8","7F cb","V","V","","pseudo","r","",""
+"JNO rel16","JNO rel16","jno rel16","0F 81 cw","V","N.S.","","operand16","r","",""
+"JNO rel32","JNO rel32","jno rel32","0F 81 cd","V","N.S.","","operand32","r","",""
+"JNO rel32","JNO rel32","jno rel32","0F 81 cd","N.S.","V","","default64","r","",""
+"JNO rel8","JNO rel8","jno rel8","71 cb","V","N.S.","","","r","",""
+"JNO rel8","JNO rel8","jno rel8","71 cb","N.S.","V","","default64","r","",""
+"JNP rel16","JNP rel16","jnp rel16","0F 8B cw","V","N.S.","","operand16","r","",""
+"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","V","N.S.","","operand32","r","",""
+"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","N.S.","V","","default64","r","",""
+"JNP rel8","JNP rel8","jnp rel8","7B cb","N.S.","V","","default64","r","",""
+"JNP rel8","JNP rel8","jnp rel8","7B cb","V","N.S.","","","r","",""
+"JNS rel16","JNS rel16","jns rel16","0F 89 cw","V","N.S.","","operand16","r","",""
+"JNS rel32","JNS rel32","jns rel32","0F 89 cd","N.S.","V","","default64","r","",""
+"JNS rel32","JNS rel32","jns rel32","0F 89 cd","V","N.S.","","operand32","r","",""
+"JNS rel8","JNS rel8","jns rel8","79 cb","V","N.S.","","","r","",""
+"JNS rel8","JNS rel8","jns rel8","79 cb","N.S.","V","","default64","r","",""
+"JNZ rel16","JNZ rel16","jnz rel16","0F 85 cw","V","N.S.","","pseudo","r","",""
+"JNZ rel32","JNZ rel32","jnz rel32","0F 85 cd","V","V","","pseudo","r","",""
+"JNZ rel8","JNZ rel8","jnz rel8","75 cb","V","V","","pseudo","r","",""
+"JO rel16","JO rel16","jo rel16","0F 80 cw","V","N.S.","","operand16","r","",""
+"JO rel32","JO rel32","jo rel32","0F 80 cd","V","N.S.","","operand32","r","",""
+"JO rel32","JO rel32","jo rel32","0F 80 cd","N.S.","V","","default64","r","",""
+"JO rel8","JO rel8","jo rel8","70 cb","V","N.S.","","","r","",""
+"JO rel8","JO rel8","jo rel8","70 cb","N.S.","V","","default64","r","",""
+"JP rel16","JP rel16","jp rel16","0F 8A cw","V","N.S.","","operand16","r","",""
+"JP rel32","JP rel32","jp rel32","0F 8A cd","N.S.","V","","default64","r","",""
+"JP rel32","JP rel32","jp rel32","0F 8A cd","V","N.S.","","operand32","r","",""
+"JP rel8","JP rel8","jp rel8","7A cb","N.S.","V","","default64","r","",""
+"JP rel8","JP rel8","jp rel8","7A cb","V","N.S.","","","r","",""
+"JPE rel16","JPE rel16","jpe rel16","0F 8A cw","V","N.S.","","pseudo","r","",""
+"JPE rel32","JPE rel32","jpe rel32","0F 8A cd","V","V","","pseudo","r","",""
+"JPE rel8","JPE rel8","jpe rel8","7A cb","V","V","","pseudo","r","",""
+"JPO rel16","JPO rel16","jpo rel16","0F 8B cw","V","N.S.","","pseudo","r","",""
+"JPO rel32","JPO rel32","jpo rel32","0F 8B cd","V","V","","pseudo","r","",""
+"JPO rel8","JPO rel8","jpo rel8","7B cb","V","V","","pseudo","r","",""
+"JRCXZ rel8","JRCXZ rel8","jrcxz rel8","E3 cb","N.S.","V","","address64","r","",""
+"JS rel16","JS rel16","js rel16","0F 88 cw","V","N.S.","","operand16","r","",""
+"JS rel32","JS rel32","js rel32","0F 88 cd","V","N.S.","","operand32","r","",""
+"JS rel32","JS rel32","js rel32","0F 88 cd","N.S.","V","","default64","r","",""
+"JS rel8","JS rel8","js rel8","78 cb","V","N.S.","","","r","",""
+"JS rel8","JS rel8","js rel8","78 cb","N.S.","V","","default64","r","",""
+"JZ rel16","JZ rel16","jz rel16","0F 84 cw","V","N.S.","","operand16,pseudo","r","",""
+"JZ rel32","JZ rel32","jz rel32","0F 84 cd","V","V","","operand32,pseudo","r","",""
+"JZ rel8","JZ rel8","jz rel8","74 cb","V","V","","pseudo","r","",""
+"KADDB k1, kV, k2","KADDB k2, kV, k1","kaddb k2, kV, k1","VEX.NDS.256.66.0F.W0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KADDD k1, kV, k2","KADDD k2, kV, k1","kaddd k2, kV, k1","VEX.NDS.256.66.0F.W1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KADDQ k1, kV, k2","KADDQ k2, kV, k1","kaddq k2, kV, k1","VEX.NDS.256.0F.W1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KADDW k1, kV, k2","KADDW k2, kV, k1","kaddw k2, kV, k1","VEX.NDS.256.0F.W0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KANDB k1, kV, k2","KANDB k2, kV, k1","kandb k2, kV, k1","VEX.NDS.256.66.0F.W0 41 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KANDD k1, kV, k2","KANDD k2, kV, k1","kandd k2, kV, k1","VEX.NDS.256.66.0F.W1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KANDNB k1, kV, k2","KANDNB k2, kV, k1","kandnb k2, kV, k1","VEX.NDS.256.66.0F.W0 42 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KANDND k1, kV, k2","KANDND k2, kV, k1","kandnd k2, kV, k1","VEX.NDS.256.66.0F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KANDNQ k1, kV, k2","KANDNQ k2, kV, k1","kandnq k2, kV, k1","VEX.NDS.256.0F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KANDNW k1, kV, k2","KANDNW k2, kV, k1","kandnw k2, kV, k1","VEX.NDS.256.0F.W0 42 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KANDQ k1, kV, k2","KANDQ k2, kV, k1","kandq k2, kV, k1","VEX.NDS.256.0F.W1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KANDW k1, kV, k2","KANDW k2, kV, k1","kandw k2, kV, k1","VEX.NDS.256.0F.W0 41 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KMOVB m8, k1","KMOVB k1, m8","kmovb k1, m8","VEX.128.66.0F.W0 91 /r","V","V","AVX512DQ","modrm_memonly","w,r","",""
+"KMOVB r32, k2","KMOVB k2, r32","kmovb k2, r32","VEX.128.66.0F.W0 93 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"KMOVB k1, k2/m8","KMOVB k2/m8, k1","kmovb k2/m8, k1","VEX.128.66.0F.W0 90 /r","V","V","AVX512DQ","","w,r","",""
+"KMOVB k1, rmr32","KMOVB rmr32, k1","kmovb rmr32, k1","VEX.128.66.0F.W0 92 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"KMOVD m32, k1","KMOVD k1, m32","kmovd k1, m32","VEX.128.66.0F.W1 91 /r","V","V","AVX512BW","modrm_memonly","w,r","",""
+"KMOVD r32, k2","KMOVD k2, r32","kmovd k2, r32","VEX.128.F2.0F.W0 93 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"KMOVD k1, k2/m32","KMOVD k2/m32, k1","kmovd k2/m32, k1","VEX.128.66.0F.W1 90 /r","V","V","AVX512BW","","w,r","",""
+"KMOVD k1, rmr32","KMOVD rmr32, k1","kmovd rmr32, k1","VEX.128.F2.0F.W0 92 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"KMOVQ m64, k1","KMOVQ k1, m64","kmovq k1, m64","VEX.128.0F.W1 91 /r","V","V","AVX512BW","modrm_memonly","w,r","",""
+"KMOVQ r64, k2","KMOVQ k2, r64","kmovq k2, r64","VEX.128.F2.0F.W1 93 /r","N.S.","V","AVX512BW","modrm_regonly","w,r","",""
+"KMOVQ k1, k2/m64","KMOVQ k2/m64, k1","kmovq k2/m64, k1","VEX.128.0F.W1 90 /r","V","V","AVX512BW","","w,r","",""
+"KMOVQ k1, rmr64","KMOVQ rmr64, k1","kmovq rmr64, k1","VEX.128.F2.0F.W1 92 /r","N.S.","V","AVX512BW","modrm_regonly","w,r","",""
+"KMOVW m16, k1","KMOVW k1, m16","kmovw k1, m16","VEX.128.0F.W0 91 /r","V","V","AVX512F","modrm_memonly","w,r","",""
+"KMOVW r32, k2","KMOVW k2, r32","kmovw k2, r32","VEX.128.0F.W0 93 /r","V","V","AVX512F","modrm_regonly","w,r","",""
+"KMOVW k1, k2/m16","KMOVW k2/m16, k1","kmovw k2/m16, k1","VEX.128.0F.W0 90 /r","V","V","AVX512F","","w,r","",""
+"KMOVW k1, rmr32","KMOVW rmr32, k1","kmovw rmr32, k1","VEX.128.0F.W0 92 /r","V","V","AVX512F","modrm_regonly","w,r","",""
+"KNOTB k1, k2","KNOTB k2, k1","knotb k2, k1","VEX.128.66.0F.W0 44 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"KNOTD k1, k2","KNOTD k2, k1","knotd k2, k1","VEX.128.66.0F.W1 44 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"KNOTQ k1, k2","KNOTQ k2, k1","knotq k2, k1","VEX.128.0F.W1 44 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"KNOTW k1, k2","KNOTW k2, k1","knotw k2, k1","VEX.128.0F.W0 44 /r","V","V","AVX512F","modrm_regonly","w,r","",""
+"KORB k1, kV, k2","KORB k2, kV, k1","korb k2, kV, k1","VEX.NDS.256.66.0F.W0 45 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KORD k1, kV, k2","KORD k2, kV, k1","kord k2, kV, k1","VEX.NDS.256.66.0F.W1 45 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KORQ k1, kV, k2","KORQ k2, kV, k1","korq k2, kV, k1","VEX.NDS.256.0F.W1 45 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KORTESTB k1, k2","KORTESTB k2, k1","kortestb k2, k1","VEX.128.66.0F.W0 98 /r","V","V","AVX512DQ","modrm_regonly","r,r","",""
+"KORTESTD k1, k2","KORTESTD k2, k1","kortestd k2, k1","VEX.128.66.0F.W1 98 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
+"KORTESTQ k1, k2","KORTESTQ k2, k1","kortestq k2, k1","VEX.128.0F.W1 98 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
+"KORTESTW k1, k2","KORTESTW k2, k1","kortestw k2, k1","VEX.128.0F.W0 98 /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"KORW k1, kV, k2","KORW k2, kV, k1","korw k2, kV, k1","VEX.NDS.256.0F.W0 45 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KSHIFTLB k1, k2, imm8u","KSHIFTLB imm8u, k2, k1","kshiftlb imm8u, k2, k1","VEX.128.66.0F3A.W0 32 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KSHIFTLD k1, k2, imm8u","KSHIFTLD imm8u, k2, k1","kshiftld imm8u, k2, k1","VEX.128.66.0F3A.W0 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KSHIFTLQ k1, k2, imm8u","KSHIFTLQ imm8u, k2, k1","kshiftlq imm8u, k2, k1","VEX.128.66.0F3A.W1 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KSHIFTLW k1, k2, imm8u","KSHIFTLW imm8u, k2, k1","kshiftlw imm8u, k2, k1","VEX.128.66.0F3A.W1 32 /r ib","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KSHIFTRB k1, k2, imm8u","KSHIFTRB imm8u, k2, k1","kshiftrb imm8u, k2, k1","VEX.128.66.0F3A.W0 30 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KSHIFTRD k1, k2, imm8u","KSHIFTRD imm8u, k2, k1","kshiftrd imm8u, k2, k1","VEX.128.66.0F3A.W0 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KSHIFTRQ k1, k2, imm8u","KSHIFTRQ imm8u, k2, k1","kshiftrq imm8u, k2, k1","VEX.128.66.0F3A.W1 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KSHIFTRW k1, k2, imm8u","KSHIFTRW imm8u, k2, k1","kshiftrw imm8u, k2, k1","VEX.128.66.0F3A.W1 30 /r ib","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KTESTB k1, k2","KTESTB k2, k1","ktestb k2, k1","VEX.128.66.0F.W0 99 /r","V","V","AVX512DQ","modrm_regonly","r,r","",""
+"KTESTD k1, k2","KTESTD k2, k1","ktestd k2, k1","VEX.128.66.0F.W1 99 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
+"KTESTQ k1, k2","KTESTQ k2, k1","ktestq k2, k1","VEX.128.0F.W1 99 /r","V","V","AVX512BW","modrm_regonly","r,r","",""
+"KTESTW k1, k2","KTESTW k2, k1","ktestw k2, k1","VEX.128.0F.W0 99 /r","V","V","AVX512DQ","modrm_regonly","r,r","",""
+"KUNPCKBW k1, kV, k2","KUNPCKBW k2, kV, k1","kunpckbw k2, kV, k1","VEX.NDS.256.66.0F.W0 4B /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KUNPCKDQ k1, kV, k2","KUNPCKDQ k2, kV, k1","kunpckdq k2, kV, k1","VEX.NDS.256.0F.W1 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KUNPCKWD k1, kV, k2","KUNPCKWD k2, kV, k1","kunpckwd k2, kV, k1","VEX.NDS.256.0F.W0 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXNORB k1, kV, k2","KXNORB k2, kV, k1","kxnorb k2, kV, k1","VEX.NDS.256.66.0F.W0 46 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KXNORD k1, kV, k2","KXNORD k2, kV, k1","kxnord k2, kV, k1","VEX.NDS.256.66.0F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXNORQ k1, kV, k2","KXNORQ k2, kV, k1","kxnorq k2, kV, k1","VEX.NDS.256.0F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXNORW k1, kV, k2","KXNORW k2, kV, k1","kxnorw k2, kV, k1","VEX.NDS.256.0F.W0 46 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"KXORB k1, kV, k2","KXORB k2, kV, k1","kxorb k2, kV, k1","VEX.NDS.256.66.0F.W0 47 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"KXORD k1, kV, k2","KXORD k2, kV, k1","kxord k2, kV, k1","VEX.NDS.256.66.0F.W1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXORQ k1, kV, k2","KXORQ k2, kV, k1","kxorq k2, kV, k1","VEX.NDS.256.0F.W1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"KXORW k1, kV, k2","KXORW k2, kV, k1","kxorw k2, kV, k1","VEX.NDS.256.0F.W0 47 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"LAHF","LAHF","lahf","9F","V","V","LAHFSAHF","","","",""
+"LAR r32, r32/m16","LARL r32/m16, r32","larl r32/m16, r32","0F 02 /r","V","V","","operand32","rw,r","Y","32"
+"LAR r64, r64/m16","LARQ r64/m16, r64","larq r64/m16, r64","REX.W 0F 02 /r","N.S.","V","","","rw,r","Y","64"
+"LAR r16, r/m16","LARW r/m16, r16","larw r/m16, r16","0F 02 /r","V","V","","operand16","rw,r","Y","16"
+"CALL_FAR ptr16:32","LCALLL ptr16:32","lcalll ptr16:32","9A cd iw","V","N.S.","","operand32","r","Y",""
+"CALL_FAR m16:32","LCALLL* m16:32","lcalll* m16:32","FF /3","V","V","","modrm_memonly,operand32","r","Y",""
+"CALL_FAR m16:64","LCALLQ* m16:64","lcallq* m16:64","REX.W FF /3","N.S.","V","","modrm_memonly","r","Y",""
+"CALL_FAR ptr16:16","LCALLW ptr16:16","lcallw ptr16:16","9A cw iw","V","N.S.","","operand16","r","Y",""
+"CALL_FAR m16:16","LCALLW* m16:16","lcallw* m16:16","FF /3","V","V","","modrm_memonly,operand16","r","Y",""
+"LDDQU xmm1, m128","LDDQU m128, xmm1","lddqu m128, xmm1","F2 0F F0 /r","V","V","SSE3","modrm_memonly","w,r","",""
+"LDMXCSR m32","LDMXCSR m32","ldmxcsr m32","0F AE /2","V","V","SSE","modrm_memonly","r","",""
+"LDS r32, m16:32","LDSL m16:32, r32","ldsl m16:32, r32","C5 /r","V","N.S.","","modrm_memonly,operand32","w,r","Y","32"
+"LDS r16, m16:16","LDSW m16:16, r16","ldsw m16:16, r16","C5 /r","V","N.S.","","modrm_memonly,operand16","w,r","Y","16"
+"LEA r32, m","LEAL m, r32","leal m, r32","8D /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
+"LEA r64, m","LEAQ m, r64","leaq m, r64","REX.W 8D /r","N.S.","V","","modrm_memonly","w,r","Y","64"
+"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","N.S.","V","","default64","","Y",""
+"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","N.S.","","operand32","","Y",""
+"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","V","","operand16","","Y",""
+"LEA r16, m","LEAW m, r16","leaw m, r16","8D /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
+"LES r32, m16:32","LESL m16:32, r32","lesl m16:32, r32","C4 /r","V","N.S.","","modrm_memonly,operand32","w,r","Y","32"
+"LES r16, m16:16","LESW m16:16, r16","lesw m16:16, r16","C4 /r","V","N.S.","","modrm_memonly,operand16","w,r","Y","16"
+"LFENCE","LFENCE","lfence","0F AE /5","V","V","SSE2","","","",""
+"LFS r32, m16:32","LFSL m16:32, r32","lfsl m16:32, r32","0F B4 /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
+"LFS r64, m16:64","LFSQ m16:64, r64","lfsq m16:64, r64","REX.W 0F B4 /r","N.S.","V","","modrm_memonly","w,r","Y","64"
+"LFS r16, m16:16","LFSW m16:16, r16","lfsw m16:16, r16","0F B4 /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
+"LGDT m16&64","LGDT m16&64","lgdt m16&64","0F 01 /2","N.S.","V","","default64,modrm_memonly","r","",""
+"LGDT m16&32","LGDTW/LGDTL m16&32","lgdtw/lgdtl m16&32","0F 01 /2","V","N.S.","","modrm_memonly","r","",""
+"LGS r32, m16:32","LGSL m16:32, r32","lgsl m16:32, r32","0F B5 /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
+"LGS r64, m16:64","LGSQ m16:64, r64","lgsq m16:64, r64","REX.W 0F B5 /r","N.S.","V","","modrm_memonly","w,r","Y","64"
+"LGS r16, m16:16","LGSW m16:16, r16","lgsw m16:16, r16","0F B5 /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
+"LIDT m16&64","LIDT m16&64","lidt m16&64","0F 01 /3","N.S.","V","","default64,modrm_memonly","r","",""
+"LIDT m16&32","LIDTW/LIDTL m16&32","lidtw/lidtl m16&32","0F 01 /3","V","N.S.","","modrm_memonly","r","",""
+"JMP_FAR ptr16:32","LJMPL ptr16:32","ljmpl ptr16:32","EA cd iw","V","N.S.","","operand32","r","Y",""
+"JMP_FAR m16:32","LJMPL* m16:32","ljmpl* m16:32","FF /5","V","V","","modrm_memonly,operand32","r","Y",""
+"JMP_FAR m16:64","LJMPQ* m16:64","ljmpq* m16:64","REX.W FF /5","N.S.","V","","modrm_memonly","r","Y",""
+"JMP_FAR ptr16:16","LJMPW ptr16:16","ljmpw ptr16:16","EA cw iw","V","N.S.","","operand16","r","Y",""
+"JMP_FAR m16:16","LJMPW* m16:16","ljmpw* m16:16","FF /5","V","V","","modrm_memonly,operand16","r","Y",""
+"LLDT r/m16","LLDT r/m16","lldt r/m16","0F 00 /2","V","V","","","r","",""
+"LLWPCB rmr32","LLWPCBL rmr32","llwpcbl rmr32","XOP.128.09.W0 12 /0","V","V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32"
+"LLWPCB rmr64","LLWPCBQ rmr64","llwpcbq rmr64","XOP.128.09.W0 12 /0","N.S.","V","XOP","amd,modrm_regonly,operand64","w","Y","64"
+"LMSW r/m16","LMSW r/m16","lmsw r/m16","0F 01 /6","V","V","","","r","",""
+"LOCK","LOCK","lock","F0","V","V","","pseudo","","",""
+"LODSB","LODSB","lodsb","AC","V","V","","","","",""
+"LODSD","LODSL","lodsl","AD","V","V","","operand32","","",""
+"LODSQ","LODSQ","lodsq","REX.W AD","N.S.","V","","","","",""
+"LODSW","LODSW","lodsw","AD","V","V","","operand16","","",""
+"LOOP rel8","LOOP rel8","loop rel8","E2 cb","V","V","","","r","",""
+"LOOPE rel8","LOOPEQ rel8","loope rel8","E1 cb","V","V","","","r","",""
+"LOOPNE rel8","LOOPNE rel8","loopne rel8","E0 cb","V","V","","","r","",""
+"LSL r32, r32/m16","LSLL r32/m16, r32","lsll r32/m16, r32","0F 03 /r","V","V","","operand32","rw,r","Y","32"
+"LSL r64, r32/m16","LSLQ r32/m16, r64","lslq r32/m16, r64","REX.W 0F 03 /r","N.S.","V","","","rw,r","Y","64"
+"LSL r16, r/m16","LSLW r/m16, r16","lslw r/m16, r16","0F 03 /r","V","V","","operand16","rw,r","Y","16"
+"LSS r32, m16:32","LSSL m16:32, r32","lssl m16:32, r32","0F B2 /r","V","V","","modrm_memonly,operand32","w,r","Y","32"
+"LSS r64, m16:64","LSSQ m16:64, r64","lssq m16:64, r64","REX.W 0F B2 /r","N.S.","V","","modrm_memonly","w,r","Y","64"
+"LSS r16, m16:16","LSSW m16:16, r16","lssw m16:16, r16","0F B2 /r","V","V","","modrm_memonly,operand16","w,r","Y","16"
+"LTR r/m16","LTR r/m16","ltr r/m16","0F 00 /3","V","V","","","r","",""
+"LWPINS r32V, r/m32, imm32u","LWPINS imm32u, r/m32, r32V","lwpins imm32u, r/m32, r32V","XOP.NDD.128.0A.W0 12 /0","V","V","XOP","amd,operand16,operand32","w,r,r","",""
+"LWPINS r64V, r64/m32, imm32u","LWPINS imm32u, r64/m32, r64V","lwpins imm32u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /0","N.S.","V","XOP","amd,operand64","w,r,r","",""
+"LWPVAL r32V, r/m32, imm32u","LWPVAL imm32u, r/m32, r32V","lwpval imm32u, r/m32, r32V","XOP.NDD.128.0A.W0 12 /1","V","V","XOP","amd,operand16,operand32","w,r,r","",""
+"LWPVAL r64V, r64/m32, imm32u","LWPVAL imm32u, r64/m32, r64V","lwpval imm32u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /1","N.S.","V","XOP","amd,operand64","w,r,r","",""
+"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","V","V","LZCNT","operand32","w,r","Y","32"
+"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","V","V","AMD","amd,operand32","w,r","Y","32"
+"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD /r","N.S.","V","AMD","amd","w,r","Y","64"
+"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD /r","N.S.","V","LZCNT","","w,r","Y","64"
+"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","V","V","AMD","amd,operand16","w,r","Y","16"
+"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","V","V","LZCNT","operand16","w,r","Y","16"
+"MASKMOVDQU xmm1, xmm2","MASKMOVOU xmm2, xmm1","maskmovdqu xmm2, xmm1","66 0F F7 /r","V","V","SSE2","modrm_regonly","r,r","",""
+"MASKMOVQ mm1, mm2","MASKMOVQ mm2, mm1","maskmovq mm2, mm1","0F F7 /r","V","V","MMX","modrm_regonly","r,r","",""
+"MAXPD xmm1, xmm2/m128","MAXPD xmm2/m128, xmm1","maxpd xmm2/m128, xmm1","66 0F 5F /r","V","V","SSE2","","rw,r","",""
+"MAXPS xmm1, xmm2/m128","MAXPS xmm2/m128, xmm1","maxps xmm2/m128, xmm1","0F 5F /r","V","V","SSE","","rw,r","",""
+"MAXSD xmm1, xmm2/m64","MAXSD xmm2/m64, xmm1","maxsd xmm2/m64, xmm1","F2 0F 5F /r","V","V","SSE2","","rw,r","",""
+"MAXSS xmm1, xmm2/m32","MAXSS xmm2/m32, xmm1","maxss xmm2/m32, xmm1","F3 0F 5F /r","V","V","SSE","","rw,r","",""
+"MFENCE","MFENCE","mfence","0F AE /6","V","V","SSE2","","","",""
+"MINPD xmm1, xmm2/m128","MINPD xmm2/m128, xmm1","minpd xmm2/m128, xmm1","66 0F 5D /r","V","V","SSE2","","rw,r","",""
+"MINPS xmm1, xmm2/m128","MINPS xmm2/m128, xmm1","minps xmm2/m128, xmm1","0F 5D /r","V","V","SSE","","rw,r","",""
+"MINSD xmm1, xmm2/m64","MINSD xmm2/m64, xmm1","minsd xmm2/m64, xmm1","F2 0F 5D /r","V","V","SSE2","","rw,r","",""
+"MINSS xmm1, xmm2/m32","MINSS xmm2/m32, xmm1","minss xmm2/m32, xmm1","F3 0F 5D /r","V","V","SSE","","rw,r","",""
+"MONITOR","MONITOR","monitor","0F 01 C8","V","V","MONITOR","","","",""
+"MOVAPD xmm2/m128, xmm1","MOVAPD xmm1, xmm2/m128","movapd xmm1, xmm2/m128","66 0F 29 /r","V","V","SSE2","","w,r","",""
+"MOVAPD xmm1, xmm2/m128","MOVAPD xmm2/m128, xmm1","movapd xmm2/m128, xmm1","66 0F 28 /r","V","V","SSE2","","w,r","",""
+"MOVAPS xmm2/m128, xmm1","MOVAPS xmm1, xmm2/m128","movaps xmm1, xmm2/m128","0F 29 /r","V","V","SSE","","w,r","",""
+"MOVAPS xmm1, xmm2/m128","MOVAPS xmm2/m128, xmm1","movaps xmm2/m128, xmm1","0F 28 /r","V","V","SSE","","w,r","",""
+"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","C6 /0 ib","V","V","","","w,r","Y","8"
+"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","REX C6 /0 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","B0+rb ib","V","V","","","w,r","Y","8"
+"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","REX B0+rb ib","N.E.","V","","pseudo64","w,r","Y","8"
+"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","8A /r","V","V","","","w,r","Y","8"
+"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","REX 8A /r","N.E.","V","","pseudo64","w,r","Y","8"
+"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","88 /r","V","V","","","w,r","Y","8"
+"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","REX 88 /r","N.E.","V","","pseudo64","w,r","Y","8"
+"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, moffs8","A2 cm","V","V","","","w,r","Y","8"
+"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, moffs8","REX.W A2 cm","N.E.","V","","pseudo","w,r","Y","8"
+"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8, AL","A0 cm","V","V","","","w,r","Y","8"
+"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8, AL","REX.W A0 cm","N.E.","V","","pseudo","w,r","Y","8"
+"MOVBE r32, m32","MOVBELL m32, r32","movbell m32, r32","0F 38 F0 /r","V","V","MOVBE","modrm_memonly,operand32","w,r","Y","32"
+"MOVBE m32, r32","MOVBELL r32, m32","movbell r32, m32","0F 38 F1 /r","V","V","MOVBE","modrm_memonly,operand32","w,r","Y","32"
+"MOVBE r64, m64","MOVBEQQ m64, r64","movbeqq m64, r64","REX.W 0F 38 F0 /r","N.S.","V","MOVBE","modrm_memonly","w,r","Y","64"
+"MOVBE m64, r64","MOVBEQQ r64, m64","movbeqq r64, m64","REX.W 0F 38 F1 /r","N.S.","V","MOVBE","modrm_memonly","w,r","Y","64"
+"MOVBE r16, m16","MOVBEWW m16, r16","movbeww m16, r16","0F 38 F0 /r","V","V","MOVBE","modrm_memonly,operand16","w,r","Y","16"
+"MOVBE m16, r16","MOVBEWW r16, m16","movbeww r16, m16","0F 38 F1 /r","V","V","MOVBE","modrm_memonly,operand16","w,r","Y","16"
+"MOVSX r32, r/m8","MOVBLSX r/m8, r32","movsbl r/m8, r32","0F BE /r","V","V","","operand32","w,r","Y","32"
+"MOVZX r32, r/m8","MOVBLZX r/m8, r32","movzbl r/m8, r32","0F B6 /r","V","V","","operand32","w,r","Y","32"
+"MOVSX r64, r/m8","MOVBQSX r/m8, r64","movsbq r/m8, r64","REX.W 0F BE /r","N.S.","V","","","w,r","Y","64"
+"MOVZX r64, r/m8","MOVBQZX r/m8, r64","movzbq r/m8, r64","REX.W 0F B6 /r","N.S.","V","","","w,r","Y","64"
+"MOVSX r16, r/m8","MOVBWSX r/m8, r16","movsbw r/m8, r16","0F BE /r","V","V","","operand16","w,r","Y","16"
+"MOVZX r16, r/m8","MOVBWZX r/m8, r16","movzbw r/m8, r16","0F B6 /r","V","V","","operand16","w,r","Y","16"
+"MOVD r/m32, mm1","MOVD mm1, r/m32","movd mm1, r/m32","0F 7E /r","V","V","MMX","operand16,operand32","w,r","",""
+"MOVD mm1, r/m32","MOVD r/m32, mm1","movd r/m32, mm1","0F 6E /r","V","V","MMX","operand16,operand32","w,r","",""
+"MOVD xmm1, r/m32","MOVD r/m32, xmm1","movd r/m32, xmm1","66 0F 6E /r","V","V","SSE2","operand16,operand32","w,r","",""
+"MOVD r/m32, xmm1","MOVD xmm1, r/m32","movd xmm1, r/m32","66 0F 7E /r","V","V","SSE2","operand16,operand32","w,r","",""
+"MOVDDUP xmm1, xmm2/m64","MOVDDUP xmm2/m64, xmm1","movddup xmm2/m64, xmm1","F2 0F 12 /r","V","V","SSE3","","w,r","",""
+"MOVHLPS xmm1, xmm2","MOVHLPS xmm2, xmm1","movhlps xmm2, xmm1","0F 12 /r","V","V","SSE","modrm_regonly","w,r","",""
+"MOVHPD xmm1, m64","MOVHPD m64, xmm1","movhpd m64, xmm1","66 0F 16 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVHPD m64, xmm1","MOVHPD xmm1, m64","movhpd xmm1, m64","66 0F 17 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVHPS xmm1, m64","MOVHPS m64, xmm1","movhps m64, xmm1","0F 16 /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOVHPS m64, xmm1","MOVHPS xmm1, m64","movhps xmm1, m64","0F 17 /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOV rmr32, CR0-CR7","MOVL CR0-CR7, rmr32","movl CR0-CR7, rmr32","0F 20 /r","V","N.S.","","","w,r","Y","32"
+"MOV rmr32, DR0-DR7","MOVL DR0-DR7, rmr32","movl DR0-DR7, rmr32","0F 21 /r","V","N.S.","","","w,r","Y","32"
+"MOV moffs32, EAX","MOVL EAX, moffs32","movl EAX, moffs32","A3 cm","V","V","","operand32","w,r","Y","32"
+"MOV r/m32, imm32","MOVL imm32, r/m32","movl imm32, r/m32","C7 /0 id","V","V","","operand32","w,r","Y","32"
+"MOV r32op, imm32u","MOVL imm32u, r32op","movl imm32u, r32op","B8+rd id","V","V","","operand32","w,r","Y","32"
+"MOV EAX, moffs32","MOVL moffs32, EAX","movl moffs32, EAX","A1 cm","V","V","","operand32","w,r","Y","32"
+"MOV r32, r/m32","MOVL r/m32, r32","movl r/m32, r32","8B /r","V","V","","operand32","w,r","Y","32"
+"MOV r/m32, r32","MOVL r32, r/m32","movl r32, r/m32","89 /r","V","V","","operand32","w,r","Y","32"
+"MOV CR0-CR7, rmr32","MOVL rmr32, CR0-CR7","movl rmr32, CR0-CR7","0F 22 /r","V","N.S.","","","w,r","Y","32"
+"MOV DR0-DR7, rmr32","MOVL rmr32, DR0-DR7","movl rmr32, DR0-DR7","0F 23 /r","V","N.S.","","","w,r","Y","32"
+"MOVLHPS xmm1, xmm2","MOVLHPS xmm2, xmm1","movlhps xmm2, xmm1","0F 16 /r","V","V","SSE","modrm_regonly","w,r","",""
+"MOVLPD xmm1, m64","MOVLPD m64, xmm1","movlpd m64, xmm1","66 0F 12 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVLPD m64, xmm1","MOVLPD xmm1, m64","movlpd xmm1, m64","66 0F 13 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVLPS xmm1, m64","MOVLPS m64, xmm1","movlps m64, xmm1","0F 12 /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOVLPS m64, xmm1","MOVLPS xmm1, m64","movlps xmm1, m64","0F 13 /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOVSXD r32, r/m32","MOVLQSX r/m32, r32","movsxdl r/m32, r32","63 /r","N.S.","V","","operand32","w,r","Y","32"
+"MOVSXD r64, r/m32","MOVLQSX r/m32, r64","movslq r/m32, r64","REX.W 63 /r","N.S.","V","","","w,r","Y","64"
+"MOVMSKPD r32, xmm2","MOVMSKPD xmm2, r32","movmskpd xmm2, r32","66 0F 50 /r","V","V","SSE2","modrm_regonly","w,r","",""
+"MOVMSKPS r32, xmm2","MOVMSKPS xmm2, r32","movmskps xmm2, r32","0F 50 /r","V","V","SSE","modrm_regonly","w,r","",""
+"MOVNTDQA xmm1, m128","MOVNTDQA m128, xmm1","movntdqa m128, xmm1","66 0F 38 2A /r","V","V","SSE4_1","modrm_memonly","w,r","",""
+"MOVNTI m32, r32","MOVNTIL r32, m32","movntil r32, m32","0F C3 /r","V","V","SSE2","modrm_memonly,operand16,operand32","w,r","Y","32"
+"MOVNTI m64, r64","MOVNTIQ r64, m64","movntiq r64, m64","REX.W 0F C3 /r","N.S.","V","SSE2","modrm_memonly","w,r","Y","64"
+"MOVNTDQ m128, xmm1","MOVNTO xmm1, m128","movntdq xmm1, m128","66 0F E7 /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVNTPD m128, xmm1","MOVNTPD xmm1, m128","movntpd xmm1, m128","66 0F 2B /r","V","V","SSE2","modrm_memonly","w,r","",""
+"MOVNTPS m128, xmm1","MOVNTPS xmm1, m128","movntps xmm1, m128","0F 2B /r","V","V","SSE","modrm_memonly","w,r","",""
+"MOVNTQ m64, mm1","MOVNTQ mm1, m64","movntq mm1, m64","0F E7 /r","V","V","MMX","modrm_memonly","w,r","",""
+"MOVNTSD m64, xmm1","MOVNTSD xmm1, m64","movntsd xmm1, m64","F2 0F 2B /r","V","V","SSE4a","amd,modrm_memonly","w,r","",""
+"MOVNTSS m32, xmm1","MOVNTSS xmm1, m32","movntss xmm1, m32","F3 0F 2B /r","V","V","SSE4a","amd,modrm_memonly","w,r","",""
+"MOVDQA xmm2/m128, xmm1","MOVO xmm1, xmm2/m128","movdqa xmm1, xmm2/m128","66 0F 7F /r","V","V","SSE2","","w,r","",""
+"MOVDQA xmm1, xmm2/m128","MOVO xmm2/m128, xmm1","movdqa xmm2/m128, xmm1","66 0F 6F /r","V","V","SSE2","","w,r","",""
+"MOVDQU xmm2/m128, xmm1","MOVOU xmm1, xmm2/m128","movdqu xmm1, xmm2/m128","F3 0F 7F /r","V","V","SSE2","","w,r","",""
+"MOVDQU xmm1, xmm2/m128","MOVOU xmm2/m128, xmm1","movdqu xmm2/m128, xmm1","F3 0F 6F /r","V","V","SSE2","","w,r","",""
+"MOV rmr64, CR0-CR7","MOVQ CR0-CR7, rmr64","movq CR0-CR7, rmr64","0F 20 /r","N.S.","V","","default64","w,r","Y","64"
+"MOV rmr64, CR8","MOVQ CR8, rmr64","movq CR8, rmr64","REX.R + 0F 20 /0","N.E.","V","","modrm_regonly,pseudo","w,r","Y","64"
+"MOV rmr64, DR0-DR7","MOVQ DR0-DR7, rmr64","movq DR0-DR7, rmr64","0F 21 /r","N.S.","V","","default64","w,r","Y","64"
+"MOV moffs64, RAX","MOVQ RAX, moffs64","movabsq RAX, moffs64","REX.W A3 cm","N.S.","V","","","w,r","Y","64"
+"MOV r/m64, imm32","MOVQ imm32, r/m64","movq imm32, r/m64","REX.W C7 /0 id","N.S.","V","","","w,r","Y","64"
+"MOV r64op, imm64u","MOVQ imm64u, r64op","movq imm64u, r64op","REX.W B8+ro io","N.S.","V","","","w,r","Y","64"
+"MOVQ mm2/m64, mm1","MOVQ mm1, mm2/m64","movq mm1, mm2/m64","0F 7F /r","V","V","MMX","","w,r","",""
+"MOVQ r/m64, mm1","MOVQ mm1, r/m64","movq mm1, r/m64","REX.W 0F 7E /r","N.S.","V","MMX","","w,r","",""
+"MOVQ mm1, mm2/m64","MOVQ mm2/m64, mm1","movq mm2/m64, mm1","0F 6F /r","V","V","MMX","","w,r","",""
+"MOV RAX, moffs64","MOVQ moffs64, RAX","movabsq moffs64, RAX","REX.W A1 cm","N.S.","V","","","w,r","Y","64"
+"MOVQ mm1, r/m64","MOVQ r/m64, mm1","movq r/m64, mm1","REX.W 0F 6E /r","N.S.","V","MMX","","w,r","",""
+"MOV r64, r/m64","MOVQ r/m64, r64","movq r/m64, r64","REX.W 8B /r","N.S.","V","","","w,r","Y","64"
+"MOVQ xmm1, r/m64","MOVQ r/m64, xmm1","movq r/m64, xmm1","66 REX.W 0F 6E /r","N.S.","V","SSE2","","w,r","",""
+"MOV r/m64, r64","MOVQ r64, r/m64","movq r64, r/m64","REX.W 89 /r","N.S.","V","","","w,r","Y","64"
+"MOV CR0-CR7, rmr64","MOVQ rmr64, CR0-CR7","movq rmr64, CR0-CR7","0F 22 /r","N.S.","V","","default64","w,r","Y","64"
+"MOV CR8, rmr64","MOVQ rmr64, CR8","movq rmr64, CR8","REX.R + 0F 22 /0","N.E.","V","","modrm_regonly,pseudo","w,r","Y","64"
+"MOV DR0-DR7, rmr64","MOVQ rmr64, DR0-DR7","movq rmr64, DR0-DR7","0F 23 /r","N.S.","V","","default64","w,r","Y","64"
+"MOVQ r/m64, xmm1","MOVQ xmm1, r/m64","movq xmm1, r/m64","66 REX.W 0F 7E /r","N.S.","V","SSE2","","w,r","",""
+"MOVQ xmm2/m64, xmm1","MOVQ xmm1, xmm2/m64","movq xmm1, xmm2/m64","66 0F D6 /r","V","V","SSE2","","w,r","",""
+"MOVDQ2Q mm1, xmm2","MOVQ xmm2, mm1","movdq2q xmm2, mm1","F2 0F D6 /r","V","V","SSE2","modrm_regonly","w,r","",""
+"MOVQ xmm1, xmm2/m64","MOVQ xmm2/m64, xmm1","movq xmm2/m64, xmm1","F3 0F 7E /r","V","V","SSE2","","w,r","",""
+"MOVQ2DQ xmm1, mm2","MOVQOZX mm2, xmm1","movq2dq mm2, xmm1","F3 0F D6 /r","V","V","SSE2","modrm_regonly","w,r","",""
+"MOVSB","MOVSB","movsb","A4","V","V","","","","",""
+"MOVSD xmm2/m64, xmm1","MOVSD xmm1, xmm2/m64","movsd xmm1, xmm2/m64","F2 0F 11 /r","V","V","SSE2","","w,r","",""
+"MOVSD xmm1, xmm2/m64","MOVSD xmm2/m64, xmm1","movsd xmm2/m64, xmm1","F2 0F 10 /r","V","V","SSE2","","w,r","",""
+"MOVSHDUP xmm1, xmm2/m128","MOVSHDUP xmm2/m128, xmm1","movshdup xmm2/m128, xmm1","F3 0F 16 /r","V","V","SSE3","","w,r","",""
+"MOVSD","MOVSL","movsl","A5","V","V","","operand32","","",""
+"MOVSLDUP xmm1, xmm2/m128","MOVSLDUP xmm2/m128, xmm1","movsldup xmm2/m128, xmm1","F3 0F 12 /r","V","V","SSE3","","w,r","",""
+"MOVSQ","MOVSQ","movsq","REX.W A5","N.S.","V","","","","",""
+"MOVSS xmm2/m32, xmm1","MOVSS xmm1, xmm2/m32","movss xmm1, xmm2/m32","F3 0F 11 /r","V","V","SSE","","w,r","",""
+"MOVSS xmm1, xmm2/m32","MOVSS xmm2/m32, xmm1","movss xmm2/m32, xmm1","F3 0F 10 /r","V","V","SSE","","w,r","",""
+"MOVSW","MOVSW","movsw","A5","V","V","","operand16","","",""
+"MOVSX r16, r/m16","MOVSWW r/m16, r16","movsww r/m16, r16","0F BF /r","V","V","","operand16","w,r","Y","16"
+"MOVUPD xmm2/m128, xmm1","MOVUPD xmm1, xmm2/m128","movupd xmm1, xmm2/m128","66 0F 11 /r","V","V","SSE2","","w,r","",""
+"MOVUPD xmm1, xmm2/m128","MOVUPD xmm2/m128, xmm1","movupd xmm2/m128, xmm1","66 0F 10 /r","V","V","SSE2","","w,r","",""
+"MOVUPS xmm2/m128, xmm1","MOVUPS xmm1, xmm2/m128","movups xmm1, xmm2/m128","0F 11 /r","V","V","SSE","","w,r","",""
+"MOVUPS xmm1, xmm2/m128","MOVUPS xmm2/m128, xmm1","movups xmm2/m128, xmm1","0F 10 /r","V","V","SSE","","w,r","",""
+"MOV moffs16, AX","MOVW AX, moffs16","movw AX, moffs16","A3 cm","V","V","","operand16","w,r","Y","16"
+"MOV r/m16, Sreg","MOVW Sreg, r/m16","movw Sreg, r/m16","8C /r","V","V","","operand16","w,r","Y","16"
+"MOV r/m16, imm16","MOVW imm16, r/m16","movw imm16, r/m16","C7 /0 iw","V","V","","operand16","w,r","Y","16"
+"MOV r16op, imm16u","MOVW imm16u, r16op","movw imm16u, r16op","B8+rw iw","V","V","","operand16","w,r","Y","16"
+"MOV AX, moffs16","MOVW moffs16, AX","movw moffs16, AX","A1 cm","V","V","","operand16","w,r","Y","16"
+"MOV Sreg, r/m16","MOVW r/m16, Sreg","movw r/m16, Sreg","8E /r","V","V","","","w,r","Y","16"
+"MOV r16, r/m16","MOVW r/m16, r16","movw r/m16, r16","8B /r","V","V","","operand16","w,r","Y","16"
+"MOV r/m16, r16","MOVW r16, r/m16","movw r16, r/m16","89 /r","V","V","","operand16","w,r","Y","16"
+"MOVSX r32, r/m16","MOVWLSX r/m16, r32","movswl r/m16, r32","0F BF /r","V","V","","operand32","w,r","Y","32"
+"MOVZX r32, r/m16","MOVWLZX r/m16, r32","movzwl r/m16, r32","0F B7 /r","V","V","","operand32","w,r","Y","32"
+"MOVSX r64, r/m16","MOVWQSX r/m16, r64","movswq r/m16, r64","REX.W 0F BF /r","N.S.","V","","","w,r","Y","64"
+"MOVSXD r16, r/m32","MOVWQSX r/m32, r16","movsxdw r/m32, r16","63 /r","N.S.","V","","operand16","w,r","Y","16"
+"MOVZX r64, r/m16","MOVWQZX r/m16, r64","movzwq r/m16, r64","REX.W 0F B7 /r","N.S.","V","","","w,r","Y","64"
+"MOVZX r16, r/m16","MOVZWW r/m16, r16","movzww r/m16, r16","0F B7 /r","V","V","","operand16","w,r","Y","16"
+"MOV r32/m16, Sreg","MOV{L/W} Sreg, r32/m16","mov{l/w} Sreg, r32/m16","8C /r","V","V","","operand32","w,r","Y",""
+"MOV r64/m16, Sreg","MOV{Q/W} Sreg, r64/m16","mov{q/w} Sreg, r64/m16","REX.W 8C /r","N.S.","V","","","w,r","Y",""
+"MPSADBW xmm1, xmm2/m128, imm8u","MPSADBW imm8u, xmm2/m128, xmm1","mpsadbw imm8u, xmm2/m128, xmm1","66 0F 3A 42 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"MUL r/m8","MULB r/m8","mulb r/m8","F6 /4","V","V","","","r","Y","8"
+"MUL r/m8","MULB r/m8","mulb r/m8","REX F6 /4","N.E.","V","","pseudo64","r","Y","8"
+"MUL r/m32","MULL r/m32","mull r/m32","F7 /4","V","V","","operand32","r","Y","32"
+"MULPD xmm1, xmm2/m128","MULPD xmm2/m128, xmm1","mulpd xmm2/m128, xmm1","66 0F 59 /r","V","V","SSE2","","rw,r","",""
+"MULPS xmm1, xmm2/m128","MULPS xmm2/m128, xmm1","mulps xmm2/m128, xmm1","0F 59 /r","V","V","SSE","","rw,r","",""
+"MUL r/m64","MULQ r/m64","mulq r/m64","REX.W F7 /4","N.S.","V","","","r","Y","64"
+"MULSD xmm1, xmm2/m64","MULSD xmm2/m64, xmm1","mulsd xmm2/m64, xmm1","F2 0F 59 /r","V","V","SSE2","","rw,r","",""
+"MULSS xmm1, xmm2/m32","MULSS xmm2/m32, xmm1","mulss xmm2/m32, xmm1","F3 0F 59 /r","V","V","SSE","","rw,r","",""
+"MUL r/m16","MULW r/m16","mulw r/m16","F7 /4","V","V","","operand16","r","Y","16"
+"MULX r32, r32V, r/m32","MULXL r/m32, r32V, r32","mulxl r/m32, r32V, r32","VEX.NDD.128.F2.0F38.W0 F6 /r","V","V","BMI2","","w,w,r","Y","32"
+"MULX r64, r64V, r/m64","MULXQ r/m64, r64V, r64","mulxq r/m64, r64V, r64","VEX.NDD.128.F2.0F38.W1 F6 /r","N.S.","V","BMI2","","w,w,r","Y","64"
+"MWAIT","MWAIT","mwait","0F 01 C9","V","V","MONITOR","","","",""
+"NEG r/m8","NEGB r/m8","negb r/m8","F6 /3","V","V","","","rw","Y","8"
+"NEG r/m8","NEGB r/m8","negb r/m8","REX F6 /3","N.E.","V","","pseudo64","rw","Y","8"
+"NEG r/m32","NEGL r/m32","negl r/m32","F7 /3","V","V","","operand32","rw","Y","32"
+"NEG r/m64","NEGQ r/m64","negq r/m64","REX.W F7 /3","N.S.","V","","","rw","Y","64"
+"NEG r/m16","NEGW r/m16","negw r/m16","F7 /3","V","V","","operand16","rw","Y","16"
+"NOP","NOP","nop","90","V","V","","pseudo","","Y",""
+"NOP","NOP","nop","90+rd","V","V","","operand32,operand64","","Y",""
+"NOP","NOP","nop","90+rw","V","V","","operand16,operand64","","Y",""
+"NOP","NOP","nop","F3 90+rd","V","V","","operand32","","Y",""
+"NOP","NOP","nop","F3 90+rw","V","V","","operand16","","Y",""
+"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /4","V","V","","operand32","r","Y","32"
+"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /5","V","V","","operand32","r","Y","32"
+"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /6","V","V","","operand32","r","Y","32"
+"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /7","V","V","","operand32","r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 19 /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1A /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1B /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1C /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1D /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V","PPRO","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V","","operand32","r,r","Y","32"
+"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1F /r","V","V","","operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 0D /r","V","V","PRFCHW","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1A /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1B /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","66 0F 1E /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F2 0F 1E /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1B /r","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /0","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /1","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /2","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /3","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /4","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /5","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /6","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F8","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F9","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FA","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FB","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FC","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FD","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FE","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FF","V","V","PPRO","modrm_regonly,operand32","r,r","Y","32"
+"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /0","V","V","","modrm_regonly,operand32","r","Y","32"
+"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /1","V","V","","modrm_regonly,operand32","r","Y","32"
+"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /2","V","V","","modrm_regonly,operand32","r","Y","32"
+"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /3","V","V","","modrm_regonly,operand32","r","Y","32"
+"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /4","N.S.","V","","","r","Y","64"
+"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /5","N.S.","V","","","r","Y","64"
+"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /6","N.S.","V","","","r","Y","64"
+"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /7","N.S.","V","","","r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 19 /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1A /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1B /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1C /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1D /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S.","V","","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S.","V","PPRO","","r,r","Y","64"
+"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1F /r","N.S.","V","","","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","66 REX.W 0F 1E /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F2 REX.W 0F 1E /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1B /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /0","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /1","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /2","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /3","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /4","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /5","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /6","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F8","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F9","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FA","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FB","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FC","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FD","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FE","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FF","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 0D /r","N.S.","V","PRFCHW","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1A /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1B /r","N.S.","V","PPRO","modrm_regonly","r,r","Y","64"
+"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /0","N.S.","V","","modrm_regonly","r","Y","64"
+"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /1","N.S.","V","","modrm_regonly","r","Y","64"
+"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /2","N.S.","V","","modrm_regonly","r","Y","64"
+"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /3","N.S.","V","","modrm_regonly","r","Y","64"
+"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /4","V","V","","operand16","r","Y","16"
+"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /5","V","V","","operand16","r","Y","16"
+"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /6","V","V","","operand16","r","Y","16"
+"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /7","V","V","","operand16","r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 19 /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1A /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1B /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1C /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1D /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V","","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V","PPRO","operand16","r,r","Y","16"
+"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1F /r","V","V","","operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 0D /r","V","V","PRFCHW","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1A /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1B /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","66 0F 1E /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F2 0F 1E /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1B /r","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /0","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /1","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /2","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /3","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /4","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /5","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /6","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F8","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F9","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FA","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FB","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FC","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FD","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FE","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FF","V","V","PPRO","modrm_regonly,operand16","r,r","Y","16"
+"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /0","V","V","","modrm_regonly,operand16","r","Y","16"
+"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /1","V","V","","modrm_regonly,operand16","r","Y","16"
+"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /2","V","V","","modrm_regonly,operand16","r","Y","16"
+"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /3","V","V","","modrm_regonly,operand16","r","Y","16"
+"NOT r/m8","NOTB r/m8","notb r/m8","F6 /2","V","V","","","rw","Y","8"
+"NOT r/m8","NOTB r/m8","notb r/m8","REX F6 /2","N.E.","V","","pseudo64","rw","Y","8"
+"NOT r/m32","NOTL r/m32","notl r/m32","F7 /2","V","V","","operand32","rw","Y","32"
+"NOT r/m64","NOTQ r/m64","notq r/m64","REX.W F7 /2","N.S.","V","","","rw","Y","64"
+"NOT r/m16","NOTW r/m16","notw r/m16","F7 /2","V","V","","operand16","rw","Y","16"
+"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","80 /1 ib","V","V","","","rw,r","Y","8"
+"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","82 /1 ib","V","N.S.","","","rw,r","Y","8"
+"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","REX 80 /1 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"OR AL, imm8u","ORB imm8u, AL","orb imm8u, AL","0C ib","V","V","","","rw,r","Y","8"
+"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","0A /r","V","V","","","rw,r","Y","8"
+"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","REX 0A /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","08 /r","V","V","","","rw,r","Y","8"
+"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","REX 08 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"OR EAX, imm32","ORL imm32, EAX","orl imm32, EAX","0D id","V","V","","operand32","rw,r","Y","32"
+"OR r/m32, imm32","ORL imm32, r/m32","orl imm32, r/m32","81 /1 id","V","V","","operand32","rw,r","Y","32"
+"OR r/m32, imm8","ORL imm8, r/m32","orl imm8, r/m32","83 /1 ib","V","V","","operand32","rw,r","Y","32"
+"OR r32, r/m32","ORL r/m32, r32","orl r/m32, r32","0B /r","V","V","","operand32","rw,r","Y","32"
+"OR r/m32, r32","ORL r32, r/m32","orl r32, r/m32","09 /r","V","V","","operand32","rw,r","Y","32"
+"ORPD xmm1, xmm2/m128","ORPD xmm2/m128, xmm1","orpd xmm2/m128, xmm1","66 0F 56 /r","V","V","SSE2","","rw,r","",""
+"ORPS xmm1, xmm2/m128","ORPS xmm2/m128, xmm1","orps xmm2/m128, xmm1","0F 56 /r","V","V","SSE","","rw,r","",""
+"OR RAX, imm32","ORQ imm32, RAX","orq imm32, RAX","REX.W 0D id","N.S.","V","","","rw,r","Y","64"
+"OR r/m64, imm32","ORQ imm32, r/m64","orq imm32, r/m64","REX.W 81 /1 id","N.S.","V","","","rw,r","Y","64"
+"OR r/m64, imm8","ORQ imm8, r/m64","orq imm8, r/m64","REX.W 83 /1 ib","N.S.","V","","","rw,r","Y","64"
+"OR r64, r/m64","ORQ r/m64, r64","orq r/m64, r64","REX.W 0B /r","N.S.","V","","","rw,r","Y","64"
+"OR r/m64, r64","ORQ r64, r/m64","orq r64, r/m64","REX.W 09 /r","N.S.","V","","","rw,r","Y","64"
+"OR AX, imm16","ORW imm16, AX","orw imm16, AX","0D iw","V","V","","operand16","rw,r","Y","16"
+"OR r/m16, imm16","ORW imm16, r/m16","orw imm16, r/m16","81 /1 iw","V","V","","operand16","rw,r","Y","16"
+"OR r/m16, imm8","ORW imm8, r/m16","orw imm8, r/m16","83 /1 ib","V","V","","operand16","rw,r","Y","16"
+"OR r16, r/m16","ORW r/m16, r16","orw r/m16, r16","0B /r","V","V","","operand16","rw,r","Y","16"
+"OR r/m16, r16","ORW r16, r/m16","orw r16, r/m16","09 /r","V","V","","operand16","rw,r","Y","16"
+"OUT DX, AL","OUTB AL, DX","outb AL, DX","EE","V","V","","","r,r","Y","8"
+"OUT imm8u, AL","OUTB AL, imm8u","outb AL, imm8u","E6 ib","V","V","","","r,r","Y","8"
+"OUT DX, EAX","OUTL EAX, DX","outl EAX, DX","EF","V","V","","operand32,operand64","r,r","Y","32"
+"OUT imm8u, EAX","OUTL EAX, imm8u","outl EAX, imm8u","E7 ib","V","V","","operand32,operand64","r,r","Y","32"
+"OUTSB","OUTSB","outsb","6E","V","V","","","","",""
+"OUTSD","OUTSL","outsl","6F","V","V","","operand32,operand64","","",""
+"OUTSW","OUTSW","outsw","6F","V","V","","operand16","","",""
+"OUT DX, AX","OUTW AX, DX","outw AX, DX","EF","V","V","","operand16","r,r","Y","16"
+"OUT imm8u, AX","OUTW AX, imm8u","outw AX, imm8u","E7 ib","V","V","","operand16","r,r","Y","16"
+"PABSB mm1, mm2/m64","PABSB mm2/m64, mm1","pabsb mm2/m64, mm1","0F 38 1C /r","V","V","SSSE3","","w,r","",""
+"PABSB xmm1, xmm2/m128","PABSB xmm2/m128, xmm1","pabsb xmm2/m128, xmm1","66 0F 38 1C /r","V","V","SSSE3","","w,r","",""
+"PABSD mm1, mm2/m64","PABSD mm2/m64, mm1","pabsd mm2/m64, mm1","0F 38 1E /r","V","V","SSSE3","","w,r","",""
+"PABSD xmm1, xmm2/m128","PABSD xmm2/m128, xmm1","pabsd xmm2/m128, xmm1","66 0F 38 1E /r","V","V","SSSE3","","w,r","",""
+"PABSW mm1, mm2/m64","PABSW mm2/m64, mm1","pabsw mm2/m64, mm1","0F 38 1D /r","V","V","SSSE3","","w,r","",""
+"PABSW xmm1, xmm2/m128","PABSW xmm2/m128, xmm1","pabsw xmm2/m128, xmm1","66 0F 38 1D /r","V","V","SSSE3","","w,r","",""
+"PACKSSDW mm1, mm2/m64","PACKSSLW mm2/m64, mm1","packssdw mm2/m64, mm1","0F 6B /r","V","V","MMX","","rw,r","",""
+"PACKSSDW xmm1, xmm2/m128","PACKSSLW xmm2/m128, xmm1","packssdw xmm2/m128, xmm1","66 0F 6B /r","V","V","SSE2","","rw,r","",""
+"PACKSSWB mm1, mm2/m64","PACKSSWB mm2/m64, mm1","packsswb mm2/m64, mm1","0F 63 /r","V","V","MMX","","rw,r","",""
+"PACKSSWB xmm1, xmm2/m128","PACKSSWB xmm2/m128, xmm1","packsswb xmm2/m128, xmm1","66 0F 63 /r","V","V","SSE2","","rw,r","",""
+"PACKUSDW xmm1, xmm2/m128","PACKUSDW xmm2/m128, xmm1","packusdw xmm2/m128, xmm1","66 0F 38 2B /r","V","V","SSE4_1","","rw,r","",""
+"PACKUSWB mm1, mm2/m64","PACKUSWB mm2/m64, mm1","packuswb mm2/m64, mm1","0F 67 /r","V","V","MMX","","rw,r","",""
+"PACKUSWB xmm1, xmm2/m128","PACKUSWB xmm2/m128, xmm1","packuswb xmm2/m128, xmm1","66 0F 67 /r","V","V","SSE2","","rw,r","",""
+"PADDB mm1, mm2/m64","PADDB mm2/m64, mm1","paddb mm2/m64, mm1","0F FC /r","V","V","MMX","","rw,r","",""
+"PADDB xmm1, xmm2/m128","PADDB xmm2/m128, xmm1","paddb xmm2/m128, xmm1","66 0F FC /r","V","V","SSE2","","rw,r","",""
+"PADDD mm1, mm2/m64","PADDL mm2/m64, mm1","paddd mm2/m64, mm1","0F FE /r","V","V","MMX","","rw,r","",""
+"PADDD xmm1, xmm2/m128","PADDL xmm2/m128, xmm1","paddd xmm2/m128, xmm1","66 0F FE /r","V","V","SSE2","","rw,r","",""
+"PADDQ mm1, mm2/m64","PADDQ mm2/m64, mm1","paddq mm2/m64, mm1","0F D4 /r","V","V","SSE2","","rw,r","",""
+"PADDQ xmm1, xmm2/m128","PADDQ xmm2/m128, xmm1","paddq xmm2/m128, xmm1","66 0F D4 /r","V","V","SSE2","","rw,r","",""
+"PADDSB mm1, mm2/m64","PADDSB mm2/m64, mm1","paddsb mm2/m64, mm1","0F EC /r","V","V","MMX","","rw,r","",""
+"PADDSB xmm1, xmm2/m128","PADDSB xmm2/m128, xmm1","paddsb xmm2/m128, xmm1","66 0F EC /r","V","V","SSE2","","rw,r","",""
+"PADDSW mm1, mm2/m64","PADDSW mm2/m64, mm1","paddsw mm2/m64, mm1","0F ED /r","V","V","MMX","","rw,r","",""
+"PADDSW xmm1, xmm2/m128","PADDSW xmm2/m128, xmm1","paddsw xmm2/m128, xmm1","66 0F ED /r","V","V","SSE2","","rw,r","",""
+"PADDUSB mm1, mm2/m64","PADDUSB mm2/m64, mm1","paddusb mm2/m64, mm1","0F DC /r","V","V","MMX","","rw,r","",""
+"PADDUSB xmm1, xmm2/m128","PADDUSB xmm2/m128, xmm1","paddusb xmm2/m128, xmm1","66 0F DC /r","V","V","SSE2","","rw,r","",""
+"PADDUSW mm1, mm2/m64","PADDUSW mm2/m64, mm1","paddusw mm2/m64, mm1","0F DD /r","V","V","MMX","","rw,r","",""
+"PADDUSW xmm1, xmm2/m128","PADDUSW xmm2/m128, xmm1","paddusw xmm2/m128, xmm1","66 0F DD /r","V","V","SSE2","","rw,r","",""
+"PADDW mm1, mm2/m64","PADDW mm2/m64, mm1","paddw mm2/m64, mm1","0F FD /r","V","V","MMX","","rw,r","",""
+"PADDW xmm1, xmm2/m128","PADDW xmm2/m128, xmm1","paddw xmm2/m128, xmm1","66 0F FD /r","V","V","SSE2","","rw,r","",""
+"PALIGNR mm1, mm2/m64, imm8u","PALIGNR imm8u, mm2/m64, mm1","palignr imm8u, mm2/m64, mm1","0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r","",""
+"PALIGNR xmm1, xmm2/m128, imm8u","PALIGNR imm8u, xmm2/m128, xmm1","palignr imm8u, xmm2/m128, xmm1","66 0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r","",""
+"PAND mm1, mm2/m64","PAND mm2/m64, mm1","pand mm2/m64, mm1","0F DB /r","V","V","MMX","","rw,r","",""
+"PAND xmm1, xmm2/m128","PAND xmm2/m128, xmm1","pand xmm2/m128, xmm1","66 0F DB /r","V","V","SSE2","","rw,r","",""
+"PANDN mm1, mm2/m64","PANDN mm2/m64, mm1","pandn mm2/m64, mm1","0F DF /r","V","V","MMX","","rw,r","",""
+"PANDN xmm1, xmm2/m128","PANDN xmm2/m128, xmm1","pandn xmm2/m128, xmm1","66 0F DF /r","V","V","SSE2","","rw,r","",""
+"PAUSE","PAUSE","pause","F3 90","V","V","","pseudo","","",""
+"PAUSE","PAUSE","pause","F3 90+rd","V","V","","operand32","","Y",""
+"PAUSE","PAUSE","pause","F3 90+rw","V","V","","operand16,operand64","","Y",""
+"PAVGB mm1, mm2/m64","PAVGB mm2/m64, mm1","pavgb mm2/m64, mm1","0F E0 /r","V","V","MMX","","rw,r","",""
+"PAVGB xmm1, xmm2/m128","PAVGB xmm2/m128, xmm1","pavgb xmm2/m128, xmm1","66 0F E0 /r","V","V","SSE2","","rw,r","",""
+"PAVGUSB mm1, mm2/m64","PAVGUSB mm2/m64, mm1","pavgusb mm2/m64, mm1","0F 0F BF /r","V","V","3DNOW","amd","rw,r","",""
+"PAVGW mm1, mm2/m64","PAVGW mm2/m64, mm1","pavgw mm2/m64, mm1","0F E3 /r","V","V","MMX","","rw,r","",""
+"PAVGW xmm1, xmm2/m128","PAVGW xmm2/m128, xmm1","pavgw xmm2/m128, xmm1","66 0F E3 /r","V","V","SSE2","","rw,r","",""
+"PBLENDVB xmm1, xmm2/m128, <XMM0>","PBLENDVB <XMM0>, xmm2/m128, xmm1","pblendvb <XMM0>, xmm2/m128, xmm1","66 0F 38 10 /r","V","V","SSE4_1","","rw,r,r","",""
+"PBLENDW xmm1, xmm2/m128, imm8u","PBLENDW imm8u, xmm2/m128, xmm1","pblendw imm8u, xmm2/m128, xmm1","66 0F 3A 0E /r ib","V","V","SSE4_1","","rw,r,r","",""
+"PCLMULQDQ xmm1, xmm2/m128, imm8u","PCLMULQDQ imm8u, xmm2/m128, xmm1","pclmulqdq imm8u, xmm2/m128, xmm1","66 0F 3A 44 /r ib","V","V","PCLMULQDQ","","rw,r,r","",""
+"PCMPEQB mm1, mm2/m64","PCMPEQB mm2/m64, mm1","pcmpeqb mm2/m64, mm1","0F 74 /r","V","V","MMX","","rw,r","",""
+"PCMPEQB xmm1, xmm2/m128","PCMPEQB xmm2/m128, xmm1","pcmpeqb xmm2/m128, xmm1","66 0F 74 /r","V","V","SSE2","","rw,r","",""
+"PCMPEQD mm1, mm2/m64","PCMPEQL mm2/m64, mm1","pcmpeqd mm2/m64, mm1","0F 76 /r","V","V","MMX","","rw,r","",""
+"PCMPEQD xmm1, xmm2/m128","PCMPEQL xmm2/m128, xmm1","pcmpeqd xmm2/m128, xmm1","66 0F 76 /r","V","V","SSE2","","rw,r","",""
+"PCMPEQQ xmm1, xmm2/m128","PCMPEQQ xmm2/m128, xmm1","pcmpeqq xmm2/m128, xmm1","66 0F 38 29 /r","V","V","SSE4_1","","rw,r","",""
+"PCMPEQW mm1, mm2/m64","PCMPEQW mm2/m64, mm1","pcmpeqw mm2/m64, mm1","0F 75 /r","V","V","MMX","","rw,r","",""
+"PCMPEQW xmm1, xmm2/m128","PCMPEQW xmm2/m128, xmm1","pcmpeqw xmm2/m128, xmm1","66 0F 75 /r","V","V","SSE2","","rw,r","",""
+"PCMPESTRI xmm1, xmm2/m128, imm8u","PCMPESTRI imm8u, xmm2/m128, xmm1","pcmpestri imm8u, xmm2/m128, xmm1","66 0F 3A 61 /r ib","V","V","SSE4_2","","r,r,r","",""
+"PCMPESTRM xmm1, xmm2/m128, imm8u","PCMPESTRM imm8u, xmm2/m128, xmm1","pcmpestrm imm8u, xmm2/m128, xmm1","66 0F 3A 60 /r ib","V","V","SSE4_2","","r,r,r","",""
+"PCMPGTB mm1, mm2/m64","PCMPGTB mm2/m64, mm1","pcmpgtb mm2/m64, mm1","0F 64 /r","V","V","MMX","","rw,r","",""
+"PCMPGTB xmm1, xmm2/m128","PCMPGTB xmm2/m128, xmm1","pcmpgtb xmm2/m128, xmm1","66 0F 64 /r","V","V","SSE2","","rw,r","",""
+"PCMPGTD mm1, mm2/m64","PCMPGTL mm2/m64, mm1","pcmpgtd mm2/m64, mm1","0F 66 /r","V","V","MMX","","rw,r","",""
+"PCMPGTD xmm1, xmm2/m128","PCMPGTL xmm2/m128, xmm1","pcmpgtd xmm2/m128, xmm1","66 0F 66 /r","V","V","SSE2","","rw,r","",""
+"PCMPGTQ xmm1, xmm2/m128","PCMPGTQ xmm2/m128, xmm1","pcmpgtq xmm2/m128, xmm1","66 0F 38 37 /r","V","V","SSE4_2","","rw,r","",""
+"PCMPGTW mm1, mm2/m64","PCMPGTW mm2/m64, mm1","pcmpgtw mm2/m64, mm1","0F 65 /r","V","V","MMX","","rw,r","",""
+"PCMPGTW xmm1, xmm2/m128","PCMPGTW xmm2/m128, xmm1","pcmpgtw xmm2/m128, xmm1","66 0F 65 /r","V","V","SSE2","","rw,r","",""
+"PCMPISTRI xmm1, xmm2/m128, imm8u","PCMPISTRI imm8u, xmm2/m128, xmm1","pcmpistri imm8u, xmm2/m128, xmm1","66 0F 3A 63 /r ib","V","V","SSE4_2","","r,r,r","",""
+"PCMPISTRM xmm1, xmm2/m128, imm8u","PCMPISTRM imm8u, xmm2/m128, xmm1","pcmpistrm imm8u, xmm2/m128, xmm1","66 0F 3A 62 /r ib","V","V","SSE4_2","","r,r,r","",""
+"PDEP r32, r32V, r/m32","PDEPL r/m32, r32V, r32","pdepl r/m32, r32V, r32","VEX.DDS.128.F2.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32"
+"PDEP r64, r64V, r/m64","PDEPQ r/m64, r64V, r64","pdepq r/m64, r64V, r64","VEX.DDS.128.F2.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64"
+"PEXT r32, r32V, r/m32","PEXTL r/m32, r32V, r32","pextl r/m32, r32V, r32","VEX.DDS.128.F3.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32"
+"PEXT r64, r64V, r/m64","PEXTQ r/m64, r64V, r64","pextq r/m64, r64V, r64","VEX.DDS.128.F3.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64"
+"PEXTRB r32/m8, xmm1, imm8u","PEXTRB imm8u, xmm1, r32/m8","pextrb imm8u, xmm1, r32/m8","66 0F 3A 14 /r ib","V","V","SSE4_1","","w,r,r","",""
+"PEXTRD r/m32, xmm1, imm8u","PEXTRD imm8u, xmm1, r/m32","pextrd imm8u, xmm1, r/m32","66 0F 3A 16 /r ib","V","V","SSE4_1","operand16,operand32","w,r,r","",""
+"PEXTRQ r/m64, xmm1, imm8u","PEXTRQ imm8u, xmm1, r/m64","pextrq imm8u, xmm1, r/m64","66 REX.W 0F 3A 16 /r ib","N.S.","V","SSE4_1","","w,r,r","",""
+"PEXTRW r32, mm2, imm8u","PEXTRW imm8u, mm2, r32","pextrw imm8u, mm2, r32","0F C5 /r ib","V","V","MMX","modrm_regonly","w,r,r","",""
+"PEXTRW r32/m16, xmm1, imm8u","PEXTRW imm8u, xmm1, r32/m16","pextrw imm8u, xmm1, r32/m16","66 0F 3A 15 /r ib","V","V","SSE4_1","","w,r,r","",""
+"PEXTRW r32, xmm2, imm8u","PEXTRW imm8u, xmm2, r32","pextrw imm8u, xmm2, r32","66 0F C5 /r ib","V","V","SSE2","modrm_regonly","w,r,r","",""
+"PF2ID mm1, mm2/m64","PF2ID mm2/m64, mm1","pf2id mm2/m64, mm1","0F 0F 1D /r","V","V","3DNOW","amd","rw,r","",""
+"PF2IW mm1, mm2/m64","PF2IW mm2/m64, mm1","pf2iw mm2/m64, mm1","0F 0F 1C /r","V","V","3DNOW","amd","rw,r","",""
+"PFACC mm1, mm2/m64","PFACC mm2/m64, mm1","pfacc mm2/m64, mm1","0F 0F AE /r","V","V","3DNOW","amd","rw,r","",""
+"PFADD mm1, mm2/m64","PFADD mm2/m64, mm1","pfadd mm2/m64, mm1","0F 0F 9E /r","V","V","3DNOW","amd","rw,r","",""
+"PFCMPEQ mm1, mm2/m64","PFCMPEQ mm2/m64, mm1","pfcmpeq mm2/m64, mm1","0F 0F B0 /r","V","V","3DNOW","amd","rw,r","",""
+"PFCMPGE mm1, mm2/m64","PFCMPGE mm2/m64, mm1","pfcmpge mm2/m64, mm1","0F 0F 90 /r","V","V","3DNOW","amd","rw,r","",""
+"PFCMPGT mm1, mm2/m64","PFCMPGT mm2/m64, mm1","pfcmpgt mm2/m64, mm1","0F 0F A0 /r","V","V","3DNOW","amd","rw,r","",""
+"PFCPIT1 mm1, mm2/m64","PFCPIT1 mm2/m64, mm1","pfcpit1 mm2/m64, mm1","0F 0F A6 /r","V","V","3DNOW","amd","rw,r","",""
+"PFMAX mm1, mm2/m64","PFMAX mm2/m64, mm1","pfmax mm2/m64, mm1","0F 0F A4 /r","V","V","3DNOW","amd","rw,r","",""
+"PFMIN mm1, mm2/m64","PFMIN mm2/m64, mm1","pfmin mm2/m64, mm1","0F 0F 94 /r","V","V","3DNOW","amd","rw,r","",""
+"PFMUL mm1, mm2/m64","PFMUL mm2/m64, mm1","pfmul mm2/m64, mm1","0F 0F B4 /r","V","V","3DNOW","amd","rw,r","",""
+"PFNACC mm1, mm2/m64","PFNACC mm2/m64, mm1","pfnacc mm2/m64, mm1","0F 0F 8A /r","V","V","3DNOW","amd","rw,r","",""
+"PFPNACC mm1, mm2/m64","PFPNACC mm2/m64, mm1","pfpnacc mm2/m64, mm1","0F 0F 8E /r","V","V","3DNOW","amd","rw,r","",""
+"PFRCP mm1, mm2/m64","PFRCP mm2/m64, mm1","pfrcp mm2/m64, mm1","0F 0F 96 /r","V","V","3DNOW","amd","rw,r","",""
+"PFRCPIT2 mm1, mm2/m64","PFRCPIT2 mm2/m64, mm1","pfrcpit2 mm2/m64, mm1","0F 0F B6 /r","V","V","3DNOW","amd","rw,r","",""
+"PFRSQIT1 mm1, mm2/m64","PFRSQIT1 mm2/m64, mm1","pfrsqit1 mm2/m64, mm1","0F 0F A7 /r","V","V","3DNOW","amd","rw,r","",""
+"PFSQRT mm1, mm2/m64","PFSQRT mm2/m64, mm1","pfsqrt mm2/m64, mm1","0F 0F 97 /r","V","V","3DNOW","amd","rw,r","",""
+"PFSUB mm1, mm2/m64","PFSUB mm2/m64, mm1","pfsub mm2/m64, mm1","0F 0F 9A /r","V","V","3DNOW","amd","rw,r","",""
+"PFSUBR mm1, mm2/m64","PFSUBR mm2/m64, mm1","pfsubr mm2/m64, mm1","0F 0F AA /r","V","V","3DNOW","amd","rw,r","",""
+"PHADDD mm1, mm2/m64","PHADDD mm2/m64, mm1","phaddd mm2/m64, mm1","0F 38 02 /r","V","V","SSSE3","","rw,r","",""
+"PHADDD xmm1, xmm2/m128","PHADDD xmm2/m128, xmm1","phaddd xmm2/m128, xmm1","66 0F 38 02 /r","V","V","SSSE3","","rw,r","",""
+"PHADDSW mm1, mm2/m64","PHADDSW mm2/m64, mm1","phaddsw mm2/m64, mm1","0F 38 03 /r","V","V","SSSE3","","rw,r","",""
+"PHADDSW xmm1, xmm2/m128","PHADDSW xmm2/m128, xmm1","phaddsw xmm2/m128, xmm1","66 0F 38 03 /r","V","V","SSSE3","","rw,r","",""
+"PHADDW mm1, mm2/m64","PHADDW mm2/m64, mm1","phaddw mm2/m64, mm1","0F 38 01 /r","V","V","SSSE3","","rw,r","",""
+"PHADDW xmm1, xmm2/m128","PHADDW xmm2/m128, xmm1","phaddw xmm2/m128, xmm1","66 0F 38 01 /r","V","V","SSSE3","","rw,r","",""
+"PHMINPOSUW xmm1, xmm2/m128","PHMINPOSUW xmm2/m128, xmm1","phminposuw xmm2/m128, xmm1","66 0F 38 41 /r","V","V","SSE4_1","","w,r","",""
+"PHSUBD mm1, mm2/m64","PHSUBD mm2/m64, mm1","phsubd mm2/m64, mm1","0F 38 06 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBD xmm1, xmm2/m128","PHSUBD xmm2/m128, xmm1","phsubd xmm2/m128, xmm1","66 0F 38 06 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBSW mm1, mm2/m64","PHSUBSW mm2/m64, mm1","phsubsw mm2/m64, mm1","0F 38 07 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBSW xmm1, xmm2/m128","PHSUBSW xmm2/m128, xmm1","phsubsw xmm2/m128, xmm1","66 0F 38 07 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBW mm1, mm2/m64","PHSUBW mm2/m64, mm1","phsubw mm2/m64, mm1","0F 38 05 /r","V","V","SSSE3","","rw,r","",""
+"PHSUBW xmm1, xmm2/m128","PHSUBW xmm2/m128, xmm1","phsubw xmm2/m128, xmm1","66 0F 38 05 /r","V","V","SSSE3","","rw,r","",""
+"PI2FD mm1, mm2/m64","PI2FD mm2/m64, mm1","pi2fd mm2/m64, mm1","0F 0F 0D /r","V","V","3DNOW","amd","rw,r","",""
+"PI2FW mm1, mm2/m64","PI2FW mm2/m64, mm1","pi2fw mm2/m64, mm1","0F 0F 0C /r","V","V","3DNOW","amd","rw,r","",""
+"PINSRB xmm1, r32/m8, imm8u","PINSRB imm8u, r32/m8, xmm1","pinsrb imm8u, r32/m8, xmm1","66 0F 3A 20 /r ib","V","V","SSE4_1","","rw,r,r","",""
+"PINSRD xmm1, r/m32, imm8u","PINSRD imm8u, r/m32, xmm1","pinsrd imm8u, r/m32, xmm1","66 0F 3A 22 /r ib","V","V","SSE4_1","operand16,operand32","rw,r,r","",""
+"PINSRQ xmm1, r/m64, imm8u","PINSRQ imm8u, r/m64, xmm1","pinsrq imm8u, r/m64, xmm1","66 REX.W 0F 3A 22 /r ib","N.S.","V","SSE4_1","","rw,r,r","",""
+"PINSRW mm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, mm1","pinsrw imm8u, r32/m16, mm1","0F C4 /r ib","V","V","MMX","","rw,r,r","",""
+"PINSRW xmm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, xmm1","pinsrw imm8u, r32/m16, xmm1","66 0F C4 /r ib","V","V","SSE2","","rw,r,r","",""
+"PMADDUBSW mm1, mm2/m64","PMADDUBSW mm2/m64, mm1","pmaddubsw mm2/m64, mm1","0F 38 04 /r","V","V","SSSE3","","rw,r","",""
+"PMADDUBSW xmm1, xmm2/m128","PMADDUBSW xmm2/m128, xmm1","pmaddubsw xmm2/m128, xmm1","66 0F 38 04 /r","V","V","SSSE3","","rw,r","",""
+"PMADDWD mm1, mm2/m64","PMADDWL mm2/m64, mm1","pmaddwd mm2/m64, mm1","0F F5 /r","V","V","MMX","","rw,r","",""
+"PMADDWD xmm1, xmm2/m128","PMADDWL xmm2/m128, xmm1","pmaddwd xmm2/m128, xmm1","66 0F F5 /r","V","V","SSE2","","rw,r","",""
+"PMAXSB xmm1, xmm2/m128","PMAXSB xmm2/m128, xmm1","pmaxsb xmm2/m128, xmm1","66 0F 38 3C /r","V","V","SSE4_1","","rw,r","",""
+"PMAXSD xmm1, xmm2/m128","PMAXSD xmm2/m128, xmm1","pmaxsd xmm2/m128, xmm1","66 0F 38 3D /r","V","V","SSE4_1","","rw,r","",""
+"PMAXSW mm1, mm2/m64","PMAXSW mm2/m64, mm1","pmaxsw mm2/m64, mm1","0F EE /r","V","V","MMX","","rw,r","",""
+"PMAXSW xmm1, xmm2/m128","PMAXSW xmm2/m128, xmm1","pmaxsw xmm2/m128, xmm1","66 0F EE /r","V","V","SSE2","","rw,r","",""
+"PMAXUB mm1, mm2/m64","PMAXUB mm2/m64, mm1","pmaxub mm2/m64, mm1","0F DE /r","V","V","MMX","","rw,r","",""
+"PMAXUB xmm1, xmm2/m128","PMAXUB xmm2/m128, xmm1","pmaxub xmm2/m128, xmm1","66 0F DE /r","V","V","SSE2","","rw,r","",""
+"PMAXUD xmm1, xmm2/m128","PMAXUD xmm2/m128, xmm1","pmaxud xmm2/m128, xmm1","66 0F 38 3F /r","V","V","SSE4_1","","rw,r","",""
+"PMAXUW xmm1, xmm2/m128","PMAXUW xmm2/m128, xmm1","pmaxuw xmm2/m128, xmm1","66 0F 38 3E /r","V","V","SSE4_1","","rw,r","",""
+"PMINSB xmm1, xmm2/m128","PMINSB xmm2/m128, xmm1","pminsb xmm2/m128, xmm1","66 0F 38 38 /r","V","V","SSE4_1","","rw,r","",""
+"PMINSD xmm1, xmm2/m128","PMINSD xmm2/m128, xmm1","pminsd xmm2/m128, xmm1","66 0F 38 39 /r","V","V","SSE4_1","","rw,r","",""
+"PMINSW mm1, mm2/m64","PMINSW mm2/m64, mm1","pminsw mm2/m64, mm1","0F EA /r","V","V","MMX","","rw,r","",""
+"PMINSW xmm1, xmm2/m128","PMINSW xmm2/m128, xmm1","pminsw xmm2/m128, xmm1","66 0F EA /r","V","V","SSE2","","rw,r","",""
+"PMINUB mm1, mm2/m64","PMINUB mm2/m64, mm1","pminub mm2/m64, mm1","0F DA /r","V","V","MMX","","rw,r","",""
+"PMINUB xmm1, xmm2/m128","PMINUB xmm2/m128, xmm1","pminub xmm2/m128, xmm1","66 0F DA /r","V","V","SSE2","","rw,r","",""
+"PMINUD xmm1, xmm2/m128","PMINUD xmm2/m128, xmm1","pminud xmm2/m128, xmm1","66 0F 38 3B /r","V","V","SSE4_1","","rw,r","",""
+"PMINUW xmm1, xmm2/m128","PMINUW xmm2/m128, xmm1","pminuw xmm2/m128, xmm1","66 0F 38 3A /r","V","V","SSE4_1","","rw,r","",""
+"PMOVMSKB r32, mm2","PMOVMSKB mm2, r32","pmovmskb mm2, r32","0F D7 /r","V","V","SSE","modrm_regonly","w,r","",""
+"PMOVMSKB r32, xmm2","PMOVMSKB xmm2, r32","pmovmskb xmm2, r32","66 0F D7 /r","V","V","SSE2","modrm_regonly","w,r","",""
+"PMOVSXBD xmm1, xmm2/m32","PMOVSXBD xmm2/m32, xmm1","pmovsxbd xmm2/m32, xmm1","66 0F 38 21 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXBQ xmm1, xmm2/m16","PMOVSXBQ xmm2/m16, xmm1","pmovsxbq xmm2/m16, xmm1","66 0F 38 22 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXBW xmm1, xmm2/m64","PMOVSXBW xmm2/m64, xmm1","pmovsxbw xmm2/m64, xmm1","66 0F 38 20 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXDQ xmm1, xmm2/m64","PMOVSXDQ xmm2/m64, xmm1","pmovsxdq xmm2/m64, xmm1","66 0F 38 25 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXWD xmm1, xmm2/m64","PMOVSXWD xmm2/m64, xmm1","pmovsxwd xmm2/m64, xmm1","66 0F 38 23 /r","V","V","SSE4_1","","w,r","",""
+"PMOVSXWQ xmm1, xmm2/m32","PMOVSXWQ xmm2/m32, xmm1","pmovsxwq xmm2/m32, xmm1","66 0F 38 24 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXBD xmm1, xmm2/m32","PMOVZXBD xmm2/m32, xmm1","pmovzxbd xmm2/m32, xmm1","66 0F 38 31 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXBQ xmm1, xmm2/m16","PMOVZXBQ xmm2/m16, xmm1","pmovzxbq xmm2/m16, xmm1","66 0F 38 32 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXBW xmm1, xmm2/m64","PMOVZXBW xmm2/m64, xmm1","pmovzxbw xmm2/m64, xmm1","66 0F 38 30 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXDQ xmm1, xmm2/m64","PMOVZXDQ xmm2/m64, xmm1","pmovzxdq xmm2/m64, xmm1","66 0F 38 35 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXWD xmm1, xmm2/m64","PMOVZXWD xmm2/m64, xmm1","pmovzxwd xmm2/m64, xmm1","66 0F 38 33 /r","V","V","SSE4_1","","w,r","",""
+"PMOVZXWQ xmm1, xmm2/m32","PMOVZXWQ xmm2/m32, xmm1","pmovzxwq xmm2/m32, xmm1","66 0F 38 34 /r","V","V","SSE4_1","","w,r","",""
+"PMULDQ xmm1, xmm2/m128","PMULDQ xmm2/m128, xmm1","pmuldq xmm2/m128, xmm1","66 0F 38 28 /r","V","V","SSE4_1","","rw,r","",""
+"PMULHRSW mm1, mm2/m64","PMULHRSW mm2/m64, mm1","pmulhrsw mm2/m64, mm1","0F 38 0B /r","V","V","SSSE3","","rw,r","",""
+"PMULHRSW xmm1, xmm2/m128","PMULHRSW xmm2/m128, xmm1","pmulhrsw xmm2/m128, xmm1","66 0F 38 0B /r","V","V","SSSE3","","rw,r","",""
+"PMULHRW mm1, mm2/m64","PMULHRW mm2/m64, mm1","pmulhrw mm2/m64, mm1","0F 0F B7 /r","V","V","3DNOW","amd","rw,r","",""
+"PMULHUW mm1, mm2/m64","PMULHUW mm2/m64, mm1","pmulhuw mm2/m64, mm1","0F E4 /r","V","V","MMX","","rw,r","",""
+"PMULHUW xmm1, xmm2/m128","PMULHUW xmm2/m128, xmm1","pmulhuw xmm2/m128, xmm1","66 0F E4 /r","V","V","SSE2","","rw,r","",""
+"PMULHW mm1, mm2/m64","PMULHW mm2/m64, mm1","pmulhw mm2/m64, mm1","0F E5 /r","V","V","MMX","","rw,r","",""
+"PMULHW xmm1, xmm2/m128","PMULHW xmm2/m128, xmm1","pmulhw xmm2/m128, xmm1","66 0F E5 /r","V","V","SSE2","","rw,r","",""
+"PMULLD xmm1, xmm2/m128","PMULLD xmm2/m128, xmm1","pmulld xmm2/m128, xmm1","66 0F 38 40 /r","V","V","SSE4_1","","rw,r","",""
+"PMULLW mm1, mm2/m64","PMULLW mm2/m64, mm1","pmullw mm2/m64, mm1","0F D5 /r","V","V","MMX","","rw,r","",""
+"PMULLW xmm1, xmm2/m128","PMULLW xmm2/m128, xmm1","pmullw xmm2/m128, xmm1","66 0F D5 /r","V","V","SSE2","","rw,r","",""
+"PMULUDQ mm1, mm2/m64","PMULULQ mm2/m64, mm1","pmuludq mm2/m64, mm1","0F F4 /r","V","V","SSE2","","rw,r","",""
+"PMULUDQ xmm1, xmm2/m128","PMULULQ xmm2/m128, xmm1","pmuludq xmm2/m128, xmm1","66 0F F4 /r","V","V","SSE2","","rw,r","",""
+"POPAD","POPAL","popal","61","V","N.S.","","operand32","","",""
+"POPA","POPAW","popaw","61","V","N.S.","","operand16","","",""
+"POPCNT r32, r/m32","POPCNTL r/m32, r32","popcntl r/m32, r32","F3 0F B8 /r","V","V","POPCNT","operand32","w,r","Y","32"
+"POPCNT r64, r/m64","POPCNTQ r/m64, r64","popcntq r/m64, r64","F3 REX.W 0F B8 /r","N.S.","V","POPCNT","","w,r","Y","64"
+"POPCNT r16, r/m16","POPCNTW r/m16, r16","popcntw r/m16, r16","F3 0F B8 /r","V","V","POPCNT","operand16","w,r","Y","16"
+"POPFD","POPFL","popfl","9D","V","N.S.","","operand32","","",""
+"POPFQ","POPFQ","popfq","9D","N.S.","V","","default64","","",""
+"POPF","POPFW","popfw","9D","V","V","","operand16","","",""
+"POP r/m32","POPL r/m32","popl r/m32","8F /0","V","N.S.","","operand32","w","Y","32"
+"POP r32op","POPL r32op","popl r32op","58+rd","V","N.S.","","operand32","w","Y","32"
+"POP r/m64","POPQ r/m64","popq r/m64","8F /0","N.S.","V","","default64","w","Y","64"
+"POP r64op","POPQ r64op","popq r64op","58+ro","N.S.","V","","default64","w","Y","64"
+"POP r/m16","POPW r/m16","popw r/m16","8F /0","V","V","","operand16","w","Y","16"
+"POP r16op","POPW r16op","popw r16op","58+rw","V","V","","operand16","w","Y","16"
+"POP DS","POPW/POPL/POPQ DS","popw/popl/popq DS","1F","V","N.S.","","","w","Y",""
+"POP ES","POPW/POPL/POPQ ES","popw/popl/popq ES","07","V","N.S.","","","w","Y",""
+"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","N.S.","V","","default64","w","Y",""
+"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","N.S.","","operand32","w","Y",""
+"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","V","","operand16","w","Y",""
+"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","N.S.","V","","default64","w","Y",""
+"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","V","","operand16","w","Y",""
+"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","N.S.","","operand32","w","Y",""
+"POP SS","POPW/POPL/POPQ SS","popw/popl/popq SS","17","V","N.S.","","","w","Y",""
+"POR mm1, mm2/m64","POR mm2/m64, mm1","por mm2/m64, mm1","0F EB /r","V","V","MMX","","rw,r","",""
+"POR xmm1, xmm2/m128","POR xmm2/m128, xmm1","por xmm2/m128, xmm1","66 0F EB /r","V","V","SSE2","","rw,r","",""
+"PREFETCHNTA m8","PREFETCHNTA m8","prefetchnta m8","0F 18 /0","V","V","","modrm_memonly","r","",""
+"PREFETCHT0 m8","PREFETCHT0 m8","prefetcht0 m8","0F 18 /1","V","V","","modrm_memonly","r","",""
+"PREFETCHT1 m8","PREFETCHT1 m8","prefetcht1 m8","0F 18 /2","V","V","","modrm_memonly","r","",""
+"PREFETCHT2 m8","PREFETCHT2 m8","prefetcht2 m8","0F 18 /3","V","V","","modrm_memonly","r","",""
+"PREFETCHW m8","PREFETCHW m8","prefetchw m8","0F 0D /1","V","V","PRFCHW","modrm_memonly","r","",""
+"PREFETCHWT1 m8","PREFETCHWT1 m8","prefetchwt1 m8","0F 0D /2","V","V","PREFETCHWT1","modrm_memonly","r","",""
+"PREFETCHW_ALIAS m8","PREFETCHW_ALIAS m8","prefetchw_alias m8","0F 0D /3","V","V","PRFCHW","modrm_memonly","r","",""
+"PREFETCH_EXCLUSIVE m8","PREFETCH_EXCLUSIVE m8","prefetch_exclusive m8","0F 0D /0","V","V","PRFCHW","modrm_memonly","r","",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /2","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /4","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /5","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /6","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0D /7","V","V","PRFCHW","modrm_memonly","r","Y",""
+"PSADBW mm1, mm2/m64","PSADBW mm2/m64, mm1","psadbw mm2/m64, mm1","0F F6 /r","V","V","MMX","","rw,r","",""
+"PSADBW xmm1, xmm2/m128","PSADBW xmm2/m128, xmm1","psadbw xmm2/m128, xmm1","66 0F F6 /r","V","V","SSE2","","rw,r","",""
+"PSHUFB mm1, mm2/m64","PSHUFB mm2/m64, mm1","pshufb mm2/m64, mm1","0F 38 00 /r","V","V","SSSE3","","rw,r","",""
+"PSHUFB xmm1, xmm2/m128","PSHUFB xmm2/m128, xmm1","pshufb xmm2/m128, xmm1","66 0F 38 00 /r","V","V","SSSE3","","rw,r","",""
+"PSHUFD xmm1, xmm2/m128, imm8u","PSHUFD imm8u, xmm2/m128, xmm1","pshufd imm8u, xmm2/m128, xmm1","66 0F 70 /r ib","V","V","SSE2","","w,r,r","",""
+"PSHUFHW xmm1, xmm2/m128, imm8u","PSHUFHW imm8u, xmm2/m128, xmm1","pshufhw imm8u, xmm2/m128, xmm1","F3 0F 70 /r ib","V","V","SSE2","","w,r,r","",""
+"PSHUFLW xmm1, xmm2/m128, imm8u","PSHUFLW imm8u, xmm2/m128, xmm1","pshuflw imm8u, xmm2/m128, xmm1","F2 0F 70 /r ib","V","V","SSE2","","w,r,r","",""
+"PSHUFW mm1, mm2/m64, imm8u","PSHUFW imm8u, mm2/m64, mm1","pshufw imm8u, mm2/m64, mm1","0F 70 /r ib","V","V","MMX","","w,r,r","",""
+"PSIGNB mm1, mm2/m64","PSIGNB mm2/m64, mm1","psignb mm2/m64, mm1","0F 38 08 /r","V","V","SSSE3","","rw,r","",""
+"PSIGNB xmm1, xmm2/m128","PSIGNB xmm2/m128, xmm1","psignb xmm2/m128, xmm1","66 0F 38 08 /r","V","V","SSSE3","","rw,r","",""
+"PSIGND mm1, mm2/m64","PSIGND mm2/m64, mm1","psignd mm2/m64, mm1","0F 38 0A /r","V","V","SSSE3","","rw,r","",""
+"PSIGND xmm1, xmm2/m128","PSIGND xmm2/m128, xmm1","psignd xmm2/m128, xmm1","66 0F 38 0A /r","V","V","SSSE3","","rw,r","",""
+"PSIGNW mm1, mm2/m64","PSIGNW mm2/m64, mm1","psignw mm2/m64, mm1","0F 38 09 /r","V","V","SSSE3","","rw,r","",""
+"PSIGNW xmm1, xmm2/m128","PSIGNW xmm2/m128, xmm1","psignw xmm2/m128, xmm1","66 0F 38 09 /r","V","V","SSSE3","","rw,r","",""
+"PSLLD mm2, imm8u","PSLLL imm8u, mm2","pslld imm8u, mm2","0F 72 /6 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSLLD xmm2, imm8u","PSLLL imm8u, xmm2","pslld imm8u, xmm2","66 0F 72 /6 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSLLD mm1, mm2/m64","PSLLL mm2/m64, mm1","pslld mm2/m64, mm1","0F F2 /r","V","V","MMX","","rw,r","",""
+"PSLLD xmm1, xmm2/m128","PSLLL xmm2/m128, xmm1","pslld xmm2/m128, xmm1","66 0F F2 /r","V","V","SSE2","","rw,r","",""
+"PSLLDQ xmm2, imm8u","PSLLO imm8u, xmm2","pslldq imm8u, xmm2","66 0F 73 /7 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSLLQ mm2, imm8u","PSLLQ imm8u, mm2","psllq imm8u, mm2","0F 73 /6 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSLLQ xmm2, imm8u","PSLLQ imm8u, xmm2","psllq imm8u, xmm2","66 0F 73 /6 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSLLQ mm1, mm2/m64","PSLLQ mm2/m64, mm1","psllq mm2/m64, mm1","0F F3 /r","V","V","MMX","","rw,r","",""
+"PSLLQ xmm1, xmm2/m128","PSLLQ xmm2/m128, xmm1","psllq xmm2/m128, xmm1","66 0F F3 /r","V","V","SSE2","","rw,r","",""
+"PSLLW mm2, imm8u","PSLLW imm8u, mm2","psllw imm8u, mm2","0F 71 /6 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSLLW xmm2, imm8u","PSLLW imm8u, xmm2","psllw imm8u, xmm2","66 0F 71 /6 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSLLW mm1, mm2/m64","PSLLW mm2/m64, mm1","psllw mm2/m64, mm1","0F F1 /r","V","V","MMX","","rw,r","",""
+"PSLLW xmm1, xmm2/m128","PSLLW xmm2/m128, xmm1","psllw xmm2/m128, xmm1","66 0F F1 /r","V","V","SSE2","","rw,r","",""
+"PSRAD mm2, imm8u","PSRAL imm8u, mm2","psrad imm8u, mm2","0F 72 /4 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRAD xmm2, imm8u","PSRAL imm8u, xmm2","psrad imm8u, xmm2","66 0F 72 /4 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRAD mm1, mm2/m64","PSRAL mm2/m64, mm1","psrad mm2/m64, mm1","0F E2 /r","V","V","MMX","","rw,r","",""
+"PSRAD xmm1, xmm2/m128","PSRAL xmm2/m128, xmm1","psrad xmm2/m128, xmm1","66 0F E2 /r","V","V","SSE2","","rw,r","",""
+"PSRAW mm2, imm8u","PSRAW imm8u, mm2","psraw imm8u, mm2","0F 71 /4 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRAW xmm2, imm8u","PSRAW imm8u, xmm2","psraw imm8u, xmm2","66 0F 71 /4 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRAW mm1, mm2/m64","PSRAW mm2/m64, mm1","psraw mm2/m64, mm1","0F E1 /r","V","V","MMX","","rw,r","",""
+"PSRAW xmm1, xmm2/m128","PSRAW xmm2/m128, xmm1","psraw xmm2/m128, xmm1","66 0F E1 /r","V","V","SSE2","","rw,r","",""
+"PSRLD mm2, imm8u","PSRLL imm8u, mm2","psrld imm8u, mm2","0F 72 /2 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRLD xmm2, imm8u","PSRLL imm8u, xmm2","psrld imm8u, xmm2","66 0F 72 /2 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRLD mm1, mm2/m64","PSRLL mm2/m64, mm1","psrld mm2/m64, mm1","0F D2 /r","V","V","MMX","","rw,r","",""
+"PSRLD xmm1, xmm2/m128","PSRLL xmm2/m128, xmm1","psrld xmm2/m128, xmm1","66 0F D2 /r","V","V","SSE2","","rw,r","",""
+"PSRLDQ xmm2, imm8u","PSRLO imm8u, xmm2","psrldq imm8u, xmm2","66 0F 73 /3 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRLQ mm2, imm8u","PSRLQ imm8u, mm2","psrlq imm8u, mm2","0F 73 /2 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRLQ xmm2, imm8u","PSRLQ imm8u, xmm2","psrlq imm8u, xmm2","66 0F 73 /2 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRLQ mm1, mm2/m64","PSRLQ mm2/m64, mm1","psrlq mm2/m64, mm1","0F D3 /r","V","V","MMX","","rw,r","",""
+"PSRLQ xmm1, xmm2/m128","PSRLQ xmm2/m128, xmm1","psrlq xmm2/m128, xmm1","66 0F D3 /r","V","V","SSE2","","rw,r","",""
+"PSRLW mm2, imm8u","PSRLW imm8u, mm2","psrlw imm8u, mm2","0F 71 /2 ib","V","V","MMX","modrm_regonly","rw,r","",""
+"PSRLW xmm2, imm8u","PSRLW imm8u, xmm2","psrlw imm8u, xmm2","66 0F 71 /2 ib","V","V","SSE2","modrm_regonly","rw,r","",""
+"PSRLW mm1, mm2/m64","PSRLW mm2/m64, mm1","psrlw mm2/m64, mm1","0F D1 /r","V","V","MMX","","rw,r","",""
+"PSRLW xmm1, xmm2/m128","PSRLW xmm2/m128, xmm1","psrlw xmm2/m128, xmm1","66 0F D1 /r","V","V","SSE2","","rw,r","",""
+"PSUBB mm1, mm2/m64","PSUBB mm2/m64, mm1","psubb mm2/m64, mm1","0F F8 /r","V","V","MMX","","rw,r","",""
+"PSUBB xmm1, xmm2/m128","PSUBB xmm2/m128, xmm1","psubb xmm2/m128, xmm1","66 0F F8 /r","V","V","SSE2","","rw,r","",""
+"PSUBD mm1, mm2/m64","PSUBL mm2/m64, mm1","psubd mm2/m64, mm1","0F FA /r","V","V","MMX","","rw,r","",""
+"PSUBD xmm1, xmm2/m128","PSUBL xmm2/m128, xmm1","psubd xmm2/m128, xmm1","66 0F FA /r","V","V","SSE2","","rw,r","",""
+"PSUBQ mm1, mm2/m64","PSUBQ mm2/m64, mm1","psubq mm2/m64, mm1","0F FB /r","V","V","SSE2","","rw,r","",""
+"PSUBQ xmm1, xmm2/m128","PSUBQ xmm2/m128, xmm1","psubq xmm2/m128, xmm1","66 0F FB /r","V","V","SSE2","","rw,r","",""
+"PSUBSB mm1, mm2/m64","PSUBSB mm2/m64, mm1","psubsb mm2/m64, mm1","0F E8 /r","V","V","MMX","","rw,r","",""
+"PSUBSB xmm1, xmm2/m128","PSUBSB xmm2/m128, xmm1","psubsb xmm2/m128, xmm1","66 0F E8 /r","V","V","SSE2","","rw,r","",""
+"PSUBSW mm1, mm2/m64","PSUBSW mm2/m64, mm1","psubsw mm2/m64, mm1","0F E9 /r","V","V","MMX","","rw,r","",""
+"PSUBSW xmm1, xmm2/m128","PSUBSW xmm2/m128, xmm1","psubsw xmm2/m128, xmm1","66 0F E9 /r","V","V","SSE2","","rw,r","",""
+"PSUBUSB mm1, mm2/m64","PSUBUSB mm2/m64, mm1","psubusb mm2/m64, mm1","0F D8 /r","V","V","MMX","","rw,r","",""
+"PSUBUSB xmm1, xmm2/m128","PSUBUSB xmm2/m128, xmm1","psubusb xmm2/m128, xmm1","66 0F D8 /r","V","V","SSE2","","rw,r","",""
+"PSUBUSW mm1, mm2/m64","PSUBUSW mm2/m64, mm1","psubusw mm2/m64, mm1","0F D9 /r","V","V","MMX","","rw,r","",""
+"PSUBUSW xmm1, xmm2/m128","PSUBUSW xmm2/m128, xmm1","psubusw xmm2/m128, xmm1","66 0F D9 /r","V","V","SSE2","","rw,r","",""
+"PSUBW mm1, mm2/m64","PSUBW mm2/m64, mm1","psubw mm2/m64, mm1","0F F9 /r","V","V","MMX","","rw,r","",""
+"PSUBW xmm1, xmm2/m128","PSUBW xmm2/m128, xmm1","psubw xmm2/m128, xmm1","66 0F F9 /r","V","V","SSE2","","rw,r","",""
+"PSWAPD mm1, mm2/m64","PSWAPD mm2/m64, mm1","pswapd mm2/m64, mm1","0F 0F BB /r","V","V","3DNOW","amd","rw,r","",""
+"PTEST xmm1, xmm2/m128","PTEST xmm2/m128, xmm1","ptest xmm2/m128, xmm1","66 0F 38 17 /r","V","V","SSE4_1","","r,r","",""
+"PTWRITE r/m32","PTWRITEL r/m32","ptwritel r/m32","F3 0F AE /4","V","V","","operand16,operand32","r","Y","32"
+"PTWRITE r/m64","PTWRITEQ r/m64","ptwriteq r/m64","F3 REX.W 0F AE /4","N.S.","V","","","r","Y","64"
+"PUNPCKHBW mm1, mm2/m64","PUNPCKHBW mm2/m64, mm1","punpckhbw mm2/m64, mm1","0F 68 /r","V","V","MMX","","rw,r","",""
+"PUNPCKHBW xmm1, xmm2/m128","PUNPCKHBW xmm2/m128, xmm1","punpckhbw xmm2/m128, xmm1","66 0F 68 /r","V","V","SSE2","","rw,r","",""
+"PUNPCKHDQ mm1, mm2/m64","PUNPCKHLQ mm2/m64, mm1","punpckhdq mm2/m64, mm1","0F 6A /r","V","V","MMX","","rw,r","",""
+"PUNPCKHDQ xmm1, xmm2/m128","PUNPCKHLQ xmm2/m128, xmm1","punpckhdq xmm2/m128, xmm1","66 0F 6A /r","V","V","SSE2","","rw,r","",""
+"PUNPCKHQDQ xmm1, xmm2/m128","PUNPCKHQDQ xmm2/m128, xmm1","punpckhqdq xmm2/m128, xmm1","66 0F 6D /r","V","V","SSE2","","rw,r","",""
+"PUNPCKHWD mm1, mm2/m64","PUNPCKHWL mm2/m64, mm1","punpckhwd mm2/m64, mm1","0F 69 /r","V","V","MMX","","rw,r","",""
+"PUNPCKHWD xmm1, xmm2/m128","PUNPCKHWL xmm2/m128, xmm1","punpckhwd xmm2/m128, xmm1","66 0F 69 /r","V","V","SSE2","","rw,r","",""
+"PUNPCKLBW mm1, mm2/m32","PUNPCKLBW mm2/m32, mm1","punpcklbw mm2/m32, mm1","0F 60 /r","V","V","MMX","","rw,r","",""
+"PUNPCKLBW xmm1, xmm2/m128","PUNPCKLBW xmm2/m128, xmm1","punpcklbw xmm2/m128, xmm1","66 0F 60 /r","V","V","SSE2","","rw,r","",""
+"PUNPCKLDQ mm1, mm2/m32","PUNPCKLLQ mm2/m32, mm1","punpckldq mm2/m32, mm1","0F 62 /r","V","V","MMX","","rw,r","",""
+"PUNPCKLDQ xmm1, xmm2/m128","PUNPCKLLQ xmm2/m128, xmm1","punpckldq xmm2/m128, xmm1","66 0F 62 /r","V","V","SSE2","","rw,r","",""
+"PUNPCKLQDQ xmm1, xmm2/m128","PUNPCKLQDQ xmm2/m128, xmm1","punpcklqdq xmm2/m128, xmm1","66 0F 6C /r","V","V","SSE2","","rw,r","",""
+"PUNPCKLWD mm1, mm2/m32","PUNPCKLWL mm2/m32, mm1","punpcklwd mm2/m32, mm1","0F 61 /r","V","V","MMX","","rw,r","",""
+"PUNPCKLWD xmm1, xmm2/m128","PUNPCKLWL xmm2/m128, xmm1","punpcklwd xmm2/m128, xmm1","66 0F 61 /r","V","V","SSE2","","rw,r","",""
+"PUSHAD","PUSHAL","pushal","60","V","N.S.","","operand32","","",""
+"PUSHA","PUSHAW","pushaw","60","V","N.S.","","operand16","","",""
+"PUSHFD","PUSHFL","pushfl","9C","V","N.S.","","operand32","","",""
+"PUSHFQ","PUSHFQ","pushfq","9C","N.S.","V","","default64","","",""
+"PUSHF","PUSHFW","pushfw","9C","V","V","","operand16","","",""
+"PUSH r/m32","PUSHL r/m32","pushl r/m32","FF /6","V","N.S.","","operand32","r","Y","32"
+"PUSH r32op","PUSHL r32op","pushl r32op","50+rd","V","N.S.","","operand32","r","Y","32"
+"PUSH r/m64","PUSHQ r/m64","pushq r/m64","FF /6","N.S.","V","","default64","r","Y","64"
+"PUSH r64op","PUSHQ r64op","pushq r64op","50+ro","N.S.","V","","default64","r","Y","64"
+"PUSH imm16","PUSHW imm16","pushw imm16","68 iw","V","V","","operand16","r","Y",""
+"PUSH r/m16","PUSHW r/m16","pushw r/m16","FF /6","V","V","","operand16","r","Y","16"
+"PUSH r16op","PUSHW r16op","pushw r16op","50+rw","V","V","","operand16","r","Y","16"
+"PUSH CS","PUSHW/PUSHL/PUSHQ CS","pushw/pushl/pushq CS","0E","V","N.S.","","","r","Y",""
+"PUSH DS","PUSHW/PUSHL/PUSHQ DS","pushw/pushl/pushq DS","1E","V","N.S.","","","r","Y",""
+"PUSH ES","PUSHW/PUSHL/PUSHQ ES","pushw/pushl/pushq ES","06","V","N.S.","","","r","Y",""
+"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","V","","operand16","r","Y",""
+"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","N.S.","V","","default64","r","Y",""
+"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","N.S.","","operand32","r","Y",""
+"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","N.S.","V","","default64","r","Y",""
+"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","N.S.","","operand32","r","Y",""
+"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","V","","operand16","r","Y",""
+"PUSH SS","PUSHW/PUSHL/PUSHQ SS","pushw/pushl/pushq SS","16","V","N.S.","","","r","Y",""
+"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V","N.S.","","operand32","r","Y",""
+"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","N.S.","V","","default64","r","Y",""
+"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V","V","","operand16","r","Y",""
+"PXOR mm1, mm2/m64","PXOR mm2/m64, mm1","pxor mm2/m64, mm1","0F EF /r","V","V","MMX","","rw,r","",""
+"PXOR xmm1, xmm2/m128","PXOR xmm2/m128, xmm1","pxor xmm2/m128, xmm1","66 0F EF /r","V","V","SSE2","","rw,r","",""
+"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","D0 /2","V","V","","","rw,r","Y","8"
+"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","REX D0 /2","N.E.","V","","pseudo64","w,r","Y","8"
+"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","D2 /2","V","V","","","rw,r","Y","8"
+"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","REX D2 /2","N.E.","V","","pseudo64","w,r","Y","8"
+"RCL r/m8, imm8","RCLB imm8, r/m8","rclb imm8, r/m8","REX C0 /2 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"RCL r/m8, imm8u","RCLB imm8u, r/m8","rclb imm8u, r/m8","C0 /2 ib","V","V","","","rw,r","Y","8"
+"RCL r/m32, 1","RCLL 1, r/m32","rcll 1, r/m32","D1 /2","V","V","","operand32","rw,r","Y","32"
+"RCL r/m32, CL","RCLL CL, r/m32","rcll CL, r/m32","D3 /2","V","V","","operand32","rw,r","Y","32"
+"RCL r/m32, imm8u","RCLL imm8u, r/m32","rcll imm8u, r/m32","C1 /2 ib","V","V","","operand32","rw,r","Y","32"
+"RCL r/m64, 1","RCLQ 1, r/m64","rclq 1, r/m64","REX.W D1 /2","N.S.","V","","","rw,r","Y","64"
+"RCL r/m64, CL","RCLQ CL, r/m64","rclq CL, r/m64","REX.W D3 /2","N.S.","V","","","rw,r","Y","64"
+"RCL r/m64, imm8u","RCLQ imm8u, r/m64","rclq imm8u, r/m64","REX.W C1 /2 ib","N.S.","V","","","rw,r","Y","64"
+"RCL r/m16, 1","RCLW 1, r/m16","rclw 1, r/m16","D1 /2","V","V","","operand16","rw,r","Y","16"
+"RCL r/m16, CL","RCLW CL, r/m16","rclw CL, r/m16","D3 /2","V","V","","operand16","rw,r","Y","16"
+"RCL r/m16, imm8u","RCLW imm8u, r/m16","rclw imm8u, r/m16","C1 /2 ib","V","V","","operand16","rw,r","Y","16"
+"RCPPS xmm1, xmm2/m128","RCPPS xmm2/m128, xmm1","rcpps xmm2/m128, xmm1","0F 53 /r","V","V","SSE","","w,r","",""
+"RCPSS xmm1, xmm2/m32","RCPSS xmm2/m32, xmm1","rcpss xmm2/m32, xmm1","F3 0F 53 /r","V","V","SSE","","w,r","",""
+"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","D0 /3","V","V","","","rw,r","Y","8"
+"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","REX D0 /3","N.E.","V","","pseudo64","w,r","Y","8"
+"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","D2 /3","V","V","","","rw,r","Y","8"
+"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","REX D2 /3","N.E.","V","","pseudo64","w,r","Y","8"
+"RCR r/m8, imm8","RCRB imm8, r/m8","rcrb imm8, r/m8","REX C0 /3 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"RCR r/m8, imm8u","RCRB imm8u, r/m8","rcrb imm8u, r/m8","C0 /3 ib","V","V","","","rw,r","Y","8"
+"RCR r/m32, 1","RCRL 1, r/m32","rcrl 1, r/m32","D1 /3","V","V","","operand32","rw,r","Y","32"
+"RCR r/m32, CL","RCRL CL, r/m32","rcrl CL, r/m32","D3 /3","V","V","","operand32","rw,r","Y","32"
+"RCR r/m32, imm8u","RCRL imm8u, r/m32","rcrl imm8u, r/m32","C1 /3 ib","V","V","","operand32","rw,r","Y","32"
+"RCR r/m64, 1","RCRQ 1, r/m64","rcrq 1, r/m64","REX.W D1 /3","N.S.","V","","","rw,r","Y","64"
+"RCR r/m64, CL","RCRQ CL, r/m64","rcrq CL, r/m64","REX.W D3 /3","N.S.","V","","","rw,r","Y","64"
+"RCR r/m64, imm8u","RCRQ imm8u, r/m64","rcrq imm8u, r/m64","REX.W C1 /3 ib","N.S.","V","","","rw,r","Y","64"
+"RCR r/m16, 1","RCRW 1, r/m16","rcrw 1, r/m16","D1 /3","V","V","","operand16","rw,r","Y","16"
+"RCR r/m16, CL","RCRW CL, r/m16","rcrw CL, r/m16","D3 /3","V","V","","operand16","rw,r","Y","16"
+"RCR r/m16, imm8u","RCRW imm8u, r/m16","rcrw imm8u, r/m16","C1 /3 ib","V","V","","operand16","rw,r","Y","16"
+"RDFSBASE rmr32","RDFSBASEL rmr32","rdfsbase rmr32","F3 0F AE /0","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32"
+"RDFSBASE rmr64","RDFSBASEQ rmr64","rdfsbase rmr64","F3 REX.W 0F AE /0","N.S.","V","FSGSBASE","modrm_regonly","w","Y","64"
+"RDGSBASE rmr32","RDGSBASEL rmr32","rdgsbase rmr32","F3 0F AE /1","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32"
+"RDGSBASE rmr64","RDGSBASEQ rmr64","rdgsbase rmr64","F3 REX.W 0F AE /1","N.S.","V","FSGSBASE","modrm_regonly","w","Y","64"
+"RDMSR","RDMSR","rdmsr","0F 32","V","V","Pentium","","","",""
+"RDPKRU","RDPKRU","rdpkru","0F 01 EE","V","V","PKU","","","",""
+"RDPMC","RDPMC","rdpmc","0F 33","V","V","","","","",""
+"RDRAND rmr32","RDRANDL rmr32","rdrand rmr32","0F C7 /6","V","V","RDRAND","modrm_regonly,operand32","w","Y","32"
+"RDRAND rmr64","RDRANDQ rmr64","rdrand rmr64","REX.W 0F C7 /6","N.S.","V","RDRAND","modrm_regonly","w","Y","64"
+"RDRAND rmr16","RDRANDW rmr16","rdrand rmr16","0F C7 /6","V","V","RDRAND","modrm_regonly,operand16","w","Y","16"
+"RDSEED rmr32","RDSEEDL rmr32","rdseed rmr32","0F C7 /7","V","V","RDSEED","modrm_regonly,operand32","w","Y","32"
+"RDSEED rmr64","RDSEEDQ rmr64","rdseed rmr64","REX.W 0F C7 /7","N.S.","V","RDSEED","modrm_regonly","w","Y","64"
+"RDSEED rmr16","RDSEEDW rmr16","rdseed rmr16","0F C7 /7","V","V","RDSEED","modrm_regonly,operand16","w","Y","16"
+"RDSSPD rmr32","RDSSPD rmr32","rdsspd rmr32","F3 0F 1E /1","V","V","CET","modrm_regonly,operand16,operand32","w","",""
+"RDSSPQ rmr64","RDSSPQ rmr64","rdsspq rmr64","F3 REX.W 0F 1E /1","N.S.","V","CET","modrm_regonly","w","",""
+"RDTSC","RDTSC","rdtsc","0F 31","V","V","Pentium","","","",""
+"RDTSCP","RDTSCP","rdtscp","0F 01 F9","V","V","RDTSCP","","","",""
+"RET_FAR","RETFW/RETFL/RETFQ","lretw/lretl/lretl","CB","V","V","","","","",""
+"RET_FAR imm16u","RETFW/RETFL/RETFQ imm16u","lretw/lretl/lretl imm16u","CA iw","V","V","","","r","",""
+"RET","RETW/RETL/RETQ","retw/retl/retq","C3","N.S.","V","","default64","","",""
+"RET","RETW/RETL/RETQ","retw/retl/retq","C3","V","N.S.","","","","",""
+"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","N.S.","V","","default64","r","",""
+"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","V","N.S.","","","r","",""
+"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","D0 /0","V","V","","","rw,r","Y","8"
+"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","REX D0 /0","N.E.","V","","pseudo64","w,r","Y","8"
+"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","D2 /0","V","V","","","rw,r","Y","8"
+"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","REX D2 /0","N.E.","V","","pseudo64","w,r","Y","8"
+"ROL r/m8, imm8","ROLB imm8, r/m8","rolb imm8, r/m8","REX C0 /0 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"ROL r/m8, imm8u","ROLB imm8u, r/m8","rolb imm8u, r/m8","C0 /0 ib","V","V","","","rw,r","Y","8"
+"ROL r/m32, 1","ROLL 1, r/m32","roll 1, r/m32","D1 /0","V","V","","operand32","rw,r","Y","32"
+"ROL r/m32, CL","ROLL CL, r/m32","roll CL, r/m32","D3 /0","V","V","","operand32","rw,r","Y","32"
+"ROL r/m32, imm8u","ROLL imm8u, r/m32","roll imm8u, r/m32","C1 /0 ib","V","V","","operand32","rw,r","Y","32"
+"ROL r/m64, 1","ROLQ 1, r/m64","rolq 1, r/m64","REX.W D1 /0","N.S.","V","","","rw,r","Y","64"
+"ROL r/m64, CL","ROLQ CL, r/m64","rolq CL, r/m64","REX.W D3 /0","N.S.","V","","","rw,r","Y","64"
+"ROL r/m64, imm8u","ROLQ imm8u, r/m64","rolq imm8u, r/m64","REX.W C1 /0 ib","N.S.","V","","","rw,r","Y","64"
+"ROL r/m16, 1","ROLW 1, r/m16","rolw 1, r/m16","D1 /0","V","V","","operand16","rw,r","Y","16"
+"ROL r/m16, CL","ROLW CL, r/m16","rolw CL, r/m16","D3 /0","V","V","","operand16","rw,r","Y","16"
+"ROL r/m16, imm8u","ROLW imm8u, r/m16","rolw imm8u, r/m16","C1 /0 ib","V","V","","operand16","rw,r","Y","16"
+"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","D0 /1","V","V","","","rw,r","Y","8"
+"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","REX D0 /1","N.E.","V","","pseudo64","w,r","Y","8"
+"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","D2 /1","V","V","","","rw,r","Y","8"
+"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","REX D2 /1","N.E.","V","","pseudo64","w,r","Y","8"
+"ROR r/m8, imm8","RORB imm8, r/m8","rorb imm8, r/m8","REX C0 /1 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"ROR r/m8, imm8u","RORB imm8u, r/m8","rorb imm8u, r/m8","C0 /1 ib","V","V","","","rw,r","Y","8"
+"ROR r/m32, 1","RORL 1, r/m32","rorl 1, r/m32","D1 /1","V","V","","operand32","rw,r","Y","32"
+"ROR r/m32, CL","RORL CL, r/m32","rorl CL, r/m32","D3 /1","V","V","","operand32","rw,r","Y","32"
+"ROR r/m32, imm8u","RORL imm8u, r/m32","rorl imm8u, r/m32","C1 /1 ib","V","V","","operand32","rw,r","Y","32"
+"ROR r/m64, 1","RORQ 1, r/m64","rorq 1, r/m64","REX.W D1 /1","N.S.","V","","","rw,r","Y","64"
+"ROR r/m64, CL","RORQ CL, r/m64","rorq CL, r/m64","REX.W D3 /1","N.S.","V","","","rw,r","Y","64"
+"ROR r/m64, imm8u","RORQ imm8u, r/m64","rorq imm8u, r/m64","REX.W C1 /1 ib","N.S.","V","","","rw,r","Y","64"
+"ROR r/m16, 1","RORW 1, r/m16","rorw 1, r/m16","D1 /1","V","V","","operand16","rw,r","Y","16"
+"ROR r/m16, CL","RORW CL, r/m16","rorw CL, r/m16","D3 /1","V","V","","operand16","rw,r","Y","16"
+"ROR r/m16, imm8u","RORW imm8u, r/m16","rorw imm8u, r/m16","C1 /1 ib","V","V","","operand16","rw,r","Y","16"
+"RORX r32, r/m32, imm8u","RORXL imm8u, r/m32, r32","rorxl imm8u, r/m32, r32","VEX.128.F2.0F3A.W0 F0 /r ib","V","V","BMI2","","w,r,r","Y","32"
+"RORX r64, r/m64, imm8u","RORXQ imm8u, r/m64, r64","rorxq imm8u, r/m64, r64","VEX.128.F2.0F3A.W1 F0 /r ib","N.S.","V","BMI2","","w,r,r","Y","64"
+"ROUNDPD xmm1, xmm2/m128, imm8u","ROUNDPD imm8u, xmm2/m128, xmm1","roundpd imm8u, xmm2/m128, xmm1","66 0F 3A 09 /r ib","V","V","SSE4_1","","w,r,r","",""
+"ROUNDPS xmm1, xmm2/m128, imm8u","ROUNDPS imm8u, xmm2/m128, xmm1","roundps imm8u, xmm2/m128, xmm1","66 0F 3A 08 /r ib","V","V","SSE4_1","","w,r,r","",""
+"ROUNDSD xmm1, xmm2/m64, imm8u","ROUNDSD imm8u, xmm2/m64, xmm1","roundsd imm8u, xmm2/m64, xmm1","66 0F 3A 0B /r ib","V","V","SSE4_1","","w,r,r","",""
+"ROUNDSS xmm1, xmm2/m32, imm8u","ROUNDSS imm8u, xmm2/m32, xmm1","roundss imm8u, xmm2/m32, xmm1","66 0F 3A 0A /r ib","V","V","SSE4_1","","w,r,r","",""
+"RSM","RSM","rsm","0F AA","V","V","","","","",""
+"RSQRTPS xmm1, xmm2/m128","RSQRTPS xmm2/m128, xmm1","rsqrtps xmm2/m128, xmm1","0F 52 /r","V","V","SSE","","w,r","",""
+"RSQRTSS xmm1, xmm2/m32","RSQRTSS xmm2/m32, xmm1","rsqrtss xmm2/m32, xmm1","F3 0F 52 /r","V","V","SSE","","w,r","",""
+"RSTORSSP m64","RSTORSSP m64","rstorssp m64","F3 0F 01 /5","V","V","CET","modrm_memonly","rw","",""
+"SAHF","SAHF","sahf","9E","V","V","LAHFSAHF","","","",""
+"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","D0 /4","V","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","REX D0 /4","N.E.","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","D2 /4","V","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","REX D2 /4","N.E.","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","C0 /4 ib","V","V","","pseudo","rw,r","Y","8"
+"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","REX C0 /4 ib","N.E.","V","","pseudo","rw,r","Y","8"
+"SALC","SALC","salc","D6","V","N.S.","","","","",""
+"SAL r/m32, 1","SALL 1, r/m32","sall 1, r/m32","D1 /4","V","V","","operand32,pseudo","rw,r","Y","32"
+"SAL r/m32, CL","SALL CL, r/m32","sall CL, r/m32","D3 /4","V","V","","operand32,pseudo","rw,r","Y","32"
+"SAL r/m32, imm8","SALL imm8, r/m32","sall imm8, r/m32","C1 /4 ib","V","V","","operand32,pseudo","rw,r","Y","32"
+"SAL r/m64, 1","SALQ 1, r/m64","salq 1, r/m64","REX.W D1 /4","N.E.","V","","pseudo","rw,r","Y","64"
+"SAL r/m64, CL","SALQ CL, r/m64","salq CL, r/m64","REX.W D3 /4","N.E.","V","","pseudo","rw,r","Y","64"
+"SAL r/m64, imm8","SALQ imm8, r/m64","salq imm8, r/m64","REX.W C1 /4 ib","N.E.","V","","pseudo","rw,r","Y","64"
+"SAL r/m16, 1","SALW 1, r/m16","salw 1, r/m16","D1 /4","V","V","","operand16,pseudo","rw,r","Y","16"
+"SAL r/m16, CL","SALW CL, r/m16","salw CL, r/m16","D3 /4","V","V","","operand16,pseudo","rw,r","Y","16"
+"SAL r/m16, imm8","SALW imm8, r/m16","salw imm8, r/m16","C1 /4 ib","V","V","","operand16,pseudo","rw,r","Y","16"
+"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","D0 /7","V","V","","","rw,r","Y","8"
+"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","REX D0 /7","N.E.","V","","pseudo64","rw,r","Y","8"
+"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","D2 /7","V","V","","","rw,r","Y","8"
+"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","REX D2 /7","N.E.","V","","pseudo64","rw,r","Y","8"
+"SAR r/m8, imm8","SARB imm8, r/m8","sarb imm8, r/m8","REX C0 /7 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"SAR r/m8, imm8u","SARB imm8u, r/m8","sarb imm8u, r/m8","C0 /7 ib","V","V","","","rw,r","Y","8"
+"SAR r/m32, 1","SARL 1, r/m32","sarl 1, r/m32","D1 /7","V","V","","operand32","rw,r","Y","32"
+"SAR r/m32, CL","SARL CL, r/m32","sarl CL, r/m32","D3 /7","V","V","","operand32","rw,r","Y","32"
+"SAR r/m32, imm8u","SARL imm8u, r/m32","sarl imm8u, r/m32","C1 /7 ib","V","V","","operand32","rw,r","Y","32"
+"SAR r/m64, 1","SARQ 1, r/m64","sarq 1, r/m64","REX.W D1 /7","N.S.","V","","","rw,r","Y","64"
+"SAR r/m64, CL","SARQ CL, r/m64","sarq CL, r/m64","REX.W D3 /7","N.S.","V","","","rw,r","Y","64"
+"SAR r/m64, imm8u","SARQ imm8u, r/m64","sarq imm8u, r/m64","REX.W C1 /7 ib","N.S.","V","","","rw,r","Y","64"
+"SAR r/m16, 1","SARW 1, r/m16","sarw 1, r/m16","D1 /7","V","V","","operand16","rw,r","Y","16"
+"SAR r/m16, CL","SARW CL, r/m16","sarw CL, r/m16","D3 /7","V","V","","operand16","rw,r","Y","16"
+"SAR r/m16, imm8u","SARW imm8u, r/m16","sarw imm8u, r/m16","C1 /7 ib","V","V","","operand16","rw,r","Y","16"
+"SARX r32, r/m32, r32V","SARXL r32V, r/m32, r32","sarxl r32V, r/m32, r32","VEX.NDS.128.F3.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32"
+"SARX r64, r/m64, r64V","SARXQ r64V, r/m64, r64","sarxq r64V, r/m64, r64","VEX.NDS.128.F3.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64"
+"SAVESSP","SAVESSP","savessp","F3 0F 01 EA","V","V","CET","","","",""
+"SBB AL, imm8","SBBB imm8, AL","sbbb imm8, AL","1C ib","V","V","","","rw,r","Y","8"
+"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","80 /3 ib","V","V","","","rw,r","Y","8"
+"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","82 /3 ib","V","N.S.","","","rw,r","Y","8"
+"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","REX 80 /3 ib","N.E.","V","","pseudo64","w,r","Y","8"
+"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","1A /r","V","V","","","rw,r","Y","8"
+"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","REX 1A /r","N.E.","V","","pseudo64","w,r","Y","8"
+"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","18 /r","V","V","","","rw,r","Y","8"
+"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","REX 18 /r","N.E.","V","","pseudo64","w,r","Y","8"
+"SBB EAX, imm32","SBBL imm32, EAX","sbbl imm32, EAX","1D id","V","V","","operand32","rw,r","Y","32"
+"SBB r/m32, imm32","SBBL imm32, r/m32","sbbl imm32, r/m32","81 /3 id","V","V","","operand32","rw,r","Y","32"
+"SBB r/m32, imm8","SBBL imm8, r/m32","sbbl imm8, r/m32","83 /3 ib","V","V","","operand32","rw,r","Y","32"
+"SBB r32, r/m32","SBBL r/m32, r32","sbbl r/m32, r32","1B /r","V","V","","operand32","rw,r","Y","32"
+"SBB r/m32, r32","SBBL r32, r/m32","sbbl r32, r/m32","19 /r","V","V","","operand32","rw,r","Y","32"
+"SBB RAX, imm32","SBBQ imm32, RAX","sbbq imm32, RAX","REX.W 1D id","N.S.","V","","","rw,r","Y","64"
+"SBB r/m64, imm32","SBBQ imm32, r/m64","sbbq imm32, r/m64","REX.W 81 /3 id","N.S.","V","","","rw,r","Y","64"
+"SBB r/m64, imm8","SBBQ imm8, r/m64","sbbq imm8, r/m64","REX.W 83 /3 ib","N.S.","V","","","rw,r","Y","64"
+"SBB r64, r/m64","SBBQ r/m64, r64","sbbq r/m64, r64","REX.W 1B /r","N.S.","V","","","rw,r","Y","64"
+"SBB r/m64, r64","SBBQ r64, r/m64","sbbq r64, r/m64","REX.W 19 /r","N.S.","V","","","rw,r","Y","64"
+"SBB AX, imm16","SBBW imm16, AX","sbbw imm16, AX","1D iw","V","V","","operand16","rw,r","Y","16"
+"SBB r/m16, imm16","SBBW imm16, r/m16","sbbw imm16, r/m16","81 /3 iw","V","V","","operand16","rw,r","Y","16"
+"SBB r/m16, imm8","SBBW imm8, r/m16","sbbw imm8, r/m16","83 /3 ib","V","V","","operand16","rw,r","Y","16"
+"SBB r16, r/m16","SBBW r/m16, r16","sbbw r/m16, r16","1B /r","V","V","","operand16","rw,r","Y","16"
+"SBB r/m16, r16","SBBW r16, r/m16","sbbw r16, r/m16","19 /r","V","V","","operand16","rw,r","Y","16"
+"SCASB","SCASB","scasb","AE","V","V","","","","",""
+"SCASD","SCASL","scasl","AF","V","V","","operand32","","",""
+"SCASQ","SCASQ","scasq","REX.W AF","N.S.","V","","","","",""
+"SCASW","SCASW","scasw","AF","V","V","","operand16","","",""
+"SETAE r/m8","SETCC r/m8","setae r/m8","0F 93 /r","V","V","","","w","",""
+"SETNB r/m8","SETCC r/m8","setnb r/m8","0F 93 /r","V","V","","pseudo","r","",""
+"SETNC r/m8","SETCC r/m8","setnc r/m8","0F 93 /r","V","V","","pseudo","r","",""
+"SETAE r/m8","SETCC r/m8","setae r/m8","REX 0F 93 /r","N.E.","V","","pseudo64","r","",""
+"SETNB r/m8","SETCC r/m8","setnb r/m8","REX 0F 93 /r","N.E.","V","","pseudo","r","",""
+"SETNC r/m8","SETCC r/m8","setnc r/m8","REX 0F 93 /r","N.E.","V","","pseudo","r","",""
+"SETB r/m8","SETCS r/m8","setb r/m8","0F 92 /r","V","V","","","w","",""
+"SETC r/m8","SETCS r/m8","setc r/m8","0F 92 /r","V","V","","pseudo","r","",""
+"SETNAE r/m8","SETCS r/m8","setnae r/m8","0F 92 /r","V","V","","pseudo","r","",""
+"SETB r/m8","SETCS r/m8","setb r/m8","REX 0F 92 /r","N.E.","V","","pseudo64","r","",""
+"SETC r/m8","SETCS r/m8","setc r/m8","REX 0F 92 /r","N.E.","V","","pseudo","r","",""
+"SETNAE r/m8","SETCS r/m8","setnae r/m8","REX 0F 92 /r","N.E.","V","","pseudo","r","",""
+"SETE r/m8","SETEQ r/m8","sete r/m8","0F 94 /r","V","V","","","w","",""
+"SETZ r/m8","SETEQ r/m8","setz r/m8","0F 94 /r","V","V","","pseudo","r","",""
+"SETE r/m8","SETEQ r/m8","sete r/m8","REX 0F 94 /r","N.E.","V","","pseudo64","r","",""
+"SETZ r/m8","SETEQ r/m8","setz r/m8","REX 0F 94 /r","N.E.","V","","pseudo","r","",""
+"SETGE r/m8","SETGE r/m8","setge r/m8","0F 9D /r","V","V","","","w","",""
+"SETNL r/m8","SETGE r/m8","setnl r/m8","0F 9D /r","V","V","","pseudo","r","",""
+"SETGE r/m8","SETGE r/m8","setge r/m8","REX 0F 9D /r","N.E.","V","","pseudo64","r","",""
+"SETNL r/m8","SETGE r/m8","setnl r/m8","REX 0F 9D /r","N.E.","V","","pseudo","r","",""
+"SETG r/m8","SETGT r/m8","setg r/m8","0F 9F /r","V","V","","","w","",""
+"SETNLE r/m8","SETGT r/m8","setnle r/m8","0F 9F /r","V","V","","pseudo","r","",""
+"SETG r/m8","SETGT r/m8","setg r/m8","REX 0F 9F /r","N.E.","V","","pseudo64","r","",""
+"SETNLE r/m8","SETGT r/m8","setnle r/m8","REX 0F 9F /r","N.E.","V","","pseudo","r","",""
+"SETA r/m8","SETHI r/m8","seta r/m8","0F 97 /r","V","V","","","w","",""
+"SETNBE r/m8","SETHI r/m8","setnbe r/m8","0F 97 /r","V","V","","pseudo","r","",""
+"SETA r/m8","SETHI r/m8","seta r/m8","REX 0F 97 /r","N.E.","V","","pseudo64","r","",""
+"SETNBE r/m8","SETHI r/m8","setnbe r/m8","REX 0F 97 /r","N.E.","V","","pseudo","r","",""
+"SETLE r/m8","SETLE r/m8","setle r/m8","0F 9E /r","V","V","","","w","",""
+"SETNG r/m8","SETLE r/m8","setng r/m8","0F 9E /r","V","V","","pseudo","r","",""
+"SETLE r/m8","SETLE r/m8","setle r/m8","REX 0F 9E /r","N.E.","V","","pseudo64","r","",""
+"SETNG r/m8","SETLE r/m8","setng r/m8","REX 0F 9E /r","N.E.","V","","pseudo","r","",""
+"SETBE r/m8","SETLS r/m8","setbe r/m8","0F 96 /r","V","V","","","w","",""
+"SETNA r/m8","SETLS r/m8","setna r/m8","0F 96 /r","V","V","","pseudo","r","",""
+"SETBE r/m8","SETLS r/m8","setbe r/m8","REX 0F 96 /r","N.E.","V","","pseudo64","r","",""
+"SETNA r/m8","SETLS r/m8","setna r/m8","REX 0F 96 /r","N.E.","V","","pseudo","r","",""
+"SETL r/m8","SETLT r/m8","setl r/m8","0F 9C /r","V","V","","","w","",""
+"SETNGE r/m8","SETLT r/m8","setnge r/m8","0F 9C /r","V","V","","pseudo","r","",""
+"SETL r/m8","SETLT r/m8","setl r/m8","REX 0F 9C /r","N.E.","V","","pseudo64","r","",""
+"SETNGE r/m8","SETLT r/m8","setnge r/m8","REX 0F 9C /r","N.E.","V","","pseudo","r","",""
+"SETS r/m8","SETMI r/m8","sets r/m8","0F 98 /r","V","V","","","w","",""
+"SETS r/m8","SETMI r/m8","sets r/m8","REX 0F 98 /r","N.E.","V","","pseudo64","r","",""
+"SETNE r/m8","SETNE r/m8","setne r/m8","0F 95 /r","V","V","","","w","",""
+"SETNZ r/m8","SETNE r/m8","setnz r/m8","0F 95 /r","V","V","","pseudo","r","",""
+"SETNE r/m8","SETNE r/m8","setne r/m8","REX 0F 95 /r","N.E.","V","","pseudo64","r","",""
+"SETNZ r/m8","SETNE r/m8","setnz r/m8","REX 0F 95 /r","N.E.","V","","pseudo","r","",""
+"SETNO r/m8","SETOC r/m8","setno r/m8","0F 91 /r","V","V","","","w","",""
+"SETNO r/m8","SETOC r/m8","setno r/m8","REX 0F 91 /r","N.E.","V","","pseudo64","r","",""
+"SETO r/m8","SETOS r/m8","seto r/m8","0F 90 /r","V","V","","","w","",""
+"SETO r/m8","SETOS r/m8","seto r/m8","REX 0F 90 /r","N.E.","V","","pseudo64","r","",""
+"SETNP r/m8","SETPC r/m8","setnp r/m8","0F 9B /r","V","V","","","w","",""
+"SETPO r/m8","SETPC r/m8","setpo r/m8","0F 9B /r","V","V","","pseudo","r","",""
+"SETNP r/m8","SETPC r/m8","setnp r/m8","REX 0F 9B /r","N.E.","V","","pseudo64","r","",""
+"SETPO r/m8","SETPC r/m8","setpo r/m8","REX 0F 9B /r","N.E.","V","","pseudo","r","",""
+"SETNS r/m8","SETPL r/m8","setns r/m8","0F 99 /r","V","V","","","w","",""
+"SETNS r/m8","SETPL r/m8","setns r/m8","REX 0F 99 /r","N.E.","V","","pseudo64","r","",""
+"SETP r/m8","SETPS r/m8","setp r/m8","0F 9A /r","V","V","","","w","",""
+"SETPE r/m8","SETPS r/m8","setpe r/m8","0F 9A /r","V","V","","pseudo","r","",""
+"SETP r/m8","SETPS r/m8","setp r/m8","REX 0F 9A /r","N.E.","V","","pseudo64","r","",""
+"SETPE r/m8","SETPS r/m8","setpe r/m8","REX 0F 9A /r","N.E.","V","","pseudo","r","",""
+"SETSSBSY","SETSSBSY","setssbsy","F3 0F 01 E8","V","V","CET","","","",""
+"SFENCE","SFENCE","sfence","0F AE /7","V","V","SSE","","","",""
+"SGDT m16&32","SGDT m16&32","sgdt m16&32","0F 01 /0","V","N.S.","","modrm_memonly","w","",""
+"SGDT m16&64","SGDT m16&64","sgdt m16&64","0F 01 /0","N.S.","V","","default64,modrm_memonly","w","",""
+"SHA1MSG1 xmm1, xmm2/m128","SHA1MSG1 xmm2/m128, xmm1","sha1msg1 xmm2/m128, xmm1","0F 38 C9 /r","V","V","SHA","","rw,r","",""
+"SHA1MSG2 xmm1, xmm2/m128","SHA1MSG2 xmm2/m128, xmm1","sha1msg2 xmm2/m128, xmm1","0F 38 CA /r","V","V","SHA","","rw,r","",""
+"SHA1NEXTE xmm1, xmm2/m128","SHA1NEXTE xmm2/m128, xmm1","sha1nexte xmm2/m128, xmm1","0F 38 C8 /r","V","V","SHA","","rw,r","",""
+"SHA1RNDS4 xmm1, xmm2/m128, imm8u:2","SHA1RNDS4 imm8u:2, xmm2/m128, xmm1","sha1rnds4 imm8u:2, xmm2/m128, xmm1","0F 3A CC /r ib","V","V","SHA","","rw,r,r","",""
+"SHA256MSG1 xmm1, xmm2/m128","SHA256MSG1 xmm2/m128, xmm1","sha256msg1 xmm2/m128, xmm1","0F 38 CC /r","V","V","SHA","","rw,r","",""
+"SHA256MSG2 xmm1, xmm2/m128","SHA256MSG2 xmm2/m128, xmm1","sha256msg2 xmm2/m128, xmm1","0F 38 CD /r","V","V","SHA","","rw,r","",""
+"SHA256RNDS2 xmm1, xmm2/m128, <XMM0>","SHA256RNDS2 <XMM0>, xmm2/m128, xmm1","sha256rnds2 <XMM0>, xmm2/m128, xmm1","0F 38 CB /r","V","V","SHA","","rw,r,r","",""
+"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /4","V","V","","","rw,r","Y","8"
+"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /6","V","V","","","rw,r","Y","8"
+"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","REX D0 /4","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /4","V","V","","","rw,r","Y","8"
+"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /6","V","V","","","rw,r","Y","8"
+"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","REX D2 /4","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHL r/m8, imm8","SHLB imm8, r/m8","shlb imm8, r/m8","REX C0 /4 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /4 ib","V","V","","","rw,r","Y","8"
+"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /6 ib","V","V","","","rw,r","Y","8"
+"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /4","V","V","","operand32","rw,r","Y","32"
+"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /6","V","V","","operand32","rw,r","Y","32"
+"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /4","V","V","","operand32","rw,r","Y","32"
+"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /6","V","V","","operand32","rw,r","Y","32"
+"SHLD r/m32, r32, CL","SHLL CL, r32, r/m32","shldl CL, r32, r/m32","0F A5 /r","V","V","","operand32","rw,r,r","Y","32"
+"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /4 ib","V","V","","operand32","rw,r","Y","32"
+"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /6 ib","V","V","","operand32","rw,r","Y","32"
+"SHLD r/m32, r32, imm8u","SHLL imm8u, r32, r/m32","shldl imm8u, r32, r/m32","0F A4 /r ib","V","V","","operand32","rw,r,r","Y","32"
+"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /4","N.S.","V","","","rw,r","Y","64"
+"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /6","N.S.","V","","","rw,r","Y","64"
+"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /4","N.S.","V","","","rw,r","Y","64"
+"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /6","N.S.","V","","","rw,r","Y","64"
+"SHLD r/m64, r64, CL","SHLQ CL, r64, r/m64","shldq CL, r64, r/m64","REX.W 0F A5 /r","N.S.","V","","","rw,r,r","Y","64"
+"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /4 ib","N.S.","V","","","rw,r","Y","64"
+"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /6 ib","N.S.","V","","","rw,r","Y","64"
+"SHLD r/m64, r64, imm8u","SHLQ imm8u, r64, r/m64","shldq imm8u, r64, r/m64","REX.W 0F A4 /r ib","N.S.","V","","","rw,r,r","Y","64"
+"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /4","V","V","","operand16","rw,r","Y","16"
+"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /6","V","V","","operand16","rw,r","Y","16"
+"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /4","V","V","","operand16","rw,r","Y","16"
+"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /6","V","V","","operand16","rw,r","Y","16"
+"SHLD r/m16, r16, CL","SHLW CL, r16, r/m16","shldw CL, r16, r/m16","0F A5 /r","V","V","","operand16","rw,r,r","Y","16"
+"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /4 ib","V","V","","operand16","rw,r","Y","16"
+"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /6 ib","V","V","","operand16","rw,r","Y","16"
+"SHLD r/m16, r16, imm8u","SHLW imm8u, r16, r/m16","shldw imm8u, r16, r/m16","0F A4 /r ib","V","V","","operand16","rw,r,r","Y","16"
+"SHLX r32, r/m32, r32V","SHLXL r32V, r/m32, r32","shlxl r32V, r/m32, r32","VEX.NDS.128.66.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32"
+"SHLX r64, r/m64, r64V","SHLXQ r64V, r/m64, r64","shlxq r64V, r/m64, r64","VEX.NDS.128.66.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64"
+"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","D0 /5","V","V","","","rw,r","Y","8"
+"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","REX D0 /5","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","D2 /5","V","V","","","rw,r","Y","8"
+"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","REX D2 /5","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHR r/m8, imm8","SHRB imm8, r/m8","shrb imm8, r/m8","REX C0 /5 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"SHR r/m8, imm8u","SHRB imm8u, r/m8","shrb imm8u, r/m8","C0 /5 ib","V","V","","","rw,r","Y","8"
+"SHR r/m32, 1","SHRL 1, r/m32","shrl 1, r/m32","D1 /5","V","V","","operand32","rw,r","Y","32"
+"SHR r/m32, CL","SHRL CL, r/m32","shrl CL, r/m32","D3 /5","V","V","","operand32","rw,r","Y","32"
+"SHRD r/m32, r32, CL","SHRL CL, r32, r/m32","shrdl CL, r32, r/m32","0F AD /r","V","V","","operand32","rw,r,r","Y","32"
+"SHR r/m32, imm8u","SHRL imm8u, r/m32","shrl imm8u, r/m32","C1 /5 ib","V","V","","operand32","rw,r","Y","32"
+"SHRD r/m32, r32, imm8u","SHRL imm8u, r32, r/m32","shrdl imm8u, r32, r/m32","0F AC /r ib","V","V","","operand32","rw,r,r","Y","32"
+"SHR r/m64, 1","SHRQ 1, r/m64","shrq 1, r/m64","REX.W D1 /5","N.S.","V","","","rw,r","Y","64"
+"SHR r/m64, CL","SHRQ CL, r/m64","shrq CL, r/m64","REX.W D3 /5","N.S.","V","","","rw,r","Y","64"
+"SHRD r/m64, r64, CL","SHRQ CL, r64, r/m64","shrdq CL, r64, r/m64","REX.W 0F AD /r","N.S.","V","","","rw,r,r","Y","64"
+"SHR r/m64, imm8u","SHRQ imm8u, r/m64","shrq imm8u, r/m64","REX.W C1 /5 ib","N.S.","V","","","rw,r","Y","64"
+"SHRD r/m64, r64, imm8u","SHRQ imm8u, r64, r/m64","shrdq imm8u, r64, r/m64","REX.W 0F AC /r ib","N.S.","V","","","rw,r,r","Y","64"
+"SHR r/m16, 1","SHRW 1, r/m16","shrw 1, r/m16","D1 /5","V","V","","operand16","rw,r","Y","16"
+"SHR r/m16, CL","SHRW CL, r/m16","shrw CL, r/m16","D3 /5","V","V","","operand16","rw,r","Y","16"
+"SHRD r/m16, r16, CL","SHRW CL, r16, r/m16","shrdw CL, r16, r/m16","0F AD /r","V","V","","operand16","rw,r,r","Y","16"
+"SHR r/m16, imm8u","SHRW imm8u, r/m16","shrw imm8u, r/m16","C1 /5 ib","V","V","","operand16","rw,r","Y","16"
+"SHRD r/m16, r16, imm8u","SHRW imm8u, r16, r/m16","shrdw imm8u, r16, r/m16","0F AC /r ib","V","V","","operand16","rw,r,r","Y","16"
+"SHRX r32, r/m32, r32V","SHRXL r32V, r/m32, r32","shrxl r32V, r/m32, r32","VEX.NDS.128.F2.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32"
+"SHRX r64, r/m64, r64V","SHRXQ r64V, r/m64, r64","shrxq r64V, r/m64, r64","VEX.NDS.128.F2.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64"
+"SHUFPD xmm1, xmm2/m128, imm8u","SHUFPD imm8u, xmm2/m128, xmm1","shufpd imm8u, xmm2/m128, xmm1","66 0F C6 /r ib","V","V","SSE2","","rw,r,r","",""
+"SHUFPS xmm1, xmm2/m128, imm8u","SHUFPS imm8u, xmm2/m128, xmm1","shufps imm8u, xmm2/m128, xmm1","0F C6 /r ib","V","V","SSE","","rw,r,r","",""
+"SIDT m16&32","SIDT m16&32","sidt m16&32","0F 01 /1","V","N.S.","","modrm_memonly","w","",""
+"SIDT m16&64","SIDT m16&64","sidt m16&64","0F 01 /1","N.S.","V","","default64,modrm_memonly","w","",""
+"SKINIT EAX","SKINIT EAX","skinit EAX","0F 01 DE","V","V","SVM","amd,modrm_regonly","r","",""
+"SLDT r/m16","SLDTW r/m16","sldtw r/m16","0F 00 /0","V","V","","operand16","w","Y","16"
+"SLDT r32/m16","SLDT{L/W} r32/m16","sldt{l/w} r32/m16","0F 00 /0","V","V","","operand32","w","Y",""
+"SLDT r64/m16","SLDT{Q/W} r64/m16","sldt{q/w} r64/m16","REX.W 0F 00 /0","N.S.","V","","","w","Y",""
+"SLWPCB rmr32","SLWPCBL rmr32","slwpcbl rmr32","XOP.128.09.W0 12 /1","V","V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32"
+"SLWPCB rmr64","SLWPCBQ rmr64","slwpcbq rmr64","XOP.128.09.W0 12 /1","N.S.","V","XOP","amd,modrm_regonly,operand64","w","Y","64"
+"SMSW r/m16","SMSWW r/m16","smsww r/m16","0F 01 /4","V","V","","operand16","w","Y","16"
+"SMSW r32/m16","SMSW{L/W} r32/m16","smsw{l/w} r32/m16","0F 01 /4","V","V","","operand32","w","Y",""
+"SMSW r64/m16","SMSW{Q/W} r64/m16","smsw{q/w} r64/m16","REX.W 0F 01 /4","N.S.","V","","","w","Y",""
+"SQRTPD xmm1, xmm2/m128","SQRTPD xmm2/m128, xmm1","sqrtpd xmm2/m128, xmm1","66 0F 51 /r","V","V","SSE2","","w,r","",""
+"SQRTPS xmm1, xmm2/m128","SQRTPS xmm2/m128, xmm1","sqrtps xmm2/m128, xmm1","0F 51 /r","V","V","SSE","","w,r","",""
+"SQRTSD xmm1, xmm2/m64","SQRTSD xmm2/m64, xmm1","sqrtsd xmm2/m64, xmm1","F2 0F 51 /r","V","V","SSE2","","w,r","",""
+"SQRTSS xmm1, xmm2/m32","SQRTSS xmm2/m32, xmm1","sqrtss xmm2/m32, xmm1","F3 0F 51 /r","V","V","SSE","","w,r","",""
+"STAC","STAC","stac","0F 01 CB","V","V","","","","",""
+"STC","STC","stc","F9","V","V","","","","",""
+"STD","STD","std","FD","V","V","","","","",""
+"STGI","STGI","stgi","0F 01 DC","V","V","SVM","amd","","",""
+"STI","STI","sti","FB","V","V","","","","",""
+"STMXCSR m32","STMXCSR m32","stmxcsr m32","0F AE /3","V","V","SSE","modrm_memonly","w","",""
+"STOSB","STOSB","stosb","AA","V","V","","","","",""
+"STOSD","STOSL","stosl","AB","V","V","","operand32","","",""
+"STOSQ","STOSQ","stosq","REX.W AB","N.S.","V","","","","",""
+"STOSW","STOSW","stosw","AB","V","V","","operand16","","",""
+"STR r/m16","STRW r/m16","strw r/m16","0F 00 /1","V","V","","operand16","w","Y","16"
+"STR r32/m16","STR{L/W} r32/m16","str{l/w} r32/m16","0F 00 /1","V","V","","operand32","w","Y",""
+"STR r64/m16","STR{Q/W} r64/m16","str{q/w} r64/m16","REX.W 0F 00 /1","N.S.","V","","","w","Y",""
+"SUB AL, imm8","SUBB imm8, AL","subb imm8, AL","2C ib","V","V","","","rw,r","Y","8"
+"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","80 /5 ib","V","V","","","rw,r","Y","8"
+"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","82 /5 ib","V","N.S.","","","rw,r","Y","8"
+"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","REX 80 /5 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","2A /r","V","V","","","rw,r","Y","8"
+"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","REX 2A /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","28 /r","V","V","","","rw,r","Y","8"
+"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","REX 28 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"SUB EAX, imm32","SUBL imm32, EAX","subl imm32, EAX","2D id","V","V","","operand32","rw,r","Y","32"
+"SUB r/m32, imm32","SUBL imm32, r/m32","subl imm32, r/m32","81 /5 id","V","V","","operand32","rw,r","Y","32"
+"SUB r/m32, imm8","SUBL imm8, r/m32","subl imm8, r/m32","83 /5 ib","V","V","","operand32","rw,r","Y","32"
+"SUB r32, r/m32","SUBL r/m32, r32","subl r/m32, r32","2B /r","V","V","","operand32","rw,r","Y","32"
+"SUB r/m32, r32","SUBL r32, r/m32","subl r32, r/m32","29 /r","V","V","","operand32","rw,r","Y","32"
+"SUBPD xmm1, xmm2/m128","SUBPD xmm2/m128, xmm1","subpd xmm2/m128, xmm1","66 0F 5C /r","V","V","SSE2","","rw,r","",""
+"SUBPS xmm1, xmm2/m128","SUBPS xmm2/m128, xmm1","subps xmm2/m128, xmm1","0F 5C /r","V","V","SSE","","rw,r","",""
+"SUB RAX, imm32","SUBQ imm32, RAX","subq imm32, RAX","REX.W 2D id","N.S.","V","","","rw,r","Y","64"
+"SUB r/m64, imm32","SUBQ imm32, r/m64","subq imm32, r/m64","REX.W 81 /5 id","N.S.","V","","","rw,r","Y","64"
+"SUB r/m64, imm8","SUBQ imm8, r/m64","subq imm8, r/m64","REX.W 83 /5 ib","N.S.","V","","","rw,r","Y","64"
+"SUB r64, r/m64","SUBQ r/m64, r64","subq r/m64, r64","REX.W 2B /r","N.S.","V","","","rw,r","Y","64"
+"SUB r/m64, r64","SUBQ r64, r/m64","subq r64, r/m64","REX.W 29 /r","N.S.","V","","","rw,r","Y","64"
+"SUBSD xmm1, xmm2/m64","SUBSD xmm2/m64, xmm1","subsd xmm2/m64, xmm1","F2 0F 5C /r","V","V","SSE2","","rw,r","",""
+"SUBSS xmm1, xmm2/m32","SUBSS xmm2/m32, xmm1","subss xmm2/m32, xmm1","F3 0F 5C /r","V","V","SSE","","rw,r","",""
+"SUB AX, imm16","SUBW imm16, AX","subw imm16, AX","2D iw","V","V","","operand16","rw,r","Y","16"
+"SUB r/m16, imm16","SUBW imm16, r/m16","subw imm16, r/m16","81 /5 iw","V","V","","operand16","rw,r","Y","16"
+"SUB r/m16, imm8","SUBW imm8, r/m16","subw imm8, r/m16","83 /5 ib","V","V","","operand16","rw,r","Y","16"
+"SUB r16, r/m16","SUBW r/m16, r16","subw r/m16, r16","2B /r","V","V","","operand16","rw,r","Y","16"
+"SUB r/m16, r16","SUBW r16, r/m16","subw r16, r/m16","29 /r","V","V","","operand16","rw,r","Y","16"
+"SWAPGS","SWAPGS","swapgs","0F 01 F8","N.S.","V","","","","",""
+"SYSCALL","SYSCALL","syscall","0F 05","N.S.","V","","default64","","",""
+"SYSCALL","SYSCALL","syscall","0F 05","V","N.S.","AMD","amd","","",""
+"SYSENTER","SYSENTER","sysenter","0F 34","V","V","PPRO","","","",""
+"SYSEXIT","SYSEXIT","sysexit","0F 35","V","V","PPRO","","","",""
+"SYSEXIT","SYSEXIT","sysexit","REX.W 0F 35","N.E.","V","","pseudo","","",""
+"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","V","N.S.","AMD","amd","","",""
+"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","N.S.","V","","operand32,operand64","","",""
+"SYSRET","SYSRET","sysretw/sysretl/sysretl","REX.W 0F 07","I","V","","pseudo","","",""
+"T1MSKC r32V, r/m32","T1MSKCL r/m32, r32V","t1mskcl r/m32, r32V","XOP.NDD.128.09.WIG 01 /7","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"T1MSKC r64V, r/m64","T1MSKCQ r/m64, r64V","t1mskcq r/m64, r64V","XOP.NDD.128.09.WIG 01 /7","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"TEST AL, imm8","TESTB imm8, AL","testb imm8, AL","A8 ib","V","V","","","r,r","Y","8"
+"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /0 ib","V","V","","","r,r","Y","8"
+"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /1 ib","V","V","","","r,r","Y","8"
+"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","REX F6 /0 ib","N.E.","V","","pseudo64","r,r","Y","8"
+"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","84 /r","V","V","","","r,r","Y","8"
+"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","REX 84 /r","N.E.","V","","pseudo64","r,r","Y","8"
+"TEST EAX, imm32","TESTL imm32, EAX","testl imm32, EAX","A9 id","V","V","","operand32","r,r","Y","32"
+"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /0 id","V","V","","operand32","r,r","Y","32"
+"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /1 id","V","V","","operand32","r,r","Y","32"
+"TEST r/m32, r32","TESTL r32, r/m32","testl r32, r/m32","85 /r","V","V","","operand32","r,r","Y","32"
+"TEST RAX, imm32","TESTQ imm32, RAX","testq imm32, RAX","REX.W A9 id","N.S.","V","","","r,r","Y","64"
+"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /0 id","N.S.","V","","","r,r","Y","64"
+"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /1 id","N.S.","V","","","r,r","Y","64"
+"TEST r/m64, r64","TESTQ r64, r/m64","testq r64, r/m64","REX.W 85 /r","N.S.","V","","","r,r","Y","64"
+"TEST AX, imm16","TESTW imm16, AX","testw imm16, AX","A9 iw","V","V","","operand16","r,r","Y","16"
+"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /0 iw","V","V","","operand16","r,r","Y","16"
+"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /1 iw","V","V","","operand16","r,r","Y","16"
+"TEST r/m16, r16","TESTW r16, r/m16","testw r16, r/m16","85 /r","V","V","","operand16","r,r","Y","16"
+"TZCNT r32, r/m32","TZCNTL r/m32, r32","tzcntl r/m32, r32","F3 0F BC /r","V","V","BMI1","operand32","w,r","Y","32"
+"TZCNT r64, r/m64","TZCNTQ r/m64, r64","tzcntq r/m64, r64","F3 REX.W 0F BC /r","N.S.","V","BMI1","","w,r","Y","64"
+"TZCNT r16, r/m16","TZCNTW r/m16, r16","tzcntw r/m16, r16","F3 0F BC /r","V","V","BMI1","operand16","w,r","Y","16"
+"TZMSK r32V, r/m32","TZMSKL r/m32, r32V","tzmskl r/m32, r32V","XOP.NDD.128.09.WIG 01 /4","V","V","TBM","amd,operand16,operand32","w,r","Y","32"
+"TZMSK r64V, r/m64","TZMSKQ r/m64, r64V","tzmskq r/m64, r64V","XOP.NDD.128.09.WIG 01 /4","N.S.","V","TBM","amd,operand64","w,r","Y","64"
+"UCOMISD xmm1, xmm2/m64","UCOMISD xmm2/m64, xmm1","ucomisd xmm2/m64, xmm1","66 0F 2E /r","V","V","SSE2","","r,r","",""
+"UCOMISS xmm1, xmm2/m32","UCOMISS xmm2/m32, xmm1","ucomiss xmm2/m32, xmm1","0F 2E /r","V","V","SSE","","r,r","",""
+"UD0 r32, r/m32","UD0 r/m32, r32","ud0 r/m32, r32","0F FF /r","V","V","PPRO","","r,r","",""
+"UD1 r32, r/m32","UD1 r/m32, r32","ud1 r/m32, r32","0F B9 /r","V","V","PPRO","","r,r","",""
+"UD2","UD2","ud2","0F 0B","V","V","PPRO","","","",""
+"UNPCKHPD xmm1, xmm2/m128","UNPCKHPD xmm2/m128, xmm1","unpckhpd xmm2/m128, xmm1","66 0F 15 /r","V","V","SSE2","","rw,r","",""
+"UNPCKHPS xmm1, xmm2/m128","UNPCKHPS xmm2/m128, xmm1","unpckhps xmm2/m128, xmm1","0F 15 /r","V","V","SSE","","rw,r","",""
+"UNPCKLPD xmm1, xmm2/m128","UNPCKLPD xmm2/m128, xmm1","unpcklpd xmm2/m128, xmm1","66 0F 14 /r","V","V","SSE2","","rw,r","",""
+"UNPCKLPS xmm1, xmm2/m128","UNPCKLPS xmm2/m128, xmm1","unpcklps xmm2/m128, xmm1","0F 14 /r","V","V","SSE","","rw,r","",""
+"V4FMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FMADDPS m128, zmmV+3, {k}{z}, zmm1","v4fmaddps m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 9A /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
+"V4FMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FMADDSS m128, xmmV+3, {k}{z}, xmm1","v4fmaddss m128, xmmV+3, {k}{z}, xmm1","EVEX.DDS.LIG.F2.0F38.W0 9B /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
+"V4FNMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FNMADDPS m128, zmmV+3, {k}{z}, zmm1","v4fnmaddps m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 AA /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
+"V4FNMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FNMADDSS m128, xmmV+3, {k}{z}, xmm1","v4fnmaddss m128, xmmV+3, {k}{z}, xmm1","EVEX.DDS.LIG.F2.0F38.W0 AB /r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","",""
+"VADDPD xmm1, xmmV, xmm2/m128","VADDPD xmm2/m128, xmmV, xmm1","vaddpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VADDPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vaddpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 58 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VADDPD ymm1, ymmV, ymm2/m256","VADDPD ymm2/m256, ymmV, ymm1","vaddpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VADDPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vaddpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 58 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VADDPD zmm1{er}, {k}{z}, zmmV, zmm2","VADDPD zmm2, zmmV, {k}{z}, zmm1{er}","vaddpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VADDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VADDPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vaddpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 58 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VADDPS xmm1, xmmV, xmm2/m128","VADDPS xmm2/m128, xmmV, xmm1","vaddps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VADDPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vaddps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 58 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VADDPS ymm1, ymmV, ymm2/m256","VADDPS ymm2/m256, ymmV, ymm1","vaddps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VADDPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vaddps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 58 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VADDPS zmm1{er}, {k}{z}, zmmV, zmm2","VADDPS zmm2, zmmV, {k}{z}, zmm1{er}","vaddps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VADDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VADDPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vaddps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 58 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VADDSD xmm1{er}, {k}{z}, xmmV, xmm2","VADDSD xmm2, xmmV, {k}{z}, xmm1{er}","vaddsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VADDSD xmm1, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, xmm1","vaddsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDSD xmm1, {k}{z}, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, {k}{z}, xmm1","vaddsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 58 /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VADDSS xmm1{er}, {k}{z}, xmmV, xmm2","VADDSS xmm2, xmmV, {k}{z}, xmm1{er}","vaddss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 58 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VADDSS xmm1, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, xmm1","vaddss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 58 /r","V","V","AVX","","w,r,r","",""
+"VADDSS xmm1, {k}{z}, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, {k}{z}, xmm1","vaddss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 58 /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VADDSUBPD xmm1, xmmV, xmm2/m128","VADDSUBPD xmm2/m128, xmmV, xmm1","vaddsubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
+"VADDSUBPD ymm1, ymmV, ymm2/m256","VADDSUBPD ymm2/m256, ymmV, ymm1","vaddsubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
+"VADDSUBPS xmm1, xmmV, xmm2/m128","VADDSUBPS xmm2/m128, xmmV, xmm1","vaddsubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
+"VADDSUBPS ymm1, ymmV, ymm2/m256","VADDSUBPS ymm2/m256, ymmV, ymm1","vaddsubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG D0 /r","V","V","AVX","","w,r,r","",""
+"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV, xmm1","vaesdec xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DE /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
+"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV, xmm1","vaesdec xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DE /r","V","V","AES+AVX","","w,r,r","",""
+"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV, ymm1","vaesdec ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DE /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
+"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV, ymm1","vaesdec ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DE /r","V","V","VAES+AVX","","w,r,r","",""
+"VAESDEC zmm1, zmmV, zmm2/m512","VAESDEC zmm2/m512, zmmV, zmm1","vaesdec zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DE /r","V","V","AES+AVX512F","scale64","w,r,r","",""
+"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV, xmm1","vaesdeclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DF /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
+"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV, xmm1","vaesdeclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DF /r","V","V","AES+AVX","","w,r,r","",""
+"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV, ymm1","vaesdeclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DF /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
+"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV, ymm1","vaesdeclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DF /r","V","V","VAES+AVX","","w,r,r","",""
+"VAESDECLAST zmm1, zmmV, zmm2/m512","VAESDECLAST zmm2/m512, zmmV, zmm1","vaesdeclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DF /r","V","V","AES+AVX512F","scale64","w,r,r","",""
+"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV, xmm1","vaesenc xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DC /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
+"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV, xmm1","vaesenc xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DC /r","V","V","AES+AVX","","w,r,r","",""
+"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV, ymm1","vaesenc ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DC /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
+"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV, ymm1","vaesenc ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DC /r","V","V","VAES+AVX","","w,r,r","",""
+"VAESENC zmm1, zmmV, zmm2/m512","VAESENC zmm2/m512, zmmV, zmm1","vaesenc zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DC /r","V","V","AES+AVX512F","scale64","w,r,r","",""
+"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV, xmm1","vaesenclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DD /r","V","V","AES+AVX512VL","scale16","w,r,r","",""
+"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV, xmm1","vaesenclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DD /r","V","V","AES+AVX","","w,r,r","",""
+"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV, ymm1","vaesenclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DD /r","V","V","AES+AVX512VL","scale32","w,r,r","",""
+"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV, ymm1","vaesenclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DD /r","V","V","VAES+AVX","","w,r,r","",""
+"VAESENCLAST zmm1, zmmV, zmm2/m512","VAESENCLAST zmm2/m512, zmmV, zmm1","vaesenclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DD /r","V","V","AES+AVX512F","scale64","w,r,r","",""
+"VAESIMC xmm1, xmm2/m128","VAESIMC xmm2/m128, xmm1","vaesimc xmm2/m128, xmm1","VEX.128.66.0F38.WIG DB /r","V","V","AES+AVX","","w,r","",""
+"VAESKEYGENASSIST xmm1, xmm2/m128, imm8u","VAESKEYGENASSIST imm8u, xmm2/m128, xmm1","vaeskeygenassist imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG DF /r ib","V","V","AES+AVX","","w,r,r","",""
+"VALIGND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VALIGND imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","valignd imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 03 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VALIGND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VALIGND imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","valignd imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 03 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VALIGND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VALIGND imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","valignd imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 03 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VALIGNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VALIGNQ imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","valignq imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 03 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VALIGNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VALIGNQ imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","valignq imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 03 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VALIGNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VALIGNQ imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","valignq imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 03 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VANDNPD xmm1, xmmV, xmm2/m128","VANDNPD xmm2/m128, xmmV, xmm1","vandnpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
+"VANDNPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDNPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vandnpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 55 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VANDNPD ymm1, ymmV, ymm2/m256","VANDNPD ymm2/m256, ymmV, ymm1","vandnpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
+"VANDNPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDNPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vandnpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 55 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VANDNPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDNPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vandnpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 55 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VANDNPS xmm1, xmmV, xmm2/m128","VANDNPS xmm2/m128, xmmV, xmm1","vandnps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
+"VANDNPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDNPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vandnps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 55 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VANDNPS ymm1, ymmV, ymm2/m256","VANDNPS ymm2/m256, ymmV, ymm1","vandnps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 55 /r","V","V","AVX","","w,r,r","",""
+"VANDNPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDNPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vandnps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 55 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VANDNPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDNPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vandnps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 55 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VANDPD xmm1, xmmV, xmm2/m128","VANDPD xmm2/m128, xmmV, xmm1","vandpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
+"VANDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vandpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 54 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VANDPD ymm1, ymmV, ymm2/m256","VANDPD ymm2/m256, ymmV, ymm1","vandpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
+"VANDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vandpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 54 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VANDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vandpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 54 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VANDPS xmm1, xmmV, xmm2/m128","VANDPS xmm2/m128, xmmV, xmm1","vandps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
+"VANDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vandps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 54 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VANDPS ymm1, ymmV, ymm2/m256","VANDPS ymm2/m256, ymmV, ymm1","vandps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 54 /r","V","V","AVX","","w,r,r","",""
+"VANDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vandps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 54 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VANDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vandps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 54 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VBLENDMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VBLENDMPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vblendmpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 65 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VBLENDMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VBLENDMPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vblendmpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 65 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VBLENDMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VBLENDMPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vblendmpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 65 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VBLENDMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VBLENDMPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vblendmps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 65 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VBLENDMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VBLENDMPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vblendmps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 65 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VBLENDMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VBLENDMPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vblendmps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 65 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VBLENDPD xmm1, xmmV, xmm2/m128, imm8u","VBLENDPD imm8u, xmm2/m128, xmmV, xmm1","vblendpd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0D /r ib","V","V","AVX","","w,r,r,r","",""
+"VBLENDPD ymm1, ymmV, ymm2/m256, imm8u","VBLENDPD imm8u, ymm2/m256, ymmV, ymm1","vblendpd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0D /r ib","V","V","AVX","","w,r,r,r","",""
+"VBLENDPS xmm1, xmmV, xmm2/m128, imm8u","VBLENDPS imm8u, xmm2/m128, xmmV, xmm1","vblendps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0C /r ib","V","V","AVX","","w,r,r,r","",""
+"VBLENDPS ymm1, ymmV, ymm2/m256, imm8u","VBLENDPS imm8u, ymm2/m256, ymmV, ymm1","vblendps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0C /r ib","V","V","AVX","","w,r,r,r","",""
+"VBLENDVPD xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPD xmmIH, xmm2/m128, xmmV, xmm1","vblendvpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4B /r /is4","V","V","AVX","","w,r,r,r","",""
+"VBLENDVPD ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPD ymmIH, ymm2/m256, ymmV, ymm1","vblendvpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4B /r /is4","V","V","AVX","","w,r,r,r","",""
+"VBLENDVPS xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPS xmmIH, xmm2/m128, xmmV, xmm1","vblendvps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4A /r /is4","V","V","AVX","","w,r,r,r","",""
+"VBLENDVPS ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPS ymmIH, ymm2/m256, ymmV, ymm1","vblendvps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4A /r /is4","V","V","AVX","","w,r,r,r","",""
+"VBROADCASTF128 ymm1, m128","VBROADCASTF128 m128, ymm1","vbroadcastf128 m128, ymm1","VEX.256.66.0F38.W0 1A /r","V","V","AVX","modrm_memonly","w,r","",""
+"VBROADCASTF32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64, {k}{z}, ymm1","vbroadcastf32x2 xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W0 19 /r","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
+"VBROADCASTF32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64, {k}{z}, zmm1","vbroadcastf32x2 xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W0 19 /r","V","V","AVX512DQ","scale8","w,r,r","",""
+"VBROADCASTF32X4 ymm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z}, ymm1","vbroadcastf32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 1A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTF32X4 zmm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z}, zmm1","vbroadcastf32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1A /r","V","V","AVX512F","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTF32X8 zmm1, {k}{z}, m256","VBROADCASTF32X8 m256, {k}{z}, zmm1","vbroadcastf32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1B /r","V","V","AVX512DQ","modrm_memonly,scale32","w,r,r","",""
+"VBROADCASTF64X2 ymm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z}, ymm1","vbroadcastf64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 1A /r","V","V","AVX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTF64X2 zmm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z}, zmm1","vbroadcastf64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1A /r","V","V","AVX512DQ","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTF64X4 zmm1, {k}{z}, m256","VBROADCASTF64X4 m256, {k}{z}, zmm1","vbroadcastf64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1B /r","V","V","AVX512F","modrm_memonly,scale32","w,r,r","",""
+"VBROADCASTI128 ymm1, m128","VBROADCASTI128 m128, ymm1","vbroadcasti128 m128, ymm1","VEX.256.66.0F38.W0 5A /r","V","V","AVX2","modrm_memonly","w,r","",""
+"VBROADCASTI32X2 xmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}, xmm1","vbroadcasti32x2 xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 59 /r","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
+"VBROADCASTI32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}, ymm1","vbroadcasti32x2 xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W0 59 /r","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
+"VBROADCASTI32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}, zmm1","vbroadcasti32x2 xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W0 59 /r","V","V","AVX512DQ","scale8","w,r,r","",""
+"VBROADCASTI32X4 ymm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z}, ymm1","vbroadcasti32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 5A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTI32X4 zmm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z}, zmm1","vbroadcasti32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5A /r","V","V","AVX512F","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTI32X8 zmm1, {k}{z}, m256","VBROADCASTI32X8 m256, {k}{z}, zmm1","vbroadcasti32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5B /r","V","V","AVX512DQ","modrm_memonly,scale32","w,r,r","",""
+"VBROADCASTI64X2 ymm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z}, ymm1","vbroadcasti64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 5A /r","V","V","AVX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTI64X2 zmm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z}, zmm1","vbroadcasti64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5A /r","V","V","AVX512DQ","modrm_memonly,scale16","w,r,r","",""
+"VBROADCASTI64X4 zmm1, {k}{z}, m256","VBROADCASTI64X4 m256, {k}{z}, zmm1","vbroadcasti64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5B /r","V","V","AVX512F","modrm_memonly,scale32","w,r,r","",""
+"VBROADCASTSD ymm1, m64","VBROADCASTSD m64, ymm1","vbroadcastsd m64, ymm1","VEX.256.66.0F38.W0 19 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VBROADCASTSD ymm1, xmm2","VBROADCASTSD xmm2, ymm1","vbroadcastsd xmm2, ymm1","VEX.256.66.0F38.W0 19 /r","V","V","AVX2","modrm_regonly","w,r","",""
+"VBROADCASTSD ymm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z}, ymm1","vbroadcastsd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 19 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VBROADCASTSD zmm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z}, zmm1","vbroadcastsd xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 19 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VBROADCASTSS xmm1, m32","VBROADCASTSS m32, xmm1","vbroadcastss m32, xmm1","VEX.128.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VBROADCASTSS ymm1, m32","VBROADCASTSS m32, ymm1","vbroadcastss m32, ymm1","VEX.256.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VBROADCASTSS xmm1, xmm2","VBROADCASTSS xmm2, xmm1","vbroadcastss xmm2, xmm1","VEX.128.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","",""
+"VBROADCASTSS ymm1, xmm2","VBROADCASTSS xmm2, ymm1","vbroadcastss xmm2, ymm1","VEX.256.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","",""
+"VBROADCASTSS xmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, xmm1","vbroadcastss xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 18 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VBROADCASTSS ymm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, ymm1","vbroadcastss xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 18 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VBROADCASTSS zmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, zmm1","vbroadcastss xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 18 /r","V","V","AVX512F","scale4","w,r,r","",""
+"VCMPPD xmm1, xmmV, xmm2/m128, imm8u","VCMPPD imm8u, xmm2/m128, xmmV, xmm1","vcmppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPPD k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VCMPPD imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","vcmppd imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F.W1 C2 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VCMPPD ymm1, ymmV, ymm2/m256, imm8u","VCMPPD imm8u, ymm2/m256, ymmV, ymm1","vcmppd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPPD k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VCMPPD imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","vcmppd imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F.W1 C2 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VCMPPD k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPD imm8u, zmm2, zmmV, {k}, k1{sae}","vcmppd imm8u, zmm2, zmmV, {k}, k1{sae}","EVEX.NDS.512.66.0F.W1 C2 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VCMPPD k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VCMPPD imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","vcmppd imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F.W1 C2 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VCMPPS xmm1, xmmV, xmm2/m128, imm8u","VCMPPS imm8u, xmm2/m128, xmmV, xmm1","vcmpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPPS k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VCMPPS imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","vcmpps imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.0F.W0 C2 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VCMPPS ymm1, ymmV, ymm2/m256, imm8u","VCMPPS imm8u, ymm2/m256, ymmV, ymm1","vcmpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPPS k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VCMPPS imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","vcmpps imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.0F.W0 C2 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VCMPPS k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPS imm8u, zmm2, zmmV, {k}, k1{sae}","vcmpps imm8u, zmm2, zmmV, {k}, k1{sae}","EVEX.NDS.512.0F.W0 C2 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VCMPPS k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VCMPPS imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","vcmpps imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.0F.W0 C2 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VCMPSD k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSD imm8u, xmm2, xmmV, {k}, k1{sae}","vcmpsd imm8u, xmm2, xmmV, {k}, k1{sae}","EVEX.NDS.128.F2.0F.W1 C2 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VCMPSD xmm1, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64, xmmV, xmm1","vcmpsd imm8u, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPSD k1, {k}, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64, xmmV, {k}, k1","vcmpsd imm8u, xmm2/m64, xmmV, {k}, k1","EVEX.NDS.LIG.F2.0F.W1 C2 /r ib","V","V","AVX512F","scale8","w,r,r,r,r","",""
+"VCMPSS k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSS imm8u, xmm2, xmmV, {k}, k1{sae}","vcmpss imm8u, xmm2, xmmV, {k}, k1{sae}","EVEX.NDS.128.F3.0F.W0 C2 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VCMPSS xmm1, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32, xmmV, xmm1","vcmpss imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG C2 /r ib","V","V","AVX","","w,r,r,r","",""
+"VCMPSS k1, {k}, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32, xmmV, {k}, k1","vcmpss imm8u, xmm2/m32, xmmV, {k}, k1","EVEX.NDS.LIG.F3.0F.W0 C2 /r ib","V","V","AVX512F","scale4","w,r,r,r,r","",""
+"VCOMISD xmm1{sae}, xmm2","VCOMISD xmm2, xmm1{sae}","vcomisd xmm2, xmm1{sae}","EVEX.128.66.0F.W1 2F /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1","EVEX.LIG.66.0F.W1 2F /r","V","V","AVX512F","scale8","r,r","",""
+"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1","VEX.LIG.66.0F.WIG 2F /r","V","V","AVX","","r,r","",""
+"VCOMISS xmm1{sae}, xmm2","VCOMISS xmm2, xmm1{sae}","vcomiss xmm2, xmm1{sae}","EVEX.128.0F.W0 2F /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1","EVEX.LIG.0F.W0 2F /r","V","V","AVX512F","scale4","r,r","",""
+"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1","VEX.LIG.0F.WIG 2F /r","V","V","AVX","","r,r","",""
+"VCOMPRESSPD xmm2/m128, {k}{z}, xmm1","VCOMPRESSPD xmm1, {k}{z}, xmm2/m128","vcompresspd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 8A /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VCOMPRESSPD ymm2/m256, {k}{z}, ymm1","VCOMPRESSPD ymm1, {k}{z}, ymm2/m256","vcompresspd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 8A /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VCOMPRESSPD zmm2/m512, {k}{z}, zmm1","VCOMPRESSPD zmm1, {k}{z}, zmm2/m512","vcompresspd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 8A /r","V","V","AVX512F","scale8","w,r,r","",""
+"VCOMPRESSPS xmm2/m128, {k}{z}, xmm1","VCOMPRESSPS xmm1, {k}{z}, xmm2/m128","vcompressps xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 8A /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VCOMPRESSPS ymm2/m256, {k}{z}, ymm1","VCOMPRESSPS ymm1, {k}{z}, ymm2/m256","vcompressps ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 8A /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VCOMPRESSPS zmm2/m512, {k}{z}, zmm1","VCOMPRESSPS zmm1, {k}{z}, zmm2/m512","vcompressps zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 8A /r","V","V","AVX512F","scale4","w,r,r","",""
+"VCVTDQ2PD ymm1, xmm2/m128","VCVTDQ2PD xmm2/m128, ymm1","vcvtdq2pd xmm2/m128, ymm1","VEX.256.F3.0F.WIG E6 /r","V","V","AVX","","w,r","",""
+"VCVTDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PD xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtdq2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W0 E6 /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTDQ2PD xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtdq2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W0 E6 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTDQ2PD xmm1, xmm2/m64","VCVTDQ2PD xmm2/m64, xmm1","vcvtdq2pd xmm2/m64, xmm1","VEX.128.F3.0F.WIG E6 /r","V","V","AVX","","w,r","",""
+"VCVTDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTDQ2PD ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtdq2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W0 E6 /r","V","V","AVX512F","bscale4,scale32","w,r,r","",""
+"VCVTDQ2PS xmm1, xmm2/m128","VCVTDQ2PS xmm2/m128, xmm1","vcvtdq2ps xmm2/m128, xmm1","VEX.128.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PS xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtdq2ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTDQ2PS ymm1, ymm2/m256","VCVTDQ2PS ymm2/m256, ymm1","vcvtdq2ps ymm2/m256, ymm1","VEX.256.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTDQ2PS ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtdq2ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTDQ2PS zmm2, {k}{z}, zmm1{er}","vcvtdq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 5B /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTDQ2PS zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtdq2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTPD2DQ ymm1{er}, {k}{z}, zmm2","VCVTPD2DQ zmm2, {k}{z}, ymm1{er}","vcvtpd2dq zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 E6 /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2DQ zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtpd2dq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.F2.0F.W1 E6 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTPD2DQ xmm1, xmm2/m128","VCVTPD2DQX xmm2/m128, xmm1","vcvtpd2dqx xmm2/m128, xmm1","VEX.128.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128"
+"VCVTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2DQX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2dqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F2.0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTPD2DQ xmm1, ymm2/m256","VCVTPD2DQY ymm2/m256, xmm1","vcvtpd2dqy ymm2/m256, xmm1","VEX.256.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256"
+"VCVTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2DQY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtpd2dqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.F2.0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTPD2PS ymm1{er}, {k}{z}, zmm2","VCVTPD2PS zmm2, {k}{z}, ymm1{er}","vcvtpd2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.66.0F.W1 5A /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTPD2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2PS zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtpd2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.66.0F.W1 5A /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTPD2PS xmm1, xmm2/m128","VCVTPD2PSX xmm2/m128, xmm1","vcvtpd2psx xmm2/m128, xmm1","VEX.128.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","128"
+"VCVTPD2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2PSX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 5A /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTPD2PS xmm1, ymm2/m256","VCVTPD2PSY ymm2/m256, xmm1","vcvtpd2psy ymm2/m256, xmm1","VEX.256.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","256"
+"VCVTPD2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2PSY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtpd2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.66.0F.W1 5A /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2QQ xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2qq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 7B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2QQ ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtpd2qq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 7B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTPD2QQ zmm1{er}, {k}{z}, zmm2","VCVTPD2QQ zmm2, {k}{z}, zmm1{er}","vcvtpd2qq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 7B /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2QQ zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtpd2qq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 7B /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTPD2UDQ ymm1{er}, {k}{z}, zmm2","VCVTPD2UDQ zmm2, {k}{z}, ymm1{er}","vcvtpd2udq zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 79 /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UDQ zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtpd2udq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W1 79 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UDQX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2udqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F.W1 79 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UDQY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtpd2udqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F.W1 79 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UQQ xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtpd2uqq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 79 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UQQ ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtpd2uqq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 79 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTPD2UQQ zmm1{er}, {k}{z}, zmm2","VCVTPD2UQQ zmm2, {k}{z}, zmm1{er}","vcvtpd2uqq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 79 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UQQ zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtpd2uqq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 79 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTPH2PS ymm1, xmm2/m128","VCVTPH2PS xmm2/m128, ymm1","vcvtph2ps xmm2/m128, ymm1","VEX.256.66.0F38.W0 13 /r","V","V","F16C","","w,r","",""
+"VCVTPH2PS ymm1, {k}{z}, xmm2/m128","VCVTPH2PS xmm2/m128, {k}{z}, ymm1","vcvtph2ps xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 13 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VCVTPH2PS xmm1, xmm2/m64","VCVTPH2PS xmm2/m64, xmm1","vcvtph2ps xmm2/m64, xmm1","VEX.128.66.0F38.W0 13 /r","V","V","F16C","","w,r","",""
+"VCVTPH2PS xmm1, {k}{z}, xmm2/m64","VCVTPH2PS xmm2/m64, {k}{z}, xmm1","vcvtph2ps xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 13 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VCVTPH2PS zmm1{sae}, {k}{z}, ymm2","VCVTPH2PS ymm2, {k}{z}, zmm1{sae}","vcvtph2ps ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 13 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTPH2PS zmm1, {k}{z}, ymm2/m256","VCVTPH2PS ymm2/m256, {k}{z}, zmm1","vcvtph2ps ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 13 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VCVTPS2DQ xmm1, xmm2/m128","VCVTPS2DQ xmm2/m128, xmm1","vcvtps2dq xmm2/m128, xmm1","VEX.128.66.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2DQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2dq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2DQ ymm1, ymm2/m256","VCVTPS2DQ ymm2/m256, ymm1","vcvtps2dq ymm2/m256, ymm1","VEX.256.66.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2DQ ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2dq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTPS2DQ zmm1{er}, {k}{z}, zmm2","VCVTPS2DQ zmm2, {k}{z}, zmm1{er}","vcvtps2dq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 5B /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2DQ zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2dq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTPS2PD ymm1, xmm2/m128","VCVTPS2PD xmm2/m128, ymm1","vcvtps2pd xmm2/m128, ymm1","VEX.256.0F.WIG 5A /r","V","V","AVX","","w,r","",""
+"VCVTPS2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2PD xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 5A /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTPS2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2PD xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 5A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2PD xmm1, xmm2/m64","VCVTPS2PD xmm2/m64, xmm1","vcvtps2pd xmm2/m64, xmm1","VEX.128.0F.WIG 5A /r","V","V","AVX","","w,r","",""
+"VCVTPS2PD zmm1{sae}, {k}{z}, ymm2","VCVTPS2PD ymm2, {k}{z}, zmm1{sae}","vcvtps2pd ymm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 5A /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTPS2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2PD ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 5A /r","V","V","AVX512F","bscale4,scale32","w,r,r","",""
+"VCVTPS2PH xmm2/m64, xmm1, imm8u","VCVTPS2PH imm8u, xmm1, xmm2/m64","vcvtps2ph imm8u, xmm1, xmm2/m64","VEX.128.66.0F3A.W0 1D /r ib","V","V","F16C","","w,r,r","",""
+"VCVTPS2PH xmm2/m64, {k}{z}, xmm1, imm8u","VCVTPS2PH imm8u, xmm1, {k}{z}, xmm2/m64","vcvtps2ph imm8u, xmm1, {k}{z}, xmm2/m64","EVEX.128.66.0F3A.W0 1D /r ib","V","V","AVX512F+AVX512VL","scale8","w,r,r,r","",""
+"VCVTPS2PH xmm2/m128, ymm1, imm8u","VCVTPS2PH imm8u, ymm1, xmm2/m128","vcvtps2ph imm8u, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 1D /r ib","V","V","F16C","","w,r,r","",""
+"VCVTPS2PH xmm2/m128, {k}{z}, ymm1, imm8u","VCVTPS2PH imm8u, ymm1, {k}{z}, xmm2/m128","vcvtps2ph imm8u, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0 1D /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VCVTPS2PH ymm2/m256, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1, {k}{z}, ymm2/m256","vcvtps2ph imm8u, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0 1D /r ib","V","V","AVX512F","scale32","w,r,r,r","",""
+"VCVTPS2PH ymm2{sae}, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1, {k}{z}, ymm2{sae}","vcvtps2ph imm8u, zmm1, {k}{z}, ymm2{sae}","EVEX.512.66.0F3A.W0 1D /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VCVTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2QQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2qq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 7B /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2QQ xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2qq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 7B /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2QQ zmm1{er}, {k}{z}, ymm2","VCVTPS2QQ ymm2, {k}{z}, zmm1{er}","vcvtps2qq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 7B /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2QQ ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2qq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 7B /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
+"VCVTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UDQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2udq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 79 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2UDQ ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2udq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 79 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTPS2UDQ zmm1{er}, {k}{z}, zmm2","VCVTPS2UDQ zmm2, {k}{z}, zmm1{er}","vcvtps2udq zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2UDQ zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2udq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 79 /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UQQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtps2uqq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 79 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2UQQ xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtps2uqq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 79 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTPS2UQQ zmm1{er}, {k}{z}, ymm2","VCVTPS2UQQ ymm2, {k}{z}, zmm1{er}","vcvtps2uqq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 79 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2UQQ ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtps2uqq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 79 /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
+"VCVTQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PD xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtqq2pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W1 E6 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PD ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtqq2pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W1 E6 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTQQ2PD zmm2, {k}{z}, zmm1{er}","vcvtqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 E6 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PD zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtqq2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W1 E6 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTQQ2PS zmm2, {k}{z}, ymm1{er}","vcvtqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 5B /r","V","V","AVX512DQ","modrm_regonly","w,r,r","Y",""
+"VCVTQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PS zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtqq2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W1 5B /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512"
+"VCVTQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PSX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtqq2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F.W1 5B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PSY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtqq2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F.W1 5B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTSD2SI r32{er}, xmm2","VCVTSD2SI xmm2, r32{er}","vcvtsd2si xmm2, r32{er}","EVEX.128.F2.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r32","EVEX.LIG.F2.0F.W0 2D /r","V","V","AVX512F","scale8","w,r","Y","32"
+"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r32","VEX.LIG.F2.0F.W0 2D /r","V","V","AVX","","w,r","Y","32"
+"VCVTSD2SI r64{er}, xmm2","VCVTSD2SIQ xmm2, r64{er}","vcvtsd2siq xmm2, r64{er}","EVEX.128.F2.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64, r64","EVEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
+"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64, r64","VEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64"
+"VCVTSD2SS xmm1{er}, {k}{z}, xmmV, xmm2","VCVTSD2SS xmm2, xmmV, {k}{z}, xmm1{er}","vcvtsd2ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5A /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VCVTSD2SS xmm1, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV, xmm1","vcvtsd2ss xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5A /r","V","V","AVX","","w,r,r","",""
+"VCVTSD2SS xmm1, {k}{z}, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV, {k}{z}, xmm1","vcvtsd2ss xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5A /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VCVTSD2USI r32{er}, xmm2","VCVTSD2USIL xmm2, r32{er}","vcvtsd2usi xmm2, r32{er}","EVEX.128.F2.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTSD2USI r32, xmm2/m64","VCVTSD2USIL xmm2/m64, r32","vcvtsd2usi xmm2/m64, r32","EVEX.LIG.F2.0F.W0 79 /r","V","V","AVX512F","scale8","w,r","Y","32"
+"VCVTSD2USI r64{er}, xmm2","VCVTSD2USIQ xmm2, r64{er}","vcvtsd2usi xmm2, r64{er}","EVEX.128.F2.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTSD2USI r64, xmm2/m64","VCVTSD2USIQ xmm2/m64, r64","vcvtsd2usi xmm2/m64, r64","EVEX.LIG.F2.0F.W1 79 /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
+"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r/m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX512F","scale4","w,r,r","Y","32"
+"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r/m32, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX","","w,r,r","Y","32"
+"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r/m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
+"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX","","w,r,r","Y","64"
+"VCVTSI2SD xmm1{er}, xmmV, rmr64","VCVTSI2SDQ rmr64, xmmV, xmm1{er}","vcvtsi2sdq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 2A /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
+"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r/m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX512F","scale4","w,r,r","Y","32"
+"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX","","w,r,r","Y","32"
+"VCVTSI2SS xmm1{er}, xmmV, rmr32","VCVTSI2SSL rmr32, xmmV, xmm1{er}","vcvtsi2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0 2A /r","V","V","AVX512F","modrm_regonly","w,r,r","Y","32"
+"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r/m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
+"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r/m64, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX","","w,r,r","Y","64"
+"VCVTSI2SS xmm1{er}, xmmV, rmr64","VCVTSI2SSQ rmr64, xmmV, xmm1{er}","vcvtsi2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1 2A /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
+"VCVTSS2SD xmm1{sae}, {k}{z}, xmmV, xmm2","VCVTSS2SD xmm2, xmmV, {k}{z}, xmm1{sae}","vcvtss2sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5A /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VCVTSS2SD xmm1, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV, xmm1","vcvtss2sd xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5A /r","V","V","AVX","","w,r,r","",""
+"VCVTSS2SD xmm1, {k}{z}, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV, {k}{z}, xmm1","vcvtss2sd xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5A /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VCVTSS2SI r32{er}, xmm2","VCVTSS2SI xmm2, r32{er}","vcvtss2si xmm2, r32{er}","EVEX.128.F3.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r32","EVEX.LIG.F3.0F.W0 2D /r","V","V","AVX512F","scale4","w,r","Y","32"
+"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r32","VEX.LIG.F3.0F.W0 2D /r","V","V","AVX","","w,r","Y","32"
+"VCVTSS2SI r64{er}, xmm2","VCVTSS2SIQ xmm2, r64{er}","vcvtss2siq xmm2, r64{er}","EVEX.128.F3.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
+"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32, r64","VEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64"
+"VCVTSS2USI r32{er}, xmm2","VCVTSS2USIL xmm2, r32{er}","vcvtss2usil xmm2, r32{er}","EVEX.128.F3.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTSS2USI r32, xmm2/m32","VCVTSS2USIL xmm2/m32, r32","vcvtss2usil xmm2/m32, r32","EVEX.LIG.F3.0F.W0 79 /r","V","V","AVX512F","scale4","w,r","Y","32"
+"VCVTSS2USI r64{er}, xmm2","VCVTSS2USIQ xmm2, r64{er}","vcvtss2usiq xmm2, r64{er}","EVEX.128.F3.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTSS2USI r64, xmm2/m32","VCVTSS2USIQ xmm2/m32, r64","vcvtss2usiq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 79 /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
+"VCVTTPD2DQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2DQ zmm2, {k}{z}, ymm1{sae}","vcvttpd2dq zmm2, {k}{z}, ymm1{sae}","EVEX.512.66.0F.W1 E6 /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2DQ zmm2/m512/m64bcst, {k}{z}, ymm1","vcvttpd2dq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.66.0F.W1 E6 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTTPD2DQ xmm1, xmm2/m128","VCVTTPD2DQX xmm2/m128, xmm1","vcvttpd2dqx xmm2/m128, xmm1","VEX.128.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128"
+"VCVTTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2DQX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2dqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTTPD2DQ xmm1, ymm2/m256","VCVTTPD2DQY ymm2/m256, xmm1","vcvttpd2dqy ymm2/m256, xmm1","VEX.256.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256"
+"VCVTTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2DQY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvttpd2dqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.66.0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2QQ xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2qq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2QQ ymm2/m256/m64bcst, {k}{z}, ymm1","vcvttpd2qq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTTPD2QQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2QQ zmm2, {k}{z}, zmm1{sae}","vcvttpd2qq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 7A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2QQ zmm2/m512/m64bcst, {k}{z}, zmm1","vcvttpd2qq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTTPD2UDQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2UDQ zmm2, {k}{z}, ymm1{sae}","vcvttpd2udq zmm2, {k}{z}, ymm1{sae}","EVEX.512.0F.W1 78 /r","V","V","AVX512F","modrm_regonly","w,r,r","Y",""
+"VCVTTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UDQ zmm2/m512/m64bcst, {k}{z}, ymm1","vcvttpd2udq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W1 78 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512"
+"VCVTTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UDQX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2udqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F.W1 78 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UDQY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvttpd2udqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F.W1 78 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UQQ xmm2/m128/m64bcst, {k}{z}, xmm1","vcvttpd2uqq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 78 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UQQ ymm2/m256/m64bcst, {k}{z}, ymm1","vcvttpd2uqq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 78 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTTPD2UQQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2UQQ zmm2, {k}{z}, zmm1{sae}","vcvttpd2uqq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 78 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UQQ zmm2/m512/m64bcst, {k}{z}, zmm1","vcvttpd2uqq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 78 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTTPS2DQ xmm1, xmm2/m128","VCVTTPS2DQ xmm2/m128, xmm1","vcvttps2dq xmm2/m128, xmm1","VEX.128.F3.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2DQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2dq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTTPS2DQ ymm1, ymm2/m256","VCVTTPS2DQ ymm2/m256, ymm1","vcvttps2dq ymm2/m256, ymm1","VEX.256.F3.0F.WIG 5B /r","V","V","AVX","","w,r","",""
+"VCVTTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2DQ ymm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2dq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTTPS2DQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2DQ zmm2, {k}{z}, zmm1{sae}","vcvttps2dq zmm2, {k}{z}, zmm1{sae}","EVEX.512.F3.0F.W0 5B /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2DQ zmm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2dq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2QQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2qq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 7A /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2QQ xmm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2qq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 7A /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTTPS2QQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2QQ ymm2, {k}{z}, zmm1{sae}","vcvttps2qq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 7A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2QQ ymm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2qq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 7A /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
+"VCVTTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UDQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2udq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 78 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2UDQ ymm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2udq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 78 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTTPS2UDQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2UDQ zmm2, {k}{z}, zmm1{sae}","vcvttps2udq zmm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 78 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2UDQ zmm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2udq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 78 /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UQQ xmm2/m128/m32bcst, {k}{z}, xmm1","vcvttps2uqq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 78 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2UQQ xmm2/m256/m32bcst, {k}{z}, ymm1","vcvttps2uqq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 78 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTTPS2UQQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2UQQ ymm2, {k}{z}, zmm1{sae}","vcvttps2uqq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 78 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2UQQ ymm2/m512/m32bcst, {k}{z}, zmm1","vcvttps2uqq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 78 /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","",""
+"VCVTTSD2SI r32{sae}, xmm2","VCVTTSD2SI xmm2, r32{sae}","vcvttsd2si xmm2, r32{sae}","EVEX.128.F2.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64, r32","EVEX.LIG.F2.0F.W0 2C /r","V","V","AVX512F","scale8","w,r","Y","32"
+"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64, r32","VEX.LIG.F2.0F.W0 2C /r","V","V","AVX","","w,r","Y","32"
+"VCVTTSD2SI r64{sae}, xmm2","VCVTTSD2SIQ xmm2, r64{sae}","vcvttsd2siq xmm2, r64{sae}","EVEX.128.F2.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m64, r64","EVEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
+"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m64, r64","VEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64"
+"VCVTTSD2USI r32{sae}, xmm2","VCVTTSD2USIL xmm2, r32{sae}","vcvttsd2usil xmm2, r32{sae}","EVEX.128.F2.0F.W0 78 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTTSD2USI r32, xmm2/m64","VCVTTSD2USIL xmm2/m64, r32","vcvttsd2usil xmm2/m64, r32","EVEX.LIG.F2.0F.W0 78 /r","V","V","AVX512F","scale8","w,r","Y","32"
+"VCVTTSD2USI r64{sae}, xmm2","VCVTTSD2USIQ xmm2, r64{sae}","vcvttsd2usiq xmm2, r64{sae}","EVEX.128.F2.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTTSD2USI r64, xmm2/m64","VCVTTSD2USIQ xmm2/m64, r64","vcvttsd2usiq xmm2/m64, r64","EVEX.LIG.F2.0F.W1 78 /r","N.S.","V","AVX512F","scale8","w,r","Y","64"
+"VCVTTSS2SI r32{sae}, xmm2","VCVTTSS2SI xmm2, r32{sae}","vcvttss2si xmm2, r32{sae}","EVEX.128.F3.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32, r32","EVEX.LIG.F3.0F.W0 2C /r","V","V","AVX512F","scale4","w,r","Y","32"
+"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32, r32","VEX.LIG.F3.0F.W0 2C /r","V","V","AVX","","w,r","Y","32"
+"VCVTTSS2SI r64{sae}, xmm2","VCVTTSS2SIQ xmm2, r64{sae}","vcvttss2siq xmm2, r64{sae}","EVEX.128.F3.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
+"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m32, r64","VEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64"
+"VCVTTSS2USI r32{sae}, xmm2","VCVTTSS2USIL xmm2, r32{sae}","vcvttss2usil xmm2, r32{sae}","EVEX.128.F3.0F.W0 78 /r","V","V","AVX512F","modrm_regonly","w,r","Y","32"
+"VCVTTSS2USI r32, xmm2/m32","VCVTTSS2USIL xmm2/m32, r32","vcvttss2usil xmm2/m32, r32","EVEX.LIG.F3.0F.W0 78 /r","V","V","AVX512F","scale4","w,r","Y","32"
+"VCVTTSS2USI r64{sae}, xmm2","VCVTTSS2USIQ xmm2, r64{sae}","vcvttss2usiq xmm2, r64{sae}","EVEX.128.F3.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonly","w,r","Y","64"
+"VCVTTSS2USI r64, xmm2/m32","VCVTTSS2USIQ xmm2/m32, r64","vcvttss2usiq xmm2/m32, r64","EVEX.LIG.F3.0F.W1 78 /r","N.S.","V","AVX512F","scale4","w,r","Y","64"
+"VCVTUDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PD xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtudq2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","",""
+"VCVTUDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTUDQ2PD xmm2/m256/m32bcst, {k}{z}, ymm1","vcvtudq2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTUDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTUDQ2PD ymm2/m512/m32bcst, {k}{z}, zmm1","vcvtudq2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W0 7A /r","V","V","AVX512F","bscale4,scale32","w,r,r","",""
+"VCVTUDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PS xmm2/m128/m32bcst, {k}{z}, xmm1","vcvtudq2ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F2.0F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VCVTUDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTUDQ2PS ymm2/m256/m32bcst, {k}{z}, ymm1","vcvtudq2ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F2.0F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VCVTUDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTUDQ2PS zmm2, {k}{z}, zmm1{er}","vcvtudq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.F2.0F.W0 7A /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VCVTUDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTUDQ2PS zmm2/m512/m32bcst, {k}{z}, zmm1","vcvtudq2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F2.0F.W0 7A /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VCVTUQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PD xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtuqq2pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VCVTUQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PD ymm2/m256/m64bcst, {k}{z}, ymm1","vcvtuqq2pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VCVTUQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTUQQ2PD zmm2, {k}{z}, zmm1{er}","vcvtuqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 7A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","",""
+"VCVTUQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PD zmm2/m512/m64bcst, {k}{z}, zmm1","vcvtuqq2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","",""
+"VCVTUQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTUQQ2PS zmm2, {k}{z}, ymm1{er}","vcvtuqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 7A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","Y",""
+"VCVTUQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PS zmm2/m512/m64bcst, {k}{z}, ymm1","vcvtuqq2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.F2.0F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512"
+"VCVTUQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PSX xmm2/m128/m64bcst, {k}{z}, xmm1","vcvtuqq2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F2.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","128"
+"VCVTUQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PSY ymm2/m256/m64bcst, {k}{z}, xmm1","vcvtuqq2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.F2.0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","256"
+"VCVTUSI2SD xmm1, xmmV, r/m32","VCVTUSI2SDL r/m32, xmmV, xmm1","vcvtusi2sd r/m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 7B /r","V","V","AVX512F","scale4","w,r,r","Y","32"
+"VCVTUSI2SD xmm1, xmmV, r/m64","VCVTUSI2SDQ r/m64, xmmV, xmm1","vcvtusi2sd r/m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 7B /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
+"VCVTUSI2SD xmm1{er}, xmmV, rmr64","VCVTUSI2SDQ rmr64, xmmV, xmm1{er}","vcvtusi2sd rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 7B /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
+"VCVTUSI2SS xmm1, xmmV, r/m32","VCVTUSI2SSL r/m32, xmmV, xmm1","vcvtusi2ssl r/m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 7B /r","V","V","AVX512F","scale4","w,r,r","Y","32"
+"VCVTUSI2SS xmm1{er}, xmmV, rmr32","VCVTUSI2SSL rmr32, xmmV, xmm1{er}","vcvtusi2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0 7B /r","V","V","AVX512F","modrm_regonly","w,r,r","Y","32"
+"VCVTUSI2SS xmm1, xmmV, r/m64","VCVTUSI2SSQ r/m64, xmmV, xmm1","vcvtusi2ssq r/m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 7B /r","N.S.","V","AVX512F","scale8","w,r,r","Y","64"
+"VCVTUSI2SS xmm1{er}, xmmV, rmr64","VCVTUSI2SSQ rmr64, xmmV, xmm1{er}","vcvtusi2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1 7B /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","Y","64"
+"VDBPSADBW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VDBPSADBW imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","vdbpsadbw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 42 /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VDBPSADBW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VDBPSADBW imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","vdbpsadbw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 42 /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VDBPSADBW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VDBPSADBW imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","vdbpsadbw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 42 /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VDIVPD xmm1, xmmV, xmm2/m128","VDIVPD xmm2/m128, xmmV, xmm1","vdivpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VDIVPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vdivpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VDIVPD ymm1, ymmV, ymm2/m256","VDIVPD ymm2/m256, ymmV, ymm1","vdivpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VDIVPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vdivpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VDIVPD zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPD zmm2, zmmV, {k}{z}, zmm1{er}","vdivpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VDIVPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VDIVPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vdivpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5E /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VDIVPS xmm1, xmmV, xmm2/m128","VDIVPS xmm2/m128, xmmV, xmm1","vdivps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VDIVPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vdivps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VDIVPS ymm1, ymmV, ymm2/m256","VDIVPS ymm2/m256, ymmV, ymm1","vdivps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VDIVPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vdivps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VDIVPS zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPS zmm2, zmmV, {k}{z}, zmm1{er}","vdivps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VDIVPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VDIVPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vdivps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5E /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VDIVSD xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSD xmm2, xmmV, {k}{z}, xmm1{er}","vdivsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VDIVSD xmm1, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, xmm1","vdivsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVSD xmm1, {k}{z}, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, {k}{z}, xmm1","vdivsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5E /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VDIVSS xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSS xmm2, xmmV, {k}{z}, xmm1{er}","vdivss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 5E /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VDIVSS xmm1, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, xmm1","vdivss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5E /r","V","V","AVX","","w,r,r","",""
+"VDIVSS xmm1, {k}{z}, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, {k}{z}, xmm1","vdivss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5E /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VDPPD xmm1, xmmV, xmm2/m128, imm8u","VDPPD imm8u, xmm2/m128, xmmV, xmm1","vdppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 41 /r ib","V","V","AVX","","w,r,r,r","",""
+"VDPPS xmm1, xmmV, xmm2/m128, imm8u","VDPPS imm8u, xmm2/m128, xmmV, xmm1","vdpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 40 /r ib","V","V","AVX","","w,r,r,r","",""
+"VDPPS ymm1, ymmV, ymm2/m256, imm8u","VDPPS imm8u, ymm2/m256, ymmV, ymm1","vdpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 40 /r ib","V","V","AVX","","w,r,r,r","",""
+"VERR r/m16","VERR r/m16","verr r/m16","0F 00 /4","V","V","","","r","",""
+"VERW r/m16","VERW r/m16","verw r/m16","0F 00 /5","V","V","","","r","",""
+"VEXP2PD zmm1{sae}, {k}{z}, zmm2","VEXP2PD zmm2, {k}{z}, zmm1{sae}","vexp2pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 C8 /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VEXP2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VEXP2PD zmm2/m512/m64bcst, {k}{z}, zmm1","vexp2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 C8 /r","V","V","AVX512ER","bscale8,scale64","w,r,r","",""
+"VEXP2PS zmm1{sae}, {k}{z}, zmm2","VEXP2PS zmm2, {k}{z}, zmm1{sae}","vexp2ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 C8 /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VEXP2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VEXP2PS zmm2/m512/m32bcst, {k}{z}, zmm1","vexp2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 C8 /r","V","V","AVX512ER","bscale4,scale64","w,r,r","",""
+"VEXPANDPD xmm1, {k}{z}, xmm2/m128","VEXPANDPD xmm2/m128, {k}{z}, xmm1","vexpandpd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 88 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VEXPANDPD ymm1, {k}{z}, ymm2/m256","VEXPANDPD ymm2/m256, {k}{z}, ymm1","vexpandpd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 88 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VEXPANDPD zmm1, {k}{z}, zmm2/m512","VEXPANDPD zmm2/m512, {k}{z}, zmm1","vexpandpd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 88 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VEXPANDPS xmm1, {k}{z}, xmm2/m128","VEXPANDPS xmm2/m128, {k}{z}, xmm1","vexpandps xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 88 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VEXPANDPS ymm1, {k}{z}, ymm2/m256","VEXPANDPS ymm2/m256, {k}{z}, ymm1","vexpandps ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 88 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VEXPANDPS zmm1, {k}{z}, zmm2/m512","VEXPANDPS zmm2/m512, {k}{z}, zmm1","vexpandps zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 88 /r","V","V","AVX512F","scale4","w,r,r","",""
+"VEXTRACTF128 xmm2/m128, ymm1, imm8u:1","VEXTRACTF128 imm8u:1, ymm1, xmm2/m128","vextractf128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 19 /r ib","V","V","AVX","","w,r,r","",""
+"VEXTRACTF32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF32X4 imm8u:1, ymm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:1, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0 19 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VEXTRACTF32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF32X4 imm8u:2, zmm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:2, zmm1, {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W0 19 /r ib","V","V","AVX512F","scale16","w,r,r,r","",""
+"VEXTRACTF32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTF32X8 imm8u:1, zmm1, {k}{z}, ymm2/m256","vextractf32x8 imm8u:1, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0 1B /r ib","V","V","AVX512DQ","scale32","w,r,r,r","",""
+"VEXTRACTF64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF64X2 imm8u:1, ymm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:1, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W1 19 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r","",""
+"VEXTRACTF64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF64X2 imm8u:2, zmm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:2, zmm1, {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W1 19 /r ib","V","V","AVX512DQ","scale16","w,r,r,r","",""
+"VEXTRACTF64X4 ymm2/m256, {k}{z}, zmm1, imm8u","VEXTRACTF64X4 imm8u, zmm1, {k}{z}, ymm2/m256","vextractf64x4 imm8u, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W1 1B /r ib","V","V","AVX512F","scale32","w,r,r,r","",""
+"VEXTRACTI128 xmm2/m128, ymm1, imm8u:1","VEXTRACTI128 imm8u:1, ymm1, xmm2/m128","vextracti128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 39 /r ib","V","V","AVX2","","w,r,r","",""
+"VEXTRACTI32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI32X4 imm8u:1, ymm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:1, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0 39 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VEXTRACTI32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI32X4 imm8u:2, zmm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:2, zmm1, {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W0 39 /r ib","V","V","AVX512F","scale16","w,r,r,r","",""
+"VEXTRACTI32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI32X8 imm8u:1, zmm1, {k}{z}, ymm2/m256","vextracti32x8 imm8u:1, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0 3B /r ib","V","V","AVX512DQ","scale32","w,r,r,r","",""
+"VEXTRACTI64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI64X2 imm8u:1, ymm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:1, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W1 39 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r","",""
+"VEXTRACTI64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI64X2 imm8u:2, zmm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:2, zmm1, {k}{z}, xmm2/m128","EVEX.512.66.0F3A.W1 39 /r ib","V","V","AVX512DQ","scale16","w,r,r,r","",""
+"VEXTRACTI64X4 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI64X4 imm8u:1, zmm1, {k}{z}, ymm2/m256","vextracti64x4 imm8u:1, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W1 3B /r ib","V","V","AVX512F","scale32","w,r,r,r","",""
+"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1, r/m32","vextractps imm8u:2, xmm1, r/m32","EVEX.128.66.0F3A.WIG 17 /r ib","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1, r/m32","vextractps imm8u:2, xmm1, r/m32","VEX.128.66.0F3A.WIG 17 /r ib","V","V","AVX","","w,r,r","",""
+"VFIXUPIMMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VFIXUPIMMPD imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfixupimmpd imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W1 54 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r,r","",""
+"VFIXUPIMMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VFIXUPIMMPD imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfixupimmpd imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W1 54 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r,r","",""
+"VFIXUPIMMPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPD imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","vfixupimmpd imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.DDS.512.66.0F3A.W1 54 /r ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
+"VFIXUPIMMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VFIXUPIMMPD imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfixupimmpd imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W1 54 /r ib","V","V","AVX512F","bscale8,scale64","rw,r,r,r,r","",""
+"VFIXUPIMMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VFIXUPIMMPS imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfixupimmps imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W0 54 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r,r","",""
+"VFIXUPIMMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VFIXUPIMMPS imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfixupimmps imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W0 54 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r,r","",""
+"VFIXUPIMMPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPS imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","vfixupimmps imm8u, zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.DDS.512.66.0F3A.W0 54 /r ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
+"VFIXUPIMMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VFIXUPIMMPS imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfixupimmps imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W0 54 /r ib","V","V","AVX512F","bscale4,scale64","rw,r,r,r,r","",""
+"VFIXUPIMMSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSD imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vfixupimmsd imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.DDS.128.66.0F3A.W1 55 /r ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
+"VFIXUPIMMSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VFIXUPIMMSD imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","vfixupimmsd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3A.W1 55 /r ib","V","V","AVX512F","scale8","rw,r,r,r,r","",""
+"VFIXUPIMMSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSS imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vfixupimmss imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.DDS.128.66.0F3A.W0 55 /r ib","V","V","AVX512F","modrm_regonly","rw,r,r,r,r","",""
+"VFIXUPIMMSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VFIXUPIMMSS imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","vfixupimmss imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3A.W0 55 /r ib","V","V","AVX512F","scale4","rw,r,r,r,r","",""
+"VFMADD132PD xmm1, xmmV, xmm2/m128","VFMADD132PD xmm2/m128, xmmV, xmm1","vfmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 98 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmadd132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 98 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADD132PD ymm1, ymmV, ymm2/m256","VFMADD132PD ymm2/m256, ymmV, ymm1","vfmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 98 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmadd132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 98 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 98 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmadd132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 98 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADD132PS xmm1, xmmV, xmm2/m128","VFMADD132PS xmm2/m128, xmmV, xmm1","vfmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 98 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmadd132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 98 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADD132PS ymm1, ymmV, ymm2/m256","VFMADD132PS ymm2/m256, ymmV, ymm1","vfmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 98 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmadd132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 98 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 98 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmadd132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 98 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 99 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD132SD xmm1, xmmV, xmm2/m64","VFMADD132SD xmm2/m64, xmmV, xmm1","vfmadd132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 99 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD132SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmadd132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 99 /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 99 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD132SS xmm1, xmmV, xmm2/m32","VFMADD132SS xmm2/m32, xmmV, xmm1","vfmadd132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 99 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD132SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmadd132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 99 /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMADD213PD xmm1, xmmV, xmm2/m128","VFMADD213PD xmm2/m128, xmmV, xmm1","vfmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmadd213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A8 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADD213PD ymm1, ymmV, ymm2/m256","VFMADD213PD ymm2/m256, ymmV, ymm1","vfmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmadd213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A8 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 A8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmadd213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A8 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADD213PS xmm1, xmmV, xmm2/m128","VFMADD213PS xmm2/m128, xmmV, xmm1","vfmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmadd213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A8 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADD213PS ymm1, ymmV, ymm2/m256","VFMADD213PS ymm2/m256, ymmV, ymm1","vfmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmadd213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A8 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 A8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmadd213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A8 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 A9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD213SD xmm1, xmmV, xmm2/m64","VFMADD213SD xmm2/m64, xmmV, xmm1","vfmadd213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 A9 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD213SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmadd213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 A9 /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 A9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD213SS xmm1, xmmV, xmm2/m32","VFMADD213SS xmm2/m32, xmmV, xmm1","vfmadd213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 A9 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD213SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmadd213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 A9 /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMADD231PD xmm1, xmmV, xmm2/m128","VFMADD231PD xmm2/m128, xmmV, xmm1","vfmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmadd231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B8 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADD231PD ymm1, ymmV, ymm2/m256","VFMADD231PD ymm2/m256, ymmV, ymm1","vfmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmadd231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B8 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 B8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmadd231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B8 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADD231PS xmm1, xmmV, xmm2/m128","VFMADD231PS xmm2/m128, xmmV, xmm1","vfmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmadd231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B8 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADD231PS ymm1, ymmV, ymm2/m256","VFMADD231PS ymm2/m256, ymmV, ymm1","vfmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B8 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmadd231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B8 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 B8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmadd231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B8 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 B9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD231SD xmm1, xmmV, xmm2/m64","VFMADD231SD xmm2/m64, xmmV, xmm1","vfmadd231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 B9 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD231SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmadd231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 B9 /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmadd231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 B9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADD231SS xmm1, xmmV, xmm2/m32","VFMADD231SS xmm2/m32, xmmV, xmm1","vfmadd231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 B9 /r","V","V","FMA","","rw,r,r","",""
+"VFMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD231SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmadd231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 B9 /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPD xmm2/m128, xmmIH, xmmV, xmm1","vfmaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 69 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPD xmmIH, xmm2/m128, xmmV, xmm1","vfmaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 69 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPD ymm2/m256, ymmIH, ymmV, ymm1","vfmaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 69 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPD ymmIH, ymm2/m256, ymmV, ymm1","vfmaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 69 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPS xmm2/m128, xmmIH, xmmV, xmm1","vfmaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 68 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPS xmmIH, xmm2/m128, xmmV, xmm1","vfmaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 68 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPS ymm2/m256, ymmIH, ymmV, ymm1","vfmaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 68 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPS ymmIH, ymm2/m256, ymmV, ymm1","vfmaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 68 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFMADDSD xmm2/m64, xmmIH, xmmV, xmm1","vfmaddsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6B /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFMADDSD xmmIH, xmm2/m64, xmmV, xmm1","vfmaddsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6B /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFMADDSS xmm2/m32, xmmIH, xmmV, xmm1","vfmaddss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6A /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFMADDSS xmmIH, xmm2/m32, xmmV, xmm1","vfmaddss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6A /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUB132PD xmm1, xmmV, xmm2/m128","VFMADDSUB132PD xmm2/m128, xmmV, xmm1","vfmaddsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 96 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 96 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADDSUB132PD ymm1, ymmV, ymm2/m256","VFMADDSUB132PD ymm2/m256, ymmV, ymm1","vfmaddsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 96 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 96 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADDSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 96 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 96 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADDSUB132PS xmm1, xmmV, xmm2/m128","VFMADDSUB132PS xmm2/m128, xmmV, xmm1","vfmaddsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 96 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 96 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADDSUB132PS ymm1, ymmV, ymm2/m256","VFMADDSUB132PS ymm2/m256, ymmV, ymm1","vfmaddsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 96 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 96 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADDSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 96 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 96 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADDSUB213PD xmm1, xmmV, xmm2/m128","VFMADDSUB213PD xmm2/m128, xmmV, xmm1","vfmaddsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADDSUB213PD ymm1, ymmV, ymm2/m256","VFMADDSUB213PD ymm2/m256, ymmV, ymm1","vfmaddsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADDSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 A6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A6 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADDSUB213PS xmm1, xmmV, xmm2/m128","VFMADDSUB213PS xmm2/m128, xmmV, xmm1","vfmaddsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A6 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADDSUB213PS ymm1, ymmV, ymm2/m256","VFMADDSUB213PS ymm2/m256, ymmV, ymm1","vfmaddsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A6 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADDSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 A6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A6 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADDSUB231PD xmm1, xmmV, xmm2/m128","VFMADDSUB231PD xmm2/m128, xmmV, xmm1","vfmaddsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMADDSUB231PD ymm1, ymmV, ymm2/m256","VFMADDSUB231PD ymm2/m256, ymmV, ymm1","vfmaddsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMADDSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 B6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B6 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMADDSUB231PS xmm1, xmmV, xmm2/m128","VFMADDSUB231PS xmm2/m128, xmmV, xmm1","vfmaddsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B6 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMADDSUB231PS ymm1, ymmV, ymm2/m256","VFMADDSUB231PS ymm2/m256, ymmV, ymm1","vfmaddsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B6 /r","V","V","FMA","","rw,r,r","",""
+"VFMADDSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B6 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMADDSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmaddsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 B6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMADDSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B6 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMADDSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPD xmm2/m128, xmmIH, xmmV, xmm1","vfmaddsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 5D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPD xmmIH, xmm2/m128, xmmV, xmm1","vfmaddsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 5D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPD ymm2/m256, ymmIH, ymmV, ymm1","vfmaddsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 5D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPD ymmIH, ymm2/m256, ymmV, ymm1","vfmaddsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 5D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPS xmm2/m128, xmmIH, xmmV, xmm1","vfmaddsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 5C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPS xmmIH, xmm2/m128, xmmV, xmm1","vfmaddsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 5C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPS ymm2/m256, ymmIH, ymmV, ymm1","vfmaddsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 5C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMADDSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPS ymmIH, ymm2/m256, ymmV, ymm1","vfmaddsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 5C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUB132PD xmm1, xmmV, xmm2/m128","VFMSUB132PD xmm2/m128, xmmV, xmm1","vfmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9A /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsub132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 9A /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUB132PD ymm1, ymmV, ymm2/m256","VFMSUB132PD ymm2/m256, ymmV, ymm1","vfmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9A /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsub132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 9A /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 9A /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsub132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 9A /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUB132PS xmm1, xmmV, xmm2/m128","VFMSUB132PS xmm2/m128, xmmV, xmm1","vfmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9A /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsub132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 9A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUB132PS ymm1, ymmV, ymm2/m256","VFMSUB132PS ymm2/m256, ymmV, ymm1","vfmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9A /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsub132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 9A /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 9A /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsub132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 9A /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 9B /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB132SD xmm1, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64, xmmV, xmm1","vfmsub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9B /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmsub132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 9B /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 9B /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB132SS xmm1, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32, xmmV, xmm1","vfmsub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9B /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmsub132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 9B /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMSUB213PD xmm1, xmmV, xmm2/m128","VFMSUB213PD xmm2/m128, xmmV, xmm1","vfmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsub213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 AA /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUB213PD ymm1, ymmV, ymm2/m256","VFMSUB213PD ymm2/m256, ymmV, ymm1","vfmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsub213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 AA /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 AA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsub213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 AA /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUB213PS xmm1, xmmV, xmm2/m128","VFMSUB213PS xmm2/m128, xmmV, xmm1","vfmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsub213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 AA /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUB213PS ymm1, ymmV, ymm2/m256","VFMSUB213PS ymm2/m256, ymmV, ymm1","vfmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsub213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 AA /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 AA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsub213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 AA /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 AB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB213SD xmm1, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64, xmmV, xmm1","vfmsub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AB /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmsub213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 AB /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 AB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB213SS xmm1, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32, xmmV, xmm1","vfmsub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AB /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmsub213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 AB /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMSUB231PD xmm1, xmmV, xmm2/m128","VFMSUB231PD xmm2/m128, xmmV, xmm1","vfmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsub231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 BA /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUB231PD ymm1, ymmV, ymm2/m256","VFMSUB231PD ymm2/m256, ymmV, ymm1","vfmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsub231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 BA /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 BA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsub231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 BA /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUB231PS xmm1, xmmV, xmm2/m128","VFMSUB231PS xmm2/m128, xmmV, xmm1","vfmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsub231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 BA /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUB231PS ymm1, ymmV, ymm2/m256","VFMSUB231PS ymm2/m256, ymmV, ymm1","vfmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BA /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsub231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 BA /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 BA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsub231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 BA /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SD xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 BB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB231SD xmm1, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64, xmmV, xmm1","vfmsub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BB /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64, xmmV, {k}{z}, xmm1","vfmsub231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 BB /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SS xmm2, xmmV, {k}{z}, xmm1{er}","vfmsub231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 BB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUB231SS xmm1, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32, xmmV, xmm1","vfmsub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BB /r","V","V","FMA","","rw,r,r","",""
+"VFMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32, xmmV, {k}{z}, xmm1","vfmsub231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 BB /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFMSUBADD132PD xmm1, xmmV, xmm2/m128","VFMSUBADD132PD xmm2/m128, xmmV, xmm1","vfmsubadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 97 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 97 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUBADD132PD ymm1, ymmV, ymm2/m256","VFMSUBADD132PD ymm2/m256, ymmV, ymm1","vfmsubadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 97 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 97 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUBADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 97 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 97 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUBADD132PS xmm1, xmmV, xmm2/m128","VFMSUBADD132PS xmm2/m128, xmmV, xmm1","vfmsubadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 97 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 97 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUBADD132PS ymm1, ymmV, ymm2/m256","VFMSUBADD132PS ymm2/m256, ymmV, ymm1","vfmsubadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 97 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 97 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUBADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 97 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 97 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUBADD213PD xmm1, xmmV, xmm2/m128","VFMSUBADD213PD xmm2/m128, xmmV, xmm1","vfmsubadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A7 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUBADD213PD ymm1, ymmV, ymm2/m256","VFMSUBADD213PD ymm2/m256, ymmV, ymm1","vfmsubadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A7 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUBADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 A7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A7 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUBADD213PS xmm1, xmmV, xmm2/m128","VFMSUBADD213PS xmm2/m128, xmmV, xmm1","vfmsubadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A7 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUBADD213PS ymm1, ymmV, ymm2/m256","VFMSUBADD213PS ymm2/m256, ymmV, ymm1","vfmsubadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A7 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUBADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 A7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A7 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUBADD231PD xmm1, xmmV, xmm2/m128","VFMSUBADD231PD xmm2/m128, xmmV, xmm1","vfmsubadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B7 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFMSUBADD231PD ymm1, ymmV, ymm2/m256","VFMSUBADD231PD ymm2/m256, ymmV, ymm1","vfmsubadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B7 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFMSUBADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 B7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B7 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFMSUBADD231PS xmm1, xmmV, xmm2/m128","VFMSUBADD231PS xmm2/m128, xmmV, xmm1","vfmsubadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B7 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFMSUBADD231PS ymm1, ymmV, ymm2/m256","VFMSUBADD231PS ymm2/m256, ymmV, ymm1","vfmsubadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B7 /r","V","V","FMA","","rw,r,r","",""
+"VFMSUBADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B7 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFMSUBADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfmsubadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 B7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFMSUBADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B7 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFMSUBADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPD xmm2/m128, xmmIH, xmmV, xmm1","vfmsubaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 5F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPD xmmIH, xmm2/m128, xmmV, xmm1","vfmsubaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 5F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPD ymm2/m256, ymmIH, ymmV, ymm1","vfmsubaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 5F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPD ymmIH, ymm2/m256, ymmV, ymm1","vfmsubaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 5F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPS xmm2/m128, xmmIH, xmmV, xmm1","vfmsubaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 5E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPS xmmIH, xmm2/m128, xmmV, xmm1","vfmsubaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 5E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPS ymm2/m256, ymmIH, ymmV, ymm1","vfmsubaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 5E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPS ymmIH, ymm2/m256, ymmV, ymm1","vfmsubaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 5E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPD xmm2/m128, xmmIH, xmmV, xmm1","vfmsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 6D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPD xmmIH, xmm2/m128, xmmV, xmm1","vfmsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 6D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPD ymm2/m256, ymmIH, ymmV, ymm1","vfmsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 6D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPD ymmIH, ymm2/m256, ymmV, ymm1","vfmsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 6D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPS xmm2/m128, xmmIH, xmmV, xmm1","vfmsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 6C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPS xmmIH, xmm2/m128, xmmV, xmm1","vfmsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 6C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPS ymm2/m256, ymmIH, ymmV, ymm1","vfmsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 6C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPS ymmIH, ymm2/m256, ymmV, ymm1","vfmsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 6C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFMSUBSD xmm2/m64, xmmIH, xmmV, xmm1","vfmsubsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFMSUBSD xmmIH, xmm2/m64, xmmV, xmm1","vfmsubsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFMSUBSS xmm2/m32, xmmIH, xmmV, xmm1","vfmsubss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFMSUBSS xmmIH, xmm2/m32, xmmV, xmm1","vfmsubss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADD132PD xmm1, xmmV, xmm2/m128","VFNMADD132PD xmm2/m128, xmmV, xmm1","vfnmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9C /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 9C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMADD132PD ymm1, ymmV, ymm2/m256","VFNMADD132PD ymm2/m256, ymmV, ymm1","vfnmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9C /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 9C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 9C /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 9C /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMADD132PS xmm1, xmmV, xmm2/m128","VFNMADD132PS xmm2/m128, xmmV, xmm1","vfnmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9C /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 9C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMADD132PS ymm1, ymmV, ymm2/m256","VFNMADD132PS ymm2/m256, ymmV, ymm1","vfnmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9C /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 9C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 9C /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 9C /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 9D /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD132SD xmm1, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64, xmmV, xmm1","vfnmadd132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9D /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmadd132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 9D /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 9D /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD132SS xmm1, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32, xmmV, xmm1","vfnmadd132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9D /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmadd132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 9D /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMADD213PD xmm1, xmmV, xmm2/m128","VFNMADD213PD xmm2/m128, xmmV, xmm1","vfnmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 AC /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMADD213PD ymm1, ymmV, ymm2/m256","VFNMADD213PD ymm2/m256, ymmV, ymm1","vfnmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 AC /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 AC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 AC /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMADD213PS xmm1, xmmV, xmm2/m128","VFNMADD213PS xmm2/m128, xmmV, xmm1","vfnmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 AC /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMADD213PS ymm1, ymmV, ymm2/m256","VFNMADD213PS ymm2/m256, ymmV, ymm1","vfnmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 AC /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 AC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 AC /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 AD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD213SD xmm1, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64, xmmV, xmm1","vfnmadd213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AD /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmadd213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 AD /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 AD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD213SS xmm1, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32, xmmV, xmm1","vfnmadd213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AD /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmadd213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 AD /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMADD231PD xmm1, xmmV, xmm2/m128","VFNMADD231PD xmm2/m128, xmmV, xmm1","vfnmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 BC /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMADD231PD ymm1, ymmV, ymm2/m256","VFNMADD231PD ymm2/m256, ymmV, ymm1","vfnmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 BC /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 BC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 BC /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMADD231PS xmm1, xmmV, xmm2/m128","VFNMADD231PS xmm2/m128, xmmV, xmm1","vfnmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 BC /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMADD231PS ymm1, ymmV, ymm2/m256","VFNMADD231PS ymm2/m256, ymmV, ymm1","vfnmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BC /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 BC /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 BC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 BC /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 BD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD231SD xmm1, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64, xmmV, xmm1","vfnmadd231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BD /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmadd231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 BD /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmadd231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 BD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMADD231SS xmm1, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32, xmmV, xmm1","vfnmadd231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BD /r","V","V","FMA","","rw,r,r","",""
+"VFNMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmadd231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 BD /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPD xmm2/m128, xmmIH, xmmV, xmm1","vfnmaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 79 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPD xmmIH, xmm2/m128, xmmV, xmm1","vfnmaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 79 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPD ymm2/m256, ymmIH, ymmV, ymm1","vfnmaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 79 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPD ymmIH, ymm2/m256, ymmV, ymm1","vfnmaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 79 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPS xmm2/m128, xmmIH, xmmV, xmm1","vfnmaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 78 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPS xmmIH, xmm2/m128, xmmV, xmm1","vfnmaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 78 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPS ymm2/m256, ymmIH, ymmV, ymm1","vfnmaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 78 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPS ymmIH, ymm2/m256, ymmV, ymm1","vfnmaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 78 /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMADDSD xmm2/m64, xmmIH, xmmV, xmm1","vfnmaddsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7B /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMADDSD xmmIH, xmm2/m64, xmmV, xmm1","vfnmaddsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7B /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMADDSS xmm2/m32, xmmIH, xmmV, xmm1","vfnmaddss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7A /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMADDSS xmmIH, xmm2/m32, xmmV, xmm1","vfnmaddss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7A /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUB132PD xmm1, xmmV, xmm2/m128","VFNMSUB132PD xmm2/m128, xmmV, xmm1","vfnmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9E /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB132PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub132pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 9E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMSUB132PD ymm1, ymmV, ymm2/m256","VFNMSUB132PD ymm2/m256, ymmV, ymm1","vfnmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9E /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB132PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub132pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 9E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 9E /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB132PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub132pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 9E /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMSUB132PS xmm1, xmmV, xmm2/m128","VFNMSUB132PS xmm2/m128, xmmV, xmm1","vfnmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9E /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB132PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub132ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 9E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMSUB132PS ymm1, ymmV, ymm2/m256","VFNMSUB132PS ymm2/m256, ymmV, ymm1","vfnmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9E /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB132PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub132ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 9E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 9E /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB132PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub132ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 9E /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 9F /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB132SD xmm1, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64, xmmV, xmm1","vfnmsub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9F /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmsub132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 9F /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 9F /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB132SS xmm1, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32, xmmV, xmm1","vfnmsub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9F /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmsub132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 9F /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMSUB213PD xmm1, xmmV, xmm2/m128","VFNMSUB213PD xmm2/m128, xmmV, xmm1","vfnmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB213PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub213pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 AE /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMSUB213PD ymm1, ymmV, ymm2/m256","VFNMSUB213PD ymm2/m256, ymmV, ymm1","vfnmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB213PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub213pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 AE /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 AE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB213PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub213pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 AE /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMSUB213PS xmm1, xmmV, xmm2/m128","VFNMSUB213PS xmm2/m128, xmmV, xmm1","vfnmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB213PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub213ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 AE /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMSUB213PS ymm1, ymmV, ymm2/m256","VFNMSUB213PS ymm2/m256, ymmV, ymm1","vfnmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB213PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub213ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 AE /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 AE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB213PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub213ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 AE /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 AF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB213SD xmm1, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64, xmmV, xmm1","vfnmsub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AF /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmsub213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 AF /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 AF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB213SS xmm1, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32, xmmV, xmm1","vfnmsub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AF /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmsub213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 AF /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMSUB231PD xmm1, xmmV, xmm2/m128","VFNMSUB231PD xmm2/m128, xmmV, xmm1","vfnmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB231PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub231pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 BE /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VFNMSUB231PD ymm1, ymmV, ymm2/m256","VFNMSUB231PD ymm2/m256, ymmV, ymm1","vfnmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB231PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub231pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 BE /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VFNMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PD zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W1 BE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB231PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub231pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 BE /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VFNMSUB231PS xmm1, xmmV, xmm2/m128","VFNMSUB231PS xmm2/m128, xmmV, xmm1","vfnmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB231PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub231ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 BE /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VFNMSUB231PS ymm1, ymmV, ymm2/m256","VFNMSUB231PS ymm2/m256, ymmV, ymm1","vfnmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BE /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB231PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub231ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 BE /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VFNMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PS zmm2, zmmV, {k}{z}, zmm1{er}","vfnmsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F38.W0 BE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB231PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub231ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 BE /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VFNMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SD xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W1 BF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB231SD xmm1, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64, xmmV, xmm1","vfnmsub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BF /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64, xmmV, {k}{z}, xmm1","vfnmsub231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W1 BF /r","V","V","AVX512F","scale8","rw,r,r,r","",""
+"VFNMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SS xmm2, xmmV, {k}{z}, xmm1{er}","vfnmsub231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F38.W0 BF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","",""
+"VFNMSUB231SS xmm1, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32, xmmV, xmm1","vfnmsub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BF /r","V","V","FMA","","rw,r,r","",""
+"VFNMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32, xmmV, {k}{z}, xmm1","vfnmsub231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F38.W0 BF /r","V","V","AVX512F","scale4","rw,r,r,r","",""
+"VFNMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPD xmm2/m128, xmmIH, xmmV, xmm1","vfnmsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 7D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPD xmmIH, xmm2/m128, xmmV, xmm1","vfnmsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 7D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPD ymm2/m256, ymmIH, ymmV, ymm1","vfnmsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 7D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPD ymmIH, ymm2/m256, ymmV, ymm1","vfnmsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 7D /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPS xmm2/m128, xmmIH, xmmV, xmm1","vfnmsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 7C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPS xmmIH, xmm2/m128, xmmV, xmm1","vfnmsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 7C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPS ymm2/m256, ymmIH, ymmV, ymm1","vfnmsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 7C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPS ymmIH, ymm2/m256, ymmV, ymm1","vfnmsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 7C /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMSUBSD xmm2/m64, xmmIH, xmmV, xmm1","vfnmsubsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMSUBSD xmmIH, xmm2/m64, xmmV, xmm1","vfnmsubsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7F /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMSUBSS xmm2/m32, xmmIH, xmmV, xmm1","vfnmsubss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFNMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMSUBSS xmmIH, xmm2/m32, xmmV, xmm1","vfnmsubss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7E /r /is4","V","V","FMA4","amd","w,r,r,r","",""
+"VFPCLASSPD k1, {k}, xmm2/m128/m64bcst, imm8u","VFPCLASSPDX imm8u, xmm2/m128/m64bcst, {k}, k1","vfpclasspdx imm8u, xmm2/m128/m64bcst, {k}, k1","EVEX.128.66.0F3A.W1 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","Y","128"
+"VFPCLASSPD k1, {k}, ymm2/m256/m64bcst, imm8u","VFPCLASSPDY imm8u, ymm2/m256/m64bcst, {k}, k1","vfpclasspdy imm8u, ymm2/m256/m64bcst, {k}, k1","EVEX.256.66.0F3A.W1 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","Y","256"
+"VFPCLASSPD k1, {k}, zmm2/m512/m64bcst, imm8u","VFPCLASSPDZ imm8u, zmm2/m512/m64bcst, {k}, k1","vfpclasspdz imm8u, zmm2/m512/m64bcst, {k}, k1","EVEX.512.66.0F3A.W1 66 /r ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","Y","512"
+"VFPCLASSPS k1, {k}, xmm2/m128/m32bcst, imm8u","VFPCLASSPSX imm8u, xmm2/m128/m32bcst, {k}, k1","vfpclasspsx imm8u, xmm2/m128/m32bcst, {k}, k1","EVEX.128.66.0F3A.W0 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","Y","128"
+"VFPCLASSPS k1, {k}, ymm2/m256/m32bcst, imm8u","VFPCLASSPSY imm8u, ymm2/m256/m32bcst, {k}, k1","vfpclasspsy imm8u, ymm2/m256/m32bcst, {k}, k1","EVEX.256.66.0F3A.W0 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","Y","256"
+"VFPCLASSPS k1, {k}, zmm2/m512/m32bcst, imm8u","VFPCLASSPSZ imm8u, zmm2/m512/m32bcst, {k}, k1","vfpclasspsz imm8u, zmm2/m512/m32bcst, {k}, k1","EVEX.512.66.0F3A.W0 66 /r ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","Y","512"
+"VFPCLASSSD k1, {k}, xmm2/m64, imm8u","VFPCLASSSD imm8u, xmm2/m64, {k}, k1","vfpclasssd imm8u, xmm2/m64, {k}, k1","EVEX.LIG.66.0F3A.W1 67 /r ib","V","V","AVX512DQ","scale8","w,r,r,r","",""
+"VFPCLASSSS k1, {k}, xmm2/m32, imm8u","VFPCLASSSS imm8u, xmm2/m32, {k}, k1","vfpclassss imm8u, xmm2/m32, {k}, k1","EVEX.LIG.66.0F3A.W0 67 /r ib","V","V","AVX512DQ","scale4","w,r,r,r","",""
+"VFRCZPD xmm1, xmm2/m128","VFRCZPD xmm2/m128, xmm1","vfrczpd xmm2/m128, xmm1","XOP.128.09.W0 81 /r","V","V","XOP","amd","w,r","",""
+"VFRCZPD ymm1, ymm2/m256","VFRCZPD ymm2/m256, ymm1","vfrczpd ymm2/m256, ymm1","XOP.256.09.W0 81 /r","V","V","XOP","amd","w,r","",""
+"VFRCZPS xmm1, xmm2/m128","VFRCZPS xmm2/m128, xmm1","vfrczps xmm2/m128, xmm1","XOP.128.09.W0 80 /r","V","V","XOP","amd","w,r","",""
+"VFRCZPS ymm1, ymm2/m256","VFRCZPS ymm2/m256, ymm1","vfrczps ymm2/m256, ymm1","XOP.256.09.W0 80 /r","V","V","XOP","amd","w,r","",""
+"VFRCZSD xmm1, xmm2/m64","VFRCZSD xmm2/m64, xmm1","vfrczsd xmm2/m64, xmm1","XOP.128.09.W0 83 /r","V","V","XOP","amd","w,r","",""
+"VFRCZSS xmm1, xmm2/m32","VFRCZSS xmm2/m32, xmm1","vfrczss xmm2/m32, xmm1","XOP.128.09.W0 82 /r","V","V","XOP","amd","w,r","",""
+"VGATHERDPD xmm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7}, xmm1","vgatherdpd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 92 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERDPD ymm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7}, ymm1","vgatherdpd vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 92 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERDPD zmm1, {k1-k7}, vm32y","VGATHERDPD vm32y, {k1-k7}, zmm1","vgatherdpd vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 92 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERDPD xmm1, vm32x, xmmV","VGATHERDPD xmmV, vm32x, xmm1","vgatherdpd xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERDPD ymm1, vm32x, ymmV","VGATHERDPD ymmV, vm32x, ymm1","vgatherdpd ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERDPS xmm1, {k1-k7}, vm32x","VGATHERDPS vm32x, {k1-k7}, xmm1","vgatherdps vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 92 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERDPS ymm1, {k1-k7}, vm32y","VGATHERDPS vm32y, {k1-k7}, ymm1","vgatherdps vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 92 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERDPS zmm1, {k1-k7}, vm32z","VGATHERDPS vm32z, {k1-k7}, zmm1","vgatherdps vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 92 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERDPS xmm1, vm32x, xmmV","VGATHERDPS xmmV, vm32x, xmm1","vgatherdps xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERDPS ymm1, vm32y, ymmV","VGATHERDPS ymmV, vm32y, ymm1","vgatherdps ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 92 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERPF0DPD vm32y, {k1-k7}","VGATHERPF0DPD {k1-k7}, vm32y","vgatherpf0dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /1","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VGATHERPF0DPS vm32z, {k1-k7}","VGATHERPF0DPS {k1-k7}, vm32z","vgatherpf0dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /1","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VGATHERPF0QPD vm64z, {k1-k7}","VGATHERPF0QPD {k1-k7}, vm64z","vgatherpf0qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /1","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VGATHERPF0QPS vm64z, {k1-k7}","VGATHERPF0QPS {k1-k7}, vm64z","vgatherpf0qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /1","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VGATHERPF1DPD vm32y, {k1-k7}","VGATHERPF1DPD {k1-k7}, vm32y","vgatherpf1dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /2","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VGATHERPF1DPS vm32z, {k1-k7}","VGATHERPF1DPS {k1-k7}, vm32z","vgatherpf1dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /2","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VGATHERPF1QPD vm64z, {k1-k7}","VGATHERPF1QPD {k1-k7}, vm64z","vgatherpf1qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /2","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VGATHERPF1QPS vm64z, {k1-k7}","VGATHERPF1QPS {k1-k7}, vm64z","vgatherpf1qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /2","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VGATHERQPD xmm1, {k1-k7}, vm64x","VGATHERQPD vm64x, {k1-k7}, xmm1","vgatherqpd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 93 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERQPD ymm1, {k1-k7}, vm64y","VGATHERQPD vm64y, {k1-k7}, ymm1","vgatherqpd vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 93 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERQPD zmm1, {k1-k7}, vm64z","VGATHERQPD vm64z, {k1-k7}, zmm1","vgatherqpd vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 93 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VGATHERQPD xmm1, vm64x, xmmV","VGATHERQPD xmmV, vm64x, xmm1","vgatherqpd xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERQPD ymm1, vm64y, ymmV","VGATHERQPD ymmV, vm64y, ymm1","vgatherqpd ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERQPS xmm1, {k1-k7}, vm64x","VGATHERQPS vm64x, {k1-k7}, xmm1","vgatherqps vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 93 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERQPS xmm1, {k1-k7}, vm64y","VGATHERQPS vm64y, {k1-k7}, xmm1","vgatherqps vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 93 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERQPS ymm1, {k1-k7}, vm64z","VGATHERQPS vm64z, {k1-k7}, ymm1","vgatherqps vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 93 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VGATHERQPS xmm1, vm64x, xmmV","VGATHERQPS xmmV, vm64x, xmm1","vgatherqps xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGATHERQPS xmm1, vm64y, xmmV","VGATHERQPS xmmV, vm64y, xmm1","vgatherqps xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 93 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VGETEXPPD xmm1, {k}{z}, xmm2/m128/m64bcst","VGETEXPPD xmm2/m128/m64bcst, {k}{z}, xmm1","vgetexppd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 42 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VGETEXPPD ymm1, {k}{z}, ymm2/m256/m64bcst","VGETEXPPD ymm2/m256/m64bcst, {k}{z}, ymm1","vgetexppd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 42 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VGETEXPPD zmm1{sae}, {k}{z}, zmm2","VGETEXPPD zmm2, {k}{z}, zmm1{sae}","vgetexppd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 42 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VGETEXPPD zmm1, {k}{z}, zmm2/m512/m64bcst","VGETEXPPD zmm2/m512/m64bcst, {k}{z}, zmm1","vgetexppd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 42 /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VGETEXPPS xmm1, {k}{z}, xmm2/m128/m32bcst","VGETEXPPS xmm2/m128/m32bcst, {k}{z}, xmm1","vgetexpps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 42 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VGETEXPPS ymm1, {k}{z}, ymm2/m256/m32bcst","VGETEXPPS ymm2/m256/m32bcst, {k}{z}, ymm1","vgetexpps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 42 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VGETEXPPS zmm1{sae}, {k}{z}, zmm2","VGETEXPPS zmm2, {k}{z}, zmm1{sae}","vgetexpps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 42 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VGETEXPPS zmm1, {k}{z}, zmm2/m512/m32bcst","VGETEXPPS zmm2/m512/m32bcst, {k}{z}, zmm1","vgetexpps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 42 /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VGETEXPSD xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSD xmm2, xmmV, {k}{z}, xmm1{sae}","vgetexpsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W1 43 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VGETEXPSD xmm1, {k}{z}, xmmV, xmm2/m64","VGETEXPSD xmm2/m64, xmmV, {k}{z}, xmm1","vgetexpsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 43 /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VGETEXPSS xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSS xmm2, xmmV, {k}{z}, xmm1{sae}","vgetexpss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W0 43 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VGETEXPSS xmm1, {k}{z}, xmmV, xmm2/m32","VGETEXPSS xmm2/m32, xmmV, {k}{z}, xmm1","vgetexpss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 43 /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VGETMANTPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u:4","VGETMANTPD imm8u:4, xmm2/m128/m64bcst, {k}{z}, xmm1","vgetmantpd imm8u:4, xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 26 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VGETMANTPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u:4","VGETMANTPD imm8u:4, ymm2/m256/m64bcst, {k}{z}, ymm1","vgetmantpd imm8u:4, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 26 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VGETMANTPD zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPD imm8u:4, zmm2, {k}{z}, zmm1{sae}","vgetmantpd imm8u:4, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W1 26 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VGETMANTPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u:4","VGETMANTPD imm8u:4, zmm2/m512/m64bcst, {k}{z}, zmm1","vgetmantpd imm8u:4, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 26 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VGETMANTPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u:4","VGETMANTPS imm8u:4, xmm2/m128/m32bcst, {k}{z}, xmm1","vgetmantps imm8u:4, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 26 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VGETMANTPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u:4","VGETMANTPS imm8u:4, ymm2/m256/m32bcst, {k}{z}, ymm1","vgetmantps imm8u:4, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 26 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VGETMANTPS zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPS imm8u:4, zmm2, {k}{z}, zmm1{sae}","vgetmantps imm8u:4, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W0 26 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VGETMANTPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u:4","VGETMANTPS imm8u:4, zmm2/m512/m32bcst, {k}{z}, zmm1","vgetmantps imm8u:4, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 26 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VGETMANTSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSD imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vgetmantsd imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 27 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VGETMANTSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VGETMANTSD imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","vgetmantsd imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 27 /r ib","V","V","AVX512F","scale8","w,r,r,r,r","",""
+"VGETMANTSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSS imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vgetmantss imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 27 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VGETMANTSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VGETMANTSS imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","vgetmantss imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 27 /r ib","V","V","AVX512F","scale4","w,r,r,r,r","",""
+"VGF2P8AFFINEINVQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEINVQB imm8u, xmm2/m128, xmmV, xmm1","vgf2p8affineinvqb imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX","","w,r,r,r","",""
+"VGF2P8AFFINEINVQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VGF2P8AFFINEINVQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vgf2p8affineinvqb imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VGF2P8AFFINEINVQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEINVQB imm8u, ymm2/m256, ymmV, ymm1","vgf2p8affineinvqb imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX","","w,r,r,r","",""
+"VGF2P8AFFINEINVQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VGF2P8AFFINEINVQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vgf2p8affineinvqb imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VGF2P8AFFINEINVQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VGF2P8AFFINEINVQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vgf2p8affineinvqb imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VGF2P8AFFINEQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEQB imm8u, xmm2/m128, xmmV, xmm1","vgf2p8affineqb imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX","","w,r,r,r","",""
+"VGF2P8AFFINEQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VGF2P8AFFINEQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vgf2p8affineqb imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VGF2P8AFFINEQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEQB imm8u, ymm2/m256, ymmV, ymm1","vgf2p8affineqb imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX","","w,r,r,r","",""
+"VGF2P8AFFINEQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VGF2P8AFFINEQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vgf2p8affineqb imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VGF2P8AFFINEQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VGF2P8AFFINEQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vgf2p8affineqb imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VGF2P8MULB xmm1, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128, xmmV, xmm1","vgf2p8mulb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 CF /r","V","V","GFNI+AVX","","w,r,r","",""
+"VGF2P8MULB xmm1, {k}{z}, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128, xmmV, {k}{z}, xmm1","vgf2p8mulb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 CF /r","V","V","GFNI+AVX512VL","scale16","w,r,r,r","",""
+"VGF2P8MULB ymm1, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256, ymmV, ymm1","vgf2p8mulb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 CF /r","V","V","GFNI+AVX","","w,r,r","",""
+"VGF2P8MULB ymm1, {k}{z}, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256, ymmV, {k}{z}, ymm1","vgf2p8mulb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 CF /r","V","V","GFNI+AVX512VL","scale32","w,r,r,r","",""
+"VGF2P8MULB zmm1, {k}{z}, zmmV, zmm2/m512","VGF2P8MULB zmm2/m512, zmmV, {k}{z}, zmm1","vgf2p8mulb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 CF /r","V","V","GFNI+AVX512F","scale64","w,r,r,r","",""
+"VHADDPD xmm1, xmmV, xmm2/m128","VHADDPD xmm2/m128, xmmV, xmm1","vhaddpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
+"VHADDPD ymm1, ymmV, ymm2/m256","VHADDPD ymm2/m256, ymmV, ymm1","vhaddpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
+"VHADDPS xmm1, xmmV, xmm2/m128","VHADDPS xmm2/m128, xmmV, xmm1","vhaddps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
+"VHADDPS ymm1, ymmV, ymm2/m256","VHADDPS ymm2/m256, ymmV, ymm1","vhaddps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r","",""
+"VHSUBPD xmm1, xmmV, xmm2/m128","VHSUBPD xmm2/m128, xmmV, xmm1","vhsubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
+"VHSUBPD ymm1, ymmV, ymm2/m256","VHSUBPD ymm2/m256, ymmV, ymm1","vhsubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
+"VHSUBPS xmm1, xmmV, xmm2/m128","VHSUBPS xmm2/m128, xmmV, xmm1","vhsubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
+"VHSUBPS ymm1, ymmV, ymm2/m256","VHSUBPS ymm2/m256, ymmV, ymm1","vhsubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r","",""
+"VINSERTF128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTF128 imm8u:1, xmm2/m128, ymmV, ymm1","vinsertf128 imm8u:1, xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 18 /r ib","V","V","AVX","","w,r,r,r","",""
+"VINSERTF32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF32X4 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf32x4 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 18 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r,r","",""
+"VINSERTF32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF32X4 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf32x4 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 18 /r ib","V","V","AVX512F","scale16","w,r,r,r,r","",""
+"VINSERTF32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF32X8 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf32x8 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 1A /r ib","V","V","AVX512DQ","scale32","w,r,r,r,r","",""
+"VINSERTF64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF64X2 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf64x2 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 18 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r,r","",""
+"VINSERTF64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF64X2 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf64x2 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 18 /r ib","V","V","AVX512DQ","scale16","w,r,r,r,r","",""
+"VINSERTF64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF64X4 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf64x4 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 1A /r ib","V","V","AVX512F","scale32","w,r,r,r,r","",""
+"VINSERTI128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTI128 imm8u:1, xmm2/m128, ymmV, ymm1","vinserti128 imm8u:1, xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 38 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VINSERTI32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI32X4 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti32x4 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 38 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r,r","",""
+"VINSERTI32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI32X4 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti32x4 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 38 /r ib","V","V","AVX512F","scale16","w,r,r,r,r","",""
+"VINSERTI32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI32X8 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti32x8 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 3A /r ib","V","V","AVX512DQ","scale32","w,r,r,r,r","",""
+"VINSERTI64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI64X2 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti64x2 imm8u:1, xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 38 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r,r","",""
+"VINSERTI64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI64X2 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti64x2 imm8u:2, xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 38 /r ib","V","V","AVX512DQ","scale16","w,r,r,r,r","",""
+"VINSERTI64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI64X4 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti64x4 imm8u:1, ymm2/m256, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 3A /r ib","V","V","AVX512F","scale32","w,r,r,r,r","",""
+"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32, xmmV, xmm1","vinsertps imm8u, xmm2/m32, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W0 21 /r ib","V","V","AVX512F+AVX512VL","scale4","w,r,r,r","",""
+"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32, xmmV, xmm1","vinsertps imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 21 /r ib","V","V","AVX","","w,r,r,r","",""
+"VLDDQU xmm1, m128","VLDDQU m128, xmm1","vlddqu m128, xmm1","VEX.128.F2.0F.WIG F0 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VLDDQU ymm1, m256","VLDDQU m256, ymm1","vlddqu m256, ymm1","VEX.256.F2.0F.WIG F0 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VLDMXCSR m32","VLDMXCSR m32","vldmxcsr m32","VEX.128.0F.WIG AE /2","V","V","AVX","modrm_memonly","r","",""
+"VMASKMOVDQU xmm1, xmm2","VMASKMOVDQU xmm2, xmm1","vmaskmovdqu xmm2, xmm1","VEX.128.66.0F.WIG F7 /r","V","V","AVX","modrm_regonly","r,r","",""
+"VMASKMOVPD xmm1, xmmV, m128","VMASKMOVPD m128, xmmV, xmm1","vmaskmovpd m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPD ymm1, ymmV, m256","VMASKMOVPD m256, ymmV, ymm1","vmaskmovpd m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPD m128, xmmV, xmm1","VMASKMOVPD xmm1, xmmV, m128","vmaskmovpd xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPD m256, ymmV, ymm1","VMASKMOVPD ymm1, ymmV, m256","vmaskmovpd ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPS xmm1, xmmV, m128","VMASKMOVPS m128, xmmV, xmm1","vmaskmovps m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPS ymm1, ymmV, m256","VMASKMOVPS m256, ymmV, ymm1","vmaskmovps m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPS m128, xmmV, xmm1","VMASKMOVPS xmm1, xmmV, m128","vmaskmovps xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMASKMOVPS m256, ymmV, ymm1","VMASKMOVPS ymm1, ymmV, m256","vmaskmovps ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMAXPD xmm1, xmmV, xmm2/m128","VMAXPD xmm2/m128, xmmV, xmm1","vmaxpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMAXPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vmaxpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VMAXPD ymm1, ymmV, ymm2/m256","VMAXPD ymm2/m256, ymmV, ymm1","vmaxpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMAXPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vmaxpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VMAXPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPD zmm2, zmmV, {k}{z}, zmm1{sae}","vmaxpd zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F.W1 5F /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMAXPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMAXPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vmaxpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5F /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VMAXPS xmm1, xmmV, xmm2/m128","VMAXPS xmm2/m128, xmmV, xmm1","vmaxps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMAXPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vmaxps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VMAXPS ymm1, ymmV, ymm2/m256","VMAXPS ymm2/m256, ymmV, ymm1","vmaxps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMAXPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vmaxps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VMAXPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPS zmm2, zmmV, {k}{z}, zmm1{sae}","vmaxps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0 5F /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMAXPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMAXPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vmaxps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5F /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VMAXSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSD xmm2, xmmV, {k}{z}, xmm1{sae}","vmaxsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5F /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMAXSD xmm1, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, xmm1","vmaxsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXSD xmm1, {k}{z}, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, {k}{z}, xmm1","vmaxsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5F /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VMAXSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSS xmm2, xmmV, {k}{z}, xmm1{sae}","vmaxss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5F /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMAXSS xmm1, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, xmm1","vmaxss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5F /r","V","V","AVX","","w,r,r","",""
+"VMAXSS xmm1, {k}{z}, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, {k}{z}, xmm1","vmaxss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5F /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VMCALL","VMCALL","vmcall","0F 01 C1","V","V","VTX","","","",""
+"VMCLEAR m64","VMCLEAR m64","vmclear m64","66 0F C7 /6","V","V","VTX","modrm_memonly","r","",""
+"VMFUNC","VMFUNC","vmfunc","0F 01 D4","V","V","","","","",""
+"VMINPD xmm1, xmmV, xmm2/m128","VMINPD xmm2/m128, xmmV, xmm1","vminpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMINPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vminpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VMINPD ymm1, ymmV, ymm2/m256","VMINPD ymm2/m256, ymmV, ymm1","vminpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMINPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vminpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VMINPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPD zmm2, zmmV, {k}{z}, zmm1{sae}","vminpd zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F.W1 5D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMINPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMINPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vminpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VMINPS xmm1, xmmV, xmm2/m128","VMINPS xmm2/m128, xmmV, xmm1","vminps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMINPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vminps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5D /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VMINPS ymm1, ymmV, ymm2/m256","VMINPS ymm2/m256, ymmV, ymm1","vminps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMINPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vminps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5D /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VMINPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPS zmm2, zmmV, {k}{z}, zmm1{sae}","vminps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0 5D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMINPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMINPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vminps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5D /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VMINSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSD xmm2, xmmV, {k}{z}, xmm1{sae}","vminsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMINSD xmm1, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, xmm1","vminsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINSD xmm1, {k}{z}, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, {k}{z}, xmm1","vminsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5D /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VMINSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSS xmm2, xmmV, {k}{z}, xmm1{sae}","vminss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMINSS xmm1, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, xmm1","vminss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5D /r","V","V","AVX","","w,r,r","",""
+"VMINSS xmm1, {k}{z}, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, {k}{z}, xmm1","vminss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5D /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VMLAUNCH","VMLAUNCH","vmlaunch","0F 01 C2","V","V","VTX","","","",""
+"VMLOAD EAX","VMLOADL EAX","vmloadl EAX","0F 01 DA","V","V","SVM","amd,modrm_regonly,operand32","r","Y","32"
+"VMLOAD RAX","VMLOADQ RAX","vmloadq RAX","REX.W 0F 01 DA","N.S.","V","SVM","amd,modrm_regonly","r","Y","64"
+"VMLOAD AX","VMLOADW AX","vmloadw AX","0F 01 DA","V","V","SVM","amd,modrm_regonly,operand16","r","Y","16"
+"VMMCALL","VMMCALL","vmmcall","0F 01 D9","V","V","SVM","amd","","",""
+"VMOVAPD xmm2/m128, xmm1","VMOVAPD xmm1, xmm2/m128","vmovapd xmm1, xmm2/m128","VEX.128.66.0F.WIG 29 /r","V","V","AVX","","w,r","",""
+"VMOVAPD xmm2/m128, {k}{z}, xmm1","VMOVAPD xmm1, {k}{z}, xmm2/m128","vmovapd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 29 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVAPD xmm1, xmm2/m128","VMOVAPD xmm2/m128, xmm1","vmovapd xmm2/m128, xmm1","VEX.128.66.0F.WIG 28 /r","V","V","AVX","","w,r","",""
+"VMOVAPD xmm1, {k}{z}, xmm2/m128","VMOVAPD xmm2/m128, {k}{z}, xmm1","vmovapd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 28 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVAPD ymm2/m256, ymm1","VMOVAPD ymm1, ymm2/m256","vmovapd ymm1, ymm2/m256","VEX.256.66.0F.WIG 29 /r","V","V","AVX","","w,r","",""
+"VMOVAPD ymm2/m256, {k}{z}, ymm1","VMOVAPD ymm1, {k}{z}, ymm2/m256","vmovapd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 29 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVAPD ymm1, ymm2/m256","VMOVAPD ymm2/m256, ymm1","vmovapd ymm2/m256, ymm1","VEX.256.66.0F.WIG 28 /r","V","V","AVX","","w,r","",""
+"VMOVAPD ymm1, {k}{z}, ymm2/m256","VMOVAPD ymm2/m256, {k}{z}, ymm1","vmovapd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 28 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVAPD zmm2/m512, {k}{z}, zmm1","VMOVAPD zmm1, {k}{z}, zmm2/m512","vmovapd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 29 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVAPD zmm1, {k}{z}, zmm2/m512","VMOVAPD zmm2/m512, {k}{z}, zmm1","vmovapd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 28 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVAPS xmm2/m128, xmm1","VMOVAPS xmm1, xmm2/m128","vmovaps xmm1, xmm2/m128","VEX.128.0F.WIG 29 /r","V","V","AVX","","w,r","",""
+"VMOVAPS xmm2/m128, {k}{z}, xmm1","VMOVAPS xmm1, {k}{z}, xmm2/m128","vmovaps xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 29 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVAPS xmm1, xmm2/m128","VMOVAPS xmm2/m128, xmm1","vmovaps xmm2/m128, xmm1","VEX.128.0F.WIG 28 /r","V","V","AVX","","w,r","",""
+"VMOVAPS xmm1, {k}{z}, xmm2/m128","VMOVAPS xmm2/m128, {k}{z}, xmm1","vmovaps xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 28 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVAPS ymm2/m256, ymm1","VMOVAPS ymm1, ymm2/m256","vmovaps ymm1, ymm2/m256","VEX.256.0F.WIG 29 /r","V","V","AVX","","w,r","",""
+"VMOVAPS ymm2/m256, {k}{z}, ymm1","VMOVAPS ymm1, {k}{z}, ymm2/m256","vmovaps ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 29 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVAPS ymm1, ymm2/m256","VMOVAPS ymm2/m256, ymm1","vmovaps ymm2/m256, ymm1","VEX.256.0F.WIG 28 /r","V","V","AVX","","w,r","",""
+"VMOVAPS ymm1, {k}{z}, ymm2/m256","VMOVAPS ymm2/m256, {k}{z}, ymm1","vmovaps ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 28 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVAPS zmm2/m512, {k}{z}, zmm1","VMOVAPS zmm1, {k}{z}, zmm2/m512","vmovaps zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 29 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVAPS zmm1, {k}{z}, zmm2/m512","VMOVAPS zmm2/m512, {k}{z}, zmm1","vmovaps zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 28 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","EVEX.128.66.0F.W0 6E /r","V","V","AVX512F+AVX512VL","scale4","w,r","",""
+"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","VEX.128.66.0F.W0 6E /r","V","V","AVX","","w,r","",""
+"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","EVEX.128.66.0F.W0 7E /r","V","V","AVX512F+AVX512VL","scale4","w,r","",""
+"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","VEX.128.66.0F.W0 7E /r","V","V","AVX","","w,r","",""
+"VMOVDDUP xmm1, xmm2/m64","VMOVDDUP xmm2/m64, xmm1","vmovddup xmm2/m64, xmm1","VEX.128.F2.0F.WIG 12 /r","V","V","AVX","","w,r","",""
+"VMOVDDUP xmm1, {k}{z}, xmm2/m64","VMOVDDUP xmm2/m64, {k}{z}, xmm1","vmovddup xmm2/m64, {k}{z}, xmm1","EVEX.128.F2.0F.W1 12 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VMOVDDUP ymm1, ymm2/m256","VMOVDDUP ymm2/m256, ymm1","vmovddup ymm2/m256, ymm1","VEX.256.F2.0F.WIG 12 /r","V","V","AVX","","w,r","",""
+"VMOVDDUP ymm1, {k}{z}, ymm2/m256","VMOVDDUP ymm2/m256, {k}{z}, ymm1","vmovddup ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 12 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVDDUP zmm1, {k}{z}, zmm2/m512","VMOVDDUP zmm2/m512, {k}{z}, zmm1","vmovddup zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 12 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVDQA xmm2/m128, xmm1","VMOVDQA xmm1, xmm2/m128","vmovdqa xmm1, xmm2/m128","VEX.128.66.0F.WIG 7F /r","V","V","AVX","","w,r","",""
+"VMOVDQA xmm1, xmm2/m128","VMOVDQA xmm2/m128, xmm1","vmovdqa xmm2/m128, xmm1","VEX.128.66.0F.WIG 6F /r","V","V","AVX","","w,r","",""
+"VMOVDQA ymm2/m256, ymm1","VMOVDQA ymm1, ymm2/m256","vmovdqa ymm1, ymm2/m256","VEX.256.66.0F.WIG 7F /r","V","V","AVX","","w,r","",""
+"VMOVDQA ymm1, ymm2/m256","VMOVDQA ymm2/m256, ymm1","vmovdqa ymm2/m256, ymm1","VEX.256.66.0F.WIG 6F /r","V","V","AVX","","w,r","",""
+"VMOVDQA32 xmm2/m128, {k}{z}, xmm1","VMOVDQA32 xmm1, {k}{z}, xmm2/m128","vmovdqa32 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W0 7F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVDQA32 xmm1, {k}{z}, xmm2/m128","VMOVDQA32 xmm2/m128, {k}{z}, xmm1","vmovdqa32 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W0 6F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVDQA32 ymm2/m256, {k}{z}, ymm1","VMOVDQA32 ymm1, {k}{z}, ymm2/m256","vmovdqa32 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W0 7F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVDQA32 ymm1, {k}{z}, ymm2/m256","VMOVDQA32 ymm2/m256, {k}{z}, ymm1","vmovdqa32 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W0 6F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVDQA32 zmm2/m512, {k}{z}, zmm1","VMOVDQA32 zmm1, {k}{z}, zmm2/m512","vmovdqa32 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W0 7F /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVDQA32 zmm1, {k}{z}, zmm2/m512","VMOVDQA32 zmm2/m512, {k}{z}, zmm1","vmovdqa32 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W0 6F /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVDQA64 xmm2/m128, {k}{z}, xmm1","VMOVDQA64 xmm1, {k}{z}, xmm2/m128","vmovdqa64 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 7F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQA64 xmm1, {k}{z}, xmm2/m128","VMOVDQA64 xmm2/m128, {k}{z}, xmm1","vmovdqa64 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 6F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQA64 ymm2/m256, {k}{z}, ymm1","VMOVDQA64 ymm1, {k}{z}, ymm2/m256","vmovdqa64 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 7F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQA64 ymm1, {k}{z}, ymm2/m256","VMOVDQA64 ymm2/m256, {k}{z}, ymm1","vmovdqa64 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 6F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQA64 zmm2/m512, {k}{z}, zmm1","VMOVDQA64 zmm1, {k}{z}, zmm2/m512","vmovdqa64 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 7F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQA64 zmm1, {k}{z}, zmm2/m512","VMOVDQA64 zmm2/m512, {k}{z}, zmm1","vmovdqa64 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 6F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU xmm2/m128, xmm1","VMOVDQU xmm1, xmm2/m128","vmovdqu xmm1, xmm2/m128","VEX.128.F3.0F.WIG 7F /r","V","V","AVX","","w,r","",""
+"VMOVDQU xmm1, xmm2/m128","VMOVDQU xmm2/m128, xmm1","vmovdqu xmm2/m128, xmm1","VEX.128.F3.0F.WIG 6F /r","V","V","AVX","","w,r","",""
+"VMOVDQU ymm2/m256, ymm1","VMOVDQU ymm1, ymm2/m256","vmovdqu ymm1, ymm2/m256","VEX.256.F3.0F.WIG 7F /r","V","V","AVX","","w,r","",""
+"VMOVDQU ymm1, ymm2/m256","VMOVDQU ymm2/m256, ymm1","vmovdqu ymm2/m256, ymm1","VEX.256.F3.0F.WIG 6F /r","V","V","AVX","","w,r","",""
+"VMOVDQU16 xmm2/m128, {k}{z}, xmm1","VMOVDQU16 xmm1, {k}{z}, xmm2/m128","vmovdqu16 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W1 7F /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VMOVDQU16 xmm1, {k}{z}, xmm2/m128","VMOVDQU16 xmm2/m128, {k}{z}, xmm1","vmovdqu16 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W1 6F /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VMOVDQU16 ymm2/m256, {k}{z}, ymm1","VMOVDQU16 ymm1, {k}{z}, ymm2/m256","vmovdqu16 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W1 7F /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VMOVDQU16 ymm1, {k}{z}, ymm2/m256","VMOVDQU16 ymm2/m256, {k}{z}, ymm1","vmovdqu16 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 6F /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VMOVDQU16 zmm2/m512, {k}{z}, zmm1","VMOVDQU16 zmm1, {k}{z}, zmm2/m512","vmovdqu16 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W1 7F /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VMOVDQU16 zmm1, {k}{z}, zmm2/m512","VMOVDQU16 zmm2/m512, {k}{z}, zmm1","vmovdqu16 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 6F /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VMOVDQU32 xmm2/m128, {k}{z}, xmm1","VMOVDQU32 xmm1, {k}{z}, xmm2/m128","vmovdqu32 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W0 7F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU32 xmm1, {k}{z}, xmm2/m128","VMOVDQU32 xmm2/m128, {k}{z}, xmm1","vmovdqu32 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 6F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU32 ymm2/m256, {k}{z}, ymm1","VMOVDQU32 ymm1, {k}{z}, ymm2/m256","vmovdqu32 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W0 7F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU32 ymm1, {k}{z}, ymm2/m256","VMOVDQU32 ymm2/m256, {k}{z}, ymm1","vmovdqu32 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 6F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU32 zmm2/m512, {k}{z}, zmm1","VMOVDQU32 zmm1, {k}{z}, zmm2/m512","vmovdqu32 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W0 7F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU32 zmm1, {k}{z}, zmm2/m512","VMOVDQU32 zmm2/m512, {k}{z}, zmm1","vmovdqu32 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 6F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU64 xmm2/m128, {k}{z}, xmm1","VMOVDQU64 xmm1, {k}{z}, xmm2/m128","vmovdqu64 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W1 7F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU64 xmm1, {k}{z}, xmm2/m128","VMOVDQU64 xmm2/m128, {k}{z}, xmm1","vmovdqu64 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W1 6F /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU64 ymm2/m256, {k}{z}, ymm1","VMOVDQU64 ymm1, {k}{z}, ymm2/m256","vmovdqu64 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W1 7F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU64 ymm1, {k}{z}, ymm2/m256","VMOVDQU64 ymm2/m256, {k}{z}, ymm1","vmovdqu64 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W1 6F /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU64 zmm2/m512, {k}{z}, zmm1","VMOVDQU64 zmm1, {k}{z}, zmm2/m512","vmovdqu64 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W1 7F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU64 zmm1, {k}{z}, zmm2/m512","VMOVDQU64 zmm2/m512, {k}{z}, zmm1","vmovdqu64 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W1 6F /r","V","V","AVX512F","scale64","w,r,r","Y","512"
+"VMOVDQU8 xmm2/m128, {k}{z}, xmm1","VMOVDQU8 xmm1, {k}{z}, xmm2/m128","vmovdqu8 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W0 7F /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU8 xmm1, {k}{z}, xmm2/m128","VMOVDQU8 xmm2/m128, {k}{z}, xmm1","vmovdqu8 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W0 6F /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","Y","128"
+"VMOVDQU8 ymm2/m256, {k}{z}, ymm1","VMOVDQU8 ymm1, {k}{z}, ymm2/m256","vmovdqu8 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W0 7F /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU8 ymm1, {k}{z}, ymm2/m256","VMOVDQU8 ymm2/m256, {k}{z}, ymm1","vmovdqu8 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W0 6F /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","Y","256"
+"VMOVDQU8 zmm2/m512, {k}{z}, zmm1","VMOVDQU8 zmm1, {k}{z}, zmm2/m512","vmovdqu8 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W0 7F /r","V","V","AVX512BW","scale64","w,r,r","Y","512"
+"VMOVDQU8 zmm1, {k}{z}, zmm2/m512","VMOVDQU8 zmm2/m512, {k}{z}, zmm1","vmovdqu8 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W0 6F /r","V","V","AVX512BW","scale64","w,r,r","Y","512"
+"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xmmV, xmm1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xmmV, xmm1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xmm1","EVEX.NDS.LIG.66.0F.W1 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
+"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","EVEX.LIG.66.0F.W1 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
+"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","VEX.128.66.0F.WIG 17 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xmm1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
+"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xmm1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","EVEX.128.0F.W0 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
+"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","VEX.128.0F.WIG 17 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xmmV, xmm1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xmmV, xmm1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xmm1","EVEX.NDS.LIG.66.0F.W1 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
+"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","EVEX.LIG.66.0F.W1 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
+"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","VEX.128.66.0F.WIG 13 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xmm1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r,r","",""
+"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xmm1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","",""
+"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","EVEX.128.0F.W0 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","",""
+"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","VEX.128.0F.WIG 13 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVMSKPD r32, xmm2","VMOVMSKPD xmm2, r32","vmovmskpd xmm2, r32","VEX.128.66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VMOVMSKPD r32, ymm2","VMOVMSKPD ymm2, r32","vmovmskpd ymm2, r32","VEX.256.66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VMOVMSKPS r32, xmm2","VMOVMSKPS xmm2, r32","vmovmskps xmm2, r32","VEX.128.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VMOVMSKPS r32, ymm2","VMOVMSKPS ymm2, r32","vmovmskps ymm2, r32","VEX.256.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","EVEX.128.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
+"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","VEX.128.66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","EVEX.256.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
+"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","VEX.256.66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTDQ m512, zmm1","VMOVNTDQ zmm1, m512","vmovntdq zmm1, m512","EVEX.512.66.0F.W0 E7 /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
+"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","EVEX.128.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
+"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","VEX.128.66.0F38.WIG 2A /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","EVEX.256.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
+"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","VEX.256.66.0F38.WIG 2A /r","V","V","AVX2","modrm_memonly","w,r","",""
+"VMOVNTDQA zmm1, m512","VMOVNTDQA m512, zmm1","vmovntdqa m512, zmm1","EVEX.512.66.0F38.W0 2A /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
+"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","EVEX.128.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
+"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","VEX.128.66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","EVEX.256.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
+"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","VEX.256.66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTPD m512, zmm1","VMOVNTPD zmm1, m512","vmovntpd zmm1, m512","EVEX.512.66.0F.W1 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
+"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","EVEX.128.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",""
+"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","VEX.128.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","EVEX.256.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
+"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","VEX.256.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVNTPS m512, zmm1","VMOVNTPS zmm1, m512","vmovntps zmm1, m512","EVEX.512.0F.W0 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","",""
+"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","EVEX.128.66.0F.W1 6E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","",""
+"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","VEX.128.66.0F.W1 6E /r","N.S.","V","AVX","","w,r","",""
+"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","EVEX.128.66.0F.W1 7E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","",""
+"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","VEX.128.66.0F.W1 7E /r","N.S.","V","AVX","","w,r","",""
+"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","EVEX.LIG.66.0F.W1 D6 /r","V","V","AVX512F+AVX512VL","scale8","w,r","",""
+"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","VEX.128.66.0F.WIG D6 /r","V","V","AVX","","w,r","",""
+"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","EVEX.LIG.F3.0F.W1 7E /r","V","V","AVX512F+AVX512VL","scale8","w,r","",""
+"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","VEX.128.F3.0F.WIG 7E /r","V","V","AVX","","w,r","",""
+"VMOVSD xmm1, m64","VMOVSD m64, xmm1","vmovsd m64, xmm1","VEX.LIG.F2.0F.WIG 10 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVSD xmm1, {k}{z}, m64","VMOVSD m64, {k}{z}, xmm1","vmovsd m64, {k}{z}, xmm1","EVEX.LIG.F2.0F.W1 10 /r","V","V","AVX512F","modrm_memonly,scale8","w,r,r","",""
+"VMOVSD m64, xmm1","VMOVSD xmm1, m64","vmovsd xmm1, m64","VEX.LIG.F2.0F.WIG 11 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVSD xmm2, xmmV, xmm1","VMOVSD xmm1, xmmV, xmm2","vmovsd xmm1, xmmV, xmm2","VEX.NDS.LIG.F2.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVSD xmm2, {k}{z}, xmmV, xmm1","VMOVSD xmm1, xmmV, {k}{z}, xmm2","vmovsd xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F2.0F.W1 11 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMOVSD m64, {k}, xmm1","VMOVSD xmm1, {k}, m64","vmovsd xmm1, {k}, m64","EVEX.LIG.F2.0F.W1 11 /r","V","V","AVX512F","modrm_memonly,scale8","w,r,r","",""
+"VMOVSD xmm1, xmmV, xmm2","VMOVSD xmm2, xmmV, xmm1","vmovsd xmm2, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVSD xmm1, {k}{z}, xmmV, xmm2","VMOVSD xmm2, xmmV, {k}{z}, xmm1","vmovsd xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 10 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMOVSHDUP xmm1, xmm2/m128","VMOVSHDUP xmm2/m128, xmm1","vmovshdup xmm2/m128, xmm1","VEX.128.F3.0F.WIG 16 /r","V","V","AVX","","w,r","",""
+"VMOVSHDUP xmm1, {k}{z}, xmm2/m128","VMOVSHDUP xmm2/m128, {k}{z}, xmm1","vmovshdup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 16 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVSHDUP ymm1, ymm2/m256","VMOVSHDUP ymm2/m256, ymm1","vmovshdup ymm2/m256, ymm1","VEX.256.F3.0F.WIG 16 /r","V","V","AVX","","w,r","",""
+"VMOVSHDUP ymm1, {k}{z}, ymm2/m256","VMOVSHDUP ymm2/m256, {k}{z}, ymm1","vmovshdup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 16 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVSHDUP zmm1, {k}{z}, zmm2/m512","VMOVSHDUP zmm2/m512, {k}{z}, zmm1","vmovshdup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 16 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVSLDUP xmm1, xmm2/m128","VMOVSLDUP xmm2/m128, xmm1","vmovsldup xmm2/m128, xmm1","VEX.128.F3.0F.WIG 12 /r","V","V","AVX","","w,r","",""
+"VMOVSLDUP xmm1, {k}{z}, xmm2/m128","VMOVSLDUP xmm2/m128, {k}{z}, xmm1","vmovsldup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 12 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVSLDUP ymm1, ymm2/m256","VMOVSLDUP ymm2/m256, ymm1","vmovsldup ymm2/m256, ymm1","VEX.256.F3.0F.WIG 12 /r","V","V","AVX","","w,r","",""
+"VMOVSLDUP ymm1, {k}{z}, ymm2/m256","VMOVSLDUP ymm2/m256, {k}{z}, ymm1","vmovsldup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 12 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVSLDUP zmm1, {k}{z}, zmm2/m512","VMOVSLDUP zmm2/m512, {k}{z}, zmm1","vmovsldup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 12 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVSS xmm1, m32","VMOVSS m32, xmm1","vmovss m32, xmm1","VEX.LIG.F3.0F.WIG 10 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVSS xmm1, {k}{z}, m32","VMOVSS m32, {k}{z}, xmm1","vmovss m32, {k}{z}, xmm1","EVEX.LIG.F3.0F.W0 10 /r","V","V","AVX512F","modrm_memonly,scale4","w,r,r","",""
+"VMOVSS m32, xmm1","VMOVSS xmm1, m32","vmovss xmm1, m32","VEX.LIG.F3.0F.WIG 11 /r","V","V","AVX","modrm_memonly","w,r","",""
+"VMOVSS xmm2, xmmV, xmm1","VMOVSS xmm1, xmmV, xmm2","vmovss xmm1, xmmV, xmm2","VEX.NDS.LIG.F3.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVSS xmm2, {k}{z}, xmmV, xmm1","VMOVSS xmm1, xmmV, {k}{z}, xmm2","vmovss xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F3.0F.W0 11 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMOVSS m32, {k}, xmm1","VMOVSS xmm1, {k}, m32","vmovss xmm1, {k}, m32","EVEX.LIG.F3.0F.W0 11 /r","V","V","AVX512F","modrm_memonly,scale4","w,r,r","",""
+"VMOVSS xmm1, xmmV, xmm2","VMOVSS xmm2, xmmV, xmm1","vmovss xmm2, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",""
+"VMOVSS xmm1, {k}{z}, xmmV, xmm2","VMOVSS xmm2, xmmV, {k}{z}, xmm1","vmovss xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 10 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMOVUPD xmm2/m128, xmm1","VMOVUPD xmm1, xmm2/m128","vmovupd xmm1, xmm2/m128","VEX.128.66.0F.WIG 11 /r","V","V","AVX","","w,r","",""
+"VMOVUPD xmm2/m128, {k}{z}, xmm1","VMOVUPD xmm1, {k}{z}, xmm2/m128","vmovupd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 11 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVUPD xmm1, xmm2/m128","VMOVUPD xmm2/m128, xmm1","vmovupd xmm2/m128, xmm1","VEX.128.66.0F.WIG 10 /r","V","V","AVX","","w,r","",""
+"VMOVUPD xmm1, {k}{z}, xmm2/m128","VMOVUPD xmm2/m128, {k}{z}, xmm1","vmovupd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 10 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVUPD ymm2/m256, ymm1","VMOVUPD ymm1, ymm2/m256","vmovupd ymm1, ymm2/m256","VEX.256.66.0F.WIG 11 /r","V","V","AVX","","w,r","",""
+"VMOVUPD ymm2/m256, {k}{z}, ymm1","VMOVUPD ymm1, {k}{z}, ymm2/m256","vmovupd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 11 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVUPD ymm1, ymm2/m256","VMOVUPD ymm2/m256, ymm1","vmovupd ymm2/m256, ymm1","VEX.256.66.0F.WIG 10 /r","V","V","AVX","","w,r","",""
+"VMOVUPD ymm1, {k}{z}, ymm2/m256","VMOVUPD ymm2/m256, {k}{z}, ymm1","vmovupd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 10 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVUPD zmm2/m512, {k}{z}, zmm1","VMOVUPD zmm1, {k}{z}, zmm2/m512","vmovupd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 11 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVUPD zmm1, {k}{z}, zmm2/m512","VMOVUPD zmm2/m512, {k}{z}, zmm1","vmovupd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 10 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVUPS xmm2/m128, xmm1","VMOVUPS xmm1, xmm2/m128","vmovups xmm1, xmm2/m128","VEX.128.0F.WIG 11 /r","V","V","AVX","","w,r","",""
+"VMOVUPS xmm2/m128, {k}{z}, xmm1","VMOVUPS xmm1, {k}{z}, xmm2/m128","vmovups xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 11 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVUPS xmm1, xmm2/m128","VMOVUPS xmm2/m128, xmm1","vmovups xmm2/m128, xmm1","VEX.128.0F.WIG 10 /r","V","V","AVX","","w,r","",""
+"VMOVUPS xmm1, {k}{z}, xmm2/m128","VMOVUPS xmm2/m128, {k}{z}, xmm1","vmovups xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 10 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VMOVUPS ymm2/m256, ymm1","VMOVUPS ymm1, ymm2/m256","vmovups ymm1, ymm2/m256","VEX.256.0F.WIG 11 /r","V","V","AVX","","w,r","",""
+"VMOVUPS ymm2/m256, {k}{z}, ymm1","VMOVUPS ymm1, {k}{z}, ymm2/m256","vmovups ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 11 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVUPS ymm1, ymm2/m256","VMOVUPS ymm2/m256, ymm1","vmovups ymm2/m256, ymm1","VEX.256.0F.WIG 10 /r","V","V","AVX","","w,r","",""
+"VMOVUPS ymm1, {k}{z}, ymm2/m256","VMOVUPS ymm2/m256, {k}{z}, ymm1","vmovups ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 10 /r","V","V","AVX512F+AVX512VL","scale32","w,r,r","",""
+"VMOVUPS zmm2/m512, {k}{z}, zmm1","VMOVUPS zmm1, {k}{z}, zmm2/m512","vmovups zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 11 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMOVUPS zmm1, {k}{z}, zmm2/m512","VMOVUPS zmm2/m512, {k}{z}, zmm1","vmovups zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 10 /r","V","V","AVX512F","scale64","w,r,r","",""
+"VMPSADBW xmm1, xmmV, xmm2/m128, imm8u","VMPSADBW imm8u, xmm2/m128, xmmV, xmm1","vmpsadbw imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 42 /r ib","V","V","AVX","","w,r,r,r","",""
+"VMPSADBW ymm1, ymmV, ymm2/m256, imm8u","VMPSADBW imm8u, ymm2/m256, ymmV, ymm1","vmpsadbw imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 42 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VMPTRLD m64","VMPTRLD m64","vmptrld m64","0F C7 /6","V","V","VTX","modrm_memonly","r","",""
+"VMPTRST m64","VMPTRST m64","vmptrst m64","0F C7 /7","V","V","VTX","modrm_memonly","w","",""
+"VMREAD r/m32, r32","VMREAD r32, r/m32","vmread r32, r/m32","0F 78 /r","V","N.S.","VTX","","rw,r","",""
+"VMREAD r/m64, r64","VMREAD r64, r/m64","vmread r64, r/m64","0F 78 /r","N.S.","V","VTX","default64","rw,r","",""
+"VMRESUME","VMRESUME","vmresume","0F 01 C3","V","V","VTX","","","",""
+"VMRUN EAX","VMRUNL EAX","vmrunl EAX","0F 01 D8","V","V","SVM","amd,modrm_regonly,operand32","r","Y","32"
+"VMRUN RAX","VMRUNQ RAX","vmrunq RAX","REX.W 0F 01 D8","N.S.","V","SVM","amd,modrm_regonly","r","Y","64"
+"VMRUN AX","VMRUNW AX","vmrunw AX","0F 01 D8","V","V","SVM","amd,modrm_regonly,operand16","r","Y","16"
+"VMSAVE","VMSAVE","vmsave","0F 01 DB","V","V","SVM","amd","","",""
+"VMULPD xmm1, xmmV, xmm2/m128","VMULPD xmm2/m128, xmmV, xmm1","vmulpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMULPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vmulpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 59 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VMULPD ymm1, ymmV, ymm2/m256","VMULPD ymm2/m256, ymmV, ymm1","vmulpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMULPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vmulpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 59 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VMULPD zmm1{er}, {k}{z}, zmmV, zmm2","VMULPD zmm2, zmmV, {k}{z}, zmm1{er}","vmulpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMULPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMULPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vmulpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 59 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VMULPS xmm1, xmmV, xmm2/m128","VMULPS xmm2/m128, xmmV, xmm1","vmulps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMULPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vmulps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 59 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VMULPS ymm1, ymmV, ymm2/m256","VMULPS ymm2/m256, ymmV, ymm1","vmulps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMULPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vmulps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 59 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VMULPS zmm1{er}, {k}{z}, zmmV, zmm2","VMULPS zmm2, zmmV, {k}{z}, zmm1{er}","vmulps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMULPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMULPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vmulps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 59 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VMULSD xmm1{er}, {k}{z}, xmmV, xmm2","VMULSD xmm2, xmmV, {k}{z}, xmm1{er}","vmulsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMULSD xmm1, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, xmm1","vmulsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULSD xmm1, {k}{z}, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, {k}{z}, xmm1","vmulsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 59 /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VMULSS xmm1{er}, {k}{z}, xmmV, xmm2","VMULSS xmm2, xmmV, {k}{z}, xmm1{er}","vmulss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 59 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VMULSS xmm1, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, xmm1","vmulss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 59 /r","V","V","AVX","","w,r,r","",""
+"VMULSS xmm1, {k}{z}, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, {k}{z}, xmm1","vmulss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 59 /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VMWRITE r32, r/m32","VMWRITE r/m32, r32","vmwrite r/m32, r32","0F 79 /r","V","N.S.","VTX","","r,r","",""
+"VMWRITE r64, r/m64","VMWRITE r/m64, r64","vmwrite r/m64, r64","0F 79 /r","N.S.","V","VTX","default64","r,r","",""
+"VMXOFF","VMXOFF","vmxoff","0F 01 C4","V","V","VTX","","","",""
+"VMXON m64","VMXON m64","vmxon m64","F3 0F C7 /6","V","V","VTX","modrm_memonly","r","",""
+"VORPD xmm1, xmmV, xmm2/m128","VORPD xmm2/m128, xmmV, xmm1","vorpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
+"VORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VORPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vorpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 56 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VORPD ymm1, ymmV, ymm2/m256","VORPD ymm2/m256, ymmV, ymm1","vorpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
+"VORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VORPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vorpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 56 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VORPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vorpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 56 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VORPS xmm1, xmmV, xmm2/m128","VORPS xmm2/m128, xmmV, xmm1","vorps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
+"VORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VORPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vorps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 56 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VORPS ymm1, ymmV, ymm2/m256","VORPS ymm2/m256, ymmV, ymm1","vorps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 56 /r","V","V","AVX","","w,r,r","",""
+"VORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VORPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vorps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 56 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VORPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vorps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 56 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VP4DPWSSD zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSD m128, zmmV+3, {k}{z}, zmm1","vp4dpwssd m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 52 /r","V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","",""
+"VP4DPWSSDS zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSDS m128, zmmV+3, {k}{z}, zmm1","vp4dpwssds m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 53 /r","V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","",""
+"VPABSB xmm1, xmm2/m128","VPABSB xmm2/m128, xmm1","vpabsb xmm2/m128, xmm1","VEX.128.66.0F38.WIG 1C /r","V","V","AVX","","w,r","",""
+"VPABSB xmm1, {k}{z}, xmm2/m128","VPABSB xmm2/m128, {k}{z}, xmm1","vpabsb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1C /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPABSB ymm1, ymm2/m256","VPABSB ymm2/m256, ymm1","vpabsb ymm2/m256, ymm1","VEX.256.66.0F38.WIG 1C /r","V","V","AVX2","","w,r","",""
+"VPABSB ymm1, {k}{z}, ymm2/m256","VPABSB ymm2/m256, {k}{z}, ymm1","vpabsb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1C /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPABSB zmm1, {k}{z}, zmm2/m512","VPABSB zmm2/m512, {k}{z}, zmm1","vpabsb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1C /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VPABSD xmm1, xmm2/m128","VPABSD xmm2/m128, xmm1","vpabsd xmm2/m128, xmm1","VEX.128.66.0F38.WIG 1E /r","V","V","AVX","","w,r","",""
+"VPABSD xmm1, {k}{z}, xmm2/m128/m32bcst","VPABSD xmm2/m128/m32bcst, {k}{z}, xmm1","vpabsd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 1E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VPABSD ymm1, ymm2/m256","VPABSD ymm2/m256, ymm1","vpabsd ymm2/m256, ymm1","VEX.256.66.0F38.WIG 1E /r","V","V","AVX2","","w,r","",""
+"VPABSD ymm1, {k}{z}, ymm2/m256/m32bcst","VPABSD ymm2/m256/m32bcst, {k}{z}, ymm1","vpabsd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 1E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VPABSD zmm1, {k}{z}, zmm2/m512/m32bcst","VPABSD zmm2/m512/m32bcst, {k}{z}, zmm1","vpabsd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1E /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VPABSQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPABSQ xmm2/m128/m64bcst, {k}{z}, xmm1","vpabsq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 1F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VPABSQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPABSQ ymm2/m256/m64bcst, {k}{z}, ymm1","vpabsq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 1F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VPABSQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPABSQ zmm2/m512/m64bcst, {k}{z}, zmm1","vpabsq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1F /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VPABSW xmm1, xmm2/m128","VPABSW xmm2/m128, xmm1","vpabsw xmm2/m128, xmm1","VEX.128.66.0F38.WIG 1D /r","V","V","AVX","","w,r","",""
+"VPABSW xmm1, {k}{z}, xmm2/m128","VPABSW xmm2/m128, {k}{z}, xmm1","vpabsw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1D /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPABSW ymm1, ymm2/m256","VPABSW ymm2/m256, ymm1","vpabsw ymm2/m256, ymm1","VEX.256.66.0F38.WIG 1D /r","V","V","AVX2","","w,r","",""
+"VPABSW ymm1, {k}{z}, ymm2/m256","VPABSW ymm2/m256, {k}{z}, ymm1","vpabsw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1D /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPABSW zmm1, {k}{z}, zmm2/m512","VPABSW zmm2/m512, {k}{z}, zmm1","vpabsw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1D /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VPACKSSDW xmm1, xmmV, xmm2/m128","VPACKSSDW xmm2/m128, xmmV, xmm1","vpackssdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6B /r","V","V","AVX","","w,r,r","",""
+"VPACKSSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKSSDW xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpackssdw xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 6B /r","V","V","AVX512BW+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPACKSSDW ymm1, ymmV, ymm2/m256","VPACKSSDW ymm2/m256, ymmV, ymm1","vpackssdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6B /r","V","V","AVX2","","w,r,r","",""
+"VPACKSSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKSSDW ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpackssdw ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 6B /r","V","V","AVX512BW+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPACKSSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKSSDW zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpackssdw zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 6B /r","V","V","AVX512BW","bscale4,scale64","w,r,r,r","",""
+"VPACKSSWB xmm1, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV, xmm1","vpacksswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 63 /r","V","V","AVX","","w,r,r","",""
+"VPACKSSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV, {k}{z}, xmm1","vpacksswb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 63 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPACKSSWB ymm1, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV, ymm1","vpacksswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 63 /r","V","V","AVX2","","w,r,r","",""
+"VPACKSSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV, {k}{z}, ymm1","vpacksswb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 63 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPACKSSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKSSWB zmm2/m512, zmmV, {k}{z}, zmm1","vpacksswb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 63 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPACKUSDW xmm1, xmmV, xmm2/m128","VPACKUSDW xmm2/m128, xmmV, xmm1","vpackusdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 2B /r","V","V","AVX","","w,r,r","",""
+"VPACKUSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKUSDW xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpackusdw xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 2B /r","V","V","AVX512BW+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPACKUSDW ymm1, ymmV, ymm2/m256","VPACKUSDW ymm2/m256, ymmV, ymm1","vpackusdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 2B /r","V","V","AVX2","","w,r,r","",""
+"VPACKUSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKUSDW ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpackusdw ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 2B /r","V","V","AVX512BW+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPACKUSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKUSDW zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpackusdw zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 2B /r","V","V","AVX512BW","bscale4,scale64","w,r,r,r","",""
+"VPACKUSWB xmm1, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV, xmm1","vpackuswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 67 /r","V","V","AVX","","w,r,r","",""
+"VPACKUSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV, {k}{z}, xmm1","vpackuswb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 67 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPACKUSWB ymm1, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV, ymm1","vpackuswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 67 /r","V","V","AVX2","","w,r,r","",""
+"VPACKUSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV, {k}{z}, ymm1","vpackuswb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 67 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPACKUSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKUSWB zmm2/m512, zmmV, {k}{z}, zmm1","vpackuswb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 67 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDB xmm1, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV, xmm1","vpaddb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FC /r","V","V","AVX","","w,r,r","",""
+"VPADDB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV, {k}{z}, xmm1","vpaddb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG FC /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDB ymm1, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV, ymm1","vpaddb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FC /r","V","V","AVX2","","w,r,r","",""
+"VPADDB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV, {k}{z}, ymm1","vpaddb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG FC /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDB zmm2/m512, zmmV, {k}{z}, zmm1","vpaddb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG FC /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDD xmm1, xmmV, xmm2/m128","VPADDD xmm2/m128, xmmV, xmm1","vpaddd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FE /r","V","V","AVX","","w,r,r","",""
+"VPADDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPADDD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpaddd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 FE /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPADDD ymm1, ymmV, ymm2/m256","VPADDD ymm2/m256, ymmV, ymm1","vpaddd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FE /r","V","V","AVX2","","w,r,r","",""
+"VPADDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPADDD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpaddd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 FE /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPADDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPADDD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpaddd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 FE /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPADDQ xmm1, xmmV, xmm2/m128","VPADDQ xmm2/m128, xmmV, xmm1","vpaddq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D4 /r","V","V","AVX","","w,r,r","",""
+"VPADDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPADDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpaddq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 D4 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPADDQ ymm1, ymmV, ymm2/m256","VPADDQ ymm2/m256, ymmV, ymm1","vpaddq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D4 /r","V","V","AVX2","","w,r,r","",""
+"VPADDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPADDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpaddq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 D4 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPADDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPADDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpaddq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 D4 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPADDSB xmm1, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV, xmm1","vpaddsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EC /r","V","V","AVX","","w,r,r","",""
+"VPADDSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV, {k}{z}, xmm1","vpaddsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EC /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDSB ymm1, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV, ymm1","vpaddsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EC /r","V","V","AVX2","","w,r,r","",""
+"VPADDSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV, {k}{z}, ymm1","vpaddsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EC /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSB zmm2/m512, zmmV, {k}{z}, zmm1","vpaddsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EC /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDSW xmm1, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV, xmm1","vpaddsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG ED /r","V","V","AVX","","w,r,r","",""
+"VPADDSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV, {k}{z}, xmm1","vpaddsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG ED /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDSW ymm1, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV, ymm1","vpaddsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG ED /r","V","V","AVX2","","w,r,r","",""
+"VPADDSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV, {k}{z}, ymm1","vpaddsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG ED /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSW zmm2/m512, zmmV, {k}{z}, zmm1","vpaddsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG ED /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDUSB xmm1, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV, xmm1","vpaddusb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DC /r","V","V","AVX","","w,r,r","",""
+"VPADDUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV, {k}{z}, xmm1","vpaddusb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DC /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDUSB ymm1, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV, ymm1","vpaddusb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DC /r","V","V","AVX2","","w,r,r","",""
+"VPADDUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV, {k}{z}, ymm1","vpaddusb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DC /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSB zmm2/m512, zmmV, {k}{z}, zmm1","vpaddusb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DC /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDUSW xmm1, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV, xmm1","vpaddusw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DD /r","V","V","AVX","","w,r,r","",""
+"VPADDUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV, {k}{z}, xmm1","vpaddusw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DD /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDUSW ymm1, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV, ymm1","vpaddusw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DD /r","V","V","AVX2","","w,r,r","",""
+"VPADDUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV, {k}{z}, ymm1","vpaddusw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DD /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSW zmm2/m512, zmmV, {k}{z}, zmm1","vpaddusw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DD /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPADDW xmm1, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV, xmm1","vpaddw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FD /r","V","V","AVX","","w,r,r","",""
+"VPADDW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV, {k}{z}, xmm1","vpaddw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG FD /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPADDW ymm1, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV, ymm1","vpaddw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FD /r","V","V","AVX2","","w,r,r","",""
+"VPADDW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV, {k}{z}, ymm1","vpaddw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG FD /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPADDW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDW zmm2/m512, zmmV, {k}{z}, zmm1","vpaddw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG FD /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPALIGNR xmm1, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u, xmm2/m128, xmmV, xmm1","vpalignr imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0F /r ib","V","V","AVX","","w,r,r,r","",""
+"VPALIGNR xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","vpalignr imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.WIG 0F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPALIGNR ymm1, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u, ymm2/m256, ymmV, ymm1","vpalignr imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0F /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPALIGNR ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","vpalignr imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.WIG 0F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPALIGNR zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPALIGNR imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","vpalignr imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.WIG 0F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPAND xmm1, xmmV, xmm2/m128","VPAND xmm2/m128, xmmV, xmm1","vpand xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DB /r","V","V","AVX","","w,r,r","",""
+"VPAND ymm1, ymmV, ymm2/m256","VPAND ymm2/m256, ymmV, ymm1","vpand ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DB /r","V","V","AVX2","","w,r,r","",""
+"VPANDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpandd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 DB /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPANDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpandd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 DB /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPANDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpandd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 DB /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPANDN xmm1, xmmV, xmm2/m128","VPANDN xmm2/m128, xmmV, xmm1","vpandn xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DF /r","V","V","AVX","","w,r,r","",""
+"VPANDN ymm1, ymmV, ymm2/m256","VPANDN ymm2/m256, ymmV, ymm1","vpandn ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DF /r","V","V","AVX2","","w,r,r","",""
+"VPANDND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDND xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpandnd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 DF /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPANDND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDND ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpandnd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 DF /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPANDND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDND zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpandnd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 DF /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPANDNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDNQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpandnq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 DF /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPANDNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDNQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpandnq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 DF /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPANDNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDNQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpandnq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 DF /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPANDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpandq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 DB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPANDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpandq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 DB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPANDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpandq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 DB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPAVGB xmm1, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV, xmm1","vpavgb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E0 /r","V","V","AVX","","w,r,r","",""
+"VPAVGB xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV, {k}{z}, xmm1","vpavgb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E0 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPAVGB ymm1, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV, ymm1","vpavgb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E0 /r","V","V","AVX2","","w,r,r","",""
+"VPAVGB ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV, {k}{z}, ymm1","vpavgb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E0 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPAVGB zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGB zmm2/m512, zmmV, {k}{z}, zmm1","vpavgb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E0 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPAVGW xmm1, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV, xmm1","vpavgw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E3 /r","V","V","AVX","","w,r,r","",""
+"VPAVGW xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV, {k}{z}, xmm1","vpavgw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E3 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPAVGW ymm1, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV, ymm1","vpavgw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E3 /r","V","V","AVX2","","w,r,r","",""
+"VPAVGW ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV, {k}{z}, ymm1","vpavgw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E3 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPAVGW zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGW zmm2/m512, zmmV, {k}{z}, zmm1","vpavgw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E3 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPBLENDD xmm1, xmmV, xmm2/m128, imm8u","VPBLENDD imm8u, xmm2/m128, xmmV, xmm1","vpblendd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 02 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPBLENDD ymm1, ymmV, ymm2/m256, imm8u","VPBLENDD imm8u, ymm2/m256, ymmV, ymm1","vpblendd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 02 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPBLENDMB xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMB xmm2/m128, xmmV, {k}{z}, xmm1","vpblendmb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 66 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPBLENDMB ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMB ymm2/m256, ymmV, {k}{z}, ymm1","vpblendmb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 66 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPBLENDMB zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMB zmm2/m512, zmmV, {k}{z}, zmm1","vpblendmb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 66 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPBLENDMD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPBLENDMD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpblendmd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 64 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPBLENDMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPBLENDMD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpblendmd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 64 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPBLENDMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPBLENDMD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpblendmd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 64 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPBLENDMQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPBLENDMQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpblendmq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 64 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPBLENDMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPBLENDMQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpblendmq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 64 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPBLENDMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPBLENDMQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpblendmq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 64 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPBLENDMW xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMW xmm2/m128, xmmV, {k}{z}, xmm1","vpblendmw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 66 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPBLENDMW ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMW ymm2/m256, ymmV, {k}{z}, ymm1","vpblendmw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 66 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPBLENDMW zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMW zmm2/m512, zmmV, {k}{z}, zmm1","vpblendmw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 66 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPBLENDVB xmm1, xmmV, xmm2/m128, xmmIH","VPBLENDVB xmmIH, xmm2/m128, xmmV, xmm1","vpblendvb xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4C /r /is4","V","V","AVX","","w,r,r,r","",""
+"VPBLENDVB ymm1, ymmV, ymm2/m256, ymmIH","VPBLENDVB ymmIH, ymm2/m256, ymmV, ymm1","vpblendvb ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4C /r /is4","V","V","AVX2","","w,r,r,r","",""
+"VPBLENDW xmm1, xmmV, xmm2/m128, imm8u","VPBLENDW imm8u, xmm2/m128, xmmV, xmm1","vpblendw imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0E /r ib","V","V","AVX","","w,r,r,r","",""
+"VPBLENDW ymm1, ymmV, ymm2/m256, imm8u","VPBLENDW imm8u, ymm2/m256, ymmV, ymm1","vpblendw imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0E /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPBROADCASTB xmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, xmm1","vpbroadcastb rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7A /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTB ymm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, ymm1","vpbroadcastb rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7A /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTB zmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, zmm1","vpbroadcastb rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7A /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"VPBROADCASTB xmm1, xmm2/m8","VPBROADCASTB xmm2/m8, xmm1","vpbroadcastb xmm2/m8, xmm1","VEX.128.66.0F38.W0 78 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTB ymm1, xmm2/m8","VPBROADCASTB xmm2/m8, ymm1","vpbroadcastb xmm2/m8, ymm1","VEX.256.66.0F38.W0 78 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTB xmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, xmm1","vpbroadcastb xmm2/m8, {k}{z}, xmm1","EVEX.128.66.0F38.W0 78 /r","V","V","AVX512BW+AVX512VL","scale1","w,r,r","",""
+"VPBROADCASTB ymm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, ymm1","vpbroadcastb xmm2/m8, {k}{z}, ymm1","EVEX.256.66.0F38.W0 78 /r","V","V","AVX512BW+AVX512VL","scale1","w,r,r","",""
+"VPBROADCASTB zmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, zmm1","vpbroadcastb xmm2/m8, {k}{z}, zmm1","EVEX.512.66.0F38.W0 78 /r","V","V","AVX512BW","scale1","w,r,r","",""
+"VPBROADCASTD xmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, xmm1","vpbroadcastd rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7C /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTD ymm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, ymm1","vpbroadcastd rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7C /r","V","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTD zmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, zmm1","vpbroadcastd rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7C /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VPBROADCASTD xmm1, xmm2/m32","VPBROADCASTD xmm2/m32, xmm1","vpbroadcastd xmm2/m32, xmm1","VEX.128.66.0F38.W0 58 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTD ymm1, xmm2/m32","VPBROADCASTD xmm2/m32, ymm1","vpbroadcastd xmm2/m32, ymm1","VEX.256.66.0F38.W0 58 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTD xmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, xmm1","vpbroadcastd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 58 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPBROADCASTD ymm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, ymm1","vpbroadcastd xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 58 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPBROADCASTD zmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, zmm1","vpbroadcastd xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 58 /r","V","V","AVX512F","scale4","w,r,r","",""
+"VPBROADCASTMB2Q xmm1, k2","VPBROADCASTMB2Q k2, xmm1","vpbroadcastmb2q k2, xmm1","EVEX.128.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
+"VPBROADCASTMB2Q ymm1, k2","VPBROADCASTMB2Q k2, ymm1","vpbroadcastmb2q k2, ymm1","EVEX.256.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
+"VPBROADCASTMB2Q zmm1, k2","VPBROADCASTMB2Q k2, zmm1","vpbroadcastmb2q k2, zmm1","EVEX.512.F3.0F38.W1 2A /r","V","V","AVX512CD","modrm_regonly","w,r","",""
+"VPBROADCASTMW2D xmm1, k2","VPBROADCASTMW2D k2, xmm1","vpbroadcastmw2d k2, xmm1","EVEX.128.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
+"VPBROADCASTMW2D ymm1, k2","VPBROADCASTMW2D k2, ymm1","vpbroadcastmw2d k2, ymm1","EVEX.256.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regonly","w,r","",""
+"VPBROADCASTMW2D zmm1, k2","VPBROADCASTMW2D k2, zmm1","vpbroadcastmw2d k2, zmm1","EVEX.512.F3.0F38.W0 3A /r","V","V","AVX512CD","modrm_regonly","w,r","",""
+"VPBROADCASTQ xmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, xmm1","vpbroadcastq rmr64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 7C /r","N.S.","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTQ ymm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, ymm1","vpbroadcastq rmr64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 7C /r","N.S.","V","AVX512F+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTQ zmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, zmm1","vpbroadcastq rmr64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 7C /r","N.S.","V","AVX512F","modrm_regonly","w,r,r","",""
+"VPBROADCASTQ xmm1, xmm2/m64","VPBROADCASTQ xmm2/m64, xmm1","vpbroadcastq xmm2/m64, xmm1","VEX.128.66.0F38.W0 59 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTQ ymm1, xmm2/m64","VPBROADCASTQ xmm2/m64, ymm1","vpbroadcastq xmm2/m64, ymm1","VEX.256.66.0F38.W0 59 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTQ xmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, xmm1","vpbroadcastq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 59 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPBROADCASTQ ymm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, ymm1","vpbroadcastq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 59 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPBROADCASTQ zmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, zmm1","vpbroadcastq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 59 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPBROADCASTW xmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, xmm1","vpbroadcastw rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7B /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTW ymm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, ymm1","vpbroadcastw rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7B /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPBROADCASTW zmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, zmm1","vpbroadcastw rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7B /r","V","V","AVX512BW","modrm_regonly","w,r,r","",""
+"VPBROADCASTW xmm1, xmm2/m16","VPBROADCASTW xmm2/m16, xmm1","vpbroadcastw xmm2/m16, xmm1","VEX.128.66.0F38.W0 79 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTW ymm1, xmm2/m16","VPBROADCASTW xmm2/m16, ymm1","vpbroadcastw xmm2/m16, ymm1","VEX.256.66.0F38.W0 79 /r","V","V","AVX2","","w,r","",""
+"VPBROADCASTW xmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, xmm1","vpbroadcastw xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.W0 79 /r","V","V","AVX512BW+AVX512VL","scale2","w,r,r","",""
+"VPBROADCASTW ymm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, ymm1","vpbroadcastw xmm2/m16, {k}{z}, ymm1","EVEX.256.66.0F38.W0 79 /r","V","V","AVX512BW+AVX512VL","scale2","w,r,r","",""
+"VPBROADCASTW zmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, zmm1","vpbroadcastw xmm2/m16, {k}{z}, zmm1","EVEX.512.66.0F38.W0 79 /r","V","V","AVX512BW","scale2","w,r,r","",""
+"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u, xmm2/m128, xmmV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F3A.WIG 44 /r ib","V","V","VPCLMULQDQ+AVX512VL","scale16","w,r,r,r","",""
+"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u, xmm2/m128, xmmV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 44 /r ib","V","V","PCLMULQDQ+AVX","","w,r,r,r","",""
+"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u, ymm2/m256, ymmV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F3A.WIG 44 /r ib","V","V","VPCLMULQDQ+AVX512VL","scale32","w,r,r,r","",""
+"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u, ymm2/m256, ymmV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 44 /r ib","V","V","VPCLMULQDQ","","w,r,r,r","",""
+"VPCLMULQDQ zmm1, zmmV, zmm2/m512, imm8u","VPCLMULQDQ imm8u, zmm2/m512, zmmV, zmm1","vpclmulqdq imm8u, zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F3A.WIG 44 /r ib","V","V","VPCLMULQDQ+AVX512F","scale64","w,r,r,r","",""
+"VPCMOV xmm1, xmmV, xmmIH, xmm2/m128","VPCMOV xmm2/m128, xmmIH, xmmV, xmm1","vpcmov xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A2 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPCMOV xmm1, xmmV, xmm2/m128, xmmIH","VPCMOV xmmIH, xmm2/m128, xmmV, xmm1","vpcmov xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A2 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPCMOV ymm1, ymmV, ymmIH, ymm2/m256","VPCMOV ymm2/m256, ymmIH, ymmV, ymm1","vpcmov ymm2/m256, ymmIH, ymmV, ymm1","XOP.NDS.256.08.W1 A2 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPCMOV ymm1, ymmV, ymm2/m256, ymmIH","VPCMOV ymmIH, ymm2/m256, ymmV, ymm1","vpcmov ymmIH, ymm2/m256, ymmV, ymm1","XOP.NDS.256.08.W0 A2 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPCMPB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPB imm8u, xmm2/m128, xmmV, {k}, k1","vpcmpb imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 3F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPCMPB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPB imm8u, ymm2/m256, ymmV, {k}, k1","vpcmpb imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 3F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPCMPB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPB imm8u, zmm2/m512, zmmV, {k}, k1","vpcmpb imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 3F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPCMPD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPD imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpd imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 1F /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VPCMPD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPD imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpd imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 1F /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VPCMPD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPD imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpd imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 1F /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VPCMPEQB xmm1, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, xmm1","vpcmpeqb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 74 /r","V","V","AVX","","w,r,r","",""
+"VPCMPEQB k1, {k}, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, {k}, k1","vpcmpeqb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 74 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPCMPEQB ymm1, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, ymm1","vpcmpeqb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 74 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPEQB k1, {k}, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, {k}, k1","vpcmpeqb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 74 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPCMPEQB k1, {k}, zmmV, zmm2/m512","VPCMPEQB zmm2/m512, zmmV, {k}, k1","vpcmpeqb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 74 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPCMPEQD xmm1, xmmV, xmm2/m128","VPCMPEQD xmm2/m128, xmmV, xmm1","vpcmpeqd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 76 /r","V","V","AVX","","w,r,r","",""
+"VPCMPEQD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPEQD xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpeqd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPCMPEQD ymm1, ymmV, ymm2/m256","VPCMPEQD ymm2/m256, ymmV, ymm1","vpcmpeqd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 76 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPEQD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPEQD ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpeqd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPCMPEQD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPEQD zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpeqd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F.W0 76 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPCMPEQQ xmm1, xmmV, xmm2/m128","VPCMPEQQ xmm2/m128, xmmV, xmm1","vpcmpeqq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 29 /r","V","V","AVX","","w,r,r","",""
+"VPCMPEQQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPEQQ xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpeqq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 29 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPCMPEQQ ymm1, ymmV, ymm2/m256","VPCMPEQQ ymm2/m256, ymmV, ymm1","vpcmpeqq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 29 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPEQQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPEQQ ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpeqq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 29 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPCMPEQQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPEQQ zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpeqq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 29 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPCMPEQW xmm1, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, xmm1","vpcmpeqw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 75 /r","V","V","AVX","","w,r,r","",""
+"VPCMPEQW k1, {k}, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, {k}, k1","vpcmpeqw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 75 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPCMPEQW ymm1, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, ymm1","vpcmpeqw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 75 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPEQW k1, {k}, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, {k}, k1","vpcmpeqw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 75 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPCMPEQW k1, {k}, zmmV, zmm2/m512","VPCMPEQW zmm2/m512, zmmV, {k}, k1","vpcmpeqw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 75 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPCMPESTRI xmm1, xmm2/m128, imm8u","VPCMPESTRI imm8u, xmm2/m128, xmm1","vpcmpestri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 61 /r ib","V","V","AVX","","r,r,r","",""
+"VPCMPESTRM xmm1, xmm2/m128, imm8u","VPCMPESTRM imm8u, xmm2/m128, xmm1","vpcmpestrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 60 /r ib","V","V","AVX","","r,r,r","",""
+"VPCMPGTB xmm1, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, xmm1","vpcmpgtb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 64 /r","V","V","AVX","","w,r,r","",""
+"VPCMPGTB k1, {k}, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, {k}, k1","vpcmpgtb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 64 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPCMPGTB ymm1, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, ymm1","vpcmpgtb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 64 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPGTB k1, {k}, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, {k}, k1","vpcmpgtb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 64 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPCMPGTB k1, {k}, zmmV, zmm2/m512","VPCMPGTB zmm2/m512, zmmV, {k}, k1","vpcmpgtb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 64 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPCMPGTD xmm1, xmmV, xmm2/m128","VPCMPGTD xmm2/m128, xmmV, xmm1","vpcmpgtd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 66 /r","V","V","AVX","","w,r,r","",""
+"VPCMPGTD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPGTD xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpgtd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F.W0 66 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPCMPGTD ymm1, ymmV, ymm2/m256","VPCMPGTD ymm2/m256, ymmV, ymm1","vpcmpgtd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 66 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPGTD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPGTD ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpgtd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F.W0 66 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPCMPGTD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPGTD zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpgtd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F.W0 66 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPCMPGTQ xmm1, xmmV, xmm2/m128","VPCMPGTQ xmm2/m128, xmmV, xmm1","vpcmpgtq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 37 /r","V","V","AVX","","w,r,r","",""
+"VPCMPGTQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPGTQ xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpgtq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 37 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPCMPGTQ ymm1, ymmV, ymm2/m256","VPCMPGTQ ymm2/m256, ymmV, ymm1","vpcmpgtq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 37 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPGTQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPGTQ ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpgtq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 37 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPCMPGTQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPGTQ zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpgtq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 37 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPCMPGTW xmm1, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, xmm1","vpcmpgtw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 65 /r","V","V","AVX","","w,r,r","",""
+"VPCMPGTW k1, {k}, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, {k}, k1","vpcmpgtw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 65 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPCMPGTW ymm1, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, ymm1","vpcmpgtw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 65 /r","V","V","AVX2","","w,r,r","",""
+"VPCMPGTW k1, {k}, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, {k}, k1","vpcmpgtw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 65 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPCMPGTW k1, {k}, zmmV, zmm2/m512","VPCMPGTW zmm2/m512, zmmV, {k}, k1","vpcmpgtw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 65 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPCMPISTRI xmm1, xmm2/m128, imm8u","VPCMPISTRI imm8u, xmm2/m128, xmm1","vpcmpistri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 63 /r ib","V","V","AVX","","r,r,r","",""
+"VPCMPISTRM xmm1, xmm2/m128, imm8u","VPCMPISTRM imm8u, xmm2/m128, xmm1","vpcmpistrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 62 /r ib","V","V","AVX","","r,r,r","",""
+"VPCMPQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPQ imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpq imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 1F /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VPCMPQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPQ imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpq imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 1F /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VPCMPQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPQ imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpq imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 1F /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VPCMPUB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUB imm8u, xmm2/m128, xmmV, {k}, k1","vpcmpub imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 3E /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPCMPUB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUB imm8u, ymm2/m256, ymmV, {k}, k1","vpcmpub imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 3E /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPCMPUB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUB imm8u, zmm2/m512, zmmV, {k}, k1","vpcmpub imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 3E /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPCMPUD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPUD imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","vpcmpud imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 1E /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VPCMPUD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPUD imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","vpcmpud imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 1E /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VPCMPUD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPUD imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","vpcmpud imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 1E /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VPCMPUQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPUQ imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","vpcmpuq imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 1E /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VPCMPUQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPUQ imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","vpcmpuq imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 1E /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VPCMPUQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPUQ imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","vpcmpuq imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 1E /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VPCMPUW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUW imm8u, xmm2/m128, xmmV, {k}, k1","vpcmpuw imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 3E /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPCMPUW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUW imm8u, ymm2/m256, ymmV, {k}, k1","vpcmpuw imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 3E /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPCMPUW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUW imm8u, zmm2/m512, zmmV, {k}, k1","vpcmpuw imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 3E /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPCMPW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPW imm8u, xmm2/m128, xmmV, {k}, k1","vpcmpw imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 3F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","",""
+"VPCMPW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPW imm8u, ymm2/m256, ymmV, {k}, k1","vpcmpw imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 3F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","",""
+"VPCMPW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPW imm8u, zmm2/m512, zmmV, {k}, k1","vpcmpw imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 3F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",""
+"VPCOMB xmm1, xmmV, xmm2/m128, imm8u","VPCOMB imm8u, xmm2/m128, xmmV, xmm1","vpcomb imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CC /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMD xmm1, xmmV, xmm2/m128, imm8u","VPCOMD imm8u, xmm2/m128, xmmV, xmm1","vpcomd imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CE /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMPRESSB xmm2/m128, {k}{z}, xmm1","VPCOMPRESSB xmm1, {k}{z}, xmm2/m128","vpcompressb xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
+"VPCOMPRESSB ymm2/m256, {k}{z}, ymm1","VPCOMPRESSB ymm1, {k}{z}, ymm2/m256","vpcompressb ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
+"VPCOMPRESSB zmm2/m512, {k}{z}, zmm1","VPCOMPRESSB zmm1, {k}{z}, zmm2/m512","vpcompressb zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 63 /r","V","V","AVX512_VBMI2","scale1","w,r,r","",""
+"VPCOMPRESSD xmm2/m128, {k}{z}, xmm1","VPCOMPRESSD xmm1, {k}{z}, xmm2/m128","vpcompressd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 8B /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPCOMPRESSD ymm2/m256, {k}{z}, ymm1","VPCOMPRESSD ymm1, {k}{z}, ymm2/m256","vpcompressd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 8B /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPCOMPRESSD zmm2/m512, {k}{z}, zmm1","VPCOMPRESSD zmm1, {k}{z}, zmm2/m512","vpcompressd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 8B /r","V","V","AVX512F","scale4","w,r,r","",""
+"VPCOMPRESSQ xmm2/m128, {k}{z}, xmm1","VPCOMPRESSQ xmm1, {k}{z}, xmm2/m128","vpcompressq xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 8B /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPCOMPRESSQ ymm2/m256, {k}{z}, ymm1","VPCOMPRESSQ ymm1, {k}{z}, ymm2/m256","vpcompressq ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 8B /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPCOMPRESSQ zmm2/m512, {k}{z}, zmm1","VPCOMPRESSQ zmm1, {k}{z}, zmm2/m512","vpcompressq zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 8B /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPCOMPRESSW xmm2/m128, {k}{z}, xmm1","VPCOMPRESSW xmm1, {k}{z}, xmm2/m128","vpcompressw xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
+"VPCOMPRESSW ymm2/m256, {k}{z}, ymm1","VPCOMPRESSW ymm1, {k}{z}, ymm2/m256","vpcompressw ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 63 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
+"VPCOMPRESSW zmm2/m512, {k}{z}, zmm1","VPCOMPRESSW zmm1, {k}{z}, zmm2/m512","vpcompressw zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 63 /r","V","V","AVX512_VBMI2","scale2","w,r,r","",""
+"VPCOMQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMQ imm8u, xmm2/m128, xmmV, xmm1","vpcomq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CF /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMUB xmm1, xmmV, xmm2/m128, imm8u","VPCOMUB imm8u, xmm2/m128, xmmV, xmm1","vpcomub imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EC /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMUD xmm1, xmmV, xmm2/m128, imm8u","VPCOMUD imm8u, xmm2/m128, xmmV, xmm1","vpcomud imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EE /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMUQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMUQ imm8u, xmm2/m128, xmmV, xmm1","vpcomuq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EF /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMUW xmm1, xmmV, xmm2/m128, imm8u","VPCOMUW imm8u, xmm2/m128, xmmV, xmm1","vpcomuw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 ED /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCOMW xmm1, xmmV, xmm2/m128, imm8u","VPCOMW imm8u, xmm2/m128, xmmV, xmm1","vpcomw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CD /r ib","V","V","XOP","amd","w,r,r,r","",""
+"VPCONFLICTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPCONFLICTD xmm2/m128/m32bcst, {k}{z}, xmm1","vpconflictd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 C4 /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","",""
+"VPCONFLICTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPCONFLICTD ymm2/m256/m32bcst, {k}{z}, ymm1","vpconflictd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 C4 /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","",""
+"VPCONFLICTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPCONFLICTD zmm2/m512/m32bcst, {k}{z}, zmm1","vpconflictd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 C4 /r","V","V","AVX512CD","bscale4,scale64","w,r,r","",""
+"VPCONFLICTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPCONFLICTQ xmm2/m128/m64bcst, {k}{z}, xmm1","vpconflictq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 C4 /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","",""
+"VPCONFLICTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPCONFLICTQ ymm2/m256/m64bcst, {k}{z}, ymm1","vpconflictq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 C4 /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","",""
+"VPCONFLICTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPCONFLICTQ zmm2/m512/m64bcst, {k}{z}, zmm1","vpconflictq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 C4 /r","V","V","AVX512CD","bscale8,scale64","w,r,r","",""
+"VPDPBUSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpbusd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 50 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPDPBUSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpbusd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 50 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPDPBUSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpbusd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 50 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
+"VPDPBUSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSDS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpbusds xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 51 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPDPBUSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSDS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpbusds ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 51 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPDPBUSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSDS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpbusds zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 51 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
+"VPDPWSSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpwssd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 52 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPDPWSSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpwssd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 52 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPDPWSSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpwssd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 52 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
+"VPDPWSSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSDS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpdpwssds xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 53 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPDPWSSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSDS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpdpwssds ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 53 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPDPWSSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSDS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpdpwssds zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 53 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r,r,r","",""
+"VPERM2F128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2F128 imm8u, ymm2/m256, ymmV, ymm1","vperm2f128 imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 06 /r ib","V","V","AVX","","w,r,r,r","",""
+"VPERM2I128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2I128 imm8u, ymm2/m256, ymmV, ymm1","vperm2i128 imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 46 /r ib","V","V","AVX2","","w,r,r,r","",""
+"VPERMB xmm1, {k}{z}, xmmV, xmm2/m128","VPERMB xmm2/m128, xmmV, {k}{z}, xmm1","vpermb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 8D /r","V","V","AVX512_VBMI+AVX512VL","scale16","w,r,r,r","",""
+"VPERMB ymm1, {k}{z}, ymmV, ymm2/m256","VPERMB ymm2/m256, ymmV, {k}{z}, ymm1","vpermb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 8D /r","V","V","AVX512_VBMI+AVX512VL","scale32","w,r,r,r","",""
+"VPERMB zmm1, {k}{z}, zmmV, zmm2/m512","VPERMB zmm2/m512, zmmV, {k}{z}, zmm1","vpermb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 8D /r","V","V","AVX512_VBMI","scale64","w,r,r,r","",""
+"VPERMD ymm1, ymmV, ymm2/m256","VPERMD ymm2/m256, ymmV, ymm1","vpermd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 36 /r","V","V","AVX2","","w,r,r","",""
+"VPERMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 36 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPERMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 36 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPERMI2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2B xmm2/m128, xmmV, {k}{z}, xmm1","vpermi2b xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 75 /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","",""
+"VPERMI2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2B ymm2/m256, ymmV, {k}{z}, ymm1","vpermi2b ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 75 /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","",""
+"VPERMI2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2B zmm2/m512, zmmV, {k}{z}, zmm1","vpermi2b zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 75 /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","",""
+"VPERMI2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2D xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermi2d xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPERMI2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2D ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermi2d ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPERMI2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2D zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermi2d zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 76 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VPERMI2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermi2pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 77 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPERMI2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermi2pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 77 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPERMI2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermi2pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 77 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VPERMI2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermi2ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 77 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPERMI2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermi2ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 77 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPERMI2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermi2ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 77 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VPERMI2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2Q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermi2q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 76 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPERMI2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2Q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermi2q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 76 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPERMI2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2Q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermi2q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 76 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VPERMI2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2W xmm2/m128, xmmV, {k}{z}, xmm1","vpermi2w xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 75 /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","",""
+"VPERMI2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2W ymm2/m256, ymmV, {k}{z}, ymm1","vpermi2w ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 75 /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","",""
+"VPERMI2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2W zmm2/m512, zmmV, {k}{z}, zmm1","vpermi2w zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 75 /r","V","V","AVX512BW","scale64","rw,r,r,r","",""
+"VPERMIL2PD xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PD imm8u, xmm2/m128, xmmIH, xmmV, xmm1","vpermil2pd imm8u, xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PD xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PD imm8u, xmmIH, xmm2/m128, xmmV, xmm1","vpermil2pd imm8u, xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PD ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PD imm8u, ymm2/m256, ymmIH, ymmV, ymm1","vpermil2pd imm8u, ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PD ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PD imm8u, ymmIH, ymm2/m256, ymmV, ymm1","vpermil2pd imm8u, ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PS xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PS imm8u, xmm2/m128, xmmIH, xmmV, xmm1","vpermil2ps imm8u, xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PS xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PS imm8u, xmmIH, xmm2/m128, xmmV, xmm1","vpermil2ps imm8u, xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PS ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PS imm8u, ymm2/m256, ymmIH, ymmV, ymm1","vpermil2ps imm8u, ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMIL2PS ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PS imm8u, ymmIH, ymm2/m256, ymmV, ymm1","vpermil2ps imm8u, ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","",""
+"VPERMILPD xmm1, xmm2/m128, imm8u","VPERMILPD imm8u, xmm2/m128, xmm1","vpermilpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 05 /r ib","V","V","AVX","","w,r,r","",""
+"VPERMILPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VPERMILPD imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","vpermilpd imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 05 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPERMILPD ymm1, ymm2/m256, imm8u","VPERMILPD imm8u, ymm2/m256, ymm1","vpermilpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 05 /r ib","V","V","AVX","","w,r,r","",""
+"VPERMILPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMILPD imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vpermilpd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 05 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMILPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMILPD imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vpermilpd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 05 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMILPD xmm1, xmmV, xmm2/m128","VPERMILPD xmm2/m128, xmmV, xmm1","vpermilpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0D /r","V","V","AVX","","w,r,r","",""
+"VPERMILPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMILPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermilpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 0D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPERMILPD ymm1, ymmV, ymm2/m256","VPERMILPD ymm2/m256, ymmV, ymm1","vpermilpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0D /r","V","V","AVX","","w,r,r","",""
+"VPERMILPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMILPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermilpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 0D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMILPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMILPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermilpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 0D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMILPS xmm1, xmm2/m128, imm8u","VPERMILPS imm8u, xmm2/m128, xmm1","vpermilps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 04 /r ib","V","V","AVX","","w,r,r","",""
+"VPERMILPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPERMILPS imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","vpermilps imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 04 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPERMILPS ymm1, ymm2/m256, imm8u","VPERMILPS imm8u, ymm2/m256, ymm1","vpermilps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 04 /r ib","V","V","AVX","","w,r,r","",""
+"VPERMILPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPERMILPS imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","vpermilps imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 04 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPERMILPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPERMILPS imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","vpermilps imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 04 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPERMILPS xmm1, xmmV, xmm2/m128","VPERMILPS xmm2/m128, xmmV, xmm1","vpermilps xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0C /r","V","V","AVX","","w,r,r","",""
+"VPERMILPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMILPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermilps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 0C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPERMILPS ymm1, ymmV, ymm2/m256","VPERMILPS ymm2/m256, ymmV, ymm1","vpermilps ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0C /r","V","V","AVX","","w,r,r","",""
+"VPERMILPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMILPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermilps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 0C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPERMILPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMILPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermilps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 0C /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPERMPD ymm1, ymm2/m256, imm8u","VPERMPD imm8u, ymm2/m256, ymm1","vpermpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 01 /r ib","V","V","AVX2","","w,r,r","",""
+"VPERMPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMPD imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vpermpd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 01 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMPD imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vpermpd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 01 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 16 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 16 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMPS ymm1, ymmV, ymm2/m256","VPERMPS ymm2/m256, ymmV, ymm1","vpermps ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 16 /r","V","V","AVX2","","w,r,r","",""
+"VPERMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 16 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPERMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 16 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPERMQ ymm1, ymm2/m256, imm8u","VPERMQ imm8u, ymm2/m256, ymm1","vpermq imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 00 /r ib","V","V","AVX2","","w,r,r","",""
+"VPERMQ ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vpermq imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 00 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMQ zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vpermq imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 00 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 36 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPERMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 36 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPERMT2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2B xmm2/m128, xmmV, {k}{z}, xmm1","vpermt2b xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7D /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","",""
+"VPERMT2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2B ymm2/m256, ymmV, {k}{z}, ymm1","vpermt2b ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7D /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","",""
+"VPERMT2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2B zmm2/m512, zmmV, {k}{z}, zmm1","vpermt2b zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7D /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","",""
+"VPERMT2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2D xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermt2d xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPERMT2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2D ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermt2d ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPERMT2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2D zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermt2d zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7E /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VPERMT2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2PD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermt2pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPERMT2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2PD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermt2pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPERMT2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2PD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermt2pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7F /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VPERMT2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2PS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpermt2ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPERMT2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2PS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpermt2ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPERMT2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2PS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpermt2ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7F /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r","",""
+"VPERMT2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2Q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpermt2q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPERMT2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2Q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpermt2q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPERMT2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2Q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpermt2q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7E /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r","",""
+"VPERMT2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2W xmm2/m128, xmmV, {k}{z}, xmm1","vpermt2w xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7D /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","",""
+"VPERMT2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2W ymm2/m256, ymmV, {k}{z}, ymm1","vpermt2w ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7D /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","",""
+"VPERMT2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2W zmm2/m512, zmmV, {k}{z}, zmm1","vpermt2w zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7D /r","V","V","AVX512BW","scale64","rw,r,r,r","",""
+"VPERMW xmm1, {k}{z}, xmmV, xmm2/m128","VPERMW xmm2/m128, xmmV, {k}{z}, xmm1","vpermw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 8D /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPERMW ymm1, {k}{z}, ymmV, ymm2/m256","VPERMW ymm2/m256, ymmV, {k}{z}, ymm1","vpermw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 8D /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPERMW zmm1, {k}{z}, zmmV, zmm2/m512","VPERMW zmm2/m512, zmmV, {k}{z}, zmm1","vpermw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 8D /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPEXPANDB xmm1, {k}{z}, xmm2/m128","VPEXPANDB xmm2/m128, {k}{z}, xmm1","vpexpandb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 62 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
+"VPEXPANDB ymm1, {k}{z}, ymm2/m256","VPEXPANDB ymm2/m256, {k}{z}, ymm1","vpexpandb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 62 /r","V","V","AVX512_VBMI2+AVX512VL","scale1","w,r,r","",""
+"VPEXPANDB zmm1, {k}{z}, zmm2/m512","VPEXPANDB zmm2/m512, {k}{z}, zmm1","vpexpandb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 62 /r","V","V","AVX512_VBMI2","scale1","w,r,r","",""
+"VPEXPANDD xmm1, {k}{z}, xmm2/m128","VPEXPANDD xmm2/m128, {k}{z}, xmm1","vpexpandd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 89 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPEXPANDD ymm1, {k}{z}, ymm2/m256","VPEXPANDD ymm2/m256, {k}{z}, ymm1","vpexpandd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 89 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPEXPANDD zmm1, {k}{z}, zmm2/m512","VPEXPANDD zmm2/m512, {k}{z}, zmm1","vpexpandd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 89 /r","V","V","AVX512F","scale4","w,r,r","",""
+"VPEXPANDQ xmm1, {k}{z}, xmm2/m128","VPEXPANDQ xmm2/m128, {k}{z}, xmm1","vpexpandq xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 89 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPEXPANDQ ymm1, {k}{z}, ymm2/m256","VPEXPANDQ ymm2/m256, {k}{z}, ymm1","vpexpandq ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 89 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPEXPANDQ zmm1, {k}{z}, zmm2/m512","VPEXPANDQ zmm2/m512, {k}{z}, zmm1","vpexpandq zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 89 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPEXPANDW xmm1, {k}{z}, xmm2/m128","VPEXPANDW xmm2/m128, {k}{z}, xmm1","vpexpandw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 62 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
+"VPEXPANDW ymm1, {k}{z}, ymm2/m256","VPEXPANDW ymm2/m256, {k}{z}, ymm1","vpexpandw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 62 /r","V","V","AVX512_VBMI2+AVX512VL","scale2","w,r,r","",""
+"VPEXPANDW zmm1, {k}{z}, zmm2/m512","VPEXPANDW zmm2/m512, {k}{z}, zmm1","vpexpandw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 62 /r","V","V","AVX512_VBMI2","scale2","w,r,r","",""
+"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb imm8u, xmm1, r32/m8","EVEX.128.66.0F3A.WIG 14 /r ib","V","V","AVX512BW+AVX512VL","scale1","w,r,r","",""
+"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb imm8u, xmm1, r32/m8","VEX.128.66.0F3A.WIG 14 /r ib","V","V","AVX","","w,r,r","",""
+"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, xmm1, r/m32","EVEX.128.66.0F3A.W0 16 /r ib","V","V","AVX512DQ+AVX512VL","scale4","w,r,r","",""
+"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, xmm1, r/m32","VEX.128.66.0F3A.W0 16 /r ib","V","V","AVX","","w,r,r","",""
+"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, xmm1, r/m64","EVEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX512DQ+AVX512VL","scale8","w,r,r","",""
+"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, xmm1, r/m64","VEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX","","w,r,r","",""
+"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1, r32/m16","vpextrw imm8u, xmm1, r32/m16","EVEX.128.66.0F3A.WIG 15 /r ib","V","V","AVX512BW+AVX512VL","scale2","w,r,r","",""
+"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1, r32/m16","vpextrw imm8u, xmm1, r32/m16","VEX.128.66.0F3A.WIG 15 /r ib","V","V","AVX","","w,r,r","",""
+"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2, r32","EVEX.128.66.0F.WIG C5 /r ib","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r,r","",""
+"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2, r32","VEX.128.66.0F.WIG C5 /r ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPGATHERDD xmm1, {k1-k7}, vm32x","VPGATHERDD vm32x, {k1-k7}, xmm1","vpgatherdd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 90 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERDD ymm1, {k1-k7}, vm32y","VPGATHERDD vm32y, {k1-k7}, ymm1","vpgatherdd vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 90 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERDD zmm1, {k1-k7}, vm32z","VPGATHERDD vm32z, {k1-k7}, zmm1","vpgatherdd vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 90 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERDD xmm1, vm32x, xmmV","VPGATHERDD xmmV, vm32x, xmm1","vpgatherdd xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERDD ymm1, vm32y, ymmV","VPGATHERDD ymmV, vm32y, ymm1","vpgatherdd ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERDQ xmm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7}, xmm1","vpgatherdq vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 90 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERDQ ymm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7}, ymm1","vpgatherdq vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 90 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERDQ zmm1, {k1-k7}, vm32y","VPGATHERDQ vm32y, {k1-k7}, zmm1","vpgatherdq vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 90 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERDQ xmm1, vm32x, xmmV","VPGATHERDQ xmmV, vm32x, xmm1","vpgatherdq xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERDQ ymm1, vm32x, ymmV","VPGATHERDQ ymmV, vm32x, ymm1","vpgatherdq ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 90 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERQD xmm1, {k1-k7}, vm64x","VPGATHERQD vm64x, {k1-k7}, xmm1","vpgatherqd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 91 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERQD xmm1, {k1-k7}, vm64y","VPGATHERQD vm64y, {k1-k7}, xmm1","vpgatherqd vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 91 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERQD ymm1, {k1-k7}, vm64z","VPGATHERQD vm64z, {k1-k7}, ymm1","vpgatherqd vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 91 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VPGATHERQD xmm1, vm64x, xmmV","VPGATHERQD xmmV, vm64x, xmm1","vpgatherqd xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERQD xmm1, vm64y, xmmV","VPGATHERQD xmmV, vm64y, xmm1","vpgatherqd xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERQQ xmm1, {k1-k7}, vm64x","VPGATHERQQ vm64x, {k1-k7}, xmm1","vpgatherqq vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 91 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERQQ ymm1, {k1-k7}, vm64y","VPGATHERQQ vm64y, {k1-k7}, ymm1","vpgatherqq vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 91 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERQQ zmm1, {k1-k7}, vm64z","VPGATHERQQ vm64z, {k1-k7}, zmm1","vpgatherqq vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 91 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VPGATHERQQ xmm1, vm64x, xmmV","VPGATHERQQ xmmV, vm64x, xmm1","vpgatherqq xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPGATHERQQ ymm1, vm64y, ymmV","VPGATHERQQ ymmV, vm64y, ymm1","vpgatherqq ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 91 /r","V","V","AVX2","modrm_memonly","rw,r,rw","",""
+"VPHADDBD xmm1, xmm2/m128","VPHADDBD xmm2/m128, xmm1","vphaddbd xmm2/m128, xmm1","XOP.128.09.W0 C2 /r","V","V","XOP","amd","w,r","",""
+"VPHADDBQ xmm1, xmm2/m128","VPHADDBQ xmm2/m128, xmm1","vphaddbq xmm2/m128, xmm1","XOP.128.09.W0 C3 /r","V","V","XOP","amd","w,r","",""
+"VPHADDBW xmm1, xmm2/m128","VPHADDBW xmm2/m128, xmm1","vphaddbw xmm2/m128, xmm1","XOP.128.09.W0 C1 /r","V","V","XOP","amd","w,r","",""
+"VPHADDD xmm1, xmmV, xmm2/m128","VPHADDD xmm2/m128, xmmV, xmm1","vphaddd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 02 /r","V","V","AVX","","w,r,r","",""
+"VPHADDD ymm1, ymmV, ymm2/m256","VPHADDD ymm2/m256, ymmV, ymm1","vphaddd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 02 /r","V","V","AVX2","","w,r,r","",""
+"VPHADDDQ xmm1, xmm2/m128","VPHADDDQ xmm2/m128, xmm1","vphadddq xmm2/m128, xmm1","XOP.128.09.W0 CB /r","V","V","XOP","amd","w,r","",""
+"VPHADDSW xmm1, xmmV, xmm2/m128","VPHADDSW xmm2/m128, xmmV, xmm1","vphaddsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 03 /r","V","V","AVX","","w,r,r","",""
+"VPHADDSW ymm1, ymmV, ymm2/m256","VPHADDSW ymm2/m256, ymmV, ymm1","vphaddsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 03 /r","V","V","AVX2","","w,r,r","",""
+"VPHADDUBD xmm1, xmm2/m128","VPHADDUBD xmm2/m128, xmm1","vphaddubd xmm2/m128, xmm1","XOP.128.09.W0 D2 /r","V","V","XOP","amd","w,r","",""
+"VPHADDUBQ xmm1, xmm2/m128","VPHADDUBQ xmm2/m128, xmm1","vphaddubq xmm2/m128, xmm1","XOP.128.09.W0 D3 /r","V","V","XOP","amd","w,r","",""
+"VPHADDUBW xmm1, xmm2/m128","VPHADDUBW xmm2/m128, xmm1","vphaddubw xmm2/m128, xmm1","XOP.128.09.W0 D1 /r","V","V","XOP","amd","w,r","",""
+"VPHADDUDQ xmm1, xmm2/m128","VPHADDUDQ xmm2/m128, xmm1","vphaddudq xmm2/m128, xmm1","XOP.128.09.W0 DB /r","V","V","XOP","amd","w,r","",""
+"VPHADDUWD xmm1, xmm2/m128","VPHADDUWD xmm2/m128, xmm1","vphadduwd xmm2/m128, xmm1","XOP.128.09.W0 D6 /r","V","V","XOP","amd","w,r","",""
+"VPHADDUWQ xmm1, xmm2/m128","VPHADDUWQ xmm2/m128, xmm1","vphadduwq xmm2/m128, xmm1","XOP.128.09.W0 D7 /r","V","V","XOP","amd","w,r","",""
+"VPHADDW xmm1, xmmV, xmm2/m128","VPHADDW xmm2/m128, xmmV, xmm1","vphaddw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 01 /r","V","V","AVX","","w,r,r","",""
+"VPHADDW ymm1, ymmV, ymm2/m256","VPHADDW ymm2/m256, ymmV, ymm1","vphaddw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 01 /r","V","V","AVX2","","w,r,r","",""
+"VPHADDWD xmm1, xmm2/m128","VPHADDWD xmm2/m128, xmm1","vphaddwd xmm2/m128, xmm1","XOP.128.09.W0 C6 /r","V","V","XOP","amd","w,r","",""
+"VPHADDWQ xmm1, xmm2/m128","VPHADDWQ xmm2/m128, xmm1","vphaddwq xmm2/m128, xmm1","XOP.128.09.W0 C7 /r","V","V","XOP","amd","w,r","",""
+"VPHMINPOSUW xmm1, xmm2/m128","VPHMINPOSUW xmm2/m128, xmm1","vphminposuw xmm2/m128, xmm1","VEX.128.66.0F38.WIG 41 /r","V","V","AVX","","w,r","",""
+"VPHSUBBW xmm1, xmm2/m128","VPHSUBBW xmm2/m128, xmm1","vphsubbw xmm2/m128, xmm1","XOP.128.09.W0 E1 /r","V","V","XOP","amd","w,r","",""
+"VPHSUBD xmm1, xmmV, xmm2/m128","VPHSUBD xmm2/m128, xmmV, xmm1","vphsubd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 06 /r","V","V","AVX","","w,r,r","",""
+"VPHSUBD ymm1, ymmV, ymm2/m256","VPHSUBD ymm2/m256, ymmV, ymm1","vphsubd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 06 /r","V","V","AVX2","","w,r,r","",""
+"VPHSUBDQ xmm1, xmm2/m128","VPHSUBDQ xmm2/m128, xmm1","vphsubdq xmm2/m128, xmm1","XOP.128.09.W0 E3 /r","V","V","XOP","amd","w,r","",""
+"VPHSUBSW xmm1, xmmV, xmm2/m128","VPHSUBSW xmm2/m128, xmmV, xmm1","vphsubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 07 /r","V","V","AVX","","w,r,r","",""
+"VPHSUBSW ymm1, ymmV, ymm2/m256","VPHSUBSW ymm2/m256, ymmV, ymm1","vphsubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 07 /r","V","V","AVX2","","w,r,r","",""
+"VPHSUBW xmm1, xmmV, xmm2/m128","VPHSUBW xmm2/m128, xmmV, xmm1","vphsubw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 05 /r","V","V","AVX","","w,r,r","",""
+"VPHSUBW ymm1, ymmV, ymm2/m256","VPHSUBW ymm2/m256, ymmV, ymm1","vphsubw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 05 /r","V","V","AVX2","","w,r,r","",""
+"VPHSUBWD xmm1, xmm2/m128","VPHSUBWD xmm2/m128, xmm1","vphsubwd xmm2/m128, xmm1","XOP.128.09.W0 E2 /r","V","V","XOP","amd","w,r","",""
+"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV, xmm1","vpinsrb imm8u, r32/m8, xmmV, xmm1","EVEX.NDS.128.66.0F3A.WIG 20 /r ib","V","V","AVX512BW+AVX512VL","scale1","w,r,r,r","",""
+"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV, xmm1","vpinsrb imm8u, r32/m8, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 20 /r ib","V","V","AVX","","w,r,r,r","",""
+"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV, xmm1","vpinsrd imm8u, r/m32, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W0 22 /r ib","V","V","AVX512DQ+AVX512VL","scale4","w,r,r,r","",""
+"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV, xmm1","vpinsrd imm8u, r/m32, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 22 /r ib","V","V","AVX","","w,r,r,r","",""
+"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV, xmm1","vpinsrq imm8u, r/m64, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W1 22 /r ib","N.S.","V","AVX512DQ+AVX512VL","scale8","w,r,r,r","",""
+"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV, xmm1","vpinsrq imm8u, r/m64, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 22 /r ib","N.S.","V","AVX","","w,r,r,r","",""
+"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV, xmm1","vpinsrw imm8u, r32/m16, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG C4 /r ib","V","V","AVX512BW+AVX512VL","scale2","w,r,r,r","",""
+"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV, xmm1","vpinsrw imm8u, r32/m16, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C4 /r ib","V","V","AVX","","w,r,r,r","",""
+"VPLZCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPLZCNTD xmm2/m128/m32bcst, {k}{z}, xmm1","vplzcntd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 44 /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","",""
+"VPLZCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPLZCNTD ymm2/m256/m32bcst, {k}{z}, ymm1","vplzcntd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 44 /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","",""
+"VPLZCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPLZCNTD zmm2/m512/m32bcst, {k}{z}, zmm1","vplzcntd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 44 /r","V","V","AVX512CD","bscale4,scale64","w,r,r","",""
+"VPLZCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPLZCNTQ xmm2/m128/m64bcst, {k}{z}, xmm1","vplzcntq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 44 /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","",""
+"VPLZCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPLZCNTQ ymm2/m256/m64bcst, {k}{z}, ymm1","vplzcntq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 44 /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","",""
+"VPLZCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPLZCNTQ zmm2/m512/m64bcst, {k}{z}, zmm1","vplzcntq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 44 /r","V","V","AVX512CD","bscale8,scale64","w,r,r","",""
+"VPMACSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDD xmmIH, xmm2/m128, xmmV, xmm1","vpmacsdd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 9E /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQH xmmIH, xmm2/m128, xmmV, xmm1","vpmacsdqh xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 9F /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQL xmmIH, xmm2/m128, xmmV, xmm1","vpmacsdql xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 97 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDD xmmIH, xmm2/m128, xmmV, xmm1","vpmacssdd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 8E /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQH xmmIH, xmm2/m128, xmmV, xmm1","vpmacssdqh xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 8F /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQL xmmIH, xmm2/m128, xmmV, xmm1","vpmacssdql xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 87 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWD xmmIH, xmm2/m128, xmmV, xmm1","vpmacsswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 86 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWW xmmIH, xmm2/m128, xmmV, xmm1","vpmacssww xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 85 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWD xmmIH, xmm2/m128, xmmV, xmm1","vpmacswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 96 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMACSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWW xmmIH, xmm2/m128, xmmV, xmm1","vpmacsww xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 95 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMADCSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSSWD xmmIH, xmm2/m128, xmmV, xmm1","vpmadcsswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A6 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMADCSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSWD xmmIH, xmm2/m128, xmmV, xmm1","vpmadcswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 B6 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPMADD52HUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52HUQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmadd52huq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B5 /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPMADD52HUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52HUQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmadd52huq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B5 /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPMADD52HUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52HUQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmadd52huq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B5 /r","V","V","AVX512_IFMA","bscale8,scale64","rw,r,r,r","",""
+"VPMADD52LUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52LUQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmadd52luq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B4 /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPMADD52LUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52LUQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmadd52luq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B4 /r","V","V","AVX512_IFMA+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPMADD52LUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52LUQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmadd52luq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B4 /r","V","V","AVX512_IFMA","bscale8,scale64","rw,r,r,r","",""
+"VPMADDUBSW xmm1, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128, xmmV, xmm1","vpmaddubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 04 /r","V","V","AVX","","w,r,r","",""
+"VPMADDUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128, xmmV, {k}{z}, xmm1","vpmaddubsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 04 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMADDUBSW ymm1, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256, ymmV, ymm1","vpmaddubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 04 /r","V","V","AVX2","","w,r,r","",""
+"VPMADDUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256, ymmV, {k}{z}, ymm1","vpmaddubsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 04 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMADDUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDUBSW zmm2/m512, zmmV, {k}{z}, zmm1","vpmaddubsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 04 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMADDWD xmm1, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV, xmm1","vpmaddwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F5 /r","V","V","AVX","","w,r,r","",""
+"VPMADDWD xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV, {k}{z}, xmm1","vpmaddwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F5 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMADDWD ymm1, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV, ymm1","vpmaddwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F5 /r","V","V","AVX2","","w,r,r","",""
+"VPMADDWD ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV, {k}{z}, ymm1","vpmaddwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F5 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMADDWD zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDWD zmm2/m512, zmmV, {k}{z}, zmm1","vpmaddwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F5 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMASKMOVD xmm1, xmmV, m128","VPMASKMOVD m128, xmmV, xmm1","vpmaskmovd m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVD ymm1, ymmV, m256","VPMASKMOVD m256, ymmV, ymm1","vpmaskmovd m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVD m128, xmmV, xmm1","VPMASKMOVD xmm1, xmmV, m128","vpmaskmovd xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVD m256, ymmV, ymm1","VPMASKMOVD ymm1, ymmV, m256","vpmaskmovd ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVQ xmm1, xmmV, m128","VPMASKMOVQ m128, xmmV, xmm1","vpmaskmovq m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVQ ymm1, ymmV, m256","VPMASKMOVQ m256, ymmV, ymm1","vpmaskmovq m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVQ m128, xmmV, xmm1","VPMASKMOVQ xmm1, xmmV, m128","vpmaskmovq xmm1, xmmV, m128","VEX.NDS.128.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMASKMOVQ m256, ymmV, ymm1","VPMASKMOVQ ymm1, ymmV, m256","vpmaskmovq ymm1, ymmV, m256","VEX.NDS.256.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonly","w,r,r","",""
+"VPMAXSB xmm1, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV, xmm1","vpmaxsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3C /r","V","V","AVX","","w,r,r","",""
+"VPMAXSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV, {k}{z}, xmm1","vpmaxsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3C /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMAXSB ymm1, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV, ymm1","vpmaxsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3C /r","V","V","AVX2","","w,r,r","",""
+"VPMAXSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV, {k}{z}, ymm1","vpmaxsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3C /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMAXSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSB zmm2/m512, zmmV, {k}{z}, zmm1","vpmaxsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3C /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMAXSD xmm1, xmmV, xmm2/m128","VPMAXSD xmm2/m128, xmmV, xmm1","vpmaxsd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3D /r","V","V","AVX","","w,r,r","",""
+"VPMAXSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXSD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpmaxsd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 3D /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMAXSD ymm1, ymmV, ymm2/m256","VPMAXSD ymm2/m256, ymmV, ymm1","vpmaxsd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3D /r","V","V","AVX2","","w,r,r","",""
+"VPMAXSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXSD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpmaxsd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 3D /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMAXSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXSD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpmaxsd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 3D /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMAXSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXSQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmaxsq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 3D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMAXSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXSQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmaxsq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 3D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMAXSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXSQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmaxsq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 3D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMAXSW xmm1, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV, xmm1","vpmaxsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EE /r","V","V","AVX","","w,r,r","",""
+"VPMAXSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV, {k}{z}, xmm1","vpmaxsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EE /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMAXSW ymm1, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV, ymm1","vpmaxsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EE /r","V","V","AVX2","","w,r,r","",""
+"VPMAXSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV, {k}{z}, ymm1","vpmaxsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EE /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMAXSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSW zmm2/m512, zmmV, {k}{z}, zmm1","vpmaxsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EE /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMAXUB xmm1, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV, xmm1","vpmaxub xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DE /r","V","V","AVX","","w,r,r","",""
+"VPMAXUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV, {k}{z}, xmm1","vpmaxub xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DE /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMAXUB ymm1, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV, ymm1","vpmaxub ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DE /r","V","V","AVX2","","w,r,r","",""
+"VPMAXUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV, {k}{z}, ymm1","vpmaxub ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DE /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMAXUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUB zmm2/m512, zmmV, {k}{z}, zmm1","vpmaxub zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DE /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMAXUD xmm1, xmmV, xmm2/m128","VPMAXUD xmm2/m128, xmmV, xmm1","vpmaxud xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3F /r","V","V","AVX","","w,r,r","",""
+"VPMAXUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXUD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpmaxud xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 3F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMAXUD ymm1, ymmV, ymm2/m256","VPMAXUD ymm2/m256, ymmV, ymm1","vpmaxud ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3F /r","V","V","AVX2","","w,r,r","",""
+"VPMAXUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXUD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpmaxud ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 3F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMAXUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXUD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpmaxud zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 3F /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMAXUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXUQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmaxuq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 3F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMAXUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXUQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmaxuq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 3F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMAXUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXUQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmaxuq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 3F /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMAXUW xmm1, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV, xmm1","vpmaxuw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3E /r","V","V","AVX","","w,r,r","",""
+"VPMAXUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV, {k}{z}, xmm1","vpmaxuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3E /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMAXUW ymm1, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV, ymm1","vpmaxuw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3E /r","V","V","AVX2","","w,r,r","",""
+"VPMAXUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV, {k}{z}, ymm1","vpmaxuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3E /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMAXUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUW zmm2/m512, zmmV, {k}{z}, zmm1","vpmaxuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3E /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMINSB xmm1, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV, xmm1","vpminsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 38 /r","V","V","AVX","","w,r,r","",""
+"VPMINSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV, {k}{z}, xmm1","vpminsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 38 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMINSB ymm1, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV, ymm1","vpminsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 38 /r","V","V","AVX2","","w,r,r","",""
+"VPMINSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV, {k}{z}, ymm1","vpminsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 38 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMINSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSB zmm2/m512, zmmV, {k}{z}, zmm1","vpminsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 38 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMINSD xmm1, xmmV, xmm2/m128","VPMINSD xmm2/m128, xmmV, xmm1","vpminsd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 39 /r","V","V","AVX","","w,r,r","",""
+"VPMINSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINSD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpminsd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 39 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMINSD ymm1, ymmV, ymm2/m256","VPMINSD ymm2/m256, ymmV, ymm1","vpminsd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 39 /r","V","V","AVX2","","w,r,r","",""
+"VPMINSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINSD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpminsd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 39 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMINSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINSD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpminsd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 39 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMINSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINSQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpminsq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 39 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMINSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINSQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpminsq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 39 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMINSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINSQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpminsq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 39 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMINSW xmm1, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV, xmm1","vpminsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EA /r","V","V","AVX","","w,r,r","",""
+"VPMINSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV, {k}{z}, xmm1","vpminsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EA /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMINSW ymm1, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV, ymm1","vpminsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EA /r","V","V","AVX2","","w,r,r","",""
+"VPMINSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV, {k}{z}, ymm1","vpminsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EA /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMINSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSW zmm2/m512, zmmV, {k}{z}, zmm1","vpminsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EA /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMINUB xmm1, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV, xmm1","vpminub xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DA /r","V","V","AVX","","w,r,r","",""
+"VPMINUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV, {k}{z}, xmm1","vpminub xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DA /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMINUB ymm1, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV, ymm1","vpminub ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DA /r","V","V","AVX2","","w,r,r","",""
+"VPMINUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV, {k}{z}, ymm1","vpminub ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DA /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMINUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUB zmm2/m512, zmmV, {k}{z}, zmm1","vpminub zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DA /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMINUD xmm1, xmmV, xmm2/m128","VPMINUD xmm2/m128, xmmV, xmm1","vpminud xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3B /r","V","V","AVX","","w,r,r","",""
+"VPMINUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINUD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpminud xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 3B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMINUD ymm1, ymmV, ymm2/m256","VPMINUD ymm2/m256, ymmV, ymm1","vpminud ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3B /r","V","V","AVX2","","w,r,r","",""
+"VPMINUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINUD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpminud ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 3B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMINUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINUD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpminud zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 3B /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMINUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINUQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpminuq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 3B /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMINUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINUQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpminuq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 3B /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMINUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINUQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpminuq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 3B /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMINUW xmm1, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV, xmm1","vpminuw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3A /r","V","V","AVX","","w,r,r","",""
+"VPMINUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV, {k}{z}, xmm1","vpminuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3A /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMINUW ymm1, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV, ymm1","vpminuw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3A /r","V","V","AVX2","","w,r,r","",""
+"VPMINUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV, {k}{z}, ymm1","vpminuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3A /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMINUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUW zmm2/m512, zmmV, {k}{z}, zmm1","vpminuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3A /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMOVB2M k1, xmm2","VPMOVB2M xmm2, k1","vpmovb2m xmm2, k1","EVEX.128.F3.0F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVB2M k1, ymm2","VPMOVB2M ymm2, k1","vpmovb2m ymm2, k1","EVEX.256.F3.0F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVB2M k1, zmm2","VPMOVB2M zmm2, k1","vpmovb2m zmm2, k1","EVEX.512.F3.0F38.W0 29 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"VPMOVD2M k1, xmm2","VPMOVD2M xmm2, k1","vpmovd2m xmm2, k1","EVEX.128.F3.0F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVD2M k1, ymm2","VPMOVD2M ymm2, k1","vpmovd2m ymm2, k1","EVEX.256.F3.0F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVD2M k1, zmm2","VPMOVD2M zmm2, k1","vpmovd2m zmm2, k1","EVEX.512.F3.0F38.W0 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"VPMOVDB xmm2/m32, {k}{z}, xmm1","VPMOVDB xmm1, {k}{z}, xmm2/m32","vpmovdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 31 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVDB xmm2/m64, {k}{z}, ymm1","VPMOVDB ymm1, {k}{z}, xmm2/m64","vpmovdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 31 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVDB xmm2/m128, {k}{z}, zmm1","VPMOVDB zmm1, {k}{z}, xmm2/m128","vpmovdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 31 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVDW xmm2/m64, {k}{z}, xmm1","VPMOVDW xmm1, {k}{z}, xmm2/m64","vpmovdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 33 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVDW xmm2/m128, {k}{z}, ymm1","VPMOVDW ymm1, {k}{z}, xmm2/m128","vpmovdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 33 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVDW ymm2/m256, {k}{z}, zmm1","VPMOVDW zmm1, {k}{z}, ymm2/m256","vpmovdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 33 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVM2B xmm1, k2","VPMOVM2B k2, xmm1","vpmovm2b k2, xmm1","EVEX.128.F3.0F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2B ymm1, k2","VPMOVM2B k2, ymm1","vpmovm2b k2, ymm1","EVEX.256.F3.0F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2B zmm1, k2","VPMOVM2B k2, zmm1","vpmovm2b k2, zmm1","EVEX.512.F3.0F38.W0 28 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"VPMOVM2D xmm1, k2","VPMOVM2D k2, xmm1","vpmovm2d k2, xmm1","EVEX.128.F3.0F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2D ymm1, k2","VPMOVM2D k2, ymm1","vpmovm2d k2, ymm1","EVEX.256.F3.0F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2D zmm1, k2","VPMOVM2D k2, zmm1","vpmovm2d k2, zmm1","EVEX.512.F3.0F38.W0 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"VPMOVM2Q xmm1, k2","VPMOVM2Q k2, xmm1","vpmovm2q k2, xmm1","EVEX.128.F3.0F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2Q ymm1, k2","VPMOVM2Q k2, ymm1","vpmovm2q k2, ymm1","EVEX.256.F3.0F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2Q zmm1, k2","VPMOVM2Q k2, zmm1","vpmovm2q k2, zmm1","EVEX.512.F3.0F38.W1 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"VPMOVM2W xmm1, k2","VPMOVM2W k2, xmm1","vpmovm2w k2, xmm1","EVEX.128.F3.0F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2W ymm1, k2","VPMOVM2W k2, ymm1","vpmovm2w k2, ymm1","EVEX.256.F3.0F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVM2W zmm1, k2","VPMOVM2W k2, zmm1","vpmovm2w k2, zmm1","EVEX.512.F3.0F38.W1 28 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"VPMOVMSKB r32, xmm2","VPMOVMSKB xmm2, r32","vpmovmskb xmm2, r32","VEX.128.66.0F.WIG D7 /r","V","V","AVX","modrm_regonly","w,r","",""
+"VPMOVMSKB r32, ymm2","VPMOVMSKB ymm2, r32","vpmovmskb ymm2, r32","VEX.256.66.0F.WIG D7 /r","V","V","AVX2","modrm_regonly","w,r","",""
+"VPMOVQ2M k1, xmm2","VPMOVQ2M xmm2, k1","vpmovq2m xmm2, k1","EVEX.128.F3.0F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVQ2M k1, ymm2","VPMOVQ2M ymm2, k1","vpmovq2m ymm2, k1","EVEX.256.F3.0F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVQ2M k1, zmm2","VPMOVQ2M zmm2, k1","vpmovq2m zmm2, k1","EVEX.512.F3.0F38.W1 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","",""
+"VPMOVQB xmm2/m16, {k}{z}, xmm1","VPMOVQB xmm1, {k}{z}, xmm2/m16","vpmovqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 32 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVQB xmm2/m32, {k}{z}, ymm1","VPMOVQB ymm1, {k}{z}, xmm2/m32","vpmovqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 32 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVQB xmm2/m64, {k}{z}, zmm1","VPMOVQB zmm1, {k}{z}, xmm2/m64","vpmovqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 32 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVQD xmm2/m64, {k}{z}, xmm1","VPMOVQD xmm1, {k}{z}, xmm2/m64","vpmovqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 35 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVQD xmm2/m128, {k}{z}, ymm1","VPMOVQD ymm1, {k}{z}, xmm2/m128","vpmovqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 35 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVQD ymm2/m256, {k}{z}, zmm1","VPMOVQD zmm1, {k}{z}, ymm2/m256","vpmovqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 35 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVQW xmm2/m32, {k}{z}, xmm1","VPMOVQW xmm1, {k}{z}, xmm2/m32","vpmovqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 34 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVQW xmm2/m64, {k}{z}, ymm1","VPMOVQW ymm1, {k}{z}, xmm2/m64","vpmovqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 34 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVQW xmm2/m128, {k}{z}, zmm1","VPMOVQW zmm1, {k}{z}, xmm2/m128","vpmovqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 34 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSDB xmm2/m32, {k}{z}, xmm1","VPMOVSDB xmm1, {k}{z}, xmm2/m32","vpmovsdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 21 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSDB xmm2/m64, {k}{z}, ymm1","VPMOVSDB ymm1, {k}{z}, xmm2/m64","vpmovsdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 21 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSDB xmm2/m128, {k}{z}, zmm1","VPMOVSDB zmm1, {k}{z}, xmm2/m128","vpmovsdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 21 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSDW xmm2/m64, {k}{z}, xmm1","VPMOVSDW xmm1, {k}{z}, xmm2/m64","vpmovsdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 23 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSDW xmm2/m128, {k}{z}, ymm1","VPMOVSDW ymm1, {k}{z}, xmm2/m128","vpmovsdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 23 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVSDW ymm2/m256, {k}{z}, zmm1","VPMOVSDW zmm1, {k}{z}, ymm2/m256","vpmovsdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 23 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVSQB xmm2/m16, {k}{z}, xmm1","VPMOVSQB xmm1, {k}{z}, xmm2/m16","vpmovsqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 22 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVSQB xmm2/m32, {k}{z}, ymm1","VPMOVSQB ymm1, {k}{z}, xmm2/m32","vpmovsqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 22 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSQB xmm2/m64, {k}{z}, zmm1","VPMOVSQB zmm1, {k}{z}, xmm2/m64","vpmovsqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 22 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVSQD xmm2/m64, {k}{z}, xmm1","VPMOVSQD xmm1, {k}{z}, xmm2/m64","vpmovsqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 25 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSQD xmm2/m128, {k}{z}, ymm1","VPMOVSQD ymm1, {k}{z}, xmm2/m128","vpmovsqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 25 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVSQD ymm2/m256, {k}{z}, zmm1","VPMOVSQD zmm1, {k}{z}, ymm2/m256","vpmovsqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 25 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVSQW xmm2/m32, {k}{z}, xmm1","VPMOVSQW xmm1, {k}{z}, xmm2/m32","vpmovsqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 24 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSQW xmm2/m64, {k}{z}, ymm1","VPMOVSQW ymm1, {k}{z}, xmm2/m64","vpmovsqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 24 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSQW xmm2/m128, {k}{z}, zmm1","VPMOVSQW zmm1, {k}{z}, xmm2/m128","vpmovsqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 24 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSWB xmm2/m64, {k}{z}, xmm1","VPMOVSWB xmm1, {k}{z}, xmm2/m64","vpmovswb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 20 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVSWB xmm2/m128, {k}{z}, ymm1","VPMOVSWB ymm1, {k}{z}, xmm2/m128","vpmovswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 20 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVSWB ymm2/m256, {k}{z}, zmm1","VPMOVSWB zmm1, {k}{z}, ymm2/m256","vpmovswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 20 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVSXBD zmm1, {k}{z}, xmm2/m128","VPMOVSXBD xmm2/m128, {k}{z}, zmm1","vpmovsxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 21 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSXBD xmm1, xmm2/m32","VPMOVSXBD xmm2/m32, xmm1","vpmovsxbd xmm2/m32, xmm1","VEX.128.66.0F38.WIG 21 /r","V","V","AVX","","w,r","",""
+"VPMOVSXBD xmm1, {k}{z}, xmm2/m32","VPMOVSXBD xmm2/m32, {k}{z}, xmm1","vpmovsxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 21 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSXBD ymm1, xmm2/m64","VPMOVSXBD xmm2/m64, ymm1","vpmovsxbd xmm2/m64, ymm1","VEX.256.66.0F38.WIG 21 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXBD ymm1, {k}{z}, xmm2/m64","VPMOVSXBD xmm2/m64, {k}{z}, ymm1","vpmovsxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 21 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSXBQ xmm1, xmm2/m16","VPMOVSXBQ xmm2/m16, xmm1","vpmovsxbq xmm2/m16, xmm1","VEX.128.66.0F38.WIG 22 /r","V","V","AVX","","w,r","",""
+"VPMOVSXBQ xmm1, {k}{z}, xmm2/m16","VPMOVSXBQ xmm2/m16, {k}{z}, xmm1","vpmovsxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 22 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVSXBQ ymm1, xmm2/m32","VPMOVSXBQ xmm2/m32, ymm1","vpmovsxbq xmm2/m32, ymm1","VEX.256.66.0F38.WIG 22 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXBQ ymm1, {k}{z}, xmm2/m32","VPMOVSXBQ xmm2/m32, {k}{z}, ymm1","vpmovsxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 22 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSXBQ zmm1, {k}{z}, xmm2/m64","VPMOVSXBQ xmm2/m64, {k}{z}, zmm1","vpmovsxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 22 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVSXBW ymm1, xmm2/m128","VPMOVSXBW xmm2/m128, ymm1","vpmovsxbw xmm2/m128, ymm1","VEX.256.66.0F38.WIG 20 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXBW ymm1, {k}{z}, xmm2/m128","VPMOVSXBW xmm2/m128, {k}{z}, ymm1","vpmovsxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 20 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVSXBW xmm1, xmm2/m64","VPMOVSXBW xmm2/m64, xmm1","vpmovsxbw xmm2/m64, xmm1","VEX.128.66.0F38.WIG 20 /r","V","V","AVX","","w,r","",""
+"VPMOVSXBW xmm1, {k}{z}, xmm2/m64","VPMOVSXBW xmm2/m64, {k}{z}, xmm1","vpmovsxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 20 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVSXBW zmm1, {k}{z}, ymm2/m256","VPMOVSXBW ymm2/m256, {k}{z}, zmm1","vpmovsxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 20 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVSXDQ ymm1, xmm2/m128","VPMOVSXDQ xmm2/m128, ymm1","vpmovsxdq xmm2/m128, ymm1","VEX.256.66.0F38.WIG 25 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXDQ ymm1, {k}{z}, xmm2/m128","VPMOVSXDQ xmm2/m128, {k}{z}, ymm1","vpmovsxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 25 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVSXDQ xmm1, xmm2/m64","VPMOVSXDQ xmm2/m64, xmm1","vpmovsxdq xmm2/m64, xmm1","VEX.128.66.0F38.WIG 25 /r","V","V","AVX","","w,r","",""
+"VPMOVSXDQ xmm1, {k}{z}, xmm2/m64","VPMOVSXDQ xmm2/m64, {k}{z}, xmm1","vpmovsxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 25 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSXDQ zmm1, {k}{z}, ymm2/m256","VPMOVSXDQ ymm2/m256, {k}{z}, zmm1","vpmovsxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 25 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVSXWD ymm1, xmm2/m128","VPMOVSXWD xmm2/m128, ymm1","vpmovsxwd xmm2/m128, ymm1","VEX.256.66.0F38.WIG 23 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXWD ymm1, {k}{z}, xmm2/m128","VPMOVSXWD xmm2/m128, {k}{z}, ymm1","vpmovsxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 23 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVSXWD xmm1, xmm2/m64","VPMOVSXWD xmm2/m64, xmm1","vpmovsxwd xmm2/m64, xmm1","VEX.128.66.0F38.WIG 23 /r","V","V","AVX","","w,r","",""
+"VPMOVSXWD xmm1, {k}{z}, xmm2/m64","VPMOVSXWD xmm2/m64, {k}{z}, xmm1","vpmovsxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 23 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVSXWD zmm1, {k}{z}, ymm2/m256","VPMOVSXWD ymm2/m256, {k}{z}, zmm1","vpmovsxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 23 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVSXWQ zmm1, {k}{z}, xmm2/m128","VPMOVSXWQ xmm2/m128, {k}{z}, zmm1","vpmovsxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 24 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVSXWQ xmm1, xmm2/m32","VPMOVSXWQ xmm2/m32, xmm1","vpmovsxwq xmm2/m32, xmm1","VEX.128.66.0F38.WIG 24 /r","V","V","AVX","","w,r","",""
+"VPMOVSXWQ xmm1, {k}{z}, xmm2/m32","VPMOVSXWQ xmm2/m32, {k}{z}, xmm1","vpmovsxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 24 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVSXWQ ymm1, xmm2/m64","VPMOVSXWQ xmm2/m64, ymm1","vpmovsxwq xmm2/m64, ymm1","VEX.256.66.0F38.WIG 24 /r","V","V","AVX2","","w,r","",""
+"VPMOVSXWQ ymm1, {k}{z}, xmm2/m64","VPMOVSXWQ xmm2/m64, {k}{z}, ymm1","vpmovsxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 24 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSDB xmm2/m32, {k}{z}, xmm1","VPMOVUSDB xmm1, {k}{z}, xmm2/m32","vpmovusdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 11 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVUSDB xmm2/m64, {k}{z}, ymm1","VPMOVUSDB ymm1, {k}{z}, xmm2/m64","vpmovusdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 11 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSDB xmm2/m128, {k}{z}, zmm1","VPMOVUSDB zmm1, {k}{z}, xmm2/m128","vpmovusdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 11 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVUSDW xmm2/m64, {k}{z}, xmm1","VPMOVUSDW xmm1, {k}{z}, xmm2/m64","vpmovusdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 13 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSDW xmm2/m128, {k}{z}, ymm1","VPMOVUSDW ymm1, {k}{z}, xmm2/m128","vpmovusdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 13 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVUSDW ymm2/m256, {k}{z}, zmm1","VPMOVUSDW zmm1, {k}{z}, ymm2/m256","vpmovusdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 13 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVUSQB xmm2/m16, {k}{z}, xmm1","VPMOVUSQB xmm1, {k}{z}, xmm2/m16","vpmovusqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 12 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVUSQB xmm2/m32, {k}{z}, ymm1","VPMOVUSQB ymm1, {k}{z}, xmm2/m32","vpmovusqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 12 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVUSQB xmm2/m64, {k}{z}, zmm1","VPMOVUSQB zmm1, {k}{z}, xmm2/m64","vpmovusqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 12 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVUSQD xmm2/m64, {k}{z}, xmm1","VPMOVUSQD xmm1, {k}{z}, xmm2/m64","vpmovusqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSQD xmm2/m128, {k}{z}, ymm1","VPMOVUSQD ymm1, {k}{z}, xmm2/m128","vpmovusqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVUSQD ymm2/m256, {k}{z}, zmm1","VPMOVUSQD zmm1, {k}{z}, ymm2/m256","vpmovusqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 15 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVUSQW xmm2/m32, {k}{z}, xmm1","VPMOVUSQW xmm1, {k}{z}, xmm2/m32","vpmovusqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVUSQW xmm2/m64, {k}{z}, ymm1","VPMOVUSQW ymm1, {k}{z}, xmm2/m64","vpmovusqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSQW xmm2/m128, {k}{z}, zmm1","VPMOVUSQW zmm1, {k}{z}, xmm2/m128","vpmovusqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 14 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVUSWB xmm2/m64, {k}{z}, xmm1","VPMOVUSWB xmm1, {k}{z}, xmm2/m64","vpmovuswb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 10 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVUSWB xmm2/m128, {k}{z}, ymm1","VPMOVUSWB ymm1, {k}{z}, xmm2/m128","vpmovuswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 10 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVUSWB ymm2/m256, {k}{z}, zmm1","VPMOVUSWB zmm1, {k}{z}, ymm2/m256","vpmovuswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 10 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVW2M k1, xmm2","VPMOVW2M xmm2, k1","vpmovw2m xmm2, k1","EVEX.128.F3.0F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVW2M k1, ymm2","VPMOVW2M ymm2, k1","vpmovw2m ymm2, k1","EVEX.256.F3.0F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","",""
+"VPMOVW2M k1, zmm2","VPMOVW2M zmm2, k1","vpmovw2m zmm2, k1","EVEX.512.F3.0F38.W1 29 /r","V","V","AVX512BW","modrm_regonly","w,r","",""
+"VPMOVWB xmm2/m64, {k}{z}, xmm1","VPMOVWB xmm1, {k}{z}, xmm2/m64","vpmovwb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 30 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVWB xmm2/m128, {k}{z}, ymm1","VPMOVWB ymm1, {k}{z}, xmm2/m128","vpmovwb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 30 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVWB ymm2/m256, {k}{z}, zmm1","VPMOVWB zmm1, {k}{z}, ymm2/m256","vpmovwb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 30 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVZXBD zmm1, {k}{z}, xmm2/m128","VPMOVZXBD xmm2/m128, {k}{z}, zmm1","vpmovzxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 31 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVZXBD xmm1, xmm2/m32","VPMOVZXBD xmm2/m32, xmm1","vpmovzxbd xmm2/m32, xmm1","VEX.128.66.0F38.WIG 31 /r","V","V","AVX","","w,r","",""
+"VPMOVZXBD xmm1, {k}{z}, xmm2/m32","VPMOVZXBD xmm2/m32, {k}{z}, xmm1","vpmovzxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 31 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVZXBD ymm1, xmm2/m64","VPMOVZXBD xmm2/m64, ymm1","vpmovzxbd xmm2/m64, ymm1","VEX.256.66.0F38.WIG 31 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXBD ymm1, {k}{z}, xmm2/m64","VPMOVZXBD xmm2/m64, {k}{z}, ymm1","vpmovzxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 31 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVZXBQ xmm1, xmm2/m16","VPMOVZXBQ xmm2/m16, xmm1","vpmovzxbq xmm2/m16, xmm1","VEX.128.66.0F38.WIG 32 /r","V","V","AVX","","w,r","",""
+"VPMOVZXBQ xmm1, {k}{z}, xmm2/m16","VPMOVZXBQ xmm2/m16, {k}{z}, xmm1","vpmovzxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 32 /r","V","V","AVX512F+AVX512VL","scale2","w,r,r","",""
+"VPMOVZXBQ ymm1, xmm2/m32","VPMOVZXBQ xmm2/m32, ymm1","vpmovzxbq xmm2/m32, ymm1","VEX.256.66.0F38.WIG 32 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXBQ ymm1, {k}{z}, xmm2/m32","VPMOVZXBQ xmm2/m32, {k}{z}, ymm1","vpmovzxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 32 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVZXBQ zmm1, {k}{z}, xmm2/m64","VPMOVZXBQ xmm2/m64, {k}{z}, zmm1","vpmovzxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 32 /r","V","V","AVX512F","scale8","w,r,r","",""
+"VPMOVZXBW ymm1, xmm2/m128","VPMOVZXBW xmm2/m128, ymm1","vpmovzxbw xmm2/m128, ymm1","VEX.256.66.0F38.WIG 30 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXBW ymm1, {k}{z}, xmm2/m128","VPMOVZXBW xmm2/m128, {k}{z}, ymm1","vpmovzxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 30 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPMOVZXBW xmm1, xmm2/m64","VPMOVZXBW xmm2/m64, xmm1","vpmovzxbw xmm2/m64, xmm1","VEX.128.66.0F38.WIG 30 /r","V","V","AVX","","w,r","",""
+"VPMOVZXBW xmm1, {k}{z}, xmm2/m64","VPMOVZXBW xmm2/m64, {k}{z}, xmm1","vpmovzxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 30 /r","V","V","AVX512BW+AVX512VL","scale8","w,r,r","",""
+"VPMOVZXBW zmm1, {k}{z}, ymm2/m256","VPMOVZXBW ymm2/m256, {k}{z}, zmm1","vpmovzxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 30 /r","V","V","AVX512BW","scale32","w,r,r","",""
+"VPMOVZXDQ ymm1, xmm2/m128","VPMOVZXDQ xmm2/m128, ymm1","vpmovzxdq xmm2/m128, ymm1","VEX.256.66.0F38.WIG 35 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXDQ ymm1, {k}{z}, xmm2/m128","VPMOVZXDQ xmm2/m128, {k}{z}, ymm1","vpmovzxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 35 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVZXDQ xmm1, xmm2/m64","VPMOVZXDQ xmm2/m64, xmm1","vpmovzxdq xmm2/m64, xmm1","VEX.128.66.0F38.WIG 35 /r","V","V","AVX","","w,r","",""
+"VPMOVZXDQ xmm1, {k}{z}, xmm2/m64","VPMOVZXDQ xmm2/m64, {k}{z}, xmm1","vpmovzxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 35 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVZXDQ zmm1, {k}{z}, ymm2/m256","VPMOVZXDQ ymm2/m256, {k}{z}, zmm1","vpmovzxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 35 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVZXWD ymm1, xmm2/m128","VPMOVZXWD xmm2/m128, ymm1","vpmovzxwd xmm2/m128, ymm1","VEX.256.66.0F38.WIG 33 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXWD ymm1, {k}{z}, xmm2/m128","VPMOVZXWD xmm2/m128, {k}{z}, ymm1","vpmovzxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 33 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r","",""
+"VPMOVZXWD xmm1, xmm2/m64","VPMOVZXWD xmm2/m64, xmm1","vpmovzxwd xmm2/m64, xmm1","VEX.128.66.0F38.WIG 33 /r","V","V","AVX","","w,r","",""
+"VPMOVZXWD xmm1, {k}{z}, xmm2/m64","VPMOVZXWD xmm2/m64, {k}{z}, xmm1","vpmovzxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 33 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMOVZXWD zmm1, {k}{z}, ymm2/m256","VPMOVZXWD ymm2/m256, {k}{z}, zmm1","vpmovzxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 33 /r","V","V","AVX512F","scale32","w,r,r","",""
+"VPMOVZXWQ zmm1, {k}{z}, xmm2/m128","VPMOVZXWQ xmm2/m128, {k}{z}, zmm1","vpmovzxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 34 /r","V","V","AVX512F","scale16","w,r,r","",""
+"VPMOVZXWQ xmm1, xmm2/m32","VPMOVZXWQ xmm2/m32, xmm1","vpmovzxwq xmm2/m32, xmm1","VEX.128.66.0F38.WIG 34 /r","V","V","AVX","","w,r","",""
+"VPMOVZXWQ xmm1, {k}{z}, xmm2/m32","VPMOVZXWQ xmm2/m32, {k}{z}, xmm1","vpmovzxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 34 /r","V","V","AVX512F+AVX512VL","scale4","w,r,r","",""
+"VPMOVZXWQ ymm1, xmm2/m64","VPMOVZXWQ xmm2/m64, ymm1","vpmovzxwq xmm2/m64, ymm1","VEX.256.66.0F38.WIG 34 /r","V","V","AVX2","","w,r","",""
+"VPMOVZXWQ ymm1, {k}{z}, xmm2/m64","VPMOVZXWQ xmm2/m64, {k}{z}, ymm1","vpmovzxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 34 /r","V","V","AVX512F+AVX512VL","scale8","w,r,r","",""
+"VPMULDQ xmm1, xmmV, xmm2/m128","VPMULDQ xmm2/m128, xmmV, xmm1","vpmuldq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 28 /r","V","V","AVX","","w,r,r","",""
+"VPMULDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmuldq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 28 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMULDQ ymm1, ymmV, ymm2/m256","VPMULDQ ymm2/m256, ymmV, ymm1","vpmuldq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 28 /r","V","V","AVX2","","w,r,r","",""
+"VPMULDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmuldq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 28 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMULDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmuldq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 28 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPMULHRSW xmm1, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV, xmm1","vpmulhrsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0B /r","V","V","AVX","","w,r,r","",""
+"VPMULHRSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV, {k}{z}, xmm1","vpmulhrsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 0B /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMULHRSW ymm1, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV, ymm1","vpmulhrsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0B /r","V","V","AVX2","","w,r,r","",""
+"VPMULHRSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV, {k}{z}, ymm1","vpmulhrsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 0B /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMULHRSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHRSW zmm2/m512, zmmV, {k}{z}, zmm1","vpmulhrsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 0B /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMULHUW xmm1, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV, xmm1","vpmulhuw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E4 /r","V","V","AVX","","w,r,r","",""
+"VPMULHUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV, {k}{z}, xmm1","vpmulhuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E4 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMULHUW ymm1, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV, ymm1","vpmulhuw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E4 /r","V","V","AVX2","","w,r,r","",""
+"VPMULHUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV, {k}{z}, ymm1","vpmulhuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E4 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMULHUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHUW zmm2/m512, zmmV, {k}{z}, zmm1","vpmulhuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E4 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMULHW xmm1, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV, xmm1","vpmulhw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E5 /r","V","V","AVX","","w,r,r","",""
+"VPMULHW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV, {k}{z}, xmm1","vpmulhw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E5 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMULHW ymm1, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV, ymm1","vpmulhw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E5 /r","V","V","AVX2","","w,r,r","",""
+"VPMULHW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV, {k}{z}, ymm1","vpmulhw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E5 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMULHW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHW zmm2/m512, zmmV, {k}{z}, zmm1","vpmulhw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E5 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMULLD xmm1, xmmV, xmm2/m128","VPMULLD xmm2/m128, xmmV, xmm1","vpmulld xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 40 /r","V","V","AVX","","w,r,r","",""
+"VPMULLD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMULLD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpmulld xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 40 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPMULLD ymm1, ymmV, ymm2/m256","VPMULLD ymm2/m256, ymmV, ymm1","vpmulld ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 40 /r","V","V","AVX2","","w,r,r","",""
+"VPMULLD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMULLD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpmulld ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 40 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPMULLD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMULLD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpmulld zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 40 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPMULLQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULLQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmullq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 40 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMULLQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULLQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmullq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 40 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMULLQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULLQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmullq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 40 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VPMULLW xmm1, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV, xmm1","vpmullw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D5 /r","V","V","AVX","","w,r,r","",""
+"VPMULLW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV, {k}{z}, xmm1","vpmullw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D5 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPMULLW ymm1, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV, ymm1","vpmullw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D5 /r","V","V","AVX2","","w,r,r","",""
+"VPMULLW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV, {k}{z}, ymm1","vpmullw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D5 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPMULLW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULLW zmm2/m512, zmmV, {k}{z}, zmm1","vpmullw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D5 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPMULTISHIFTQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULTISHIFTQB xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmultishiftqb xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 83 /r","V","V","AVX512_VBMI+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMULTISHIFTQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULTISHIFTQB ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmultishiftqb ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 83 /r","V","V","AVX512_VBMI+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMULTISHIFTQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULTISHIFTQB zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmultishiftqb zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 83 /r","V","V","AVX512_VBMI","bscale8,scale64","w,r,r,r","",""
+"VPMULUDQ xmm1, xmmV, xmm2/m128","VPMULUDQ xmm2/m128, xmmV, xmm1","vpmuludq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F4 /r","V","V","AVX","","w,r,r","",""
+"VPMULUDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULUDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmuludq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 F4 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPMULUDQ ymm1, ymmV, ymm2/m256","VPMULUDQ ymm2/m256, ymmV, ymm1","vpmuludq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F4 /r","V","V","AVX2","","w,r,r","",""
+"VPMULUDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULUDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmuludq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 F4 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPMULUDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULUDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmuludq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 F4 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPOPCNTB xmm1, {k}{z}, xmm2/m128","VPOPCNTB xmm2/m128, {k}{z}, xmm1","vpopcntb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 54 /r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r","",""
+"VPOPCNTB ymm1, {k}{z}, ymm2/m256","VPOPCNTB ymm2/m256, {k}{z}, ymm1","vpopcntb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 54 /r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r","",""
+"VPOPCNTB zmm1, {k}{z}, zmm2/m512","VPOPCNTB zmm2/m512, {k}{z}, zmm1","vpopcntb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 54 /r","V","V","AVX512_BITALG","scale64","w,r,r","",""
+"VPOPCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPOPCNTD xmm2/m128/m32bcst, {k}{z}, xmm1","vpopcntd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale16","w,r,r","",""
+"VPOPCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPOPCNTD ymm2/m256/m32bcst, {k}{z}, ymm1","vpopcntd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale32","w,r,r","",""
+"VPOPCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPOPCNTD zmm2/m512/m32bcst, {k}{z}, zmm1","vpopcntd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 55 /r","V","V","AVX512_VPOPCNTDQ","bscale4,scale64","w,r,r","",""
+"VPOPCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPOPCNTQ xmm2/m128/m64bcst, {k}{z}, xmm1","vpopcntq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale16","w,r,r","",""
+"VPOPCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPOPCNTQ ymm2/m256/m64bcst, {k}{z}, ymm1","vpopcntq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale32","w,r,r","",""
+"VPOPCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPOPCNTQ zmm2/m512/m64bcst, {k}{z}, zmm1","vpopcntq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 55 /r","V","V","AVX512_VPOPCNTDQ","bscale8,scale64","w,r,r","",""
+"VPOPCNTW xmm1, {k}{z}, xmm2/m128","VPOPCNTW xmm2/m128, {k}{z}, xmm1","vpopcntw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 54 /r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r","",""
+"VPOPCNTW ymm1, {k}{z}, ymm2/m256","VPOPCNTW ymm2/m256, {k}{z}, ymm1","vpopcntw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 54 /r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r","",""
+"VPOPCNTW zmm1, {k}{z}, zmm2/m512","VPOPCNTW zmm2/m512, {k}{z}, zmm1","vpopcntw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 54 /r","V","V","AVX512_BITALG","scale64","w,r,r","",""
+"VPOR xmm1, xmmV, xmm2/m128","VPOR xmm2/m128, xmmV, xmm1","vpor xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EB /r","V","V","AVX","","w,r,r","",""
+"VPOR ymm1, ymmV, ymm2/m256","VPOR ymm2/m256, ymmV, ymm1","vpor ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EB /r","V","V","AVX2","","w,r,r","",""
+"VPORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPORD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpord xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 EB /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPORD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpord ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 EB /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPORD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpord zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 EB /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPORQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vporq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 EB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPORQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vporq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 EB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPORQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vporq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 EB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPPERM xmm1, xmmV, xmmIH, xmm2/m128","VPPERM xmm2/m128, xmmIH, xmmV, xmm1","vpperm xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A3 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPPERM xmm1, xmmV, xmm2/m128, xmmIH","VPPERM xmmIH, xmm2/m128, xmmV, xmm1","vpperm xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A3 /r /is4","V","V","XOP","amd","w,r,r,r","",""
+"VPROLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPROLD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vprold imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /1 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPROLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPROLD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vprold imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /1 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPROLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPROLD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vprold imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /1 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPROLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPROLQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vprolq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 72 /1 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPROLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPROLQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vprolq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 72 /1 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPROLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPROLQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vprolq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 72 /1 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPROLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPROLVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vprolvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPROLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPROLVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vprolvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPROLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPROLVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vprolvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 15 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPROLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPROLVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vprolvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPROLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPROLVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vprolvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPROLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPROLVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vprolvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 15 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPRORD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPRORD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vprord imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /0 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPRORD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPRORD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vprord imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /0 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPRORD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPRORD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vprord imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /0 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPRORQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPRORQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vprorq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 72 /0 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPRORQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPRORQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vprorq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 72 /0 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPRORQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPRORQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vprorq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 72 /0 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPRORVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPRORVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vprorvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPRORVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPRORVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vprorvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPRORVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPRORVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vprorvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 14 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPRORVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPRORVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vprorvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPRORVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPRORVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vprorvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPRORVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPRORVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vprorvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 14 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPROTB xmm1, xmm2/m128, imm8u","VPROTB imm8u, xmm2/m128, xmm1","vprotb imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C0 /r ib","V","V","XOP","amd","w,r,r","",""
+"VPROTB xmm1, xmmV, xmm2/m128","VPROTB xmm2/m128, xmmV, xmm1","vprotb xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 90 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTB xmm1, xmm2/m128, xmmV","VPROTB xmmV, xmm2/m128, xmm1","vprotb xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 90 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTD xmm1, xmm2/m128, imm8u","VPROTD imm8u, xmm2/m128, xmm1","vprotd imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C2 /r ib","V","V","XOP","amd","w,r,r","",""
+"VPROTD xmm1, xmmV, xmm2/m128","VPROTD xmm2/m128, xmmV, xmm1","vprotd xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 92 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTD xmm1, xmm2/m128, xmmV","VPROTD xmmV, xmm2/m128, xmm1","vprotd xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 92 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTQ xmm1, xmm2/m128, imm8u","VPROTQ imm8u, xmm2/m128, xmm1","vprotq imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C3 /r ib","V","V","XOP","amd","w,r,r","",""
+"VPROTQ xmm1, xmmV, xmm2/m128","VPROTQ xmm2/m128, xmmV, xmm1","vprotq xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 93 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTQ xmm1, xmm2/m128, xmmV","VPROTQ xmmV, xmm2/m128, xmm1","vprotq xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 93 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTW xmm1, xmm2/m128, imm8u","VPROTW imm8u, xmm2/m128, xmm1","vprotw imm8u, xmm2/m128, xmm1","XOP.128.08.W0 C1 /r ib","V","V","XOP","amd","w,r,r","",""
+"VPROTW xmm1, xmmV, xmm2/m128","VPROTW xmm2/m128, xmmV, xmm1","vprotw xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 91 /r","V","V","XOP","amd","w,r,r","",""
+"VPROTW xmm1, xmm2/m128, xmmV","VPROTW xmmV, xmm2/m128, xmm1","vprotw xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 91 /r","V","V","XOP","amd","w,r,r","",""
+"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV, xmm1","vpsadbw xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG F6 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV, xmm1","vpsadbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F6 /r","V","V","AVX","","w,r,r","",""
+"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV, ymm1","vpsadbw ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F.WIG F6 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV, ymm1","vpsadbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F6 /r","V","V","AVX2","","w,r,r","",""
+"VPSADBW zmm1, zmmV, zmm2/m512","VPSADBW zmm2/m512, zmmV, zmm1","vpsadbw zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F.WIG F6 /r","V","V","AVX512BW","scale64","w,r,r","",""
+"VPSCATTERDD vm32x, {k1-k7}, xmm1","VPSCATTERDD xmm1, {k1-k7}, vm32x","vpscatterdd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A0 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERDD vm32y, {k1-k7}, ymm1","VPSCATTERDD ymm1, {k1-k7}, vm32y","vpscatterdd ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A0 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERDD vm32z, {k1-k7}, zmm1","VPSCATTERDD zmm1, {k1-k7}, vm32z","vpscatterdd zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A0 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERDQ vm32x, {k1-k7}, xmm1","VPSCATTERDQ xmm1, {k1-k7}, vm32x","vpscatterdq xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A0 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERDQ vm32x, {k1-k7}, ymm1","VPSCATTERDQ ymm1, {k1-k7}, vm32x","vpscatterdq ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A0 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERDQ vm32y, {k1-k7}, zmm1","VPSCATTERDQ zmm1, {k1-k7}, vm32y","vpscatterdq zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A0 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERQD vm64x, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7}, vm64x","vpscatterqd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A1 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERQD vm64y, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7}, vm64y","vpscatterqd xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A1 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERQD vm64z, {k1-k7}, ymm1","VPSCATTERQD ymm1, {k1-k7}, vm64z","vpscatterqd ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A1 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VPSCATTERQQ vm64x, {k1-k7}, xmm1","VPSCATTERQQ xmm1, {k1-k7}, vm64x","vpscatterqq xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A1 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERQQ vm64y, {k1-k7}, ymm1","VPSCATTERQQ ymm1, {k1-k7}, vm64y","vpscatterqq ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A1 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VPSCATTERQQ vm64z, {k1-k7}, zmm1","VPSCATTERQQ zmm1, {k1-k7}, vm64z","vpscatterqq zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A1 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VPSHAB xmm1, xmmV, xmm2/m128","VPSHAB xmm2/m128, xmmV, xmm1","vpshab xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 98 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAB xmm1, xmm2/m128, xmmV","VPSHAB xmmV, xmm2/m128, xmm1","vpshab xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 98 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAD xmm1, xmmV, xmm2/m128","VPSHAD xmm2/m128, xmmV, xmm1","vpshad xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 9A /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAD xmm1, xmm2/m128, xmmV","VPSHAD xmmV, xmm2/m128, xmm1","vpshad xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 9A /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAQ xmm1, xmmV, xmm2/m128","VPSHAQ xmm2/m128, xmmV, xmm1","vpshaq xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 9B /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAQ xmm1, xmm2/m128, xmmV","VPSHAQ xmmV, xmm2/m128, xmm1","vpshaq xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 9B /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAW xmm1, xmmV, xmm2/m128","VPSHAW xmm2/m128, xmmV, xmm1","vpshaw xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 99 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHAW xmm1, xmm2/m128, xmmV","VPSHAW xmmV, xmm2/m128, xmm1","vpshaw xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 99 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLB xmm1, xmmV, xmm2/m128","VPSHLB xmm2/m128, xmmV, xmm1","vpshlb xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 94 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLB xmm1, xmm2/m128, xmmV","VPSHLB xmmV, xmm2/m128, xmm1","vpshlb xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 94 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLD xmm1, xmmV, xmm2/m128","VPSHLD xmm2/m128, xmmV, xmm1","vpshld xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 96 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLD xmm1, xmm2/m128, xmmV","VPSHLD xmmV, xmm2/m128, xmm1","vpshld xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 96 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHLDD imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshldd imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VPSHLDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHLDD imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshldd imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VPSHLDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHLDD imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshldd imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2","bscale4,scale64","w,r,r,r,r","",""
+"VPSHLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHLDQ imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshldq imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VPSHLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHLDQ imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshldq imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VPSHLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHLDQ imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshldq imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2","bscale8,scale64","w,r,r,r,r","",""
+"VPSHLDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHLDVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshldvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPSHLDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHLDVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshldvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPSHLDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHLDVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshldvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 71 /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,r,r,r","",""
+"VPSHLDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHLDVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshldvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPSHLDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHLDVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshldvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPSHLDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHLDVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshldvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 71 /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,r,r,r","",""
+"VPSHLDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHLDVW xmm2/m128, xmmV, {k}{z}, xmm1","vpshldvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 70 /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","",""
+"VPSHLDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHLDVW ymm2/m256, ymmV, {k}{z}, ymm1","vpshldvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 70 /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","",""
+"VPSHLDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHLDVW zmm2/m512, zmmV, {k}{z}, zmm1","vpshldvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 70 /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","",""
+"VPSHLDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHLDW imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","vpshldw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r,r,r","",""
+"VPSHLDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHLDW imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","vpshldw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r,r,r","",""
+"VPSHLDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHLDW imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","vpshldw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",""
+"VPSHLQ xmm1, xmmV, xmm2/m128","VPSHLQ xmm2/m128, xmmV, xmm1","vpshlq xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 97 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLQ xmm1, xmm2/m128, xmmV","VPSHLQ xmmV, xmm2/m128, xmm1","vpshlq xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 97 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLW xmm1, xmmV, xmm2/m128","VPSHLW xmm2/m128, xmmV, xmm1","vpshlw xmm2/m128, xmmV, xmm1","XOP.NDS.128.09.W1 95 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHLW xmm1, xmm2/m128, xmmV","VPSHLW xmmV, xmm2/m128, xmm1","vpshlw xmmV, xmm2/m128, xmm1","XOP.NDS.128.09.W0 95 /r","V","V","XOP","amd","w,r,r","",""
+"VPSHRDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHRDD imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshrdd imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VPSHRDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHRDD imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshrdd imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VPSHRDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHRDD imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshrdd imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2","bscale4,scale64","w,r,r,r,r","",""
+"VPSHRDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHRDQ imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshrdq imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VPSHRDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHRDQ imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshrdq imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VPSHRDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHRDQ imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshrdq imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2","bscale8,scale64","w,r,r,r,r","",""
+"VPSHRDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHRDVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshrdvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale16","rw,r,r,r","",""
+"VPSHRDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHRDVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshrdvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scale32","rw,r,r,r","",""
+"VPSHRDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHRDVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshrdvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 73 /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,r,r,r","",""
+"VPSHRDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHRDVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshrdvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale16","rw,r,r,r","",""
+"VPSHRDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHRDVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshrdvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scale32","rw,r,r,r","",""
+"VPSHRDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHRDVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshrdvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 73 /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,r,r,r","",""
+"VPSHRDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHRDVW xmm2/m128, xmmV, {k}{z}, xmm1","vpshrdvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 72 /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","",""
+"VPSHRDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHRDVW ymm2/m256, ymmV, {k}{z}, ymm1","vpshrdvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 72 /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","",""
+"VPSHRDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHRDVW zmm2/m512, zmmV, {k}{z}, zmm1","vpshrdvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 72 /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","",""
+"VPSHRDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHRDW imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","vpshrdw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r,r,r","",""
+"VPSHRDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHRDW imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","vpshrdw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r,r,r","",""
+"VPSHRDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHRDW imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","vpshrdw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",""
+"VPSHUFB xmm1, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV, xmm1","vpshufb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 00 /r","V","V","AVX","","w,r,r","",""
+"VPSHUFB xmm1, {k}{z}, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV, {k}{z}, xmm1","vpshufb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 00 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSHUFB ymm1, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV, ymm1","vpshufb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 00 /r","V","V","AVX2","","w,r,r","",""
+"VPSHUFB ymm1, {k}{z}, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV, {k}{z}, ymm1","vpshufb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 00 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSHUFB zmm1, {k}{z}, zmmV, zmm2/m512","VPSHUFB zmm2/m512, zmmV, {k}{z}, zmm1","vpshufb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 00 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSHUFBITQMB k1, {k}, xmmV, xmm2/m128","VPSHUFBITQMB xmm2/m128, xmmV, {k}, k1","vpshufbitqmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 8F /r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r,r","",""
+"VPSHUFBITQMB k1, {k}, ymmV, ymm2/m256","VPSHUFBITQMB ymm2/m256, ymmV, {k}, k1","vpshufbitqmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 8F /r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r,r","",""
+"VPSHUFBITQMB k1, {k}, zmmV, zmm2/m512","VPSHUFBITQMB zmm2/m512, zmmV, {k}, k1","vpshufbitqmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 8F /r","V","V","AVX512_BITALG","scale64","w,r,r,r","",""
+"VPSHUFD xmm1, xmm2/m128, imm8u","VPSHUFD imm8u, xmm2/m128, xmm1","vpshufd imm8u, xmm2/m128, xmm1","VEX.128.66.0F.WIG 70 /r ib","V","V","AVX","","w,r,r","",""
+"VPSHUFD xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSHUFD imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","vpshufd imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W0 70 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSHUFD ymm1, ymm2/m256, imm8u","VPSHUFD imm8u, ymm2/m256, ymm1","vpshufd imm8u, ymm2/m256, ymm1","VEX.256.66.0F.WIG 70 /r ib","V","V","AVX2","","w,r,r","",""
+"VPSHUFD ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSHUFD imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","vpshufd imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W0 70 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSHUFD zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSHUFD imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","vpshufd imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W0 70 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSHUFHW xmm1, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128, xmm1","vpshufhw imm8u, xmm2/m128, xmm1","VEX.128.F3.0F.WIG 70 /r ib","V","V","AVX","","w,r,r","",""
+"VPSHUFHW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128, {k}{z}, xmm1","vpshufhw imm8u, xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.WIG 70 /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSHUFHW ymm1, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256, ymm1","vpshufhw imm8u, ymm2/m256, ymm1","VEX.256.F3.0F.WIG 70 /r ib","V","V","AVX2","","w,r,r","",""
+"VPSHUFHW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256, {k}{z}, ymm1","vpshufhw imm8u, ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.WIG 70 /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSHUFHW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFHW imm8u, zmm2/m512, {k}{z}, zmm1","vpshufhw imm8u, zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.WIG 70 /r ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSHUFLW xmm1, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128, xmm1","vpshuflw imm8u, xmm2/m128, xmm1","VEX.128.F2.0F.WIG 70 /r ib","V","V","AVX","","w,r,r","",""
+"VPSHUFLW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128, {k}{z}, xmm1","vpshuflw imm8u, xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.WIG 70 /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSHUFLW ymm1, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256, ymm1","vpshuflw imm8u, ymm2/m256, ymm1","VEX.256.F2.0F.WIG 70 /r ib","V","V","AVX2","","w,r,r","",""
+"VPSHUFLW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256, {k}{z}, ymm1","vpshuflw imm8u, ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.WIG 70 /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSHUFLW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFLW imm8u, zmm2/m512, {k}{z}, zmm1","vpshuflw imm8u, zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.WIG 70 /r ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSIGNB xmm1, xmmV, xmm2/m128","VPSIGNB xmm2/m128, xmmV, xmm1","vpsignb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 08 /r","V","V","AVX","","w,r,r","",""
+"VPSIGNB ymm1, ymmV, ymm2/m256","VPSIGNB ymm2/m256, ymmV, ymm1","vpsignb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 08 /r","V","V","AVX2","","w,r,r","",""
+"VPSIGND xmm1, xmmV, xmm2/m128","VPSIGND xmm2/m128, xmmV, xmm1","vpsignd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0A /r","V","V","AVX","","w,r,r","",""
+"VPSIGND ymm1, ymmV, ymm2/m256","VPSIGND ymm2/m256, ymmV, ymm1","vpsignd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0A /r","V","V","AVX2","","w,r,r","",""
+"VPSIGNW xmm1, xmmV, xmm2/m128","VPSIGNW xmm2/m128, xmmV, xmm1","vpsignw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 09 /r","V","V","AVX","","w,r,r","",""
+"VPSIGNW ymm1, ymmV, ymm2/m256","VPSIGNW ymm2/m256, ymmV, ymm1","vpsignw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 09 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLD xmmV, xmm2, imm8u","VPSLLD imm8u, xmm2, xmmV","vpslld imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 72 /6 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSLLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSLLD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vpslld imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /6 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSLLD ymmV, ymm2, imm8u","VPSLLD imm8u, ymm2, ymmV","vpslld imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 72 /6 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSLLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSLLD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vpslld imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /6 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSLLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSLLD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vpslld imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /6 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSLLD xmm1, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV, xmm1","vpslld xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F2 /r","V","V","AVX","","w,r,r","",""
+"VPSLLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV, {k}{z}, xmm1","vpslld xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 F2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLD ymm1, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV, ymm1","vpslld xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F2 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV, {k}{z}, ymm1","vpslld xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 F2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLD xmm2/m128, zmmV, {k}{z}, zmm1","vpslld xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 F2 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSLLDQ xmmV, xmm2, imm8u","VPSLLDQ imm8u, xmm2, xmmV","vpslldq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /7 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSLLDQ xmmV, xmm2/m128, imm8u","VPSLLDQ imm8u, xmm2/m128, xmmV","vpslldq imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /7 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPSLLDQ ymmV, ymm2, imm8u","VPSLLDQ imm8u, ymm2, ymmV","vpslldq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /7 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSLLDQ ymmV, ymm2/m256, imm8u","VPSLLDQ imm8u, ymm2/m256, ymmV","vpslldq imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /7 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPSLLDQ zmmV, zmm2/m512, imm8u","VPSLLDQ imm8u, zmm2/m512, zmmV","vpslldq imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /7 ib","V","V","AVX512BW","scale64","w,r,r","",""
+"VPSLLQ xmmV, xmm2, imm8u","VPSLLQ imm8u, xmm2, xmmV","vpsllq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /6 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSLLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSLLQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vpsllq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 73 /6 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSLLQ ymmV, ymm2, imm8u","VPSLLQ imm8u, ymm2, ymmV","vpsllq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /6 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSLLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSLLQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vpsllq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 73 /6 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSLLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSLLQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vpsllq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 73 /6 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSLLQ xmm1, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV, xmm1","vpsllq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F3 /r","V","V","AVX","","w,r,r","",""
+"VPSLLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV, {k}{z}, xmm1","vpsllq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 F3 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLQ ymm1, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV, ymm1","vpsllq xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F3 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV, {k}{z}, ymm1","vpsllq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 F3 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLQ xmm2/m128, zmmV, {k}{z}, zmm1","vpsllq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 F3 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSLLVD xmm1, xmmV, xmm2/m128","VPSLLVD xmm2/m128, xmmV, xmm1","vpsllvd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 47 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSLLVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsllvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 47 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSLLVD ymm1, ymmV, ymm2/m256","VPSLLVD ymm2/m256, ymmV, ymm1","vpsllvd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 47 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSLLVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsllvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 47 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSLLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSLLVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsllvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 47 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSLLVQ xmm1, xmmV, xmm2/m128","VPSLLVQ xmm2/m128, xmmV, xmm1","vpsllvq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 47 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSLLVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsllvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 47 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSLLVQ ymm1, ymmV, ymm2/m256","VPSLLVQ ymm2/m256, ymmV, ymm1","vpsllvq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 47 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSLLVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsllvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 47 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSLLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSLLVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsllvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 47 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSLLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLVW xmm2/m128, xmmV, {k}{z}, xmm1","vpsllvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 12 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSLLVW ymm2/m256, ymmV, {k}{z}, ymm1","vpsllvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 12 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSLLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSLLVW zmm2/m512, zmmV, {k}{z}, zmm1","vpsllvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 12 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSLLW xmmV, xmm2, imm8u","VPSLLW imm8u, xmm2, xmmV","vpsllw imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 71 /6 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSLLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSLLW imm8u, xmm2/m128, {k}{z}, xmmV","vpsllw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /6 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLW ymmV, ymm2, imm8u","VPSLLW imm8u, ymm2, ymmV","vpsllw imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 71 /6 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSLLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSLLW imm8u, ymm2/m256, {k}{z}, ymmV","vpsllw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /6 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSLLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSLLW imm8u, zmm2/m512, {k}{z}, zmmV","vpsllw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /6 ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSLLW xmm1, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV, xmm1","vpsllw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F1 /r","V","V","AVX","","w,r,r","",""
+"VPSLLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV, {k}{z}, xmm1","vpsllw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLW ymm1, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV, ymm1","vpsllw xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F1 /r","V","V","AVX2","","w,r,r","",""
+"VPSLLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV, {k}{z}, ymm1","vpsllw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSLLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLW xmm2/m128, zmmV, {k}{z}, zmm1","vpsllw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F1 /r","V","V","AVX512BW","scale16","w,r,r,r","",""
+"VPSRAD xmmV, xmm2, imm8u","VPSRAD imm8u, xmm2, xmmV","vpsrad imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 72 /4 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRAD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRAD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vpsrad imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /4 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSRAD ymmV, ymm2, imm8u","VPSRAD imm8u, ymm2, ymmV","vpsrad imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 72 /4 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRAD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRAD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vpsrad imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /4 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSRAD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRAD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vpsrad imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /4 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSRAD xmm1, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV, xmm1","vpsrad xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E2 /r","V","V","AVX","","w,r,r","",""
+"VPSRAD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV, {k}{z}, xmm1","vpsrad xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 E2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAD ymm1, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV, ymm1","vpsrad xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E2 /r","V","V","AVX2","","w,r,r","",""
+"VPSRAD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV, {k}{z}, ymm1","vpsrad xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 E2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAD xmm2/m128, zmmV, {k}{z}, zmm1","vpsrad xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 E2 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSRAQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRAQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vpsraq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 72 /4 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSRAQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRAQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vpsraq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 72 /4 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSRAQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRAQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vpsraq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 72 /4 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSRAQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAQ xmm2/m128, xmmV, {k}{z}, xmm1","vpsraq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 E2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAQ xmm2/m128, ymmV, {k}{z}, ymm1","vpsraq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 E2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAQ xmm2/m128, zmmV, {k}{z}, zmm1","vpsraq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 E2 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSRAVD xmm1, xmmV, xmm2/m128","VPSRAVD xmm2/m128, xmmV, xmm1","vpsravd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 46 /r","V","V","AVX2","","w,r,r","",""
+"VPSRAVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRAVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsravd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 46 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSRAVD ymm1, ymmV, ymm2/m256","VPSRAVD ymm2/m256, ymmV, ymm1","vpsravd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 46 /r","V","V","AVX2","","w,r,r","",""
+"VPSRAVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRAVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsravd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 46 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSRAVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRAVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsravd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 46 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSRAVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRAVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsravq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 46 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSRAVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRAVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsravq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 46 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSRAVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRAVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsravq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 46 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSRAVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAVW xmm2/m128, xmmV, {k}{z}, xmm1","vpsravw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 11 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRAVW ymm2/m256, ymmV, {k}{z}, ymm1","vpsravw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 11 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSRAVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRAVW zmm2/m512, zmmV, {k}{z}, zmm1","vpsravw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 11 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSRAW xmmV, xmm2, imm8u","VPSRAW imm8u, xmm2, xmmV","vpsraw imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 71 /4 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRAW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRAW imm8u, xmm2/m128, {k}{z}, xmmV","vpsraw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /4 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAW ymmV, ymm2, imm8u","VPSRAW imm8u, ymm2, ymmV","vpsraw imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 71 /4 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRAW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRAW imm8u, ymm2/m256, {k}{z}, ymmV","vpsraw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /4 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSRAW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRAW imm8u, zmm2/m512, {k}{z}, zmmV","vpsraw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /4 ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSRAW xmm1, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV, xmm1","vpsraw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E1 /r","V","V","AVX","","w,r,r","",""
+"VPSRAW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV, {k}{z}, xmm1","vpsraw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAW ymm1, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV, ymm1","vpsraw xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E1 /r","V","V","AVX2","","w,r,r","",""
+"VPSRAW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV, {k}{z}, ymm1","vpsraw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRAW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAW xmm2/m128, zmmV, {k}{z}, zmm1","vpsraw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E1 /r","V","V","AVX512BW","scale16","w,r,r,r","",""
+"VPSRLD xmmV, xmm2, imm8u","VPSRLD imm8u, xmm2, xmmV","vpsrld imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 72 /2 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRLD imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","vpsrld imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W0 72 /2 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSRLD ymmV, ymm2, imm8u","VPSRLD imm8u, ymm2, ymmV","vpsrld imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 72 /2 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRLD imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","vpsrld imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W0 72 /2 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSRLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRLD imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","vpsrld imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W0 72 /2 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSRLD xmm1, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV, xmm1","vpsrld xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D2 /r","V","V","AVX","","w,r,r","",""
+"VPSRLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV, {k}{z}, xmm1","vpsrld xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 D2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLD ymm1, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV, ymm1","vpsrld xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D2 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV, {k}{z}, ymm1","vpsrld xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 D2 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLD xmm2/m128, zmmV, {k}{z}, zmm1","vpsrld xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 D2 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSRLDQ xmmV, xmm2, imm8u","VPSRLDQ imm8u, xmm2, xmmV","vpsrldq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /3 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRLDQ xmmV, xmm2/m128, imm8u","VPSRLDQ imm8u, xmm2/m128, xmmV","vpsrldq imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /3 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r","",""
+"VPSRLDQ ymmV, ymm2, imm8u","VPSRLDQ imm8u, ymm2, ymmV","vpsrldq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /3 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRLDQ ymmV, ymm2/m256, imm8u","VPSRLDQ imm8u, ymm2/m256, ymmV","vpsrldq imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /3 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r","",""
+"VPSRLDQ zmmV, zmm2/m512, imm8u","VPSRLDQ imm8u, zmm2/m512, zmmV","vpsrldq imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /3 ib","V","V","AVX512BW","scale64","w,r,r","",""
+"VPSRLQ xmmV, xmm2, imm8u","VPSRLQ imm8u, xmm2, xmmV","vpsrlq imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 73 /2 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRLQ imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","vpsrlq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX.NDD.128.66.0F.W1 73 /2 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSRLQ ymmV, ymm2, imm8u","VPSRLQ imm8u, ymm2, ymmV","vpsrlq imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 73 /2 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRLQ imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","vpsrlq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX.NDD.256.66.0F.W1 73 /2 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSRLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRLQ imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","vpsrlq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX.NDD.512.66.0F.W1 73 /2 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSRLQ xmm1, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV, xmm1","vpsrlq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D3 /r","V","V","AVX","","w,r,r","",""
+"VPSRLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV, {k}{z}, xmm1","vpsrlq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 D3 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLQ ymm1, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV, ymm1","vpsrlq xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D3 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV, {k}{z}, ymm1","vpsrlq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 D3 /r","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLQ xmm2/m128, zmmV, {k}{z}, zmm1","vpsrlq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 D3 /r","V","V","AVX512F","scale16","w,r,r,r","",""
+"VPSRLVD xmm1, xmmV, xmm2/m128","VPSRLVD xmm2/m128, xmmV, xmm1","vpsrlvd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 45 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRLVD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsrlvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 45 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSRLVD ymm1, ymmV, ymm2/m256","VPSRLVD ymm2/m256, ymmV, ymm1","vpsrlvd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 45 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRLVD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsrlvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 45 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSRLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRLVD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsrlvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 45 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSRLVQ xmm1, xmmV, xmm2/m128","VPSRLVQ xmm2/m128, xmmV, xmm1","vpsrlvq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 45 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRLVQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsrlvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 45 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSRLVQ ymm1, ymmV, ymm2/m256","VPSRLVQ ymm2/m256, ymmV, ymm1","vpsrlvq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 45 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRLVQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsrlvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 45 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSRLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRLVQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsrlvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 45 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSRLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLVW xmm2/m128, xmmV, {k}{z}, xmm1","vpsrlvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 10 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRLVW ymm2/m256, ymmV, {k}{z}, ymm1","vpsrlvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 10 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSRLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRLVW zmm2/m512, zmmV, {k}{z}, zmm1","vpsrlvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 10 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSRLW xmmV, xmm2, imm8u","VPSRLW imm8u, xmm2, xmmV","vpsrlw imm8u, xmm2, xmmV","VEX.NDD.128.66.0F.WIG 71 /2 ib","V","V","AVX","modrm_regonly","w,r,r","",""
+"VPSRLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRLW imm8u, xmm2/m128, {k}{z}, xmmV","vpsrlw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /2 ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLW ymmV, ymm2, imm8u","VPSRLW imm8u, ymm2, ymmV","vpsrlw imm8u, ymm2, ymmV","VEX.NDD.256.66.0F.WIG 71 /2 ib","V","V","AVX2","modrm_regonly","w,r,r","",""
+"VPSRLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRLW imm8u, ymm2/m256, {k}{z}, ymmV","vpsrlw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /2 ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSRLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRLW imm8u, zmm2/m512, {k}{z}, zmmV","vpsrlw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /2 ib","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSRLW xmm1, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV, xmm1","vpsrlw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D1 /r","V","V","AVX","","w,r,r","",""
+"VPSRLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV, {k}{z}, xmm1","vpsrlw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLW ymm1, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV, ymm1","vpsrlw xmm2/m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D1 /r","V","V","AVX2","","w,r,r","",""
+"VPSRLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV, {k}{z}, ymm1","vpsrlw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D1 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSRLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLW xmm2/m128, zmmV, {k}{z}, zmm1","vpsrlw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D1 /r","V","V","AVX512BW","scale16","w,r,r,r","",""
+"VPSUBB xmm1, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV, xmm1","vpsubb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F8 /r","V","V","AVX","","w,r,r","",""
+"VPSUBB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV, {k}{z}, xmm1","vpsubb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F8 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBB ymm1, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV, ymm1","vpsubb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F8 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV, {k}{z}, ymm1","vpsubb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F8 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBB zmm2/m512, zmmV, {k}{z}, zmm1","vpsubb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F8 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBD xmm1, xmmV, xmm2/m128","VPSUBD xmm2/m128, xmmV, xmm1","vpsubd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FA /r","V","V","AVX","","w,r,r","",""
+"VPSUBD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSUBD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpsubd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 FA /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPSUBD ymm1, ymmV, ymm2/m256","VPSUBD ymm2/m256, ymmV, ymm1","vpsubd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FA /r","V","V","AVX2","","w,r,r","",""
+"VPSUBD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSUBD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpsubd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 FA /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPSUBD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSUBD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpsubd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 FA /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPSUBQ xmm1, xmmV, xmm2/m128","VPSUBQ xmm2/m128, xmmV, xmm1","vpsubq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FB /r","V","V","AVX","","w,r,r","",""
+"VPSUBQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSUBQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpsubq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 FB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPSUBQ ymm1, ymmV, ymm2/m256","VPSUBQ ymm2/m256, ymmV, ymm1","vpsubq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FB /r","V","V","AVX2","","w,r,r","",""
+"VPSUBQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSUBQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpsubq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 FB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPSUBQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSUBQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpsubq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 FB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPSUBSB xmm1, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV, xmm1","vpsubsb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E8 /r","V","V","AVX","","w,r,r","",""
+"VPSUBSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV, {k}{z}, xmm1","vpsubsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E8 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBSB ymm1, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV, ymm1","vpsubsb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E8 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV, {k}{z}, ymm1","vpsubsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E8 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSB zmm2/m512, zmmV, {k}{z}, zmm1","vpsubsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E8 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBSW xmm1, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV, xmm1","vpsubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E9 /r","V","V","AVX","","w,r,r","",""
+"VPSUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV, {k}{z}, xmm1","vpsubsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E9 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBSW ymm1, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV, ymm1","vpsubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E9 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV, {k}{z}, ymm1","vpsubsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E9 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSW zmm2/m512, zmmV, {k}{z}, zmm1","vpsubsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E9 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBUSB xmm1, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV, xmm1","vpsubusb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D8 /r","V","V","AVX","","w,r,r","",""
+"VPSUBUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV, {k}{z}, xmm1","vpsubusb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D8 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBUSB ymm1, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV, ymm1","vpsubusb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D8 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV, {k}{z}, ymm1","vpsubusb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D8 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSB zmm2/m512, zmmV, {k}{z}, zmm1","vpsubusb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D8 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBUSW xmm1, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV, xmm1","vpsubusw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D9 /r","V","V","AVX","","w,r,r","",""
+"VPSUBUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV, {k}{z}, xmm1","vpsubusw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D9 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBUSW ymm1, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV, ymm1","vpsubusw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D9 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV, {k}{z}, ymm1","vpsubusw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D9 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSW zmm2/m512, zmmV, {k}{z}, zmm1","vpsubusw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D9 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPSUBW xmm1, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV, xmm1","vpsubw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F9 /r","V","V","AVX","","w,r,r","",""
+"VPSUBW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV, {k}{z}, xmm1","vpsubw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F9 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPSUBW ymm1, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV, ymm1","vpsubw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F9 /r","V","V","AVX2","","w,r,r","",""
+"VPSUBW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV, {k}{z}, ymm1","vpsubw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F9 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPSUBW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBW zmm2/m512, zmmV, {k}{z}, zmm1","vpsubw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F9 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPTERNLOGD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPTERNLOGD imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpternlogd imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W0 25 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","rw,r,r,r,r","",""
+"VPTERNLOGD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPTERNLOGD imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpternlogd imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W0 25 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","rw,r,r,r,r","",""
+"VPTERNLOGD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPTERNLOGD imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpternlogd imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W0 25 /r ib","V","V","AVX512F","bscale4,scale64","rw,r,r,r,r","",""
+"VPTERNLOGQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPTERNLOGQ imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpternlogq imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W1 25 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","rw,r,r,r,r","",""
+"VPTERNLOGQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPTERNLOGQ imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpternlogq imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W1 25 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","rw,r,r,r,r","",""
+"VPTERNLOGQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPTERNLOGQ imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpternlogq imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W1 25 /r ib","V","V","AVX512F","bscale8,scale64","rw,r,r,r,r","",""
+"VPTEST xmm1, xmm2/m128","VPTEST xmm2/m128, xmm1","vptest xmm2/m128, xmm1","VEX.128.66.0F38.WIG 17 /r","V","V","AVX","","r,r","",""
+"VPTEST ymm1, ymm2/m256","VPTEST ymm2/m256, ymm1","vptest ymm2/m256, ymm1","VEX.256.66.0F38.WIG 17 /r","V","V","AVX","","r,r","",""
+"VPTESTMB k1, {k}, xmmV, xmm2/m128","VPTESTMB xmm2/m128, xmmV, {k}, k1","vptestmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 26 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPTESTMB k1, {k}, ymmV, ymm2/m256","VPTESTMB ymm2/m256, ymmV, {k}, k1","vptestmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 26 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPTESTMB k1, {k}, zmmV, zmm2/m512","VPTESTMB zmm2/m512, zmmV, {k}, k1","vptestmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 26 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPTESTMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTMD xmm2/m128/m32bcst, xmmV, {k}, k1","vptestmd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPTESTMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTMD ymm2/m256/m32bcst, ymmV, {k}, k1","vptestmd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPTESTMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTMD zmm2/m512/m32bcst, zmmV, {k}, k1","vptestmd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 27 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPTESTMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTMQ xmm2/m128/m64bcst, xmmV, {k}, k1","vptestmq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPTESTMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTMQ ymm2/m256/m64bcst, ymmV, {k}, k1","vptestmq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPTESTMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTMQ zmm2/m512/m64bcst, zmmV, {k}, k1","vptestmq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 27 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPTESTMW k1, {k}, xmmV, xmm2/m128","VPTESTMW xmm2/m128, xmmV, {k}, k1","vptestmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 26 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPTESTMW k1, {k}, ymmV, ymm2/m256","VPTESTMW ymm2/m256, ymmV, {k}, k1","vptestmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 26 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPTESTMW k1, {k}, zmmV, zmm2/m512","VPTESTMW zmm2/m512, zmmV, {k}, k1","vptestmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 26 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPTESTNMB k1, {k}, xmmV, xmm2/m128","VPTESTNMB xmm2/m128, xmmV, {k}, k1","vptestnmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W0 26 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPTESTNMB k1, {k}, ymmV, ymm2/m256","VPTESTNMB ymm2/m256, ymmV, {k}, k1","vptestnmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W0 26 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPTESTNMB k1, {k}, zmmV, zmm2/m512","VPTESTNMB zmm2/m512, zmmV, {k}, k1","vptestnmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W0 26 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPTESTNMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTNMD xmm2/m128/m32bcst, xmmV, {k}, k1","vptestnmd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPTESTNMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTNMD ymm2/m256/m32bcst, ymmV, {k}, k1","vptestnmd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPTESTNMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTNMD zmm2/m512/m32bcst, zmmV, {k}, k1","vptestnmd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W0 27 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPTESTNMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTNMQ xmm2/m128/m64bcst, xmmV, {k}, k1","vptestnmq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPTESTNMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTNMQ ymm2/m256/m64bcst, ymmV, {k}, k1","vptestnmq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPTESTNMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTNMQ zmm2/m512/m64bcst, zmmV, {k}, k1","vptestnmq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W1 27 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPTESTNMW k1, {k}, xmmV, xmm2/m128","VPTESTNMW xmm2/m128, xmmV, {k}, k1","vptestnmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W1 26 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPTESTNMW k1, {k}, ymmV, ymm2/m256","VPTESTNMW ymm2/m256, ymmV, {k}, k1","vptestnmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W1 26 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPTESTNMW k1, {k}, zmmV, zmm2/m512","VPTESTNMW zmm2/m512, zmmV, {k}, k1","vptestnmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W1 26 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPUNPCKHBW xmm1, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128, xmmV, xmm1","vpunpckhbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 68 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKHBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128, xmmV, {k}{z}, xmm1","vpunpckhbw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 68 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPUNPCKHBW ymm1, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256, ymmV, ymm1","vpunpckhbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 68 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKHBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256, ymmV, {k}{z}, ymm1","vpunpckhbw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 68 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPUNPCKHBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHBW zmm2/m512, zmmV, {k}{z}, zmm1","vpunpckhbw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 68 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPUNPCKHDQ xmm1, xmmV, xmm2/m128","VPUNPCKHDQ xmm2/m128, xmmV, xmm1","vpunpckhdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6A /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKHDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKHDQ xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpunpckhdq xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 6A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPUNPCKHDQ ymm1, ymmV, ymm2/m256","VPUNPCKHDQ ymm2/m256, ymmV, ymm1","vpunpckhdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6A /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKHDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKHDQ ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpunpckhdq ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 6A /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPUNPCKHDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKHDQ zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpunpckhdq zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 6A /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPUNPCKHQDQ xmm1, xmmV, xmm2/m128","VPUNPCKHQDQ xmm2/m128, xmmV, xmm1","vpunpckhqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6D /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKHQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKHQDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpunpckhqdq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 6D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPUNPCKHQDQ ymm1, ymmV, ymm2/m256","VPUNPCKHQDQ ymm2/m256, ymmV, ymm1","vpunpckhqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6D /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKHQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKHQDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpunpckhqdq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 6D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPUNPCKHQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKHQDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpunpckhqdq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 6D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPUNPCKHWD xmm1, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128, xmmV, xmm1","vpunpckhwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 69 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKHWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128, xmmV, {k}{z}, xmm1","vpunpckhwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 69 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPUNPCKHWD ymm1, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256, ymmV, ymm1","vpunpckhwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 69 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKHWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256, ymmV, {k}{z}, ymm1","vpunpckhwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 69 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPUNPCKHWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHWD zmm2/m512, zmmV, {k}{z}, zmm1","vpunpckhwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 69 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPUNPCKLBW xmm1, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128, xmmV, xmm1","vpunpcklbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 60 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKLBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128, xmmV, {k}{z}, xmm1","vpunpcklbw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 60 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPUNPCKLBW ymm1, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256, ymmV, ymm1","vpunpcklbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 60 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKLBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256, ymmV, {k}{z}, ymm1","vpunpcklbw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 60 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPUNPCKLBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLBW zmm2/m512, zmmV, {k}{z}, zmm1","vpunpcklbw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 60 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPUNPCKLDQ xmm1, xmmV, xmm2/m128","VPUNPCKLDQ xmm2/m128, xmmV, xmm1","vpunpckldq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 62 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKLDQ xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpunpckldq xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 62 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPUNPCKLDQ ymm1, ymmV, ymm2/m256","VPUNPCKLDQ ymm2/m256, ymmV, ymm1","vpunpckldq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 62 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKLDQ ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpunpckldq ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 62 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPUNPCKLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKLDQ zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpunpckldq zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 62 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPUNPCKLQDQ xmm1, xmmV, xmm2/m128","VPUNPCKLQDQ xmm2/m128, xmmV, xmm1","vpunpcklqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6C /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKLQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKLQDQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpunpcklqdq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 6C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPUNPCKLQDQ ymm1, ymmV, ymm2/m256","VPUNPCKLQDQ ymm2/m256, ymmV, ymm1","vpunpcklqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6C /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKLQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKLQDQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpunpcklqdq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 6C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPUNPCKLQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKLQDQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpunpcklqdq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 6C /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VPUNPCKLWD xmm1, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128, xmmV, xmm1","vpunpcklwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 61 /r","V","V","AVX","","w,r,r","",""
+"VPUNPCKLWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128, xmmV, {k}{z}, xmm1","vpunpcklwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG 61 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","",""
+"VPUNPCKLWD ymm1, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256, ymmV, ymm1","vpunpcklwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 61 /r","V","V","AVX2","","w,r,r","",""
+"VPUNPCKLWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256, ymmV, {k}{z}, ymm1","vpunpcklwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG 61 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","",""
+"VPUNPCKLWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLWD zmm2/m512, zmmV, {k}{z}, zmm1","vpunpcklwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG 61 /r","V","V","AVX512BW","scale64","w,r,r,r","",""
+"VPXOR xmm1, xmmV, xmm2/m128","VPXOR xmm2/m128, xmmV, xmm1","vpxor xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EF /r","V","V","AVX","","w,r,r","",""
+"VPXOR ymm1, ymmV, ymm2/m256","VPXOR ymm2/m256, ymmV, ymm1","vpxor ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EF /r","V","V","AVX2","","w,r,r","",""
+"VPXORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPXORD xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpxord xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 EF /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VPXORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPXORD ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpxord ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 EF /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VPXORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPXORD zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpxord zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 EF /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VPXORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPXORQ xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpxorq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 EF /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VPXORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPXORQ ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpxorq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 EF /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VPXORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPXORQ zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpxorq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 EF /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VRANGEPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u:4","VRANGEPD imm8u:4, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vrangepd imm8u:4, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 50 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VRANGEPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u:4","VRANGEPD imm8u:4, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vrangepd imm8u:4, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 50 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VRANGEPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPD imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}","vrangepd imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F3A.W1 50 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VRANGEPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u:4","VRANGEPD imm8u:4, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vrangepd imm8u:4, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 50 /r ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r,r","",""
+"VRANGEPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u:4","VRANGEPS imm8u:4, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vrangeps imm8u:4, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 50 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VRANGEPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u:4","VRANGEPS imm8u:4, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vrangeps imm8u:4, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 50 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VRANGEPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPS imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}","vrangeps imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F3A.W0 50 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VRANGEPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u:4","VRANGEPS imm8u:4, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vrangeps imm8u:4, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 50 /r ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r,r","",""
+"VRANGESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESD imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vrangesd imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 51 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VRANGESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VRANGESD imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","vrangesd imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 51 /r ib","V","V","AVX512DQ","scale8","w,r,r,r,r","",""
+"VRANGESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESS imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","vrangess imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 51 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VRANGESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VRANGESS imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","vrangess imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 51 /r ib","V","V","AVX512DQ","scale4","w,r,r,r,r","",""
+"VRCP14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRCP14PD xmm2/m128/m64bcst, {k}{z}, xmm1","vrcp14pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 4C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VRCP14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRCP14PD ymm2/m256/m64bcst, {k}{z}, ymm1","vrcp14pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 4C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VRCP14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP14PD zmm2/m512/m64bcst, {k}{z}, zmm1","vrcp14pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 4C /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VRCP14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRCP14PS xmm2/m128/m32bcst, {k}{z}, xmm1","vrcp14ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 4C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VRCP14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRCP14PS ymm2/m256/m32bcst, {k}{z}, ymm1","vrcp14ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 4C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VRCP14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP14PS zmm2/m512/m32bcst, {k}{z}, zmm1","vrcp14ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 4C /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VRCP14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP14SD xmm2/m64, xmmV, {k}{z}, xmm1","vrcp14sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 4D /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VRCP14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP14SS xmm2/m32, xmmV, {k}{z}, xmm1","vrcp14ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 4D /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VRCP28PD zmm1{sae}, {k}{z}, zmm2","VRCP28PD zmm2, {k}{z}, zmm1{sae}","vrcp28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 CA /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VRCP28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP28PD zmm2/m512/m64bcst, {k}{z}, zmm1","vrcp28pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 CA /r","V","V","AVX512ER","bscale8,scale64","w,r,r","",""
+"VRCP28PS zmm1{sae}, {k}{z}, zmm2","VRCP28PS zmm2, {k}{z}, zmm1{sae}","vrcp28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 CA /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VRCP28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP28PS zmm2/m512/m32bcst, {k}{z}, zmm1","vrcp28ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 CA /r","V","V","AVX512ER","bscale4,scale64","w,r,r","",""
+"VRCP28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SD xmm2, xmmV, {k}{z}, xmm1{sae}","vrcp28sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W1 CB /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
+"VRCP28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP28SD xmm2/m64, xmmV, {k}{z}, xmm1","vrcp28sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 CB /r","V","V","AVX512ER","scale8","w,r,r,r","",""
+"VRCP28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SS xmm2, xmmV, {k}{z}, xmm1{sae}","vrcp28ss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W0 CB /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
+"VRCP28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP28SS xmm2/m32, xmmV, {k}{z}, xmm1","vrcp28ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 CB /r","V","V","AVX512ER","scale4","w,r,r,r","",""
+"VRCPPS xmm1, xmm2/m128","VRCPPS xmm2/m128, xmm1","vrcpps xmm2/m128, xmm1","VEX.128.0F.WIG 53 /r","V","V","AVX","","w,r","",""
+"VRCPPS ymm1, ymm2/m256","VRCPPS ymm2/m256, ymm1","vrcpps ymm2/m256, ymm1","VEX.256.0F.WIG 53 /r","V","V","AVX","","w,r","",""
+"VRCPSS xmm1, xmmV, xmm2/m32","VRCPSS xmm2/m32, xmmV, xmm1","vrcpss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 53 /r","V","V","AVX","","w,r,r","",""
+"VREDUCEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VREDUCEPD imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","vreducepd imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VREDUCEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VREDUCEPD imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vreducepd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VREDUCEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPD imm8u, zmm2, {k}{z}, zmm1{sae}","vreducepd imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W1 56 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","",""
+"VREDUCEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VREDUCEPD imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vreducepd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 56 /r ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VREDUCEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VREDUCEPS imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","vreduceps imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VREDUCEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VREDUCEPS imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","vreduceps imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VREDUCEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPS imm8u, zmm2, {k}{z}, zmm1{sae}","vreduceps imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W0 56 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","",""
+"VREDUCEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VREDUCEPS imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","vreduceps imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 56 /r ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VREDUCESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESD imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vreducesd imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 57 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VREDUCESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VREDUCESD imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","vreducesd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 57 /r ib","V","V","AVX512DQ","scale8","w,r,r,r,r","",""
+"VREDUCESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESS imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vreducess imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 57 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r,r","",""
+"VREDUCESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VREDUCESS imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","vreducess imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 57 /r ib","V","V","AVX512DQ","scale4","w,r,r,r,r","",""
+"VRNDSCALEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VRNDSCALEPD imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","vrndscalepd imm8u, xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W1 09 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VRNDSCALEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VRNDSCALEPD imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","vrndscalepd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W1 09 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VRNDSCALEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPD imm8u, zmm2, {k}{z}, zmm1{sae}","vrndscalepd imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W1 09 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VRNDSCALEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VRNDSCALEPD imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","vrndscalepd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W1 09 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VRNDSCALEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VRNDSCALEPS imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","vrndscaleps imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F3A.W0 08 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VRNDSCALEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VRNDSCALEPS imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","vrndscaleps imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F3A.W0 08 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VRNDSCALEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPS imm8u, zmm2, {k}{z}, zmm1{sae}","vrndscaleps imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W0 08 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VRNDSCALEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VRNDSCALEPS imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","vrndscaleps imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F3A.W0 08 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VRNDSCALESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESD imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vrndscalesd imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W1 0B /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VRNDSCALESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VRNDSCALESD imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","vrndscalesd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W1 0B /r ib","V","V","AVX512F","scale8","w,r,r,r,r","",""
+"VRNDSCALESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESS imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","vrndscaless imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3A.W0 0A /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","",""
+"VRNDSCALESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VRNDSCALESS imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","vrndscaless imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F3A.W0 0A /r ib","V","V","AVX512F","scale4","w,r,r,r,r","",""
+"VROUNDPD xmm1, xmm2/m128, imm8u","VROUNDPD imm8u, xmm2/m128, xmm1","vroundpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 09 /r ib","V","V","AVX","","w,r,r","",""
+"VROUNDPD ymm1, ymm2/m256, imm8u","VROUNDPD imm8u, ymm2/m256, ymm1","vroundpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 09 /r ib","V","V","AVX","","w,r,r","",""
+"VROUNDPS xmm1, xmm2/m128, imm8u","VROUNDPS imm8u, xmm2/m128, xmm1","vroundps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 08 /r ib","V","V","AVX","","w,r,r","",""
+"VROUNDPS ymm1, ymm2/m256, imm8u","VROUNDPS imm8u, ymm2/m256, ymm1","vroundps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 08 /r ib","V","V","AVX","","w,r,r","",""
+"VROUNDSD xmm1, xmmV, xmm2/m64, imm8u","VROUNDSD imm8u, xmm2/m64, xmmV, xmm1","vroundsd imm8u, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.WIG 0B /r ib","V","V","AVX","","w,r,r,r","",""
+"VROUNDSS xmm1, xmmV, xmm2/m32, imm8u","VROUNDSS imm8u, xmm2/m32, xmmV, xmm1","vroundss imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.WIG 0A /r ib","V","V","AVX","","w,r,r,r","",""
+"VRSQRT14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRSQRT14PD xmm2/m128/m64bcst, {k}{z}, xmm1","vrsqrt14pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 4E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VRSQRT14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRSQRT14PD ymm2/m256/m64bcst, {k}{z}, ymm1","vrsqrt14pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 4E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VRSQRT14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT14PD zmm2/m512/m64bcst, {k}{z}, zmm1","vrsqrt14pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 4E /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VRSQRT14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRSQRT14PS xmm2/m128/m32bcst, {k}{z}, xmm1","vrsqrt14ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 4E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VRSQRT14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRSQRT14PS ymm2/m256/m32bcst, {k}{z}, ymm1","vrsqrt14ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 4E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VRSQRT14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT14PS zmm2/m512/m32bcst, {k}{z}, zmm1","vrsqrt14ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 4E /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VRSQRT14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT14SD xmm2/m64, xmmV, {k}{z}, xmm1","vrsqrt14sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 4F /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VRSQRT14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT14SS xmm2/m32, xmmV, {k}{z}, xmm1","vrsqrt14ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 4F /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VRSQRT28PD zmm1{sae}, {k}{z}, zmm2","VRSQRT28PD zmm2, {k}{z}, zmm1{sae}","vrsqrt28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 CC /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VRSQRT28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT28PD zmm2/m512/m64bcst, {k}{z}, zmm1","vrsqrt28pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 CC /r","V","V","AVX512ER","bscale8,scale64","w,r,r","",""
+"VRSQRT28PS zmm1{sae}, {k}{z}, zmm2","VRSQRT28PS zmm2, {k}{z}, zmm1{sae}","vrsqrt28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 CC /r","V","V","AVX512ER","modrm_regonly","w,r,r","",""
+"VRSQRT28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT28PS zmm2/m512/m32bcst, {k}{z}, zmm1","vrsqrt28ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 CC /r","V","V","AVX512ER","bscale4,scale64","w,r,r","",""
+"VRSQRT28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SD xmm2, xmmV, {k}{z}, xmm1{sae}","vrsqrt28sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W1 CD /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
+"VRSQRT28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT28SD xmm2/m64, xmmV, {k}{z}, xmm1","vrsqrt28sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 CD /r","V","V","AVX512ER","scale8","w,r,r,r","",""
+"VRSQRT28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SS xmm2, xmmV, {k}{z}, xmm1{sae}","vrsqrt28ss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W0 CD /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","",""
+"VRSQRT28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT28SS xmm2/m32, xmmV, {k}{z}, xmm1","vrsqrt28ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 CD /r","V","V","AVX512ER","scale4","w,r,r,r","",""
+"VRSQRTPS xmm1, xmm2/m128","VRSQRTPS xmm2/m128, xmm1","vrsqrtps xmm2/m128, xmm1","VEX.128.0F.WIG 52 /r","V","V","AVX","","w,r","",""
+"VRSQRTPS ymm1, ymm2/m256","VRSQRTPS ymm2/m256, ymm1","vrsqrtps ymm2/m256, ymm1","VEX.256.0F.WIG 52 /r","V","V","AVX","","w,r","",""
+"VRSQRTSS xmm1, xmmV, xmm2/m32","VRSQRTSS xmm2/m32, xmmV, xmm1","vrsqrtss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 52 /r","V","V","AVX","","w,r,r","",""
+"VSCALEFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSCALEFPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vscalefpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 2C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VSCALEFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSCALEFPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vscalefpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 2C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VSCALEFPD zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPD zmm2, zmmV, {k}{z}, zmm1{er}","vscalefpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F38.W1 2C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSCALEFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSCALEFPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vscalefpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 2C /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VSCALEFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSCALEFPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vscalefps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 2C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VSCALEFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSCALEFPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vscalefps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 2C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VSCALEFPS zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPS zmm2, zmmV, {k}{z}, zmm1{er}","vscalefps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F38.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSCALEFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSCALEFPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vscalefps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 2C /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VSCALEFSD xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSD xmm2, xmmV, {k}{z}, xmm1{er}","vscalefsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.66.0F38.W1 2D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSCALEFSD xmm1, {k}{z}, xmmV, xmm2/m64","VSCALEFSD xmm2/m64, xmmV, {k}{z}, xmm1","vscalefsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 2D /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VSCALEFSS xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSS xmm2, xmmV, {k}{z}, xmm1{er}","vscalefss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.66.0F38.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSCALEFSS xmm1, {k}{z}, xmmV, xmm2/m32","VSCALEFSS xmm2/m32, xmmV, {k}{z}, xmm1","vscalefss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 2D /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VSCATTERDPD vm32x, {k1-k7}, xmm1","VSCATTERDPD xmm1, {k1-k7}, vm32x","vscatterdpd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A2 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERDPD vm32x, {k1-k7}, ymm1","VSCATTERDPD ymm1, {k1-k7}, vm32x","vscatterdpd ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A2 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERDPD vm32y, {k1-k7}, zmm1","VSCATTERDPD zmm1, {k1-k7}, vm32y","vscatterdpd zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A2 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERDPS vm32x, {k1-k7}, xmm1","VSCATTERDPS xmm1, {k1-k7}, vm32x","vscatterdps xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A2 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERDPS vm32y, {k1-k7}, ymm1","VSCATTERDPS ymm1, {k1-k7}, vm32y","vscatterdps ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A2 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERDPS vm32z, {k1-k7}, zmm1","VSCATTERDPS zmm1, {k1-k7}, vm32z","vscatterdps zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A2 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERPF0DPD vm32y, {k1-k7}","VSCATTERPF0DPD {k1-k7}, vm32y","vscatterpf0dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /5","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VSCATTERPF0DPS vm32z, {k1-k7}","VSCATTERPF0DPS {k1-k7}, vm32z","vscatterpf0dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /5","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VSCATTERPF0QPD vm64z, {k1-k7}","VSCATTERPF0QPD {k1-k7}, vm64z","vscatterpf0qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /5","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VSCATTERPF0QPS vm64z, {k1-k7}","VSCATTERPF0QPS {k1-k7}, vm64z","vscatterpf0qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /5","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VSCATTERPF1DPD vm32y, {k1-k7}","VSCATTERPF1DPD {k1-k7}, vm32y","vscatterpf1dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /6","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VSCATTERPF1DPS vm32z, {k1-k7}","VSCATTERPF1DPS {k1-k7}, vm32z","vscatterpf1dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /6","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VSCATTERPF1QPD vm64z, {k1-k7}","VSCATTERPF1QPD {k1-k7}, vm64z","vscatterpf1qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /6","V","V","AVX512PF","modrm_memonly,scale8","r,rw","",""
+"VSCATTERPF1QPS vm64z, {k1-k7}","VSCATTERPF1QPS {k1-k7}, vm64z","vscatterpf1qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /6","V","V","AVX512PF","modrm_memonly,scale4","r,rw","",""
+"VSCATTERQPD vm64x, {k1-k7}, xmm1","VSCATTERQPD xmm1, {k1-k7}, vm64x","vscatterqpd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A3 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERQPD vm64y, {k1-k7}, ymm1","VSCATTERQPD ymm1, {k1-k7}, vm64y","vscatterqpd ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A3 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERQPD vm64z, {k1-k7}, zmm1","VSCATTERQPD zmm1, {k1-k7}, vm64z","vscatterqpd zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A3 /vsib","V","V","AVX512F","modrm_memonly,scale8","w,rw,r","",""
+"VSCATTERQPS vm64x, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7}, vm64x","vscatterqps xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A3 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERQPS vm64y, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7}, vm64y","vscatterqps xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A3 /vsib","V","V","AVX512F+AVX512VL","modrm_memonly,scale4","w,rw,r","",""
+"VSCATTERQPS vm64z, {k1-k7}, ymm1","VSCATTERQPS ymm1, {k1-k7}, vm64z","vscatterqps ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A3 /vsib","V","V","AVX512F","modrm_memonly,scale4","w,rw,r","",""
+"VSHUFF32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFF32X4 imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshuff32x4 imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 23 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VSHUFF32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFF32X4 imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshuff32x4 imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 23 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VSHUFF64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFF64X2 imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshuff64x2 imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 23 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VSHUFF64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFF64X2 imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshuff64x2 imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 23 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VSHUFI32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFI32X4 imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufi32x4 imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 43 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VSHUFI32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFI32X4 imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufi32x4 imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 43 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VSHUFI64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFI64X2 imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufi64x2 imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 43 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VSHUFI64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFI64X2 imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufi64x2 imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 43 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VSHUFPD xmm1, xmmV, xmm2/m128, imm8u","VSHUFPD imm8u, xmm2/m128, xmmV, xmm1","vshufpd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C6 /r ib","V","V","AVX","","w,r,r,r","",""
+"VSHUFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VSHUFPD imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vshufpd imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 C6 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r,r","",""
+"VSHUFPD ymm1, ymmV, ymm2/m256, imm8u","VSHUFPD imm8u, ymm2/m256, ymmV, ymm1","vshufpd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG C6 /r ib","V","V","AVX","","w,r,r,r","",""
+"VSHUFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFPD imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufpd imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 C6 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r,r","",""
+"VSHUFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFPD imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufpd imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 C6 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r,r","",""
+"VSHUFPS xmm1, xmmV, xmm2/m128, imm8u","VSHUFPS imm8u, xmm2/m128, xmmV, xmm1","vshufps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG C6 /r ib","V","V","AVX","","w,r,r,r","",""
+"VSHUFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VSHUFPS imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vshufps imm8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 C6 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r,r","",""
+"VSHUFPS ymm1, ymmV, ymm2/m256, imm8u","VSHUFPS imm8u, ymm2/m256, ymmV, ymm1","vshufps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG C6 /r ib","V","V","AVX","","w,r,r,r","",""
+"VSHUFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFPS imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufps imm8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 C6 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r,r","",""
+"VSHUFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFPS imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufps imm8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 C6 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r","",""
+"VSQRTPD xmm1, xmm2/m128","VSQRTPD xmm2/m128, xmm1","vsqrtpd xmm2/m128, xmm1","VEX.128.66.0F.WIG 51 /r","V","V","AVX","","w,r","",""
+"VSQRTPD xmm1, {k}{z}, xmm2/m128/m64bcst","VSQRTPD xmm2/m128/m64bcst, {k}{z}, xmm1","vsqrtpd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 51 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","",""
+"VSQRTPD ymm1, ymm2/m256","VSQRTPD ymm2/m256, ymm1","vsqrtpd ymm2/m256, ymm1","VEX.256.66.0F.WIG 51 /r","V","V","AVX","","w,r","",""
+"VSQRTPD ymm1, {k}{z}, ymm2/m256/m64bcst","VSQRTPD ymm2/m256/m64bcst, {k}{z}, ymm1","vsqrtpd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 51 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","",""
+"VSQRTPD zmm1{er}, {k}{z}, zmm2","VSQRTPD zmm2, {k}{z}, zmm1{er}","vsqrtpd zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 51 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VSQRTPD zmm1, {k}{z}, zmm2/m512/m64bcst","VSQRTPD zmm2/m512/m64bcst, {k}{z}, zmm1","vsqrtpd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 51 /r","V","V","AVX512F","bscale8,scale64","w,r,r","",""
+"VSQRTPS xmm1, xmm2/m128","VSQRTPS xmm2/m128, xmm1","vsqrtps xmm2/m128, xmm1","VEX.128.0F.WIG 51 /r","V","V","AVX","","w,r","",""
+"VSQRTPS xmm1, {k}{z}, xmm2/m128/m32bcst","VSQRTPS xmm2/m128/m32bcst, {k}{z}, xmm1","vsqrtps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 51 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","",""
+"VSQRTPS ymm1, ymm2/m256","VSQRTPS ymm2/m256, ymm1","vsqrtps ymm2/m256, ymm1","VEX.256.0F.WIG 51 /r","V","V","AVX","","w,r","",""
+"VSQRTPS ymm1, {k}{z}, ymm2/m256/m32bcst","VSQRTPS ymm2/m256/m32bcst, {k}{z}, ymm1","vsqrtps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 51 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","",""
+"VSQRTPS zmm1{er}, {k}{z}, zmm2","VSQRTPS zmm2, {k}{z}, zmm1{er}","vsqrtps zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 51 /r","V","V","AVX512F","modrm_regonly","w,r,r","",""
+"VSQRTPS zmm1, {k}{z}, zmm2/m512/m32bcst","VSQRTPS zmm2/m512/m32bcst, {k}{z}, zmm1","vsqrtps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 51 /r","V","V","AVX512F","bscale4,scale64","w,r,r","",""
+"VSQRTSD xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSD xmm2, xmmV, {k}{z}, xmm1{er}","vsqrtsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 51 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSQRTSD xmm1, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV, xmm1","vsqrtsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 51 /r","V","V","AVX","","w,r,r","",""
+"VSQRTSD xmm1, {k}{z}, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV, {k}{z}, xmm1","vsqrtsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 51 /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VSQRTSS xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSS xmm2, xmmV, {k}{z}, xmm1{er}","vsqrtss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 51 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSQRTSS xmm1, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV, xmm1","vsqrtss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 51 /r","V","V","AVX","","w,r,r","",""
+"VSQRTSS xmm1, {k}{z}, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV, {k}{z}, xmm1","vsqrtss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 51 /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VSTMXCSR m32","VSTMXCSR m32","vstmxcsr m32","VEX.128.0F.WIG AE /3","V","V","AVX","modrm_memonly","w","",""
+"VSUBPD xmm1, xmmV, xmm2/m128","VSUBPD xmm2/m128, xmmV, xmm1","vsubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSUBPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vsubpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 5C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VSUBPD ymm1, ymmV, ymm2/m256","VSUBPD ymm2/m256, ymmV, ymm1","vsubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSUBPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vsubpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 5C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VSUBPD zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPD zmm2, zmmV, {k}{z}, zmm1{er}","vsubpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSUBPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSUBPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vsubpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 5C /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VSUBPS xmm1, xmmV, xmm2/m128","VSUBPS xmm2/m128, xmmV, xmm1","vsubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSUBPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vsubps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 5C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VSUBPS ymm1, ymmV, ymm2/m256","VSUBPS ymm2/m256, ymmV, ymm1","vsubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSUBPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vsubps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 5C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VSUBPS zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPS zmm2, zmmV, {k}{z}, zmm1{er}","vsubps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSUBPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSUBPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vsubps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 5C /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VSUBSD xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSD xmm2, xmmV, {k}{z}, xmm1{er}","vsubsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSUBSD xmm1, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, xmm1","vsubsd xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBSD xmm1, {k}{z}, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, {k}{z}, xmm1","vsubsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5C /r","V","V","AVX512F","scale8","w,r,r,r","",""
+"VSUBSS xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSS xmm2, xmmV, {k}{z}, xmm1{er}","vsubss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 5C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","",""
+"VSUBSS xmm1, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, xmm1","vsubss xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5C /r","V","V","AVX","","w,r,r","",""
+"VSUBSS xmm1, {k}{z}, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, {k}{z}, xmm1","vsubss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5C /r","V","V","AVX512F","scale4","w,r,r,r","",""
+"VTESTPD xmm1, xmm2/m128","VTESTPD xmm2/m128, xmm1","vtestpd xmm2/m128, xmm1","VEX.128.66.0F38.W0 0F /r","V","V","AVX","","r,r","",""
+"VTESTPD ymm1, ymm2/m256","VTESTPD ymm2/m256, ymm1","vtestpd ymm2/m256, ymm1","VEX.256.66.0F38.W0 0F /r","V","V","AVX","","r,r","",""
+"VTESTPS xmm1, xmm2/m128","VTESTPS xmm2/m128, xmm1","vtestps xmm2/m128, xmm1","VEX.128.66.0F38.W0 0E /r","V","V","AVX","","r,r","",""
+"VTESTPS ymm1, ymm2/m256","VTESTPS ymm2/m256, ymm1","vtestps ymm2/m256, ymm1","VEX.256.66.0F38.W0 0E /r","V","V","AVX","","r,r","",""
+"VUCOMISD xmm1{sae}, xmm2","VUCOMISD xmm2, xmm1{sae}","vucomisd xmm2, xmm1{sae}","EVEX.128.66.0F.W1 2E /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xmm1","EVEX.LIG.66.0F.W1 2E /r","V","V","AVX512F","scale8","r,r","",""
+"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xmm1","VEX.LIG.66.0F.WIG 2E /r","V","V","AVX","","r,r","",""
+"VUCOMISS xmm1{sae}, xmm2","VUCOMISS xmm2, xmm1{sae}","vucomiss xmm2, xmm1{sae}","EVEX.128.0F.W0 2E /r","V","V","AVX512F","modrm_regonly","r,r","",""
+"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xmm1","EVEX.LIG.0F.W0 2E /r","V","V","AVX512F","scale4","r,r","",""
+"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xmm1","VEX.LIG.0F.WIG 2E /r","V","V","AVX","","r,r","",""
+"VUNPCKHPD xmm1, xmmV, xmm2/m128","VUNPCKHPD xmm2/m128, xmmV, xmm1","vunpckhpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKHPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKHPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vunpckhpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VUNPCKHPD ymm1, ymmV, ymm2/m256","VUNPCKHPD ymm2/m256, ymmV, ymm1","vunpckhpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKHPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKHPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vunpckhpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VUNPCKHPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKHPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vunpckhpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 15 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VUNPCKHPS xmm1, xmmV, xmm2/m128","VUNPCKHPS xmm2/m128, xmmV, xmm1","vunpckhps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKHPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKHPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vunpckhps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VUNPCKHPS ymm1, ymmV, ymm2/m256","VUNPCKHPS ymm2/m256, ymmV, ymm1","vunpckhps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 15 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKHPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKHPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vunpckhps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VUNPCKHPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKHPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vunpckhps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 15 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VUNPCKLPD xmm1, xmmV, xmm2/m128","VUNPCKLPD xmm2/m128, xmmV, xmm1","vunpcklpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKLPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKLPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vunpcklpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VUNPCKLPD ymm1, ymmV, ymm2/m256","VUNPCKLPD ymm2/m256, ymmV, ymm1","vunpcklpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKLPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKLPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vunpcklpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VUNPCKLPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKLPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vunpcklpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 14 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","",""
+"VUNPCKLPS xmm1, xmmV, xmm2/m128","VUNPCKLPS xmm2/m128, xmmV, xmm1","vunpcklps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKLPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKLPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vunpcklps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VUNPCKLPS ymm1, ymmV, ymm2/m256","VUNPCKLPS ymm2/m256, ymmV, ymm1","vunpcklps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 14 /r","V","V","AVX","","w,r,r","",""
+"VUNPCKLPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKLPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vunpcklps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VUNPCKLPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKLPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vunpcklps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 14 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","",""
+"VXORPD xmm1, xmmV, xmm2/m128","VXORPD xmm2/m128, xmmV, xmm1","vxorpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
+"VXORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VXORPD xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vxorpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 57 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r","",""
+"VXORPD ymm1, ymmV, ymm2/m256","VXORPD ymm2/m256, ymmV, ymm1","vxorpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
+"VXORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VXORPD ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vxorpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 57 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r","",""
+"VXORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VXORPD zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vxorpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 57 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",""
+"VXORPS xmm1, xmmV, xmm2/m128","VXORPS xmm2/m128, xmmV, xmm1","vxorps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
+"VXORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VXORPS xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vxorps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.0F.W0 57 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",""
+"VXORPS ymm1, ymmV, ymm2/m256","VXORPS ymm2/m256, ymmV, ymm1","vxorps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 57 /r","V","V","AVX","","w,r,r","",""
+"VXORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VXORPS ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vxorps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.0F.W0 57 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",""
+"VXORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VXORPS zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vxorps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.0F.W0 57 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","",""
+"VZEROALL","VZEROALL","vzeroall","VEX.256.0F.WIG 77","V","V","AVX","","","",""
+"VZEROUPPER","VZEROUPPER","vzeroupper","VEX.128.0F.WIG 77","V","V","AVX","","","",""
+"WAIT","WAIT","wait","9B","V","V","","pseudo","","",""
+"WBINVD","WBINVD","wbinvd","0F 09","V","V","486","","","",""
+"WRFSBASE rmr32","WRFSBASE rmr32","wrfsbase rmr32","F3 0F AE /2","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32"
+"WRFSBASE rmr64","WRFSBASE rmr64","wrfsbase rmr64","F3 REX.W 0F AE /2","N.S.","V","FSGSBASE","modrm_regonly","r","Y","64"
+"WRGSBASE rmr32","WRGSBASE rmr32","wrgsbase rmr32","F3 0F AE /3","N.S.","V","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32"
+"WRGSBASE rmr64","WRGSBASE rmr64","wrgsbase rmr64","F3 REX.W 0F AE /3","N.S.","V","FSGSBASE","modrm_regonly","r","Y","64"
+"WRMSR","WRMSR","wrmsr","0F 30","V","V","Pentium","","","",""
+"WRPKRU","WRPKRU","wrpkru","0F 01 EF","V","V","PKU","","","",""
+"WRSSD m32, r32","WRSSD r32, m32","wrssd r32, m32","0F 38 F6 /r","V","V","CET","modrm_memonly,operand16,operand32","w,r","",""
+"WRSSQ m64, r64","WRSSQ r64, m64","wrssq r64, m64","REX.W 0F 38 F6 /r","N.S.","V","CET","modrm_memonly","w,r","",""
+"WRUSSD m32, r32","WRUSSD r32, m32","wrussd r32, m32","66 0F 38 F5 /r","V","V","CET","modrm_memonly,operand16,operand32","w,r","",""
+"WRUSSQ m64, r64","WRUSSQ r64, m64","wrussq r64, m64","66 REX.W 0F 38 F5 /r","N.S.","V","CET","modrm_memonly","w,r","",""
+"XABORT imm8u","XABORT imm8u","xabort imm8u","C6 F8 ib","V","V","RTM","modrm_regonly","r","",""
+"XACQUIRE","XACQUIRE","xacquire","F2","V","V","HLE","pseudo","","",""
+"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","0F C0 /r","V","V","486","","rw,rw","Y","8"
+"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","REX 0F C0 /r","N.E.","V","","pseudo64","rw,w","Y","8"
+"XADD r/m32, r32","XADDL r32, r/m32","xaddl r32, r/m32","0F C1 /r","V","V","486","operand32","rw,rw","Y","32"
+"XADD r/m64, r64","XADDQ r64, r/m64","xaddq r64, r/m64","REX.W 0F C1 /r","N.S.","V","486","","rw,rw","Y","64"
+"XADD r/m16, r16","XADDW r16, r/m16","xaddw r16, r/m16","0F C1 /r","V","V","486","operand16","rw,rw","Y","16"
+"XBEGIN rel16","XBEGIN rel16","xbegin rel16","C7 F8 cw","V","V","RTM","modrm_regonly,operand16","r","",""
+"XBEGIN rel32","XBEGIN rel32","xbegin rel32","C7 F8 cd","V","V","RTM","modrm_regonly,operand32,operand64","r","",""
+"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","86 /r","V","V","","pseudo","w,r","Y","8"
+"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","REX 86 /r","N.E.","V","","pseudo","w,r","Y","8"
+"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","86 /r","V","V","","","rw,rw","Y","8"
+"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","REX 86 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"XCHG r32op, EAX","XCHGL EAX, r32op","xchgl EAX, r32op","90+rd","V","V","","operand32","rw,rw","Y","32"
+"XCHG r32, r/m32","XCHGL r/m32, r32","xchgl r/m32, r32","87 /r","V","V","","operand32,pseudo","w,r","Y","32"
+"XCHG r/m32, r32","XCHGL r32, r/m32","xchgl r32, r/m32","87 /r","V","V","","operand32","rw,rw","Y","32"
+"XCHG EAX, r32op","XCHGL r32op, EAX","xchgl r32op, EAX","90+rd","V","V","","operand32,pseudo","rw,rw","Y","32"
+"XCHG r64op, RAX","XCHGQ RAX, r64op","xchgq RAX, r64op","REX.W 90+ro","N.S.","V","","","rw,rw","Y","64"
+"XCHG r64, r/m64","XCHGQ r/m64, r64","xchgq r/m64, r64","REX.W 87 /r","N.E.","V","","pseudo","w,r","Y","64"
+"XCHG r/m64, r64","XCHGQ r64, r/m64","xchgq r64, r/m64","REX.W 87 /r","N.S.","V","","","rw,rw","Y","64"
+"XCHG RAX, r64op","XCHGQ r64op, RAX","xchgq r64op, RAX","REX.W 90+rd","N.E.","V","","pseudo","rw,rw","Y","64"
+"XCHG r16op, AX","XCHGW AX, r16op","xchgw AX, r16op","90+rw","V","V","","operand16","rw,rw","Y","16"
+"XCHG r16, r/m16","XCHGW r/m16, r16","xchgw r/m16, r16","87 /r","V","V","","operand16,pseudo","w,r","Y","16"
+"XCHG r/m16, r16","XCHGW r16, r/m16","xchgw r16, r/m16","87 /r","V","V","","operand16","rw,rw","Y","16"
+"XCHG AX, r16op","XCHGW r16op, AX","xchgw r16op, AX","90+rw","V","V","","operand16,pseudo","rw,rw","Y","16"
+"XEND","XEND","xend","0F 01 D5","V","V","RTM","","","",""
+"XGETBV","XGETBV","xgetbv","0F 01 D0","V","V","XSAVE","","","",""
+"XLATB","XLAT","xlat","D7","V","V","","","","",""
+"XLATB","XLAT","xlat","REX.W D7","N.E.","V","","pseudo","","",""
+"XOR r/m8, imm8","XORB imm8, r/m8","xorb imm8, r/m8","REX 80 /6 ib","N.E.","V","","pseudo64","rw,r","Y","8"
+"XOR AL, imm8u","XORB imm8u, AL","xorb imm8u, AL","34 ib","V","V","","","rw,r","Y","8"
+"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","80 /6 ib","V","V","","","rw,r","Y","8"
+"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","82 /6 ib","V","N.S.","","","rw,r","Y","8"
+"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","32 /r","V","V","","","rw,r","Y","8"
+"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","REX 32 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","30 /r","V","V","","","rw,r","Y","8"
+"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","REX 30 /r","N.E.","V","","pseudo64","rw,r","Y","8"
+"XOR EAX, imm32","XORL imm32, EAX","xorl imm32, EAX","35 id","V","V","","operand32","rw,r","Y","32"
+"XOR r/m32, imm32","XORL imm32, r/m32","xorl imm32, r/m32","81 /6 id","V","V","","operand32","rw,r","Y","32"
+"XOR r/m32, imm8","XORL imm8, r/m32","xorl imm8, r/m32","83 /6 ib","V","V","","operand32","rw,r","Y","32"
+"XOR r32, r/m32","XORL r/m32, r32","xorl r/m32, r32","33 /r","V","V","","operand32","rw,r","Y","32"
+"XOR r/m32, r32","XORL r32, r/m32","xorl r32, r/m32","31 /r","V","V","","operand32","rw,r","Y","32"
+"XORPD xmm1, xmm2/m128","XORPD xmm2/m128, xmm1","xorpd xmm2/m128, xmm1","66 0F 57 /r","V","V","SSE2","","rw,r","",""
+"XORPS xmm1, xmm2/m128","XORPS xmm2/m128, xmm1","xorps xmm2/m128, xmm1","0F 57 /r","V","V","SSE","","rw,r","",""
+"XOR RAX, imm32","XORQ imm32, RAX","xorq imm32, RAX","REX.W 35 id","N.S.","V","","","rw,r","Y","64"
+"XOR r/m64, imm32","XORQ imm32, r/m64","xorq imm32, r/m64","REX.W 81 /6 id","N.S.","V","","","rw,r","Y","64"
+"XOR r/m64, imm8","XORQ imm8, r/m64","xorq imm8, r/m64","REX.W 83 /6 ib","N.S.","V","","","rw,r","Y","64"
+"XOR r64, r/m64","XORQ r/m64, r64","xorq r/m64, r64","REX.W 33 /r","N.S.","V","","","rw,r","Y","64"
+"XOR r/m64, r64","XORQ r64, r/m64","xorq r64, r/m64","REX.W 31 /r","N.S.","V","","","rw,r","Y","64"
+"XOR AX, imm16","XORW imm16, AX","xorw imm16, AX","35 iw","V","V","","operand16","rw,r","Y","16"
+"XOR r/m16, imm16","XORW imm16, r/m16","xorw imm16, r/m16","81 /6 iw","V","V","","operand16","rw,r","Y","16"
+"XOR r/m16, imm8","XORW imm8, r/m16","xorw imm8, r/m16","83 /6 ib","V","V","","operand16","rw,r","Y","16"
+"XOR r16, r/m16","XORW r/m16, r16","xorw r/m16, r16","33 /r","V","V","","operand16","rw,r","Y","16"
+"XOR r/m16, r16","XORW r16, r/m16","xorw r16, r/m16","31 /r","V","V","","operand16","rw,r","Y","16"
+"XRELEASE","XRELEASE","xrelease","F3","V","V","HLE","pseudo","","",""
+"XRSTOR mem","XRSTOR mem","xrstor mem","0F AE /5","V","V","XSAVE","modrm_memonly,operand16,operand32","r","",""
+"XRSTOR64 mem","XRSTOR64 mem","xrstor64 mem","REX.W 0F AE /5","N.S.","V","XSAVE","modrm_memonly","r","",""
+"XRSTORS mem","XRSTORS mem","xrstors mem","0F C7 /3","V","V","XSAVES","modrm_memonly,operand16,operand32","r","",""
+"XRSTORS64 mem","XRSTORS64 mem","xrstors64 mem","REX.W 0F C7 /3","N.S.","V","XSAVES","modrm_memonly","r","",""
+"XSAVE mem","XSAVE mem","xsave mem","0F AE /4","V","V","XSAVE","modrm_memonly,operand16,operand32","w","",""
+"XSAVE64 mem","XSAVE64 mem","xsave64 mem","REX.W 0F AE /4","N.S.","V","XSAVE","modrm_memonly","w","",""
+"XSAVEC mem","XSAVEC mem","xsavec mem","0F C7 /4","V","V","XSAVEC","modrm_memonly,operand16,operand32","w","",""
+"XSAVEC64 mem","XSAVEC64 mem","xsavec64 mem","REX.W 0F C7 /4","N.S.","V","XSAVEC","modrm_memonly","w","",""
+"XSAVEOPT mem","XSAVEOPT mem","xsaveopt mem","0F AE /6","V","V","XSAVEOPT","modrm_memonly,operand16,operand32","w","",""
+"XSAVEOPT64 mem","XSAVEOPT64 mem","xsaveopt64 mem","REX.W 0F AE /6","N.S.","V","XSAVEOPT","modrm_memonly","w","",""
+"XSAVES mem","XSAVES mem","xsaves mem","0F C7 /5","V","V","XSAVES","modrm_memonly,operand16,operand32","w","",""
+"XSAVES64 mem","XSAVES64 mem","xsaves64 mem","REX.W 0F C7 /5","N.S.","V","XSAVES","modrm_memonly","w","",""
+"XSETBV","XSETBV","xsetbv","0F 01 D1","V","V","XSAVE","","","",""
+"XTEST","XTEST","xtest","0F 01 D6","V","V","HLE or RTM","","","",""
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2 42/42] i386: Add sha512-avx test
  2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
                   ` (44 preceding siblings ...)
  2022-04-24 22:02 ` [PATCH v2 41/42] AVX tests Paul Brook
@ 2022-04-24 22:02 ` Paul Brook
  45 siblings, 0 replies; 67+ messages in thread
From: Paul Brook @ 2022-04-24 22:02 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here, Paul Brook

Include sha512 built with avx[2] in the tcg tests.

Signed-off-by: Paul Brook <paul@nowt.org>
---
 tests/tcg/i386/Makefile.target | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
index eb06f7eb89..a0335fff6d 100644
--- a/tests/tcg/i386/Makefile.target
+++ b/tests/tcg/i386/Makefile.target
@@ -79,7 +79,14 @@ sha512-sse: sha512.c
 run-sha512-sse: QEMU_OPTS+=-cpu max
 run-plugin-sha512-sse-with-%: QEMU_OPTS+=-cpu max
 
-TESTS+=sha512-sse
+sha512-avx: CFLAGS=-mavx2 -mavx -O3
+sha512-avx: sha512.c
+	$(CC) $(CFLAGS) $(EXTRA_CFLAGS) $< -o $@ $(LDFLAGS)
+
+run-sha512-avx: QEMU_OPTS+=-cpu max
+run-plugin-sha512-avx-with-%: QEMU_OPTS+=-cpu max
+
+TESTS+=sha512-sse sha512-avx
 
 test-avx.h: test-avx.py x86.csv
 	$(PYTHON) $(I386_SRC)/test-avx.py $(I386_SRC)/x86.csv $@
-- 
2.36.0



^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 01/42] i386: pcmpestr 64-bit sign extension bug
  2022-04-24 22:01 ` [PATCH v2 01/42] i386: pcmpestr 64-bit sign extension bug Paul Brook
@ 2022-04-25 15:50   ` Richard Henderson
  2022-04-27  7:00   ` Paolo Bonzini
  1 sibling, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2022-04-25 15:50 UTC (permalink / raw)
  To: Paul Brook, Paolo Bonzini, Eduardo Habkost; +Cc: open list:All patches CC here

On 4/24/22 15:01, Paul Brook wrote:
> The abs1 function in ops_sse.h only works sorrectly when the result fits
> in a signed int. This is fine most of the time because we're only dealing
> with byte sized values.
> 
> However pcmp_elen helper function uses abs1 to calculate the absolute value
> of a cpu register. This incorrectly truncates to 32 bits, and will give
> the wrong anser for the most negative value.
> 
> Fix by open coding the saturation check before taking the absolute value.
> 
> Signed-off-by: Paul Brook <paul@nowt.org>
> ---
>   target/i386/ops_sse.h | 20 +++++++++-----------
>   1 file changed, 9 insertions(+), 11 deletions(-)

This works, since the bound comes first, so
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

> +    if ((val > limit) || (val < -limit)) {
> +        return limit;
> +    }
> +    return abs1(val);

But you could also have used uabs64() for one fewer compare.


r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 02/42] i386: DPPS rounding fix
  2022-04-24 22:01 ` [PATCH v2 02/42] i386: DPPS rounding fix Paul Brook
@ 2022-04-25 16:09   ` Richard Henderson
  0 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2022-04-25 16:09 UTC (permalink / raw)
  To: Paul Brook, Paolo Bonzini, Eduardo Habkost; +Cc: open list:All patches CC here

On 4/24/22 15:01, Paul Brook wrote:
> The DPPS (Dot Product) instruction is defined to first sum pairs of
> intermediate results, then sum those values to get the final result.
> i.e. (A+B)+(C+D)
> 
> We incrementally sum the results, i.e. ((A+B)+C)+D, which can result
> in incorrect rouding.
> 
> For consistency, also remove the redundant (but harmless) add operation
> from DPPD
> 
> Signed-off-by: Paul Brook <paul@nowt.org>
> ---
>   target/i386/ops_sse.h | 47 +++++++++++++++++++++++--------------------
>   1 file changed, 25 insertions(+), 22 deletions(-)
> 
> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
> index 535440f882..a5a48a20f6 100644
> --- a/target/i386/ops_sse.h
> +++ b/target/i386/ops_sse.h
> @@ -1934,32 +1934,36 @@ SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP)
>   
>   void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
>   {
> -    float32 iresult = float32_zero;
> +    float32 prod, iresult, iresult2;
>   
> +    /*
> +     * We must evaluate (A+B)+(C+D), not ((A+B)+C)+D
> +     * to correctly round the intermediate results
> +     */
>       if (mask & (1 << 4)) {
> -        iresult = float32_add(iresult,
> -                              float32_mul(d->ZMM_S(0), s->ZMM_S(0),
> -                                          &env->sse_status),
> -                              &env->sse_status);
> +        iresult = float32_mul(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status);
> +    } else {
> +        iresult = float32_zero;
>       }
>       if (mask & (1 << 5)) {
> -        iresult = float32_add(iresult,
> -                              float32_mul(d->ZMM_S(1), s->ZMM_S(1),
> -                                          &env->sse_status),
> -                              &env->sse_status);
> +        prod = float32_mul(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status);
> +    } else {
> +        prod = float32_zero;
>       }
> +    iresult = float32_add(iresult, prod, &env->sse_status);
>       if (mask & (1 << 6)) {
> -        iresult = float32_add(iresult,
> -                              float32_mul(d->ZMM_S(2), s->ZMM_S(2),
> -                                          &env->sse_status),
> -                              &env->sse_status);
> +        iresult2 = float32_mul(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status);
> +    } else {
> +        iresult2 = float32_zero;
>       }
>       if (mask & (1 << 7)) {
> -        iresult = float32_add(iresult,
> -                              float32_mul(d->ZMM_S(3), s->ZMM_S(3),
> -                                          &env->sse_status),
> -                              &env->sse_status);
> +        prod = float32_mul(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status);
> +    } else {
> +        prod = float32_zero;
>       }
> +    iresult2 = float32_add(iresult2, prod, &env->sse_status);
> +    iresult = float32_add(iresult, iresult2, &env->sse_status);
> +
>       d->ZMM_S(0) = (mask & (1 << 0)) ? iresult : float32_zero;
>       d->ZMM_S(1) = (mask & (1 << 1)) ? iresult : float32_zero;
>       d->ZMM_S(2) = (mask & (1 << 2)) ? iresult : float32_zero;

Here I believe you're producing correct results, but reuse of variable names does not aid 
clarity.  Better written with prod[0-3], iresult[0-1], and result.

> @@ -1968,13 +1972,12 @@ void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
>   
>   void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask)
>   {
> -    float64 iresult = float64_zero;
> +    float64 iresult;
>   
>       if (mask & (1 << 4)) {
> -        iresult = float64_add(iresult,
> -                              float64_mul(d->ZMM_D(0), s->ZMM_D(0),
> -                                          &env->sse_status),
> -                              &env->sse_status);
> +        iresult = float64_mul(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status);
> +    } else {
> +        iresult = float64_zero;
>       }
>       if (mask & (1 << 5)) {
>           iresult = float64_add(iresult,

This is incorrect.  By skipping the add if 1<<5 is not set, you can produce an incorrect 
result of -0 from the 1<<4 mul.


r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 03/42] Add AVX_EN hflag
  2022-04-24 22:01 ` [PATCH v2 03/42] Add AVX_EN hflag Paul Brook
@ 2022-04-25 17:27   ` Richard Henderson
  0 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2022-04-25 17:27 UTC (permalink / raw)
  To: Paul Brook, Paolo Bonzini, Eduardo Habkost; +Cc: open list:All patches CC here

On 4/24/22 15:01, Paul Brook wrote:
> Add a new hflag bit to determine whether AVX instructions are allowed
> 
> Signed-off-by: Paul Brook<paul@nowt.org>
> ---
>   target/i386/cpu.h            |  3 +++
>   target/i386/helper.c         | 12 ++++++++++++
>   target/i386/tcg/fpu_helper.c |  1 +
>   3 files changed, 16 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 06/42] i386: Add CHECK_NO_VEX
  2022-04-24 22:01 ` [PATCH v2 06/42] i386: Add CHECK_NO_VEX Paul Brook
@ 2022-04-25 20:39   ` Richard Henderson
  2022-04-25 20:41   ` Richard Henderson
  1 sibling, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2022-04-25 20:39 UTC (permalink / raw)
  To: Paul Brook, Paolo Bonzini, Eduardo Habkost; +Cc: open list:All patches CC here

On 4/24/22 15:01, Paul Brook wrote:
> Reject invalid VEX encodings on MMX instructions.
> 
> Signed-off-by: Paul Brook<paul@nowt.org>
> ---
>   target/i386/tcg/translate.c | 26 ++++++++++++++++++++++++++
>   1 file changed, 26 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 06/42] i386: Add CHECK_NO_VEX
  2022-04-24 22:01 ` [PATCH v2 06/42] i386: Add CHECK_NO_VEX Paul Brook
  2022-04-25 20:39   ` Richard Henderson
@ 2022-04-25 20:41   ` Richard Henderson
  1 sibling, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2022-04-25 20:41 UTC (permalink / raw)
  To: Paul Brook, Paolo Bonzini, Eduardo Habkost; +Cc: open list:All patches CC here

On 4/24/22 15:01, Paul Brook wrote:
> +/* VEX prefix not allowed */
> +#define CHECK_NO_VEX(s) do { \
> +    if (s->prefix & PREFIX_VEX) \
> +        goto illegal_op; \
> +    } while (0)

Make the do/while align, and add the required braces for the if, per coding style.


r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 07/42] Enforce VEX encoding restrictions
  2022-04-24 22:01 ` [PATCH v2 07/42] Enforce VEX encoding restrictions Paul Brook
@ 2022-04-25 20:42   ` Richard Henderson
  2022-04-25 21:00   ` Richard Henderson
  2022-04-27  9:08   ` Paolo Bonzini
  2 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2022-04-25 20:42 UTC (permalink / raw)
  To: Paul Brook, Paolo Bonzini, Eduardo Habkost; +Cc: open list:All patches CC here

On 4/24/22 15:01, Paul Brook wrote:
> +/*
> + * VEX encodings require AVX
> + * Allow legacy SSE encodings even if AVX not enabled
> + */
> +#define CHECK_AVX(s) do { \
> +    if ((s->prefix & PREFIX_VEX) \
> +        && !(env->hflags & HF_AVX_EN_MASK)) \
> +        goto illegal_op; \
> +    } while (0)

Likewise, fix coding style.


r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 07/42] Enforce VEX encoding restrictions
  2022-04-24 22:01 ` [PATCH v2 07/42] Enforce VEX encoding restrictions Paul Brook
  2022-04-25 20:42   ` Richard Henderson
@ 2022-04-25 21:00   ` Richard Henderson
  2022-04-27  9:08   ` Paolo Bonzini
  2 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2022-04-25 21:00 UTC (permalink / raw)
  To: Paul Brook, Paolo Bonzini, Eduardo Habkost; +Cc: open list:All patches CC here

On 4/24/22 15:01, Paul Brook wrote:
> Add CHECK_AVX* macros, and use them to validate VEX encoded AVX instructions
> 
> All AVX instructions require both CPU and OS support, this is encapsulated
> by HF_AVX_EN.
> 
> Some also require specific values in the VEX.L and VEX.V fields.
> Some (mostly integer operations) also require AVX2
> 
> Signed-off-by: Paul Brook <paul@nowt.org>
> ---
>   target/i386/tcg/translate.c | 159 +++++++++++++++++++++++++++++++++---
>   1 file changed, 149 insertions(+), 10 deletions(-)
> 
> diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
> index 66ba690b7d..2f5cc24e0c 100644
> --- a/target/i386/tcg/translate.c
> +++ b/target/i386/tcg/translate.c
> @@ -3185,10 +3185,54 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] = {
>           goto illegal_op; \
>       } while (0)
>   
> +/*
> + * VEX encodings require AVX
> + * Allow legacy SSE encodings even if AVX not enabled
> + */
> +#define CHECK_AVX(s) do { \
> +    if ((s->prefix & PREFIX_VEX) \
> +        && !(env->hflags & HF_AVX_EN_MASK)) \
> +        goto illegal_op; \
> +    } while (0)
> +
> +/* If a VEX prefix is used then it must have V=1111b */
> +#define CHECK_AVX_V0(s) do { \
> +    CHECK_AVX(s); \
> +    if ((s->prefix & PREFIX_VEX) && (s->vex_v != 0)) \
> +        goto illegal_op; \
> +    } while (0)
> +
> +/* If a VEX prefix is used then it must have L=0 */
> +#define CHECK_AVX_128(s) do { \
> +    CHECK_AVX(s); \
> +    if ((s->prefix & PREFIX_VEX) && (s->vex_l != 0)) \
> +        goto illegal_op; \
> +    } while (0)
> +
> +/* If a VEX prefix is used then it must have V=1111b and L=0 */
> +#define CHECK_AVX_V0_128(s) do { \
> +    CHECK_AVX(s); \
> +    if ((s->prefix & PREFIX_VEX) && (s->vex_v != 0 || s->vex_l != 0)) \
> +        goto illegal_op; \
> +    } while (0)

These predicates have some overlap, but awkwardly.  It leaves you with cases like

> +                if (op6.flags & SSE_OPF_V0) {
> +                    CHECK_AVX_V0(s);
> +                } else {
> +                    CHECK_AVX(s);
> +                }

this, where clearly the CHECK_AVX is common across the IF, and would be better written as

     CHECK_AVX(s);
     if (flags & SSE_OPF_V0) {
         CHECK_V0(s);
     }

> +            CHECK_AVX(s);
> +            scalar_op = (s->prefix & PREFIX_VEX)
> +                && (op7.flags & SSE_OPF_SCALAR)
> +                && !(op7.flags & SSE_OPF_CMP);
> +            if (is_xmm && (op7.flags & SSE_OPF_MMX)) {
> +                CHECK_AVX2_256(s);
> +            }
> +            if (op7.flags & SSE_OPF_AVX2) {
> +                CHECK_AVX2(s);
> +            }
> +            if ((op7.flags & SSE_OPF_V0) && !scalar_op) {
> +                CHECK_AVX_V0(s);
> +            }

And these.  Also, it would appear as if there's overlap between the AVX2 checks.  Is this 
clearer as

     CHECK_AVX(s);
     if (v0 && !scalar) {
        CHECK_V0(s);
     }
     if ((flags & AVX2) || ((flags & MMX) && s->vex_l)) {
         CHECK_AVX2(s);
     }

and perhaps these could be broken out into helpers, so that

>           if (is_xmm) {
> +            scalar_op = (s->prefix & PREFIX_VEX)
> +                && (sse_op.flags & SSE_OPF_SCALAR)
> +                && !(sse_op.flags & SSE_OPF_CMP)
> +                && (b1 == 2 || b1 == 3);
> +            /* VEX encoded scalar ops always have 3 operands! */
> +            if ((sse_op.flags & SSE_OPF_V0) && !scalar_op) {
> +                CHECK_AVX_V0(s);
> +            } else {
> +                CHECK_AVX(s);
> +            }
> +            if (sse_op.flags & SSE_OPF_MMX) {
> +                CHECK_AVX2_256(s);
> +            }

... you don't have to keep repeating stuff.  This is where a better decoder could really help.


r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 08/42] i386: Add ZMM_OFFSET macro
  2022-04-24 22:01 ` [PATCH v2 08/42] i386: Add ZMM_OFFSET macro Paul Brook
@ 2022-04-25 21:03   ` Richard Henderson
  0 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2022-04-25 21:03 UTC (permalink / raw)
  To: Paul Brook, Paolo Bonzini, Eduardo Habkost; +Cc: open list:All patches CC here

On 4/24/22 15:01, Paul Brook wrote:
> Add a convenience macro to get the address of an xmm_regs element within
> CPUX86State.
> 
> This was originally going to be the basis of an implementation that broke
> operations into 128 bit chunks. I scrapped that idea, so this is now a purely
> cosmetic change. But I think a worthwhile one - it reduces the number of
> function calls that need to be split over multiple lines.
> 
> No functional changes.
> 
> Signed-off-by: Paul Brook<paul@nowt.org>
> ---
>   target/i386/tcg/translate.c | 60 +++++++++++++++++--------------------
>   1 file changed, 27 insertions(+), 33 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 10/42] i386: Rewrite vector shift helper
  2022-04-24 22:01 ` [PATCH v2 10/42] i386: Rewrite vector shift helper Paul Brook
@ 2022-04-25 21:33   ` Richard Henderson
  2022-04-27  6:51     ` Paolo Bonzini
  0 siblings, 1 reply; 67+ messages in thread
From: Richard Henderson @ 2022-04-25 21:33 UTC (permalink / raw)
  To: Paul Brook, Paolo Bonzini, Eduardo Habkost; +Cc: open list:All patches CC here

On 4/24/22 15:01, Paul Brook wrote:
> Rewrite the vector shift helpers in preperation for AVX support (3 operand
> form and 256 bit vectors).
> 
> For now keep the existing two operand interface.
> 
> No functional changes to existing helpers.
> 
> Signed-off-by: Paul Brook <paul@nowt.org>
> ---
>   target/i386/ops_sse.h | 250 ++++++++++++++++++++++--------------------
>   1 file changed, 133 insertions(+), 117 deletions(-)
> 
> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
> index 23daab6b50..9297c96d04 100644
> --- a/target/i386/ops_sse.h
> +++ b/target/i386/ops_sse.h
> @@ -63,199 +63,215 @@
>   #define MOVE(d, r) memcpy(&(d).B(0), &(r).B(0), SIZE)
>   #endif
>   
> -void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
> +#if SHIFT == 0
> +#define SHIFT_HELPER_BODY(n, elem, F) do {      \
> +    d->elem(0) = F(s->elem(0), shift);          \
> +    if ((n) > 1) {                              \
> +        d->elem(1) = F(s->elem(1), shift);      \
> +    }                                           \
> +    if ((n) > 2) {                              \
> +        d->elem(2) = F(s->elem(2), shift);      \
> +        d->elem(3) = F(s->elem(3), shift);      \
> +    }                                           \
> +    if ((n) > 4) {                              \
> +        d->elem(4) = F(s->elem(4), shift);      \
> +        d->elem(5) = F(s->elem(5), shift);      \
> +        d->elem(6) = F(s->elem(6), shift);      \
> +        d->elem(7) = F(s->elem(7), shift);      \
> +    }                                           \
> +    if ((n) > 8) {                              \
> +        d->elem(8) = F(s->elem(8), shift);      \
> +        d->elem(9) = F(s->elem(9), shift);      \
> +        d->elem(10) = F(s->elem(10), shift);    \
> +        d->elem(11) = F(s->elem(11), shift);    \
> +        d->elem(12) = F(s->elem(12), shift);    \
> +        d->elem(13) = F(s->elem(13), shift);    \
> +        d->elem(14) = F(s->elem(14), shift);    \
> +        d->elem(15) = F(s->elem(15), shift);    \
> +    }                                           \
> +    } while (0)
> +
> +#define FPSRL(x, c) ((x) >> shift)
> +#define FPSRAW(x, c) ((int16_t)(x) >> shift)
> +#define FPSRAL(x, c) ((int32_t)(x) >> shift)
> +#define FPSLL(x, c) ((x) << shift)
> +#endif
> +
> +void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
>   {
> +    Reg *s = d;
>       int shift;
> -
> -    if (s->Q(0) > 15) {
> +    if (c->Q(0) > 15) {
>           d->Q(0) = 0;
> -#if SHIFT == 1
> -        d->Q(1) = 0;
> -#endif
> +        XMM_ONLY(d->Q(1) = 0;)
> +        YMM_ONLY(
> +                d->Q(2) = 0;
> +                d->Q(3) = 0;
> +                )
>       } else {
> -        shift = s->B(0);
> -        d->W(0) >>= shift;
> -        d->W(1) >>= shift;
> -        d->W(2) >>= shift;
> -        d->W(3) >>= shift;
> -#if SHIFT == 1
> -        d->W(4) >>= shift;
> -        d->W(5) >>= shift;
> -        d->W(6) >>= shift;
> -        d->W(7) >>= shift;
> -#endif
> +        shift = c->B(0);
> +        SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRL);
>       }

I do not think it worthwhile to unroll these loops by hand.
If we're that keen on it, it should be written

#pragma GCC unroll 4 << SHIFT
     for (i = 0; i < 4 << SHIFT; ++i) {
         something
     }

However, I would much rather you rework the users to use tcg_gen_gvec_3.  Note that you 
can't use tcg_gen_gvec_shls directly because of the shift-overflow-to-zero behaviour. 
There are examples in target/arm/translate.c, though of course the arm shift semantics are 
different, so it's not cut-and-paste.



r~


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 10/42] i386: Rewrite vector shift helper
  2022-04-25 21:33   ` Richard Henderson
@ 2022-04-27  6:51     ` Paolo Bonzini
  0 siblings, 0 replies; 67+ messages in thread
From: Paolo Bonzini @ 2022-04-27  6:51 UTC (permalink / raw)
  To: Richard Henderson, Paul Brook, Eduardo Habkost
  Cc: open list:All patches CC here

On 4/25/22 23:33, Richard Henderson wrote:
> I do not think it worthwhile to unroll these loops by hand.

Totally agree, as it would also remove most of the uses of 
XMM_ONLY/YMM_ONLY.

I also saw GCC -Warray-bounds complain about

	if (SHIFT >= 1) {
		d->elem[8] = s->elem[8];
	}

though this should probably treated as a GCC bug.

Paolo

> If we're that keen on it, it should be written
> 
> #pragma GCC unroll 4 << SHIFT
>      for (i = 0; i < 4 << SHIFT; ++i) {
>          something
>      }



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 13/42] i386: Destructive vector helpers for AVX
  2022-04-24 22:01 ` [PATCH v2 13/42] i386: Destructive vector helpers for AVX Paul Brook
@ 2022-04-27  6:53   ` Paolo Bonzini
  0 siblings, 0 replies; 67+ messages in thread
From: Paolo Bonzini @ 2022-04-27  6:53 UTC (permalink / raw)
  To: Paul Brook, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here

On 4/25/22 00:01, Paul Brook wrote:
> +#define SHUFFLE4(F, a, b, offset) do {      \
> +    r0 = a->F((order & 3) + offset);        \
> +    r1 = a->F(((order >> 2) & 3) + offset); \
> +    r2 = b->F(((order >> 4) & 3) + offset); \
> +    r3 = b->F(((order >> 6) & 3) + offset); \
> +    d->F(offset) = r0;                      \
> +    d->F(offset + 1) = r1;                  \
> +    d->F(offset + 2) = r2;                  \
> +    d->F(offset + 3) = r3;                  \
> +    } while (0)
> +
>   #if SHIFT == 0
>   void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order)
>   {
> -    Reg r;
> +    uint16_t r0, r1, r2, r3;
>   
> -    r.W(0) = s->W(order & 3);
> -    r.W(1) = s->W((order >> 2) & 3);
> -    r.W(2) = s->W((order >> 4) & 3);
> -    r.W(3) = s->W((order >> 6) & 3);
> -    MOVE(*d, r);
> +    SHUFFLE4(W, s, s, 0);

I am not particularly attached to the MOVE macro, but replacing the Reg 
variable with scalars seems worse.

Paolo


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 01/42] i386: pcmpestr 64-bit sign extension bug
  2022-04-24 22:01 ` [PATCH v2 01/42] i386: pcmpestr 64-bit sign extension bug Paul Brook
  2022-04-25 15:50   ` Richard Henderson
@ 2022-04-27  7:00   ` Paolo Bonzini
  1 sibling, 0 replies; 67+ messages in thread
From: Paolo Bonzini @ 2022-04-27  7:00 UTC (permalink / raw)
  To: Paul Brook, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here

On 4/25/22 00:01, Paul Brook wrote:
> The abs1 function in ops_sse.h only works sorrectly when the result fits
> in a signed int. This is fine most of the time because we're only dealing
> with byte sized values.
> 
> However pcmp_elen helper function uses abs1 to calculate the absolute value
> of a cpu register. This incorrectly truncates to 32 bits, and will give
> the wrong anser for the most negative value.
> 
> Fix by open coding the saturation check before taking the absolute value.
> 
> Signed-off-by: Paul Brook <paul@nowt.org>

Queued, thanks.

Paolo

> ---
>   target/i386/ops_sse.h | 20 +++++++++-----------
>   1 file changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
> index e4d74b814a..535440f882 100644
> --- a/target/i386/ops_sse.h
> +++ b/target/i386/ops_sse.h
> @@ -2011,25 +2011,23 @@ SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ)
>   
>   static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl)
>   {
> -    int val;
> +    target_long val, limit;
>   
>       /* Presence of REX.W is indicated by a bit higher than 7 set */
>       if (ctrl >> 8) {
> -        val = abs1((int64_t)env->regs[reg]);
> +        val = (target_long)env->regs[reg];
>       } else {
> -        val = abs1((int32_t)env->regs[reg]);
> +        val = (int32_t)env->regs[reg];
>       }
> -
>       if (ctrl & 1) {
> -        if (val > 8) {
> -            return 8;
> -        }
> +        limit = 8;
>       } else {
> -        if (val > 16) {
> -            return 16;
> -        }
> +        limit = 16;
>       }
> -    return val;
> +    if ((val > limit) || (val < -limit)) {
> +        return limit;
> +    }
> +    return abs1(val);
>   }
>   
>   static inline int pcmp_ilen(Reg *r, uint8_t ctrl)



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 07/42] Enforce VEX encoding restrictions
  2022-04-24 22:01 ` [PATCH v2 07/42] Enforce VEX encoding restrictions Paul Brook
  2022-04-25 20:42   ` Richard Henderson
  2022-04-25 21:00   ` Richard Henderson
@ 2022-04-27  9:08   ` Paolo Bonzini
  2 siblings, 0 replies; 67+ messages in thread
From: Paolo Bonzini @ 2022-04-27  9:08 UTC (permalink / raw)
  To: Paul Brook, Richard Henderson, Eduardo Habkost
  Cc: open list:All patches CC here

On 4/25/22 00:01, Paul Brook wrote:
> +/* If a VEX prefix is used then it must have V=1111b */
> +#define CHECK_AVX_V0(s) do { \
> +    CHECK_AVX(s); \
> +    if ((s->prefix & PREFIX_VEX) && (s->vex_v != 0)) \
> +        goto illegal_op; \
> +    } while (0)
> +

What do you think about

#define CHECK_AVX(s, flags) \
     do {
         if ((s->prefix & PREFIX_VEX) && !(env->hflags & HF_AVX_EN_MASK)) {
             goto illegal_op;
         }
         if ((flags) & SSE_OPF_AVX2) {
             CHECK_AVX2(s);
         }
         if ((flags) & SSE_OPF_AVX_128) {
             CHECK_AVX_128(s);
         }
         if ((flags) & SSE_OPF_V0) {
             CHECK_V0(s);
         }
     }

Macros such as CHECK_AVX_V0_128(s) would become CHECK_AVX(s, SSE_OPF_V0 
| SSE_OPF_AVX_128); a bit longer but still bearable.  And here you would 
have:

>           case 0x210: /* movss xmm, ea */
>               if (mod != 3) {
> +                CHECK_AVX_V0_128(s);
>                   gen_lea_modrm(env, s, modrm);
>                   gen_op_ld_v(s, MO_32, s->T0, s->A0);
>                   tcg_gen_st32_tl(s->T0, cpu_env,
> @@ -3379,6 +3432,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
>                   tcg_gen_st32_tl(s->T0, cpu_env,
>                                   offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)));
>               } else {
> +                CHECK_AVX_128(s);

     CHECK_AVX(s, SSE_OPF_AVX_128);
     if (mod != 3) {
         CHECK_V0(s);
         ...
     } else {
         ...
     }

Another possibility is to add SSE_OPF_V0_MEM (i.e. V0 if mod != 3), and use

     CHECK_AVX(s, SSE_OPF_AVX_128 | SSE_OPF_AVX_V0_MEM);


It's okay not to move _all_ flags checks in the macros, but for example 
here:

> +            if (op6.ext_mask == CPUID_EXT_AVX
> +                    && (s->prefix & PREFIX_VEX) == 0) {
> +                goto illegal_op;
> +            }
> +            if (op6.flags & SSE_OPF_AVX2) {
> +                CHECK_AVX2(s);
> +            }
> +
>               if (b1) {
> +                if (op6.flags & SSE_OPF_V0) {
> +                    CHECK_AVX_V0(s);
> +                } else {
> +                    CHECK_AVX(s);
> +                }
>                   op1_offset = offsetof(CPUX86State,xmm_regs[reg]);
> +
> +                if (op6.flags & SSE_OPF_MMX) {
> +                    CHECK_AVX2_256(s);
> +                }

there is a lot of room for using a flags-extended CHECK_AVX macro.


Also, SSE_OPF_V0 seems overloaded, because it means depending on the 
place in the code:

- always 2-operand

- 2-operand except if SCALAR && !CMP

- 2-operand except if SCALAR && !CMP && has REPZ/REPNZ prefixes

It is not clear to me if the former overlaps with the last (i.e. if 
there are any SCALAR && !CMP operations that are always 2-operand). If 
so, please use different constants for all three; if not, please use a 
different constant for the last, e.g. SSE_OPF_V0 and SSE_OPF_VEC_V0, so 
that the difference is visible in the flags-extended CHECK_AVX macro.

Also related to overloading, here and in patch 37 there is code like this:

> +            if (op7.flags & SSE_OPF_BLENDV && !(s->prefix & PREFIX_VEX)) {
> +                /* Only VEX encodings are valid for these blendv opcodes */
> +                goto illegal_op;
> +            }

If this is for all SSE_OPF_BLENDV operations, it can be handled in the 
flags-enabled CHECK_AVX() macro above.  If it is only for some, it 
should be a new flag SSE_OPF_VEX_ONLY.

Finally (replying here just to keep things together), patch 29 has "We 
abuse the SSE_OPF_SCALAR flag to select the memory operand width 
appropriately".  Please don't; use a separate function that takes in "b" 
and returns a bool, with just a switch statement in it.

> +            CHECK_AVX(s);
> +            scalar_op = (s->prefix & PREFIX_VEX)
> +                && (op7.flags & SSE_OPF_SCALAR)
> +                && !(op7.flags & SSE_OPF_CMP);
> +            if (is_xmm && (op7.flags & SSE_OPF_MMX)) {
> +                CHECK_AVX2_256(s);
> +            }

I think the is_xmm check is always true here (inside case 0x03a: case 
0x13a:, i.e. b is inside the 0x10..0x5f range)?

Paolo


^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2022-04-27  9:14 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-18 17:39 [PATCH 0/3] AVX guest implementation Paul Brook
2022-04-18 17:39 ` [PATCH 1/4] Add AVX_EN hflag Paul Brook
2022-04-18 17:39 ` [PATCH 2/4] TCG support for AVX Paul Brook
2022-04-18 19:33   ` Peter Maydell
2022-04-18 19:45     ` Paul Brook
2022-04-18 19:50       ` Peter Maydell
2022-04-18 23:14       ` Richard Henderson
2022-04-20 14:19       ` Paolo Bonzini
2022-04-20 18:59         ` Paul Brook
2022-04-18 17:39 ` [PATCH 3/4] Enable all x86-64 cpu features in user mode Paul Brook
2022-04-18 17:39 ` [PATCH 4/4] AVX tests Paul Brook
2022-04-19 10:34   ` Alex Bennée
2022-04-24 22:01 ` [PATCH v2 01/42] i386: pcmpestr 64-bit sign extension bug Paul Brook
2022-04-25 15:50   ` Richard Henderson
2022-04-27  7:00   ` Paolo Bonzini
2022-04-24 22:01 ` [PATCH v2 02/42] i386: DPPS rounding fix Paul Brook
2022-04-25 16:09   ` Richard Henderson
2022-04-24 22:01 ` [PATCH v2 03/42] Add AVX_EN hflag Paul Brook
2022-04-25 17:27   ` Richard Henderson
2022-04-24 22:01 ` [PATCH v2 04/42] i386: Rework sse_op_table1 Paul Brook
2022-04-24 22:01 ` [PATCH v2 05/42] i386: Rework sse_op_table6/7 Paul Brook
2022-04-24 22:01 ` [PATCH v2 06/42] i386: Add CHECK_NO_VEX Paul Brook
2022-04-25 20:39   ` Richard Henderson
2022-04-25 20:41   ` Richard Henderson
2022-04-24 22:01 ` [PATCH v2 07/42] Enforce VEX encoding restrictions Paul Brook
2022-04-25 20:42   ` Richard Henderson
2022-04-25 21:00   ` Richard Henderson
2022-04-27  9:08   ` Paolo Bonzini
2022-04-24 22:01 ` [PATCH v2 08/42] i386: Add ZMM_OFFSET macro Paul Brook
2022-04-25 21:03   ` Richard Henderson
2022-04-24 22:01 ` [PATCH v2 09/42] i386: Helper macro for 256 bit AVX helpers Paul Brook
2022-04-24 22:01 ` [PATCH v2 10/42] i386: Rewrite vector shift helper Paul Brook
2022-04-25 21:33   ` Richard Henderson
2022-04-27  6:51     ` Paolo Bonzini
2022-04-24 22:01 ` [PATCH v2 11/42] i386: Rewrite simple integer vector helpers Paul Brook
2022-04-24 22:01 ` [PATCH v2 12/42] i386: Misc integer AVX helper prep Paul Brook
2022-04-24 22:01 ` [PATCH v2 13/42] i386: Destructive vector helpers for AVX Paul Brook
2022-04-27  6:53   ` Paolo Bonzini
2022-04-24 22:01 ` [PATCH v2 14/42] i386: Add size suffix to vector FP helpers Paul Brook
2022-04-24 22:01 ` [PATCH v2 15/42] i386: Floating point atithmetic helper AVX prep Paul Brook
2022-04-24 22:01 ` [PATCH v2 16/42] i386: Dot product AVX helper prep Paul Brook
2022-04-24 22:01 ` [PATCH v2 17/42] i386: Destructive FP helpers for AVX Paul Brook
2022-04-24 22:01 ` [PATCH v2 18/42] i386: Misc AVX helper prep Paul Brook
2022-04-24 22:01 ` [PATCH v2 19/42] i386: Rewrite blendv helpers Paul Brook
2022-04-24 22:01 ` [PATCH v2 20/42] i386: AVX pclmulqdq Paul Brook
2022-04-24 22:01 ` [PATCH v2 21/42] i386: AVX+AES helpers Paul Brook
2022-04-24 22:01 ` [PATCH v2 22/42] i386: Update ops_sse_helper.h ready for 256 bit AVX Paul Brook
2022-04-24 22:01 ` [PATCH v2 23/42] i386: AVX comparison helpers Paul Brook
2022-04-24 22:01 ` [PATCH v2 24/42] i386: Move 3DNOW decoder Paul Brook
2022-04-24 22:01 ` [PATCH v2 25/42] i386: VEX.V encodings (3 operand) Paul Brook
2022-04-24 22:01 ` [PATCH v2 26/42] i386: Utility function for 128 bit AVX Paul Brook
2022-04-24 22:01 ` [PATCH v2 27/42] i386: Translate 256 bit AVX instructions Paul Brook
2022-04-24 22:01 ` [PATCH v2 28/42] i386: Implement VZEROALL and VZEROUPPER Paul Brook
2022-04-24 22:01 ` [PATCH v2 29/42] i386: Implement VBROADCAST Paul Brook
2022-04-24 22:01 ` [PATCH v2 30/42] i386: Implement VPERMIL Paul Brook
2022-04-24 22:01 ` [PATCH v2 31/42] i386: Implement AVX variable shifts Paul Brook
2022-04-24 22:01 ` [PATCH v2 32/42] i386: Implement VTEST Paul Brook
2022-04-24 22:01 ` [PATCH v2 33/42] i386: Implement VMASKMOV Paul Brook
2022-04-24 22:01 ` [PATCH v2 34/42] i386: Implement VGATHER Paul Brook
2022-04-24 22:01 ` [PATCH v2 35/42] i386: Implement VPERM Paul Brook
2022-04-24 22:01 ` [PATCH v2 36/42] i386: Implement VINSERT128/VEXTRACT128 Paul Brook
2022-04-24 22:01 ` [PATCH v2 37/42] i386: Implement VBLENDV Paul Brook
2022-04-24 22:02 ` [PATCH v2 38/42] i386: Implement VPBLENDD Paul Brook
2022-04-24 22:02 ` [PATCH v2 39/42] i386: Enable AVX cpuid bits when using TCG Paul Brook
2022-04-24 22:02 ` [PATCH v2 40/42] Enable all x86-64 cpu features in user mode Paul Brook
2022-04-24 22:02 ` [PATCH v2 41/42] AVX tests Paul Brook
2022-04-24 22:02 ` [PATCH v2 42/42] i386: Add sha512-avx test Paul Brook

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.