* [PATCH 01/10] x86emul: adjustments to mem access / write logic testing
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
@ 2020-08-03 14:50 ` Jan Beulich
2020-08-03 14:50 ` [PATCH 02/10] x86emul: extend decoding / mem access testing to FPU insns Jan Beulich
` (10 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:50 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
The combination of specifying a ModR/M byte with the upper two bits set
and the modrm field set to T is pointless - the same test will be
executed twice, i.e. overall things will be slower for no extra gain. I
can only assume this was a copy-and-paste-without-enough-editing mistake
of mine.
Furthermore adjust the base type of a few bit fields to shrink table
size, as subsequently quite a few new entries will get added to the
tables using this type.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/tools/tests/x86_emulator/predicates.c
+++ b/tools/tests/x86_emulator/predicates.c
@@ -21,8 +21,8 @@ static const struct {
uint8_t opc[8];
uint8_t len[2]; /* 32- and 64-bit mode */
bool modrm:1; /* Should register form (also) be tested? */
- unsigned int mem:2;
- unsigned int pfx:2;
+ uint8_t mem:2;
+ uint8_t pfx:2;
#define REG(opc, more...) \
{ { (opc) | 0 }, more }, /* %?ax */ \
{ { (opc) | 1 }, more }, /* %?cx */ \
@@ -334,53 +334,53 @@ static const struct {
/*{ 0x01, 0x28 }, { 2, 2 }, F, W, pfx_f3 }, rstorssp */
{ { 0x01, 0x30 }, { 2, 2 }, T, R }, /* lmsw */
{ { 0x01, 0x38 }, { 2, 2 }, F, N }, /* invlpg */
- { { 0x01, 0xc0 }, { 2, 2 }, T, N }, /* enclv */
- { { 0x01, 0xc1 }, { 2, 2 }, T, N }, /* vmcall */
+ { { 0x01, 0xc0 }, { 2, 2 }, F, N }, /* enclv */
+ { { 0x01, 0xc1 }, { 2, 2 }, F, N }, /* vmcall */
/*{ 0x01, 0xc2 }, { 2, 2 }, F, R }, vmlaunch */
/*{ 0x01, 0xc3 }, { 2, 2 }, F, R }, vmresume */
- { { 0x01, 0xc4 }, { 2, 2 }, T, N }, /* vmxoff */
- { { 0x01, 0xc5 }, { 2, 2 }, T, N }, /* pconfig */
- { { 0x01, 0xc8 }, { 2, 2 }, T, N }, /* monitor */
- { { 0x01, 0xc9 }, { 2, 2 }, T, N }, /* mwait */
- { { 0x01, 0xca }, { 2, 2 }, T, N }, /* clac */
- { { 0x01, 0xcb }, { 2, 2 }, T, N }, /* stac */
- { { 0x01, 0xcf }, { 2, 2 }, T, N }, /* encls */
- { { 0x01, 0xd0 }, { 2, 2 }, T, N }, /* xgetbv */
- { { 0x01, 0xd1 }, { 2, 2 }, T, N }, /* xsetbv */
- { { 0x01, 0xd4 }, { 2, 2 }, T, N }, /* vmfunc */
- { { 0x01, 0xd5 }, { 2, 2 }, T, N }, /* xend */
- { { 0x01, 0xd6 }, { 2, 2 }, T, N }, /* xtest */
- { { 0x01, 0xd7 }, { 2, 2 }, T, N }, /* enclu */
+ { { 0x01, 0xc4 }, { 2, 2 }, F, N }, /* vmxoff */
+ { { 0x01, 0xc5 }, { 2, 2 }, F, N }, /* pconfig */
+ { { 0x01, 0xc8 }, { 2, 2 }, F, N }, /* monitor */
+ { { 0x01, 0xc9 }, { 2, 2 }, F, N }, /* mwait */
+ { { 0x01, 0xca }, { 2, 2 }, F, N }, /* clac */
+ { { 0x01, 0xcb }, { 2, 2 }, F, N }, /* stac */
+ { { 0x01, 0xcf }, { 2, 2 }, F, N }, /* encls */
+ { { 0x01, 0xd0 }, { 2, 2 }, F, N }, /* xgetbv */
+ { { 0x01, 0xd1 }, { 2, 2 }, F, N }, /* xsetbv */
+ { { 0x01, 0xd4 }, { 2, 2 }, F, N }, /* vmfunc */
+ { { 0x01, 0xd5 }, { 2, 2 }, F, N }, /* xend */
+ { { 0x01, 0xd6 }, { 2, 2 }, F, N }, /* xtest */
+ { { 0x01, 0xd7 }, { 2, 2 }, F, N }, /* enclu */
/*{ 0x01, 0xd8 }, { 2, 2 }, F, R }, vmrun */
- { { 0x01, 0xd9 }, { 2, 2 }, T, N }, /* vmcall */
- { { 0x01, 0xd9 }, { 2, 2 }, T, N, pfx_f3 }, /* vmgexit */
- { { 0x01, 0xd9 }, { 2, 2 }, T, N, pfx_f2 }, /* vmgexit */
+ { { 0x01, 0xd9 }, { 2, 2 }, F, N }, /* vmcall */
+ { { 0x01, 0xd9 }, { 2, 2 }, F, N, pfx_f3 }, /* vmgexit */
+ { { 0x01, 0xd9 }, { 2, 2 }, F, N, pfx_f2 }, /* vmgexit */
/*{ 0x01, 0xda }, { 2, 2 }, F, R }, vmload */
/*{ 0x01, 0xdb }, { 2, 2 }, F, W }, vmsave */
- { { 0x01, 0xdc }, { 2, 2 }, T, N }, /* stgi */
- { { 0x01, 0xdd }, { 2, 2 }, T, N }, /* clgi */
+ { { 0x01, 0xdc }, { 2, 2 }, F, N }, /* stgi */
+ { { 0x01, 0xdd }, { 2, 2 }, F, N }, /* clgi */
/*{ 0x01, 0xde }, { 2, 2 }, F, R }, skinit */
- { { 0x01, 0xdf }, { 2, 2 }, T, N }, /* invlpga */
- { { 0x01, 0xe8 }, { 2, 2 }, T, N }, /* serialize */
+ { { 0x01, 0xdf }, { 2, 2 }, F, N }, /* invlpga */
+ { { 0x01, 0xe8 }, { 2, 2 }, F, N }, /* serialize */
/*{ 0x01, 0xe8 }, { 2, 2 }, F, W, pfx_f3 }, setssbsy */
- { { 0x01, 0xe8 }, { 2, 2 }, T, N, pfx_f2 }, /* xsusldtrk */
- { { 0x01, 0xe9 }, { 2, 2 }, T, N, pfx_f2 }, /* xresldtrk */
+ { { 0x01, 0xe8 }, { 2, 2 }, F, N, pfx_f2 }, /* xsusldtrk */
+ { { 0x01, 0xe9 }, { 2, 2 }, F, N, pfx_f2 }, /* xresldtrk */
/*{ 0x01, 0xea }, { 2, 2 }, F, W, pfx_f3 }, saveprevssp */
- { { 0x01, 0xee }, { 2, 2 }, T, N }, /* rdpkru */
- { { 0x01, 0xef }, { 2, 2 }, T, N }, /* wrpkru */
- { { 0x01, 0xf8 }, { 0, 2 }, T, N }, /* swapgs */
- { { 0x01, 0xf9 }, { 2, 2 }, T, N }, /* rdtscp */
- { { 0x01, 0xfa }, { 2, 2 }, T, N }, /* monitorx */
- { { 0x01, 0xfa }, { 2, 2 }, T, N, pfx_f3 }, /* mcommit */
- { { 0x01, 0xfb }, { 2, 2 }, T, N }, /* mwaitx */
+ { { 0x01, 0xee }, { 2, 2 }, F, N }, /* rdpkru */
+ { { 0x01, 0xef }, { 2, 2 }, F, N }, /* wrpkru */
+ { { 0x01, 0xf8 }, { 0, 2 }, F, N }, /* swapgs */
+ { { 0x01, 0xf9 }, { 2, 2 }, F, N }, /* rdtscp */
+ { { 0x01, 0xfa }, { 2, 2 }, F, N }, /* monitorx */
+ { { 0x01, 0xfa }, { 2, 2 }, F, N, pfx_f3 }, /* mcommit */
+ { { 0x01, 0xfb }, { 2, 2 }, F, N }, /* mwaitx */
{ { 0x01, 0xfc }, { 2, 2 }, F, W }, /* clzero */
- { { 0x01, 0xfd }, { 2, 2 }, T, N }, /* rdpru */
- { { 0x01, 0xfe }, { 2, 2 }, T, N }, /* invlpgb */
- { { 0x01, 0xfe }, { 0, 2 }, T, N, pfx_f3 }, /* rmpadjust */
- { { 0x01, 0xfe }, { 0, 2 }, T, N, pfx_f2 }, /* rmpupdate */
- { { 0x01, 0xff }, { 2, 2 }, T, N }, /* tlbsync */
- { { 0x01, 0xff }, { 0, 2 }, T, N, pfx_f3 }, /* psmash */
- { { 0x01, 0xff }, { 0, 2 }, T, N, pfx_f2 }, /* pvalidate */
+ { { 0x01, 0xfd }, { 2, 2 }, F, N }, /* rdpru */
+ { { 0x01, 0xfe }, { 2, 2 }, F, N }, /* invlpgb */
+ { { 0x01, 0xfe }, { 0, 2 }, F, N, pfx_f3 }, /* rmpadjust */
+ { { 0x01, 0xfe }, { 0, 2 }, F, N, pfx_f2 }, /* rmpupdate */
+ { { 0x01, 0xff }, { 2, 2 }, F, N }, /* tlbsync */
+ { { 0x01, 0xff }, { 0, 2 }, F, N, pfx_f3 }, /* psmash */
+ { { 0x01, 0xff }, { 0, 2 }, F, N, pfx_f2 }, /* pvalidate */
{ { 0x02 }, { 2, 2 }, T, R }, /* lar */
{ { 0x03 }, { 2, 2 }, T, R }, /* lsl */
{ { 0x05 }, { 1, 1 }, F, N }, /* syscall */
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 02/10] x86emul: extend decoding / mem access testing to FPU insns
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
2020-08-03 14:50 ` [PATCH 01/10] x86emul: adjustments to mem access / write logic testing Jan Beulich
@ 2020-08-03 14:50 ` Jan Beulich
2020-08-03 14:50 ` [PATCH 03/10] x86emul: extend decoding / mem access testing to MMX / SSE insns Jan Beulich
` (9 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:50 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/tools/tests/x86_emulator/predicates.c
+++ b/tools/tests/x86_emulator/predicates.c
@@ -517,6 +517,138 @@ static const struct {
};
#undef CND
#undef REG
+static const struct {
+ uint8_t opc[2];
+ bool modrm:1; /* Should register form (also) be tested? */
+ uint8_t mem:2;
+} fpu[] = {
+ { { 0xd8, 0x00 }, T, R }, /* fadd */
+ { { 0xd8, 0x08 }, T, R }, /* fmul */
+ { { 0xd8, 0x10 }, T, R }, /* fcom */
+ { { 0xd8, 0x18 }, T, R }, /* fcomp */
+ { { 0xd8, 0x20 }, T, R }, /* fsub */
+ { { 0xd8, 0x28 }, T, R }, /* fsubr */
+ { { 0xd8, 0x30 }, T, R }, /* fdiv */
+ { { 0xd8, 0x38 }, T, R }, /* fdivr */
+ { { 0xd9, 0x00 }, T, R }, /* fld */
+ { { 0xd9, 0x10 }, F, W }, /* fst */
+ { { 0xd9, 0x18 }, T, W }, /* fstp */
+ { { 0xd9, 0x20 }, F, R }, /* fldenv */
+ { { 0xd9, 0x28 }, F, R }, /* fldcw */
+ { { 0xd9, 0x30 }, F, W }, /* fnstenv */
+ { { 0xd9, 0x38 }, F, W }, /* fnstcw */
+ { { 0xd9, 0xc8 }, F, N }, /* fxch */
+ { { 0xd9, 0xd0 }, F, N }, /* fnop */
+ { { 0xd9, 0xe0 }, F, N }, /* fchs */
+ { { 0xd9, 0xe1 }, F, N }, /* fabs */
+ { { 0xd9, 0xe4 }, F, N }, /* ftst */
+ { { 0xd9, 0xe5 }, F, N }, /* fxam */
+ { { 0xd9, 0xe6 }, F, N }, /* ftstp */
+ { { 0xd9, 0xe8 }, F, N }, /* fld1 */
+ { { 0xd9, 0xe9 }, F, N }, /* fldl2t */
+ { { 0xd9, 0xea }, F, N }, /* fldl2e */
+ { { 0xd9, 0xeb }, F, N }, /* fldpi */
+ { { 0xd9, 0xec }, F, N }, /* fldlg2 */
+ { { 0xd9, 0xed }, F, N }, /* fldln2 */
+ { { 0xd9, 0xee }, F, N }, /* fldz */
+ { { 0xd9, 0xf0 }, F, N }, /* f2xm1 */
+ { { 0xd9, 0xf1 }, F, N }, /* fyl2x */
+ { { 0xd9, 0xf2 }, F, N }, /* fptan */
+ { { 0xd9, 0xf3 }, F, N }, /* fpatan */
+ { { 0xd9, 0xf4 }, F, N }, /* fxtract */
+ { { 0xd9, 0xf5 }, F, N }, /* fprem1 */
+ { { 0xd9, 0xf6 }, F, N }, /* fdecstp */
+ { { 0xd9, 0xf7 }, F, N }, /* fincstp */
+ { { 0xd9, 0xf8 }, F, N }, /* fprem */
+ { { 0xd9, 0xf9 }, F, N }, /* fyl2xp1 */
+ { { 0xd9, 0xfa }, F, N }, /* fsqrt */
+ { { 0xd9, 0xfb }, F, N }, /* fsincos */
+ { { 0xd9, 0xfc }, F, N }, /* frndint */
+ { { 0xd9, 0xfd }, F, N }, /* fscale */
+ { { 0xd9, 0xfe }, F, N }, /* fsin */
+ { { 0xd9, 0xff }, F, N }, /* fcos */
+ { { 0xda, 0x00 }, F, R }, /* fiadd */
+ { { 0xda, 0x08 }, F, R }, /* fimul */
+ { { 0xda, 0x10 }, F, R }, /* ficom */
+ { { 0xda, 0x18 }, F, R }, /* ficomp */
+ { { 0xda, 0x20 }, F, R }, /* fisub */
+ { { 0xda, 0x28 }, F, R }, /* fisubr */
+ { { 0xda, 0x30 }, F, R }, /* fidiv */
+ { { 0xda, 0x38 }, F, R }, /* fidivr */
+ { { 0xda, 0xc0 }, F, N }, /* fcmovb */
+ { { 0xda, 0xc8 }, F, N }, /* fcmove */
+ { { 0xda, 0xd0 }, F, N }, /* fcmovbe */
+ { { 0xda, 0xd8 }, F, N }, /* fcmovu */
+ { { 0xda, 0xe9 }, F, N }, /* fucompp */
+ { { 0xdb, 0x00 }, F, R }, /* fild */
+ { { 0xdb, 0x08 }, F, W }, /* fisttp */
+ { { 0xdb, 0x10 }, F, W }, /* fist */
+ { { 0xdb, 0x18 }, F, W }, /* fistp */
+ { { 0xdb, 0x28 }, F, R }, /* fld */
+ { { 0xdb, 0x38 }, F, W }, /* fstp */
+ { { 0xdb, 0xc0 }, F, N }, /* fcmovnb */
+ { { 0xdb, 0xc8 }, F, N }, /* fcmovne */
+ { { 0xdb, 0xd0 }, F, N }, /* fcmovnbe */
+ { { 0xdb, 0xd8 }, F, N }, /* fcmovnu */
+ { { 0xdb, 0xe0 }, F, N }, /* fneni */
+ { { 0xdb, 0xe1 }, F, N }, /* fndisi */
+ { { 0xdb, 0xe2 }, F, N }, /* fnclex */
+ { { 0xdb, 0xe3 }, F, N }, /* fninit */
+ { { 0xdb, 0xe4 }, F, N }, /* fsetpm */
+ { { 0xdb, 0xe5 }, F, N }, /* frstpm */
+ { { 0xdb, 0xe8 }, F, N }, /* fucomi */
+ { { 0xdb, 0xf0 }, F, N }, /* fcomi */
+ { { 0xdc, 0x00 }, T, R }, /* fadd */
+ { { 0xdc, 0x08 }, T, R }, /* fmul */
+ { { 0xdc, 0x10 }, T, R }, /* fcom */
+ { { 0xdc, 0x18 }, T, R }, /* fcomp */
+ { { 0xdc, 0x20 }, T, R }, /* fsub */
+ { { 0xdc, 0x28 }, T, R }, /* fsubr */
+ { { 0xdc, 0x30 }, T, R }, /* fdiv */
+ { { 0xdc, 0x38 }, T, R }, /* fdivr */
+ { { 0xdd, 0x00 }, F, R }, /* fld */
+ { { 0xdd, 0x08 }, F, W }, /* fisttp */
+ { { 0xdd, 0x10 }, T, W }, /* fst */
+ { { 0xdd, 0x18 }, T, W }, /* fstp */
+ { { 0xdd, 0x20 }, F, R }, /* frstor */
+ { { 0xdd, 0x30 }, F, W }, /* fnsave */
+ { { 0xdd, 0x38 }, F, W }, /* fnstsw */
+ { { 0xdd, 0xc0 }, F, N }, /* ffree */
+ { { 0xdd, 0xc8 }, F, N }, /* fxch */
+ { { 0xdd, 0xe0 }, F, N }, /* fucom */
+ { { 0xdd, 0xe8 }, F, N }, /* fucomp */
+ { { 0xde, 0x00 }, F, R }, /* fiadd */
+ { { 0xde, 0x08 }, F, R }, /* fimul */
+ { { 0xde, 0x10 }, F, R }, /* ficom */
+ { { 0xde, 0x18 }, F, R }, /* ficomp */
+ { { 0xde, 0x20 }, F, R }, /* fisub */
+ { { 0xde, 0x28 }, F, R }, /* fisubr */
+ { { 0xde, 0x30 }, F, R }, /* fidiv */
+ { { 0xde, 0x38 }, F, R }, /* fidivr */
+ { { 0xde, 0xc0 }, F, N }, /* faddp */
+ { { 0xde, 0xc8 }, F, N }, /* fmulp */
+ { { 0xde, 0xd0 }, F, N }, /* fcomp */
+ { { 0xde, 0xd9 }, F, N }, /* fcompp */
+ { { 0xde, 0xe0 }, F, N }, /* fsubrp */
+ { { 0xde, 0xe8 }, F, N }, /* fsubp */
+ { { 0xde, 0xf0 }, F, N }, /* fdivrp */
+ { { 0xde, 0xf8 }, F, N }, /* fdivp */
+ { { 0xdf, 0x00 }, F, R }, /* fild */
+ { { 0xdf, 0x08 }, F, W }, /* fisttp */
+ { { 0xdf, 0x10 }, F, W }, /* fist */
+ { { 0xdf, 0x18 }, F, W }, /* fistp */
+ { { 0xdf, 0x20 }, F, R }, /* fbld */
+ { { 0xdf, 0x28 }, F, R }, /* fild */
+ { { 0xdf, 0x30 }, F, W }, /* fbstp */
+ { { 0xdf, 0x38 }, F, W }, /* fistp */
+ { { 0xdf, 0xc0 }, F, N }, /* ffreep */
+ { { 0xdf, 0xc8 }, F, N }, /* fxch */
+ { { 0xdf, 0xd0 }, F, N }, /* fstp */
+ { { 0xdf, 0xd8 }, F, N }, /* fstp */
+ { { 0xdf, 0xe0 }, F, N }, /* fnstsw */
+ { { 0xdf, 0xe8 }, F, N }, /* fucomip */
+ { { 0xdf, 0xf0 }, F, N }, /* fcomip */
+};
#undef F
#undef N
#undef R
@@ -667,6 +799,16 @@ void predicates_test(void *instr, struct
legacy_0f38[t].mem, ctxt, fetch);
}
+ memset(instr + ARRAY_SIZE(fpu[t].opc), 0xcc, 13);
+
+ for ( t = 0; t < ARRAY_SIZE(fpu); ++t )
+ {
+ memcpy(instr, fpu[t].opc, ARRAY_SIZE(fpu[t].opc));
+
+ do_test(instr, ARRAY_SIZE(fpu[t].opc), fpu[t].modrm, fpu[t].mem,
+ ctxt, fetch);
+ }
+
if ( errors )
exit(1);
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 03/10] x86emul: extend decoding / mem access testing to MMX / SSE insns
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
2020-08-03 14:50 ` [PATCH 01/10] x86emul: adjustments to mem access / write logic testing Jan Beulich
2020-08-03 14:50 ` [PATCH 02/10] x86emul: extend decoding / mem access testing to FPU insns Jan Beulich
@ 2020-08-03 14:50 ` Jan Beulich
2020-08-03 16:42 ` Andrew Cooper
2020-08-03 14:51 ` [PATCH 04/10] x86emul: extend decoding / mem access testing to VEX-encoded insns Jan Beulich
` (8 subsequent siblings)
11 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:50 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
IOW just legacy encoded ones. For 3dNow! just one example is used, as
they're all similar in nature both encoding- and operand-wise.
Adjust a slightly misleading (but not wrong) memcpy() invocation, as
noticed while further cloning that code.
Rename pfx_none to pfx_no, so it can be used to improve readability /
column alignment.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/tools/tests/x86_emulator/predicates.c
+++ b/tools/tests/x86_emulator/predicates.c
@@ -3,7 +3,7 @@
#include <stdio.h>
enum mem_access { mem_none, mem_read, mem_write };
-enum pfx { pfx_none, pfx_66, pfx_f3, pfx_f2 };
+enum pfx { pfx_no, pfx_66, pfx_f3, pfx_f2 };
static const uint8_t prefixes[] = { 0x66, 0xf3, 0xf2 };
#define F false
@@ -393,6 +393,30 @@ static const struct {
{ { 0x0d, 0x00 }, { 2, 2 }, F, N }, /* prefetch */
{ { 0x0d, 0x08 }, { 2, 2 }, F, N }, /* prefetchw */
{ { 0x0e }, { 1, 1 }, F, N }, /* femms */
+ { { 0x0f, 0x00, 0x9e }, { 3, 3 }, T, R }, /* pfadd */
+ { { 0x10 }, { 2, 2 }, T, R, pfx_no }, /* movups */
+ { { 0x10 }, { 2, 2 }, T, R, pfx_66 }, /* movupd */
+ { { 0x10 }, { 2, 2 }, T, R, pfx_f3 }, /* movss */
+ { { 0x10 }, { 2, 2 }, T, R, pfx_f2 }, /* movsd */
+ { { 0x11 }, { 2, 2 }, T, W, pfx_no }, /* movups */
+ { { 0x11 }, { 2, 2 }, T, W, pfx_66 }, /* movupd */
+ { { 0x11 }, { 2, 2 }, T, W, pfx_f3 }, /* movss */
+ { { 0x11 }, { 2, 2 }, T, W, pfx_f2 }, /* movsd */
+ { { 0x12 }, { 2, 2 }, T, R, pfx_no }, /* movlps / movhlps */
+ { { 0x12 }, { 2, 2 }, F, R, pfx_66 }, /* movlpd */
+ { { 0x12 }, { 2, 2 }, T, R, pfx_f3 }, /* movsldup */
+ { { 0x12 }, { 2, 2 }, T, R, pfx_f2 }, /* movddup */
+ { { 0x13 }, { 2, 2 }, F, W, pfx_no }, /* movlps */
+ { { 0x13 }, { 2, 2 }, F, W, pfx_66 }, /* movlpd */
+ { { 0x14 }, { 2, 2 }, T, R, pfx_no }, /* unpcklps */
+ { { 0x14 }, { 2, 2 }, T, R, pfx_66 }, /* unpcklpd */
+ { { 0x15 }, { 2, 2 }, T, R, pfx_no }, /* unpckhps */
+ { { 0x15 }, { 2, 2 }, T, R, pfx_66 }, /* unpckhpd */
+ { { 0x16 }, { 2, 2 }, T, R, pfx_no }, /* movhps / movlhps */
+ { { 0x16 }, { 2, 2 }, F, R, pfx_66 }, /* movhpd */
+ { { 0x16 }, { 2, 2 }, T, R, pfx_f3 }, /* movshdup */
+ { { 0x17 }, { 2, 2 }, F, W, pfx_no }, /* movhps */
+ { { 0x17 }, { 2, 2 }, F, W, pfx_66 }, /* movhpd */
{ { 0x18, 0x00 }, { 2, 2 }, F, N }, /* prefetchnta */
{ { 0x18, 0x08 }, { 2, 2 }, F, N }, /* prefetch0 */
{ { 0x18, 0x10 }, { 2, 2 }, F, N }, /* prefetch1 */
@@ -414,6 +438,30 @@ static const struct {
{ { 0x21 }, { 2, 2 }, T, N }, /* mov */
{ { 0x22 }, { 2, 2 }, T, N }, /* mov */
{ { 0x23 }, { 2, 2 }, T, N }, /* mov */
+ { { 0x28 }, { 2, 2 }, T, R, pfx_no }, /* movaps */
+ { { 0x28 }, { 2, 2 }, T, R, pfx_66 }, /* movapd */
+ { { 0x29 }, { 2, 2 }, T, W, pfx_no }, /* movaps */
+ { { 0x29 }, { 2, 2 }, T, W, pfx_66 }, /* movapd */
+ { { 0x2a }, { 2, 2 }, T, R, pfx_no }, /* cvtpi2ps */
+ { { 0x2a }, { 2, 2 }, T, R, pfx_66 }, /* cvtpi2pd */
+ { { 0x2a }, { 2, 2 }, T, R, pfx_f3 }, /* cvtsi2ss */
+ { { 0x2a }, { 2, 2 }, T, R, pfx_f2 }, /* cvtsi2sd */
+ { { 0x2b }, { 2, 2 }, T, W, pfx_no }, /* movntps */
+ { { 0x2b }, { 2, 2 }, T, W, pfx_66 }, /* movntpd */
+ { { 0x2b }, { 2, 2 }, T, W, pfx_f3 }, /* movntss */
+ { { 0x2b }, { 2, 2 }, T, W, pfx_f2 }, /* movntsd */
+ { { 0x2c }, { 2, 2 }, T, R, pfx_no }, /* cvttps2pi */
+ { { 0x2c }, { 2, 2 }, T, R, pfx_66 }, /* cvttpd2pi */
+ { { 0x2c }, { 2, 2 }, T, R, pfx_f3 }, /* cvttss2si */
+ { { 0x2c }, { 2, 2 }, T, R, pfx_f2 }, /* cvttsd2si */
+ { { 0x2d }, { 2, 2 }, T, R, pfx_no }, /* cvtps2pi */
+ { { 0x2d }, { 2, 2 }, T, R, pfx_66 }, /* cvtpd2pi */
+ { { 0x2d }, { 2, 2 }, T, R, pfx_f3 }, /* cvtss2si */
+ { { 0x2d }, { 2, 2 }, T, R, pfx_f2 }, /* cvtsd2si */
+ { { 0x2e }, { 2, 2 }, T, R, pfx_no }, /* ucomiss */
+ { { 0x2e }, { 2, 2 }, T, R, pfx_66 }, /* ucomisd */
+ { { 0x2f }, { 2, 2 }, T, R, pfx_no }, /* comiss */
+ { { 0x2f }, { 2, 2 }, T, R, pfx_66 }, /* comisd */
{ { 0x30 }, { 1, 1 }, F, N }, /* wrmsr */
{ { 0x31 }, { 1, 1 }, F, N }, /* rdtsc */
{ { 0x32 }, { 1, 1 }, F, N }, /* rdmsr */
@@ -421,8 +469,131 @@ static const struct {
{ { 0x34 }, { 1, 1 }, F, N }, /* sysenter */
{ { 0x35 }, { 1, 1 }, F, N }, /* sysexit */
CND(0x40, { 2, 2 }, T, R ), /* cmov<cc> */
+ { { 0x50, 0xc0 }, { 2, 2 }, F, N, pfx_no }, /* movmskps */
+ { { 0x50, 0xc0 }, { 2, 2 }, F, N, pfx_66 }, /* movmskpd */
+ { { 0x51 }, { 2, 2 }, T, R, pfx_no }, /* sqrtps */
+ { { 0x51 }, { 2, 2 }, T, R, pfx_66 }, /* sqrtpd */
+ { { 0x51 }, { 2, 2 }, T, R, pfx_f3 }, /* sqrtss */
+ { { 0x51 }, { 2, 2 }, T, R, pfx_f2 }, /* sqrtsd */
+ { { 0x52 }, { 2, 2 }, T, R, pfx_no }, /* rsqrtps */
+ { { 0x52 }, { 2, 2 }, T, R, pfx_f3 }, /* rsqrtss */
+ { { 0x53 }, { 2, 2 }, T, R, pfx_no }, /* rcpps */
+ { { 0x53 }, { 2, 2 }, T, R, pfx_f3 }, /* rcpss */
+ { { 0x54 }, { 2, 2 }, T, R, pfx_no }, /* andps */
+ { { 0x54 }, { 2, 2 }, T, R, pfx_66 }, /* andpd */
+ { { 0x55 }, { 2, 2 }, T, R, pfx_no }, /* andnps */
+ { { 0x55 }, { 2, 2 }, T, R, pfx_66 }, /* andnpd */
+ { { 0x56 }, { 2, 2 }, T, R, pfx_no }, /* orps */
+ { { 0x56 }, { 2, 2 }, T, R, pfx_66 }, /* orpd */
+ { { 0x57 }, { 2, 2 }, T, R, pfx_no }, /* xorps */
+ { { 0x57 }, { 2, 2 }, T, R, pfx_66 }, /* xorpd */
+ { { 0x58 }, { 2, 2 }, T, R, pfx_no }, /* addps */
+ { { 0x58 }, { 2, 2 }, T, R, pfx_66 }, /* addpd */
+ { { 0x58 }, { 2, 2 }, T, R, pfx_f3 }, /* addss */
+ { { 0x58 }, { 2, 2 }, T, R, pfx_f2 }, /* addsd */
+ { { 0x59 }, { 2, 2 }, T, R, pfx_no }, /* mulps */
+ { { 0x59 }, { 2, 2 }, T, R, pfx_66 }, /* mulpd */
+ { { 0x59 }, { 2, 2 }, T, R, pfx_f3 }, /* mulss */
+ { { 0x59 }, { 2, 2 }, T, R, pfx_f2 }, /* mulsd */
+ { { 0x5a }, { 2, 2 }, T, R, pfx_no }, /* cvtps2pd */
+ { { 0x5a }, { 2, 2 }, T, R, pfx_66 }, /* cvtpd2ps */
+ { { 0x5a }, { 2, 2 }, T, R, pfx_f3 }, /* cvtss2sd */
+ { { 0x5a }, { 2, 2 }, T, R, pfx_f2 }, /* cvtsd2ss */
+ { { 0x5b }, { 2, 2 }, T, R, pfx_no }, /* cvtdq2ps */
+ { { 0x5b }, { 2, 2 }, T, R, pfx_66 }, /* cvtps2dq */
+ { { 0x5b }, { 2, 2 }, T, R, pfx_f3 }, /* cvttps2dq */
+ { { 0x5c }, { 2, 2 }, T, R, pfx_no }, /* subps */
+ { { 0x5c }, { 2, 2 }, T, R, pfx_66 }, /* subpd */
+ { { 0x5c }, { 2, 2 }, T, R, pfx_f3 }, /* subss */
+ { { 0x5c }, { 2, 2 }, T, R, pfx_f2 }, /* subsd */
+ { { 0x5d }, { 2, 2 }, T, R, pfx_no }, /* minps */
+ { { 0x5d }, { 2, 2 }, T, R, pfx_66 }, /* minpd */
+ { { 0x5d }, { 2, 2 }, T, R, pfx_f3 }, /* minss */
+ { { 0x5d }, { 2, 2 }, T, R, pfx_f2 }, /* minsd */
+ { { 0x5e }, { 2, 2 }, T, R, pfx_no }, /* divps */
+ { { 0x5e }, { 2, 2 }, T, R, pfx_66 }, /* divpd */
+ { { 0x5e }, { 2, 2 }, T, R, pfx_f3 }, /* divss */
+ { { 0x5e }, { 2, 2 }, T, R, pfx_f2 }, /* divsd */
+ { { 0x5f }, { 2, 2 }, T, R, pfx_no }, /* maxps */
+ { { 0x5f }, { 2, 2 }, T, R, pfx_66 }, /* maxpd */
+ { { 0x5f }, { 2, 2 }, T, R, pfx_f3 }, /* maxss */
+ { { 0x5f }, { 2, 2 }, T, R, pfx_f2 }, /* maxsd */
+ { { 0x60 }, { 2, 2 }, T, R, pfx_no }, /* punpcklbw */
+ { { 0x60 }, { 2, 2 }, T, R, pfx_66 }, /* punpcklbw */
+ { { 0x61 }, { 2, 2 }, T, R, pfx_no }, /* punpcklwd */
+ { { 0x61 }, { 2, 2 }, T, R, pfx_66 }, /* punpcklwd */
+ { { 0x62 }, { 2, 2 }, T, R, pfx_no }, /* punpckldq */
+ { { 0x62 }, { 2, 2 }, T, R, pfx_66 }, /* punpckldq */
+ { { 0x63 }, { 2, 2 }, T, R, pfx_no }, /* packsswb */
+ { { 0x63 }, { 2, 2 }, T, R, pfx_66 }, /* packsswb */
+ { { 0x64 }, { 2, 2 }, T, R, pfx_no }, /* pcmpgtb */
+ { { 0x64 }, { 2, 2 }, T, R, pfx_66 }, /* pcmpgtb */
+ { { 0x65 }, { 2, 2 }, T, R, pfx_no }, /* pcmpgtw */
+ { { 0x65 }, { 2, 2 }, T, R, pfx_66 }, /* pcmpgtw */
+ { { 0x66 }, { 2, 2 }, T, R, pfx_no }, /* pcmpgtd */
+ { { 0x66 }, { 2, 2 }, T, R, pfx_66 }, /* pcmpgtd */
+ { { 0x67 }, { 2, 2 }, T, R, pfx_no }, /* packuswb */
+ { { 0x67 }, { 2, 2 }, T, R, pfx_66 }, /* packuswb */
+ { { 0x68 }, { 2, 2 }, T, R, pfx_no }, /* punpckhbw */
+ { { 0x68 }, { 2, 2 }, T, R, pfx_66 }, /* punpckhbw */
+ { { 0x69 }, { 2, 2 }, T, R, pfx_no }, /* punpckhwd */
+ { { 0x69 }, { 2, 2 }, T, R, pfx_66 }, /* punpckhwd */
+ { { 0x6a }, { 2, 2 }, T, R, pfx_no }, /* punpckhdq */
+ { { 0x6a }, { 2, 2 }, T, R, pfx_66 }, /* punpckhdq */
+ { { 0x6b }, { 2, 2 }, T, R, pfx_no }, /* packssdw */
+ { { 0x6b }, { 2, 2 }, T, R, pfx_66 }, /* packssdw */
+ { { 0x6c }, { 2, 2 }, T, R, pfx_66 }, /* punpcklqdq */
+ { { 0x6d }, { 2, 2 }, T, R, pfx_66 }, /* punpckhqdq */
+ { { 0x6e }, { 2, 2 }, T, R, pfx_no }, /* movd */
+ { { 0x6e }, { 2, 2 }, T, R, pfx_66 }, /* movd */
+ { { 0x6f }, { 2, 2 }, T, R, pfx_no }, /* movq */
+ { { 0x6f }, { 2, 2 }, T, R, pfx_66 }, /* movdqa */
+ { { 0x6f }, { 2, 2 }, T, R, pfx_f3 }, /* movdqu */
+ { { 0x70 }, { 3, 3 }, T, R, pfx_no }, /* pshufw */
+ { { 0x70 }, { 3, 3 }, T, R, pfx_66 }, /* pshufd */
+ { { 0x70 }, { 3, 3 }, T, R, pfx_f3 }, /* pshuflw */
+ { { 0x70 }, { 3, 3 }, T, R, pfx_f2 }, /* pshufhw */
+ { { 0x71, 0xd0 }, { 3, 3 }, F, N, pfx_no }, /* psrlw */
+ { { 0x71, 0xd0 }, { 3, 3 }, F, N, pfx_66 }, /* psrlw */
+ { { 0x71, 0xe0 }, { 3, 3 }, F, N, pfx_no }, /* psraw */
+ { { 0x71, 0xe0 }, { 3, 3 }, F, N, pfx_66 }, /* psraw */
+ { { 0x71, 0xf0 }, { 3, 3 }, F, N, pfx_no }, /* psllw */
+ { { 0x71, 0xf0 }, { 3, 3 }, F, N, pfx_66 }, /* psllw */
+ { { 0x72, 0xd0 }, { 3, 3 }, F, N, pfx_no }, /* psrld */
+ { { 0x72, 0xd0 }, { 3, 3 }, F, N, pfx_66 }, /* psrld */
+ { { 0x72, 0xe0 }, { 3, 3 }, F, N, pfx_no }, /* psrad */
+ { { 0x72, 0xe0 }, { 3, 3 }, F, N, pfx_66 }, /* psrad */
+ { { 0x72, 0xf0 }, { 3, 3 }, F, N, pfx_no }, /* pslld */
+ { { 0x72, 0xf0 }, { 3, 3 }, F, N, pfx_66 }, /* pslld */
+ { { 0x73, 0xd0 }, { 3, 3 }, F, N, pfx_no }, /* psrlq */
+ { { 0x73, 0xd0 }, { 3, 3 }, F, N, pfx_66 }, /* psrlq */
+ { { 0x73, 0xd8 }, { 3, 3 }, F, N, pfx_66 }, /* psrldq */
+ { { 0x73, 0xf0 }, { 3, 3 }, F, N, pfx_no }, /* psllq */
+ { { 0x73, 0xf0 }, { 3, 3 }, F, N, pfx_66 }, /* psllq */
+ { { 0x73, 0xf8 }, { 3, 3 }, F, N, pfx_66 }, /* pslldq */
+ { { 0x74 }, { 2, 2 }, T, R, pfx_no }, /* pcmpeqb */
+ { { 0x74 }, { 2, 2 }, T, R, pfx_66 }, /* pcmpeqb */
+ { { 0x75 }, { 2, 2 }, T, R, pfx_no }, /* pcmpeqw */
+ { { 0x75 }, { 2, 2 }, T, R, pfx_66 }, /* pcmpeqw */
+ { { 0x76 }, { 2, 2 }, T, R, pfx_no }, /* pcmpeqd */
+ { { 0x76 }, { 2, 2 }, T, R, pfx_66 }, /* pcmpeqd */
+ { { 0x77 }, { 1, 1 }, F, N }, /* emms */
/*{ 0x78 }, { 2, 2 }, T, W }, vmread */
+ { { 0x78, 0xc0 }, { 4, 4 }, F, N, pfx_66 }, /* extrq */
+ { { 0x78, 0xc0 }, { 4, 4 }, F, N, pfx_f2 }, /* insertq */
{ { 0x79 }, { 2, 2 }, T, R }, /* vmwrite */
+ { { 0x79, 0xc0 }, { 2, 2 }, F, N, pfx_66 }, /* extrq */
+ { { 0x79, 0xc0 }, { 2, 2 }, F, N, pfx_f2 }, /* insertq */
+ { { 0x7c }, { 2, 2 }, T, R, pfx_66 }, /* haddpd */
+ { { 0x7c }, { 2, 2 }, T, R, pfx_f2 }, /* haddps */
+ { { 0x7d }, { 2, 2 }, T, R, pfx_66 }, /* hsubpd */
+ { { 0x7d }, { 2, 2 }, T, R, pfx_f2 }, /* hsubps */
+ { { 0x7e }, { 2, 2 }, T, W, pfx_no }, /* movd */
+ { { 0x7e }, { 2, 2 }, T, W, pfx_66 }, /* movd */
+ { { 0x7e }, { 2, 2 }, T, R, pfx_f3 }, /* movq */
+ { { 0x7f }, { 2, 2 }, T, W, pfx_no }, /* movq */
+ { { 0x7f }, { 2, 2 }, T, W, pfx_66 }, /* movdqa */
+ { { 0x7f }, { 2, 2 }, T, W, pfx_f3 }, /* movdqu */
CND(0x80, { 5, 5 }, F, N ), /* j<cc> */
CND(0x90, { 2, 2 }, T, W ), /* set<cc> */
{ { 0xa0 }, { 1, 1 }, F, W }, /* push %fs */
@@ -484,7 +655,17 @@ static const struct {
{ { 0xbf }, { 2, 2 }, F, R }, /* movsx */
{ { 0xc0 }, { 2, 2 }, F, W }, /* xadd */
{ { 0xc1 }, { 2, 2 }, F, W }, /* xadd */
+ { { 0xc2 }, { 3, 3 }, T, R, pfx_no }, /* cmpps */
+ { { 0xc2 }, { 3, 3 }, T, R, pfx_66 }, /* cmppd */
+ { { 0xc2 }, { 3, 3 }, T, R, pfx_f3 }, /* cmpss */
+ { { 0xc2 }, { 3, 3 }, T, R, pfx_f2 }, /* cmpsd */
{ { 0xc3 }, { 2, 2 }, F, W }, /* movnti */
+ { { 0xc4 }, { 3, 3 }, T, R, pfx_no }, /* pinsrw */
+ { { 0xc4 }, { 3, 3 }, T, R, pfx_66 }, /* pinsrw */
+ { { 0xc5, 0xc0 }, { 3, 3 }, F, N, pfx_no }, /* pextrw */
+ { { 0xc5, 0xc0 }, { 3, 3 }, F, N, pfx_66 }, /* pextrw */
+ { { 0xc6 }, { 3, 3 }, T, R, pfx_no }, /* shufps */
+ { { 0xc6 }, { 3, 3 }, T, R, pfx_66 }, /* shufpd */
{ { 0xc7, 0x08 }, { 2, 2 }, F, W }, /* cmpxchg8b */
{ { 0xc7, 0x18 }, { 2, 2 }, F, R }, /* xrstors */
{ { 0xc7, 0x20 }, { 2, 2 }, F, W }, /* xsavec */
@@ -497,11 +678,179 @@ static const struct {
{ { 0xc7, 0xf8 }, { 2, 2 }, F, N }, /* rdseed */
{ { 0xc7, 0xf8 }, { 2, 2 }, F, N, pfx_f3 }, /* rdpid */
REG(0xc8, { 1, 1 }, F, N ), /* bswap */
+ { { 0xd0 }, { 2, 2 }, T, R, pfx_66 }, /* addsubpd */
+ { { 0xd0 }, { 2, 2 }, T, R, pfx_f2 }, /* addsubps */
+ { { 0xd1 }, { 2, 2 }, T, R, pfx_no }, /* psrlw */
+ { { 0xd1 }, { 2, 2 }, T, R, pfx_66 }, /* psrlw */
+ { { 0xd2 }, { 2, 2 }, T, R, pfx_no }, /* psrld */
+ { { 0xd2 }, { 2, 2 }, T, R, pfx_66 }, /* psrld */
+ { { 0xd3 }, { 2, 2 }, T, R, pfx_no }, /* psrlq */
+ { { 0xd3 }, { 2, 2 }, T, R, pfx_66 }, /* psrlq */
+ { { 0xd4 }, { 2, 2 }, T, R, pfx_no }, /* paddq */
+ { { 0xd4 }, { 2, 2 }, T, R, pfx_66 }, /* paddq */
+ { { 0xd5 }, { 2, 2 }, T, R, pfx_no }, /* pmullw */
+ { { 0xd5 }, { 2, 2 }, T, R, pfx_66 }, /* pmullw */
+ { { 0xd6 }, { 2, 2 }, T, W, pfx_66 }, /* movq */
+ { { 0xd6, 0xc0 }, { 2, 2 }, F, N, pfx_f3 }, /* movq2dq */
+ { { 0xd6, 0xc0 }, { 2, 2 }, F, N, pfx_f2 }, /* movdq2q */
+ { { 0xd7, 0xc0 }, { 2, 2 }, F, N, pfx_no }, /* pmovmskb */
+ { { 0xd7, 0xc0 }, { 2, 2 }, F, N, pfx_66 }, /* pmovmskb */
+ { { 0xd8 }, { 2, 2 }, T, R, pfx_no }, /* psubusb */
+ { { 0xd8 }, { 2, 2 }, T, R, pfx_66 }, /* psubusb */
+ { { 0xd9 }, { 2, 2 }, T, R, pfx_no }, /* psubusw */
+ { { 0xd9 }, { 2, 2 }, T, R, pfx_66 }, /* psubusw */
+ { { 0xda }, { 2, 2 }, T, R, pfx_no }, /* pminub */
+ { { 0xda }, { 2, 2 }, T, R, pfx_66 }, /* pminub */
+ { { 0xdb }, { 2, 2 }, T, R, pfx_no }, /* pand */
+ { { 0xdb }, { 2, 2 }, T, R, pfx_66 }, /* pand */
+ { { 0xdc }, { 2, 2 }, T, R, pfx_no }, /* paddusb */
+ { { 0xdc }, { 2, 2 }, T, R, pfx_66 }, /* paddusb */
+ { { 0xdd }, { 2, 2 }, T, R, pfx_no }, /* paddusw */
+ { { 0xdd }, { 2, 2 }, T, R, pfx_66 }, /* paddusw */
+ { { 0xde }, { 2, 2 }, T, R, pfx_no }, /* pmaxub */
+ { { 0xde }, { 2, 2 }, T, R, pfx_66 }, /* pmaxub */
+ { { 0xdf }, { 2, 2 }, T, R, pfx_no }, /* pandn */
+ { { 0xdf }, { 2, 2 }, T, R, pfx_66 }, /* pandn */
+ { { 0xe0 }, { 2, 2 }, T, R, pfx_no }, /* pavgb */
+ { { 0xe0 }, { 2, 2 }, T, R, pfx_66 }, /* pavgb */
+ { { 0xe1 }, { 2, 2 }, T, R, pfx_no }, /* psraw */
+ { { 0xe1 }, { 2, 2 }, T, R, pfx_66 }, /* psraw */
+ { { 0xe2 }, { 2, 2 }, T, R, pfx_no }, /* psrad */
+ { { 0xe2 }, { 2, 2 }, T, R, pfx_66 }, /* psrad */
+ { { 0xe3 }, { 2, 2 }, T, R, pfx_no }, /* pavgw */
+ { { 0xe3 }, { 2, 2 }, T, R, pfx_66 }, /* pavgw */
+ { { 0xe4 }, { 2, 2 }, T, R, pfx_no }, /* pmulhuw */
+ { { 0xe4 }, { 2, 2 }, T, R, pfx_66 }, /* pmulhuw */
+ { { 0xe5 }, { 2, 2 }, T, R, pfx_no }, /* pmulhw */
+ { { 0xe5 }, { 2, 2 }, T, R, pfx_66 }, /* pmulhw */
+ { { 0xe6 }, { 2, 2 }, T, R, pfx_66 }, /* cvttpd2dq */
+ { { 0xe6 }, { 2, 2 }, T, R, pfx_f3 }, /* cvtdq2pd */
+ { { 0xe6 }, { 2, 2 }, T, R, pfx_f2 }, /* cvtpd2dq */
+ { { 0xe7 }, { 2, 2 }, F, W, pfx_no }, /* movntq */
+ { { 0xe7 }, { 2, 2 }, F, W, pfx_66 }, /* movntdq */
+ { { 0xe8 }, { 2, 2 }, T, R, pfx_no }, /* psubsb */
+ { { 0xe8 }, { 2, 2 }, T, R, pfx_66 }, /* psubsb */
+ { { 0xe9 }, { 2, 2 }, T, R, pfx_no }, /* psubsw */
+ { { 0xe9 }, { 2, 2 }, T, R, pfx_66 }, /* psubsw */
+ { { 0xea }, { 2, 2 }, T, R, pfx_no }, /* pminsw */
+ { { 0xea }, { 2, 2 }, T, R, pfx_66 }, /* pminsw */
+ { { 0xeb }, { 2, 2 }, T, R, pfx_no }, /* por */
+ { { 0xeb }, { 2, 2 }, T, R, pfx_66 }, /* por */
+ { { 0xec }, { 2, 2 }, T, R, pfx_no }, /* paddsb */
+ { { 0xec }, { 2, 2 }, T, R, pfx_66 }, /* paddsb */
+ { { 0xed }, { 2, 2 }, T, R, pfx_no }, /* paddsw */
+ { { 0xed }, { 2, 2 }, T, R, pfx_66 }, /* paddsw */
+ { { 0xee }, { 2, 2 }, T, R, pfx_no }, /* pmaxsw */
+ { { 0xee }, { 2, 2 }, T, R, pfx_66 }, /* pmaxsw */
+ { { 0xef }, { 2, 2 }, T, R, pfx_no }, /* pxor */
+ { { 0xef }, { 2, 2 }, T, R, pfx_66 }, /* pxor */
+ { { 0xf0 }, { 2, 2 }, T, R, pfx_f2 }, /* lddqu */
+ { { 0xf1 }, { 2, 2 }, T, R, pfx_no }, /* psllw */
+ { { 0xf1 }, { 2, 2 }, T, R, pfx_66 }, /* psllw */
+ { { 0xf2 }, { 2, 2 }, T, R, pfx_no }, /* pslld */
+ { { 0xf2 }, { 2, 2 }, T, R, pfx_66 }, /* pslld */
+ { { 0xf3 }, { 2, 2 }, T, R, pfx_no }, /* psllq */
+ { { 0xf3 }, { 2, 2 }, T, R, pfx_66 }, /* psllq */
+ { { 0xf4 }, { 2, 2 }, T, R, pfx_no }, /* pmuludq */
+ { { 0xf4 }, { 2, 2 }, T, R, pfx_66 }, /* pmuludq */
+ { { 0xf5 }, { 2, 2 }, T, R, pfx_no }, /* pmaddwd */
+ { { 0xf5 }, { 2, 2 }, T, R, pfx_66 }, /* pmaddwd */
+ { { 0xf6 }, { 2, 2 }, T, R, pfx_no }, /* psadbw */
+ { { 0xf6 }, { 2, 2 }, T, R, pfx_66 }, /* psadbw */
+ { { 0xf7, 0xc0 }, { 2, 2 }, F, W, pfx_no }, /* maskmovq */
+ { { 0xf7, 0xc0 }, { 2, 2 }, F, W, pfx_66 }, /* maskmovdqu */
+ { { 0xf8 }, { 2, 2 }, T, R, pfx_no }, /* psubb */
+ { { 0xf8 }, { 2, 2 }, T, R, pfx_66 }, /* psubb */
+ { { 0xf9 }, { 2, 2 }, T, R, pfx_no }, /* psubw */
+ { { 0xf9 }, { 2, 2 }, T, R, pfx_66 }, /* psubw */
+ { { 0xfa }, { 2, 2 }, T, R, pfx_no }, /* psubd */
+ { { 0xfa }, { 2, 2 }, T, R, pfx_66 }, /* psubd */
+ { { 0xfb }, { 2, 2 }, T, R, pfx_no }, /* psubq */
+ { { 0xfb }, { 2, 2 }, T, R, pfx_66 }, /* psubq */
+ { { 0xfc }, { 2, 2 }, T, R, pfx_no }, /* paddb */
+ { { 0xfc }, { 2, 2 }, T, R, pfx_66 }, /* paddb */
+ { { 0xfd }, { 2, 2 }, T, R, pfx_no }, /* paddw */
+ { { 0xfd }, { 2, 2 }, T, R, pfx_66 }, /* paddw */
+ { { 0xfe }, { 2, 2 }, T, R, pfx_no }, /* paddd */
+ { { 0xfe }, { 2, 2 }, T, R, pfx_66 }, /* paddd */
{ { 0xff }, { 2, 2 }, F, N }, /* ud0 */
}, legacy_0f38[] = {
+ { { 0x00 }, { 2, 2 }, T, R, pfx_no }, /* pshufb */
+ { { 0x00 }, { 2, 2 }, T, R, pfx_66 }, /* pshufb */
+ { { 0x01 }, { 2, 2 }, T, R, pfx_no }, /* phaddw */
+ { { 0x01 }, { 2, 2 }, T, R, pfx_66 }, /* phaddw */
+ { { 0x02 }, { 2, 2 }, T, R, pfx_no }, /* phaddd */
+ { { 0x02 }, { 2, 2 }, T, R, pfx_66 }, /* phaddd */
+ { { 0x03 }, { 2, 2 }, T, R, pfx_no }, /* phaddsw */
+ { { 0x03 }, { 2, 2 }, T, R, pfx_66 }, /* phaddsw */
+ { { 0x04 }, { 2, 2 }, T, R, pfx_no }, /* pmaddubsw */
+ { { 0x04 }, { 2, 2 }, T, R, pfx_66 }, /* pmaddubsw */
+ { { 0x05 }, { 2, 2 }, T, R, pfx_no }, /* phsubw */
+ { { 0x05 }, { 2, 2 }, T, R, pfx_66 }, /* phsubw */
+ { { 0x06 }, { 2, 2 }, T, R, pfx_no }, /* phsubd */
+ { { 0x06 }, { 2, 2 }, T, R, pfx_66 }, /* phsubd */
+ { { 0x07 }, { 2, 2 }, T, R, pfx_no }, /* phsubsw */
+ { { 0x07 }, { 2, 2 }, T, R, pfx_66 }, /* phsubsw */
+ { { 0x08 }, { 2, 2 }, T, R, pfx_no }, /* psignb */
+ { { 0x08 }, { 2, 2 }, T, R, pfx_66 }, /* psignb */
+ { { 0x09 }, { 2, 2 }, T, R, pfx_no }, /* psignw */
+ { { 0x09 }, { 2, 2 }, T, R, pfx_66 }, /* psignw */
+ { { 0x0a }, { 2, 2 }, T, R, pfx_no }, /* psignd */
+ { { 0x0a }, { 2, 2 }, T, R, pfx_66 }, /* psignd */
+ { { 0x0b }, { 2, 2 }, T, R, pfx_no }, /* pmulhrsw */
+ { { 0x0b }, { 2, 2 }, T, R, pfx_66 }, /* pmulhrsw */
+ { { 0x10 }, { 2, 2 }, T, R, pfx_66 }, /* pblendvb */
+ { { 0x14 }, { 2, 2 }, T, R, pfx_66 }, /* blendvps */
+ { { 0x15 }, { 2, 2 }, T, R, pfx_66 }, /* blendvpd */
+ { { 0x17 }, { 2, 2 }, T, R, pfx_66 }, /* ptest */
+ { { 0x1c }, { 2, 2 }, T, R, pfx_no }, /* pabsb */
+ { { 0x1c }, { 2, 2 }, T, R, pfx_66 }, /* pabsb */
+ { { 0x1d }, { 2, 2 }, T, R, pfx_no }, /* pabsw */
+ { { 0x1d }, { 2, 2 }, T, R, pfx_66 }, /* pabsw */
+ { { 0x1e }, { 2, 2 }, T, R, pfx_no }, /* pabsd */
+ { { 0x1e }, { 2, 2 }, T, R, pfx_66 }, /* pabsd */
+ { { 0x20 }, { 2, 2 }, T, R, pfx_66 }, /* pmovsxbw */
+ { { 0x21 }, { 2, 2 }, T, R, pfx_66 }, /* pmovsxbd */
+ { { 0x22 }, { 2, 2 }, T, R, pfx_66 }, /* pmovsxbq */
+ { { 0x23 }, { 2, 2 }, T, R, pfx_66 }, /* pmovsxwd */
+ { { 0x24 }, { 2, 2 }, T, R, pfx_66 }, /* pmovsxwq */
+ { { 0x25 }, { 2, 2 }, T, R, pfx_66 }, /* pmovsxdq */
+ { { 0x28 }, { 2, 2 }, T, R, pfx_66 }, /* pmuldq */
+ { { 0x29 }, { 2, 2 }, T, R, pfx_66 }, /* pcmpeqq */
+ { { 0x2a }, { 2, 2 }, F, R, pfx_66 }, /* movntdqa */
+ { { 0x2b }, { 2, 2 }, T, R, pfx_66 }, /* packusdw */
+ { { 0x30 }, { 2, 2 }, T, R, pfx_66 }, /* pmovzxbw */
+ { { 0x31 }, { 2, 2 }, T, R, pfx_66 }, /* pmovzxbd */
+ { { 0x32 }, { 2, 2 }, T, R, pfx_66 }, /* pmovzxbq */
+ { { 0x33 }, { 2, 2 }, T, R, pfx_66 }, /* pmovzxwd */
+ { { 0x34 }, { 2, 2 }, T, R, pfx_66 }, /* pmovzxwq */
+ { { 0x35 }, { 2, 2 }, T, R, pfx_66 }, /* pmovzxdq */
+ { { 0x37 }, { 2, 2 }, T, R, pfx_66 }, /* pcmpgtq */
+ { { 0x38 }, { 2, 2 }, T, R, pfx_66 }, /* pminsb */
+ { { 0x39 }, { 2, 2 }, T, R, pfx_66 }, /* pminsd */
+ { { 0x3a }, { 2, 2 }, T, R, pfx_66 }, /* pminuw */
+ { { 0x3b }, { 2, 2 }, T, R, pfx_66 }, /* pminud */
+ { { 0x3c }, { 2, 2 }, T, R, pfx_66 }, /* pmaxsb */
+ { { 0x3d }, { 2, 2 }, T, R, pfx_66 }, /* pmaxsd */
+ { { 0x3e }, { 2, 2 }, T, R, pfx_66 }, /* pmaxuw */
+ { { 0x3f }, { 2, 2 }, T, R, pfx_66 }, /* pmaxud */
+ { { 0x40 }, { 2, 2 }, T, R, pfx_66 }, /* pmulld */
+ { { 0x41 }, { 2, 2 }, T, R, pfx_66 }, /* phminposuw */
{ { 0x80 }, { 2, 2 }, T, R, pfx_66 }, /* invept */
{ { 0x81 }, { 2, 2 }, T, R, pfx_66 }, /* invvpid */
{ { 0x82 }, { 2, 2 }, T, R, pfx_66 }, /* invpcid */
+ { { 0xc8 }, { 2, 2 }, T, R, pfx_no }, /* sha1nexte */
+ { { 0xc9 }, { 2, 2 }, T, R, pfx_no }, /* sha1msg1 */
+ { { 0xca }, { 2, 2 }, T, R, pfx_no }, /* sha1msg2 */
+ { { 0xcb }, { 2, 2 }, T, R, pfx_no }, /* sha256rnds2 */
+ { { 0xcc }, { 2, 2 }, T, R, pfx_no }, /* sha256msg1 */
+ { { 0xcd }, { 2, 2 }, T, R, pfx_no }, /* sha256msg2 */
+ { { 0xcf }, { 2, 2 }, T, R, pfx_66 }, /* gf2p8mulb */
+ { { 0xdb }, { 2, 2 }, T, R, pfx_66 }, /* aesimc */
+ { { 0xdc }, { 2, 2 }, T, R, pfx_66 }, /* aesenc */
+ { { 0xdd }, { 2, 2 }, T, R, pfx_66 }, /* aesenclast */
+ { { 0xde }, { 2, 2 }, T, R, pfx_66 }, /* aesdec */
+ { { 0xdf }, { 2, 2 }, T, R, pfx_66 }, /* aesdeclast */
{ { 0xf0 }, { 2, 2 }, T, R }, /* movbe */
{ { 0xf0 }, { 2, 2 }, T, R, pfx_f2 }, /* crc32 */
{ { 0xf1 }, { 2, 2 }, T, W }, /* movbe */
@@ -517,6 +866,42 @@ static const struct {
};
#undef CND
#undef REG
+
+static const struct {
+ uint8_t opc;
+ uint8_t mem:2;
+ uint8_t pfx:2;
+} legacy_0f3a[] = {
+ { 0x08, R, pfx_66 }, /* roundps */
+ { 0x09, R, pfx_66 }, /* roundpd */
+ { 0x0a, R, pfx_66 }, /* roundss */
+ { 0x0b, R, pfx_66 }, /* roundsd */
+ { 0x0c, R, pfx_66 }, /* blendps */
+ { 0x0d, R, pfx_66 }, /* blendpd */
+ { 0x0e, R, pfx_66 }, /* pblendw */
+ { 0x0f, R, pfx_no }, /* palignr */
+ { 0x0f, R, pfx_66 }, /* palignr */
+ { 0x14, W, pfx_66 }, /* pextrb */
+ { 0x15, W, pfx_66 }, /* pextrw */
+ { 0x16, W, pfx_66 }, /* pextrd */
+ { 0x17, W, pfx_66 }, /* extractps */
+ { 0x20, R, pfx_66 }, /* pinsrb */
+ { 0x21, R, pfx_66 }, /* insertps */
+ { 0x22, R, pfx_66 }, /* pinsrd */
+ { 0x40, R, pfx_66 }, /* dpps */
+ { 0x41, R, pfx_66 }, /* dppd */
+ { 0x42, R, pfx_66 }, /* mpsadbw */
+ { 0x44, R, pfx_66 }, /* pclmulqdq */
+ { 0x60, R, pfx_66 }, /* pcmpestrm */
+ { 0x61, R, pfx_66 }, /* pcmpestri */
+ { 0x62, R, pfx_66 }, /* pcmpistrm */
+ { 0x63, R, pfx_66 }, /* pcmpistri */
+ { 0xcc, R, pfx_no }, /* sha1rnds4 */
+ { 0xce, R, pfx_66 }, /* gf2p8affineqb */
+ { 0xcf, R, pfx_66 }, /* gf2p8affineinvqb */
+ { 0xdf, R, pfx_66 }, /* aeskeygenassist */
+};
+
static const struct {
uint8_t opc[2];
bool modrm:1; /* Should register form (also) be tested? */
@@ -799,6 +1184,23 @@ void predicates_test(void *instr, struct
legacy_0f38[t].mem, ctxt, fetch);
}
+ for ( t = 0; t < ARRAY_SIZE(legacy_0f3a); ++t )
+ {
+ uint8_t *ptr = instr;
+
+ memset(instr + 5, 0xcc, 10);
+ if ( legacy_0f3a[t].pfx )
+ *ptr++ = prefixes[legacy_0f3a[t].pfx - 1];
+ *ptr++ = 0x0f;
+ *ptr++ = 0x3a;
+ *ptr++ = legacy_0f3a[t].opc;
+ *ptr++ = 0x00; /* ModR/M */
+ *ptr++ = 0x00; /* imm8 */
+
+ do_test(instr, (void *)ptr - instr, (void *)ptr - instr - 2,
+ legacy_0f3a[t].mem, ctxt, fetch);
+ }
+
memset(instr + ARRAY_SIZE(fpu[t].opc), 0xcc, 13);
for ( t = 0; t < ARRAY_SIZE(fpu); ++t )
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 03/10] x86emul: extend decoding / mem access testing to MMX / SSE insns
2020-08-03 14:50 ` [PATCH 03/10] x86emul: extend decoding / mem access testing to MMX / SSE insns Jan Beulich
@ 2020-08-03 16:42 ` Andrew Cooper
2020-08-04 6:40 ` Jan Beulich
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Cooper @ 2020-08-03 16:42 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné
On 03/08/2020 15:50, Jan Beulich wrote:
> IOW just legacy encoded ones. For 3dNow! just one example is used, as
> they're all similar in nature both encoding- and operand-wise.
>
> Adjust a slightly misleading (but not wrong) memcpy() invocation, as
> noticed while further cloning that code.
I don't see any adjustment, in this or later patches.
Is the comment stale?
~Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 03/10] x86emul: extend decoding / mem access testing to MMX / SSE insns
2020-08-03 16:42 ` Andrew Cooper
@ 2020-08-04 6:40 ` Jan Beulich
0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-04 6:40 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné
On 03.08.2020 18:42, Andrew Cooper wrote:
> On 03/08/2020 15:50, Jan Beulich wrote:
>> IOW just legacy encoded ones. For 3dNow! just one example is used, as
>> they're all similar in nature both encoding- and operand-wise.
>>
>> Adjust a slightly misleading (but not wrong) memcpy() invocation, as
>> noticed while further cloning that code.
>
> I don't see any adjustment, in this or later patches.
>
> Is the comment stale?
Indeed it is, thanks for noticing. That change we merged back into
the patch that has already gone in (and afaict now it was a memset(),
not a memcpy()).
Jan
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 04/10] x86emul: extend decoding / mem access testing to VEX-encoded insns
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
` (2 preceding siblings ...)
2020-08-03 14:50 ` [PATCH 03/10] x86emul: extend decoding / mem access testing to MMX / SSE insns Jan Beulich
@ 2020-08-03 14:51 ` Jan Beulich
2020-08-03 14:51 ` [PATCH 05/10] x86emul: extend decoding / mem access testing to XOP-encoded insns Jan Beulich
` (7 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:51 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/tools/tests/x86_emulator/predicates.c
+++ b/tools/tests/x86_emulator/predicates.c
@@ -1034,6 +1034,449 @@ static const struct {
{ { 0xdf, 0xe8 }, F, N }, /* fucomip */
{ { 0xdf, 0xf0 }, F, N }, /* fcomip */
};
+
+#define VSIB(n) 0x04 | ((n) << 3), 0x38 /* reg: %xmm<n>, mem: (%eax,%xmm7) */
+
+static const struct vex {
+ uint8_t opc[3];
+ uint8_t len:3;
+ bool modrm:1; /* Should register form (also) be tested? */
+ uint8_t mem:2;
+ uint8_t pfx:2;
+ uint8_t w:2;
+#define WIG 0
+#define W0 1
+#define W1 2
+#define Wn (W0 | W1)
+ uint8_t l:2;
+#define LIG 0
+#define L0 1
+#define L1 2
+#define Ln (L0 | L1)
+} vex_0f[] = {
+ { { 0x10 }, 2, T, R, pfx_no, WIG, Ln }, /* vmovups */
+ { { 0x10 }, 2, T, R, pfx_66, WIG, Ln }, /* vmovupd */
+ { { 0x10 }, 2, T, R, pfx_f3, WIG, LIG }, /* vmovss */
+ { { 0x10 }, 2, T, R, pfx_f2, WIG, LIG }, /* vmovsd */
+ { { 0x11 }, 2, T, W, pfx_no, WIG, Ln }, /* vmovups */
+ { { 0x11 }, 2, T, W, pfx_66, WIG, Ln }, /* vmovupd */
+ { { 0x11 }, 2, T, W, pfx_f3, WIG, LIG }, /* vmovss */
+ { { 0x11 }, 2, T, W, pfx_f2, WIG, LIG }, /* vmovsd */
+ { { 0x12 }, 2, T, R, pfx_no, WIG, L0 }, /* vmovlps / vmovhlps */
+ { { 0x12 }, 2, F, R, pfx_66, WIG, L0 }, /* vmovlpd */
+ { { 0x12 }, 2, T, R, pfx_f3, WIG, Ln }, /* vmovsldup */
+ { { 0x12 }, 2, T, R, pfx_f2, WIG, Ln }, /* vmovddup */
+ { { 0x13 }, 2, F, W, pfx_no, WIG, L0 }, /* vmovlps */
+ { { 0x13 }, 2, F, W, pfx_66, WIG, L0 }, /* vmovlpd */
+ { { 0x14 }, 2, T, R, pfx_no, WIG, Ln }, /* vunpcklps */
+ { { 0x14 }, 2, T, R, pfx_66, WIG, Ln }, /* vunpcklpd */
+ { { 0x15 }, 2, T, R, pfx_no, WIG, Ln }, /* vunpckhps */
+ { { 0x15 }, 2, T, R, pfx_66, WIG, Ln }, /* vunpckhpd */
+ { { 0x16 }, 2, T, R, pfx_no, WIG, L0 }, /* vmovhps / vmovlhps */
+ { { 0x16 }, 2, F, R, pfx_66, WIG, L0 }, /* vmovhpd */
+ { { 0x16 }, 2, T, R, pfx_f3, WIG, Ln }, /* vmovshdup */
+ { { 0x17 }, 2, F, W, pfx_no, WIG, L0 }, /* vmovhps */
+ { { 0x17 }, 2, F, W, pfx_66, WIG, L0 }, /* vmovhpd */
+ { { 0x28 }, 2, T, R, pfx_no, WIG, Ln }, /* vmovaps */
+ { { 0x28 }, 2, T, R, pfx_66, WIG, Ln }, /* vmovapd */
+ { { 0x29 }, 2, T, W, pfx_no, WIG, Ln }, /* vmovaps */
+ { { 0x29 }, 2, T, W, pfx_66, WIG, Ln }, /* vmovapd */
+ { { 0x2a }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtsi2ss */
+ { { 0x2a }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvtsi2sd */
+ { { 0x2b }, 2, T, W, pfx_no, WIG, Ln }, /* vmovntps */
+ { { 0x2b }, 2, T, W, pfx_66, WIG, Ln }, /* vmovntpd */
+ { { 0x2c }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttss2si */
+ { { 0x2c }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvttsd2si */
+ { { 0x2d }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtss2si */
+ { { 0x2d }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvtsd2si */
+ { { 0x2e }, 2, T, R, pfx_no, WIG, LIG }, /* vucomiss */
+ { { 0x2e }, 2, T, R, pfx_66, WIG, LIG }, /* vucomisd */
+ { { 0x2f }, 2, T, R, pfx_no, WIG, LIG }, /* vcomiss */
+ { { 0x2f }, 2, T, R, pfx_66, WIG, LIG }, /* vcomisd */
+ { { 0x41, 0xc0 }, 2, F, N, pfx_no, Wn, L1 }, /* kand{w,q} */
+ { { 0x41, 0xc0 }, 2, F, N, pfx_66, Wn, L1 }, /* kand{b,d} */
+ { { 0x42, 0xc0 }, 2, F, N, pfx_no, Wn, L1 }, /* kandn{w,q} */
+ { { 0x42, 0xc0 }, 2, F, N, pfx_66, Wn, L1 }, /* kandn{b,d} */
+ { { 0x44, 0xc0 }, 2, F, N, pfx_no, Wn, L0 }, /* knot{w,q} */
+ { { 0x44, 0xc0 }, 2, F, N, pfx_66, Wn, L0 }, /* knot{b,d} */
+ { { 0x45, 0xc0 }, 2, F, N, pfx_no, Wn, L1 }, /* kor{w,q} */
+ { { 0x45, 0xc0 }, 2, F, N, pfx_66, Wn, L1 }, /* kor{b,d} */
+ { { 0x46, 0xc0 }, 2, F, N, pfx_no, Wn, L1 }, /* kxnor{w,q} */
+ { { 0x46, 0xc0 }, 2, F, N, pfx_66, Wn, L1 }, /* kxnor{b,d} */
+ { { 0x47, 0xc0 }, 2, F, N, pfx_no, Wn, L1 }, /* kxor{w,q} */
+ { { 0x47, 0xc0 }, 2, F, N, pfx_66, Wn, L1 }, /* kxor{b,d} */
+ { { 0x4a, 0xc0 }, 2, F, N, pfx_no, Wn, L1 }, /* kadd{w,q} */
+ { { 0x4a, 0xc0 }, 2, F, N, pfx_66, Wn, L1 }, /* kadd{b,d} */
+ { { 0x4b, 0xc0 }, 2, F, N, pfx_no, Wn, L1 }, /* kunpck{wd,dq} */
+ { { 0x4b, 0xc0 }, 2, F, N, pfx_66, W0, L1 }, /* kunpckbw */
+ { { 0x50, 0xc0 }, 2, F, N, pfx_no, WIG, Ln }, /* vmovmskps */
+ { { 0x50, 0xc0 }, 2, F, N, pfx_66, WIG, Ln }, /* vmovmskpd */
+ { { 0x51 }, 2, T, R, pfx_no, WIG, Ln }, /* vsqrtps */
+ { { 0x51 }, 2, T, R, pfx_66, WIG, Ln }, /* vsqrtpd */
+ { { 0x51 }, 2, T, R, pfx_f3, WIG, LIG }, /* vsqrtss */
+ { { 0x51 }, 2, T, R, pfx_f2, WIG, LIG }, /* vsqrtsd */
+ { { 0x52 }, 2, T, R, pfx_no, WIG, Ln }, /* vrsqrtps */
+ { { 0x52 }, 2, T, R, pfx_f3, WIG, LIG }, /* vrsqrtss */
+ { { 0x53 }, 2, T, R, pfx_no, WIG, Ln }, /* vrcpps */
+ { { 0x53 }, 2, T, R, pfx_f3, WIG, LIG }, /* vrcpss */
+ { { 0x54 }, 2, T, R, pfx_no, WIG, Ln }, /* vandps */
+ { { 0x54 }, 2, T, R, pfx_66, WIG, Ln }, /* vandpd */
+ { { 0x55 }, 2, T, R, pfx_no, WIG, Ln }, /* vandnps */
+ { { 0x55 }, 2, T, R, pfx_66, WIG, Ln }, /* vandnpd */
+ { { 0x56 }, 2, T, R, pfx_no, WIG, Ln }, /* vorps */
+ { { 0x56 }, 2, T, R, pfx_66, WIG, Ln }, /* vorpd */
+ { { 0x57 }, 2, T, R, pfx_no, WIG, Ln }, /* vxorps */
+ { { 0x57 }, 2, T, R, pfx_66, WIG, Ln }, /* vxorpd */
+ { { 0x58 }, 2, T, R, pfx_no, WIG, Ln }, /* vaddps */
+ { { 0x58 }, 2, T, R, pfx_66, WIG, Ln }, /* vaddpd */
+ { { 0x58 }, 2, T, R, pfx_f3, WIG, LIG }, /* vaddss */
+ { { 0x58 }, 2, T, R, pfx_f2, WIG, LIG }, /* vaddsd */
+ { { 0x59 }, 2, T, R, pfx_no, WIG, Ln }, /* vmulps */
+ { { 0x59 }, 2, T, R, pfx_66, WIG, Ln }, /* vmulpd */
+ { { 0x59 }, 2, T, R, pfx_f3, WIG, LIG }, /* vmulss */
+ { { 0x59 }, 2, T, R, pfx_f2, WIG, LIG }, /* vmulsd */
+ { { 0x5a }, 2, T, R, pfx_no, WIG, Ln }, /* vcvtps2pd */
+ { { 0x5a }, 2, T, R, pfx_66, WIG, Ln }, /* vcvtpd2ps */
+ { { 0x5a }, 2, T, R, pfx_f3, WIG, LIG }, /* vcvtss2sd */
+ { { 0x5a }, 2, T, R, pfx_f2, WIG, LIG }, /* vcvtsd2ss */
+ { { 0x5b }, 2, T, R, pfx_no, WIG, Ln }, /* vcvtdq2ps */
+ { { 0x5b }, 2, T, R, pfx_66, WIG, Ln }, /* vcvtps2dq */
+ { { 0x5b }, 2, T, R, pfx_f3, WIG, Ln }, /* vcvttps2dq */
+ { { 0x5c }, 2, T, R, pfx_no, WIG, Ln }, /* vsubps */
+ { { 0x5c }, 2, T, R, pfx_66, WIG, Ln }, /* vsubpd */
+ { { 0x5c }, 2, T, R, pfx_f3, WIG, LIG }, /* vsubss */
+ { { 0x5c }, 2, T, R, pfx_f2, WIG, LIG }, /* vsubsd */
+ { { 0x5d }, 2, T, R, pfx_no, WIG, Ln }, /* vminps */
+ { { 0x5d }, 2, T, R, pfx_66, WIG, Ln }, /* vminpd */
+ { { 0x5d }, 2, T, R, pfx_f3, WIG, LIG }, /* vminss */
+ { { 0x5d }, 2, T, R, pfx_f2, WIG, LIG }, /* vminsd */
+ { { 0x5e }, 2, T, R, pfx_no, WIG, Ln }, /* vdivps */
+ { { 0x5e }, 2, T, R, pfx_66, WIG, Ln }, /* vdivpd */
+ { { 0x5e }, 2, T, R, pfx_f3, WIG, LIG }, /* vdivss */
+ { { 0x5e }, 2, T, R, pfx_f2, WIG, LIG }, /* vdivsd */
+ { { 0x5f }, 2, T, R, pfx_no, WIG, Ln }, /* vmaxps */
+ { { 0x5f }, 2, T, R, pfx_66, WIG, Ln }, /* vmaxpd */
+ { { 0x5f }, 2, T, R, pfx_f3, WIG, LIG }, /* vmaxss */
+ { { 0x5f }, 2, T, R, pfx_f2, WIG, LIG }, /* vmaxsd */
+ { { 0x60 }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpcklbw */
+ { { 0x61 }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpcklwd */
+ { { 0x62 }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpckldq */
+ { { 0x63 }, 2, T, R, pfx_66, WIG, Ln }, /* vpacksswb */
+ { { 0x64 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpgtb */
+ { { 0x65 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpgtw */
+ { { 0x66 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpgtd */
+ { { 0x67 }, 2, T, R, pfx_66, WIG, Ln }, /* vpackuswb */
+ { { 0x68 }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpckhbw */
+ { { 0x69 }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpckhwd */
+ { { 0x6a }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpckhdq */
+ { { 0x6b }, 2, T, R, pfx_66, WIG, Ln }, /* vpackssdw */
+ { { 0x6c }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpcklqdq */
+ { { 0x6d }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpckhqdq */
+ { { 0x6e }, 2, T, R, pfx_66, Wn, L0 }, /* vmov{d,q} */
+ { { 0x6f }, 2, T, R, pfx_66, WIG, Ln }, /* vmovdqa */
+ { { 0x6f }, 2, T, R, pfx_f3, WIG, Ln }, /* vmovdqu */
+ { { 0x70 }, 3, T, R, pfx_66, WIG, Ln }, /* vpshufd */
+ { { 0x70 }, 3, T, R, pfx_f3, WIG, Ln }, /* vpshuflw */
+ { { 0x70 }, 3, T, R, pfx_f2, WIG, Ln }, /* vpshufhw */
+ { { 0x71, 0xd0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsrlw */
+ { { 0x71, 0xe0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsraw */
+ { { 0x71, 0xf0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsllw */
+ { { 0x72, 0xd0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsrld */
+ { { 0x72, 0xe0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsrad */
+ { { 0x72, 0xf0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpslld */
+ { { 0x73, 0xd0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsrlq */
+ { { 0x73, 0xd8 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsrldq */
+ { { 0x73, 0xf0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsllq */
+ { { 0x73, 0xf8 }, 3, F, N, pfx_66, WIG, Ln }, /* vpslldq */
+ { { 0x74 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpeqb */
+ { { 0x75 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpeqw */
+ { { 0x76 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpeqd */
+ { { 0x77 }, 1, F, N, pfx_no, WIG, Ln }, /* vzero{upper,all} */
+ { { 0x7c }, 2, T, R, pfx_66, WIG, Ln }, /* vhaddpd */
+ { { 0x7c }, 2, T, R, pfx_f2, WIG, Ln }, /* vhaddps */
+ { { 0x7d }, 2, T, R, pfx_66, WIG, Ln }, /* vhsubpd */
+ { { 0x7d }, 2, T, R, pfx_f2, WIG, Ln }, /* vhsubps */
+ { { 0x7e }, 2, T, W, pfx_66, Wn, L0 }, /* vmov{d,q} */
+ { { 0x7e }, 2, T, R, pfx_f3, WIG, L0 }, /* vmovq */
+ { { 0x7f }, 2, T, W, pfx_66, WIG, Ln }, /* vmovdqa */
+ { { 0x7f }, 2, T, W, pfx_f3, WIG, Ln }, /* vmovdqu */
+ { { 0x90 }, 2, T, R, pfx_no, Wn, L0 }, /* kmov{w,q} */
+ { { 0x90 }, 2, T, R, pfx_66, Wn, L0 }, /* kmov{b,d} */
+ { { 0x91 }, 2, N, W, pfx_no, Wn, L0 }, /* kmov{w,q} */
+ { { 0x91 }, 2, N, W, pfx_66, Wn, L0 }, /* kmov{b,d} */
+ { { 0x92, 0xc0 }, 2, F, N, pfx_no, W0, L0 }, /* kmovw */
+ { { 0x92, 0xc0 }, 2, F, N, pfx_66, W0, L0 }, /* kmovb */
+ { { 0x92, 0xc0 }, 2, F, N, pfx_f2, Wn, L0 }, /* kmov{d,q} */
+ { { 0x93, 0xc0 }, 2, F, N, pfx_no, W0, L0 }, /* kmovw */
+ { { 0x93, 0xc0 }, 2, F, N, pfx_66, W0, L0 }, /* kmovb */
+ { { 0x93, 0xc0 }, 2, F, N, pfx_f2, Wn, L0 }, /* kmov{d,q} */
+ { { 0x98, 0xc0 }, 2, F, N, pfx_no, Wn, L0 }, /* kortest{w,q} */
+ { { 0x98, 0xc0 }, 2, F, N, pfx_66, Wn, L0 }, /* kortest{b,d} */
+ { { 0x99, 0xc0 }, 2, F, N, pfx_no, Wn, L0 }, /* ktest{w,q} */
+ { { 0x99, 0xc0 }, 2, F, N, pfx_66, Wn, L0 }, /* ktest{b,d} */
+ { { 0xae, 0x10 }, 2, F, R, pfx_no, WIG, L0 }, /* vldmxcsr */
+ { { 0xae, 0x18 }, 2, F, W, pfx_no, WIG, L0 }, /* vstmxcsr */
+ { { 0xc2 }, 3, T, R, pfx_no, WIG, Ln }, /* vcmpps */
+ { { 0xc2 }, 3, T, R, pfx_66, WIG, Ln }, /* vcmppd */
+ { { 0xc2 }, 3, T, R, pfx_f3, WIG, LIG }, /* vcmpss */
+ { { 0xc2 }, 3, T, R, pfx_f2, WIG, LIG }, /* vcmpsd */
+ { { 0xc4 }, 3, T, R, pfx_66, WIG, L0 }, /* vpinsrw */
+ { { 0xc5, 0xc0 }, 3, F, N, pfx_66, WIG, L0 }, /* vpextrw */
+ { { 0xc6 }, 3, T, R, pfx_no, WIG, Ln }, /* vshufps */
+ { { 0xc6 }, 3, T, R, pfx_66, WIG, Ln }, /* vshufpd */
+ { { 0xd0 }, 2, T, R, pfx_66, WIG, Ln }, /* vaddsubpd */
+ { { 0xd0 }, 2, T, R, pfx_f2, WIG, Ln }, /* vaddsubps */
+ { { 0xd1 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsrlw */
+ { { 0xd2 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsrld */
+ { { 0xd3 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsrlq */
+ { { 0xd4 }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddq */
+ { { 0xd5 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmullw */
+ { { 0xd6 }, 2, T, W, pfx_66, WIG, L0 }, /* vmovq */
+ { { 0xd7, 0xc0 }, 2, F, N, pfx_66, WIG, Ln }, /* vpmovmskb */
+ { { 0xd8 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubusb */
+ { { 0xd9 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubusw */
+ { { 0xda }, 2, T, R, pfx_66, WIG, Ln }, /* vpminub */
+ { { 0xdb }, 2, T, R, pfx_66, WIG, Ln }, /* vpand */
+ { { 0xdc }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddusb */
+ { { 0xdd }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddusw */
+ { { 0xde }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxub */
+ { { 0xdf }, 2, T, R, pfx_66, WIG, Ln }, /* vpandn */
+ { { 0xe0 }, 2, T, R, pfx_66, WIG, Ln }, /* vpavgb */
+ { { 0xe1 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsraw */
+ { { 0xe2 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsrad */
+ { { 0xe3 }, 2, T, R, pfx_66, WIG, Ln }, /* vpavgw */
+ { { 0xe4 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmulhuw */
+ { { 0xe5 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmulhw */
+ { { 0xe6 }, 2, T, R, pfx_66, WIG, Ln }, /* vcvttpd2dq */
+ { { 0xe6 }, 2, T, R, pfx_f3, WIG, Ln }, /* vcvtdq2pd */
+ { { 0xe6 }, 2, T, R, pfx_f2, WIG, Ln }, /* vcvtpd2dq */
+ { { 0xe7 }, 2, F, W, pfx_66, WIG, Ln }, /* vmovntdq */
+ { { 0xe8 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubsb */
+ { { 0xe9 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubsw */
+ { { 0xea }, 2, T, R, pfx_66, WIG, Ln }, /* vpminsw */
+ { { 0xeb }, 2, T, R, pfx_66, WIG, Ln }, /* vpor */
+ { { 0xec }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddsb */
+ { { 0xed }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddsw */
+ { { 0xee }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxsw */
+ { { 0xef }, 2, T, R, pfx_66, WIG, Ln }, /* vpxor */
+ { { 0xf0 }, 2, T, R, pfx_f2, WIG, Ln }, /* vlddqu */
+ { { 0xf1 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsllw */
+ { { 0xf2 }, 2, T, R, pfx_66, WIG, Ln }, /* vpslld */
+ { { 0xf3 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsllq */
+ { { 0xf4 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmuludq */
+ { { 0xf5 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaddwd */
+ { { 0xf6 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsadbw */
+ { { 0xf7, 0xc0 }, 2, F, W, pfx_66, WIG, L0 }, /* vmaskmovdqu */
+ { { 0xf8 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubb */
+ { { 0xf9 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubw */
+ { { 0xfa }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubd */
+ { { 0xfb }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubq */
+ { { 0xfc }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddb */
+ { { 0xfd }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddw */
+ { { 0xfe }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddd */
+}, vex_0f38[] = {
+ { { 0x00 }, 2, T, R, pfx_66, WIG, Ln }, /* vpshufb */
+ { { 0x01 }, 2, T, R, pfx_66, WIG, Ln }, /* vphaddw */
+ { { 0x02 }, 2, T, R, pfx_66, WIG, Ln }, /* vphaddd */
+ { { 0x03 }, 2, T, R, pfx_66, WIG, Ln }, /* vphaddsw */
+ { { 0x04 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaddubsw */
+ { { 0x05 }, 2, T, R, pfx_66, WIG, Ln }, /* vphsubw */
+ { { 0x06 }, 2, T, R, pfx_66, WIG, Ln }, /* vphsubd */
+ { { 0x07 }, 2, T, R, pfx_66, WIG, Ln }, /* vphsubsw */
+ { { 0x08 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsignb */
+ { { 0x09 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsignw */
+ { { 0x0a }, 2, T, R, pfx_66, WIG, Ln }, /* vpsignd */
+ { { 0x0b }, 2, T, R, pfx_66, WIG, Ln }, /* vpmulhrsw */
+ { { 0x0c }, 2, T, R, pfx_66, W0, Ln }, /* vpermilps */
+ { { 0x0d }, 2, T, R, pfx_66, W0, Ln }, /* vpermilpd */
+ { { 0x0e }, 2, T, R, pfx_66, W0, Ln }, /* vtestps */
+ { { 0x0f }, 2, T, R, pfx_66, W0, Ln }, /* vtestpd */
+ { { 0x13 }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2ps */
+ { { 0x16 }, 2, T, R, pfx_66, W0, L1 }, /* vpermps */
+ { { 0x17 }, 2, T, R, pfx_66, WIG, Ln }, /* vptest */
+ { { 0x18 }, 2, T, R, pfx_66, W0, Ln }, /* vbroadcastss */
+ { { 0x19 }, 2, T, R, pfx_66, W0, L1 }, /* vbroadcastsd */
+ { { 0x1a }, 2, F, R, pfx_66, W0, L1 }, /* vbroadcastf128 */
+ { { 0x1c }, 2, T, R, pfx_66, WIG, Ln }, /* vpabsb */
+ { { 0x1d }, 2, T, R, pfx_66, WIG, Ln }, /* vpabsw */
+ { { 0x1e }, 2, T, R, pfx_66, WIG, Ln }, /* vpabsd */
+ { { 0x20 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxbw */
+ { { 0x21 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxbd */
+ { { 0x22 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxbq */
+ { { 0x23 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxwd */
+ { { 0x24 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxwq */
+ { { 0x25 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxdq */
+ { { 0x28 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmuldq */
+ { { 0x29 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpeqq */
+ { { 0x2a }, 2, F, R, pfx_66, WIG, Ln }, /* vmovntdqa */
+ { { 0x2b }, 2, T, R, pfx_66, WIG, Ln }, /* vpackusdw */
+ { { 0x2c }, 2, F, R, pfx_66, W0, Ln }, /* vmaskmovps */
+ { { 0x2d }, 2, F, R, pfx_66, W0, Ln }, /* vmaskmovpd */
+ { { 0x2e }, 2, F, W, pfx_66, W0, Ln }, /* vmaskmovps */
+ { { 0x2f }, 2, F, W, pfx_66, W0, Ln }, /* vmaskmovpd */
+ { { 0x30 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxbw */
+ { { 0x31 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxbd */
+ { { 0x32 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxbq */
+ { { 0x33 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxwd */
+ { { 0x34 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxwq */
+ { { 0x35 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxdq */
+ { { 0x36 }, 2, T, R, pfx_66, W0, L1 }, /* vpermd */
+ { { 0x37 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpgtq */
+ { { 0x38 }, 2, T, R, pfx_66, WIG, Ln }, /* vpminsb */
+ { { 0x39 }, 2, T, R, pfx_66, WIG, Ln }, /* vpminsd */
+ { { 0x3a }, 2, T, R, pfx_66, WIG, Ln }, /* vpminuw */
+ { { 0x3b }, 2, T, R, pfx_66, WIG, Ln }, /* vpminud */
+ { { 0x3c }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxsb */
+ { { 0x3d }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxsd */
+ { { 0x3e }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxuw */
+ { { 0x3f }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxud */
+ { { 0x40 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmulld */
+ { { 0x41 }, 2, T, R, pfx_66, WIG, L0 }, /* vphminposuw */
+ { { 0x45 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsrlv{d,q} */
+ { { 0x46 }, 2, T, R, pfx_66, W0, Ln }, /* vpsravd */
+ { { 0x47 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsllv{d,q} */
+ { { 0x58 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastd */
+ { { 0x59 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastq */
+ { { 0x5a }, 2, F, R, pfx_66, W0, L1 }, /* vbroadcasti128 */
+ { { 0x78 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastb */
+ { { 0x79 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastw */
+ { { 0x8c }, 2, F, R, pfx_66, Wn, Ln }, /* vpmaskmov{d,q} */
+ { { 0x8e }, 2, F, W, pfx_66, Wn, Ln }, /* vpmaskmov{d,q} */
+ { { 0x90, VSIB(1) }, 3, F, R, pfx_66, Wn, Ln }, /* vpgatherd{d,q} */
+ { { 0x91, VSIB(1) }, 3, F, R, pfx_66, Wn, Ln }, /* vpgatherq{d,q} */
+ { { 0x92, VSIB(1) }, 3, F, R, pfx_66, Wn, Ln }, /* vgatherdp{s,d} */
+ { { 0x93, VSIB(1) }, 3, F, R, pfx_66, Wn, Ln }, /* vgatherqp{s,d} */
+ { { 0x96 }, 2, T, R, pfx_66, Wn, Ln }, /* vmaddsub132p{s,d} */
+ { { 0x97 }, 2, T, R, pfx_66, Wn, Ln }, /* vmsubadd132p{s,d} */
+ { { 0x98 }, 2, T, R, pfx_66, Wn, Ln }, /* vmadd132p{s,d} */
+ { { 0x99 }, 2, T, R, pfx_66, Wn, LIG }, /* vmadd132s{s,d} */
+ { { 0x9a }, 2, T, R, pfx_66, Wn, Ln }, /* vmsub132p{s,d} */
+ { { 0x9b }, 2, T, R, pfx_66, Wn, LIG }, /* vmsub132s{s,d} */
+ { { 0x9c }, 2, T, R, pfx_66, Wn, Ln }, /* vnmadd132p{s,d} */
+ { { 0x9d }, 2, T, R, pfx_66, Wn, LIG }, /* vnmadd132s{s,d} */
+ { { 0x9e }, 2, T, R, pfx_66, Wn, Ln }, /* vnmsub132p{s,d} */
+ { { 0x9f }, 2, T, R, pfx_66, Wn, LIG }, /* vnmsub132s{s,d} */
+ { { 0xa6 }, 2, T, R, pfx_66, Wn, Ln }, /* vmaddsub213p{s,d} */
+ { { 0xa7 }, 2, T, R, pfx_66, Wn, Ln }, /* vmsubadd213p{s,d} */
+ { { 0xa8 }, 2, T, R, pfx_66, Wn, Ln }, /* vmadd213p{s,d} */
+ { { 0xa9 }, 2, T, R, pfx_66, Wn, LIG }, /* vmadd213s{s,d} */
+ { { 0xaa }, 2, T, R, pfx_66, Wn, Ln }, /* vmsub213p{s,d} */
+ { { 0xab }, 2, T, R, pfx_66, Wn, LIG }, /* vmsub213s{s,d} */
+ { { 0xac }, 2, T, R, pfx_66, Wn, Ln }, /* vnmadd213p{s,d} */
+ { { 0xad }, 2, T, R, pfx_66, Wn, LIG }, /* vnmadd213s{s,d} */
+ { { 0xae }, 2, T, R, pfx_66, Wn, Ln }, /* vnmsub213p{s,d} */
+ { { 0xaf }, 2, T, R, pfx_66, Wn, LIG }, /* vnmsub213s{s,d} */
+ { { 0xb6 }, 2, T, R, pfx_66, Wn, Ln }, /* vmaddsub231p{s,d} */
+ { { 0xb7 }, 2, T, R, pfx_66, Wn, Ln }, /* vmsubadd231p{s,d} */
+ { { 0xb8 }, 2, T, R, pfx_66, Wn, Ln }, /* vmadd231p{s,d} */
+ { { 0xb9 }, 2, T, R, pfx_66, Wn, LIG }, /* vmadd231s{s,d} */
+ { { 0xba }, 2, T, R, pfx_66, Wn, Ln }, /* vmsub231p{s,d} */
+ { { 0xbb }, 2, T, R, pfx_66, Wn, LIG }, /* vmsub231s{s,d} */
+ { { 0xbc }, 2, T, R, pfx_66, Wn, Ln }, /* vnmadd231p{s,d} */
+ { { 0xbd }, 2, T, R, pfx_66, Wn, LIG }, /* vnmadd231s{s,d} */
+ { { 0xbe }, 2, T, R, pfx_66, Wn, Ln }, /* vnmsub231p{s,d} */
+ { { 0xbf }, 2, T, R, pfx_66, Wn, LIG }, /* vnmsub231s{s,d} */
+ { { 0xcf }, 2, T, R, pfx_66, W0, Ln }, /* vgf2p8mulb */
+ { { 0xdb }, 2, T, R, pfx_66, WIG, L0 }, /* vaesimc */
+ { { 0xdc }, 2, T, R, pfx_66, WIG, Ln }, /* vaesenc */
+ { { 0xdd }, 2, T, R, pfx_66, WIG, Ln }, /* vaesenclast */
+ { { 0xde }, 2, T, R, pfx_66, WIG, Ln }, /* vaesdec */
+ { { 0xdf }, 2, T, R, pfx_66, WIG, Ln }, /* vaesdeclast */
+ { { 0xf2 }, 2, T, R, pfx_no, Wn, L0 }, /* andn */
+ { { 0xf3, 0x08 }, 2, T, R, pfx_no, Wn, L0 }, /* blsr */
+ { { 0xf3, 0x10 }, 2, T, R, pfx_no, Wn, L0 }, /* blsmsk */
+ { { 0xf3, 0x18 }, 2, T, R, pfx_no, Wn, L0 }, /* blsi */
+ { { 0xf5 }, 2, T, R, pfx_no, Wn, L0 }, /* bzhi */
+ { { 0xf5 }, 2, T, R, pfx_f3, Wn, L0 }, /* pext */
+ { { 0xf5 }, 2, T, R, pfx_f2, Wn, L0 }, /* pdep */
+ { { 0xf6 }, 2, T, R, pfx_f2, Wn, L0 }, /* mulx */
+ { { 0xf7 }, 2, T, R, pfx_no, Wn, L0 }, /* bextr */
+ { { 0xf7 }, 2, T, R, pfx_66, Wn, L0 }, /* shlx */
+ { { 0xf7 }, 2, T, R, pfx_f3, Wn, L0 }, /* sarx */
+ { { 0xf7 }, 2, T, R, pfx_f2, Wn, L0 }, /* shrx */
+}, vex_0f3a[] = {
+ { { 0x00 }, 3, T, R, pfx_66, W1, L1 }, /* vpermq */
+ { { 0x01 }, 3, T, R, pfx_66, W1, L1 }, /* vpermpd */
+ { { 0x02 }, 3, T, R, pfx_66, W0, Ln }, /* vpblendd */
+ { { 0x04 }, 3, T, R, pfx_66, W0, Ln }, /* vpermilps */
+ { { 0x05 }, 3, T, R, pfx_66, W0, Ln }, /* vpermilpd */
+ { { 0x06 }, 3, T, R, pfx_66, W0, L1 }, /* vperm2f128 */
+ { { 0x08 }, 3, T, R, pfx_66, WIG, Ln }, /* vroundps */
+ { { 0x09 }, 3, T, R, pfx_66, WIG, Ln }, /* vroundpd */
+ { { 0x0a }, 3, T, R, pfx_66, WIG, LIG }, /* vroundss */
+ { { 0x0b }, 3, T, R, pfx_66, WIG, LIG }, /* vroundsd */
+ { { 0x0c }, 3, T, R, pfx_66, WIG, Ln }, /* vblendps */
+ { { 0x0d }, 3, T, R, pfx_66, WIG, Ln }, /* vblendpd */
+ { { 0x0e }, 3, T, R, pfx_66, WIG, Ln }, /* vpblendw */
+ { { 0x0f }, 3, T, R, pfx_66, WIG, Ln }, /* vpalignr */
+ { { 0x14 }, 3, T, W, pfx_66, WIG, L0 }, /* vpextrb */
+ { { 0x15 }, 3, T, W, pfx_66, WIG, L0 }, /* vpextrw */
+ { { 0x16 }, 3, T, W, pfx_66, Wn, L0 }, /* vpextr{d,q} */
+ { { 0x17 }, 3, T, W, pfx_66, WIG, L0 }, /* vextractps */
+ { { 0x18 }, 3, T, R, pfx_66, W0, L1 }, /* vinsertf128 */
+ { { 0x19 }, 3, T, W, pfx_66, W0, L1 }, /* vextractf128 */
+ { { 0x1d }, 3, T, W, pfx_66, W0, Ln }, /* vcvtps2ph */
+ { { 0x20 }, 3, T, R, pfx_66, WIG, L0 }, /* vpinsrb */
+ { { 0x21 }, 3, T, R, pfx_66, WIG, L0 }, /* vinsertps */
+ { { 0x22 }, 3, T, R, pfx_66, Wn, L0 }, /* vpinsr{d,q} */
+ { { 0x30, 0xc0 }, 3, F, N, pfx_66, Wn, L0 }, /* kshiftr{b,w} */
+ { { 0x31, 0xc0 }, 3, F, N, pfx_66, Wn, L0 }, /* kshiftr{d,q} */
+ { { 0x32, 0xc0 }, 3, F, N, pfx_66, Wn, L0 }, /* kshiftl{b,w} */
+ { { 0x33, 0xc0 }, 3, F, N, pfx_66, Wn, L0 }, /* kshiftl{d,q} */
+ { { 0x38 }, 3, T, R, pfx_66, W0, L1 }, /* vinserti128 */
+ { { 0x39 }, 3, T, W, pfx_66, W0, L1 }, /* vextracti128 */
+ { { 0x40 }, 3, T, R, pfx_66, WIG, Ln }, /* vdpps */
+ { { 0x41 }, 3, T, R, pfx_66, WIG, Ln }, /* vdppd */
+ { { 0x42 }, 3, T, R, pfx_66, WIG, Ln }, /* vmpsadbw */
+ { { 0x44 }, 3, T, R, pfx_66, WIG, Ln }, /* vpclmulqdq */
+ { { 0x46 }, 3, T, R, pfx_66, W0, L1 }, /* vperm2i128 */
+ { { 0x48 }, 3, T, R, pfx_66, Wn, Ln }, /* vpermil2ps */
+ { { 0x49 }, 3, T, R, pfx_66, Wn, Ln }, /* vpermil2pd */
+ { { 0x4a }, 3, T, R, pfx_66, W0, Ln }, /* vblendvps */
+ { { 0x4b }, 3, T, R, pfx_66, W0, Ln }, /* vblendvpd */
+ { { 0x4c }, 3, T, R, pfx_66, W0, Ln }, /* vpblendvb */
+ { { 0x5c }, 3, T, R, pfx_66, Wn, Ln }, /* vfmaddsubps */
+ { { 0x5d }, 3, T, R, pfx_66, Wn, Ln }, /* vfmaddsubpd */
+ { { 0x5e }, 3, T, R, pfx_66, Wn, Ln }, /* vfmsubaddps */
+ { { 0x5f }, 3, T, R, pfx_66, Wn, Ln }, /* vfmsubaddpd */
+ { { 0x60 }, 3, T, R, pfx_66, WIG, L0 }, /* vpcmpestrm */
+ { { 0x61 }, 3, T, R, pfx_66, WIG, L0 }, /* vpcmpestri */
+ { { 0x62 }, 3, T, R, pfx_66, WIG, L0 }, /* vpcmpistrm */
+ { { 0x63 }, 3, T, R, pfx_66, WIG, L0 }, /* vpcmpistri */
+ { { 0x68 }, 3, T, R, pfx_66, Wn, Ln }, /* vfmaddps */
+ { { 0x69 }, 3, T, R, pfx_66, Wn, Ln }, /* vfmaddpd */
+ { { 0x6a }, 3, T, R, pfx_66, Wn, LIG }, /* vfmaddss */
+ { { 0x6b }, 3, T, R, pfx_66, Wn, LIG }, /* vfmaddsd */
+ { { 0x6c }, 3, T, R, pfx_66, Wn, Ln }, /* vfmsubps */
+ { { 0x6d }, 3, T, R, pfx_66, Wn, Ln }, /* vfmsubpd */
+ { { 0x6e }, 3, T, R, pfx_66, Wn, LIG }, /* vfmsubss */
+ { { 0x6f }, 3, T, R, pfx_66, Wn, LIG }, /* vfmsubsd */
+ { { 0x78 }, 3, T, R, pfx_66, Wn, Ln }, /* vfnmaddps */
+ { { 0x79 }, 3, T, R, pfx_66, Wn, Ln }, /* vfnmaddpd */
+ { { 0x7a }, 3, T, R, pfx_66, Wn, LIG }, /* vfnmaddss */
+ { { 0x7b }, 3, T, R, pfx_66, Wn, LIG }, /* vfnmaddsd */
+ { { 0x7c }, 3, T, R, pfx_66, Wn, Ln }, /* vfnmsubps */
+ { { 0x7d }, 3, T, R, pfx_66, Wn, Ln }, /* vfnmsubpd */
+ { { 0x7e }, 3, T, R, pfx_66, Wn, LIG }, /* vfnmsubss */
+ { { 0x7f }, 3, T, R, pfx_66, Wn, LIG }, /* vfnmsubsd */
+ { { 0xce }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineqb */
+ { { 0xcf }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineinvqb */
+ { { 0xdf }, 3, T, R, pfx_66, WIG, Ln }, /* vaeskeygenassist */
+ { { 0xf0 }, 3, T, R, pfx_f2, Wn, L0 }, /* rorx */
+};
+
+static const struct {
+ const struct vex *tbl;
+ unsigned int num;
+} vex[] = {
+ { vex_0f, ARRAY_SIZE(vex_0f) },
+ { vex_0f38, ARRAY_SIZE(vex_0f38) },
+ { vex_0f3a, ARRAY_SIZE(vex_0f3a) },
+};
+
+#undef Wn
+#undef Ln
+
#undef F
#undef N
#undef R
@@ -1125,7 +1568,7 @@ void predicates_test(void *instr, struct
for ( m = 0; m < sizeof(long) / sizeof(int); ++m )
{
- unsigned int t;
+ unsigned int t, x;
ctxt->addr_size = 32 << m;
ctxt->sp_size = 32 << m;
@@ -1211,6 +1654,90 @@ void predicates_test(void *instr, struct
ctxt, fetch);
}
+ for ( t = 0; t < ARRAY_SIZE(vex_0f); ++t )
+ {
+ if ( vex_0f[t].w == WIG || (vex_0f[t].w & W0) )
+ {
+ uint8_t *ptr = instr;
+
+ memset(instr + 3, 0xcc, 12);
+
+ *ptr++ = 0xc5;
+ *ptr++ = 0xf8 | vex_0f[t].pfx;
+ memcpy(ptr, vex_0f[t].opc, vex_0f[t].len);
+
+ if ( vex_0f[t].l == LIG || (vex_0f[t].l & L0) )
+ do_test(instr, vex_0f[t].len + ((void *)ptr - instr),
+ vex_0f[t].modrm ? (void *)ptr - instr + 1 : 0,
+ vex_0f[t].mem, ctxt, fetch);
+
+ if ( vex_0f[t].l == LIG || (vex_0f[t].l & L1) )
+ {
+ ptr[-1] |= 4;
+ memcpy(ptr, vex_0f[t].opc, vex_0f[t].len);
+
+ do_test(instr, vex_0f[t].len + ((void *)ptr - instr),
+ vex_0f[t].modrm ? (void *)ptr - instr + 1 : 0,
+ vex_0f[t].mem, ctxt, fetch);
+ }
+ }
+ }
+
+ for ( x = 0; x < ARRAY_SIZE(vex); ++x )
+ {
+ for ( t = 0; t < vex[x].num; ++t )
+ {
+ uint8_t *ptr = instr;
+
+ memset(instr + 4, 0xcc, 11);
+
+ *ptr++ = 0xc4;
+ *ptr++ = 0xe1 + x;
+ *ptr++ = 0x78 | vex[x].tbl[t].pfx;
+
+ if ( vex[x].tbl[t].w == WIG || (vex[x].tbl[t].w & W0) )
+ {
+ memcpy(ptr, vex[x].tbl[t].opc, vex[x].tbl[t].len);
+
+ if ( vex[x].tbl[t].l == LIG || (vex[x].tbl[t].l & L0) )
+ do_test(instr, vex[x].tbl[t].len + ((void *)ptr - instr),
+ vex[x].tbl[t].modrm ? (void *)ptr - instr + 1 : 0,
+ vex[x].tbl[t].mem, ctxt, fetch);
+
+ if ( vex[x].tbl[t].l == LIG || (vex[x].tbl[t].l & L1) )
+ {
+ ptr[-1] |= 4;
+ memcpy(ptr, vex[x].tbl[t].opc, vex[x].tbl[t].len);
+
+ do_test(instr, vex[x].tbl[t].len + ((void *)ptr - instr),
+ vex[x].tbl[t].modrm ? (void *)ptr - instr + 1 : 0,
+ vex[x].tbl[t].mem, ctxt, fetch);
+ }
+ }
+
+ if ( vex[x].tbl[t].w == WIG || (vex[x].tbl[t].w & W1) )
+ {
+ ptr[-1] = 0xf8 | vex[x].tbl[t].pfx;
+ memcpy(ptr, vex[x].tbl[t].opc, vex[x].tbl[t].len);
+
+ if ( vex[x].tbl[t].l == LIG || (vex[x].tbl[t].l & L0) )
+ do_test(instr, vex[x].tbl[t].len + ((void *)ptr - instr),
+ vex[x].tbl[t].modrm ? (void *)ptr - instr + 1 : 0,
+ vex[x].tbl[t].mem, ctxt, fetch);
+
+ if ( vex[x].tbl[t].l == LIG || (vex[x].tbl[t].l & L1) )
+ {
+ ptr[-1] |= 4;
+ memcpy(ptr, vex[x].tbl[t].opc, vex[x].tbl[t].len);
+
+ do_test(instr, vex[x].tbl[t].len + ((void *)ptr - instr),
+ vex[x].tbl[t].modrm ? (void *)ptr - instr + 1 : 0,
+ vex[x].tbl[t].mem, ctxt, fetch);
+ }
+ }
+ }
+ }
+
if ( errors )
exit(1);
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 05/10] x86emul: extend decoding / mem access testing to XOP-encoded insns
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
` (3 preceding siblings ...)
2020-08-03 14:51 ` [PATCH 04/10] x86emul: extend decoding / mem access testing to VEX-encoded insns Jan Beulich
@ 2020-08-03 14:51 ` Jan Beulich
2020-08-03 14:52 ` [PATCH 06/10] x86emul: AVX512{F, BW} down conversion moves are memory writes Jan Beulich
` (6 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:51 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/tools/tests/x86_emulator/predicates.c
+++ b/tools/tests/x86_emulator/predicates.c
@@ -1474,6 +1474,96 @@ static const struct {
{ vex_0f3a, ARRAY_SIZE(vex_0f3a) },
};
+static const struct xop {
+ uint8_t opc[2];
+ uint8_t w:2;
+ uint8_t l:2;
+} xop_08[] = {
+ { { 0x85 }, W0, L0 }, /* vpmacssww */
+ { { 0x86 }, W0, L0 }, /* vpmacsswd */
+ { { 0x87 }, W0, L0 }, /* vpmacssdql */
+ { { 0x8e }, W0, L0 }, /* vpmacssdd */
+ { { 0x8f }, W0, L0 }, /* vpmacssdqh */
+ { { 0x95 }, W0, L0 }, /* vpmacsww */
+ { { 0x96 }, W0, L0 }, /* vpmacswd */
+ { { 0x97 }, W0, L0 }, /* vpmacsdql */
+ { { 0x9e }, W0, L0 }, /* vpmacsdd */
+ { { 0x9f }, W0, L0 }, /* vpmacsdqh */
+ { { 0xa2 }, Wn, Ln }, /* vpcmov */
+ { { 0xa3 }, Wn, L0 }, /* vpperm */
+ { { 0xa6 }, W0, L0 }, /* vpmadcsswd */
+ { { 0xb6 }, W0, L0 }, /* vpmadcswd */
+ { { 0xc0 }, W0, L0 }, /* vprotb */
+ { { 0xc1 }, W0, L0 }, /* vprotw */
+ { { 0xc2 }, W0, L0 }, /* vprotd */
+ { { 0xc3 }, W0, L0 }, /* vprotq */
+ { { 0xcc }, W0, L0 }, /* vpcomb */
+ { { 0xcd }, W0, L0 }, /* vpcomw */
+ { { 0xce }, W0, L0 }, /* vpcomd */
+ { { 0xcf }, W0, L0 }, /* vpcomq */
+ { { 0xec }, W0, L0 }, /* vpcomub */
+ { { 0xed }, W0, L0 }, /* vpcomuw */
+ { { 0xee }, W0, L0 }, /* vpcomud */
+ { { 0xef }, W0, L0 }, /* vpcomuq */
+}, xop_09[] = {
+ { { 0x01, 0x08 }, Wn, L0 }, /* blcfill */
+ { { 0x01, 0x10 }, Wn, L0 }, /* blsfill */
+ { { 0x01, 0x18 }, Wn, L0 }, /* blcs */
+ { { 0x01, 0x20 }, Wn, L0 }, /* tzmsk */
+ { { 0x01, 0x28 }, Wn, L0 }, /* blcic */
+ { { 0x01, 0x30 }, Wn, L0 }, /* blsic */
+ { { 0x01, 0x38 }, Wn, L0 }, /* t1mskc */
+ { { 0x02, 0x08 }, Wn, L0 }, /* blcmsk */
+ { { 0x02, 0x30 }, Wn, L0 }, /* blci */
+ { { 0x02, 0xc0 }, Wn, L0 }, /* llwpcb */
+ { { 0x02, 0xc8 }, Wn, L0 }, /* slwpcb */
+ { { 0x80 }, W0, Ln }, /* vfrczps */
+ { { 0x81 }, W0, Ln }, /* vfrczpd */
+ { { 0x82 }, W0, L0 }, /* vfrczss */
+ { { 0x83 }, W0, L0 }, /* vfrczsd */
+ { { 0x90 }, Wn, L0 }, /* vprotb */
+ { { 0x91 }, Wn, L0 }, /* vprotw */
+ { { 0x92 }, Wn, L0 }, /* vprotd */
+ { { 0x93 }, Wn, L0 }, /* vprotq */
+ { { 0x94 }, Wn, L0 }, /* vpshlb */
+ { { 0x95 }, Wn, L0 }, /* vpshlw */
+ { { 0x96 }, Wn, L0 }, /* vpshld */
+ { { 0x97 }, Wn, L0 }, /* vpshlq */
+ { { 0x9c }, Wn, L0 }, /* vpshab */
+ { { 0x9d }, Wn, L0 }, /* vpshaw */
+ { { 0x9e }, Wn, L0 }, /* vpshad */
+ { { 0x9f }, Wn, L0 }, /* vpshaq */
+ { { 0xc1 }, W0, L0 }, /* vphaddbw */
+ { { 0xc2 }, W0, L0 }, /* vphaddbd */
+ { { 0xc3 }, W0, L0 }, /* vphaddbq */
+ { { 0xc6 }, W0, L0 }, /* vphaddwd */
+ { { 0xc7 }, W0, L0 }, /* vphaddwq */
+ { { 0xcb }, W0, L0 }, /* vphadddq */
+ { { 0xd1 }, W0, L0 }, /* vphaddubw */
+ { { 0xd2 }, W0, L0 }, /* vphaddubd */
+ { { 0xd3 }, W0, L0 }, /* vphaddubq */
+ { { 0xd6 }, W0, L0 }, /* vphadduwd */
+ { { 0xd7 }, W0, L0 }, /* vphadduwq */
+ { { 0xdb }, W0, L0 }, /* vphaddudq */
+ { { 0xe1 }, W0, L0 }, /* vphsubbw */
+ { { 0xe2 }, W0, L0 }, /* vphsubwd */
+ { { 0xe3 }, W0, L0 }, /* vphsubdq */
+}, xop_0a[] = {
+ { { 0x10 }, Wn, L0 }, /* bextr */
+ { { 0x12, 0x00 }, Wn, L0 }, /* lwpins */
+ { { 0x12, 0x08 }, Wn, L0 }, /* lwpval */
+};
+
+static const struct {
+ const struct xop *tbl;
+ unsigned int num;
+ unsigned int imm;
+} xop[] = {
+ { xop_08, ARRAY_SIZE(xop_08), 1 },
+ { xop_09, ARRAY_SIZE(xop_09), 0 },
+ { xop_0a, ARRAY_SIZE(xop_0a), 4 },
+};
+
#undef Wn
#undef Ln
@@ -1736,6 +1826,63 @@ void predicates_test(void *instr, struct
}
}
}
+ }
+
+ for ( x = 0; x < ARRAY_SIZE(xop); ++x )
+ {
+ for ( t = 0; t < xop[x].num; ++t )
+ {
+ uint8_t *ptr = instr;
+ unsigned int modrm;
+ enum mem_access mem;
+
+ memset(instr + 5, 0xcc, 10);
+
+ *ptr++ = 0x8f;
+ *ptr++ = 0xe8 + x;
+ *ptr++ = 0x78;
+ memcpy(ptr, xop[x].tbl[t].opc, 2);
+ memset(ptr + 2, 0, xop[x].imm);
+
+ modrm = ptr[1] & 0xc0 ? 0 : 4;
+ mem = ptr[1] & 0xc0 ? mem_none : mem_read;
+
+ assert(xop[x].tbl[t].w != WIG);
+ assert(xop[x].tbl[t].l != LIG);
+
+ if ( xop[x].tbl[t].w & W0 )
+ {
+ if ( xop[x].tbl[t].l & L0 )
+ do_test(instr, 5 + xop[x].imm, modrm, mem, ctxt, fetch);
+
+ if ( xop[x].tbl[t].l & L1 )
+ {
+ ptr[-1] = 0x7c;
+ ptr[1] = mem != mem_none ? 0x00 : 0xc0;
+
+ do_test(instr, 5 + xop[x].imm, modrm, mem, ctxt, fetch);
+ }
+ }
+
+ if ( xop[x].tbl[t].w & W1 )
+ {
+ if ( xop[x].tbl[t].l & L0 )
+ {
+ ptr[-1] = 0xf8;
+ ptr[1] = mem != mem_none ? 0x00 : 0xc0;
+
+ do_test(instr, 5 + xop[x].imm, modrm, mem, ctxt, fetch);
+ }
+
+ if ( xop[x].tbl[t].l & L1 )
+ {
+ ptr[-1] = 0xfc;
+ ptr[1] = mem != mem_none ? 0x00 : 0xc0;
+
+ do_test(instr, 5 + xop[x].imm, modrm, mem, ctxt, fetch);
+ }
+ }
+ }
}
if ( errors )
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 06/10] x86emul: AVX512{F, BW} down conversion moves are memory writes
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
` (4 preceding siblings ...)
2020-08-03 14:51 ` [PATCH 05/10] x86emul: extend decoding / mem access testing to XOP-encoded insns Jan Beulich
@ 2020-08-03 14:52 ` Jan Beulich
2020-08-03 14:53 ` [PATCH 07/10] x86emul: AVX512F scatter insns " Jan Beulich
` (5 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:52 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
For this to be properly reported, the case labels need to move to a
different switch() block.
Fixes: 30e0bdf79828 ("x86emul: support AVX512{F,BW} down conversion moves")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -12359,6 +12359,14 @@ x86_insn_is_mem_write(const struct x86_e
case X86EMUL_OPC_F2(0x0f38, 0xf8): /* ENQCMD */
case X86EMUL_OPC_F3(0x0f38, 0xf8): /* ENQCMDS */
return true;
+
+ case X86EMUL_OPC_EVEX_F3(0x0f38, 0x10) ...
+ X86EMUL_OPC_EVEX_F3(0x0f38, 0x15): /* VPMOVUS* */
+ case X86EMUL_OPC_EVEX_F3(0x0f38, 0x20) ...
+ X86EMUL_OPC_EVEX_F3(0x0f38, 0x25): /* VPMOVS* */
+ case X86EMUL_OPC_EVEX_F3(0x0f38, 0x30) ...
+ X86EMUL_OPC_EVEX_F3(0x0f38, 0x35): /* VPMOV{D,Q,W}* */
+ return state->modrm_mod != 3;
}
return false;
@@ -12400,12 +12408,6 @@ x86_insn_is_mem_write(const struct x86_e
case X86EMUL_OPC(0x0f, 0xab): /* BTS */
case X86EMUL_OPC(0x0f, 0xb3): /* BTR */
case X86EMUL_OPC(0x0f, 0xbb): /* BTC */
- case X86EMUL_OPC_EVEX_F3(0x0f38, 0x10) ...
- X86EMUL_OPC_EVEX_F3(0x0f38, 0x15): /* VPMOVUS* */
- case X86EMUL_OPC_EVEX_F3(0x0f38, 0x20) ...
- X86EMUL_OPC_EVEX_F3(0x0f38, 0x25): /* VPMOVS* */
- case X86EMUL_OPC_EVEX_F3(0x0f38, 0x30) ...
- X86EMUL_OPC_EVEX_F3(0x0f38, 0x35): /* VPMOV{D,Q,W}* */
return true;
case 0xd9:
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 07/10] x86emul: AVX512F scatter insns are memory writes
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
` (5 preceding siblings ...)
2020-08-03 14:52 ` [PATCH 06/10] x86emul: AVX512{F, BW} down conversion moves are memory writes Jan Beulich
@ 2020-08-03 14:53 ` Jan Beulich
2020-08-03 14:53 ` [PATCH 08/10] x86emul: AVX512PF insns aren't memory accesses Jan Beulich
` (4 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:53 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
While the custom handling renders the "to_mem" field generally unused,
x86_insn_is_mem_write() still (indirectly) consumes that information,
and hence the table entries want to be correct.
Fixes: ("x86emul: support AVX512F scatter insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -516,7 +516,7 @@ static const struct ext0f38_table {
[0x9d] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq },
[0x9e] = { .simd_size = simd_packed_fp, .d8s = d8s_vl },
[0x9f] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq },
- [0xa0 ... 0xa3] = { .simd_size = simd_other, .vsib = 1, .d8s = d8s_dq },
+ [0xa0 ... 0xa3] = { .simd_size = simd_other, .to_mem = 1, .vsib = 1, .d8s = d8s_dq },
[0xa6 ... 0xa8] = { .simd_size = simd_packed_fp, .d8s = d8s_vl },
[0xa9] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq },
[0xaa] = { .simd_size = simd_packed_fp, .d8s = d8s_vl },
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 08/10] x86emul: AVX512PF insns aren't memory accesses
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
` (6 preceding siblings ...)
2020-08-03 14:53 ` [PATCH 07/10] x86emul: AVX512F scatter insns " Jan Beulich
@ 2020-08-03 14:53 ` Jan Beulich
2020-08-03 14:54 ` [PATCH 09/10] x86emul: extend decoding / mem access testing to EVEX-encoded insns Jan Beulich
` (3 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:53 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
These are prefetches, so should be treated just like other prefetches.
Fixes: ("x86emul: support AVX512PF insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -12265,6 +12265,8 @@ x86_insn_is_mem_access(const struct x86_
... X86EMUL_OPC_F2(0x0f, 0x1f): /* NOP space */
case X86EMUL_OPC(0x0f, 0xb9): /* UD1 */
case X86EMUL_OPC(0x0f, 0xff): /* UD0 */
+ case X86EMUL_OPC_EVEX_66(0x0f38, 0xc6): /* V{GATH,SCATT}ERPF*D* */
+ case X86EMUL_OPC_EVEX_66(0x0f38, 0xc7): /* V{GATH,SCATT}ERPF*Q* */
return false;
case X86EMUL_OPC(0x0f, 0x01):
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 09/10] x86emul: extend decoding / mem access testing to EVEX-encoded insns
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
` (7 preceding siblings ...)
2020-08-03 14:53 ` [PATCH 08/10] x86emul: AVX512PF insns aren't memory accesses Jan Beulich
@ 2020-08-03 14:54 ` Jan Beulich
2020-08-03 14:54 ` [PATCH 10/10] x86emul: correct AVX512_BF16 insn names in EVEX Disp8 test Jan Beulich
` (2 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:54 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/tools/tests/x86_emulator/predicates.c
+++ b/tools/tests/x86_emulator/predicates.c
@@ -1564,9 +1564,469 @@ static const struct {
{ xop_0a, ARRAY_SIZE(xop_0a), 4 },
};
-#undef Wn
#undef Ln
+static const struct evex {
+ uint8_t opc[3];
+ uint8_t len:3;
+ bool modrm:1; /* Should register form (also) be tested? */
+ uint8_t mem:2;
+ uint8_t pfx:2;
+ uint8_t w:2;
+ uint8_t l:3;
+ bool mask:1;
+#define L2 4
+#define Ln (L0 | L1 | L2)
+} evex_0f[] = {
+ { { 0x10 }, 2, T, R, pfx_no, W0, Ln }, /* vmovups */
+ { { 0x10 }, 2, T, R, pfx_66, W1, Ln }, /* vmovupd */
+ { { 0x10 }, 2, T, R, pfx_f3, W0, LIG }, /* vmovss */
+ { { 0x10 }, 2, T, R, pfx_f2, W1, LIG }, /* vmovsd */
+ { { 0x11 }, 2, T, W, pfx_no, W0, Ln }, /* vmovups */
+ { { 0x11 }, 2, T, W, pfx_66, W1, Ln }, /* vmovupd */
+ { { 0x11 }, 2, T, W, pfx_f3, W0, LIG }, /* vmovss */
+ { { 0x11 }, 2, T, W, pfx_f2, W1, LIG }, /* vmovsd */
+ { { 0x12 }, 2, T, R, pfx_no, W0, L0 }, /* vmovlps / vmovhlps */
+ { { 0x12 }, 2, F, R, pfx_66, W1, L0 }, /* vmovlpd */
+ { { 0x12 }, 2, T, R, pfx_f3, W0, Ln }, /* vmovsldup */
+ { { 0x12 }, 2, T, R, pfx_f2, W1, Ln }, /* vmovddup */
+ { { 0x13 }, 2, F, W, pfx_no, W0, L0 }, /* vmovlps */
+ { { 0x13 }, 2, F, W, pfx_66, W1, L0 }, /* vmovlpd */
+ { { 0x14 }, 2, T, R, pfx_no, W0, Ln }, /* vunpcklps */
+ { { 0x14 }, 2, T, R, pfx_66, W1, Ln }, /* vunpcklpd */
+ { { 0x15 }, 2, T, R, pfx_no, W0, Ln }, /* vunpckhps */
+ { { 0x15 }, 2, T, R, pfx_66, W1, Ln }, /* vunpckhpd */
+ { { 0x16 }, 2, T, R, pfx_no, W0, L0 }, /* vmovhps / vmovlhps */
+ { { 0x16 }, 2, F, R, pfx_66, W1, L0 }, /* vmovhpd */
+ { { 0x16 }, 2, T, R, pfx_f3, W0, Ln }, /* vmovshdup */
+ { { 0x17 }, 2, F, W, pfx_no, W0, L0 }, /* vmovhps */
+ { { 0x17 }, 2, F, W, pfx_66, W1, L0 }, /* vmovhpd */
+ { { 0x28 }, 2, T, R, pfx_no, W0, Ln }, /* vmovaps */
+ { { 0x28 }, 2, T, R, pfx_66, W1, Ln }, /* vmovapd */
+ { { 0x29 }, 2, T, W, pfx_no, W0, Ln }, /* vmovaps */
+ { { 0x29 }, 2, T, W, pfx_66, W1, Ln }, /* vmovapd */
+ { { 0x2a }, 2, T, R, pfx_f3, W0, LIG }, /* vcvtsi2ss */
+ { { 0x2a }, 2, T, R, pfx_f2, W1, LIG }, /* vcvtsi2sd */
+ { { 0x2b }, 2, T, W, pfx_no, W0, Ln }, /* vmovntps */
+ { { 0x2b }, 2, T, W, pfx_66, W1, Ln }, /* vmovntpd */
+ { { 0x2c }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttss2si */
+ { { 0x2c }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvttsd2si */
+ { { 0x2d }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtss2si */
+ { { 0x2d }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvtsd2si */
+ { { 0x2e }, 2, T, R, pfx_no, W0, LIG }, /* vucomiss */
+ { { 0x2e }, 2, T, R, pfx_66, W1, LIG }, /* vucomisd */
+ { { 0x2f }, 2, T, R, pfx_no, W0, LIG }, /* vcomiss */
+ { { 0x2f }, 2, T, R, pfx_66, W1, LIG }, /* vcomisd */
+ { { 0x51 }, 2, T, R, pfx_no, W0, Ln }, /* vsqrtps */
+ { { 0x51 }, 2, T, R, pfx_66, W1, Ln }, /* vsqrtpd */
+ { { 0x51 }, 2, T, R, pfx_f3, W0, LIG }, /* vsqrtss */
+ { { 0x51 }, 2, T, R, pfx_f2, W1, LIG }, /* vsqrtsd */
+ { { 0x54 }, 2, T, R, pfx_no, W0, Ln }, /* vandps */
+ { { 0x54 }, 2, T, R, pfx_66, W1, Ln }, /* vandpd */
+ { { 0x55 }, 2, T, R, pfx_no, W0, Ln }, /* vandnps */
+ { { 0x55 }, 2, T, R, pfx_66, W1, Ln }, /* vandnpd */
+ { { 0x56 }, 2, T, R, pfx_no, W0, Ln }, /* vorps */
+ { { 0x56 }, 2, T, R, pfx_66, W1, Ln }, /* vorpd */
+ { { 0x57 }, 2, T, R, pfx_no, W0, Ln }, /* vxorps */
+ { { 0x57 }, 2, T, R, pfx_66, W1, Ln }, /* vxorpd */
+ { { 0x58 }, 2, T, R, pfx_no, W0, Ln }, /* vaddps */
+ { { 0x58 }, 2, T, R, pfx_66, W1, Ln }, /* vaddpd */
+ { { 0x58 }, 2, T, R, pfx_f3, W0, LIG }, /* vaddss */
+ { { 0x58 }, 2, T, R, pfx_f2, W1, LIG }, /* vaddsd */
+ { { 0x59 }, 2, T, R, pfx_no, W0, Ln }, /* vmulps */
+ { { 0x59 }, 2, T, R, pfx_66, W1, Ln }, /* vmulpd */
+ { { 0x59 }, 2, T, R, pfx_f3, W0, LIG }, /* vmulss */
+ { { 0x59 }, 2, T, R, pfx_f2, W1, LIG }, /* vmulsd */
+ { { 0x5a }, 2, T, R, pfx_no, W0, Ln }, /* vcvtps2pd */
+ { { 0x5a }, 2, T, R, pfx_66, W1, Ln }, /* vcvtpd2ps */
+ { { 0x5a }, 2, T, R, pfx_f3, W0, LIG }, /* vcvtss2sd */
+ { { 0x5a }, 2, T, R, pfx_f2, W1, LIG }, /* vcvtsd2ss */
+ { { 0x5b }, 2, T, R, pfx_no, Wn, Ln }, /* vcvt{d,q}q2ps */
+ { { 0x5b }, 2, T, R, pfx_66, W0, Ln }, /* vcvtps2dq */
+ { { 0x5b }, 2, T, R, pfx_f3, W0, Ln }, /* vcvttps2dq */
+ { { 0x5c }, 2, T, R, pfx_no, W0, Ln }, /* vsubps */
+ { { 0x5c }, 2, T, R, pfx_66, W1, Ln }, /* vsubpd */
+ { { 0x5c }, 2, T, R, pfx_f3, W0, LIG }, /* vsubss */
+ { { 0x5c }, 2, T, R, pfx_f2, W1, LIG }, /* vsubsd */
+ { { 0x5d }, 2, T, R, pfx_no, W0, Ln }, /* vminps */
+ { { 0x5d }, 2, T, R, pfx_66, W1, Ln }, /* vminpd */
+ { { 0x5d }, 2, T, R, pfx_f3, W0, LIG }, /* vminss */
+ { { 0x5d }, 2, T, R, pfx_f2, W1, LIG }, /* vminsd */
+ { { 0x5e }, 2, T, R, pfx_no, W0, Ln }, /* vdivps */
+ { { 0x5e }, 2, T, R, pfx_66, W1, Ln }, /* vdivpd */
+ { { 0x5e }, 2, T, R, pfx_f3, W0, LIG }, /* vdivss */
+ { { 0x5e }, 2, T, R, pfx_f2, W1, LIG }, /* vdivsd */
+ { { 0x5f }, 2, T, R, pfx_no, W0, Ln }, /* vmaxps */
+ { { 0x5f }, 2, T, R, pfx_66, W1, Ln }, /* vmaxpd */
+ { { 0x5f }, 2, T, R, pfx_f3, W0, LIG }, /* vmaxss */
+ { { 0x5f }, 2, T, R, pfx_f2, W1, LIG }, /* vmaxsd */
+ { { 0x60 }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpcklbw */
+ { { 0x61 }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpcklwd */
+ { { 0x62 }, 2, T, R, pfx_66, W0, Ln }, /* vpunpckldq */
+ { { 0x63 }, 2, T, R, pfx_66, WIG, Ln }, /* vpacksswb */
+ { { 0x64 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpgtb */
+ { { 0x65 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpgtw */
+ { { 0x66 }, 2, T, R, pfx_66, W0, Ln }, /* vpcmpgtd */
+ { { 0x67 }, 2, T, R, pfx_66, WIG, Ln }, /* vpackuswb */
+ { { 0x68 }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpckhbw */
+ { { 0x69 }, 2, T, R, pfx_66, WIG, Ln }, /* vpunpckhwd */
+ { { 0x6a }, 2, T, R, pfx_66, W0, Ln }, /* vpunpckhdq */
+ { { 0x6b }, 2, T, R, pfx_66, W0, Ln }, /* vpackssdw */
+ { { 0x6c }, 2, T, R, pfx_66, W1, Ln }, /* vpunpcklqdq */
+ { { 0x6d }, 2, T, R, pfx_66, W1, Ln }, /* vpunpckhqdq */
+ { { 0x6e }, 2, T, R, pfx_66, Wn, L0 }, /* vmov{d,q} */
+ { { 0x6f }, 2, T, R, pfx_66, Wn, Ln }, /* vmovdqa{32,64} */
+ { { 0x6f }, 2, T, R, pfx_f3, Wn, Ln }, /* vmovdqu{32,64} */
+ { { 0x6f }, 2, T, R, pfx_f2, Wn, Ln }, /* vmovdqu{8,16} */
+ { { 0x70 }, 3, T, R, pfx_66, W0, Ln }, /* vpshufd */
+ { { 0x70 }, 3, T, R, pfx_f3, WIG, Ln }, /* vpshuflw */
+ { { 0x70 }, 3, T, R, pfx_f2, WIG, Ln }, /* vpshufhw */
+ { { 0x71, 0xd0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsrlw */
+ { { 0x71, 0xe0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsraw */
+ { { 0x71, 0xf0 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsllw */
+ { { 0x72, 0xc0 }, 3, F, N, pfx_66, Wn, Ln }, /* vpror{d,q} */
+ { { 0x72, 0xc8 }, 3, F, N, pfx_66, Wn, Ln }, /* vprol{d,q} */
+ { { 0x72, 0xd0 }, 3, F, N, pfx_66, W0, Ln }, /* vpsrld */
+ { { 0x72, 0xe0 }, 3, F, N, pfx_66, Wn, Ln }, /* vpsra{d,q} */
+ { { 0x72, 0xf0 }, 3, F, N, pfx_66, W0, Ln }, /* vpslld */
+ { { 0x73, 0xd0 }, 3, F, N, pfx_66, W1, Ln }, /* vpsrlq */
+ { { 0x73, 0xd8 }, 3, F, N, pfx_66, WIG, Ln }, /* vpsrldq */
+ { { 0x73, 0xf0 }, 3, F, N, pfx_66, W0, Ln }, /* vpsllq */
+ { { 0x73, 0xf8 }, 3, F, N, pfx_66, WIG, Ln }, /* vpslldq */
+ { { 0x74 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpeqb */
+ { { 0x75 }, 2, T, R, pfx_66, WIG, Ln }, /* vpcmpeqw */
+ { { 0x76 }, 2, T, R, pfx_66, W0, Ln }, /* vpcmpeqd */
+ { { 0x78 }, 2, T, R, pfx_no, Wn, Ln }, /* vcvttp{s,d}2udq */
+ { { 0x78 }, 2, T, R, pfx_66, Wn, Ln }, /* vcvttp{s,d}2uqq */
+ { { 0x78 }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttss2usi */
+ { { 0x78 }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvttsd2usi */
+ { { 0x79 }, 2, T, R, pfx_no, Wn, Ln }, /* vcvtp{s,d}2udq */
+ { { 0x79 }, 2, T, R, pfx_66, Wn, Ln }, /* vcvtp{s,d}2uqq */
+ { { 0x79 }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtss2usi */
+ { { 0x79 }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvtsd2usi */
+ { { 0x7a }, 2, T, R, pfx_66, Wn, Ln }, /* vcvttp{s,d}2qq */
+ { { 0x7a }, 2, T, R, pfx_f3, Wn, Ln }, /* vcvtu{d,q}2pd */
+ { { 0x7a }, 2, T, R, pfx_f2, Wn, Ln }, /* vcvtu{d,q}2ps */
+ { { 0x7b }, 2, T, R, pfx_66, Wn, Ln }, /* vcvtp{s,d}2qq */
+ { { 0x7b }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtusi2s */
+ { { 0x7b }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvtusi2s */
+ { { 0x7e }, 2, T, W, pfx_66, Wn, L0 }, /* vmov{d,q} */
+ { { 0x7e }, 2, T, R, pfx_f3, W1, L0 }, /* vmovq */
+ { { 0x7f }, 2, T, W, pfx_66, Wn, Ln }, /* vmovdqa{32,64} */
+ { { 0x7f }, 2, T, W, pfx_f3, Wn, Ln }, /* vmovdqu{32,64} */
+ { { 0x7f }, 2, T, W, pfx_f2, Wn, Ln }, /* vmovdqu{8,16} */
+ { { 0xc2 }, 3, T, R, pfx_no, W0, Ln }, /* vcmpps */
+ { { 0xc2 }, 3, T, R, pfx_66, W1, Ln }, /* vcmppd */
+ { { 0xc2 }, 3, T, R, pfx_f3, W0, LIG }, /* vcmpss */
+ { { 0xc2 }, 3, T, R, pfx_f2, W1, LIG }, /* vcmpsd */
+ { { 0xc4 }, 3, T, R, pfx_66, WIG, L0 }, /* vpinsrw */
+ { { 0xc5, 0xc0 }, 3, F, N, pfx_66, WIG, L0 }, /* vpextrw */
+ { { 0xc6 }, 3, T, R, pfx_no, W0, Ln }, /* vshufps */
+ { { 0xc6 }, 3, T, R, pfx_66, W1, Ln }, /* vshufpd */
+ { { 0xd1 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsrlw */
+ { { 0xd2 }, 2, T, R, pfx_66, W0, Ln }, /* vpsrld */
+ { { 0xd3 }, 2, T, R, pfx_66, W1, Ln }, /* vpsrlq */
+ { { 0xd4 }, 2, T, R, pfx_66, W1, Ln }, /* vpaddq */
+ { { 0xd5 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmullw */
+ { { 0xd6 }, 2, T, W, pfx_66, W1, L0 }, /* vmovq */
+ { { 0xd8 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubusb */
+ { { 0xd9 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubusw */
+ { { 0xda }, 2, T, R, pfx_66, WIG, Ln }, /* vpminub */
+ { { 0xdb }, 2, T, R, pfx_66, Wn, Ln }, /* vpand{d,q} */
+ { { 0xdc }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddusb */
+ { { 0xdd }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddusw */
+ { { 0xde }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxub */
+ { { 0xdf }, 2, T, R, pfx_66, Wn, Ln }, /* vpandn{d,q} */
+ { { 0xe0 }, 2, T, R, pfx_66, WIG, Ln }, /* vpavgb */
+ { { 0xe1 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsraw */
+ { { 0xe2 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsra{d,q} */
+ { { 0xe3 }, 2, T, R, pfx_66, WIG, Ln }, /* vpavgw */
+ { { 0xe4 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmulhuw */
+ { { 0xe5 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmulhw */
+ { { 0xe6 }, 2, T, R, pfx_66, WIG, Ln }, /* vcvttpd2dq */
+ { { 0xe6 }, 2, T, R, pfx_f3, Wn, Ln }, /* vcvt{d,q}q2pd */
+ { { 0xe6 }, 2, T, R, pfx_f2, WIG, Ln }, /* vcvtpd2dq */
+ { { 0xe7 }, 2, F, W, pfx_66, W0, Ln }, /* vmovntdq */
+ { { 0xe8 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubsb */
+ { { 0xe9 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubsw */
+ { { 0xea }, 2, T, R, pfx_66, WIG, Ln }, /* vpminsw */
+ { { 0xeb }, 2, T, R, pfx_66, Wn, Ln }, /* vpor{d,q} */
+ { { 0xec }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddsb */
+ { { 0xed }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddsw */
+ { { 0xee }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxsw */
+ { { 0xef }, 2, T, R, pfx_66, Wn, Ln }, /* vpxor{d,q} */
+ { { 0xf1 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsllw */
+ { { 0xf2 }, 2, T, R, pfx_66, W0, Ln }, /* vpslld */
+ { { 0xf3 }, 2, T, R, pfx_66, W1, Ln }, /* vpsllq */
+ { { 0xf4 }, 2, T, R, pfx_66, W1, Ln }, /* vpmuludq */
+ { { 0xf5 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaddwd */
+ { { 0xf6 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsadbw */
+ { { 0xf8 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubb */
+ { { 0xf9 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubw */
+ { { 0xfa }, 2, T, R, pfx_66, W0, Ln }, /* vpsubd */
+ { { 0xfb }, 2, T, R, pfx_66, W1, Ln }, /* vpsubq */
+ { { 0xfc }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddb */
+ { { 0xfd }, 2, T, R, pfx_66, WIG, Ln }, /* vpaddw */
+ { { 0xfe }, 2, T, R, pfx_66, W0, Ln }, /* vpaddd */
+}, evex_0f38[] = {
+ { { 0x00 }, 2, T, R, pfx_66, WIG, Ln }, /* vpshufb */
+ { { 0x04 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaddubsw */
+ { { 0x0b }, 2, T, R, pfx_66, WIG, Ln }, /* vpmulhrsw */
+ { { 0x0c }, 2, T, R, pfx_66, W0, Ln }, /* vpermilps */
+ { { 0x0d }, 2, T, R, pfx_66, W1, Ln }, /* vpermilpd */
+ { { 0x10 }, 2, T, R, pfx_66, W1, Ln }, /* vpsrlvw */
+ { { 0x10 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovuswb */
+ { { 0x11 }, 2, T, R, pfx_66, W1, Ln }, /* vpsravw */
+ { { 0x11 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovusdb */
+ { { 0x12 }, 2, T, R, pfx_66, W1, Ln }, /* vpsllvw */
+ { { 0x12 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovusqb */
+ { { 0x13 }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2ps */
+ { { 0x13 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovusdw */
+ { { 0x14 }, 2, T, R, pfx_66, Wn, Ln }, /* vprorv{d,q} */
+ { { 0x14 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovusqw */
+ { { 0x15 }, 2, T, R, pfx_66, Wn, Ln }, /* vprolv{d,q} */
+ { { 0x15 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovusqd */
+ { { 0x16 }, 2, T, R, pfx_66, Wn, L1|L2 }, /* vpermp{s,d} */
+ { { 0x18 }, 2, T, R, pfx_66, W0, Ln }, /* vbroadcastss */
+ { { 0x19 }, 2, T, R, pfx_66, Wn, L1|L2 }, /* vbroadcast{32x2,sd} */
+ { { 0x1a }, 2, F, R, pfx_66, Wn, L1|L2 }, /* vbroadcastf{32x4,64x2} */
+ { { 0x1b }, 2, F, R, pfx_66, Wn, L2 }, /* vbroadcastf{32x8,64x4} */
+ { { 0x1c }, 2, T, R, pfx_66, WIG, Ln }, /* vpabsb */
+ { { 0x1d }, 2, T, R, pfx_66, WIG, Ln }, /* vpabsw */
+ { { 0x1e }, 2, T, R, pfx_66, W0, Ln }, /* vpabsd */
+ { { 0x1f }, 2, T, R, pfx_66, W1, Ln }, /* vpabsq */
+ { { 0x20 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxbw */
+ { { 0x20 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovswb */
+ { { 0x21 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxbd */
+ { { 0x21 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovsdb */
+ { { 0x22 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxbq */
+ { { 0x22 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovsqb */
+ { { 0x23 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxwd */
+ { { 0x23 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovsdw */
+ { { 0x24 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovsxwq */
+ { { 0x24 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovsqw */
+ { { 0x25 }, 2, T, R, pfx_66, W0, Ln }, /* vpmovsxdq */
+ { { 0x25 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovsqd */
+ { { 0x26 }, 2, T, R, pfx_66, Wn, Ln }, /* vptestm{b,w} */
+ { { 0x26 }, 2, T, R, pfx_f3, Wn, Ln }, /* vptestnm{b,w} */
+ { { 0x27 }, 2, T, R, pfx_66, Wn, Ln }, /* vptestm{d,q} */
+ { { 0x27 }, 2, T, R, pfx_f3, Wn, Ln }, /* vptestnm{d,q} */
+ { { 0x28 }, 2, T, R, pfx_66, W1, Ln }, /* vpmuldq */
+ { { 0x28, 0xc0 }, 2, F, N, pfx_f3, Wn, Ln }, /* vpmovm2{b,w} */
+ { { 0x29 }, 2, T, R, pfx_66, W1, Ln }, /* vpcmpeqq */
+ { { 0x29, 0xc0 }, 2, F, N, pfx_f3, Wn, Ln }, /* vpmov{b,w}2m */
+ { { 0x2a }, 2, F, R, pfx_66, W0, Ln }, /* vmovntdqa */
+ { { 0x2a, 0xc0 }, 2, F, N, pfx_f3, W1, Ln }, /* vpbroadcastmb2q */
+ { { 0x2b }, 2, T, R, pfx_66, W0, Ln }, /* vpackusdw */
+ { { 0x2c }, 2, F, R, pfx_66, Wn, Ln }, /* vscalefp{s,d} */
+ { { 0x2d }, 2, F, R, pfx_66, Wn, LIG }, /* vscalefs{s,d} */
+ { { 0x30 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxbw */
+ { { 0x30 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovwb */
+ { { 0x31 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxbd */
+ { { 0x31 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovdb */
+ { { 0x32 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxbq */
+ { { 0x32 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovqb */
+ { { 0x33 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxwd */
+ { { 0x33 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovdw */
+ { { 0x34 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmovzxwq */
+ { { 0x34 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovqw */
+ { { 0x35 }, 2, T, R, pfx_66, W0, Ln }, /* vpmovzxdq */
+ { { 0x35 }, 2, T, W, pfx_f3, W0, Ln }, /* vpmovqd */
+ { { 0x36 }, 2, T, R, pfx_66, Wn, L1|L2 }, /* vperm{d,q} */
+ { { 0x37 }, 2, T, R, pfx_66, W1, Ln }, /* vpcmpgtq */
+ { { 0x38 }, 2, T, R, pfx_66, WIG, Ln }, /* vpminsb */
+ { { 0x38, 0xc0 }, 2, F, N, pfx_f3, Wn, Ln }, /* vpmovm2{d,q} */
+ { { 0x39 }, 2, T, R, pfx_66, Wn, Ln }, /* vpmins{d,q} */
+ { { 0x39, 0xc0 }, 2, F, N, pfx_f3, Wn, Ln }, /* vpmov{d,q}2m */
+ { { 0x3a }, 2, T, R, pfx_66, WIG, Ln }, /* vpminuw */
+ { { 0x2a, 0xc0 }, 2, F, N, pfx_f3, W0, Ln }, /* vpbroadcastmw2d */
+ { { 0x3b }, 2, T, R, pfx_66, Wn, Ln }, /* vpminu{d,q} */
+ { { 0x3c }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxsb */
+ { { 0x3d }, 2, T, R, pfx_66, Wn, Ln }, /* vpmaxs{d,q} */
+ { { 0x3e }, 2, T, R, pfx_66, WIG, Ln }, /* vpmaxuw */
+ { { 0x3f }, 2, T, R, pfx_66, Wn, Ln }, /* vpmaxu{d,q} */
+ { { 0x40 }, 2, T, R, pfx_66, Wn, Ln }, /* vpmull{d,q} */
+ { { 0x42 }, 2, T, R, pfx_66, Wn, Ln }, /* vgetexpp{s,d} */
+ { { 0x43 }, 2, T, R, pfx_66, Wn, LIG }, /* vgetexps{s,d} */
+ { { 0x44 }, 2, T, R, pfx_66, Wn, Ln }, /* vlzcnt{d,q} */
+ { { 0x45 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsrlv{d,q} */
+ { { 0x46 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsrav{d,q} */
+ { { 0x47 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsllv{d,q} */
+ { { 0x4c }, 2, T, R, pfx_66, Wn, Ln }, /* vrcp14p{s,d} */
+ { { 0x4d }, 2, T, R, pfx_66, Wn, LIG }, /* vrcp14s{s,d} */
+ { { 0x4e }, 2, T, R, pfx_66, Wn, Ln }, /* vrsqrt14p{s,d} */
+ { { 0x4f }, 2, T, R, pfx_66, Wn, LIG }, /* vrsqrt14s{s,d} */
+ { { 0x50 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusd */
+ { { 0x51 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusds */
+ { { 0x52 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwssd */
+ { { 0x52 }, 2, T, R, pfx_f3, W0, Ln }, /* vdpbf16ps */
+ { { 0x52 }, 2, T, R, pfx_f2, W0, L2 }, /* vp4dpwssd */
+ { { 0x53 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwssds */
+ { { 0x53 }, 2, T, R, pfx_f2, W0, L2 }, /* vp4dpwssds */
+ { { 0x54 }, 2, T, R, pfx_66, Wn, Ln }, /* vpopcnt{b,w} */
+ { { 0x55 }, 2, T, R, pfx_66, Wn, Ln }, /* vpopcnt{d,q} */
+ { { 0x58 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastd */
+ { { 0x59 }, 2, T, R, pfx_66, Wn, Ln }, /* vbroadcast32x2 / vpbroadcastq */
+ { { 0x5a }, 2, F, R, pfx_66, Wn, L1|L2 }, /* vbroadcasti{32x4,64x2} */
+ { { 0x5b }, 2, F, R, pfx_66, Wn, L2 }, /* vbroadcasti{32x8,64x4} */
+ { { 0x62 }, 2, T, R, pfx_66, Wn, Ln }, /* vpexpand{b,w} */
+ { { 0x63 }, 2, T, W, pfx_66, Wn, Ln }, /* vpcompress{b,w} */
+ { { 0x64 }, 2, T, R, pfx_66, Wn, Ln }, /* vpblendm{d,q} */
+ { { 0x65 }, 2, T, R, pfx_66, Wn, Ln }, /* vblendmp{s,d} */
+ { { 0x66 }, 2, T, R, pfx_66, Wn, Ln }, /* vpblendm{b,w} */
+ { { 0x68 }, 2, T, R, pfx_f2, Wn, Ln }, /* vp2intersect{d,q} */
+ { { 0x70 }, 2, T, R, pfx_66, W1, Ln }, /* vpshldvw */
+ { { 0x71 }, 2, T, R, pfx_66, Wn, Ln }, /* vpshldv{d,q} */
+ { { 0x72 }, 2, T, R, pfx_66, W1, Ln }, /* vpshrdvw */
+ { { 0x72 }, 2, T, R, pfx_f3, W1, Ln }, /* vcvtneps2bf16 */
+ { { 0x72 }, 2, T, R, pfx_f2, W1, Ln }, /* vcvtne2ps2bf16 */
+ { { 0x73 }, 2, T, R, pfx_66, Wn, Ln }, /* vpshrdv{d,q} */
+ { { 0x75 }, 2, T, R, pfx_66, Wn, Ln }, /* vpermi2{b,w} */
+ { { 0x76 }, 2, T, R, pfx_66, Wn, Ln }, /* vpermi2{d,q} */
+ { { 0x77 }, 2, T, R, pfx_66, Wn, Ln }, /* vpermi2p{s,d} */
+ { { 0x78 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastb */
+ { { 0x79 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastw */
+ { { 0x7a, 0xc0 }, 2, F, N, pfx_66, W0, Ln }, /* vpbroadcastb */
+ { { 0x7b, 0xc0 }, 2, F, N, pfx_66, W0, Ln }, /* vpbroadcastw */
+ { { 0x7c, 0xc0 }, 2, F, N, pfx_66, W0, Ln }, /* vpbroadcast{d,q} */
+ { { 0x7d }, 2, T, R, pfx_66, Wn, Ln }, /* vpermt2{b,w} */
+ { { 0x7e }, 2, T, R, pfx_66, Wn, Ln }, /* vpermt2{d,q} */
+ { { 0x7f }, 2, T, R, pfx_66, Wn, Ln }, /* vpermt2p{s,d} */
+ { { 0x83 }, 2, T, R, pfx_66, W1, Ln }, /* vpmultishiftqb */
+ { { 0x88 }, 2, T, R, pfx_66, Wn, Ln }, /* vpexpandp{s,d} */
+ { { 0x89 }, 2, T, R, pfx_66, Wn, Ln }, /* vpexpand{d,q} */
+ { { 0x8a }, 2, T, W, pfx_66, Wn, Ln }, /* vpcompressp{s,d} */
+ { { 0x8b }, 2, T, W, pfx_66, Wn, Ln }, /* vpcompress{d,q} */
+ { { 0x8d }, 2, F, R, pfx_66, Wn, Ln }, /* vperm{b,w} */
+ { { 0x8f }, 2, F, R, pfx_66, W0, Ln }, /* vpshufbitqmb */
+ { { 0x90, VSIB(1) }, 3, F, R, pfx_66, Wn, Ln, T }, /* vpgatherd{d,q} */
+ { { 0x91, VSIB(1) }, 3, F, R, pfx_66, Wn, Ln, T }, /* vpgatherq{d,q} */
+ { { 0x92, VSIB(1) }, 3, F, R, pfx_66, Wn, Ln, T }, /* vgatherdp{s,d} */
+ { { 0x93, VSIB(1) }, 3, F, R, pfx_66, Wn, Ln, T }, /* vgatherqp{s,d} */
+ { { 0x96 }, 2, T, R, pfx_66, Wn, Ln }, /* vfmaddsub132p{s,d} */
+ { { 0x97 }, 2, T, R, pfx_66, Wn, Ln }, /* vfmsubadd132p{s,d} */
+ { { 0x98 }, 2, T, R, pfx_66, Wn, Ln }, /* vfmadd132p{s,d} */
+ { { 0x99 }, 2, T, R, pfx_66, Wn, LIG }, /* vfmadd132s{s,d} */
+ { { 0x9a }, 2, T, R, pfx_66, Wn, Ln }, /* vfmsub132p{s,d} */
+ { { 0x9a }, 2, T, R, pfx_f2, W0, L2 }, /* v4fmaddps */
+ { { 0x9b }, 2, T, R, pfx_66, Wn, LIG }, /* vfmsub132s{s,d} */
+ { { 0x9b }, 2, T, R, pfx_f2, W0, LIG }, /* v4fmaddss */
+ { { 0x9c }, 2, T, R, pfx_66, Wn, Ln }, /* vfnmadd132p{s,d} */
+ { { 0x9d }, 2, T, R, pfx_66, Wn, LIG }, /* vfnmadd132s{s,d} */
+ { { 0x9e }, 2, T, R, pfx_66, Wn, Ln }, /* vfnmsub132p{s,d} */
+ { { 0x9f }, 2, T, R, pfx_66, Wn, LIG }, /* vfnmsub132s{s,d} */
+ { { 0xa0, VSIB(1) }, 3, F, W, pfx_66, Wn, Ln, T }, /* vpscatterd{d,q} */
+ { { 0xa1, VSIB(1) }, 3, F, W, pfx_66, Wn, Ln, T }, /* vpscatterq{d,q} */
+ { { 0xa2, VSIB(1) }, 3, F, W, pfx_66, Wn, Ln, T }, /* vscatterdp{s,d} */
+ { { 0xa3, VSIB(1) }, 3, F, W, pfx_66, Wn, Ln, T }, /* vscatterqp{s,d} */
+ { { 0xa6 }, 2, T, R, pfx_66, Wn, Ln }, /* vfmaddsub213p{s,d} */
+ { { 0xa7 }, 2, T, R, pfx_66, Wn, Ln }, /* vfmsubadd213p{s,d} */
+ { { 0xa8 }, 2, T, R, pfx_66, Wn, Ln }, /* vfmadd213p{s,d} */
+ { { 0xa9 }, 2, T, R, pfx_66, Wn, LIG }, /* vfmadd213s{s,d} */
+ { { 0x9a }, 2, T, R, pfx_f2, W0, L2 }, /* v4fnmaddps */
+ { { 0xaa }, 2, T, R, pfx_66, Wn, Ln }, /* vfmsub213p{s,d} */
+ { { 0xab }, 2, T, R, pfx_66, Wn, LIG }, /* vfmsub213s{s,d} */
+ { { 0x9b }, 2, T, R, pfx_f2, W0, LIG }, /* v4fnmaddss */
+ { { 0xac }, 2, T, R, pfx_66, Wn, Ln }, /* vfnmadd213p{s,d} */
+ { { 0xad }, 2, T, R, pfx_66, Wn, LIG }, /* vfnmadd213s{s,d} */
+ { { 0xae }, 2, T, R, pfx_66, Wn, Ln }, /* vfnmsub213p{s,d} */
+ { { 0xaf }, 2, T, R, pfx_66, Wn, LIG }, /* vfnmsub213s{s,d} */
+ { { 0xb4 }, 2, T, R, pfx_66, W1, Ln }, /* vpmadd52luq */
+ { { 0xb5 }, 2, T, R, pfx_66, W1, Ln }, /* vpmadd52huq */
+ { { 0xb6 }, 2, T, R, pfx_66, Wn, Ln }, /* vfmaddsub231p{s,d} */
+ { { 0xb7 }, 2, T, R, pfx_66, Wn, Ln }, /* vfmsubadd231p{s,d} */
+ { { 0xb8 }, 2, T, R, pfx_66, Wn, Ln }, /* vfmadd231p{s,d} */
+ { { 0xb9 }, 2, T, R, pfx_66, Wn, LIG }, /* vfmadd231s{s,d} */
+ { { 0xba }, 2, T, R, pfx_66, Wn, Ln }, /* vfmsub231p{s,d} */
+ { { 0xbb }, 2, T, R, pfx_66, Wn, LIG }, /* vfmsub231s{s,d} */
+ { { 0xbc }, 2, T, R, pfx_66, Wn, Ln }, /* vfnmadd231p{s,d} */
+ { { 0xbd }, 2, T, R, pfx_66, Wn, LIG }, /* vfnmadd231s{s,d} */
+ { { 0xbe }, 2, T, R, pfx_66, Wn, Ln }, /* vfnmsub231p{s,d} */
+ { { 0xbf }, 2, T, R, pfx_66, Wn, LIG }, /* vfnmsub231s{s,d} */
+ { { 0xc4 }, 2, T, R, pfx_66, Wn, Ln }, /* vpconflict{d,q} */
+ { { 0xc6, VSIB(1) }, 3, F, N, pfx_66, Wn, L2, T }, /* vgatherpf0dp{s,d} */
+ { { 0xc6, VSIB(2) }, 3, F, N, pfx_66, Wn, L2, T }, /* vgatherpf1dp{s,d} */
+ { { 0xc6, VSIB(5) }, 3, F, N, pfx_66, Wn, L2, T }, /* vscatterpf0dp{s,d} */
+ { { 0xc6, VSIB(6) }, 3, F, N, pfx_66, Wn, L2, T }, /* vscatterpf1dp{s,d} */
+ { { 0xc7, VSIB(1) }, 3, F, N, pfx_66, Wn, L2, T }, /* vgatherpf0qp{s,d} */
+ { { 0xc7, VSIB(2) }, 3, F, N, pfx_66, Wn, L2, T }, /* vgatherpf1qp{s,d} */
+ { { 0xc7, VSIB(5) }, 3, F, N, pfx_66, Wn, L2, T }, /* vscatterpf0qp{s,d} */
+ { { 0xc7, VSIB(6) }, 3, F, N, pfx_66, Wn, L2, T }, /* vscatterpf1qp{s,d} */
+ { { 0xc8 }, 2, T, R, pfx_66, Wn, L2 }, /* vexp2p{s,d} */
+ { { 0xca }, 2, T, R, pfx_66, Wn, L2 }, /* vrcp28p{s,d} */
+ { { 0xcb }, 2, T, R, pfx_66, Wn, LIG }, /* vrcp28s{s,d} */
+ { { 0xcc }, 2, T, R, pfx_66, Wn, L2 }, /* vrsqrt28p{s,d} */
+ { { 0xcd }, 2, T, R, pfx_66, Wn, LIG }, /* vrsqrt28s{s,d} */
+ { { 0xcf }, 2, T, R, pfx_66, W0, Ln }, /* vgf2p8mulb */
+ { { 0xdc }, 2, T, R, pfx_66, WIG, Ln }, /* vaesenc */
+ { { 0xdd }, 2, T, R, pfx_66, WIG, Ln }, /* vaesenclast */
+ { { 0xde }, 2, T, R, pfx_66, WIG, Ln }, /* vaesdec */
+ { { 0xdf }, 2, T, R, pfx_66, WIG, Ln }, /* vaesdeclast */
+}, evex_0f3a[] = {
+ { { 0x00 }, 3, T, R, pfx_66, W1, L1|L2 }, /* vpermq */
+ { { 0x01 }, 3, T, R, pfx_66, W1, L1|L2 }, /* vpermpd */
+ { { 0x03 }, 3, T, R, pfx_66, Wn, Ln }, /* valign{d,q} */
+ { { 0x04 }, 3, T, R, pfx_66, W0, Ln }, /* vpermilps */
+ { { 0x05 }, 3, T, R, pfx_66, W1, Ln }, /* vpermilpd */
+ { { 0x08 }, 3, T, R, pfx_66, W0, Ln }, /* vrndscaleps */
+ { { 0x09 }, 3, T, R, pfx_66, W1, Ln }, /* vrndscalepd */
+ { { 0x0a }, 3, T, R, pfx_66, WIG, LIG }, /* vrndscaless */
+ { { 0x0b }, 3, T, R, pfx_66, WIG, LIG }, /* vrndscalesd */
+ { { 0x0f }, 3, T, R, pfx_66, WIG, Ln }, /* vpalignr */
+ { { 0x14 }, 3, T, W, pfx_66, WIG, L0 }, /* vpextrb */
+ { { 0x15 }, 3, T, W, pfx_66, WIG, L0 }, /* vpextrw */
+ { { 0x16 }, 3, T, W, pfx_66, Wn, L0 }, /* vpextr{d,q} */
+ { { 0x17 }, 3, T, W, pfx_66, WIG, L0 }, /* vextractps */
+ { { 0x18 }, 3, T, R, pfx_66, Wn, L1|L2 }, /* vinsertf{32x4,64x2} */
+ { { 0x19 }, 3, T, W, pfx_66, Wn, L1|L2 }, /* vextractf{32x4,64x2} */
+ { { 0x1a }, 3, T, R, pfx_66, Wn, L2 }, /* vinsertf{32x8,64x4} */
+ { { 0x1b }, 3, T, W, pfx_66, Wn, L2 }, /* vextractf{32x8,64x4} */
+ { { 0x1d }, 3, T, W, pfx_66, W0, Ln }, /* vcvtps2ph */
+ { { 0x1e }, 3, T, R, pfx_66, Wn, Ln }, /* vpcmpu{d,q} */
+ { { 0x1f }, 3, T, R, pfx_66, Wn, Ln }, /* vpcmp{d,q} */
+ { { 0x20 }, 3, T, R, pfx_66, WIG, L0 }, /* vpinsrb */
+ { { 0x21 }, 3, T, R, pfx_66, WIG, L0 }, /* vinsertps */
+ { { 0x22 }, 3, T, R, pfx_66, Wn, L0 }, /* vpinsr{d,q} */
+ { { 0x23 }, 3, T, R, pfx_66, Wn, L1|L2 }, /* vshuff{32x4,64x2} */
+ { { 0x25 }, 3, T, R, pfx_66, Wn, Ln }, /* vpternlog{d,q} */
+ { { 0x26 }, 3, T, R, pfx_66, Wn, Ln }, /* vgetmantp{s,d} */
+ { { 0x27 }, 3, T, R, pfx_66, Wn, LIG }, /* vgetmants{s,d} */
+ { { 0x38 }, 3, T, R, pfx_66, Wn, L1|L2 }, /* vinserti{32x4,64x2} */
+ { { 0x39 }, 3, T, W, pfx_66, Wn, L1|L2 }, /* vextracti{32x4,64x2} */
+ { { 0x3a }, 3, T, R, pfx_66, Wn, L2 }, /* vinserti{32x8,64x4} */
+ { { 0x3b }, 3, T, W, pfx_66, Wn, L2 }, /* vextracti{32x8,64x4} */
+ { { 0x3e }, 3, T, R, pfx_66, Wn, Ln }, /* vpcmpu{b,w} */
+ { { 0x3f }, 3, T, R, pfx_66, Wn, Ln }, /* vpcmp{b,w} */
+ { { 0x42 }, 3, T, R, pfx_66, W0, Ln }, /* vdbpsadbw */
+ { { 0x43 }, 3, T, R, pfx_66, Wn, L1|L2 }, /* vshufi{32x4,64x2} */
+ { { 0x44 }, 3, T, R, pfx_66, WIG, Ln }, /* vpclmulqdq */
+ { { 0x50 }, 3, T, R, pfx_66, Wn, Ln }, /* vrangep{s,d} */
+ { { 0x51 }, 3, T, R, pfx_66, Wn, LIG }, /* vranges{s,d} */
+ { { 0x54 }, 3, T, R, pfx_66, Wn, Ln }, /* vfixupimmp{s,d} */
+ { { 0x55 }, 3, T, R, pfx_66, Wn, LIG }, /* vfixumpimms{s,d} */
+ { { 0x56 }, 3, T, R, pfx_66, Wn, Ln }, /* vreducep{s,d} */
+ { { 0x57 }, 3, T, R, pfx_66, Wn, LIG }, /* vreduces{s,d} */
+ { { 0x66 }, 3, T, R, pfx_66, Wn, Ln }, /* vfpclassp{s,d} */
+ { { 0x67 }, 3, T, R, pfx_66, Wn, LIG }, /* vfpclasss{s,d} */
+ { { 0x70 }, 3, T, R, pfx_66, W1, Ln }, /* vshldw */
+ { { 0x71 }, 3, T, R, pfx_66, Wn, Ln }, /* vshld{d,q} */
+ { { 0x72 }, 3, T, R, pfx_66, W1, Ln }, /* vshrdw */
+ { { 0x73 }, 3, T, R, pfx_66, Wn, Ln }, /* vshrd{d,q} */
+ { { 0xce }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineqb */
+ { { 0xcf }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineinvqb */
+};
+
+static const struct {
+ const struct evex *tbl;
+ unsigned int num;
+} evex[] = {
+ { evex_0f, ARRAY_SIZE(evex_0f) },
+ { evex_0f38, ARRAY_SIZE(evex_0f38) },
+ { evex_0f3a, ARRAY_SIZE(evex_0f3a) },
+};
+
+#undef Wn
+
#undef F
#undef N
#undef R
@@ -1883,6 +2343,50 @@ void predicates_test(void *instr, struct
}
}
}
+ }
+
+ for ( x = 0; x < ARRAY_SIZE(evex); ++x )
+ {
+ for ( t = 0; t < evex[x].num; ++t )
+ {
+ uint8_t *ptr = instr;
+ unsigned int l;
+
+ memset(instr + 5, 0xcc, 10);
+
+ *ptr++ = 0x62;
+ *ptr++ = 0xf1 + x;
+ *ptr++ = 0x7c | evex[x].tbl[t].pfx;
+ *ptr++ = 0x08 | evex[x].tbl[t].mask;
+
+ for ( l = 3; l--; )
+ {
+ if ( evex[x].tbl[t].l != LIG && !(evex[x].tbl[t].l & (1u << l)) )
+ continue;
+
+ ptr[-1] &= ~0x60;
+ ptr[-1] |= l << 5;
+ memcpy(ptr, evex[x].tbl[t].opc, evex[x].tbl[t].len);
+
+ if ( evex[x].tbl[t].w == WIG || (evex[x].tbl[t].w & W0) )
+ {
+ ptr[-2] &= ~0x80;
+ do_test(instr, evex[x].tbl[t].len + ((void *)ptr - instr),
+ evex[x].tbl[t].modrm ? (void *)ptr - instr + 1 : 0,
+ evex[x].tbl[t].mem, ctxt, fetch);
+ }
+
+ if ( evex[x].tbl[t].w == WIG || (evex[x].tbl[t].w & W1) )
+ {
+ ptr[-2] |= 0x80;
+ memcpy(ptr, evex[x].tbl[t].opc, evex[x].tbl[t].len);
+
+ do_test(instr, evex[x].tbl[t].len + ((void *)ptr - instr),
+ evex[x].tbl[t].modrm ? (void *)ptr - instr + 1 : 0,
+ evex[x].tbl[t].mem, ctxt, fetch);
+ }
+ }
+ }
}
if ( errors )
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 10/10] x86emul: correct AVX512_BF16 insn names in EVEX Disp8 test
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
` (8 preceding siblings ...)
2020-08-03 14:54 ` [PATCH 09/10] x86emul: extend decoding / mem access testing to EVEX-encoded insns Jan Beulich
@ 2020-08-03 14:54 ` Jan Beulich
2020-08-03 16:40 ` [PATCH 00/10] x86emul: full coverage mem access / write testing Andrew Cooper
2020-08-04 14:59 ` Andrew Cooper
11 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-03 14:54 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné
The leading 'v' ought to be omitted from the table entries.
Fixes: 7ff66809ccd5 ("x86emul: support AVX512_BF16 insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/tools/tests/x86_emulator/evex-disp8.c
+++ b/tools/tests/x86_emulator/evex-disp8.c
@@ -551,9 +551,9 @@ static const struct test avx512_4vnniw_5
};
static const struct test avx512_bf16_all[] = {
- INSN(vcvtne2ps2bf16, f2, 0f38, 72, vl, d, vl),
- INSN(vcvtneps2bf16, f3, 0f38, 72, vl, d, vl),
- INSN(vdpbf16ps, f3, 0f38, 52, vl, d, vl),
+ INSN(cvtne2ps2bf16, f2, 0f38, 72, vl, d, vl),
+ INSN(cvtneps2bf16, f3, 0f38, 72, vl, d, vl),
+ INSN(dpbf16ps, f3, 0f38, 52, vl, d, vl),
};
static const struct test avx512_bitalg_all[] = {
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 00/10] x86emul: full coverage mem access / write testing
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
` (9 preceding siblings ...)
2020-08-03 14:54 ` [PATCH 10/10] x86emul: correct AVX512_BF16 insn names in EVEX Disp8 test Jan Beulich
@ 2020-08-03 16:40 ` Andrew Cooper
2020-08-04 6:42 ` Jan Beulich
2020-08-04 7:38 ` Jan Beulich
2020-08-04 14:59 ` Andrew Cooper
11 siblings, 2 replies; 17+ messages in thread
From: Andrew Cooper @ 2020-08-03 16:40 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné
On 03/08/2020 15:47, Jan Beulich wrote:
> ... and a few fixes resulting from this work. This completes what
> was started for legacy encoded GPR insns in a rush before 4.14.
>
> There's one thing I'm still planning on top of both this and the
> EVEX-disp8 checking: For all encodings we produce via general
> logic (and in particular without involvement of any assembler) I'd
> like to add a kind of logging mechanism, the output of which could
> be fed to gas and then some disassembler, to allow verification
> that the produced encodings are actually valid ones. See e.g. the
> first patch here or commit 5f55389d6960 - the problems addressed
> there could have been caught earlier if the generated encodings
> could be easily disassembled. What's not clear to me here is
> whether this is deemed generally useful, or whether I should make
> this a private addition of mine.
Seems fine to me.
I have encountered a failure on AMD Naples which I doubt is related to
this series, but is blocking testing on some of the content here.
Testing fnstenv 4(%ecx)... failed!
AMD Fam17 does have the fcs/fds save-as-zero logic which is still not
wired up anywhere in Xen, which seems like the most likely candidate
here (without having investigated the issue at all yet).
~Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 00/10] x86emul: full coverage mem access / write testing
2020-08-03 16:40 ` [PATCH 00/10] x86emul: full coverage mem access / write testing Andrew Cooper
@ 2020-08-04 6:42 ` Jan Beulich
2020-08-04 7:38 ` Jan Beulich
1 sibling, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-04 6:42 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné
On 03.08.2020 18:40, Andrew Cooper wrote:
> On 03/08/2020 15:47, Jan Beulich wrote:
>> ... and a few fixes resulting from this work. This completes what
>> was started for legacy encoded GPR insns in a rush before 4.14.
>>
>> There's one thing I'm still planning on top of both this and the
>> EVEX-disp8 checking: For all encodings we produce via general
>> logic (and in particular without involvement of any assembler) I'd
>> like to add a kind of logging mechanism, the output of which could
>> be fed to gas and then some disassembler, to allow verification
>> that the produced encodings are actually valid ones. See e.g. the
>> first patch here or commit 5f55389d6960 - the problems addressed
>> there could have been caught earlier if the generated encodings
>> could be easily disassembled. What's not clear to me here is
>> whether this is deemed generally useful, or whether I should make
>> this a private addition of mine.
>
> Seems fine to me.
>
> I have encountered a failure on AMD Naples which I doubt is related to
> this series, but is blocking testing on some of the content here.
>
> Testing fnstenv 4(%ecx)... failed!
>
> AMD Fam17 does have the fcs/fds save-as-zero logic which is still not
> wired up anywhere in Xen, which seems like the most likely candidate
> here (without having investigated the issue at all yet).
There are two zap_fpsel() in place there, which I would have thought
to cover this. I'll see whether I can repro on my Rome box.
Jan
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 00/10] x86emul: full coverage mem access / write testing
2020-08-03 16:40 ` [PATCH 00/10] x86emul: full coverage mem access / write testing Andrew Cooper
2020-08-04 6:42 ` Jan Beulich
@ 2020-08-04 7:38 ` Jan Beulich
1 sibling, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2020-08-04 7:38 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné
On 03.08.2020 18:40, Andrew Cooper wrote:
> On 03/08/2020 15:47, Jan Beulich wrote:
>> ... and a few fixes resulting from this work. This completes what
>> was started for legacy encoded GPR insns in a rush before 4.14.
>>
>> There's one thing I'm still planning on top of both this and the
>> EVEX-disp8 checking: For all encodings we produce via general
>> logic (and in particular without involvement of any assembler) I'd
>> like to add a kind of logging mechanism, the output of which could
>> be fed to gas and then some disassembler, to allow verification
>> that the produced encodings are actually valid ones. See e.g. the
>> first patch here or commit 5f55389d6960 - the problems addressed
>> there could have been caught earlier if the generated encodings
>> could be easily disassembled. What's not clear to me here is
>> whether this is deemed generally useful, or whether I should make
>> this a private addition of mine.
>
> Seems fine to me.
>
> I have encountered a failure on AMD Naples which I doubt is related to
> this series, but is blocking testing on some of the content here.
>
> Testing fnstenv 4(%ecx)... failed!
>
> AMD Fam17 does have the fcs/fds save-as-zero logic which is still not
> wired up anywhere in Xen, which seems like the most likely candidate
> here (without having investigated the issue at all yet).
FIP/FOP/FDP are lost over a context switch in Linux here, as it
seems. No idea yet why a context switch would happen this
reliably on Fam17, but not on Fam15 (where I'd expect the behavior
to be the same as long as there's no unmasked exception).
Jan
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 00/10] x86emul: full coverage mem access / write testing
2020-08-03 14:47 [PATCH 00/10] x86emul: full coverage mem access / write testing Jan Beulich
` (10 preceding siblings ...)
2020-08-03 16:40 ` [PATCH 00/10] x86emul: full coverage mem access / write testing Andrew Cooper
@ 2020-08-04 14:59 ` Andrew Cooper
11 siblings, 0 replies; 17+ messages in thread
From: Andrew Cooper @ 2020-08-04 14:59 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné
On 03/08/2020 15:47, Jan Beulich wrote:
> ... and a few fixes resulting from this work. This completes what
> was started for legacy encoded GPR insns in a rush before 4.14.
>
> There's one thing I'm still planning on top of both this and the
> EVEX-disp8 checking: For all encodings we produce via general
> logic (and in particular without involvement of any assembler) I'd
> like to add a kind of logging mechanism, the output of which could
> be fed to gas and then some disassembler, to allow verification
> that the produced encodings are actually valid ones. See e.g. the
> first patch here or commit 5f55389d6960 - the problems addressed
> there could have been caught earlier if the generated encodings
> could be easily disassembled. What's not clear to me here is
> whether this is deemed generally useful, or whether I should make
> this a private addition of mine.
>
> 01: adjustments to mem access / write logic testing
> 02: extend decoding / mem access testing to FPU insns
> 03: extend decoding / mem access testing to MMX / SSE insns
> 04: extend decoding / mem access testing to VEX-encoded insns
> 05: extend decoding / mem access testing to XOP-encoded insns
> 06: AVX512{F,BW} down conversion moves are memory writes
> 07: AVX512F scatter insns are memory writes
> 08: AVX512PF insns aren't memory accesses
> 09: extend decoding / mem access testing to EVEX-encoded insns
> 10: correct AVX512_BF16 insn names in EVEX Disp8 test
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ideally with the commit message for patch 3 adjusted.
^ permalink raw reply [flat|nested] 17+ messages in thread