All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL 0/2] target/i386: fix cmpxchgl, lahf, sahf
@ 2022-11-14 23:38 Richard Henderson
  2022-11-14 23:38 ` [PULL 1/2] target/i386: fix cmpxchg with 32-bit register destination Richard Henderson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Richard Henderson @ 2022-11-14 23:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: stefanha

The following changes since commit 98f10f0e2613ba1ac2ad3f57a5174014f6dcb03d:

  Merge tag 'pull-target-arm-20221114' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2022-11-14 13:31:17 -0500)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-x86-20221115

for you to fetch changes up to 35d95e4126d83c0bb0de83007494d184f6111b3d:

  target/i386: hardcode R_EAX as destination register for LAHF/SAHF (2022-11-15 09:34:42 +1000)

----------------------------------------------------------------
Fix cmpxchgl writeback to rax.
Fix lahf/sahf for 64-bit

----------------------------------------------------------------
Paolo Bonzini (2):
      target/i386: fix cmpxchg with 32-bit register destination
      target/i386: hardcode R_EAX as destination register for LAHF/SAHF

 target/i386/tcg/translate.c      | 86 +++++++++++++++++++++++++++-------------
 tests/tcg/x86_64/cmpxchg.c       | 42 ++++++++++++++++++++
 tests/tcg/x86_64/Makefile.target |  1 +
 3 files changed, 101 insertions(+), 28 deletions(-)
 create mode 100644 tests/tcg/x86_64/cmpxchg.c


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PULL 1/2] target/i386: fix cmpxchg with 32-bit register destination
  2022-11-14 23:38 [PULL 0/2] target/i386: fix cmpxchgl, lahf, sahf Richard Henderson
@ 2022-11-14 23:38 ` Richard Henderson
  2022-11-14 23:38 ` [PULL 2/2] target/i386: hardcode R_EAX as destination register for LAHF/SAHF Richard Henderson
  2022-11-15 23:54 ` [PULL 0/2] target/i386: fix cmpxchgl, lahf, sahf Stefan Hajnoczi
  2 siblings, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2022-11-14 23:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: stefanha, Paolo Bonzini

From: Paolo Bonzini <pbonzini@redhat.com>

Unlike the memory case, where "the destination operand receives a write
cycle without regard to the result of the comparison", rm must not be
touched altogether if the write fails, including not zero-extending
it on 64-bit processors.  This is not how the movcond currently works,
because it is always followed by a gen_op_mov_reg_v to rm.

To fix it, introduce a new function that is similar to gen_op_mov_reg_v
but writes to a TCG temporary.

Considering that gen_extu(ot, oldv) is not needed in the memory case
either, the two cases for register and memory destinations are different
enough that one might as well fuse the two "if (mod == 3)" into one.
So do that too.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/508
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[rth: Add a test case ]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/i386/tcg/translate.c      | 82 ++++++++++++++++++++++----------
 tests/tcg/x86_64/cmpxchg.c       | 42 ++++++++++++++++
 tests/tcg/x86_64/Makefile.target |  1 +
 3 files changed, 99 insertions(+), 26 deletions(-)
 create mode 100644 tests/tcg/x86_64/cmpxchg.c

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 28a4e6dc1d..dbd6492778 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -439,32 +439,51 @@ static inline MemOp mo_b_d32(int b, MemOp ot)
     return b & 1 ? (ot == MO_16 ? MO_16 : MO_32) : MO_8;
 }
 
-static void gen_op_mov_reg_v(DisasContext *s, MemOp ot, int reg, TCGv t0)
+/* Compute the result of writing t0 to the OT-sized register REG.
+ *
+ * If DEST is NULL, store the result into the register and return the
+ * register's TCGv.
+ *
+ * If DEST is not NULL, store the result into DEST and return the
+ * register's TCGv.
+ */
+static TCGv gen_op_deposit_reg_v(DisasContext *s, MemOp ot, int reg, TCGv dest, TCGv t0)
 {
     switch(ot) {
     case MO_8:
-        if (!byte_reg_is_xH(s, reg)) {
-            tcg_gen_deposit_tl(cpu_regs[reg], cpu_regs[reg], t0, 0, 8);
-        } else {
-            tcg_gen_deposit_tl(cpu_regs[reg - 4], cpu_regs[reg - 4], t0, 8, 8);
+        if (byte_reg_is_xH(s, reg)) {
+            dest = dest ? dest : cpu_regs[reg - 4];
+            tcg_gen_deposit_tl(dest, cpu_regs[reg - 4], t0, 8, 8);
+            return cpu_regs[reg - 4];
         }
+        dest = dest ? dest : cpu_regs[reg];
+        tcg_gen_deposit_tl(dest, cpu_regs[reg], t0, 0, 8);
         break;
     case MO_16:
-        tcg_gen_deposit_tl(cpu_regs[reg], cpu_regs[reg], t0, 0, 16);
+        dest = dest ? dest : cpu_regs[reg];
+        tcg_gen_deposit_tl(dest, cpu_regs[reg], t0, 0, 16);
         break;
     case MO_32:
         /* For x86_64, this sets the higher half of register to zero.
            For i386, this is equivalent to a mov. */
-        tcg_gen_ext32u_tl(cpu_regs[reg], t0);
+        dest = dest ? dest : cpu_regs[reg];
+        tcg_gen_ext32u_tl(dest, t0);
         break;
 #ifdef TARGET_X86_64
     case MO_64:
-        tcg_gen_mov_tl(cpu_regs[reg], t0);
+        dest = dest ? dest : cpu_regs[reg];
+        tcg_gen_mov_tl(dest, t0);
         break;
 #endif
     default:
         tcg_abort();
     }
+    return cpu_regs[reg];
+}
+
+static void gen_op_mov_reg_v(DisasContext *s, MemOp ot, int reg, TCGv t0)
+{
+    gen_op_deposit_reg_v(s, ot, reg, NULL, t0);
 }
 
 static inline
@@ -3747,7 +3766,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1b0:
     case 0x1b1: /* cmpxchg Ev, Gv */
         {
-            TCGv oldv, newv, cmpv;
+            TCGv oldv, newv, cmpv, dest;
 
             ot = mo_b_d(b, dflag);
             modrm = x86_ldub_code(env, s);
@@ -3758,7 +3777,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
             cmpv = tcg_temp_new();
             gen_op_mov_v_reg(s, ot, newv, reg);
             tcg_gen_mov_tl(cmpv, cpu_regs[R_EAX]);
-
+            gen_extu(ot, cmpv);
             if (s->prefix & PREFIX_LOCK) {
                 if (mod == 3) {
                     goto illegal_op;
@@ -3766,32 +3785,43 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
                 gen_lea_modrm(env, s, modrm);
                 tcg_gen_atomic_cmpxchg_tl(oldv, s->A0, cmpv, newv,
                                           s->mem_index, ot | MO_LE);
-                gen_op_mov_reg_v(s, ot, R_EAX, oldv);
             } else {
                 if (mod == 3) {
                     rm = (modrm & 7) | REX_B(s);
                     gen_op_mov_v_reg(s, ot, oldv, rm);
+                    gen_extu(ot, oldv);
+
+                    /*
+                     * Unlike the memory case, where "the destination operand receives
+                     * a write cycle without regard to the result of the comparison",
+                     * rm must not be touched altogether if the write fails, including
+                     * not zero-extending it on 64-bit processors.  So, precompute
+                     * the result of a successful writeback and perform the movcond
+                     * directly on cpu_regs.  Also need to write accumulator first, in
+                     * case rm is part of RAX too.
+                     */
+                    dest = gen_op_deposit_reg_v(s, ot, rm, newv, newv);
+                    tcg_gen_movcond_tl(TCG_COND_EQ, dest, oldv, cmpv, newv, dest);
                 } else {
                     gen_lea_modrm(env, s, modrm);
                     gen_op_ld_v(s, ot, oldv, s->A0);
-                    rm = 0; /* avoid warning */
-                }
-                gen_extu(ot, oldv);
-                gen_extu(ot, cmpv);
-                /* store value = (old == cmp ? new : old);  */
-                tcg_gen_movcond_tl(TCG_COND_EQ, newv, oldv, cmpv, newv, oldv);
-                if (mod == 3) {
-                    gen_op_mov_reg_v(s, ot, R_EAX, oldv);
-                    gen_op_mov_reg_v(s, ot, rm, newv);
-                } else {
-                    /* Perform an unconditional store cycle like physical cpu;
-                       must be before changing accumulator to ensure
-                       idempotency if the store faults and the instruction
-                       is restarted */
+
+                    /*
+                     * Perform an unconditional store cycle like physical cpu;
+                     * must be before changing accumulator to ensure
+                     * idempotency if the store faults and the instruction
+                     * is restarted
+                     */
+                    tcg_gen_movcond_tl(TCG_COND_EQ, newv, oldv, cmpv, newv, oldv);
                     gen_op_st_v(s, ot, newv, s->A0);
-                    gen_op_mov_reg_v(s, ot, R_EAX, oldv);
                 }
             }
+	    /*
+	     * Write EAX only if the cmpxchg fails; reuse newv as the destination,
+	     * since it's dead here.
+	     */
+            dest = gen_op_deposit_reg_v(s, ot, R_EAX, newv, oldv);
+            tcg_gen_movcond_tl(TCG_COND_EQ, dest, oldv, cmpv, dest, newv);
             tcg_gen_mov_tl(cpu_cc_src, oldv);
             tcg_gen_mov_tl(s->cc_srcT, cmpv);
             tcg_gen_sub_tl(cpu_cc_dst, cmpv, oldv);
diff --git a/tests/tcg/x86_64/cmpxchg.c b/tests/tcg/x86_64/cmpxchg.c
new file mode 100644
index 0000000000..5891735161
--- /dev/null
+++ b/tests/tcg/x86_64/cmpxchg.c
@@ -0,0 +1,42 @@
+#include <assert.h>
+
+static int mem;
+
+static unsigned long test_cmpxchgb(unsigned long orig)
+{
+  unsigned long ret;
+  mem = orig;
+  asm("cmpxchgb %b[cmp],%[mem]"
+      : [ mem ] "+m"(mem), [ rax ] "=a"(ret)
+      : [ cmp ] "r"(0x77), "a"(orig));
+  return ret;
+}
+
+static unsigned long test_cmpxchgw(unsigned long orig)
+{
+  unsigned long ret;
+  mem = orig;
+  asm("cmpxchgw %w[cmp],%[mem]"
+      : [ mem ] "+m"(mem), [ rax ] "=a"(ret)
+      : [ cmp ] "r"(0x7777), "a"(orig));
+  return ret;
+}
+
+static unsigned long test_cmpxchgl(unsigned long orig)
+{
+  unsigned long ret;
+  mem = orig;
+  asm("cmpxchgl %[cmp],%[mem]"
+      : [ mem ] "+m"(mem), [ rax ] "=a"(ret)
+      : [ cmp ] "r"(0x77777777u), "a"(orig));
+  return ret;
+}
+
+int main()
+{
+  unsigned long test = 0xdeadbeef12345678ull;
+  assert(test == test_cmpxchgb(test));
+  assert(test == test_cmpxchgw(test));
+  assert(test == test_cmpxchgl(test));
+  return 0;
+}
diff --git a/tests/tcg/x86_64/Makefile.target b/tests/tcg/x86_64/Makefile.target
index 6895db1c43..4eac78293f 100644
--- a/tests/tcg/x86_64/Makefile.target
+++ b/tests/tcg/x86_64/Makefile.target
@@ -11,6 +11,7 @@ include $(SRC_PATH)/tests/tcg/i386/Makefile.target
 ifeq ($(filter %-linux-user, $(TARGET)),$(TARGET))
 X86_64_TESTS += vsyscall
 X86_64_TESTS += noexec
+X86_64_TESTS += cmpxchg
 TESTS=$(MULTIARCH_TESTS) $(X86_64_TESTS) test-x86_64
 else
 TESTS=$(MULTIARCH_TESTS)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PULL 2/2] target/i386: hardcode R_EAX as destination register for LAHF/SAHF
  2022-11-14 23:38 [PULL 0/2] target/i386: fix cmpxchgl, lahf, sahf Richard Henderson
  2022-11-14 23:38 ` [PULL 1/2] target/i386: fix cmpxchg with 32-bit register destination Richard Henderson
@ 2022-11-14 23:38 ` Richard Henderson
  2022-11-15 23:54 ` [PULL 0/2] target/i386: fix cmpxchgl, lahf, sahf Stefan Hajnoczi
  2 siblings, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2022-11-14 23:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: stefanha, Paolo Bonzini

From: Paolo Bonzini <pbonzini@redhat.com>

When translating code that is using LAHF and SAHF in combination with the
REX prefix, the instructions should not use any other register than AH;
however, QEMU selects SPL (SP being register 4, just like AH) if the
REX prefix is present.  To fix this, use deposit directly without
going through gen_op_mov_v_reg and gen_op_mov_reg_v.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/130
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/i386/tcg/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index dbd6492778..7e0b2a709a 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -5230,7 +5230,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
     case 0x9e: /* sahf */
         if (CODE64(s) && !(s->cpuid_ext3_features & CPUID_EXT3_LAHF_LM))
             goto illegal_op;
-        gen_op_mov_v_reg(s, MO_8, s->T0, R_AH);
+        tcg_gen_shri_tl(s->T0, cpu_regs[R_EAX], 8);
         gen_compute_eflags(s);
         tcg_gen_andi_tl(cpu_cc_src, cpu_cc_src, CC_O);
         tcg_gen_andi_tl(s->T0, s->T0, CC_S | CC_Z | CC_A | CC_P | CC_C);
@@ -5242,7 +5242,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
         gen_compute_eflags(s);
         /* Note: gen_compute_eflags() only gives the condition codes */
         tcg_gen_ori_tl(s->T0, cpu_cc_src, 0x02);
-        gen_op_mov_reg_v(s, MO_8, R_AH, s->T0);
+        tcg_gen_deposit_tl(cpu_regs[R_EAX], cpu_regs[R_EAX], s->T0, 8, 8);
         break;
     case 0xf5: /* cmc */
         gen_compute_eflags(s);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PULL 0/2] target/i386: fix cmpxchgl, lahf, sahf
  2022-11-14 23:38 [PULL 0/2] target/i386: fix cmpxchgl, lahf, sahf Richard Henderson
  2022-11-14 23:38 ` [PULL 1/2] target/i386: fix cmpxchg with 32-bit register destination Richard Henderson
  2022-11-14 23:38 ` [PULL 2/2] target/i386: hardcode R_EAX as destination register for LAHF/SAHF Richard Henderson
@ 2022-11-15 23:54 ` Stefan Hajnoczi
  2 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2022-11-15 23:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 115 bytes --]

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any user-visible changes.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-11-15 23:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-14 23:38 [PULL 0/2] target/i386: fix cmpxchgl, lahf, sahf Richard Henderson
2022-11-14 23:38 ` [PULL 1/2] target/i386: fix cmpxchg with 32-bit register destination Richard Henderson
2022-11-14 23:38 ` [PULL 2/2] target/i386: hardcode R_EAX as destination register for LAHF/SAHF Richard Henderson
2022-11-15 23:54 ` [PULL 0/2] target/i386: fix cmpxchgl, lahf, sahf Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.