All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling
@ 2018-12-03 16:08 Richard Henderson
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 1/5] tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct Richard Henderson
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Richard Henderson @ 2018-12-03 16:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota, alex.bennee

This tidies guest_base handling such that (1) we require no scratch
registers, (2) we require no extra instructions besides the memory op,
and (3) we reduce the size of the memory op by omitting a prefix.

In principal point 3 is offset by adding additional opcodes to handle
zero-extension when converting 64-bit guest values back to 32-bit guest
addresses.  But those turn out to be hen's teeth, since 32-bit guests
often have no way of even producing 64-bit guest values.

In particular, I saw none in a simple pass through linux-user-test-0.3
for i386, arm, sh4, sparc.


r~


Richard Henderson (5):
  tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct
  tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests
  tcg/i386: Assume 32-bit values are zero-extended
  tcg/i386: Precompute all guest_base parameters
  tcg/i386: Add setup_guest_base_seg for FreeBSD

 tcg/i386/tcg-target.h     |   5 +-
 tcg/i386/tcg-target.inc.c | 188 ++++++++++++++++----------------------
 2 files changed, 83 insertions(+), 110 deletions(-)

-- 
2.17.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH for-4.0 1/5] tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct
  2018-12-03 16:08 [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling Richard Henderson
@ 2018-12-03 16:08 ` Richard Henderson
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 2/5] tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests Richard Henderson
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2018-12-03 16:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota, alex.bennee

This helps preserve the invariant that all TCG_TYPE_I32 values
are stored zero-extended in the 64-bit host registers.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.inc.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index 28192f4608..6bf4f84b20 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -1883,10 +1883,11 @@ static inline void setup_guest_base_seg(void) { }
 
 static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
                                    TCGReg base, int index, intptr_t ofs,
-                                   int seg, TCGMemOp memop)
+                                   int seg, bool is64, TCGMemOp memop)
 {
     const TCGMemOp real_bswap = memop & MO_BSWAP;
     TCGMemOp bswap = real_bswap;
+    int rexw = is64 * P_REXW;
     int movop = OPC_MOVL_GvEv;
 
     if (have_movbe && real_bswap) {
@@ -1900,7 +1901,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
                                  base, index, 0, ofs);
         break;
     case MO_SB:
-        tcg_out_modrm_sib_offset(s, OPC_MOVSBL + P_REXW + seg, datalo,
+        tcg_out_modrm_sib_offset(s, OPC_MOVSBL + rexw + seg, datalo,
                                  base, index, 0, ofs);
         break;
     case MO_UW:
@@ -1920,9 +1921,9 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
                                          base, index, 0, ofs);
                 tcg_out_rolw_8(s, datalo);
             }
-            tcg_out_modrm(s, OPC_MOVSWL + P_REXW, datalo, datalo);
+            tcg_out_modrm(s, OPC_MOVSWL + rexw, datalo, datalo);
         } else {
-            tcg_out_modrm_sib_offset(s, OPC_MOVSWL + P_REXW + seg,
+            tcg_out_modrm_sib_offset(s, OPC_MOVSWL + rexw + seg,
                                      datalo, base, index, 0, ofs);
         }
         break;
@@ -2010,7 +2011,7 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
                      label_ptr, offsetof(CPUTLBEntry, addr_read));
 
     /* TLB Hit.  */
-    tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc);
+    tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, is64, opc);
 
     /* Record the current context of a load into ldst label */
     add_qemu_ldst_label(s, true, oi, datalo, datahi, addrlo, addrhi,
@@ -2045,7 +2046,7 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
         }
 
         tcg_out_qemu_ld_direct(s, datalo, datahi,
-                               base, index, offset, seg, opc);
+                               base, index, offset, seg, is64, opc);
     }
 #endif
 }
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH for-4.0 2/5] tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests
  2018-12-03 16:08 [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling Richard Henderson
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 1/5] tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct Richard Henderson
@ 2018-12-03 16:08 ` Richard Henderson
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 3/5] tcg/i386: Assume 32-bit values are zero-extended Richard Henderson
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2018-12-03 16:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota, alex.bennee

This preserves the invariant that all TCG_TYPE_I32 values are
zero-extended in the 64-bit host register.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.h     | 5 +++--
 tcg/i386/tcg-target.inc.c | 6 ++++++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 2441658865..c523d5f5e1 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -135,8 +135,9 @@ extern bool have_avx2;
 #define TCG_TARGET_HAS_direct_jump      1
 
 #if TCG_TARGET_REG_BITS == 64
-#define TCG_TARGET_HAS_extrl_i64_i32    0
-#define TCG_TARGET_HAS_extrh_i64_i32    0
+/* Keep target addresses zero-extended in a register.  */
+#define TCG_TARGET_HAS_extrl_i64_i32    (TARGET_LONG_BITS == 32)
+#define TCG_TARGET_HAS_extrh_i64_i32    (TARGET_LONG_BITS == 32)
 #define TCG_TARGET_HAS_div2_i64         1
 #define TCG_TARGET_HAS_rot_i64          1
 #define TCG_TARGET_HAS_ext8s_i64        1
diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index 6bf4f84b20..ab31dfa66d 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -2546,12 +2546,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
     case INDEX_op_extu_i32_i64:
     case INDEX_op_ext32u_i64:
+    case INDEX_op_extrl_i64_i32:
         tcg_out_ext32u(s, a0, a1);
         break;
     case INDEX_op_ext_i32_i64:
     case INDEX_op_ext32s_i64:
         tcg_out_ext32s(s, a0, a1);
         break;
+    case INDEX_op_extrh_i64_i32:
+        tcg_out_shifti(s, SHIFT_SHR + P_REXW, a0, 32);
+        break;
 #endif
 
     OP_32_64(deposit):
@@ -2915,6 +2919,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op)
     case INDEX_op_neg_i64:
     case INDEX_op_not_i32:
     case INDEX_op_not_i64:
+    case INDEX_op_extrh_i64_i32:
         return &r_0;
 
     case INDEX_op_ext8s_i32:
@@ -2930,6 +2935,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op)
     case INDEX_op_ext32u_i64:
     case INDEX_op_ext_i32_i64:
     case INDEX_op_extu_i32_i64:
+    case INDEX_op_extrl_i64_i32:
     case INDEX_op_extract_i32:
     case INDEX_op_extract_i64:
     case INDEX_op_sextract_i32:
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH for-4.0 3/5] tcg/i386: Assume 32-bit values are zero-extended
  2018-12-03 16:08 [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling Richard Henderson
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 1/5] tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct Richard Henderson
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 2/5] tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests Richard Henderson
@ 2018-12-03 16:08 ` Richard Henderson
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 4/5] tcg/i386: Precompute all guest_base parameters Richard Henderson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2018-12-03 16:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota, alex.bennee

We now have an invariant that all TCG_TYPE_I32 values are
zero-extended, which means that we do not need to extend
them again during qemu_ld/st, either explicitly via a separate
tcg_out_ext32u or implicitly via P_ADDR32.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.inc.c | 103 +++++++++++++++-----------------------
 1 file changed, 40 insertions(+), 63 deletions(-)

diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index ab31dfa66d..853c3c8465 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -309,13 +309,11 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type,
 #define P_EXT38         0x200           /* 0x0f 0x38 opcode prefix */
 #define P_DATA16        0x400           /* 0x66 opcode prefix */
 #if TCG_TARGET_REG_BITS == 64
-# define P_ADDR32       0x800           /* 0x67 opcode prefix */
 # define P_REXW         0x1000          /* Set REX.W = 1 */
 # define P_REXB_R       0x2000          /* REG field as byte register */
 # define P_REXB_RM      0x4000          /* R/M field as byte register */
 # define P_GS           0x8000          /* gs segment override */
 #else
-# define P_ADDR32	0
 # define P_REXW		0
 # define P_REXB_R	0
 # define P_REXB_RM	0
@@ -528,9 +526,6 @@ static void tcg_out_opc(TCGContext *s, int opc, int r, int rm, int x)
         tcg_debug_assert((opc & P_REXW) == 0);
         tcg_out8(s, 0x66);
     }
-    if (opc & P_ADDR32) {
-        tcg_out8(s, 0x67);
-    }
     if (opc & P_SIMDF3) {
         tcg_out8(s, 0xf3);
     } else if (opc & P_SIMDF2) {
@@ -1659,11 +1654,7 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
     tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw, r1, r0, 0);
 
     /* Prepare for both the fast path add of the tlb addend, and the slow
-       path function argument setup.  There are two cases worth note:
-       For 32-bit guest and x86_64 host, MOVL zero-extends the guest address
-       before the fastpath ADDQ below.  For 64-bit guest and x32 host, MOVQ
-       copies the entire guest address for the slow path, while truncation
-       for the 32-bit host happens with the fastpath ADDL below.  */
+       path function argument setup.  */
     tcg_out_mov(s, ttype, r1, addrlo);
 
     /* jne slow_path */
@@ -2019,41 +2010,31 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
 #else
     {
         int32_t offset = guest_base;
-        TCGReg base = addrlo;
         int index = -1;
         int seg = 0;
 
-        /* For a 32-bit guest, the high 32 bits may contain garbage.
-           We can do this with the ADDR32 prefix if we're not using
-           a guest base, or when using segmentation.  Otherwise we
-           need to zero-extend manually.  */
+        /*
+         * Recall we store 32-bit values zero-extended.  No need for
+         * further manual extension or an addr32 (0x67) prefix.
+         */
         if (guest_base == 0 || guest_base_flags) {
             seg = guest_base_flags;
             offset = 0;
-            if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
-                seg |= P_ADDR32;
-            }
-        } else if (TCG_TARGET_REG_BITS == 64) {
-            if (TARGET_LONG_BITS == 32) {
-                tcg_out_ext32u(s, TCG_REG_L0, base);
-                base = TCG_REG_L0;
-            }
-            if (offset != guest_base) {
-                tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_L1, guest_base);
-                index = TCG_REG_L1;
-                offset = 0;
-            }
+        } else if (TCG_TARGET_REG_BITS == 64 && offset != guest_base) {
+            tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_L1, guest_base);
+            index = TCG_REG_L1;
+            offset = 0;
         }
 
         tcg_out_qemu_ld_direct(s, datalo, datahi,
-                               base, index, offset, seg, is64, opc);
+                               addrlo, index, offset, seg, is64, opc);
     }
 #endif
 }
 
 static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
-                                   TCGReg base, intptr_t ofs, int seg,
-                                   TCGMemOp memop)
+                                   TCGReg base, int index, intptr_t ofs,
+                                   int seg, TCGMemOp memop)
 {
     /* ??? Ideally we wouldn't need a scratch register.  For user-only,
        we could perform the bswap twice to restore the original value
@@ -2077,8 +2058,8 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
             tcg_out_mov(s, TCG_TYPE_I32, scratch, datalo);
             datalo = scratch;
         }
-        tcg_out_modrm_offset(s, OPC_MOVB_EvGv + P_REXB_R + seg,
-                             datalo, base, ofs);
+        tcg_out_modrm_sib_offset(s, OPC_MOVB_EvGv + P_REXB_R + seg,
+                                 datalo, base, index, 0, ofs);
         break;
     case MO_16:
         if (bswap) {
@@ -2086,7 +2067,8 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
             tcg_out_rolw_8(s, scratch);
             datalo = scratch;
         }
-        tcg_out_modrm_offset(s, movop + P_DATA16 + seg, datalo, base, ofs);
+        tcg_out_modrm_sib_offset(s, movop + P_DATA16 + seg, datalo,
+                                 base, index, 0, ofs);
         break;
     case MO_32:
         if (bswap) {
@@ -2094,7 +2076,7 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
             tcg_out_bswap32(s, scratch);
             datalo = scratch;
         }
-        tcg_out_modrm_offset(s, movop + seg, datalo, base, ofs);
+        tcg_out_modrm_sib_offset(s, movop + seg, datalo, base, index, 0, ofs);
         break;
     case MO_64:
         if (TCG_TARGET_REG_BITS == 64) {
@@ -2103,22 +2085,27 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
                 tcg_out_bswap64(s, scratch);
                 datalo = scratch;
             }
-            tcg_out_modrm_offset(s, movop + P_REXW + seg, datalo, base, ofs);
+            tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo,
+                                     base, index, 0, ofs);
         } else if (bswap) {
             tcg_out_mov(s, TCG_TYPE_I32, scratch, datahi);
             tcg_out_bswap32(s, scratch);
-            tcg_out_modrm_offset(s, OPC_MOVL_EvGv + seg, scratch, base, ofs);
+            tcg_out_modrm_sib_offset(s, OPC_MOVL_EvGv + seg, scratch,
+                                     base, index, 0, ofs);
             tcg_out_mov(s, TCG_TYPE_I32, scratch, datalo);
             tcg_out_bswap32(s, scratch);
-            tcg_out_modrm_offset(s, OPC_MOVL_EvGv + seg, scratch, base, ofs+4);
+            tcg_out_modrm_sib_offset(s, OPC_MOVL_EvGv + seg, scratch,
+                                     base, index, 0, ofs + 4);
         } else {
             if (real_bswap) {
                 int t = datalo;
                 datalo = datahi;
                 datahi = t;
             }
-            tcg_out_modrm_offset(s, movop + seg, datalo, base, ofs);
-            tcg_out_modrm_offset(s, movop + seg, datahi, base, ofs+4);
+            tcg_out_modrm_sib_offset(s, movop + seg, datalo,
+                                     base, index, 0, ofs);
+            tcg_out_modrm_sib_offset(s, movop + seg, datahi,
+                                     base, index, 0, ofs + 4);
         }
         break;
     default:
@@ -2151,7 +2138,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
                      label_ptr, offsetof(CPUTLBEntry, addr_write));
 
     /* TLB Hit.  */
-    tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, 0, 0, opc);
+    tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc);
 
     /* Record the current context of a store into ldst label */
     add_qemu_ldst_label(s, false, oi, datalo, datahi, addrlo, addrhi,
@@ -2159,35 +2146,25 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
 #else
     {
         int32_t offset = guest_base;
-        TCGReg base = addrlo;
+        int index = -1;
         int seg = 0;
 
-        /* See comment in tcg_out_qemu_ld re zero-extension of addrlo.  */
+        /*
+         * Recall we store 32-bit values zero-extended.  No need for
+         * further manual extension or an addr32 (0x67) prefix.
+         */
         if (guest_base == 0 || guest_base_flags) {
             seg = guest_base_flags;
             offset = 0;
-            if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
-                seg |= P_ADDR32;
-            }
-        } else if (TCG_TARGET_REG_BITS == 64) {
-            /* ??? Note that we can't use the same SIB addressing scheme
-               as for loads, since we require L0 free for bswap.  */
-            if (offset != guest_base) {
-                if (TARGET_LONG_BITS == 32) {
-                    tcg_out_ext32u(s, TCG_REG_L0, base);
-                    base = TCG_REG_L0;
-                }
-                tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_L1, guest_base);
-                tgen_arithr(s, ARITH_ADD + P_REXW, TCG_REG_L1, base);
-                base = TCG_REG_L1;
-                offset = 0;
-            } else if (TARGET_LONG_BITS == 32) {
-                tcg_out_ext32u(s, TCG_REG_L1, base);
-                base = TCG_REG_L1;
-            }
+        } else if (TCG_TARGET_REG_BITS == 64 && offset != guest_base) {
+            /* ??? Note that we require L0 free for bswap.  */
+            tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_L1, guest_base);
+            index = TCG_REG_L1;
+            offset = 0;
         }
 
-        tcg_out_qemu_st_direct(s, datalo, datahi, base, offset, seg, opc);
+        tcg_out_qemu_st_direct(s, datalo, datahi,
+                               addrlo, index, offset, seg, opc);
     }
 #endif
 }
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH for-4.0 4/5] tcg/i386: Precompute all guest_base parameters
  2018-12-03 16:08 [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling Richard Henderson
                   ` (2 preceding siblings ...)
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 3/5] tcg/i386: Assume 32-bit values are zero-extended Richard Henderson
@ 2018-12-03 16:08 ` Richard Henderson
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 5/5] tcg/i386: Add setup_guest_base_seg for FreeBSD Richard Henderson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2018-12-03 16:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota, alex.bennee

These values are constant between all qemu_ld/st invocations;
there is no need to figure this out each time.  If we cannot
use a segment or an offset directly for guest_base, load the
value into a register in the prologue.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.inc.c | 101 +++++++++++++++-----------------------
 1 file changed, 40 insertions(+), 61 deletions(-)

diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index 853c3c8465..b8d2dd5ba3 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -1854,22 +1854,31 @@ static void tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
     tcg_out_push(s, retaddr);
     tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
 }
-#elif defined(__x86_64__) && defined(__linux__)
-# include <asm/prctl.h>
-# include <sys/prctl.h>
-
+#elif TCG_TARGET_REG_BITS == 32
+# define x86_guest_base_seg     0
+# define x86_guest_base_index   -1
+# define x86_guest_base_offset  guest_base
+#else
+static int x86_guest_base_seg;
+static int x86_guest_base_index = -1;
+static int32_t x86_guest_base_offset;
+# if defined(__x86_64__) && defined(__linux__)
+#  include <asm/prctl.h>
+#  include <sys/prctl.h>
 int arch_prctl(int code, unsigned long addr);
-
-static int guest_base_flags;
-static inline void setup_guest_base_seg(void)
+static inline int setup_guest_base_seg(void)
 {
     if (arch_prctl(ARCH_SET_GS, guest_base) == 0) {
-        guest_base_flags = P_GS;
+        return P_GS;
     }
+    return 0;
 }
-#else
-# define guest_base_flags 0
-static inline void setup_guest_base_seg(void) { }
+# else
+static inline int setup_guest_base_seg(void)
+{
+    return 0;
+}
+# endif
 #endif /* SOFTMMU */
 
 static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
@@ -2008,27 +2017,9 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
     add_qemu_ldst_label(s, true, oi, datalo, datahi, addrlo, addrhi,
                         s->code_ptr, label_ptr);
 #else
-    {
-        int32_t offset = guest_base;
-        int index = -1;
-        int seg = 0;
-
-        /*
-         * Recall we store 32-bit values zero-extended.  No need for
-         * further manual extension or an addr32 (0x67) prefix.
-         */
-        if (guest_base == 0 || guest_base_flags) {
-            seg = guest_base_flags;
-            offset = 0;
-        } else if (TCG_TARGET_REG_BITS == 64 && offset != guest_base) {
-            tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_L1, guest_base);
-            index = TCG_REG_L1;
-            offset = 0;
-        }
-
-        tcg_out_qemu_ld_direct(s, datalo, datahi,
-                               addrlo, index, offset, seg, is64, opc);
-    }
+    tcg_out_qemu_ld_direct(s, datalo, datahi, addrlo, x86_guest_base_index,
+                           x86_guest_base_offset, x86_guest_base_seg,
+                           is64, opc);
 #endif
 }
 
@@ -2144,28 +2135,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
     add_qemu_ldst_label(s, false, oi, datalo, datahi, addrlo, addrhi,
                         s->code_ptr, label_ptr);
 #else
-    {
-        int32_t offset = guest_base;
-        int index = -1;
-        int seg = 0;
-
-        /*
-         * Recall we store 32-bit values zero-extended.  No need for
-         * further manual extension or an addr32 (0x67) prefix.
-         */
-        if (guest_base == 0 || guest_base_flags) {
-            seg = guest_base_flags;
-            offset = 0;
-        } else if (TCG_TARGET_REG_BITS == 64 && offset != guest_base) {
-            /* ??? Note that we require L0 free for bswap.  */
-            tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_L1, guest_base);
-            index = TCG_REG_L1;
-            offset = 0;
-        }
-
-        tcg_out_qemu_st_direct(s, datalo, datahi,
-                               addrlo, index, offset, seg, opc);
-    }
+    tcg_out_qemu_st_direct(s, datalo, datahi, addrlo, x86_guest_base_index,
+                           x86_guest_base_offset, x86_guest_base_seg, opc);
 #endif
 }
 
@@ -3412,6 +3383,21 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 		         (ARRAY_SIZE(tcg_target_callee_save_regs) + 2) * 4
 			 + stack_addend);
 #else
+# ifndef CONFIG_SOFTMMU
+    if (guest_base) {
+        int seg = setup_guest_base_seg();
+        if (seg != 0) {
+            x86_guest_base_seg = seg;
+        } else if (guest_base == (int32_t)guest_base) {
+            x86_guest_base_offset = guest_base;
+        } else {
+            /* Choose R12 because, as a base, it requires a SIB byte. */
+            x86_guest_base_index = TCG_REG_R12;
+            tcg_out_mov(s, TCG_TYPE_PTR, x86_guest_base_index, guest_base);
+            tcg_regset_set_reg(s->reserved_regs, x86_guest_base_index);
+        }
+    }
+# endif
     tcg_out_mov(s, TCG_TYPE_PTR, TCG_AREG0, tcg_target_call_iarg_regs[0]);
     tcg_out_addi(s, TCG_REG_ESP, -stack_addend);
     /* jmp *tb.  */
@@ -3437,13 +3423,6 @@ static void tcg_target_qemu_prologue(TCGContext *s)
         tcg_out_pop(s, tcg_target_callee_save_regs[i]);
     }
     tcg_out_opc(s, OPC_RET, 0, 0, 0);
-
-#if !defined(CONFIG_SOFTMMU)
-    /* Try to set up a segment register to point to guest_base.  */
-    if (guest_base) {
-        setup_guest_base_seg();
-    }
-#endif
 }
 
 static void tcg_out_nop_fill(tcg_insn_unit *p, int count)
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH for-4.0 5/5] tcg/i386: Add setup_guest_base_seg for FreeBSD
  2018-12-03 16:08 [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling Richard Henderson
                   ` (3 preceding siblings ...)
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 4/5] tcg/i386: Precompute all guest_base parameters Richard Henderson
@ 2018-12-03 16:08 ` Richard Henderson
  2018-12-03 17:01   ` Kamil Rytarowski
  2018-12-03 22:28 ` [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling no-reply
  2018-12-10 21:53 ` Emilio G. Cota
  6 siblings, 1 reply; 10+ messages in thread
From: Richard Henderson @ 2018-12-03 16:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota, alex.bennee

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.inc.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index b8d2dd5ba3..3a39b51685 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -1873,6 +1873,15 @@ static inline int setup_guest_base_seg(void)
     }
     return 0;
 }
+# elif defined (__FreeBSD__) || defined (__FreeBSD_kernel__)
+#  include <machine/sysarch.h>
+static inline int setup_guest_base_seg(void)
+{
+    if (sysarch(AMD64_SET_GSBASE, &guest_base) == 0) {
+        return P_GS;
+    }
+    return 0;
+}
 # else
 static inline int setup_guest_base_seg(void)
 {
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 5/5] tcg/i386: Add setup_guest_base_seg for FreeBSD
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 5/5] tcg/i386: Add setup_guest_base_seg for FreeBSD Richard Henderson
@ 2018-12-03 17:01   ` Kamil Rytarowski
  2018-12-03 18:35     ` Richard Henderson
  0 siblings, 1 reply; 10+ messages in thread
From: Kamil Rytarowski @ 2018-12-03 17:01 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: cota, alex.bennee

[-- Attachment #1: Type: text/plain, Size: 939 bytes --]

On 03.12.2018 17:08, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/i386/tcg-target.inc.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
> index b8d2dd5ba3..3a39b51685 100644
> --- a/tcg/i386/tcg-target.inc.c
> +++ b/tcg/i386/tcg-target.inc.c
> @@ -1873,6 +1873,15 @@ static inline int setup_guest_base_seg(void)
>      }
>      return 0;
>  }
> +# elif defined (__FreeBSD__) || defined (__FreeBSD_kernel__)
> +#  include <machine/sysarch.h>
> +static inline int setup_guest_base_seg(void)
> +{
> +    if (sysarch(AMD64_SET_GSBASE, &guest_base) == 0) {
> +        return P_GS;
> +    }
> +    return 0;
> +}
>  # else
>  static inline int setup_guest_base_seg(void)
>  {
> 

There is also X86_SET_GSBASE in <include/sysarch.h> in NetBSD. Do we
need to set it for this OS too?


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 850 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 5/5] tcg/i386: Add setup_guest_base_seg for FreeBSD
  2018-12-03 17:01   ` Kamil Rytarowski
@ 2018-12-03 18:35     ` Richard Henderson
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2018-12-03 18:35 UTC (permalink / raw)
  To: Kamil Rytarowski, qemu-devel; +Cc: cota, alex.bennee

On 12/3/18 11:01 AM, Kamil Rytarowski wrote:
> On 03.12.2018 17:08, Richard Henderson wrote:
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>  tcg/i386/tcg-target.inc.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
>> index b8d2dd5ba3..3a39b51685 100644
>> --- a/tcg/i386/tcg-target.inc.c
>> +++ b/tcg/i386/tcg-target.inc.c
>> @@ -1873,6 +1873,15 @@ static inline int setup_guest_base_seg(void)
>>      }
>>      return 0;
>>  }
>> +# elif defined (__FreeBSD__) || defined (__FreeBSD_kernel__)
>> +#  include <machine/sysarch.h>
>> +static inline int setup_guest_base_seg(void)
>> +{
>> +    if (sysarch(AMD64_SET_GSBASE, &guest_base) == 0) {
>> +        return P_GS;
>> +    }
>> +    return 0;
>> +}
>>  # else
>>  static inline int setup_guest_base_seg(void)
>>  {
>>
> 
> There is also X86_SET_GSBASE in <include/sysarch.h> in NetBSD. Do we
> need to set it for this OS too?

You will want to do so, yes.  In the meantime NetBSD should not break; it is
only an optimization.


r~

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling
  2018-12-03 16:08 [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling Richard Henderson
                   ` (4 preceding siblings ...)
  2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 5/5] tcg/i386: Add setup_guest_base_seg for FreeBSD Richard Henderson
@ 2018-12-03 22:28 ` no-reply
  2018-12-10 21:53 ` Emilio G. Cota
  6 siblings, 0 replies; 10+ messages in thread
From: no-reply @ 2018-12-03 22:28 UTC (permalink / raw)
  To: richard.henderson; +Cc: famz, qemu-devel, cota, alex.bennee

Patchew URL: https://patchew.org/QEMU/20181203160840.15115-1-richard.henderson@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling
Message-id: 20181203160840.15115-1-richard.henderson@linaro.org
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
76861b4 tcg/i386: Add setup_guest_base_seg for FreeBSD
94f1510 tcg/i386: Precompute all guest_base parameters
979a645 tcg/i386: Assume 32-bit values are zero-extended
ec46a17 tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests
7eab43f tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct

=== OUTPUT BEGIN ===
Checking PATCH 1/5: tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct...
Checking PATCH 2/5: tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests...
Checking PATCH 3/5: tcg/i386: Assume 32-bit values are zero-extended...
Checking PATCH 4/5: tcg/i386: Precompute all guest_base parameters...
WARNING: architecture specific defines should be avoided
#34: FILE: tcg/i386/tcg-target.inc.c:1864:
+# if defined(__x86_64__) && defined(__linux__)

total: 0 errors, 1 warnings, 136 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 5/5: tcg/i386: Add setup_guest_base_seg for FreeBSD...
ERROR: space prohibited between function name and open parenthesis '('
#17: FILE: tcg/i386/tcg-target.inc.c:1875:
+# elif defined (__FreeBSD__) || defined (__FreeBSD_kernel__)

ERROR: space prohibited between function name and open parenthesis '('
#17: FILE: tcg/i386/tcg-target.inc.c:1875:
+# elif defined (__FreeBSD__) || defined (__FreeBSD_kernel__)

total: 2 errors, 0 warnings, 15 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20181203160840.15115-1-richard.henderson@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling
  2018-12-03 16:08 [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling Richard Henderson
                   ` (5 preceding siblings ...)
  2018-12-03 22:28 ` [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling no-reply
@ 2018-12-10 21:53 ` Emilio G. Cota
  6 siblings, 0 replies; 10+ messages in thread
From: Emilio G. Cota @ 2018-12-10 21:53 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, alex.bennee

On Mon, Dec 03, 2018 at 10:08:35 -0600, Richard Henderson wrote:
> This tidies guest_base handling such that (1) we require no scratch
> registers, (2) we require no extra instructions besides the memory op,
> and (3) we reduce the size of the memory op by omitting a prefix.
> 
> In principal point 3 is offset by adding additional opcodes to handle
> zero-extension when converting 64-bit guest values back to 32-bit guest
> addresses.  But those turn out to be hen's teeth, since 32-bit guests
> often have no way of even producing 64-bit guest values.
> 
> In particular, I saw none in a simple pass through linux-user-test-0.3
> for i386, arm, sh4, sparc.

Reviewed-by: Emilio G. Cota <cota@braap.org>

for the series.

Thanks,

		Emilio

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-12-10 21:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-03 16:08 [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling Richard Henderson
2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 1/5] tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct Richard Henderson
2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 2/5] tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests Richard Henderson
2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 3/5] tcg/i386: Assume 32-bit values are zero-extended Richard Henderson
2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 4/5] tcg/i386: Precompute all guest_base parameters Richard Henderson
2018-12-03 16:08 ` [Qemu-devel] [PATCH for-4.0 5/5] tcg/i386: Add setup_guest_base_seg for FreeBSD Richard Henderson
2018-12-03 17:01   ` Kamil Rytarowski
2018-12-03 18:35     ` Richard Henderson
2018-12-03 22:28 ` [Qemu-devel] [PATCH for-4.0 0/5] tcg/i386: Improve guest_base handling no-reply
2018-12-10 21:53 ` Emilio G. Cota

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.