All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/15] tcg field extract primitives
@ 2016-10-16  3:37 Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 01/15] tcg: Add field extraction primitives Richard Henderson
                   ` (15 more replies)
  0 siblings, 16 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel

I was fooling around with a new target last weekend, and got myself
all turned around with its field extract instruction.  How much more
handy would it be, I told myself, if we had such a thing generically?

In addition, many hosts have this natively.  So it seems a shame to
not take advantage of it when we can.

Lightly tested on x86_64, ppc64le, arm32, and s390x hosts.


r~


Richard Henderson (15):
  tcg: Add field extraction primitives
  tcg: Minor adjustments to deposit expanders
  tcg/aarch64: Implement field extraction opcodes
  tcg/arm: Move isa detection to tcg-target.h
  tcg/arm: Implement field extraction opcodes
  tcg/i386: Implement field extraction opcodes
  tcg/mips: Implement field extraction opcodes
  tcg/ppc: Implement field extraction opcodes
  tcg/s390: Implement field extraction opcodes
  target-alpha: Use deposit and extract ops
  target-arm: Use tcg_gen_*extract
  target-i386: Use tcg_gen_extract_tl
  target-mips: Use tcg_gen_extract_*
  target-ppc: Use tcg_gen_extract_*
  target-s390: Use tcg_gen_extract_i64

 target-alpha/translate.c     |  67 +++++----
 target-arm/translate-a64.c   |  48 ++++---
 target-arm/translate.c       |  37 ++---
 target-i386/translate.c      |  45 +++---
 target-mips/translate.c      |  12 +-
 target-ppc/translate.c       |   9 +-
 target-s390x/translate.c     |  24 ++--
 tcg/aarch64/tcg-target.h     |   4 +
 tcg/aarch64/tcg-target.inc.c |  14 ++
 tcg/arm/tcg-target.h         |  38 +++++-
 tcg/arm/tcg-target.inc.c     |  63 ++++-----
 tcg/i386/tcg-target.h        |   7 +
 tcg/i386/tcg-target.inc.c    |  30 ++++
 tcg/ia64/tcg-target.h        |   4 +
 tcg/mips/tcg-target.h        |   2 +
 tcg/mips/tcg-target.inc.c    |   4 +
 tcg/optimize.c               |  29 ++++
 tcg/ppc/tcg-target.h         |   4 +
 tcg/ppc/tcg-target.inc.c     |  10 ++
 tcg/s390/tcg-target.h        |  12 +-
 tcg/s390/tcg-target.inc.c    |  13 +-
 tcg/sparc/tcg-target.h       |   4 +
 tcg/tcg-op.c                 | 319 ++++++++++++++++++++++++++++++++++++++++++-
 tcg/tcg-op.h                 |  12 ++
 tcg/tcg-opc.h                |   4 +
 tcg/tcg.h                    |   8 ++
 tcg/tci/tcg-target.h         |   4 +
 27 files changed, 655 insertions(+), 172 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 01/15] tcg: Add field extraction primitives
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 02/15] tcg: Minor adjustments to deposit expanders Richard Henderson
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel

Adds tcg_gen_extract_* and tcg_gen_sextract_* for extraction of
fixed position bitfields, much like we already have for deposit.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/aarch64/tcg-target.h |   4 +
 tcg/arm/tcg-target.h     |   2 +
 tcg/i386/tcg-target.h    |   4 +
 tcg/ia64/tcg-target.h    |   4 +
 tcg/mips/tcg-target.h    |   2 +
 tcg/optimize.c           |  29 +++++
 tcg/ppc/tcg-target.h     |   4 +
 tcg/s390/tcg-target.h    |   4 +
 tcg/sparc/tcg-target.h   |   4 +
 tcg/tcg-op.c             | 313 +++++++++++++++++++++++++++++++++++++++++++++++
 tcg/tcg-op.h             |  12 ++
 tcg/tcg-opc.h            |   4 +
 tcg/tcg.h                |   8 ++
 tcg/tci/tcg-target.h     |   4 +
 14 files changed, 398 insertions(+)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index a1d101f..410c31b 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -63,6 +63,8 @@ typedef enum {
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      1
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_add2_i32         1
 #define TCG_TARGET_HAS_sub2_i32         1
@@ -93,6 +95,8 @@ typedef enum {
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      1
+#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
 #define TCG_TARGET_HAS_sub2_i64         1
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index a0e1acf..8e724be 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -80,6 +80,8 @@ extern bool use_idiv_instructions;
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      1
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_mulu2_i32        1
 #define TCG_TARGET_HAS_muls2_i32        1
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 524cfc6..7625188 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -94,6 +94,8 @@ extern bool have_bmi1;
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      1
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_add2_i32         1
 #define TCG_TARGET_HAS_sub2_i32         1
@@ -124,6 +126,8 @@ extern bool have_bmi1;
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      1
+#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
 #define TCG_TARGET_HAS_sub2_i64         1
diff --git a/tcg/ia64/tcg-target.h b/tcg/ia64/tcg-target.h
index 6dddb7f..8856dc8 100644
--- a/tcg/ia64/tcg-target.h
+++ b/tcg/ia64/tcg-target.h
@@ -149,6 +149,10 @@ typedef enum {
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_deposit_i32      1
 #define TCG_TARGET_HAS_deposit_i64      1
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_sextract_i32     0
+#define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_add2_i32         0
 #define TCG_TARGET_HAS_add2_i64         0
 #define TCG_TARGET_HAS_sub2_i32         0
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index 3aeac87..1bcea3b 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -123,6 +123,8 @@ extern bool use_mips32r2_instructions;
 #define TCG_TARGET_HAS_bswap16_i32      use_mips32r2_instructions
 #define TCG_TARGET_HAS_bswap32_i32      use_mips32r2_instructions
 #define TCG_TARGET_HAS_deposit_i32      use_mips32r2_instructions
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_ext8s_i32        use_mips32r2_instructions
 #define TCG_TARGET_HAS_ext16s_i32       use_mips32r2_instructions
 #define TCG_TARGET_HAS_rot_i32          use_mips32r2_instructions
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 0f13490..0be71f8 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -878,6 +878,19 @@ void tcg_optimize(TCGContext *s)
                              temps[args[2]].mask);
             break;
 
+        CASE_OP_32_64(extract):
+            mask = extract64(temps[args[1]].mask, args[2], args[3]);
+            if (args[2] == 0) {
+                affected = temps[args[1]].mask & ~mask;
+            }
+            break;
+        CASE_OP_32_64(sextract):
+            mask = sextract64(temps[args[1]].mask, args[2], args[3]);
+            if (args[2] == 0 && (tcg_target_long)mask >= 0) {
+                affected = temps[args[1]].mask & ~mask;
+            }
+            break;
+
         CASE_OP_32_64(or):
         CASE_OP_32_64(xor):
             mask = temps[args[1]].mask | temps[args[2]].mask;
@@ -1048,6 +1061,22 @@ void tcg_optimize(TCGContext *s)
             }
             goto do_default;
 
+        CASE_OP_32_64(extract):
+            if (temp_is_const(args[1])) {
+                tmp = extract64(temps[args[1]].val, args[3], args[4]);
+                tcg_opt_gen_movi(s, op, args, args[0], tmp);
+                break;
+            }
+            goto do_default;
+
+        CASE_OP_32_64(sextract):
+            if (temp_is_const(args[1])) {
+                tmp = sextract64(temps[args[1]].val, args[3], args[4]);
+                tcg_opt_gen_movi(s, op, args, args[0], tmp);
+                break;
+            }
+            goto do_default;
+
         CASE_OP_32_64(setcond):
             tmp = do_constant_folding_cond(opc, args[1], args[2], args[3]);
             if (tmp != 2) {
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index dd032f2..c765d3e 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -69,6 +69,8 @@ typedef enum {
 #define TCG_TARGET_HAS_nand_i32         1
 #define TCG_TARGET_HAS_nor_i32          1
 #define TCG_TARGET_HAS_deposit_i32      1
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_mulu2_i32        0
 #define TCG_TARGET_HAS_muls2_i32        0
@@ -100,6 +102,8 @@ typedef enum {
 #define TCG_TARGET_HAS_nand_i64         1
 #define TCG_TARGET_HAS_nor_i64          1
 #define TCG_TARGET_HAS_deposit_i64      1
+#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
 #define TCG_TARGET_HAS_sub2_i64         1
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 0c1af24..9583df4 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -66,6 +66,8 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      1
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_add2_i32         1
 #define TCG_TARGET_HAS_sub2_i32         1
@@ -95,6 +97,8 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      1
+#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
 #define TCG_TARGET_HAS_sub2_i64         1
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index 88f9c90..a212167 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -111,6 +111,8 @@ extern bool use_vis3_instructions;
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      0
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_add2_i32         1
 #define TCG_TARGET_HAS_sub2_i32         1
@@ -141,6 +143,8 @@ extern bool use_vis3_instructions;
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      0
+#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
 #define TCG_TARGET_HAS_sub2_i64         1
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 291d50b..cc4a331 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -570,6 +570,123 @@ void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2,
     tcg_temp_free_i32(t1);
 }
 
+void tcg_gen_extract_i32(TCGv_i32 ret, TCGv_i32 arg,
+                         unsigned int ofs, unsigned int len)
+{
+    tcg_debug_assert(ofs < 32);
+    tcg_debug_assert(len > 0);
+    tcg_debug_assert(len <= 32);
+    tcg_debug_assert(ofs + len <= 32);
+
+    if (ofs + len == 32) {
+        tcg_gen_shri_i32(ret, arg, 32 - len);
+        return;
+    }
+
+    if (TCG_TARGET_HAS_extract_i32
+        && TCG_TARGET_extract_i32_valid(ofs, len)) {
+        tcg_gen_op4ii_i32(INDEX_op_extract_i32, ret, arg, ofs, len);
+        return;
+    }
+
+    switch (ofs + len) {
+    case 16:
+        if (TCG_TARGET_HAS_ext16u_i32) {
+            tcg_gen_ext16u_i32(ret, arg);
+            tcg_gen_shri_i32(ret, ret, ofs);
+            return;
+        }
+        break;
+    case 8:
+        if (TCG_TARGET_HAS_ext8u_i32) {
+            tcg_gen_ext8u_i32(ret, arg);
+            tcg_gen_shri_i32(ret, ret, ofs);
+            return;
+        }
+        break;
+    }
+
+    /* ??? Ideally we'd know what values are available for immediate AND.
+       Assume that 8 bits are available, plus the special case of 16,
+       so that we get ext8u, ext16u.  */
+    switch (len) {
+    case 1 ... 8: case 16:
+        if (ofs != 0) {
+            tcg_gen_shri_i32(ret, arg, ofs);
+            arg = ret;
+        }
+        tcg_gen_andi_i32(ret, arg, (1u << len) - 1);
+        break;
+    default:
+        tcg_gen_shli_i32(ret, arg, 32 - len - ofs);
+        tcg_gen_shri_i32(ret, ret, 32 - len);
+        break;
+    }
+}
+
+void tcg_gen_sextract_i32(TCGv_i32 ret, TCGv_i32 arg,
+                          unsigned int ofs, unsigned int len)
+{
+    tcg_debug_assert(ofs < 32);
+    tcg_debug_assert(len > 0);
+    tcg_debug_assert(len <= 32);
+    tcg_debug_assert(ofs + len <= 32);
+
+    if (ofs + len == 32) {
+        tcg_gen_sari_i32(ret, arg, 32 - len);
+        return;
+    }
+
+    if (TCG_TARGET_HAS_sextract_i32
+        && TCG_TARGET_extract_i32_valid(ofs, len)) {
+        tcg_gen_op4ii_i32(INDEX_op_sextract_i32, ret, arg, ofs, len);
+        return;
+    }
+
+    /* Assume that sign-extension, if available, is cheaper than a shift.  */
+    switch (ofs + len) {
+    case 16:
+        if (TCG_TARGET_HAS_ext16s_i32) {
+            tcg_gen_ext16s_i32(ret, arg);
+            tcg_gen_sari_i32(ret, ret, ofs);
+            return;
+        }
+        break;
+    case 8:
+        if (TCG_TARGET_HAS_ext8s_i32) {
+            tcg_gen_ext8s_i32(ret, arg);
+            tcg_gen_sari_i32(ret, ret, ofs);
+            return;
+        }
+        break;
+    }
+    switch (len) {
+    case 16:
+        if (TCG_TARGET_HAS_ext16s_i32) {
+            if (ofs != 0) {
+                tcg_gen_shri_i32(ret, arg, ofs);
+                arg = ret;
+            }
+            tcg_gen_ext16s_i32(ret, arg);
+            return;
+        }
+        break;
+    case 8:
+        if (TCG_TARGET_HAS_ext8s_i32) {
+            if (ofs != 0) {
+                tcg_gen_shri_i32(ret, arg, ofs);
+                arg = ret;
+            }
+            tcg_gen_ext8s_i32(ret, arg);
+            return;
+        }
+        break;
+    }
+
+    tcg_gen_shli_i32(ret, arg, 32 - len - ofs);
+    tcg_gen_sari_i32(ret, ret, 32 - len);
+}
+
 void tcg_gen_movcond_i32(TCGCond cond, TCGv_i32 ret, TCGv_i32 c1,
                          TCGv_i32 c2, TCGv_i32 v1, TCGv_i32 v2)
 {
@@ -1618,6 +1735,202 @@ void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
     tcg_temp_free_i64(t1);
 }
 
+void tcg_gen_extract_i64(TCGv_i64 ret, TCGv_i64 arg,
+                         unsigned int ofs, unsigned int len)
+{
+    tcg_debug_assert(ofs < 64);
+    tcg_debug_assert(len > 0);
+    tcg_debug_assert(len <= 64);
+    tcg_debug_assert(ofs + len <= 64);
+
+    if (ofs + len == 64) {
+        tcg_gen_shri_i64(ret, arg, 64 - len);
+        return;
+    }
+
+    if (TCG_TARGET_REG_BITS == 32) {
+        /* Look for a 32-bit extract within one of the two words.  */
+        if (ofs >= 32) {
+            tcg_gen_extract_i32(TCGV_LOW(ret), TCGV_HIGH(arg), ofs - 32, len);
+            tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+            return;
+        }
+        if (ofs + len <= 32) {
+            tcg_gen_extract_i32(TCGV_LOW(ret), TCGV_LOW(arg), ofs - 32, len);
+            tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+            return;
+        }
+        /* The field is split across the two words.  If the field itself
+           is 32-bits or less, we can perform one double-word shift to place
+           the field at the top of the low word and do the rest as a single
+           word shift.  */
+        if (len <= 32) {
+            if (len == 32) {
+                tcg_gen_shri_i64(ret, arg, ofs);
+            } else {
+                tcg_gen_shri_i64(ret, arg, ofs + len - 32);
+                tcg_gen_shri_i32(TCGV_LOW(ret), TCGV_LOW(ret), 32 - len);
+            }
+            tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+            return;
+        }
+    }
+
+    if (TCG_TARGET_HAS_extract_i64
+        && TCG_TARGET_extract_i64_valid(ofs, len)) {
+        tcg_gen_op4ii_i64(INDEX_op_extract_i64, ret, arg, ofs, len);
+        return;
+    }
+
+    switch (ofs + len) {
+    case 32:
+        if (TCG_TARGET_HAS_ext32u_i64) {
+            tcg_gen_ext32u_i64(ret, arg);
+            tcg_gen_shri_i64(ret, ret, ofs);
+            return;
+        }
+        break;
+    case 16:
+        if (TCG_TARGET_HAS_ext16u_i64) {
+            tcg_gen_ext16u_i64(ret, arg);
+            tcg_gen_shri_i64(ret, ret, ofs);
+            return;
+        }
+        break;
+    case 8:
+        if (TCG_TARGET_HAS_ext8u_i64) {
+            tcg_gen_ext8u_i64(ret, arg);
+            tcg_gen_shri_i64(ret, ret, ofs);
+            return;
+        }
+        break;
+    }
+
+    /* ??? Ideally we'd know what values are available for immediate AND.
+       Assume that 8 bits are available, plus the special cases of 16 and 32,
+       so that we get ext8u, ext16u, and ext32u.  */
+    switch (len) {
+    case 1 ... 8: case 16: case 32:
+        if (ofs != 0) {
+            tcg_gen_shri_i64(ret, arg, ofs);
+            arg = ret;
+        }
+        tcg_gen_andi_i64(ret, arg, (1ull << len) - 1);
+        break;
+    default:
+        tcg_gen_shli_i64(ret, arg, 64 - len - ofs);
+        tcg_gen_shri_i64(ret, ret, 64 - len);
+        break;
+    }
+}
+
+void tcg_gen_sextract_i64(TCGv_i64 ret, TCGv_i64 arg,
+                          unsigned int ofs, unsigned int len)
+{
+    tcg_debug_assert(ofs < 64);
+    tcg_debug_assert(len > 0);
+    tcg_debug_assert(len <= 64);
+    tcg_debug_assert(ofs + len <= 64);
+
+    if (ofs + len == 64) {
+        tcg_gen_sari_i64(ret, arg, 64 - len);
+        return;
+    }
+
+    if (TCG_TARGET_REG_BITS == 32) {
+        /* Look for a 32-bit extract within one of the two words.  */
+        if (ofs >= 32) {
+            tcg_gen_sextract_i32(TCGV_LOW(ret), TCGV_HIGH(arg), ofs - 32, len);
+            tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+            return;
+        }
+        if (ofs + len <= 32) {
+            tcg_gen_sextract_i32(TCGV_LOW(ret), TCGV_LOW(arg), ofs - 32, len);
+            tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+            return;
+        }
+        /* The field is split across the two words.  If the field itself
+           is 32-bits or less, we can perform one double-word shift to place
+           the field at the top of the low word and do the rest as single
+           word shifts.  */
+        if (len <= 32) {
+            if (len == 32) {
+                tcg_gen_shri_i64(ret, arg, ofs);
+            } else {
+                tcg_gen_shri_i64(ret, arg, ofs + len - 32);
+                tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+            }
+            tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+            return;
+        }
+    }
+
+    if (TCG_TARGET_HAS_sextract_i64
+        && TCG_TARGET_extract_i64_valid(ofs, len)) {
+        tcg_gen_op4ii_i64(INDEX_op_sextract_i64, ret, arg, ofs, len);
+        return;
+    }
+
+    /* Assume that sign-extension, if available, is cheaper than a shift.  */
+    switch (ofs + len) {
+    case 32:
+        if (TCG_TARGET_HAS_ext32s_i64) {
+            tcg_gen_ext32s_i64(ret, arg);
+            tcg_gen_sari_i64(ret, ret, ofs);
+            return;
+        }
+        break;
+    case 16:
+        if (TCG_TARGET_HAS_ext16s_i64) {
+            tcg_gen_ext16s_i64(ret, arg);
+            tcg_gen_sari_i64(ret, ret, ofs);
+            return;
+        }
+        break;
+    case 8:
+        if (TCG_TARGET_HAS_ext8s_i64) {
+            tcg_gen_ext8s_i64(ret, arg);
+            tcg_gen_sari_i64(ret, ret, ofs);
+            return;
+        }
+        break;
+    }
+    switch (len) {
+    case 32:
+        if (TCG_TARGET_HAS_ext32s_i64) {
+            if (ofs != 0) {
+                tcg_gen_shri_i64(ret, arg, ofs);
+                arg = ret;
+            }
+            tcg_gen_ext32s_i64(ret, arg);
+            return;
+        }
+        break;
+    case 16:
+        if (TCG_TARGET_HAS_ext16s_i64) {
+            if (ofs != 0) {
+                tcg_gen_shri_i64(ret, arg, ofs);
+                arg = ret;
+            }
+            tcg_gen_ext16s_i64(ret, arg);
+            return;
+        }
+        break;
+    case 8:
+        if (TCG_TARGET_HAS_ext8s_i64) {
+            if (ofs != 0) {
+                tcg_gen_shri_i64(ret, arg, ofs);
+                arg = ret;
+            }
+            tcg_gen_ext8s_i64(ret, arg);
+            return;
+        }
+        break;
+    }
+    tcg_gen_shli_i64(ret, arg, 64 - len - ofs);
+    tcg_gen_sari_i64(ret, ret, 64 - len);
+}
+
 void tcg_gen_movcond_i64(TCGCond cond, TCGv_i64 ret, TCGv_i64 c1,
                          TCGv_i64 c2, TCGv_i64 v1, TCGv_i64 v2)
 {
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 02cb376..21d30cb 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -292,6 +292,10 @@ void tcg_gen_rotr_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
 void tcg_gen_rotri_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2);
 void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2,
                          unsigned int ofs, unsigned int len);
+void tcg_gen_extract_i32(TCGv_i32 ret, TCGv_i32 arg,
+                         unsigned int ofs, unsigned int len);
+void tcg_gen_sextract_i32(TCGv_i32 ret, TCGv_i32 arg,
+                          unsigned int ofs, unsigned int len);
 void tcg_gen_brcond_i32(TCGCond cond, TCGv_i32 arg1, TCGv_i32 arg2, TCGLabel *);
 void tcg_gen_brcondi_i32(TCGCond cond, TCGv_i32 arg1, int32_t arg2, TCGLabel *);
 void tcg_gen_setcond_i32(TCGCond cond, TCGv_i32 ret,
@@ -468,6 +472,10 @@ void tcg_gen_rotr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
 void tcg_gen_rotri_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2);
 void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
                          unsigned int ofs, unsigned int len);
+void tcg_gen_extract_i64(TCGv_i64 ret, TCGv_i64 arg,
+                         unsigned int ofs, unsigned int len);
+void tcg_gen_sextract_i64(TCGv_i64 ret, TCGv_i64 arg,
+                          unsigned int ofs, unsigned int len);
 void tcg_gen_brcond_i64(TCGCond cond, TCGv_i64 arg1, TCGv_i64 arg2, TCGLabel *);
 void tcg_gen_brcondi_i64(TCGCond cond, TCGv_i64 arg1, int64_t arg2, TCGLabel *);
 void tcg_gen_setcond_i64(TCGCond cond, TCGv_i64 ret,
@@ -925,6 +933,8 @@ static inline void tcg_gen_qemu_st64(TCGv_i64 arg, TCGv addr, int mem_index)
 #define tcg_gen_rotr_tl tcg_gen_rotr_i64
 #define tcg_gen_rotri_tl tcg_gen_rotri_i64
 #define tcg_gen_deposit_tl tcg_gen_deposit_i64
+#define tcg_gen_extract_tl tcg_gen_extract_i64
+#define tcg_gen_sextract_tl tcg_gen_sextract_i64
 #define tcg_const_tl tcg_const_i64
 #define tcg_const_local_tl tcg_const_local_i64
 #define tcg_gen_movcond_tl tcg_gen_movcond_i64
@@ -1002,6 +1012,8 @@ static inline void tcg_gen_qemu_st64(TCGv_i64 arg, TCGv addr, int mem_index)
 #define tcg_gen_rotr_tl tcg_gen_rotr_i32
 #define tcg_gen_rotri_tl tcg_gen_rotri_i32
 #define tcg_gen_deposit_tl tcg_gen_deposit_i32
+#define tcg_gen_extract_tl tcg_gen_extract_i32
+#define tcg_gen_sextract_tl tcg_gen_sextract_i32
 #define tcg_const_tl tcg_const_i32
 #define tcg_const_local_tl tcg_const_local_i32
 #define tcg_gen_movcond_tl tcg_gen_movcond_i32
diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
index 45528d2..11563ac 100644
--- a/tcg/tcg-opc.h
+++ b/tcg/tcg-opc.h
@@ -77,6 +77,8 @@ DEF(sar_i32, 1, 2, 0, 0)
 DEF(rotl_i32, 1, 2, 0, IMPL(TCG_TARGET_HAS_rot_i32))
 DEF(rotr_i32, 1, 2, 0, IMPL(TCG_TARGET_HAS_rot_i32))
 DEF(deposit_i32, 1, 2, 2, IMPL(TCG_TARGET_HAS_deposit_i32))
+DEF(extract_i32, 1, 1, 2, IMPL(TCG_TARGET_HAS_extract_i32))
+DEF(sextract_i32, 1, 1, 2, IMPL(TCG_TARGET_HAS_sextract_i32))
 
 DEF(brcond_i32, 0, 2, 2, TCG_OPF_BB_END)
 
@@ -139,6 +141,8 @@ DEF(sar_i64, 1, 2, 0, IMPL64)
 DEF(rotl_i64, 1, 2, 0, IMPL64 | IMPL(TCG_TARGET_HAS_rot_i64))
 DEF(rotr_i64, 1, 2, 0, IMPL64 | IMPL(TCG_TARGET_HAS_rot_i64))
 DEF(deposit_i64, 1, 2, 2, IMPL64 | IMPL(TCG_TARGET_HAS_deposit_i64))
+DEF(extract_i64, 1, 1, 2, IMPL64 | IMPL(TCG_TARGET_HAS_extract_i64))
+DEF(sextract_i64, 1, 1, 2, IMPL64 | IMPL(TCG_TARGET_HAS_sextract_i64))
 
 /* size changing ops */
 DEF(ext_i32_i64, 1, 1, 0, IMPL64)
diff --git a/tcg/tcg.h b/tcg/tcg.h
index c9949aa..e2d34b6 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -112,6 +112,8 @@ typedef uint64_t TCGRegSet;
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      0
+#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_movcond_i64      0
 #define TCG_TARGET_HAS_add2_i64         0
 #define TCG_TARGET_HAS_sub2_i64         0
@@ -130,6 +132,12 @@ typedef uint64_t TCGRegSet;
 #ifndef TCG_TARGET_deposit_i64_valid
 #define TCG_TARGET_deposit_i64_valid(ofs, len) 1
 #endif
+#ifndef TCG_TARGET_extract_i32_valid
+#define TCG_TARGET_extract_i32_valid(ofs, len) 1
+#endif
+#ifndef TCG_TARGET_extract_i64_valid
+#define TCG_TARGET_extract_i64_valid(ofs, len) 1
+#endif
 
 /* Only one of DIV or DIV2 should be defined.  */
 #if defined(TCG_TARGET_HAS_div_i32)
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index 868228b..2065042 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -69,6 +69,8 @@
 #define TCG_TARGET_HAS_ext16u_i32       1
 #define TCG_TARGET_HAS_andc_i32         0
 #define TCG_TARGET_HAS_deposit_i32      1
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_eqv_i32          0
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
@@ -88,6 +90,8 @@
 #define TCG_TARGET_HAS_bswap32_i64      1
 #define TCG_TARGET_HAS_bswap64_i64      1
 #define TCG_TARGET_HAS_deposit_i64      1
+#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_div_i64          0
 #define TCG_TARGET_HAS_rem_i64          0
 #define TCG_TARGET_HAS_ext8s_i64        1
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 02/15] tcg: Minor adjustments to deposit expanders
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 01/15] tcg: Add field extraction primitives Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-17 15:23   ` Claudio Fontana
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 03/15] tcg/aarch64: Implement field extraction opcodes Richard Henderson
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel

Assert that len is not 0.

Since we have asserted that ofs + len <= N, a later
check for len == N implies that ofs == 0.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/tcg-op.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index cc4a331..39bab98 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -543,10 +543,11 @@ void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2,
     TCGv_i32 t1;
 
     tcg_debug_assert(ofs < 32);
+    tcg_debug_assert(len > 0);
     tcg_debug_assert(len <= 32);
     tcg_debug_assert(ofs + len <= 32);
 
-    if (ofs == 0 && len == 32) {
+    if (len == 32) {
         tcg_gen_mov_i32(ret, arg2);
         return;
     }
@@ -1693,10 +1694,11 @@ void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
     TCGv_i64 t1;
 
     tcg_debug_assert(ofs < 64);
+    tcg_debug_assert(len > 0);
     tcg_debug_assert(len <= 64);
     tcg_debug_assert(ofs + len <= 64);
 
-    if (ofs == 0 && len == 64) {
+    if (len == 64) {
         tcg_gen_mov_i64(ret, arg2);
         return;
     }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 03/15] tcg/aarch64: Implement field extraction opcodes
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 01/15] tcg: Add field extraction primitives Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 02/15] tcg: Minor adjustments to deposit expanders Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-17 15:22   ` Claudio Fontana
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 04/15] tcg/arm: Move isa detection to tcg-target.h Richard Henderson
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Claudio Fontana

Cc: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/aarch64/tcg-target.h     |  8 ++++----
 tcg/aarch64/tcg-target.inc.c | 14 ++++++++++++++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 410c31b..4a74bd8 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -63,8 +63,8 @@ typedef enum {
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      1
-#define TCG_TARGET_HAS_extract_i32      0
-#define TCG_TARGET_HAS_sextract_i32     0
+#define TCG_TARGET_HAS_extract_i32      1
+#define TCG_TARGET_HAS_sextract_i32     1
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_add2_i32         1
 #define TCG_TARGET_HAS_sub2_i32         1
@@ -95,8 +95,8 @@ typedef enum {
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      1
-#define TCG_TARGET_HAS_extract_i64      0
-#define TCG_TARGET_HAS_sextract_i64     0
+#define TCG_TARGET_HAS_extract_i64      1
+#define TCG_TARGET_HAS_sextract_i64     1
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
 #define TCG_TARGET_HAS_sub2_i64         1
diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
index 1939d35..a496b3b 100644
--- a/tcg/aarch64/tcg-target.inc.c
+++ b/tcg/aarch64/tcg-target.inc.c
@@ -1640,6 +1640,16 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_dep(s, ext, a0, REG0(2), args[3], args[4]);
         break;
 
+    case INDEX_op_extract_i64:
+    case INDEX_op_extract_i32:
+        tcg_out_ubfm(s, ext, a0, a1, a2, args[3]);
+        break;
+
+    case INDEX_op_sextract_i64:
+    case INDEX_op_sextract_i32:
+        tcg_out_sbfm(s, ext, a0, a1, a2, args[3]);
+        break;
+
     case INDEX_op_add2_i32:
         tcg_out_addsub2(s, TCG_TYPE_I32, a0, a1, REG0(2), REG0(3),
                         (int32_t)args[4], args[5], const_args[4],
@@ -1785,6 +1795,10 @@ static const TCGTargetOpDef aarch64_op_defs[] = {
 
     { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
     { INDEX_op_deposit_i64, { "r", "0", "rZ" } },
+    { INDEX_op_extract_i32, { "r", "r" } },
+    { INDEX_op_extract_i64, { "r", "r" } },
+    { INDEX_op_sextract_i32, { "r", "r" } },
+    { INDEX_op_sextract_i64, { "r", "r" } },
 
     { INDEX_op_add2_i32, { "r", "r", "rZ", "rZ", "rA", "rMZ" } },
     { INDEX_op_add2_i64, { "r", "r", "rZ", "rZ", "rA", "rMZ" } },
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 04/15] tcg/arm: Move isa detection to tcg-target.h
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (2 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 03/15] tcg/aarch64: Implement field extraction opcodes Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 05/15] tcg/arm: Implement field extraction opcodes Richard Henderson
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Andrzej Zaborowski, Peter Maydell

Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/arm/tcg-target.h     | 36 ++++++++++++++++++++++++++++++++----
 tcg/arm/tcg-target.inc.c | 41 +----------------------------------------
 2 files changed, 33 insertions(+), 44 deletions(-)

diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index 8e724be..d1fe12b 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -26,6 +26,37 @@
 #ifndef ARM_TCG_TARGET_H
 #define ARM_TCG_TARGET_H
 
+/* The __ARM_ARCH define is provided by gcc 4.8.  Construct it otherwise.  */
+#ifndef __ARM_ARCH
+# if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \
+     || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) \
+     || defined(__ARM_ARCH_7EM__)
+#  define __ARM_ARCH 7
+# elif defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) \
+       || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) \
+       || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6T2__)
+#  define __ARM_ARCH 6
+# elif defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_5E__) \
+       || defined(__ARM_ARCH_5T__) || defined(__ARM_ARCH_5TE__) \
+       || defined(__ARM_ARCH_5TEJ__)
+#  define __ARM_ARCH 5
+# else
+#  define __ARM_ARCH 4
+# endif
+#endif
+
+extern int arm_arch;
+
+#if defined(__ARM_ARCH_5T__) \
+    || defined(__ARM_ARCH_5TE__) || defined(__ARM_ARCH_5TEJ__)
+# define use_armv5t_instructions 1
+#else
+# define use_armv5t_instructions use_armv6_instructions
+#endif
+
+#define use_armv6_instructions  (__ARM_ARCH >= 6 || arm_arch >= 6)
+#define use_armv7_instructions  (__ARM_ARCH >= 7 || arm_arch >= 7)
+
 #undef TCG_TARGET_STACK_GROWSUP
 #define TCG_TARGET_INSN_UNIT_SIZE 4
 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
@@ -79,7 +110,7 @@ extern bool use_idiv_instructions;
 #define TCG_TARGET_HAS_eqv_i32          0
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
-#define TCG_TARGET_HAS_deposit_i32      1
+#define TCG_TARGET_HAS_deposit_i32      use_armv7_instructions
 #define TCG_TARGET_HAS_extract_i32      0
 #define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_movcond_i32      1
@@ -90,9 +121,6 @@ extern bool use_idiv_instructions;
 #define TCG_TARGET_HAS_div_i32          use_idiv_instructions
 #define TCG_TARGET_HAS_rem_i32          0
 
-extern bool tcg_target_deposit_valid(int ofs, int len);
-#define TCG_TARGET_deposit_i32_valid  tcg_target_deposit_valid
-
 enum {
     TCG_AREG0 = TCG_REG_R6,
 };
diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c
index ffa0d40..1415c27 100644
--- a/tcg/arm/tcg-target.inc.c
+++ b/tcg/arm/tcg-target.inc.c
@@ -25,36 +25,7 @@
 #include "elf.h"
 #include "tcg-be-ldst.h"
 
-/* The __ARM_ARCH define is provided by gcc 4.8.  Construct it otherwise.  */
-#ifndef __ARM_ARCH
-# if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \
-     || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) \
-     || defined(__ARM_ARCH_7EM__)
-#  define __ARM_ARCH 7
-# elif defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) \
-       || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) \
-       || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6T2__)
-#  define __ARM_ARCH 6
-# elif defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_5E__) \
-       || defined(__ARM_ARCH_5T__) || defined(__ARM_ARCH_5TE__) \
-       || defined(__ARM_ARCH_5TEJ__)
-#  define __ARM_ARCH 5
-# else
-#  define __ARM_ARCH 4
-# endif
-#endif
-
-static int arm_arch = __ARM_ARCH;
-
-#if defined(__ARM_ARCH_5T__) \
-    || defined(__ARM_ARCH_5TE__) || defined(__ARM_ARCH_5TEJ__)
-# define use_armv5t_instructions 1
-#else
-# define use_armv5t_instructions use_armv6_instructions
-#endif
-
-#define use_armv6_instructions  (__ARM_ARCH >= 6 || arm_arch >= 6)
-#define use_armv7_instructions  (__ARM_ARCH >= 7 || arm_arch >= 7)
+int arm_arch = __ARM_ARCH;
 
 #ifndef use_idiv_instructions
 bool use_idiv_instructions;
@@ -730,16 +701,6 @@ static inline void tcg_out_bswap32(TCGContext *s, int cond, int rd, int rn)
     }
 }
 
-bool tcg_target_deposit_valid(int ofs, int len)
-{
-    /* ??? Without bfi, we could improve over generic code by combining
-       the right-shift from a non-zero ofs with the orr.  We do run into
-       problems when rd == rs, and the mask generated from ofs+len doesn't
-       fit into an immediate.  We would have to be careful not to pessimize
-       wrt the optimizations performed on the expanded code.  */
-    return use_armv7_instructions;
-}
-
 static inline void tcg_out_deposit(TCGContext *s, int cond, TCGReg rd,
                                    TCGArg a1, int ofs, int len, bool const_a1)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 05/15] tcg/arm: Implement field extraction opcodes
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (3 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 04/15] tcg/arm: Move isa detection to tcg-target.h Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 06/15] tcg/i386: " Richard Henderson
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Andrzej Zaborowski, Peter Maydell

Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/arm/tcg-target.h     |  4 ++--
 tcg/arm/tcg-target.inc.c | 22 ++++++++++++++++++++++
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index d1fe12b..4e30728 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -111,8 +111,8 @@ extern bool use_idiv_instructions;
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      use_armv7_instructions
-#define TCG_TARGET_HAS_extract_i32      0
-#define TCG_TARGET_HAS_sextract_i32     0
+#define TCG_TARGET_HAS_extract_i32      use_armv7_instructions
+#define TCG_TARGET_HAS_sextract_i32     use_armv7_instructions
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_mulu2_i32        1
 #define TCG_TARGET_HAS_muls2_i32        1
diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c
index 1415c27..6765a9d 100644
--- a/tcg/arm/tcg-target.inc.c
+++ b/tcg/arm/tcg-target.inc.c
@@ -713,6 +713,20 @@ static inline void tcg_out_deposit(TCGContext *s, int cond, TCGReg rd,
               | (ofs << 7) | ((ofs + len - 1) << 16));
 }
 
+static inline void tcg_out_extract(TCGContext *s, int cond, TCGReg rd,
+                                   TCGArg a1, int ofs, int len)
+{
+    tcg_out32(s, 0x07e00050 | (cond << 28) | (rd << 12) | a1
+              | (ofs << 7) | ((len - 1) << 16));
+}
+
+static inline void tcg_out_sextract(TCGContext *s, int cond, TCGReg rd,
+                                    TCGArg a1, int ofs, int len)
+{
+    tcg_out32(s, 0x07a00050 | (cond << 28) | (rd << 12) | a1
+              | (ofs << 7) | ((len - 1) << 16));
+}
+
 /* Note that this routine is used for both LDR and LDRH formats, so we do
    not wish to include an immediate shift at this point.  */
 static void tcg_out_memop_r(TCGContext *s, int cond, ARMInsn opc, TCGReg rt,
@@ -1894,6 +1908,12 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_deposit(s, COND_AL, args[0], args[2],
                         args[3], args[4], const_args[2]);
         break;
+    case INDEX_op_extract_i32:
+        tcg_out_extract(s, COND_AL, args[0], args[1], args[2], args[3]);
+        break;
+    case INDEX_op_sextract_i32:
+        tcg_out_sextract(s, COND_AL, args[0], args[1], args[2], args[3]);
+        break;
 
     case INDEX_op_div_i32:
         tcg_out_sdiv(s, COND_AL, args[0], args[1], args[2]);
@@ -1976,6 +1996,8 @@ static const TCGTargetOpDef arm_op_defs[] = {
     { INDEX_op_ext16u_i32, { "r", "r" } },
 
     { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
+    { INDEX_op_extract_i32, { "r", "r" } },
+    { INDEX_op_sextract_i32, { "r", "r" } },
 
     { INDEX_op_div_i32, { "r", "r", "r" } },
     { INDEX_op_divu_i32, { "r", "r", "r" } },
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 06/15] tcg/i386: Implement field extraction opcodes
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (4 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 05/15] tcg/arm: Implement field extraction opcodes Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 07/15] tcg/mips: " Richard Henderson
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/i386/tcg-target.h     |  9 ++++++---
 tcg/i386/tcg-target.inc.c | 30 ++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 7625188..302ca96 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -94,8 +94,8 @@ extern bool have_bmi1;
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      1
-#define TCG_TARGET_HAS_extract_i32      0
-#define TCG_TARGET_HAS_sextract_i32     0
+#define TCG_TARGET_HAS_extract_i32      1
+#define TCG_TARGET_HAS_sextract_i32     1
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_add2_i32         1
 #define TCG_TARGET_HAS_sub2_i32         1
@@ -126,7 +126,7 @@ extern bool have_bmi1;
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      1
-#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_extract_i64      1
 #define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
@@ -142,6 +142,9 @@ extern bool have_bmi1;
      ((ofs) == 0 && (len) == 16))
 #define TCG_TARGET_deposit_i64_valid    TCG_TARGET_deposit_i32_valid
 
+#define TCG_TARGET_extract_i32_valid(ofs, len) ((ofs) == 8 && (len) == 8)
+#define TCG_TARGET_extract_i64_valid    TCG_TARGET_extract_i32_valid
+
 #if TCG_TARGET_REG_BITS == 64
 # define TCG_AREG0 TCG_REG_R14
 #else
diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index eeb1777..091c6ff 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -2143,6 +2143,32 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
+    OP_32_64(extract):
+        /* Note that TCG_TARGET_extract_*_valid only allows pos=8, len=8,
+           on the off-chance that we can use the high-byte registers.
+           Otherwise we emit the same ext16 + shift pattern that we would
+           have gotten from the normal tcg-op.c expansion.  */
+        if (args[1] < 4 && args[0] < 8) {
+            /* Do not set P_REXB_RM, so that we do get the %[abcd]h regs.  */
+            tcg_out_modrm(s, OPC_MOVZBL, args[0], args[1] + 4);
+        } else {
+            tcg_out_ext16u(s, args[0], args[1]);
+            tcg_out_shifti(s, SHIFT_SHR, args[0], 8);
+        }
+        break;
+
+    case INDEX_op_sextract_i32:
+        /* Note that we don't implement sextract_i64, as we cannot
+           sign-extend to 64-bits without using the REX prefix that
+           explicitly excludes access to the high-byte registers.  */
+        if (args[1] < 4 && args[0] < 8) {
+            tcg_out_modrm(s, OPC_MOVSBL, args[0], args[1] + 4);
+        } else {
+            tcg_out_ext16s(s, args[0], args[1], 0);
+            tcg_out_shifti(s, SHIFT_SAR + rexw, args[0], args[0]);
+        }
+        break;
+
     case INDEX_op_mb:
         tcg_out_mb(s, args[0]);
         break;
@@ -2204,6 +2230,9 @@ static const TCGTargetOpDef x86_op_defs[] = {
     { INDEX_op_setcond_i32, { "q", "r", "ri" } },
 
     { INDEX_op_deposit_i32, { "Q", "0", "Q" } },
+    { INDEX_op_extract_i32, { "r", "r" } },
+    { INDEX_op_sextract_i32, { "r", "r" } },
+
     { INDEX_op_movcond_i32, { "r", "r", "ri", "r", "0" } },
 
     { INDEX_op_mulu2_i32, { "a", "d", "a", "r" } },
@@ -2265,6 +2294,7 @@ static const TCGTargetOpDef x86_op_defs[] = {
     { INDEX_op_extu_i32_i64, { "r", "r" } },
 
     { INDEX_op_deposit_i64, { "Q", "0", "Q" } },
+    { INDEX_op_extract_i64, { "r", "r" } },
     { INDEX_op_movcond_i64, { "r", "r", "re", "r", "0" } },
 
     { INDEX_op_mulu2_i64, { "a", "d", "a", "r" } },
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 07/15] tcg/mips: Implement field extraction opcodes
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (5 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 06/15] tcg/i386: " Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 08/15] tcg/ppc: " Richard Henderson
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/mips/tcg-target.h     | 2 +-
 tcg/mips/tcg-target.inc.c | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index 1bcea3b..f1c3137 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -123,7 +123,7 @@ extern bool use_mips32r2_instructions;
 #define TCG_TARGET_HAS_bswap16_i32      use_mips32r2_instructions
 #define TCG_TARGET_HAS_bswap32_i32      use_mips32r2_instructions
 #define TCG_TARGET_HAS_deposit_i32      use_mips32r2_instructions
-#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_extract_i32      use_mips32r2_instructions
 #define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_ext8s_i32        use_mips32r2_instructions
 #define TCG_TARGET_HAS_ext16s_i32       use_mips32r2_instructions
diff --git a/tcg/mips/tcg-target.inc.c b/tcg/mips/tcg-target.inc.c
index abce602..192dd49 100644
--- a/tcg/mips/tcg-target.inc.c
+++ b/tcg/mips/tcg-target.inc.c
@@ -1637,6 +1637,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_deposit_i32:
         tcg_out_opc_bf(s, OPC_INS, a0, a2, args[3] + args[4] - 1, args[3]);
         break;
+    case INDEX_op_extract_i32:
+        tcg_out_opc_bf(s, OPC_EXT, a0, a1, a3 + args[3] - 1, a2);
+        break;
 
     case INDEX_op_brcond_i32:
         tcg_out_brcond(s, a2, a0, a1, arg_label(args[3]));
@@ -1736,6 +1739,7 @@ static const TCGTargetOpDef mips_op_defs[] = {
     { INDEX_op_ext16s_i32, { "r", "rZ" } },
 
     { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
+    { INDEX_op_extract_i32, { "r", "r" } },
 
     { INDEX_op_brcond_i32, { "rZ", "rZ" } },
 #if use_mips32r6_instructions
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 08/15] tcg/ppc: Implement field extraction opcodes
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (6 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 07/15] tcg/mips: " Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-26  1:48   ` David Gibson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 09/15] tcg/s390: " Richard Henderson
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/ppc/tcg-target.h     |  4 ++--
 tcg/ppc/tcg-target.inc.c | 10 ++++++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index c765d3e..b42c57a 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -69,7 +69,7 @@ typedef enum {
 #define TCG_TARGET_HAS_nand_i32         1
 #define TCG_TARGET_HAS_nor_i32          1
 #define TCG_TARGET_HAS_deposit_i32      1
-#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_extract_i32      1
 #define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_mulu2_i32        0
@@ -102,7 +102,7 @@ typedef enum {
 #define TCG_TARGET_HAS_nand_i64         1
 #define TCG_TARGET_HAS_nor_i64          1
 #define TCG_TARGET_HAS_deposit_i64      1
-#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_extract_i64      1
 #define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c
index a3262cf..7ec54a2 100644
--- a/tcg/ppc/tcg-target.inc.c
+++ b/tcg/ppc/tcg-target.inc.c
@@ -2396,6 +2396,14 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
         }
         break;
 
+    case INDEX_op_extract_i32:
+        tcg_out_rlw(s, RLWINM, args[0], args[1],
+                    32 - args[2], 32 - args[3], 31);
+        break;
+    case INDEX_op_extract_i64:
+        tcg_out_rld(s, RLDICL, args[0], args[1], 64 - args[2], 64 - args[3]);
+        break;
+
     case INDEX_op_movcond_i32:
         tcg_out_movcond(s, TCG_TYPE_I32, args[5], args[0], args[1], args[2],
                         args[3], args[4], const_args[2]);
@@ -2530,6 +2538,7 @@ static const TCGTargetOpDef ppc_op_defs[] = {
     { INDEX_op_movcond_i32, { "r", "r", "ri", "rZ", "rZ" } },
 
     { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
+    { INDEX_op_extract_i32, { "r", "r" } },
 
     { INDEX_op_muluh_i32, { "r", "r", "r" } },
     { INDEX_op_mulsh_i32, { "r", "r", "r" } },
@@ -2585,6 +2594,7 @@ static const TCGTargetOpDef ppc_op_defs[] = {
     { INDEX_op_movcond_i64, { "r", "r", "ri", "rZ", "rZ" } },
 
     { INDEX_op_deposit_i64, { "r", "0", "rZ" } },
+    { INDEX_op_extract_i64, { "r", "r" } },
 
     { INDEX_op_mulsh_i64, { "r", "r", "r" } },
     { INDEX_op_muluh_i64, { "r", "r", "r" } },
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 09/15] tcg/s390: Implement field extraction opcodes
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (7 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 08/15] tcg/ppc: " Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 10/15] target-alpha: Use deposit and extract ops Richard Henderson
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.h     | 12 +++++++-----
 tcg/s390/tcg-target.inc.c | 13 ++++++++++++-
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 9583df4..cf8fbfd 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -66,7 +66,7 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      1
-#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_extract_i32      1
 #define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_add2_i32         1
@@ -97,7 +97,7 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      1
-#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_extract_i64      1
 #define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
@@ -107,9 +107,11 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_muluh_i64        0
 #define TCG_TARGET_HAS_mulsh_i64        0
 
-extern bool tcg_target_deposit_valid(int ofs, int len);
-#define TCG_TARGET_deposit_i32_valid  tcg_target_deposit_valid
-#define TCG_TARGET_deposit_i64_valid  tcg_target_deposit_valid
+extern bool tcg_target_have_gen_inst(void);
+#define TCG_TARGET_deposit_i32_valid(o,l)  tcg_target_have_gen_inst()
+#define TCG_TARGET_deposit_i64_valid(o,l)  tcg_target_have_gen_inst()
+#define TCG_TARGET_extract_i32_valid(o,l)  tcg_target_have_gen_inst()
+#define TCG_TARGET_extract_i64_valid(o,l)  tcg_target_have_gen_inst()
 
 /* used for function call generation */
 #define TCG_REG_CALL_STACK		TCG_REG_R15
diff --git a/tcg/s390/tcg-target.inc.c b/tcg/s390/tcg-target.inc.c
index 253d4a0..fa9e144 100644
--- a/tcg/s390/tcg-target.inc.c
+++ b/tcg/s390/tcg-target.inc.c
@@ -1250,7 +1250,7 @@ static void tgen_movcond(TCGContext *s, TCGType type, TCGCond c, TCGReg dest,
     }
 }
 
-bool tcg_target_deposit_valid(int ofs, int len)
+bool tcg_target_have_gen_inst(void)
 {
     return (facilities & FACILITY_GEN_INST_EXT) != 0;
 }
@@ -1263,6 +1263,12 @@ static void tgen_deposit(TCGContext *s, TCGReg dest, TCGReg src,
     tcg_out_risbg(s, dest, src, msb, lsb, ofs, 0);
 }
 
+static void tgen_extract(TCGContext *s, TCGReg dest, TCGReg src,
+                         int ofs, int len)
+{
+    tcg_out_risbg(s, dest, src, 64 - len, 63, 64 - ofs, 1);
+}
+
 static void tgen_gotoi(TCGContext *s, int cc, tcg_insn_unit *dest)
 {
     ptrdiff_t off = dest - s->code_ptr;
@@ -2169,6 +2175,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     OP_32_64(deposit):
         tgen_deposit(s, args[0], args[2], args[3], args[4]);
         break;
+    OP_32_64(extract):
+        tgen_extract(s, args[0], args[1], args[2], args[3]);
+        break;
 
     case INDEX_op_mb:
         /* The host memory model is quite strong, we simply need to
@@ -2238,6 +2247,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_setcond_i32, { "r", "r", "rC" } },
     { INDEX_op_movcond_i32, { "r", "r", "rC", "r", "0" } },
     { INDEX_op_deposit_i32, { "r", "0", "r" } },
+    { INDEX_op_extract_i32, { "r", "r" } },
 
     { INDEX_op_qemu_ld_i32, { "r", "L" } },
     { INDEX_op_qemu_ld_i64, { "r", "L" } },
@@ -2299,6 +2309,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_setcond_i64, { "r", "r", "rC" } },
     { INDEX_op_movcond_i64, { "r", "r", "rC", "r", "0" } },
     { INDEX_op_deposit_i64, { "r", "0", "r" } },
+    { INDEX_op_extract_i64, { "r", "r" } },
 
     { INDEX_op_mb, { } },
     { -1 },
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 10/15] target-alpha: Use deposit and extract ops
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (8 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 09/15] tcg/s390: " Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 11/15] target-arm: Use tcg_gen_*extract Richard Henderson
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-alpha/translate.c | 67 ++++++++++++++++++++++++++++++------------------
 1 file changed, 42 insertions(+), 25 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index af717ca..a341729 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -953,7 +953,13 @@ static void gen_ext_h(DisasContext *ctx, TCGv vc, TCGv va, int rb, bool islit,
                       uint8_t lit, uint8_t byte_mask)
 {
     if (islit) {
-        tcg_gen_shli_i64(vc, va, (64 - lit * 8) & 0x3f);
+        int pos = (64 - lit * 8) & 0x3f;
+        int len = cto32(byte_mask) * 8;
+        if (pos < len) {
+            tcg_gen_deposit_i64(vc, load_zero(ctx), va, pos, len - pos);
+        } else {
+            tcg_gen_movi_i64(vc, 0);
+        }
     } else {
         TCGv tmp = tcg_temp_new();
         tcg_gen_shli_i64(tmp, load_gpr(ctx, rb), 3);
@@ -970,38 +976,44 @@ static void gen_ext_l(DisasContext *ctx, TCGv vc, TCGv va, int rb, bool islit,
                       uint8_t lit, uint8_t byte_mask)
 {
     if (islit) {
-        tcg_gen_shri_i64(vc, va, (lit & 7) * 8);
+        int pos = (lit & 7) * 8;
+        int len = cto32(byte_mask) * 8;
+        if (pos + len >= 64) {
+            len = 64 - pos;
+        }
+        tcg_gen_extract_i64(vc, va, pos, len);
     } else {
         TCGv tmp = tcg_temp_new();
         tcg_gen_andi_i64(tmp, load_gpr(ctx, rb), 7);
         tcg_gen_shli_i64(tmp, tmp, 3);
         tcg_gen_shr_i64(vc, va, tmp);
         tcg_temp_free(tmp);
+        gen_zapnoti(vc, vc, byte_mask);
     }
-    gen_zapnoti(vc, vc, byte_mask);
 }
 
 /* INSWH, INSLH, INSQH */
 static void gen_ins_h(DisasContext *ctx, TCGv vc, TCGv va, int rb, bool islit,
                       uint8_t lit, uint8_t byte_mask)
 {
-    TCGv tmp = tcg_temp_new();
-
-    /* The instruction description has us left-shift the byte mask and extract
-       bits <15:8> and apply that zap at the end.  This is equivalent to simply
-       performing the zap first and shifting afterward.  */
-    gen_zapnoti(tmp, va, byte_mask);
-
     if (islit) {
-        lit &= 7;
-        if (unlikely(lit == 0)) {
-            tcg_gen_movi_i64(vc, 0);
+        int pos = 64 - (lit & 7) * 8;
+        int len = cto32(byte_mask) * 8;
+        if (pos < len) {
+            tcg_gen_extract_i64(vc, va, pos, len - pos);
         } else {
-            tcg_gen_shri_i64(vc, tmp, 64 - lit * 8);
+            tcg_gen_movi_i64(vc, 0);
         }
     } else {
+        TCGv tmp = tcg_temp_new();
         TCGv shift = tcg_temp_new();
 
+        /* The instruction description has us left-shift the byte mask
+           and extract bits <15:8> and apply that zap at the end.  This
+           is equivalent to simply performing the zap first and shifting
+           afterward.  */
+        gen_zapnoti(tmp, va, byte_mask);
+
         /* If (B & 7) == 0, we need to shift by 64 and leave a zero.  Do this
            portably by splitting the shift into two parts: shift_count-1 and 1.
            Arrange for the -1 by using ones-complement instead of
@@ -1014,32 +1026,37 @@ static void gen_ins_h(DisasContext *ctx, TCGv vc, TCGv va, int rb, bool islit,
         tcg_gen_shr_i64(vc, tmp, shift);
         tcg_gen_shri_i64(vc, vc, 1);
         tcg_temp_free(shift);
+        tcg_temp_free(tmp);
     }
-    tcg_temp_free(tmp);
 }
 
 /* INSBL, INSWL, INSLL, INSQL */
 static void gen_ins_l(DisasContext *ctx, TCGv vc, TCGv va, int rb, bool islit,
                       uint8_t lit, uint8_t byte_mask)
 {
-    TCGv tmp = tcg_temp_new();
-
-    /* The instruction description has us left-shift the byte mask
-       the same number of byte slots as the data and apply the zap
-       at the end.  This is equivalent to simply performing the zap
-       first and shifting afterward.  */
-    gen_zapnoti(tmp, va, byte_mask);
-
     if (islit) {
-        tcg_gen_shli_i64(vc, tmp, (lit & 7) * 8);
+        int pos = (lit & 7) * 8;
+        int len = cto32(byte_mask) * 8;
+        if (pos + len > 64) {
+            len = 64 - pos;
+        }
+        tcg_gen_deposit_i64(vc, load_zero(ctx), va, pos, len);
     } else {
+        TCGv tmp = tcg_temp_new();
         TCGv shift = tcg_temp_new();
+
+        /* The instruction description has us left-shift the byte mask
+           and extract bits <15:8> and apply that zap at the end.  This
+           is equivalent to simply performing the zap first and shifting
+           afterward.  */
+        gen_zapnoti(tmp, va, byte_mask);
+
         tcg_gen_andi_i64(shift, load_gpr(ctx, rb), 7);
         tcg_gen_shli_i64(shift, shift, 3);
         tcg_gen_shl_i64(vc, tmp, shift);
         tcg_temp_free(shift);
+        tcg_temp_free(tmp);
     }
-    tcg_temp_free(tmp);
 }
 
 /* MSKWH, MSKLH, MSKQH */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 11/15] target-arm: Use tcg_gen_*extract
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (9 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 10/15] target-alpha: Use deposit and extract ops Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 12/15] target-i386: Use tcg_gen_extract_tl Richard Henderson
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell

Use the new primitives for UBFX and SBFX.

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 48 ++++++++++++++++++++++------------------------
 target-arm/translate.c     | 37 ++++++++---------------------------
 2 files changed, 31 insertions(+), 54 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 2d5c1a2..7df4e84 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -3171,11 +3171,8 @@ static void disas_bitfield(DisasContext *s, uint32_t insn)
                 goto done;
             }
         }
-        if (si == 63 || (si == 31 && ri <= si)) { /* ASR */
-            if (si == 31) {
-                tcg_gen_ext32s_i64(tcg_tmp, tcg_tmp);
-            }
-            tcg_gen_sari_i64(tcg_rd, tcg_tmp, ri);
+        if (ri <= si) { /* ASR, SBFX */
+            tcg_gen_sextract_i64(tcg_rd, tcg_tmp, ri, (si - ri) + 1);
             goto done;
         }
     } else if (opc == 2) { /* UBFM */
@@ -3183,41 +3180,42 @@ static void disas_bitfield(DisasContext *s, uint32_t insn)
             tcg_gen_andi_i64(tcg_rd, tcg_tmp, bitmask64(si + 1));
             return;
         }
-        if (si == 63 || (si == 31 && ri <= si)) { /* LSR */
-            if (si == 31) {
-                tcg_gen_ext32u_i64(tcg_tmp, tcg_tmp);
-            }
-            tcg_gen_shri_i64(tcg_rd, tcg_tmp, ri);
-            return;
-        }
         if (si + 1 == ri && si != bitsize - 1) { /* LSL */
             int shift = bitsize - 1 - si;
             tcg_gen_shli_i64(tcg_rd, tcg_tmp, shift);
             goto done;
         }
+        if (ri <= si) { /* UBFX, LSR */
+            tcg_gen_extract_i64(tcg_rd, tcg_tmp, ri, (si - ri) + 1);
+            return;
+        }
     }
 
     if (opc != 1) { /* SBFM or UBFM */
         tcg_gen_movi_i64(tcg_rd, 0);
     }
 
-    /* do the bit move operation */
-    if (si >= ri) {
-        /* Wd<s-r:0> = Wn<s:r> */
-        tcg_gen_shri_i64(tcg_tmp, tcg_tmp, ri);
-        pos = 0;
-        len = (si - ri) + 1;
-    } else {
-        /* Wd<32+s-r,32-r> = Wn<s:0> */
-        pos = bitsize - ri;
-        len = si + 1;
+    /* Do the bit move operation.  Note that above we handled ri <= si,
+       Wd<s-r:0> = Wn<s:r>, via tcg_gen_*extract_i64.  Now we handle
+       the ri > si case, Wd<32+s-r,32-r> = Wn<s:0>, via deposit.  */
+    pos = bitsize - ri;
+    len = si + 1;
+
+    if (opc == 0 && len < ri) {
+        /* SBFM - sign extend the destination field from len to fill
+           the balance of the word.  Let the deposit below insert all
+           of those sign bits.  */
+        tcg_gen_sextract_i64(tcg_tmp, tcg_tmp, 0, len);
+        len = ri;
     }
 
     tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, pos, len);
 
-    if (opc == 0) { /* SBFM - sign extend the destination field */
-        tcg_gen_shli_i64(tcg_rd, tcg_rd, 64 - (pos + len));
-        tcg_gen_sari_i64(tcg_rd, tcg_rd, 64 - (pos + len));
+    if (opc != 1) {
+        /* SBFM or UBFM: We started with zero above, and we haven't
+           modified any bits outside bitsize, therefore the zero-extension
+           below is unneeded.  */
+        return;
     }
 
  done:
diff --git a/target-arm/translate.c b/target-arm/translate.c
index aaf6135..37ad61d 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -297,29 +297,6 @@ static void gen_revsh(TCGv_i32 var)
     tcg_gen_ext16s_i32(var, var);
 }
 
-/* Unsigned bitfield extract.  */
-static void gen_ubfx(TCGv_i32 var, int shift, uint32_t mask)
-{
-    if (shift)
-        tcg_gen_shri_i32(var, var, shift);
-    tcg_gen_andi_i32(var, var, mask);
-}
-
-/* Signed bitfield extract.  */
-static void gen_sbfx(TCGv_i32 var, int shift, int width)
-{
-    uint32_t signbit;
-
-    if (shift)
-        tcg_gen_sari_i32(var, var, shift);
-    if (shift + width < 32) {
-        signbit = 1u << (width - 1);
-        tcg_gen_andi_i32(var, var, (1u << width) - 1);
-        tcg_gen_xori_i32(var, var, signbit);
-        tcg_gen_subi_i32(var, var, signbit);
-    }
-}
-
 /* Return (b << 32) + a. Mark inputs as dead */
 static TCGv_i64 gen_addq_msw(TCGv_i64 a, TCGv_i32 b)
 {
@@ -9234,9 +9211,9 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                             goto illegal_op;
                         if (i < 32) {
                             if (op1 & 0x20) {
-                                gen_ubfx(tmp, shift, (1u << i) - 1);
+                                tcg_gen_extract_i32(tmp, tmp, shift, i);
                             } else {
-                                gen_sbfx(tmp, shift, i);
+                                tcg_gen_sextract_i32(tmp, tmp, shift, i);
                             }
                         }
                         store_reg(s, rd, tmp);
@@ -10551,15 +10528,17 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw
                         imm++;
                         if (shift + imm > 32)
                             goto illegal_op;
-                        if (imm < 32)
-                            gen_sbfx(tmp, shift, imm);
+                        if (imm < 32) {
+                            tcg_gen_sextract_i32(tmp, tmp, shift, imm);
+                        }
                         break;
                     case 6: /* Unsigned bitfield extract.  */
                         imm++;
                         if (shift + imm > 32)
                             goto illegal_op;
-                        if (imm < 32)
-                            gen_ubfx(tmp, shift, (1u << imm) - 1);
+                        if (imm < 32) {
+                            tcg_gen_extract_i32(tmp, tmp, shift, imm);
+                        }
                         break;
                     case 3: /* Bitfield insert/clear.  */
                         if (imm < shift)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 12/15] target-i386: Use tcg_gen_extract_tl
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (10 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 11/15] target-arm: Use tcg_gen_*extract Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 13/15] target-mips: Use tcg_gen_extract_* Richard Henderson
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel

Three places where it was easy to identify a right-shift
followed by an extract or and-with-immediate.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-i386/translate.c | 45 +++++++++++++++++++++++----------------------
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/target-i386/translate.c b/target-i386/translate.c
index 2f60e9c..4a3014c 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -383,8 +383,7 @@ static void gen_op_mov_reg_v(TCGMemOp ot, int reg, TCGv t0)
 static inline void gen_op_mov_v_reg(TCGMemOp ot, TCGv t0, int reg)
 {
     if (ot == MO_8 && byte_reg_is_xH(reg)) {
-        tcg_gen_shri_tl(t0, cpu_regs[reg - 4], 8);
-        tcg_gen_ext8u_tl(t0, t0);
+        tcg_gen_extract_tl(t0, cpu_regs[reg - 4], 8, 8);
     } else {
         tcg_gen_mov_tl(t0, cpu_regs[reg]);
     }
@@ -3715,8 +3714,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 
                     /* Extract the LEN into a mask.  Lengths larger than
                        operand size get all ones.  */
-                    tcg_gen_shri_tl(cpu_A0, cpu_regs[s->vex_v], 8);
-                    tcg_gen_ext8u_tl(cpu_A0, cpu_A0);
+                    tcg_gen_extract_tl(cpu_A0, cpu_regs[s->vex_v], 8, 8);
                     tcg_gen_movcond_tl(TCG_COND_LEU, cpu_A0, cpu_A0, bound,
                                        cpu_A0, bound);
                     tcg_temp_free(bound);
@@ -3867,9 +3865,8 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                             gen_compute_eflags(s);
                         }
                         carry_in = cpu_tmp0;
-                        tcg_gen_shri_tl(carry_in, cpu_cc_src,
-                                        ctz32(b == 0x1f6 ? CC_C : CC_O));
-                        tcg_gen_andi_tl(carry_in, carry_in, 1);
+                        tcg_gen_extract_tl(carry_in, cpu_cc_src,
+                                           ctz32(b == 0x1f6 ? CC_C : CC_O), 1);
                     }
 
                     switch (ot) {
@@ -5340,21 +5337,25 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s,
             rm = (modrm & 7) | REX_B(s);
 
             if (mod == 3) {
-                gen_op_mov_v_reg(ot, cpu_T0, rm);
-                switch (s_ot) {
-                case MO_UB:
-                    tcg_gen_ext8u_tl(cpu_T0, cpu_T0);
-                    break;
-                case MO_SB:
-                    tcg_gen_ext8s_tl(cpu_T0, cpu_T0);
-                    break;
-                case MO_UW:
-                    tcg_gen_ext16u_tl(cpu_T0, cpu_T0);
-                    break;
-                default:
-                case MO_SW:
-                    tcg_gen_ext16s_tl(cpu_T0, cpu_T0);
-                    break;
+                if (s_ot == MO_SB && byte_reg_is_xH(rm)) {
+                    tcg_gen_sextract_tl(cpu_T0, cpu_regs[rm - 4], 8, 8);
+                } else {
+                    gen_op_mov_v_reg(ot, cpu_T0, rm);
+                    switch (s_ot) {
+                    case MO_UB:
+                        tcg_gen_ext8u_tl(cpu_T0, cpu_T0);
+                        break;
+                    case MO_SB:
+                        tcg_gen_ext8s_tl(cpu_T0, cpu_T0);
+                        break;
+                    case MO_UW:
+                        tcg_gen_ext16u_tl(cpu_T0, cpu_T0);
+                        break;
+                    default:
+                    case MO_SW:
+                        tcg_gen_ext16s_tl(cpu_T0, cpu_T0);
+                        break;
+                    }
                 }
                 gen_op_mov_reg_v(d_ot, reg, cpu_T0);
             } else {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 13/15] target-mips: Use tcg_gen_extract_*
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (11 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 12/15] target-i386: Use tcg_gen_extract_tl Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 14/15] target-ppc: " Richard Henderson
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Yongbok Kim

Use the new primitives for EXT and DEXT.

Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-mips/translate.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index d8dde7a..cf79aa4 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -4484,11 +4484,12 @@ static void gen_bitops (DisasContext *ctx, uint32_t opc, int rt,
         if (lsb + msb > 31) {
             goto fail;
         }
-        tcg_gen_shri_tl(t0, t1, lsb);
         if (msb != 31) {
-            tcg_gen_andi_tl(t0, t0, (1U << (msb + 1)) - 1);
+            tcg_gen_extract_tl(t0, t1, lsb, msb + 1);
         } else {
-            tcg_gen_ext32s_tl(t0, t0);
+            /* The two checks together imply that lsb == 0,
+               so this is a simple sign-extension.  */
+            tcg_gen_ext32s_tl(t0, t1);
         }
         break;
 #if defined(TARGET_MIPS64)
@@ -4503,10 +4504,7 @@ static void gen_bitops (DisasContext *ctx, uint32_t opc, int rt,
         if (lsb + msb > 63) {
             goto fail;
         }
-        tcg_gen_shri_tl(t0, t1, lsb);
-        if (msb != 63) {
-            tcg_gen_andi_tl(t0, t0, (1ULL << (msb + 1)) - 1);
-        }
+        tcg_gen_extract_tl(t0, t1, lsb, msb + 1);
         break;
 #endif
     case OPC_INS:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 14/15] target-ppc: Use tcg_gen_extract_*
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (12 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 13/15] target-mips: Use tcg_gen_extract_* Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-17  3:38   ` [Qemu-devel] [Qemu-ppc] " David Gibson
  2016-10-26  2:59   ` David Gibson
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 15/15] target-s390: Use tcg_gen_extract_i64 Richard Henderson
  2016-10-16  4:09 ` [Qemu-devel] [PATCH 00/15] tcg field extract primitives no-reply
  15 siblings, 2 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc

Use the new primitives for RDWINM and RLDICL.

Cc: qemu-ppc@nongnu.org
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-ppc/translate.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index bfc1301..724d95c 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1977,9 +1977,8 @@ static void gen_rlwinm(DisasContext *ctx)
     if (mb == 0 && me == (31 - sh)) {
         tcg_gen_shli_tl(t_ra, t_rs, sh);
         tcg_gen_ext32u_tl(t_ra, t_ra);
-    } else if (sh != 0 && me == 31 && sh == (32 - mb)) {
-        tcg_gen_ext32u_tl(t_ra, t_rs);
-        tcg_gen_shri_tl(t_ra, t_ra, mb);
+    } else if (me == 31 && (me - mb + 1) + sh <= 32) {
+        tcg_gen_extract_tl(t_ra, t_rs, sh, me - mb + 1);
     } else {
         target_ulong mask;
 #if defined(TARGET_PPC64)
@@ -2094,8 +2093,8 @@ static void gen_rldinm(DisasContext *ctx, int mb, int me, int sh)
 
     if (sh != 0 && mb == 0 && me == (63 - sh)) {
         tcg_gen_shli_tl(t_ra, t_rs, sh);
-    } else if (sh != 0 && me == 63 && sh == (64 - mb)) {
-        tcg_gen_shri_tl(t_ra, t_rs, mb);
+    } else if (me == 63 && (me - mb + 1) + sh <= 64) {
+        tcg_gen_extract_tl(t_ra, t_rs, sh, me - mb + 1);
     } else {
         tcg_gen_rotli_tl(t_ra, t_rs, sh);
         tcg_gen_andi_tl(t_ra, t_ra, MASK(mb, me));
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 15/15] target-s390: Use tcg_gen_extract_i64
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (13 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 14/15] target-ppc: " Richard Henderson
@ 2016-10-16  3:37 ` Richard Henderson
  2016-10-16  4:09 ` [Qemu-devel] [PATCH 00/15] tcg field extract primitives no-reply
  15 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2016-10-16  3:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf

Use the new primitive for RISBG.

Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-s390x/translate.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 02bc705..477238d 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -3134,20 +3134,26 @@ static ExitStatus op_risbg(DisasContext *s, DisasOps *o)
         }
     }
 
-    /* In some cases we can implement this with deposit, which can be more
-       efficient on some hosts.  */
-    if (~mask == imask && i3 <= i4) {
+    len = i4 - i3 + 1;
+    pos = 63 - i4;
+    rot = i5 & 63;
+
+    /* In some cases we can implement this with extract.  */
+    if (imask == 0 && pos == 0 && len > 0 && rot + len <= 64) {
+        tcg_gen_extract_i64(o->out, o->in2, rot, len);
+        return NO_EXIT;
+    }
+
+    /* In some cases we can implement this with deposit.  */
+    if (~mask == imask && len > 0) {
         if (s->fields->op2 == 0x5d) {
-            i3 += 32, i4 += 32;
+            pos += 32;
         }
         /* Note that we rotate the bits to be inserted to the lsb, not to
            the position as described in the PoO.  */
-        len = i4 - i3 + 1;
-        pos = 63 - i4;
-        rot = (i5 - pos) & 63;
+        rot = (rot - pos) & 63;
     } else {
-        pos = len = -1;
-        rot = i5 & 63;
+        pos = -1;
     }
 
     /* Rotate the input as necessary.  */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 00/15] tcg field extract primitives
  2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
                   ` (14 preceding siblings ...)
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 15/15] target-s390: Use tcg_gen_extract_i64 Richard Henderson
@ 2016-10-16  4:09 ` no-reply
  15 siblings, 0 replies; 25+ messages in thread
From: no-reply @ 2016-10-16  4:09 UTC (permalink / raw)
  To: rth; +Cc: famz, qemu-devel

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH 00/15] tcg field extract primitives
Type: series
Message-id: 1476589070-5792-1-git-send-email-rth@twiddle.net

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git show --no-patch --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
4d07805 target-s390: Use tcg_gen_extract_i64
3c516d8 target-ppc: Use tcg_gen_extract_*
053b5eb target-mips: Use tcg_gen_extract_*
b32052a target-i386: Use tcg_gen_extract_tl
6387628 target-arm: Use tcg_gen_*extract
0214aa9 target-alpha: Use deposit and extract ops
74f3547 tcg/s390: Implement field extraction opcodes
500fe06 tcg/ppc: Implement field extraction opcodes
976f642 tcg/mips: Implement field extraction opcodes
5eaa14b tcg/i386: Implement field extraction opcodes
123a584 tcg/arm: Implement field extraction opcodes
4624eb9 tcg/arm: Move isa detection to tcg-target.h
7ef8856 tcg/aarch64: Implement field extraction opcodes
2e87ea6 tcg: Minor adjustments to deposit expanders
be676b7 tcg: Add field extraction primitives

=== OUTPUT BEGIN ===
Checking PATCH 1/15: tcg: Add field extraction primitives...
ERROR: spaces required around that ':' (ctx:VxE)
#105: FILE: tcg/optimize.c:881:
+        CASE_OP_32_64(extract):
                               ^

ERROR: spaces required around that ':' (ctx:VxE)
#111: FILE: tcg/optimize.c:887:
+        CASE_OP_32_64(sextract):
                                ^

ERROR: spaces required around that ':' (ctx:VxE)
#125: FILE: tcg/optimize.c:1064:
+        CASE_OP_32_64(extract):
                               ^

ERROR: spaces required around that ':' (ctx:VxE)
#133: FILE: tcg/optimize.c:1072:
+        CASE_OP_32_64(sextract):
                                ^

ERROR: space prohibited after that '&&' (ctx:ExW)
#232: FILE: tcg/tcg-op.c:587:
+        && TCG_TARGET_extract_i32_valid(ofs, len)) {
         ^

ERROR: space prohibited after that '&&' (ctx:ExW)
#286: FILE: tcg/tcg-op.c:641:
+        && TCG_TARGET_extract_i32_valid(ofs, len)) {
         ^

ERROR: space prohibited after that '&&' (ctx:ExW)
#384: FILE: tcg/tcg-op.c:1780:
+        && TCG_TARGET_extract_i64_valid(ofs, len)) {
         ^

ERROR: space prohibited after that '&&' (ctx:ExW)
#473: FILE: tcg/tcg-op.c:1869:
+        && TCG_TARGET_extract_i64_valid(ofs, len)) {
         ^

total: 8 errors, 0 warnings, 560 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 2/15: tcg: Minor adjustments to deposit expanders...
Checking PATCH 3/15: tcg/aarch64: Implement field extraction opcodes...
Checking PATCH 4/15: tcg/arm: Move isa detection to tcg-target.h...
WARNING: architecture specific defines should be avoided
#20: FILE: tcg/arm/tcg-target.h:30:
+#ifndef __ARM_ARCH

WARNING: architecture specific defines should be avoided
#21: FILE: tcg/arm/tcg-target.h:31:
+# if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \

WARNING: architecture specific defines should be avoided
#40: FILE: tcg/arm/tcg-target.h:50:
+#if defined(__ARM_ARCH_5T__) \

total: 0 errors, 3 warnings, 107 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 5/15: tcg/arm: Implement field extraction opcodes...
Checking PATCH 6/15: tcg/i386: Implement field extraction opcodes...
ERROR: spaces required around that ':' (ctx:VxE)
#51: FILE: tcg/i386/tcg-target.inc.c:2146:
+    OP_32_64(extract):
                      ^

total: 1 errors, 0 warnings, 75 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 7/15: tcg/mips: Implement field extraction opcodes...
Checking PATCH 8/15: tcg/ppc: Implement field extraction opcodes...
Checking PATCH 9/15: tcg/s390: Implement field extraction opcodes...
ERROR: space required after that ',' (ctx:VxV)
#40: FILE: tcg/s390/tcg-target.h:111:
+#define TCG_TARGET_deposit_i32_valid(o,l)  tcg_target_have_gen_inst()
                                       ^

ERROR: space required after that ',' (ctx:VxV)
#41: FILE: tcg/s390/tcg-target.h:112:
+#define TCG_TARGET_deposit_i64_valid(o,l)  tcg_target_have_gen_inst()
                                       ^

ERROR: space required after that ',' (ctx:VxV)
#42: FILE: tcg/s390/tcg-target.h:113:
+#define TCG_TARGET_extract_i32_valid(o,l)  tcg_target_have_gen_inst()
                                       ^

ERROR: space required after that ',' (ctx:VxV)
#43: FILE: tcg/s390/tcg-target.h:114:
+#define TCG_TARGET_extract_i64_valid(o,l)  tcg_target_have_gen_inst()
                                       ^

ERROR: spaces required around that ':' (ctx:VxE)
#77: FILE: tcg/s390/tcg-target.inc.c:2178:
+    OP_32_64(extract):
                      ^

total: 5 errors, 0 warnings, 73 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 10/15: target-alpha: Use deposit and extract ops...
Checking PATCH 11/15: target-arm: Use tcg_gen_*extract...
Checking PATCH 12/15: target-i386: Use tcg_gen_extract_tl...
Checking PATCH 13/15: target-mips: Use tcg_gen_extract_*...
Checking PATCH 14/15: target-ppc: Use tcg_gen_extract_*...
Checking PATCH 15/15: target-s390: Use tcg_gen_extract_i64...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 14/15] target-ppc: Use tcg_gen_extract_*
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 14/15] target-ppc: " Richard Henderson
@ 2016-10-17  3:38   ` David Gibson
  2016-10-17  4:35     ` David Gibson
  2016-10-26  2:59   ` David Gibson
  1 sibling, 1 reply; 25+ messages in thread
From: David Gibson @ 2016-10-17  3:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 1782 bytes --]

On Sat, Oct 15, 2016 at 08:37:49PM -0700, Richard Henderson wrote:
> Use the new primitives for RDWINM and RLDICL.
> 
> Cc: qemu-ppc@nongnu.org
> Signed-off-by: Richard Henderson <rth@twiddle.net>

Applied to ppc-for-2.8, thanks.

> ---
>  target-ppc/translate.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index bfc1301..724d95c 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -1977,9 +1977,8 @@ static void gen_rlwinm(DisasContext *ctx)
>      if (mb == 0 && me == (31 - sh)) {
>          tcg_gen_shli_tl(t_ra, t_rs, sh);
>          tcg_gen_ext32u_tl(t_ra, t_ra);
> -    } else if (sh != 0 && me == 31 && sh == (32 - mb)) {
> -        tcg_gen_ext32u_tl(t_ra, t_rs);
> -        tcg_gen_shri_tl(t_ra, t_ra, mb);
> +    } else if (me == 31 && (me - mb + 1) + sh <= 32) {
> +        tcg_gen_extract_tl(t_ra, t_rs, sh, me - mb + 1);
>      } else {
>          target_ulong mask;
>  #if defined(TARGET_PPC64)
> @@ -2094,8 +2093,8 @@ static void gen_rldinm(DisasContext *ctx, int mb, int me, int sh)
>  
>      if (sh != 0 && mb == 0 && me == (63 - sh)) {
>          tcg_gen_shli_tl(t_ra, t_rs, sh);
> -    } else if (sh != 0 && me == 63 && sh == (64 - mb)) {
> -        tcg_gen_shri_tl(t_ra, t_rs, mb);
> +    } else if (me == 63 && (me - mb + 1) + sh <= 64) {
> +        tcg_gen_extract_tl(t_ra, t_rs, sh, me - mb + 1);
>      } else {
>          tcg_gen_rotli_tl(t_ra, t_rs, sh);
>          tcg_gen_andi_tl(t_ra, t_ra, MASK(mb, me));

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 14/15] target-ppc: Use tcg_gen_extract_*
  2016-10-17  3:38   ` [Qemu-devel] [Qemu-ppc] " David Gibson
@ 2016-10-17  4:35     ` David Gibson
  0 siblings, 0 replies; 25+ messages in thread
From: David Gibson @ 2016-10-17  4:35 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 2017 bytes --]

On Mon, Oct 17, 2016 at 02:38:06PM +1100, David Gibson wrote:
> On Sat, Oct 15, 2016 at 08:37:49PM -0700, Richard Henderson wrote:
> > Use the new primitives for RDWINM and RLDICL.
> > 
> > Cc: qemu-ppc@nongnu.org
> > Signed-off-by: Richard Henderson <rth@twiddle.net>
> 
> Applied to ppc-for-2.8, thanks.

Uh.. wait.. no, wasn't paying attention to the fact that it needs the
whole series.

> > ---
> >  target-ppc/translate.c | 9 ++++-----
> >  1 file changed, 4 insertions(+), 5 deletions(-)
> > 
> > diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> > index bfc1301..724d95c 100644
> > --- a/target-ppc/translate.c
> > +++ b/target-ppc/translate.c
> > @@ -1977,9 +1977,8 @@ static void gen_rlwinm(DisasContext *ctx)
> >      if (mb == 0 && me == (31 - sh)) {
> >          tcg_gen_shli_tl(t_ra, t_rs, sh);
> >          tcg_gen_ext32u_tl(t_ra, t_ra);
> > -    } else if (sh != 0 && me == 31 && sh == (32 - mb)) {
> > -        tcg_gen_ext32u_tl(t_ra, t_rs);
> > -        tcg_gen_shri_tl(t_ra, t_ra, mb);
> > +    } else if (me == 31 && (me - mb + 1) + sh <= 32) {
> > +        tcg_gen_extract_tl(t_ra, t_rs, sh, me - mb + 1);
> >      } else {
> >          target_ulong mask;
> >  #if defined(TARGET_PPC64)
> > @@ -2094,8 +2093,8 @@ static void gen_rldinm(DisasContext *ctx, int mb, int me, int sh)
> >  
> >      if (sh != 0 && mb == 0 && me == (63 - sh)) {
> >          tcg_gen_shli_tl(t_ra, t_rs, sh);
> > -    } else if (sh != 0 && me == 63 && sh == (64 - mb)) {
> > -        tcg_gen_shri_tl(t_ra, t_rs, mb);
> > +    } else if (me == 63 && (me - mb + 1) + sh <= 64) {
> > +        tcg_gen_extract_tl(t_ra, t_rs, sh, me - mb + 1);
> >      } else {
> >          tcg_gen_rotli_tl(t_ra, t_rs, sh);
> >          tcg_gen_andi_tl(t_ra, t_ra, MASK(mb, me));
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 03/15] tcg/aarch64: Implement field extraction opcodes
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 03/15] tcg/aarch64: Implement field extraction opcodes Richard Henderson
@ 2016-10-17 15:22   ` Claudio Fontana
  0 siblings, 0 replies; 25+ messages in thread
From: Claudio Fontana @ 2016-10-17 15:22 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 16.10.2016 05:37, Richard Henderson wrote:
> Cc: Claudio Fontana <claudio.fontana@huawei.com>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/aarch64/tcg-target.h     |  8 ++++----
>  tcg/aarch64/tcg-target.inc.c | 14 ++++++++++++++
>  2 files changed, 18 insertions(+), 4 deletions(-)

Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>

> 
> diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
> index 410c31b..4a74bd8 100644
> --- a/tcg/aarch64/tcg-target.h
> +++ b/tcg/aarch64/tcg-target.h
> @@ -63,8 +63,8 @@ typedef enum {
>  #define TCG_TARGET_HAS_nand_i32         0
>  #define TCG_TARGET_HAS_nor_i32          0
>  #define TCG_TARGET_HAS_deposit_i32      1
> -#define TCG_TARGET_HAS_extract_i32      0
> -#define TCG_TARGET_HAS_sextract_i32     0
> +#define TCG_TARGET_HAS_extract_i32      1
> +#define TCG_TARGET_HAS_sextract_i32     1
>  #define TCG_TARGET_HAS_movcond_i32      1
>  #define TCG_TARGET_HAS_add2_i32         1
>  #define TCG_TARGET_HAS_sub2_i32         1
> @@ -95,8 +95,8 @@ typedef enum {
>  #define TCG_TARGET_HAS_nand_i64         0
>  #define TCG_TARGET_HAS_nor_i64          0
>  #define TCG_TARGET_HAS_deposit_i64      1
> -#define TCG_TARGET_HAS_extract_i64      0
> -#define TCG_TARGET_HAS_sextract_i64     0
> +#define TCG_TARGET_HAS_extract_i64      1
> +#define TCG_TARGET_HAS_sextract_i64     1
>  #define TCG_TARGET_HAS_movcond_i64      1
>  #define TCG_TARGET_HAS_add2_i64         1
>  #define TCG_TARGET_HAS_sub2_i64         1
> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
> index 1939d35..a496b3b 100644
> --- a/tcg/aarch64/tcg-target.inc.c
> +++ b/tcg/aarch64/tcg-target.inc.c
> @@ -1640,6 +1640,16 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          tcg_out_dep(s, ext, a0, REG0(2), args[3], args[4]);
>          break;
>  
> +    case INDEX_op_extract_i64:
> +    case INDEX_op_extract_i32:
> +        tcg_out_ubfm(s, ext, a0, a1, a2, args[3]);
> +        break;
> +
> +    case INDEX_op_sextract_i64:
> +    case INDEX_op_sextract_i32:
> +        tcg_out_sbfm(s, ext, a0, a1, a2, args[3]);
> +        break;
> +
>      case INDEX_op_add2_i32:
>          tcg_out_addsub2(s, TCG_TYPE_I32, a0, a1, REG0(2), REG0(3),
>                          (int32_t)args[4], args[5], const_args[4],
> @@ -1785,6 +1795,10 @@ static const TCGTargetOpDef aarch64_op_defs[] = {
>  
>      { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
>      { INDEX_op_deposit_i64, { "r", "0", "rZ" } },
> +    { INDEX_op_extract_i32, { "r", "r" } },
> +    { INDEX_op_extract_i64, { "r", "r" } },
> +    { INDEX_op_sextract_i32, { "r", "r" } },
> +    { INDEX_op_sextract_i64, { "r", "r" } },
>  
>      { INDEX_op_add2_i32, { "r", "r", "rZ", "rZ", "rA", "rMZ" } },
>      { INDEX_op_add2_i64, { "r", "r", "rZ", "rZ", "rA", "rMZ" } },
> 


-- 
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] tcg: Minor adjustments to deposit expanders
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 02/15] tcg: Minor adjustments to deposit expanders Richard Henderson
@ 2016-10-17 15:23   ` Claudio Fontana
  0 siblings, 0 replies; 25+ messages in thread
From: Claudio Fontana @ 2016-10-17 15:23 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 16.10.2016 05:37, Richard Henderson wrote:
> Assert that len is not 0.
> 
> Since we have asserted that ofs + len <= N, a later
> check for len == N implies that ofs == 0.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/tcg-op.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
> index cc4a331..39bab98 100644
> --- a/tcg/tcg-op.c
> +++ b/tcg/tcg-op.c
> @@ -543,10 +543,11 @@ void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2,
>      TCGv_i32 t1;
>  
>      tcg_debug_assert(ofs < 32);
> +    tcg_debug_assert(len > 0);
>      tcg_debug_assert(len <= 32);
>      tcg_debug_assert(ofs + len <= 32);
>  
> -    if (ofs == 0 && len == 32) {
> +    if (len == 32) {
>          tcg_gen_mov_i32(ret, arg2);
>          return;
>      }
> @@ -1693,10 +1694,11 @@ void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
>      TCGv_i64 t1;
>  
>      tcg_debug_assert(ofs < 64);
> +    tcg_debug_assert(len > 0);
>      tcg_debug_assert(len <= 64);
>      tcg_debug_assert(ofs + len <= 64);
>  
> -    if (ofs == 0 && len == 64) {
> +    if (len == 64) {
>          tcg_gen_mov_i64(ret, arg2);
>          return;
>      }
> 

Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 08/15] tcg/ppc: Implement field extraction opcodes
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 08/15] tcg/ppc: " Richard Henderson
@ 2016-10-26  1:48   ` David Gibson
  0 siblings, 0 replies; 25+ messages in thread
From: David Gibson @ 2016-10-26  1:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2982 bytes --]

On Sat, Oct 15, 2016 at 08:37:43PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  tcg/ppc/tcg-target.h     |  4 ++--
>  tcg/ppc/tcg-target.inc.c | 10 ++++++++++
>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
> index c765d3e..b42c57a 100644
> --- a/tcg/ppc/tcg-target.h
> +++ b/tcg/ppc/tcg-target.h
> @@ -69,7 +69,7 @@ typedef enum {
>  #define TCG_TARGET_HAS_nand_i32         1
>  #define TCG_TARGET_HAS_nor_i32          1
>  #define TCG_TARGET_HAS_deposit_i32      1
> -#define TCG_TARGET_HAS_extract_i32      0
> +#define TCG_TARGET_HAS_extract_i32      1
>  #define TCG_TARGET_HAS_sextract_i32     0
>  #define TCG_TARGET_HAS_movcond_i32      1
>  #define TCG_TARGET_HAS_mulu2_i32        0
> @@ -102,7 +102,7 @@ typedef enum {
>  #define TCG_TARGET_HAS_nand_i64         1
>  #define TCG_TARGET_HAS_nor_i64          1
>  #define TCG_TARGET_HAS_deposit_i64      1
> -#define TCG_TARGET_HAS_extract_i64      0
> +#define TCG_TARGET_HAS_extract_i64      1
>  #define TCG_TARGET_HAS_sextract_i64     0
>  #define TCG_TARGET_HAS_movcond_i64      1
>  #define TCG_TARGET_HAS_add2_i64         1
> diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c
> index a3262cf..7ec54a2 100644
> --- a/tcg/ppc/tcg-target.inc.c
> +++ b/tcg/ppc/tcg-target.inc.c
> @@ -2396,6 +2396,14 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
>          }
>          break;
>  
> +    case INDEX_op_extract_i32:
> +        tcg_out_rlw(s, RLWINM, args[0], args[1],
> +                    32 - args[2], 32 - args[3], 31);
> +        break;
> +    case INDEX_op_extract_i64:
> +        tcg_out_rld(s, RLDICL, args[0], args[1], 64 - args[2], 64 - args[3]);
> +        break;
> +
>      case INDEX_op_movcond_i32:
>          tcg_out_movcond(s, TCG_TYPE_I32, args[5], args[0], args[1], args[2],
>                          args[3], args[4], const_args[2]);
> @@ -2530,6 +2538,7 @@ static const TCGTargetOpDef ppc_op_defs[] = {
>      { INDEX_op_movcond_i32, { "r", "r", "ri", "rZ", "rZ" } },
>  
>      { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
> +    { INDEX_op_extract_i32, { "r", "r" } },
>  
>      { INDEX_op_muluh_i32, { "r", "r", "r" } },
>      { INDEX_op_mulsh_i32, { "r", "r", "r" } },
> @@ -2585,6 +2594,7 @@ static const TCGTargetOpDef ppc_op_defs[] = {
>      { INDEX_op_movcond_i64, { "r", "r", "ri", "rZ", "rZ" } },
>  
>      { INDEX_op_deposit_i64, { "r", "0", "rZ" } },
> +    { INDEX_op_extract_i64, { "r", "r" } },
>  
>      { INDEX_op_mulsh_i64, { "r", "r", "r" } },
>      { INDEX_op_muluh_i64, { "r", "r", "r" } },

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 14/15] target-ppc: Use tcg_gen_extract_*
  2016-10-16  3:37 ` [Qemu-devel] [PATCH 14/15] target-ppc: " Richard Henderson
  2016-10-17  3:38   ` [Qemu-devel] [Qemu-ppc] " David Gibson
@ 2016-10-26  2:59   ` David Gibson
  2016-10-26 15:38     ` Richard Henderson
  1 sibling, 1 reply; 25+ messages in thread
From: David Gibson @ 2016-10-26  2:59 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 2069 bytes --]

On Sat, Oct 15, 2016 at 08:37:49PM -0700, Richard Henderson wrote:
> Use the new primitives for RDWINM and RLDICL.
> 
> Cc: qemu-ppc@nongnu.org
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-ppc/translate.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index bfc1301..724d95c 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -1977,9 +1977,8 @@ static void gen_rlwinm(DisasContext *ctx)
>      if (mb == 0 && me == (31 - sh)) {
>          tcg_gen_shli_tl(t_ra, t_rs, sh);
>          tcg_gen_ext32u_tl(t_ra, t_ra);
> -    } else if (sh != 0 && me == 31 && sh == (32 - mb)) {
> -        tcg_gen_ext32u_tl(t_ra, t_rs);
> -        tcg_gen_shri_tl(t_ra, t_ra, mb);
> +    } else if (me == 31 && (me - mb + 1) + sh <= 32) {

I'm having trouble figuring out what the second part of this condition
is supposed to be checking for, and it seems like it's too
restrictive.

For example, everything except the LSB of a word would be:
	rlwnim rT,rA,31,1,31
which would fail the test, but it should be fine to implement that
with an extract op.

> +        tcg_gen_extract_tl(t_ra, t_rs, sh, me - mb + 1);
>      } else {
>          target_ulong mask;
>  #if defined(TARGET_PPC64)
> @@ -2094,8 +2093,8 @@ static void gen_rldinm(DisasContext *ctx, int mb, int me, int sh)
>  
>      if (sh != 0 && mb == 0 && me == (63 - sh)) {
>          tcg_gen_shli_tl(t_ra, t_rs, sh);
> -    } else if (sh != 0 && me == 63 && sh == (64 - mb)) {
> -        tcg_gen_shri_tl(t_ra, t_rs, mb);
> +    } else if (me == 63 && (me - mb + 1) + sh <= 64) {
> +        tcg_gen_extract_tl(t_ra, t_rs, sh, me - mb + 1);
>      } else {
>          tcg_gen_rotli_tl(t_ra, t_rs, sh);
>          tcg_gen_andi_tl(t_ra, t_ra, MASK(mb, me));

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 14/15] target-ppc: Use tcg_gen_extract_*
  2016-10-26  2:59   ` David Gibson
@ 2016-10-26 15:38     ` Richard Henderson
  2016-10-27  2:10       ` David Gibson
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2016-10-26 15:38 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel

On 10/25/2016 07:59 PM, David Gibson wrote:
> On Sat, Oct 15, 2016 at 08:37:49PM -0700, Richard Henderson wrote:
>> Use the new primitives for RDWINM and RLDICL.
>>
>> Cc: qemu-ppc@nongnu.org
>> Signed-off-by: Richard Henderson <rth@twiddle.net>
>> ---
>>  target-ppc/translate.c | 9 ++++-----
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
>> index bfc1301..724d95c 100644
>> --- a/target-ppc/translate.c
>> +++ b/target-ppc/translate.c
>> @@ -1977,9 +1977,8 @@ static void gen_rlwinm(DisasContext *ctx)
>>      if (mb == 0 && me == (31 - sh)) {
>>          tcg_gen_shli_tl(t_ra, t_rs, sh);
>>          tcg_gen_ext32u_tl(t_ra, t_ra);
>> -    } else if (sh != 0 && me == 31 && sh == (32 - mb)) {
>> -        tcg_gen_ext32u_tl(t_ra, t_rs);
>> -        tcg_gen_shri_tl(t_ra, t_ra, mb);
>> +    } else if (me == 31 && (me - mb + 1) + sh <= 32) {
> 
> I'm having trouble figuring out what the second part of this condition
> is supposed to be checking for, and it seems like it's too
> restrictive.
> 
> For example, everything except the LSB of a word would be:
> 	rlwnim rT,rA,31,1,31
> which would fail the test, but it should be fine to implement that
> with an extract op.

It was confusing to me too, which is why I rearranged this in the v2 of this
patchset.  To which thread you also responded yesterday, so...

Anyway, in v2 this looks like

    if (sh != 0 && len > 0 && me == (31 - sh)) {
        tcg_gen_deposit_z_tl(t_ra, t_rs, sh, len);
    } else if (me == 31 && rsh + len <= 32) {
        tcg_gen_extract_tl(t_ra, t_rs, rsh, len);
    } else {

Basically, we're trying to match those combinations of rotate+mask that can be
implemented with shifts instead of real rotations.  That is, the mask doesn't
follow the rotate around the end of the word.


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 14/15] target-ppc: Use tcg_gen_extract_*
  2016-10-26 15:38     ` Richard Henderson
@ 2016-10-27  2:10       ` David Gibson
  0 siblings, 0 replies; 25+ messages in thread
From: David Gibson @ 2016-10-27  2:10 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2510 bytes --]

On Wed, Oct 26, 2016 at 08:38:06AM -0700, Richard Henderson wrote:
> On 10/25/2016 07:59 PM, David Gibson wrote:
> > On Sat, Oct 15, 2016 at 08:37:49PM -0700, Richard Henderson wrote:
> >> Use the new primitives for RDWINM and RLDICL.
> >>
> >> Cc: qemu-ppc@nongnu.org
> >> Signed-off-by: Richard Henderson <rth@twiddle.net>
> >> ---
> >>  target-ppc/translate.c | 9 ++++-----
> >>  1 file changed, 4 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> >> index bfc1301..724d95c 100644
> >> --- a/target-ppc/translate.c
> >> +++ b/target-ppc/translate.c
> >> @@ -1977,9 +1977,8 @@ static void gen_rlwinm(DisasContext *ctx)
> >>      if (mb == 0 && me == (31 - sh)) {
> >>          tcg_gen_shli_tl(t_ra, t_rs, sh);
> >>          tcg_gen_ext32u_tl(t_ra, t_ra);
> >> -    } else if (sh != 0 && me == 31 && sh == (32 - mb)) {
> >> -        tcg_gen_ext32u_tl(t_ra, t_rs);
> >> -        tcg_gen_shri_tl(t_ra, t_ra, mb);
> >> +    } else if (me == 31 && (me - mb + 1) + sh <= 32) {
> > 
> > I'm having trouble figuring out what the second part of this condition
> > is supposed to be checking for, and it seems like it's too
> > restrictive.
> > 
> > For example, everything except the LSB of a word would be:
> > 	rlwnim rT,rA,31,1,31
> > which would fail the test, but it should be fine to implement that
> > with an extract op.
> 
> It was confusing to me too, which is why I rearranged this in the v2 of this
> patchset.  To which thread you also responded yesterday, so...

Ah, sorry.  Because I missed it originally, I only had your ping, not
the actual v2 series in my inbox.  When I went back through my archive
to find it, I accidentally picked up v1 instead of v2.

> 
> Anyway, in v2 this looks like
> 
>     if (sh != 0 && len > 0 && me == (31 - sh)) {
>         tcg_gen_deposit_z_tl(t_ra, t_rs, sh, len);
>     } else if (me == 31 && rsh + len <= 32) {
>         tcg_gen_extract_tl(t_ra, t_rs, rsh, len);
>     } else {
> 
> Basically, we're trying to match those combinations of rotate+mask that can be
> implemented with shifts instead of real rotations.  That is, the mask doesn't
> follow the rotate around the end of the word.

Ok, that looks correct, the change frm sh to rsh is the fix, I think.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-10-27  4:24 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-16  3:37 [Qemu-devel] [PATCH 00/15] tcg field extract primitives Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 01/15] tcg: Add field extraction primitives Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 02/15] tcg: Minor adjustments to deposit expanders Richard Henderson
2016-10-17 15:23   ` Claudio Fontana
2016-10-16  3:37 ` [Qemu-devel] [PATCH 03/15] tcg/aarch64: Implement field extraction opcodes Richard Henderson
2016-10-17 15:22   ` Claudio Fontana
2016-10-16  3:37 ` [Qemu-devel] [PATCH 04/15] tcg/arm: Move isa detection to tcg-target.h Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 05/15] tcg/arm: Implement field extraction opcodes Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 06/15] tcg/i386: " Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 07/15] tcg/mips: " Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 08/15] tcg/ppc: " Richard Henderson
2016-10-26  1:48   ` David Gibson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 09/15] tcg/s390: " Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 10/15] target-alpha: Use deposit and extract ops Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 11/15] target-arm: Use tcg_gen_*extract Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 12/15] target-i386: Use tcg_gen_extract_tl Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 13/15] target-mips: Use tcg_gen_extract_* Richard Henderson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 14/15] target-ppc: " Richard Henderson
2016-10-17  3:38   ` [Qemu-devel] [Qemu-ppc] " David Gibson
2016-10-17  4:35     ` David Gibson
2016-10-26  2:59   ` David Gibson
2016-10-26 15:38     ` Richard Henderson
2016-10-27  2:10       ` David Gibson
2016-10-16  3:37 ` [Qemu-devel] [PATCH 15/15] target-s390: Use tcg_gen_extract_i64 Richard Henderson
2016-10-16  4:09 ` [Qemu-devel] [PATCH 00/15] tcg field extract primitives no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.