All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op
@ 2017-05-12 23:38 Philippe Mathieu-Daudé
  2017-05-12 23:38 ` [Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract() Philippe Mathieu-Daudé
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-05-12 23:38 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, qemu-ppc, Richard Henderson,
	Alexander Graf, Artyom Tarasenko, Aurelien Jarno, David Gibson,
	Eduardo Habkost, Eric Blake, Laurent Vivier, Laurent Vivier,
	Mark Cave-Ayland, Markus Armbruster, Michael Tokarev,
	Nikunj A Dadhania, Paolo Bonzini, Peter Maydell
  Cc: Philippe Mathieu-Daudé, Markus Elfring, Julia Lawall, Nicolas Palix

* Changes from v3

Tried to fix wrong previous attempt...
After getting some nice/fast pieces of advice from Coccinelle folks, I tried to
improved the script (not much inline documentation yet although).
- correctly check if this optimizable?
- document as Mersenne number instead of prime (Eric Blake)
- try to write Python code instead of BASIC (Markus Elfring advices)
- try to reduce regex usage
- try to match shri(); unrelated(); andi(); pattern to optimize, I was surprised
  to see the alpha diff Coccinelle found.

This is surely not the last version of this patchset, but I think now the
generated patches are correct and I prefer reviewers to look at them fixed
instead of wrong one in the ML.
Still lot of work to do in the cocci script, now it seems to hang trying to
parse "target/arm/translate.c".

* [v3] (v2 was a resend of the cocci script):

In my first attempt I misunderstood tcg_gen_extract() intrinsics, and Richard
Henderson pointed that out.
In this patchset the cocci script is corrected and clarified, it also print how
arguments are checked while running.
Also:
- incorrect patches have been removed. (Richard Henderson, Nikunj A Dadhania)
- Coccinelle script licensed GPLv2+ (Eric Blake)
- comment in each commit about how to apply the patch (Eric Blake)
- added Acked-by for m68k (Laurent Vivier)
- Cc: Coccinelle developers.

[v1]

While reviewing a commit from Aurelien Jarno where he optimized a TCG generator
for SH-4 [1] I found the same optimization done on PPC by Nikunj A Dadhania few
months ago [2].
After asking on the ML about a cocci script [3] I thought it would be easier to
learn about Coccinelle.

citing Aurelien Jarno:
    This doesn't change the generated code on x86, but optimizes it on most
    RISC architectures and makes the code simpler to read.

I actually applied the script using the following command:

$ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
    --sp-file scripts/coccinelle/tcg_gen_extract.cocci \
    --macro-file scripts/cocci-macro-file.h \
    --dir target \
    --in-place

Please review again! thanks.

[1] http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01466.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05211.html
[3] http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01499.html

Philippe Mathieu-Daudé (6):
  coccinelle: add a script to optimize tcg op using tcg_gen_extract()
  target/alpha: optimize cvtlq() using extract op
  target/arm: optimize rev16() using extract op
  target/m68k: optimize bcd_flags() using extract op
  target/ppc: optimize various functions using extract op
  target/sparc: optimize various functions using extract op

 scripts/coccinelle/tcg_gen_extract.cocci | 103 +++++++++++++++++++++++++++++++
 target/alpha/translate.c                 |   3 +-
 target/arm/translate-a64.c               |   6 +-
 target/m68k/translate.c                  |   3 +-
 target/ppc/translate.c                   |  21 +++----
 target/ppc/translate/vsx-impl.inc.c      |  24 +++----
 target/sparc/translate.c                 |  15 ++---
 7 files changed, 127 insertions(+), 48 deletions(-)
 create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci

-- 
2.11.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract()
  2017-05-12 23:38 [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op Philippe Mathieu-Daudé
@ 2017-05-12 23:38 ` Philippe Mathieu-Daudé
  2017-05-15 14:04   ` Eric Blake
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 2/6] target/alpha: optimize cvtlq() using extract op Philippe Mathieu-Daudé
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-05-12 23:38 UTC (permalink / raw)
  To: qemu-devel, Aurelien Jarno, Richard Henderson, Nikunj A Dadhania,
	Eric Blake, Markus Armbruster, Laurent Vivier, Michael Tokarev,
	Eduardo Habkost, Paolo Bonzini
  Cc: Philippe Mathieu-Daudé, Markus Elfring, Julia Lawall, Nicolas Palix

If you have coccinelle installed you can apply this script using:

    $ spatch \
        --macro-file scripts/cocci-macro-file.h \
        --dir target --in-place

You can also use directly Peter Senna Tschudin docker image (easier):

    $ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
        --sp-file scripts/coccinelle/tcg_gen_extract.cocci \
        --macro-file scripts/cocci-macro-file.h \
        --dir target --in-place

Then verified that no manual touchups are required.

The following thread was helpful while writing this script:

    https://github.com/coccinelle/coccinelle/issues/86

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 scripts/coccinelle/tcg_gen_extract.cocci | 103 +++++++++++++++++++++++++++++++
 1 file changed, 103 insertions(+)
 create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci

diff --git a/scripts/coccinelle/tcg_gen_extract.cocci b/scripts/coccinelle/tcg_gen_extract.cocci
new file mode 100644
index 0000000000..37546834ee
--- /dev/null
+++ b/scripts/coccinelle/tcg_gen_extract.cocci
@@ -0,0 +1,103 @@
+// optimize TCG using extract op
+//
+// Copyright: (C) 2017 Philippe Mathieu-Daudé. GPLv2+.
+// Confidence: High
+// Options: --macro-file scripts/cocci-macro-file.h
+//
+// Nikunj A Dadhania optimization:
+// http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05211.html
+// Aurelien Jarno optimization:
+// http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01466.html
+// Coccinelle helpful issue:
+// https://github.com/coccinelle/coccinelle/issues/86
+
+@initialize:python@
+@@
+import sys
+fd = sys.stderr
+def debug(msg="", trailer="\n"):
+    fd.write("[DBG] " + msg + trailer)
+def low_bits_count(value):
+    bits_count = 0
+    while (value & (1 << bits_count)):
+        bits_count += 1
+    return bits_count
+def Mn(order): # Mersenne number
+    return (1 << order) - 1
+
+@match@ // depends on never match_and_check_reg_used@
+metavariable ret, arg;
+constant ofs, msk;
+expression tcg_arg;
+identifier tcg_func =~ "^tcg_gen_";
+position shr_p, and_p;
+@@
+(
+    tcg_gen_shri_i32@shr_p
+|
+    tcg_gen_shri_i64@shr_p
+|
+    tcg_gen_shri_tl@shr_p
+)(ret, arg, ofs);
+<...
+tcg_func(tcg_arg, ...);
+...>
+(
+    tcg_gen_andi_i32@and_p
+|
+    tcg_gen_andi_i64@and_p
+|
+    tcg_gen_andi_tl@and_p
+)(ret, ret, msk);
+
+@script:python verify_len depends on match@
+ret_s << match.ret;
+msk_s << match.msk;
+shr_p << match.shr_p;
+tcg_func << match.tcg_func;
+tcg_arg << match.tcg_arg;
+extract_len;
+@@
+is_optimizable = False
+debug("candidate at %s:%s" % (shr_p[0].file, shr_p[0].line))
+if tcg_arg == ret_s:
+        debug("  %s() modifies argument '%s'" % (tcg_func, ret_s))
+else:
+    debug("candidate at %s:%s" % (shr_p[0].file, shr_p[0].line))
+    try: # only eval integer, no #define like 'SR_M' (cpp did this, else some headers are missing).
+        msk_v = long(msk_s.strip("UL"), 0)
+        msk_b = low_bits_count(msk_v)
+        if msk_b == 0:
+            debug("  value: 0x%x low_bits: %d" % (msk_v, msk_b))
+        else:
+            debug("  value: 0x%x low_bits: %d [Mersenne prime: 0x%x]" % (msk_v, msk_b, Mn(msk_b)))
+            is_optimizable = Mn(msk_b) == msk_v # check low_bits
+            coccinelle.extract_len = "%d" % msk_b
+        debug("  candidate %s optimizable" % ("IS" if is_optimizable else "is NOT"))
+    except:
+        debug("  ERROR (check included headers?)")
+cocci.include_match(is_optimizable)
+debug()
+
+@replacement depends on verify_len@
+metavariable match.ret, match.arg;
+constant match.ofs, match.msk;
+position match.shr_p, match.and_p;
+identifier verify_len.extract_len;
+@@
+(
+-tcg_gen_shri_i32@shr_p(ret, arg, ofs);
++tcg_gen_extract_i32(ret, arg, ofs, extract_len);
+...
+-tcg_gen_andi_i32@and_p(ret, ret, msk);
+|
+-tcg_gen_shri_i64@shr_p(ret, arg, ofs);
++tcg_gen_extract_i64(ret, arg, ofs, extract_len);
+...
+-tcg_gen_andi_i64@and_p(ret, ret, msk);
+|
+-tcg_gen_shri_tl@shr_p(ret, arg, ofs);
++tcg_gen_extract_tl(ret, arg, ofs, extract_len);
+...
+-tcg_gen_andi_tl@and_p(ret, ret, msk);
+)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 2/6] target/alpha: optimize cvtlq() using extract op
  2017-05-12 23:38 [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op Philippe Mathieu-Daudé
  2017-05-12 23:38 ` [Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract() Philippe Mathieu-Daudé
@ 2017-05-12 23:38 ` Philippe Mathieu-Daudé
  2017-05-13  0:04   ` Richard Henderson
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 3/6] target/arm: optimize rev16() " Philippe Mathieu-Daudé
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-05-12 23:38 UTC (permalink / raw)
  To: qemu-devel, Aurelien Jarno, Richard Henderson, Laurent Vivier
  Cc: Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

    $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
        --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 target/alpha/translate.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index df5d695344..531af4f5b8 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -747,9 +747,8 @@ static void gen_cvtlq(TCGv vc, TCGv vb)
     /* The arithmetic right shift here, plus the sign-extended mask below
        yields a sign-extended result without an explicit ext32s_i64.  */
     tcg_gen_sari_i64(tmp, vb, 32);
-    tcg_gen_shri_i64(vc, vb, 29);
+    tcg_gen_extract_i64(vc, vb, 29, 30);
     tcg_gen_andi_i64(tmp, tmp, (int32_t)0xc0000000);
-    tcg_gen_andi_i64(vc, vc, 0x3fffffff);
     tcg_gen_or_i64(vc, vc, tmp);
 
     tcg_temp_free(tmp);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 3/6] target/arm: optimize rev16() using extract op
  2017-05-12 23:38 [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op Philippe Mathieu-Daudé
  2017-05-12 23:38 ` [Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract() Philippe Mathieu-Daudé
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 2/6] target/alpha: optimize cvtlq() using extract op Philippe Mathieu-Daudé
@ 2017-05-12 23:38 ` Philippe Mathieu-Daudé
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 4/6] target/m68k: optimize bcd_flags() " Philippe Mathieu-Daudé
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-05-12 23:38 UTC (permalink / raw)
  To: qemu-devel, Peter Maydell, Aurelien Jarno, Richard Henderson, qemu-arm
  Cc: Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

    $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
        --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 target/arm/translate-a64.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 24de30d92c..759b2466ef 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4038,14 +4038,12 @@ static void handle_rev16(DisasContext *s, unsigned int sf,
     tcg_gen_andi_i64(tcg_tmp, tcg_rn, 0xffff);
     tcg_gen_bswap16_i64(tcg_rd, tcg_tmp);
 
-    tcg_gen_shri_i64(tcg_tmp, tcg_rn, 16);
-    tcg_gen_andi_i64(tcg_tmp, tcg_tmp, 0xffff);
+    tcg_gen_extract_i64(tcg_tmp, tcg_rn, 16, 16);
     tcg_gen_bswap16_i64(tcg_tmp, tcg_tmp);
     tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, 16, 16);
 
     if (sf) {
-        tcg_gen_shri_i64(tcg_tmp, tcg_rn, 32);
-        tcg_gen_andi_i64(tcg_tmp, tcg_tmp, 0xffff);
+        tcg_gen_extract_i64(tcg_tmp, tcg_rn, 32, 16);
         tcg_gen_bswap16_i64(tcg_tmp, tcg_tmp);
         tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, 32, 16);
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 4/6] target/m68k: optimize bcd_flags() using extract op
  2017-05-12 23:38 [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op Philippe Mathieu-Daudé
                   ` (2 preceding siblings ...)
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 3/6] target/arm: optimize rev16() " Philippe Mathieu-Daudé
@ 2017-05-12 23:38 ` Philippe Mathieu-Daudé
  2017-05-13  0:05   ` Richard Henderson
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions " Philippe Mathieu-Daudé
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-05-12 23:38 UTC (permalink / raw)
  To: qemu-devel, Aurelien Jarno, Richard Henderson, Laurent Vivier
  Cc: Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

    $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
        --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Laurent Vivier <laurent@vivier.eu>
---
 target/m68k/translate.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 9f60fbc0db..babb9e2c5b 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -1463,8 +1463,7 @@ static void bcd_flags(TCGv val)
     tcg_gen_andi_i32(QREG_CC_C, val, 0x0ff);
     tcg_gen_or_i32(QREG_CC_Z, QREG_CC_Z, QREG_CC_C);
 
-    tcg_gen_shri_i32(QREG_CC_C, val, 8);
-    tcg_gen_andi_i32(QREG_CC_C, QREG_CC_C, 1);
+    tcg_gen_extract_i32(QREG_CC_C, val, 8, 1);
 
     tcg_gen_mov_i32(QREG_CC_X, QREG_CC_C);
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions using extract op
  2017-05-12 23:38 [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op Philippe Mathieu-Daudé
                   ` (3 preceding siblings ...)
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 4/6] target/m68k: optimize bcd_flags() " Philippe Mathieu-Daudé
@ 2017-05-12 23:38 ` Philippe Mathieu-Daudé
  2017-05-13  0:05   ` Richard Henderson
  2017-05-15  4:12   ` David Gibson
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 6/6] target/sparc: " Philippe Mathieu-Daudé
  2017-05-13  1:16 ` [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() " Julia Lawall
  6 siblings, 2 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-05-12 23:38 UTC (permalink / raw)
  To: qemu-devel, Aurelien Jarno, Richard Henderson, David Gibson,
	Alexander Graf, qemu-ppc
  Cc: Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

    $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
        --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 target/ppc/translate.c              | 21 +++++++--------------
 target/ppc/translate/vsx-impl.inc.c | 24 ++++++++----------------
 2 files changed, 15 insertions(+), 30 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index f40b5a1abf..6521365bfa 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -868,8 +868,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
             }
             tcg_gen_xor_tl(cpu_ca, t0, t1);        /* bits changed w/ carry */
             tcg_temp_free(t1);
-            tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);   /* extract bit 32 */
-            tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
+            tcg_gen_extract_tl(cpu_ca, cpu_ca, 32, 1);
             if (is_isa300(ctx)) {
                 tcg_gen_mov_tl(cpu_ca32, cpu_ca);
             }
@@ -1399,8 +1398,7 @@ static inline void gen_op_arith_subf(DisasContext *ctx, TCGv ret, TCGv arg1,
             tcg_temp_free(inv1);
             tcg_gen_xor_tl(cpu_ca, t0, t1);         /* bits changes w/ carry */
             tcg_temp_free(t1);
-            tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);    /* extract bit 32 */
-            tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
+            tcg_gen_extract_tl(cpu_ca, cpu_ca, 32, 1);
             if (is_isa300(ctx)) {
                 tcg_gen_mov_tl(cpu_ca32, cpu_ca);
             }
@@ -4310,8 +4308,7 @@ static void gen_mfsrin(DisasContext *ctx)
 
     CHK_SV;
     t0 = tcg_temp_new();
-    tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
-    tcg_gen_andi_tl(t0, t0, 0xF);
+    tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
     gen_helper_load_sr(cpu_gpr[rD(ctx->opcode)], cpu_env, t0);
     tcg_temp_free(t0);
 #endif /* defined(CONFIG_USER_ONLY) */
@@ -4342,8 +4339,7 @@ static void gen_mtsrin(DisasContext *ctx)
     CHK_SV;
 
     t0 = tcg_temp_new();
-    tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
-    tcg_gen_andi_tl(t0, t0, 0xF);
+    tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
     gen_helper_store_sr(cpu_env, t0, cpu_gpr[rD(ctx->opcode)]);
     tcg_temp_free(t0);
 #endif /* defined(CONFIG_USER_ONLY) */
@@ -4377,8 +4373,7 @@ static void gen_mfsrin_64b(DisasContext *ctx)
 
     CHK_SV;
     t0 = tcg_temp_new();
-    tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
-    tcg_gen_andi_tl(t0, t0, 0xF);
+    tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
     gen_helper_load_sr(cpu_gpr[rD(ctx->opcode)], cpu_env, t0);
     tcg_temp_free(t0);
 #endif /* defined(CONFIG_USER_ONLY) */
@@ -4409,8 +4404,7 @@ static void gen_mtsrin_64b(DisasContext *ctx)
 
     CHK_SV;
     t0 = tcg_temp_new();
-    tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
-    tcg_gen_andi_tl(t0, t0, 0xF);
+    tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
     gen_helper_store_sr(cpu_env, t0, cpu_gpr[rS(ctx->opcode)]);
     tcg_temp_free(t0);
 #endif /* defined(CONFIG_USER_ONLY) */
@@ -5383,8 +5377,7 @@ static void gen_mfsri(DisasContext *ctx)
     CHK_SV;
     t0 = tcg_temp_new();
     gen_addr_reg_index(ctx, t0);
-    tcg_gen_shri_tl(t0, t0, 28);
-    tcg_gen_andi_tl(t0, t0, 0xF);
+    tcg_gen_extract_tl(t0, t0, 28, 4);
     gen_helper_load_sr(cpu_gpr[rd], cpu_env, t0);
     tcg_temp_free(t0);
     if (ra != 0 && ra != rd)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index 7f12908029..85ed135d44 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1248,8 +1248,7 @@ static void gen_xsxexpdp(DisasContext *ctx)
         gen_exception(ctx, POWERPC_EXCP_VSXU);
         return;
     }
-    tcg_gen_shri_i64(rt, cpu_vsrh(xB(ctx->opcode)), 52);
-    tcg_gen_andi_i64(rt, rt, 0x7FF);
+    tcg_gen_extract_i64(rt, cpu_vsrh(xB(ctx->opcode)), 52, 11);
 }
 
 static void gen_xsxexpqp(DisasContext *ctx)
@@ -1262,8 +1261,7 @@ static void gen_xsxexpqp(DisasContext *ctx)
         gen_exception(ctx, POWERPC_EXCP_VSXU);
         return;
     }
-    tcg_gen_shri_i64(xth, xbh, 48);
-    tcg_gen_andi_i64(xth, xth, 0x7FFF);
+    tcg_gen_extract_i64(xth, xbh, 48, 15);
     tcg_gen_movi_i64(xtl, 0);
 }
 
@@ -1323,8 +1321,7 @@ static void gen_xsxsigdp(DisasContext *ctx)
     zr = tcg_const_i64(0);
     nan = tcg_const_i64(2047);
 
-    tcg_gen_shri_i64(exp, cpu_vsrh(xB(ctx->opcode)), 52);
-    tcg_gen_andi_i64(exp, exp, 0x7FF);
+    tcg_gen_extract_i64(exp, cpu_vsrh(xB(ctx->opcode)), 52, 11);
     tcg_gen_movi_i64(t0, 0x0010000000000000);
     tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
     tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
@@ -1352,8 +1349,7 @@ static void gen_xsxsigqp(DisasContext *ctx)
     zr = tcg_const_i64(0);
     nan = tcg_const_i64(32767);
 
-    tcg_gen_shri_i64(exp, cpu_vsrh(rB(ctx->opcode) + 32), 48);
-    tcg_gen_andi_i64(exp, exp, 0x7FFF);
+    tcg_gen_extract_i64(exp, cpu_vsrh(rB(ctx->opcode) + 32), 48, 15);
     tcg_gen_movi_i64(t0, 0x0001000000000000);
     tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
     tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
@@ -1448,10 +1444,8 @@ static void gen_xvxexpdp(DisasContext *ctx)
         gen_exception(ctx, POWERPC_EXCP_VSXU);
         return;
     }
-    tcg_gen_shri_i64(xth, xbh, 52);
-    tcg_gen_andi_i64(xth, xth, 0x7FF);
-    tcg_gen_shri_i64(xtl, xbl, 52);
-    tcg_gen_andi_i64(xtl, xtl, 0x7FF);
+    tcg_gen_extract_i64(xth, xbh, 52, 11);
+    tcg_gen_extract_i64(xtl, xbl, 52, 11);
 }
 
 GEN_VSX_HELPER_2(xvxsigsp, 0x00, 0x04, 0, PPC2_ISA300)
@@ -1474,16 +1468,14 @@ static void gen_xvxsigdp(DisasContext *ctx)
     zr = tcg_const_i64(0);
     nan = tcg_const_i64(2047);
 
-    tcg_gen_shri_i64(exp, xbh, 52);
-    tcg_gen_andi_i64(exp, exp, 0x7FF);
+    tcg_gen_extract_i64(exp, xbh, 52, 11);
     tcg_gen_movi_i64(t0, 0x0010000000000000);
     tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
     tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
     tcg_gen_andi_i64(xth, xbh, 0x000FFFFFFFFFFFFF);
     tcg_gen_or_i64(xth, xth, t0);
 
-    tcg_gen_shri_i64(exp, xbl, 52);
-    tcg_gen_andi_i64(exp, exp, 0x7FF);
+    tcg_gen_extract_i64(exp, xbl, 52, 11);
     tcg_gen_movi_i64(t0, 0x0010000000000000);
     tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
     tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 6/6] target/sparc: optimize various functions using extract op
  2017-05-12 23:38 [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op Philippe Mathieu-Daudé
                   ` (4 preceding siblings ...)
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions " Philippe Mathieu-Daudé
@ 2017-05-12 23:38 ` Philippe Mathieu-Daudé
  2017-05-13  0:08   ` Richard Henderson
  2017-05-13  1:16 ` [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() " Julia Lawall
  6 siblings, 1 reply; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-05-12 23:38 UTC (permalink / raw)
  To: qemu-devel, Aurelien Jarno, Richard Henderson, Mark Cave-Ayland,
	Artyom Tarasenko
  Cc: Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

    $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
        --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 target/sparc/translate.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index aa6734d54e..67a83b77cc 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -380,29 +380,25 @@ static inline void gen_goto_tb(DisasContext *s, int tb_num,
 static inline void gen_mov_reg_N(TCGv reg, TCGv_i32 src)
 {
     tcg_gen_extu_i32_tl(reg, src);
-    tcg_gen_shri_tl(reg, reg, PSR_NEG_SHIFT);
-    tcg_gen_andi_tl(reg, reg, 0x1);
+    tcg_gen_extract_tl(reg, reg, PSR_NEG_SHIFT, 1);
 }
 
 static inline void gen_mov_reg_Z(TCGv reg, TCGv_i32 src)
 {
     tcg_gen_extu_i32_tl(reg, src);
-    tcg_gen_shri_tl(reg, reg, PSR_ZERO_SHIFT);
-    tcg_gen_andi_tl(reg, reg, 0x1);
+    tcg_gen_extract_tl(reg, reg, PSR_ZERO_SHIFT, 1);
 }
 
 static inline void gen_mov_reg_V(TCGv reg, TCGv_i32 src)
 {
     tcg_gen_extu_i32_tl(reg, src);
-    tcg_gen_shri_tl(reg, reg, PSR_OVF_SHIFT);
-    tcg_gen_andi_tl(reg, reg, 0x1);
+    tcg_gen_extract_tl(reg, reg, PSR_OVF_SHIFT, 1);
 }
 
 static inline void gen_mov_reg_C(TCGv reg, TCGv_i32 src)
 {
     tcg_gen_extu_i32_tl(reg, src);
-    tcg_gen_shri_tl(reg, reg, PSR_CARRY_SHIFT);
-    tcg_gen_andi_tl(reg, reg, 0x1);
+    tcg_gen_extract_tl(reg, reg, PSR_CARRY_SHIFT, 1);
 }
 
 static inline void gen_op_add_cc(TCGv dst, TCGv src1, TCGv src2)
@@ -638,8 +634,7 @@ static inline void gen_op_mulscc(TCGv dst, TCGv src1, TCGv src2)
     // env->y = (b2 << 31) | (env->y >> 1);
     tcg_gen_andi_tl(r_temp, cpu_cc_src, 0x1);
     tcg_gen_shli_tl(r_temp, r_temp, 31);
-    tcg_gen_shri_tl(t0, cpu_y, 1);
-    tcg_gen_andi_tl(t0, t0, 0x7fffffff);
+    tcg_gen_extract_tl(t0, cpu_y, 1, 31);
     tcg_gen_or_tl(t0, t0, r_temp);
     tcg_gen_andi_tl(cpu_y, t0, 0xffffffff);
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/6] target/alpha: optimize cvtlq() using extract op
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 2/6] target/alpha: optimize cvtlq() using extract op Philippe Mathieu-Daudé
@ 2017-05-13  0:04   ` Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2017-05-13  0:04 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel, Aurelien Jarno, Laurent Vivier

On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:
> Patch created mechanically using Coccinelle script via:
> 
>      $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
>          --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target
> 
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
>   target/alpha/translate.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/target/alpha/translate.c b/target/alpha/translate.c
> index df5d695344..531af4f5b8 100644
> --- a/target/alpha/translate.c
> +++ b/target/alpha/translate.c
> @@ -747,9 +747,8 @@ static void gen_cvtlq(TCGv vc, TCGv vb)
>       /* The arithmetic right shift here, plus the sign-extended mask below
>          yields a sign-extended result without an explicit ext32s_i64.  */
>       tcg_gen_sari_i64(tmp, vb, 32);
> -    tcg_gen_shri_i64(vc, vb, 29);
> +    tcg_gen_extract_i64(vc, vb, 29, 30);
>       tcg_gen_andi_i64(tmp, tmp, (int32_t)0xc0000000);
> -    tcg_gen_andi_i64(vc, vc, 0x3fffffff);
>       tcg_gen_or_i64(vc, vc, tmp);

While this is accurate, looking at the broader context I think it would be 
better to use a deposit operation for this case.

   tcg_gen_shri_i64(tmp, vb, 29);
   tcg_gen_sari_i64(vc, vb, 32);
   tcg_gen_deposit_i64(vc, vc, tmp, 0, 30);


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions using extract op
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions " Philippe Mathieu-Daudé
@ 2017-05-13  0:05   ` Richard Henderson
  2017-05-15  4:12   ` David Gibson
  1 sibling, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2017-05-13  0:05 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé,
	qemu-devel, Aurelien Jarno, David Gibson, Alexander Graf,
	qemu-ppc

On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:
> Patch created mechanically using Coccinelle script via:
> 
>      $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
>          --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target
> 
> Signed-off-by: Philippe Mathieu-Daudé<f4bug@amsat.org>
> ---
>   target/ppc/translate.c              | 21 +++++++--------------
>   target/ppc/translate/vsx-impl.inc.c | 24 ++++++++----------------
>   2 files changed, 15 insertions(+), 30 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/6] target/m68k: optimize bcd_flags() using extract op
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 4/6] target/m68k: optimize bcd_flags() " Philippe Mathieu-Daudé
@ 2017-05-13  0:05   ` Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2017-05-13  0:05 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel, Aurelien Jarno, Laurent Vivier

On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:
> Patch created mechanically using Coccinelle script via:
> 
>      $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
>          --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target
> 
> Signed-off-by: Philippe Mathieu-Daudé<f4bug@amsat.org>
> Acked-by: Laurent Vivier<laurent@vivier.eu>
> ---
>   target/m68k/translate.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)


Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 6/6] target/sparc: optimize various functions using extract op
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 6/6] target/sparc: " Philippe Mathieu-Daudé
@ 2017-05-13  0:08   ` Richard Henderson
  2017-07-18  3:18     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2017-05-13  0:08 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé,
	qemu-devel, Aurelien Jarno, Mark Cave-Ayland, Artyom Tarasenko

On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:
> Patch created mechanically using Coccinelle script via:
> 
>      $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
>          --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target
> 
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
>   target/sparc/translate.c | 15 +++++----------
>   1 file changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/target/sparc/translate.c b/target/sparc/translate.c
> index aa6734d54e..67a83b77cc 100644
> --- a/target/sparc/translate.c
> +++ b/target/sparc/translate.c
> @@ -380,29 +380,25 @@ static inline void gen_goto_tb(DisasContext *s, int tb_num,
>   static inline void gen_mov_reg_N(TCGv reg, TCGv_i32 src)
>   {
>       tcg_gen_extu_i32_tl(reg, src);
> -    tcg_gen_shri_tl(reg, reg, PSR_NEG_SHIFT);
> -    tcg_gen_andi_tl(reg, reg, 0x1);
> +    tcg_gen_extract_tl(reg, reg, PSR_NEG_SHIFT, 1);
>   }
>   
>   static inline void gen_mov_reg_Z(TCGv reg, TCGv_i32 src)
>   {
>       tcg_gen_extu_i32_tl(reg, src);
> -    tcg_gen_shri_tl(reg, reg, PSR_ZERO_SHIFT);
> -    tcg_gen_andi_tl(reg, reg, 0x1);
> +    tcg_gen_extract_tl(reg, reg, PSR_ZERO_SHIFT, 1);
>   }
>   
>   static inline void gen_mov_reg_V(TCGv reg, TCGv_i32 src)
>   {
>       tcg_gen_extu_i32_tl(reg, src);
> -    tcg_gen_shri_tl(reg, reg, PSR_OVF_SHIFT);
> -    tcg_gen_andi_tl(reg, reg, 0x1);
> +    tcg_gen_extract_tl(reg, reg, PSR_OVF_SHIFT, 1);
>   }
>   
>   static inline void gen_mov_reg_C(TCGv reg, TCGv_i32 src)
>   {
>       tcg_gen_extu_i32_tl(reg, src);
> -    tcg_gen_shri_tl(reg, reg, PSR_CARRY_SHIFT);
> -    tcg_gen_andi_tl(reg, reg, 0x1);
> +    tcg_gen_extract_tl(reg, reg, PSR_CARRY_SHIFT, 1);
>   }
>   

These ones get a

Reviewed-by: Richard Henderson <rth@twiddle.net>

>   static inline void gen_op_add_cc(TCGv dst, TCGv src1, TCGv src2)
> @@ -638,8 +634,7 @@ static inline void gen_op_mulscc(TCGv dst, TCGv src1, TCGv src2)
>       // env->y = (b2 << 31) | (env->y >> 1);
>       tcg_gen_andi_tl(r_temp, cpu_cc_src, 0x1);
>       tcg_gen_shli_tl(r_temp, r_temp, 31);
> -    tcg_gen_shri_tl(t0, cpu_y, 1);
> -    tcg_gen_andi_tl(t0, t0, 0x7fffffff);
> +    tcg_gen_extract_tl(t0, cpu_y, 1, 31);
>       tcg_gen_or_tl(t0, t0, r_temp);
>       tcg_gen_andi_tl(cpu_y, t0, 0xffffffff);

But this should use

   tcg_gen_extract_tl(cpu_y, cpu_y, 1, 31);
   tcg_gen_deposit_tl(cpu_y, cpu_y, cpu_cc_src, 31, 1);


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op
  2017-05-12 23:38 [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op Philippe Mathieu-Daudé
                   ` (5 preceding siblings ...)
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 6/6] target/sparc: " Philippe Mathieu-Daudé
@ 2017-05-13  1:16 ` Julia Lawall
  6 siblings, 0 replies; 19+ messages in thread
From: Julia Lawall @ 2017-05-13  1:16 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-devel, qemu-arm, qemu-ppc, Richard Henderson,
	Alexander Graf, Artyom Tarasenko, Aurelien Jarno, David Gibson,
	Eduardo Habkost, Eric Blake, Laurent Vivier, Laurent Vivier,
	Mark Cave-Ayland, Markus Armbruster, Michael Tokarev,
	Nikunj A Dadhania, Paolo Bonzini, Peter Maydell, Markus Elfring,
	Julia Lawall, Nicolas Palix



On Fri, 12 May 2017, Philippe Mathieu-Daudé wrote:

> * Changes from v3
>
> Tried to fix wrong previous attempt...
> After getting some nice/fast pieces of advice from Coccinelle folks, I tried to
> improved the script (not much inline documentation yet although).
> - correctly check if this optimizable?
> - document as Mersenne number instead of prime (Eric Blake)
> - try to write Python code instead of BASIC (Markus Elfring advices)
> - try to reduce regex usage
> - try to match shri(); unrelated(); andi(); pattern to optimize, I was surprised
>   to see the alpha diff Coccinelle found.
>
> This is surely not the last version of this patchset, but I think now the
> generated patches are correct and I prefer reviewers to look at them fixed
> instead of wrong one in the ML.
> Still lot of work to do in the cocci script, now it seems to hang trying to
> parse "target/arm/translate.c".

Try using the arguments --debug and --show-trying.  This will help you see
what rule it is stuck on, and what function.  If the function is just very
complicated and the file is not important for transforming, you may just
want to give up, by adding eg --timeout 120.

julia


>
> * [v3] (v2 was a resend of the cocci script):
>
> In my first attempt I misunderstood tcg_gen_extract() intrinsics, and Richard
> Henderson pointed that out.
> In this patchset the cocci script is corrected and clarified, it also print how
> arguments are checked while running.
> Also:
> - incorrect patches have been removed. (Richard Henderson, Nikunj A Dadhania)
> - Coccinelle script licensed GPLv2+ (Eric Blake)
> - comment in each commit about how to apply the patch (Eric Blake)
> - added Acked-by for m68k (Laurent Vivier)
> - Cc: Coccinelle developers.
>
> [v1]
>
> While reviewing a commit from Aurelien Jarno where he optimized a TCG generator
> for SH-4 [1] I found the same optimization done on PPC by Nikunj A Dadhania few
> months ago [2].
> After asking on the ML about a cocci script [3] I thought it would be easier to
> learn about Coccinelle.
>
> citing Aurelien Jarno:
>     This doesn't change the generated code on x86, but optimizes it on most
>     RISC architectures and makes the code simpler to read.
>
> I actually applied the script using the following command:
>
> $ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
>     --sp-file scripts/coccinelle/tcg_gen_extract.cocci \
>     --macro-file scripts/cocci-macro-file.h \
>     --dir target \
>     --in-place
>
> Please review again! thanks.
>
> [1] http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01466.html
> [2] http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05211.html
> [3] http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01499.html
>
> Philippe Mathieu-Daudé (6):
>   coccinelle: add a script to optimize tcg op using tcg_gen_extract()
>   target/alpha: optimize cvtlq() using extract op
>   target/arm: optimize rev16() using extract op
>   target/m68k: optimize bcd_flags() using extract op
>   target/ppc: optimize various functions using extract op
>   target/sparc: optimize various functions using extract op
>
>  scripts/coccinelle/tcg_gen_extract.cocci | 103 +++++++++++++++++++++++++++++++
>  target/alpha/translate.c                 |   3 +-
>  target/arm/translate-a64.c               |   6 +-
>  target/m68k/translate.c                  |   3 +-
>  target/ppc/translate.c                   |  21 +++----
>  target/ppc/translate/vsx-impl.inc.c      |  24 +++----
>  target/sparc/translate.c                 |  15 ++---
>  7 files changed, 127 insertions(+), 48 deletions(-)
>  create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci
>
> --
> 2.11.0
>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions using extract op
  2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions " Philippe Mathieu-Daudé
  2017-05-13  0:05   ` Richard Henderson
@ 2017-05-15  4:12   ` David Gibson
  2017-05-16  0:02     ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 19+ messages in thread
From: David Gibson @ 2017-05-15  4:12 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-devel, Aurelien Jarno, Richard Henderson, Alexander Graf, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 7463 bytes --]

On Fri, May 12, 2017 at 08:38:42PM -0300, Philippe Mathieu-Daudé wrote:
> Patch created mechanically using Coccinelle script via:
> 
>     $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
>         --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target
> 
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

Acked-by: David Gibson <david@gibson.dropbear.id.au>

Do you want me to merge this via my ppc tree, or is the whole set
going in via some other path?

> ---
>  target/ppc/translate.c              | 21 +++++++--------------
>  target/ppc/translate/vsx-impl.inc.c | 24 ++++++++----------------
>  2 files changed, 15 insertions(+), 30 deletions(-)
> 
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index f40b5a1abf..6521365bfa 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -868,8 +868,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
>              }
>              tcg_gen_xor_tl(cpu_ca, t0, t1);        /* bits changed w/ carry */
>              tcg_temp_free(t1);
> -            tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);   /* extract bit 32 */
> -            tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
> +            tcg_gen_extract_tl(cpu_ca, cpu_ca, 32, 1);
>              if (is_isa300(ctx)) {
>                  tcg_gen_mov_tl(cpu_ca32, cpu_ca);
>              }
> @@ -1399,8 +1398,7 @@ static inline void gen_op_arith_subf(DisasContext *ctx, TCGv ret, TCGv arg1,
>              tcg_temp_free(inv1);
>              tcg_gen_xor_tl(cpu_ca, t0, t1);         /* bits changes w/ carry */
>              tcg_temp_free(t1);
> -            tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);    /* extract bit 32 */
> -            tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
> +            tcg_gen_extract_tl(cpu_ca, cpu_ca, 32, 1);
>              if (is_isa300(ctx)) {
>                  tcg_gen_mov_tl(cpu_ca32, cpu_ca);
>              }
> @@ -4310,8 +4308,7 @@ static void gen_mfsrin(DisasContext *ctx)
>  
>      CHK_SV;
>      t0 = tcg_temp_new();
> -    tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
> -    tcg_gen_andi_tl(t0, t0, 0xF);
> +    tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
>      gen_helper_load_sr(cpu_gpr[rD(ctx->opcode)], cpu_env, t0);
>      tcg_temp_free(t0);
>  #endif /* defined(CONFIG_USER_ONLY) */
> @@ -4342,8 +4339,7 @@ static void gen_mtsrin(DisasContext *ctx)
>      CHK_SV;
>  
>      t0 = tcg_temp_new();
> -    tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
> -    tcg_gen_andi_tl(t0, t0, 0xF);
> +    tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
>      gen_helper_store_sr(cpu_env, t0, cpu_gpr[rD(ctx->opcode)]);
>      tcg_temp_free(t0);
>  #endif /* defined(CONFIG_USER_ONLY) */
> @@ -4377,8 +4373,7 @@ static void gen_mfsrin_64b(DisasContext *ctx)
>  
>      CHK_SV;
>      t0 = tcg_temp_new();
> -    tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
> -    tcg_gen_andi_tl(t0, t0, 0xF);
> +    tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
>      gen_helper_load_sr(cpu_gpr[rD(ctx->opcode)], cpu_env, t0);
>      tcg_temp_free(t0);
>  #endif /* defined(CONFIG_USER_ONLY) */
> @@ -4409,8 +4404,7 @@ static void gen_mtsrin_64b(DisasContext *ctx)
>  
>      CHK_SV;
>      t0 = tcg_temp_new();
> -    tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
> -    tcg_gen_andi_tl(t0, t0, 0xF);
> +    tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
>      gen_helper_store_sr(cpu_env, t0, cpu_gpr[rS(ctx->opcode)]);
>      tcg_temp_free(t0);
>  #endif /* defined(CONFIG_USER_ONLY) */
> @@ -5383,8 +5377,7 @@ static void gen_mfsri(DisasContext *ctx)
>      CHK_SV;
>      t0 = tcg_temp_new();
>      gen_addr_reg_index(ctx, t0);
> -    tcg_gen_shri_tl(t0, t0, 28);
> -    tcg_gen_andi_tl(t0, t0, 0xF);
> +    tcg_gen_extract_tl(t0, t0, 28, 4);
>      gen_helper_load_sr(cpu_gpr[rd], cpu_env, t0);
>      tcg_temp_free(t0);
>      if (ra != 0 && ra != rd)
> diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
> index 7f12908029..85ed135d44 100644
> --- a/target/ppc/translate/vsx-impl.inc.c
> +++ b/target/ppc/translate/vsx-impl.inc.c
> @@ -1248,8 +1248,7 @@ static void gen_xsxexpdp(DisasContext *ctx)
>          gen_exception(ctx, POWERPC_EXCP_VSXU);
>          return;
>      }
> -    tcg_gen_shri_i64(rt, cpu_vsrh(xB(ctx->opcode)), 52);
> -    tcg_gen_andi_i64(rt, rt, 0x7FF);
> +    tcg_gen_extract_i64(rt, cpu_vsrh(xB(ctx->opcode)), 52, 11);
>  }
>  
>  static void gen_xsxexpqp(DisasContext *ctx)
> @@ -1262,8 +1261,7 @@ static void gen_xsxexpqp(DisasContext *ctx)
>          gen_exception(ctx, POWERPC_EXCP_VSXU);
>          return;
>      }
> -    tcg_gen_shri_i64(xth, xbh, 48);
> -    tcg_gen_andi_i64(xth, xth, 0x7FFF);
> +    tcg_gen_extract_i64(xth, xbh, 48, 15);
>      tcg_gen_movi_i64(xtl, 0);
>  }
>  
> @@ -1323,8 +1321,7 @@ static void gen_xsxsigdp(DisasContext *ctx)
>      zr = tcg_const_i64(0);
>      nan = tcg_const_i64(2047);
>  
> -    tcg_gen_shri_i64(exp, cpu_vsrh(xB(ctx->opcode)), 52);
> -    tcg_gen_andi_i64(exp, exp, 0x7FF);
> +    tcg_gen_extract_i64(exp, cpu_vsrh(xB(ctx->opcode)), 52, 11);
>      tcg_gen_movi_i64(t0, 0x0010000000000000);
>      tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
>      tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
> @@ -1352,8 +1349,7 @@ static void gen_xsxsigqp(DisasContext *ctx)
>      zr = tcg_const_i64(0);
>      nan = tcg_const_i64(32767);
>  
> -    tcg_gen_shri_i64(exp, cpu_vsrh(rB(ctx->opcode) + 32), 48);
> -    tcg_gen_andi_i64(exp, exp, 0x7FFF);
> +    tcg_gen_extract_i64(exp, cpu_vsrh(rB(ctx->opcode) + 32), 48, 15);
>      tcg_gen_movi_i64(t0, 0x0001000000000000);
>      tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
>      tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
> @@ -1448,10 +1444,8 @@ static void gen_xvxexpdp(DisasContext *ctx)
>          gen_exception(ctx, POWERPC_EXCP_VSXU);
>          return;
>      }
> -    tcg_gen_shri_i64(xth, xbh, 52);
> -    tcg_gen_andi_i64(xth, xth, 0x7FF);
> -    tcg_gen_shri_i64(xtl, xbl, 52);
> -    tcg_gen_andi_i64(xtl, xtl, 0x7FF);
> +    tcg_gen_extract_i64(xth, xbh, 52, 11);
> +    tcg_gen_extract_i64(xtl, xbl, 52, 11);
>  }
>  
>  GEN_VSX_HELPER_2(xvxsigsp, 0x00, 0x04, 0, PPC2_ISA300)
> @@ -1474,16 +1468,14 @@ static void gen_xvxsigdp(DisasContext *ctx)
>      zr = tcg_const_i64(0);
>      nan = tcg_const_i64(2047);
>  
> -    tcg_gen_shri_i64(exp, xbh, 52);
> -    tcg_gen_andi_i64(exp, exp, 0x7FF);
> +    tcg_gen_extract_i64(exp, xbh, 52, 11);
>      tcg_gen_movi_i64(t0, 0x0010000000000000);
>      tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
>      tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
>      tcg_gen_andi_i64(xth, xbh, 0x000FFFFFFFFFFFFF);
>      tcg_gen_or_i64(xth, xth, t0);
>  
> -    tcg_gen_shri_i64(exp, xbl, 52);
> -    tcg_gen_andi_i64(exp, exp, 0x7FF);
> +    tcg_gen_extract_i64(exp, xbl, 52, 11);
>      tcg_gen_movi_i64(t0, 0x0010000000000000);
>      tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
>      tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract()
  2017-05-12 23:38 ` [Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract() Philippe Mathieu-Daudé
@ 2017-05-15 14:04   ` Eric Blake
  2017-05-15 14:06     ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Blake @ 2017-05-15 14:04 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé,
	qemu-devel, Aurelien Jarno, Richard Henderson, Nikunj A Dadhania,
	Markus Armbruster, Laurent Vivier, Michael Tokarev,
	Eduardo Habkost, Paolo Bonzini
  Cc: Markus Elfring, Julia Lawall, Nicolas Palix

[-- Attachment #1: Type: text/plain, Size: 2448 bytes --]

On 05/12/2017 06:38 PM, Philippe Mathieu-Daudé wrote:
> If you have coccinelle installed you can apply this script using:
> 
>     $ spatch \
>         --macro-file scripts/cocci-macro-file.h \
>         --dir target --in-place
> 
> You can also use directly Peter Senna Tschudin docker image (easier):
> 
>     $ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
>         --sp-file scripts/coccinelle/tcg_gen_extract.cocci \
>         --macro-file scripts/cocci-macro-file.h \
>         --dir target --in-place
> 
> Then verified that no manual touchups are required.
> 
> The following thread was helpful while writing this script:
> 
>     https://github.com/coccinelle/coccinelle/issues/86
> 
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
>  scripts/coccinelle/tcg_gen_extract.cocci | 103 +++++++++++++++++++++++++++++++
>  1 file changed, 103 insertions(+)
>  create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci

It's still not obvious to me whether we want this script in the tree (as
something we plan to rerun regularly to check for regressions), or just
in the commit message (useful for the one-time location of spots to
optimize, but something we don't anticipate repeating).


> +@@
> +import sys
> +fd = sys.stderr
> +def debug(msg="", trailer="\n"):
> +    fd.write("[DBG] " + msg + trailer)
> +def low_bits_count(value):
> +    bits_count = 0
> +    while (value & (1 << bits_count)):
> +        bits_count += 1

Surely python has a faster method than this (after all, we have ctz and
friends in C code)?  But my python is limited enough that I don't know
of one off-hand.

> +    return bits_count
> +def Mn(order): # Mersenne number
> +    return (1 << order) - 1

Correct name...


> +else:
> +    debug("candidate at %s:%s" % (shr_p[0].file, shr_p[0].line))
> +    try: # only eval integer, no #define like 'SR_M' (cpp did this, else some headers are missing).
> +        msk_v = long(msk_s.strip("UL"), 0)
> +        msk_b = low_bits_count(msk_v)
> +        if msk_b == 0:
> +            debug("  value: 0x%x low_bits: %d" % (msk_v, msk_b))
> +        else:
> +            debug("  value: 0x%x low_bits: %d [Mersenne prime: 0x%x]" % (msk_v, msk_b, Mn(msk_b)))

...but this name is still wrong.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract()
  2017-05-15 14:04   ` Eric Blake
@ 2017-05-15 14:06     ` Paolo Bonzini
  2017-05-15 14:10       ` Laurent Vivier
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2017-05-15 14:06 UTC (permalink / raw)
  To: Eric Blake, Philippe Mathieu-Daudé,
	qemu-devel, Aurelien Jarno, Richard Henderson, Nikunj A Dadhania,
	Markus Armbruster, Laurent Vivier, Michael Tokarev,
	Eduardo Habkost
  Cc: Markus Elfring, Julia Lawall, Nicolas Palix



On 15/05/2017 16:04, Eric Blake wrote:
> On 05/12/2017 06:38 PM, Philippe Mathieu-Daudé wrote:
>> If you have coccinelle installed you can apply this script using:
>>
>>     $ spatch \
>>         --macro-file scripts/cocci-macro-file.h \
>>         --dir target --in-place
>>
>> You can also use directly Peter Senna Tschudin docker image (easier):
>>
>>     $ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
>>         --sp-file scripts/coccinelle/tcg_gen_extract.cocci \
>>         --macro-file scripts/cocci-macro-file.h \
>>         --dir target --in-place
>>
>> Then verified that no manual touchups are required.
>>
>> The following thread was helpful while writing this script:
>>
>>     https://github.com/coccinelle/coccinelle/issues/86
>>
>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>> ---
>>  scripts/coccinelle/tcg_gen_extract.cocci | 103 +++++++++++++++++++++++++++++++
>>  1 file changed, 103 insertions(+)
>>  create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci
> 
> It's still not obvious to me whether we want this script in the tree (as
> something we plan to rerun regularly to check for regressions), or just
> in the commit message (useful for the one-time location of spots to
> optimize, but something we don't anticipate repeating).

I think it's useful.  New backends can have this issue, plus it shows
some advanced Coccinelle techniques.

Paolo

> 
> 
>> +@@
>> +import sys
>> +fd = sys.stderr
>> +def debug(msg="", trailer="\n"):
>> +    fd.write("[DBG] " + msg + trailer)
>> +def low_bits_count(value):
>> +    bits_count = 0
>> +    while (value & (1 << bits_count)):
>> +        bits_count += 1
> 
> Surely python has a faster method than this (after all, we have ctz and
> friends in C code)?  But my python is limited enough that I don't know
> of one off-hand.
> 
>> +    return bits_count
>> +def Mn(order): # Mersenne number
>> +    return (1 << order) - 1
> 
> Correct name...
> 
> 
>> +else:
>> +    debug("candidate at %s:%s" % (shr_p[0].file, shr_p[0].line))
>> +    try: # only eval integer, no #define like 'SR_M' (cpp did this, else some headers are missing).
>> +        msk_v = long(msk_s.strip("UL"), 0)
>> +        msk_b = low_bits_count(msk_v)
>> +        if msk_b == 0:
>> +            debug("  value: 0x%x low_bits: %d" % (msk_v, msk_b))
>> +        else:
>> +            debug("  value: 0x%x low_bits: %d [Mersenne prime: 0x%x]" % (msk_v, msk_b, Mn(msk_b)))
> 
> ...but this name is still wrong.
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract()
  2017-05-15 14:06     ` Paolo Bonzini
@ 2017-05-15 14:10       ` Laurent Vivier
  0 siblings, 0 replies; 19+ messages in thread
From: Laurent Vivier @ 2017-05-15 14:10 UTC (permalink / raw)
  To: Paolo Bonzini, Eric Blake, Philippe Mathieu-Daudé,
	qemu-devel, Aurelien Jarno, Richard Henderson, Nikunj A Dadhania,
	Markus Armbruster, Michael Tokarev, Eduardo Habkost
  Cc: Markus Elfring, Julia Lawall, Nicolas Palix

On 15/05/2017 16:06, Paolo Bonzini wrote:
> 
> 
> On 15/05/2017 16:04, Eric Blake wrote:
>> On 05/12/2017 06:38 PM, Philippe Mathieu-Daudé wrote:
>>> If you have coccinelle installed you can apply this script using:
>>>
>>>     $ spatch \
>>>         --macro-file scripts/cocci-macro-file.h \
>>>         --dir target --in-place
>>>
>>> You can also use directly Peter Senna Tschudin docker image (easier):
>>>
>>>     $ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
>>>         --sp-file scripts/coccinelle/tcg_gen_extract.cocci \
>>>         --macro-file scripts/cocci-macro-file.h \
>>>         --dir target --in-place
>>>
>>> Then verified that no manual touchups are required.
>>>
>>> The following thread was helpful while writing this script:
>>>
>>>     https://github.com/coccinelle/coccinelle/issues/86
>>>
>>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>> ---
>>>  scripts/coccinelle/tcg_gen_extract.cocci | 103 +++++++++++++++++++++++++++++++
>>>  1 file changed, 103 insertions(+)
>>>  create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci
>>
>> It's still not obvious to me whether we want this script in the tree (as
>> something we plan to rerun regularly to check for regressions), or just
>> in the commit message (useful for the one-time location of spots to
>> optimize, but something we don't anticipate repeating).
> 
> I think it's useful.  New backends can have this issue, plus it shows
> some advanced Coccinelle techniques.
> 

I agree: I think it's a good idea to have a place in QEMU directory with
all past coccinelle scripts to help to write the new ones.

Laurent

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions using extract op
  2017-05-15  4:12   ` David Gibson
@ 2017-05-16  0:02     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-05-16  0:02 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, Aurelien Jarno, Richard Henderson, Alexander Graf, qemu-ppc

Hi David,

On 05/15/2017 01:12 AM, David Gibson wrote:
> On Fri, May 12, 2017 at 08:38:42PM -0300, Philippe Mathieu-Daudé wrote:
>> Patch created mechanically using Coccinelle script via:
>>
>>     $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
>>         --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target
>>
>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>
> Acked-by: David Gibson <david@gibson.dropbear.id.au>
>
> Do you want me to merge this via my ppc tree, or is the whole set
> going in via some other path?

Thank for the review!

As you wish, I think it makes sens this serie goes altogether via 
Richard's tree, once I finish correcting few details.

Regards,

Phil.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 6/6] target/sparc: optimize various functions using extract op
  2017-05-13  0:08   ` Richard Henderson
@ 2017-07-18  3:18     ` Philippe Mathieu-Daudé
  2017-07-18  3:44       ` Richard Henderson
  0 siblings, 1 reply; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-18  3:18 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, Aurelien Jarno, Mark Cave-Ayland,
	Artyom Tarasenko

On 05/12/2017 09:08 PM, Richard Henderson wrote:
> On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:
[...]
>>   static inline void gen_op_add_cc(TCGv dst, TCGv src1, TCGv src2)
>> @@ -638,8 +634,7 @@ static inline void gen_op_mulscc(TCGv dst, TCGv 
>> src1, TCGv src2)
>>       // env->y = (b2 << 31) | (env->y >> 1);
>>       tcg_gen_andi_tl(r_temp, cpu_cc_src, 0x1);
>>       tcg_gen_shli_tl(r_temp, r_temp, 31);
>> -    tcg_gen_shri_tl(t0, cpu_y, 1);
>> -    tcg_gen_andi_tl(t0, t0, 0x7fffffff);
>> +    tcg_gen_extract_tl(t0, cpu_y, 1, 31);
>>       tcg_gen_or_tl(t0, t0, r_temp);
>>       tcg_gen_andi_tl(cpu_y, t0, 0xffffffff);

So this 0xffffffff mask is incorrect and should be 0x7fffffff?

> 
> But this should use
> 
>    tcg_gen_extract_tl(cpu_y, cpu_y, 1, 31);
>    tcg_gen_deposit_tl(cpu_y, cpu_y, cpu_cc_src, 31, 1);
> 
> 
> r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 6/6] target/sparc: optimize various functions using extract op
  2017-07-18  3:18     ` Philippe Mathieu-Daudé
@ 2017-07-18  3:44       ` Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2017-07-18  3:44 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé,
	qemu-devel, Aurelien Jarno, Mark Cave-Ayland, Artyom Tarasenko

On 07/17/2017 05:18 PM, Philippe Mathieu-Daudé wrote:
> On 05/12/2017 09:08 PM, Richard Henderson wrote:
>> On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:
> [...]
>>>   static inline void gen_op_add_cc(TCGv dst, TCGv src1, TCGv src2)
>>> @@ -638,8 +634,7 @@ static inline void gen_op_mulscc(TCGv dst, TCGv src1, 
>>> TCGv src2)
>>>       // env->y = (b2 << 31) | (env->y >> 1);
>>>       tcg_gen_andi_tl(r_temp, cpu_cc_src, 0x1);
>>>       tcg_gen_shli_tl(r_temp, r_temp, 31);
>>> -    tcg_gen_shri_tl(t0, cpu_y, 1);
>>> -    tcg_gen_andi_tl(t0, t0, 0x7fffffff);
>>> +    tcg_gen_extract_tl(t0, cpu_y, 1, 31);
>>>       tcg_gen_or_tl(t0, t0, r_temp);
>>>       tcg_gen_andi_tl(cpu_y, t0, 0xffffffff);
> 
> So this 0xffffffff mask is incorrect and should be 0x7fffffff?

No, this has nothing to do with the second andi.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-07-18  3:44 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-12 23:38 [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op Philippe Mathieu-Daudé
2017-05-12 23:38 ` [Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract() Philippe Mathieu-Daudé
2017-05-15 14:04   ` Eric Blake
2017-05-15 14:06     ` Paolo Bonzini
2017-05-15 14:10       ` Laurent Vivier
2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 2/6] target/alpha: optimize cvtlq() using extract op Philippe Mathieu-Daudé
2017-05-13  0:04   ` Richard Henderson
2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 3/6] target/arm: optimize rev16() " Philippe Mathieu-Daudé
2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 4/6] target/m68k: optimize bcd_flags() " Philippe Mathieu-Daudé
2017-05-13  0:05   ` Richard Henderson
2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions " Philippe Mathieu-Daudé
2017-05-13  0:05   ` Richard Henderson
2017-05-15  4:12   ` David Gibson
2017-05-16  0:02     ` Philippe Mathieu-Daudé
2017-05-12 23:38 ` [Qemu-devel] [PATCH v4 6/6] target/sparc: " Philippe Mathieu-Daudé
2017-05-13  0:08   ` Richard Henderson
2017-07-18  3:18     ` Philippe Mathieu-Daudé
2017-07-18  3:44       ` Richard Henderson
2017-05-13  1:16 ` [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() " Julia Lawall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.