All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements
@ 2013-09-03  6:27 Aurelien Jarno
  2013-09-03  6:27 ` [Qemu-devel] [PATCH 1/4] tcg/optimize: fix know-zero bits optimization Aurelien Jarno
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Aurelien Jarno @ 2013-09-03  6:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Aurelien Jarno, Richard Henderson

This patchset first fixes known-zero bits optimization so that it is
actually used, and does some further optimizations for 32-bit ops and
unsigned loads.

Aurelien Jarno (4):
  tcg/optimize: fix know-zero bits optimization
  tcg/optimize: fix known-zero bits for right shift ops
  tcg/optimize: improve known-zero bits for 32-bit ops
  tcg/optimize: add known-zero bits compute for load ops

 tcg/optimize.c |   48 +++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 43 insertions(+), 5 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 1/4] tcg/optimize: fix know-zero bits optimization
  2013-09-03  6:27 [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Aurelien Jarno
@ 2013-09-03  6:27 ` Aurelien Jarno
  2013-09-03  9:50   ` Andreas Färber
  2013-09-03  6:27 ` [Qemu-devel] [PATCH 2/4] tcg/optimize: fix known-zero bits for right shift ops Aurelien Jarno
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Aurelien Jarno @ 2013-09-03  6:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Aurelien Jarno, Richard Henderson

Known-zero bits optimization is a great idea that helps to generate more
optimized code. However the current implementation is basically useless
as the computed mask is not saved.

Fix this to make it really working.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index b29bf25..41f2906 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -695,7 +695,8 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             break;
         }
 
-        /* Simplify using known-zero bits */
+        /* Simplify using known-zero bits. Currently only ops with a single
+           output argument is supported. */
         mask = -1;
         affected = -1;
         switch (op) {
@@ -1144,6 +1145,11 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             } else {
                 for (i = 0; i < def->nb_oargs; i++) {
                     reset_temp(args[i]);
+                    /* Save the corresponding known-zero bits mask for the
+                       first output argument (only one supported so far). */
+                    if (i == 0) {
+                        temps[args[i]].mask = mask;
+                    }
                 }
             }
             for (i = 0; i < def->nb_args; i++) {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 2/4] tcg/optimize: fix known-zero bits for right shift ops
  2013-09-03  6:27 [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Aurelien Jarno
  2013-09-03  6:27 ` [Qemu-devel] [PATCH 1/4] tcg/optimize: fix know-zero bits optimization Aurelien Jarno
@ 2013-09-03  6:27 ` Aurelien Jarno
  2013-09-03  6:27 ` [Qemu-devel] [PATCH 3/4] tcg/optimize: improve known-zero bits for 32-bit ops Aurelien Jarno
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Aurelien Jarno @ 2013-09-03  6:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Aurelien Jarno, Richard Henderson

32-bit versions of sar and shr ops should not propagate known-zero bits
from the unused 32 high bits. For sar it could even lead to wrong code
being generated.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |   21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 41f2906..0ed8983 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -731,16 +731,29 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             mask = temps[args[1]].mask & mask;
             break;
 
-        CASE_OP_32_64(sar):
+        case INDEX_op_sar_i32:
+            if (temps[args[2]].state == TCG_TEMP_CONST) {
+                mask = ((int32_t)temps[args[1]].mask
+                        >> temps[args[2]].val);
+            }
+            break;
+        case INDEX_op_sar_i64:
             if (temps[args[2]].state == TCG_TEMP_CONST) {
-                mask = ((tcg_target_long)temps[args[1]].mask
+                mask = ((int64_t)temps[args[1]].mask
                         >> temps[args[2]].val);
             }
             break;
 
-        CASE_OP_32_64(shr):
+        case INDEX_op_shr_i32:
             if (temps[args[2]].state == TCG_TEMP_CONST) {
-                mask = temps[args[1]].mask >> temps[args[2]].val;
+                mask = ((uint32_t)temps[args[1]].mask
+                        >> temps[args[2]].val);
+            }
+            break;
+        case INDEX_op_shr_i64:
+            if (temps[args[2]].state == TCG_TEMP_CONST) {
+                mask = ((uint64_t)temps[args[1]].mask
+                        >> temps[args[2]].val);
             }
             break;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 3/4] tcg/optimize: improve known-zero bits for 32-bit ops
  2013-09-03  6:27 [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Aurelien Jarno
  2013-09-03  6:27 ` [Qemu-devel] [PATCH 1/4] tcg/optimize: fix know-zero bits optimization Aurelien Jarno
  2013-09-03  6:27 ` [Qemu-devel] [PATCH 2/4] tcg/optimize: fix known-zero bits for right shift ops Aurelien Jarno
@ 2013-09-03  6:27 ` Aurelien Jarno
  2013-09-03  6:28 ` [Qemu-devel] [PATCH 4/4] tcg/optimize: add known-zero bits compute for load ops Aurelien Jarno
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Aurelien Jarno @ 2013-09-03  6:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Aurelien Jarno, Richard Henderson

The shl_i32 op might set some bits of the unused 32 high bits of the
mask. Fix that by clearing the unused 32 high bits for all 32-bit ops
except load/store which operate on tl values.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 0ed8983..b1f736b 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -791,6 +791,12 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             break;
         }
 
+        /* 32-bit ops (non 64-bit ops and non load/store ops) generate 32-bit
+           results */
+        if (!(tcg_op_defs[op].flags & (TCG_OPF_CALL_CLOBBER | TCG_OPF_64BIT))) {
+            mask &= 0xffffffffu;
+        }
+
         if (mask == 0) {
             assert(def->nb_oargs == 1);
             s->gen_opc_buf[op_index] = op_to_movi(op);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 4/4] tcg/optimize: add known-zero bits compute for load ops
  2013-09-03  6:27 [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Aurelien Jarno
                   ` (2 preceding siblings ...)
  2013-09-03  6:27 ` [Qemu-devel] [PATCH 3/4] tcg/optimize: improve known-zero bits for 32-bit ops Aurelien Jarno
@ 2013-09-03  6:28 ` Aurelien Jarno
  2013-09-03  7:21 ` [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Paolo Bonzini
  2013-09-03 15:55 ` Richard Henderson
  5 siblings, 0 replies; 10+ messages in thread
From: Aurelien Jarno @ 2013-09-03  6:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Aurelien Jarno, Richard Henderson

Cc: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index b1f736b..044f456 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -787,6 +787,19 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             mask = temps[args[3]].mask | temps[args[4]].mask;
             break;
 
+        CASE_OP_32_64(ld8u):
+        case INDEX_op_qemu_ld8u:
+            mask = 0xff;
+            break;
+        CASE_OP_32_64(ld16u):
+        case INDEX_op_qemu_ld16u:
+            mask = 0xffff;
+            break;
+        case INDEX_op_ld32u_i64:
+        case INDEX_op_qemu_ld32u:
+            mask = 0xffffffffu;
+            break;
+
         default:
             break;
         }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements
  2013-09-03  6:27 [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Aurelien Jarno
                   ` (3 preceding siblings ...)
  2013-09-03  6:28 ` [Qemu-devel] [PATCH 4/4] tcg/optimize: add known-zero bits compute for load ops Aurelien Jarno
@ 2013-09-03  7:21 ` Paolo Bonzini
  2013-09-09 17:04   ` Aurelien Jarno
  2013-09-03 15:55 ` Richard Henderson
  5 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2013-09-03  7:21 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, Richard Henderson

Il 03/09/2013 08:27, Aurelien Jarno ha scritto:
> This patchset first fixes known-zero bits optimization so that it is
> actually used, and does some further optimizations for 32-bit ops and
> unsigned loads.
> 
> Aurelien Jarno (4):
>   tcg/optimize: fix know-zero bits optimization
>   tcg/optimize: fix known-zero bits for right shift ops
>   tcg/optimize: improve known-zero bits for 32-bit ops
>   tcg/optimize: add known-zero bits compute for load ops
> 
>  tcg/optimize.c |   48 +++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 43 insertions(+), 5 deletions(-)
> 

Commit message 1 is a bit misleading, because the optimization still
works for quite a few cases involving constant and copy propagation.
However, I had the same patch in my queue, so I can't deny that there is
a problem. :)

Two questions:

1) should patch 2 be CCed to qemu-stable?

2) should patches 1 and 2 be inverted to avoid triggering bugs?

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] tcg/optimize: fix know-zero bits optimization
  2013-09-03  6:27 ` [Qemu-devel] [PATCH 1/4] tcg/optimize: fix know-zero bits optimization Aurelien Jarno
@ 2013-09-03  9:50   ` Andreas Färber
  0 siblings, 0 replies; 10+ messages in thread
From: Andreas Färber @ 2013-09-03  9:50 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Paolo Bonzini, qemu-devel, Richard Henderson

FWIW $subject has a typo. While at it...

Am 03.09.2013 08:27, schrieb Aurelien Jarno:
> Known-zero bits optimization is a great idea that helps to generate more
> optimized code. However the current implementation is basically useless
> as the computed mask is not saved.
> 
> Fix this to make it really working.
> 
> Cc: Richard Henderson <rth@twiddle.net>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  tcg/optimize.c |    8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index b29bf25..41f2906 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -695,7 +695,8 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
>              break;
>          }
>  
> -        /* Simplify using known-zero bits */
> +        /* Simplify using known-zero bits. Currently only ops with a single
> +           output argument is supported. */

"ops ... are"?

Cheers,
Andreas

>          mask = -1;
>          affected = -1;
>          switch (op) {
> @@ -1144,6 +1145,11 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
>              } else {
>                  for (i = 0; i < def->nb_oargs; i++) {
>                      reset_temp(args[i]);
> +                    /* Save the corresponding known-zero bits mask for the
> +                       first output argument (only one supported so far). */
> +                    if (i == 0) {
> +                        temps[args[i]].mask = mask;
> +                    }
>                  }
>              }
>              for (i = 0; i < def->nb_args; i++) {
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements
  2013-09-03  6:27 [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Aurelien Jarno
                   ` (4 preceding siblings ...)
  2013-09-03  7:21 ` [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Paolo Bonzini
@ 2013-09-03 15:55 ` Richard Henderson
  5 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2013-09-03 15:55 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Paolo Bonzini, qemu-devel

On 09/02/2013 11:27 PM, Aurelien Jarno wrote:
> This patchset first fixes known-zero bits optimization so that it is
> actually used, and does some further optimizations for 32-bit ops and
> unsigned loads.
> 
> Aurelien Jarno (4):
>   tcg/optimize: fix know-zero bits optimization
>   tcg/optimize: fix known-zero bits for right shift ops
>   tcg/optimize: improve known-zero bits for 32-bit ops
>   tcg/optimize: add known-zero bits compute for load ops
> 
>  tcg/optimize.c |   48 +++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 43 insertions(+), 5 deletions(-)
> 

Reviewed-by: Richard Henderson <rth@twiddle.net>

Although I agree with Paulo about swapping patch 1 and 2.


r~

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements
  2013-09-03  7:21 ` [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Paolo Bonzini
@ 2013-09-09 17:04   ` Aurelien Jarno
  2013-09-09 17:14     ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Aurelien Jarno @ 2013-09-09 17:04 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Richard Henderson

On Tue, Sep 03, 2013 at 09:21:06AM +0200, Paolo Bonzini wrote:
> Il 03/09/2013 08:27, Aurelien Jarno ha scritto:
> > This patchset first fixes known-zero bits optimization so that it is
> > actually used, and does some further optimizations for 32-bit ops and
> > unsigned loads.
> > 
> > Aurelien Jarno (4):
> >   tcg/optimize: fix know-zero bits optimization
> >   tcg/optimize: fix known-zero bits for right shift ops
> >   tcg/optimize: improve known-zero bits for 32-bit ops
> >   tcg/optimize: add known-zero bits compute for load ops
> > 
> >  tcg/optimize.c |   48 +++++++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 43 insertions(+), 5 deletions(-)
> > 
> 
> Commit message 1 is a bit misleading, because the optimization still
> works for quite a few cases involving constant and copy propagation.
> However, I had the same patch in my queue, so I can't deny that there is
> a problem. :)

I have just checked, and it does indeed work for a few cases involving
constants. That said, it doesn't change the resulting TCG code, as these
cases were already handled by some other optimizations.

That let me ask a question, about why the bit propagation has been added
in the middle of other optimizations, and not for example immediately
after swapping commutative ops or just before the constant folding.

> Two questions:
> 
> 1) should patch 2 be CCed to qemu-stable?

I considered it was not a problem given I thought the optimization was
basically disabled. I'll fix that in the new version.

> 2) should patches 1 and 2 be inverted to avoid triggering bugs?

Good idea.

Thanks for the review, I'll send an updated version.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements
  2013-09-09 17:04   ` Aurelien Jarno
@ 2013-09-09 17:14     ` Paolo Bonzini
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-09-09 17:14 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, Richard Henderson

Il 09/09/2013 19:04, Aurelien Jarno ha scritto:
> I have just checked, and it does indeed work for a few cases involving
> constants. That said, it doesn't change the resulting TCG code, as these
> cases were already handled by some other optimizations.
> 
> That let me ask a question, about why the bit propagation has been added
> in the middle of other optimizations, and not for example immediately
> after swapping commutative ops or just before the constant folding.

I think it was just an artifact of /me rebasing the patch after other
optimizations were introduced.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-09-09 17:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-03  6:27 [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Aurelien Jarno
2013-09-03  6:27 ` [Qemu-devel] [PATCH 1/4] tcg/optimize: fix know-zero bits optimization Aurelien Jarno
2013-09-03  9:50   ` Andreas Färber
2013-09-03  6:27 ` [Qemu-devel] [PATCH 2/4] tcg/optimize: fix known-zero bits for right shift ops Aurelien Jarno
2013-09-03  6:27 ` [Qemu-devel] [PATCH 3/4] tcg/optimize: improve known-zero bits for 32-bit ops Aurelien Jarno
2013-09-03  6:28 ` [Qemu-devel] [PATCH 4/4] tcg/optimize: add known-zero bits compute for load ops Aurelien Jarno
2013-09-03  7:21 ` [Qemu-devel] [PATCH 0/4] tcg/optimize: fixes and improvements Paolo Bonzini
2013-09-09 17:04   ` Aurelien Jarno
2013-09-09 17:14     ` Paolo Bonzini
2013-09-03 15:55 ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.