[Qemu-devel] [PATCH 0/6] Implement constant folding and copy propagation in TCG

* [Qemu-devel] [PATCH 0/6] Implement constant folding and copy propagation in TCG
@ 2011-05-20 12:39 Kirill Batuzov
  2011-05-20 12:39 ` [Qemu-devel] [PATCH 1/6] Add TCG optimizations stub Kirill Batuzov
                   ` (7 more replies)
  0 siblings, 8 replies; 34+ messages in thread
From: Kirill Batuzov @ 2011-05-20 12:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: mj.mccormack, zhur

This series implements some basic machine-independent optimizations.  They
simplify code and allow liveness analysis do it's work better.

Suppose we have following ARM code:

 movw    r12, #0xb6db
 movt    r12, #0xdb6d

In TCG before optimizations we'll have:

 movi_i32 tmp8,$0xb6db
 mov_i32 r12,tmp8
 mov_i32 tmp8,r12
 ext16u_i32 tmp8,tmp8
 movi_i32 tmp9,$0xdb6d0000
 or_i32 tmp8,tmp8,tmp9
 mov_i32 r12,tmp8

And after optimizations we'll have this:

 movi_i32 r12,$0xdb6db6db

Here are performance evaluation results on SPEC CPU2000 integer tests in
user-mode emulation on x86_64 host.  There were 5 runs of each test on
reference data set.  The tables below show runtime in seconds for all these
runs.

ARM guest without optimizations:
Test name       #1       #2       #3       #4       #5    Median
164.gzip    1403.612 1403.499 1403.52  1208.55  1403.583 1403.52
175.vpr     1237.729 1238.008 1238.019 1176.852 1237.902 1237.902
176.gcc      929.511  928.867  929.048  928.927  928.792  928.927
181.mcf      196.371  196.335  196.172  197.057  196.196  196.335
186.crafty  1547.101 1547.293 1547.133 1547.037 1547.044 1547.101
197.parser  3804.336 3804.429 3804.412 3804.45  3804.301 3804.412
252.eon     2760.414 2760.45  2473.608 2760.606 2760.216 2760.414
253.perlbmk 2557.966 2558.971 2559.731 2479.299 2556.835 2557.966
256.bzip2   1296.412 1296.215 1296.63  1296.489 1296.092 1296.412
300.twolf   2919.496 2919.444 2919.529 2919.384 2919.404 2919.444

ARM guest with optimizations:
Test name       #1       #2       #3       #4       #5    Median    Gain
164.gzip    1345.416 1401.741 1377.022 1401.737 1401.773 1401.737   0.13%
175.vpr     1116.75  1243.213 1243.32  1243.316 1243.144 1243.213  -0.43%
176.gcc      897.045  909.568  850.1    909.65   909.57   909.568   2.08%
181.mcf      199.058  198.717  198.28   198.866  197.955  198.717  -1.21%
186.crafty  1525.667 1526.663 1525.981 1525.995 1526.164 1525.995   1.36%
197.parser  3749.453 3749.522 3749.413 3749.5   3749.484 3749.484   1.44%
252.eon     2730.593 2746.525 2746.495 2746.493 2746.62  2746.495   0.50%
253.perlbmk 2577.341 2521.057 2578.461 2578.721 2581.313 2578.461  -0.80%
256.bzip2   1184.498 1190.116 1294.352 1294.554 1294.637 1294.352   0.16%
300.twolf   2894.264 2894.133 2894.398 2894.103 2894.146 2894.146   0.87%

x86_64 guest without optimizations:
Test name       #1       #2       #3       #4       #5    Median
164.gzip     858.118  858.151  858.09   858.139  858.122  858.122
175.vpr      956.361  956.465  956.521  956.438  956.705  956.465
176.gcc      647.275  647.465  647.186  647.294  647.268  647.275
181.mcf      219.239  221.964  220.244  220.74   220.559  220.559
186.crafty  1128.027 1128.071 1128.028 1128.115 1128.123 1128.071
197.parser  1815.669 1815.651 1815.669 1815.711 1815.759 1815.669
253.perlbmk 1777.143 1777.749 1667.508 1777.051 1778.391 1777.143
254.gap     1062.808 1062.758 1062.801 1063.099 1062.859 1062.808
255.vortex  1930.693 1930.706 1930.579 1930.7   1930.566 1930.693
256.bzip2   1014.566 1014.702 1014.6   1014.274 1014.421 1014.566
300.twolf   1342.653 1342.759 1344.092 1342.641 1342.794 1342.759

x86_64 guest with optimizations:
Test name       #1       #2       #3       #4       #5    Median    Gain
164.gzip     857.485  857.457  857.475  857.509  857.507  857.485   0.07%
175.vpr      963.255  962.972  963.27   963.124  963.686  963.255  -0.71%
176.gcc      644.123  644.055  644.145  643.818  635.773  644.055   0.50%
181.mcf      216.215  217.549  218.744  216.437  217.83   217.549   1.36%
186.crafty  1128.873 1128.792 1128.871 1128.816 1128.823 1128.823  -0.07%
197.parser  1814.626 1814.503 1814.552 1814.602 1814.748 1814.602   0.06%
253.perlbmk 1758.056 1751.963 1753.267 1765.27  1759.828 1758.056   1.07%
254.gap     1064.702 1064.712 1064.629 1064.657 1064.645 1064.657  -0.17%
255.vortex  1760.638 1936.387 1937.871 1937.471 1760.496 1936.387  -0.29%
256.bzip2   1007.658 1007.682 1007.316 1007.982 1007.747 1007.682   0.68%
300.twolf   1334.139 1333.791 1333.795 1334.147 1333.732 1333.795   0.67%

ARM guests for 254.gap and 255.vortex and x86_64 guest for 252.eon does not
work under QEMU for some unrelated reason.

Kirill Batuzov (6):
  Add TCG optimizations stub
  Add copy and constant propagation.
  Do constant folding for basic arithmetic operations.
  Do constant folding for boolean operations.
  Do constant folding for shift operations.
  Do constant folding for unary operations.

 Makefile.target |    2 +-
 tcg/optimize.c  |  539 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tcg/tcg.c       |    6 +
 tcg/tcg.h       |    3 +
 4 files changed, 549 insertions(+), 1 deletions(-)
 create mode 100644 tcg/optimize.c

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 34+ messages in thread