* X11 performance regressions
@ 2011-05-08 18:22 Knut Petersen
2011-05-09 16:53 ` Adam Jackson
2011-05-09 21:43 ` Chris Wilson
0 siblings, 2 replies; 14+ messages in thread
From: Knut Petersen @ 2011-05-08 18:22 UTC (permalink / raw)
To: intel-gfx
I compared the performance of X11 on two otherwise idle machines.
Hardware
========
Both have
identical mainboards (Aopen i915GMm-hfs),
identical memory and BIOS setup.
Both cpus are Intel Pentium M mobile (Dothan).
One runs at 1.86 Mhz, the other runs at 2.00 MHz
Software
=======
1.86 MHz system:
opensuse 11.2
X.Org X Server 1.6.5
Release Date: 2009-10-11
kernel 2.6.38.5
2.00 MHz system:
opensuse 11.4
X.Org X Server 1.10.99
git-tree, 2011-may-7
kernel 2.6.39-rc4-drm-intel-staging
x11perf results
===========
The first line always gives the test result of the 2.00 Mhz system with the current Xorg,
the second line gives the test result of the 1.86 MHz sytem with Xorg 1.6.5. Read a
few representative examples:
10000000 trep @ 0.0032 msec (309000.0/sec): Dot
40000000 trep @ 0.0006 msec (1650000.0/sec): Dot
45000 trep @ 0.5973 msec ( 1670.0/sec): 500x500 rectangle
100000 trep @ 0.4282 msec ( 2340.0/sec): 500x500 rectangle
2000000 reps @ 0.0034 msec (296000.0/sec): 1x1 stippled rectangle (8x8 stipple)
8000000 reps @ 0.0007 msec (1420000.0/sec): 1x1 stippled rectangle (8x8 stipple)
1500 trep @ 22.4602 msec ( 44.5/sec): 500x500 stippled rectangle (8x8 stipple)
3000 trep @ 9.2680 msec ( 108.0/sec): 500x500 stippled rectangle (8x8 stipple)
100000 trep @ 0.4043 msec ( 2470.0/sec): Fill 10x10 trapezoid
1000000 trep @ 0.0336 msec ( 29700.0/sec): Fill 10x10 trapezoid
The old X on the PC with the slower cpu is always significantly faster than the current git code,
very often more than 5 times as fast, and a number of test show 1.6.5 to be more than 12 times
faster than 1.10.99.
I did not use any special configuration options at compile time
1.10.99 was built using the following commands.
export PREFIX=/home/knut/local
export PKG_CONFIG_PATH=$PREFIX/lib/pkgconfig
export PATH=$PREFIX/bin:$PATH
export ACLOCAL="aclocal -I $PREFIX/share/aclocal"
export LD_LIBRARY_PATH=$PREFIX/lib
export PYTHONPATH=$PREFIX/lib/python2.7/site-packages
util/modular/build.sh -g $PREFIX
Could anybody please explain why the old server is so much faster?
Are there any compile time or runtime options that could/should be used?
cu,
Knut
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-08 18:22 X11 performance regressions Knut Petersen
@ 2011-05-09 16:53 ` Adam Jackson
2011-05-09 21:43 ` Chris Wilson
1 sibling, 0 replies; 14+ messages in thread
From: Adam Jackson @ 2011-05-09 16:53 UTC (permalink / raw)
To: Knut Petersen; +Cc: intel-gfx
On 5/8/11 2:22 PM, Knut Petersen wrote:
> Software
> =======
> 1.86 MHz system:
> opensuse 11.2
> X.Org X Server 1.6.5
> Release Date: 2009-10-11
> kernel 2.6.38.5
>
> 2.00 MHz system:
> opensuse 11.4
> X.Org X Server 1.10.99
> git-tree, 2011-may-7
> kernel 2.6.39-rc4-drm-intel-staging
I'd start by suspecting differences in .config for the kernel between
the two, particularly since...
> 10000000 trep @ 0.0032 msec (309000.0/sec): Dot
> 40000000 trep @ 0.0006 msec (1650000.0/sec): Dot
Dot dispatch is so completely CPU-dominated that I suspect you're simply
measuring CPU overhead somewhere else. For example, if one of those
kernels is built with spinlock debugging and the other isn't.
- ajax
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-08 18:22 X11 performance regressions Knut Petersen
2011-05-09 16:53 ` Adam Jackson
@ 2011-05-09 21:43 ` Chris Wilson
2011-05-11 14:46 ` Knut Petersen
1 sibling, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2011-05-09 21:43 UTC (permalink / raw)
To: Knut Petersen, intel-gfx
As a point of comparison, here are the similar results with master of all
the various trees on my 1.6GHz N450 (Atom+PineView) [so not strictly an
apples-to-apples comparison, your CPU is about 4-5x faster, but PNV is
about 3-4x faster than the 915GM (clock-for-clock)]:
On Sun, 08 May 2011 20:22:21 +0200, Knut Petersen <Knut_Petersen@t-online.de> wrote:
> 10000000 trep @ 0.0032 msec (309000.0/sec): Dot
> 40000000 trep @ 0.0006 msec (1650000.0/sec): Dot
50000000 trep @ 0.0005 msec (1830000.0/sec): Dot
*100000000 trep @ 0.0003 msec (2900000.0/sec): Dot
> 45000 trep @ 0.5973 msec ( 1670.0/sec): 500x500 rectangle
> 100000 trep @ 0.4282 msec ( 2340.0/sec): 500x500 rectangle
100000 trep @ 0.3210 msec ( 3120.0/sec): 500x500 rectangle
> 2000000 reps @ 0.0034 msec (296000.0/sec): 1x1 stippled rectangle (8x8 stipple)
> 8000000 reps @ 0.0007 msec (1420000.0/sec): 1x1 stippled rectangle (8x8 stipple)
25000000 trep @ 0.0011 msec (902000.0/sec): 1x1 stippled rectangle (8x8 stipple)
*30000000 trep @ 0.0008 msec (1180000.0/sec): 1x1 stippled rectangle (8x8 stipple)
> 1500 trep @ 22.4602 msec ( 44.5/sec): 500x500 stippled rectangle (8x8 stipple)
> 3000 trep @ 9.2680 msec ( 108.0/sec): 500x500 stippled rectangle (8x8 stipple)
4000 trep @ 6.8986 msec ( 145.0/sec): 500x500 stippled rectangle (8x8 stipple)
*3500 trep @ 7.0786 msec ( 141.0/sec): 500x500 stippled rectangle (8x8 stipple)
> 100000 trep @ 0.4043 msec ( 2470.0/sec): Fill 10x10 trapezoid
> 1000000 trep @ 0.0336 msec ( 29700.0/sec): Fill 10x10 trapezoid
2000000 trep @ 0.0152 msec ( 65700.0/sec): Fill 10x10 trapezoid
*4000000 trep @ 0.0064 msec (156000.0/sec): Fill 10x10 trapezoid
Hmm. My suspicion was that this was GEM-related regressions (the overhead
of dynamic buffer manager and relocations) along with various
optimizations for the common cases affecting the software fallback
dominated benchmarks selected above. And whilst there may some element of
that behind the regression you're observing, I don't think that is the
whole story and Adam is right to suggest to check that the systems are
indeed configured identically (wrt to debug and optimisation options).
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-09 21:43 ` Chris Wilson
@ 2011-05-11 14:46 ` Knut Petersen
2011-05-11 17:52 ` Chris Wilson
2011-05-11 19:49 ` Adam Jackson
0 siblings, 2 replies; 14+ messages in thread
From: Knut Petersen @ 2011-05-11 14:46 UTC (permalink / raw)
To: intel-gfx
Yes, I made some mistakes during my first measurements.
Below find better results. They are made on the same machine,
with the same kernel, at the same speed, with the same x11perf
program, absolutely nothing changed.
I used x11perfcomp -ro and sorted the output, worst results for
the currrent git code first.
I think the numbers below are quite interesting ...
-Knut
System
======
AOpen i915GMm-hfs
Pentium M 2.00 MHz (Dothan) running @ 2MHz fixed frequency, no thermal throttling
2GB RAM
1: Xorg of openSuSE 11.2 (absolute numbers)
===========================================
X.Org X Server 1.6.5
Release Date: 2009-10-11
X Protocol Version 11, Revision 0
Build Operating System: openSUSE SUSE LINUX
Current Operating System: Linux linux-iffr 2.6.38.5-kape #10 PREEMPT Fri May 6 17:41:06 CEST 2011 i686
Build Date: 23 September 2010 03:43:55PM
Binaries, as distributed by openSuSE
2: Xorg, fresh from git 10 May 2011 (relative performance)
==========================================================
X.Org X Server 1.10.99.1
Release Date: unreleased
X Protocol Version 11, Revision 0
Build Operating System: Linux 2.6.39-rc4-drm-intel-staging+ i686
Current Operating System: Linux linux-iffr 2.6.38.5-kape #10 PREEMPT Fri May 6 17:41:06 CEST 2011 i686
Kernel command line: root=/dev/hda2 acpi_enforce_resources=lax drm.debug=0x0
Build Date: 10 May 2011 04:43:21PM
Compiled without any special options using build.sh
1 2 Operation
-------- ------ ---------
965000.0 0.016 10x10 wide rectangle outline
164000.0 0.033 Fill 1x1 equivalent triangle
152000.0 0.034 Fill 1x1 trapezoid
175000.0 0.061 Fill 1x1 stippled trapezoid (161x145 stipple)
174000.0 0.062 Fill 1x1 opaque stippled trapezoid (161x145 stipple)
173000.0 0.062 Fill 1x1 opaque stippled trapezoid (17x15 stipple)
173000.0 0.062 Fill 1x1 opaque stippled trapezoid (8x8 stipple)
173000.0 0.062 Fill 1x1 stippled trapezoid (17x15 stipple)
173000.0 0.062 Fill 1x1 stippled trapezoid (8x8 stipple)
138000.0 0.073 Fill 1x1 tiled trapezoid (17x15 tile)
136000.0 0.074 Fill 1x1 tiled trapezoid (161x145 tile)
136000.0 0.074 Fill 1x1 tiled trapezoid (216x208 tile)
137000.0 0.074 Fill 1x1 tiled trapezoid (4x4 tile)
2670.0 0.088 100-pixel double-dashed ellipse
4170.0 0.092 100-pixel dashed ellipse
85300.0 0.11 Fill 10x10 opaque stippled trapezoid (161x145 stipple)
85800.0 0.11 Fill 10x10 stippled trapezoid (161x145 stipple)
76400.0 0.12 Fill 10x10 opaque stippled trapezoid (17x15 stipple)
74800.0 0.12 Fill 10x10 stippled trapezoid (17x15 stipple)
68800.0 0.13 Fill 10x10 opaque stippled trapezoid (8x8 stipple)
67200.0 0.13 Fill 10x10 stippled trapezoid (8x8 stipple)
34800000.0 0.14 1-pixel solid circle
42300.0 0.15 Fill 10x10 tiled trapezoid (161x145 tile)
41900.0 0.15 Fill 10x10 tiled trapezoid (216x208 tile)
4080.0 0.16 100-pixel wide double-dashed ellipse
26800.0 0.16 500x500 rectangle outline
38100.0 0.16 Fill 10x10 tiled trapezoid (17x15 tile)
36700.0 0.16 Fill 10x10 tiled trapezoid (4x4 tile)
24700000.0 0.17 1-pixel line
22200000.0 0.17 1-pixel line segment
27500.0 0.18 Fill 10x10 equivalent triangle
28300.0 0.18 Fill 10x10 trapezoid
190000.0 0.20 100x100 wide rectangle outline
5910.0 0.23 Fill 300x300 trapezoid
553000.0 0.24 Copy 10x10 from pixmap to pixmap
292000.0 0.25 100-pixel line segment (3 kids)
54600.0 0.25 10x10 rectangle outline
281000.0 0.26 100-pixel line segment (2 kids)
4670000.0 0.26 10-pixel horizontal line segment
114000.0 0.27 Fill 1x1 aa trap
198000.0 0.27 ShmPutImage 10x10 square
265000.0 0.28 100-pixel line segment (1 kid)
2980000.0 0.28 10-pixel dashed line
2220000.0 0.28 10-pixel dashed segment
2840000.0 0.28 10-pixel line
2010000.0 0.28 10-pixel line segment
21400.0 0.28 500-pixel circle
763.0 0.28 Fill 100x100 tiled trapezoid (161x145 tile)
632.0 0.28 Fill 100x100 tiled trapezoid (17x15 tile)
572.0 0.28 Fill 100x100 tiled trapezoid (4x4 tile)
15300.0 0.28 Fill 100x100 trapezoid
3960000.0 0.29 100-pixel horizontal line segment
299000.0 0.30 100-pixel dashed line
273000.0 0.30 100-pixel dashed segment
247000.0 0.30 100-pixel double-dashed segment
274000.0 0.30 100-pixel line
248000.0 0.30 100-pixel line segment
820000.0 0.30 1-pixel circle
5410.0 0.30 500-pixel filled ellipse
2840.0 0.30 500-pixel solid circle
272000.0 0.31 100-pixel double-dashed line
130000.0 0.31 10-pixel partial ellipse
154000.0 0.31 PutImage 10x10 square
1090000.0 0.32 10x10 tiled rectangle (161x145 tile)
1120000.0 0.32 10x10 tiled rectangle (216x208 tile)
12400.0 0.32 Fill 100x100 equivalent triangle
1220000.0 0.33 1x1 tiled rectangle (161x145 tile)
1220000.0 0.33 1x1 tiled rectangle (17x15 tile)
1220000.0 0.33 1x1 tiled rectangle (216x208 tile)
1220000.0 0.33 1x1 tiled rectangle (4x4 tile)
3540.0 0.33 500-pixel wide ellipse
792.0 0.33 Fill 100x100 tiled trapezoid (216x208 tile)
87200.0 0.33 Fill 2x1 aa trap
552000.0 0.34 10x10 tiled rectangle (17x15 tile)
263000.0 0.34 Fill 1x1 aa trap with 1 bit alpha
88.4 0.34 Fill 300x300 tiled trapezoid (161x145 tile)
125000.0 0.36 10-pixel ellipse
71.5 0.38 Fill 300x300 tiled trapezoid (17x15 tile)
1680000.0 0.39 100-pixel vertical line segment
54200.0 0.39 100x100 rectangle outline
65.0 0.39 Fill 300x300 tiled trapezoid (4x4 tile)
33900.0 0.40 100-pixel circle
147000.0 0.40 10x10 tiled rectangle (4x4 tile)
103.0 0.41 500x500 tiled rectangle (4x4 tile)
35300.0 0.42 100-pixel partial circle
1780.0 0.42 100-pixel wide dashed ellipse
3520.0 0.42 100x100 tiled rectangle (4x4 tile)
56200.0 0.42 500-pixel line
5200.0 0.42 500x500 wide rectangle outline
11300.0 0.42 GetImage 10x10 square
90.5 0.44 Fill 300x300 tiled trapezoid (216x208 tile)
12900.0 0.45 100-pixel wide ellipse
50800.0 0.45 500-pixel line segment
1820000.0 0.46 10x10 rectangle
1450000.0 0.46 1x1 opaque stippled rectangle (8x8 stipple)
1570.0 0.46 ShmPutImage 500x500 square
23800.0 0.47 100x100 tiled rectangle (17x15 tile)
5730.0 0.47 Fill 100x100 opaque stippled trapezoid (161x145 stipple)
5210.0 0.47 Fill 100x100 stippled trapezoid (161x145 stipple)
122000.0 0.48 10-pixel partial circle
78600.0 0.49 100x100 rectangle
1860000.0 0.49 10-pixel vertical line segment
54300.0 0.49 10x1 wide horizontal line segment
54400.0 0.49 10x1 wide vertical line segment
1420000.0 0.50 1x1 opaque stippled rectangle (17x15 stipple)
1440000.0 0.50 1x1 stippled rectangle (17x15 stipple)
1450000.0 0.50 1x1 stippled rectangle (8x8 stipple)
1420000.0 0.51 1x1 stippled rectangle (161x145 stipple)
691.0 0.51 500x500 tiled rectangle (17x15 tile)
3330.0 0.52 100-pixel dashed circle
1400000.0 0.52 1x1 opaque stippled rectangle (161x145 stipple)
1830000.0 0.52 1x1 rectangle
2330000.0 0.52 500-pixel horizontal line segment
4020.0 0.52 Fill 100x100 opaque stippled trapezoid (17x15 stipple)
2190.0 0.53 100-pixel double-dashed circle
2300000.0 0.53 500-pixel vertical line segment
2540.0 0.53 500-pixel wide circle
1810000.0 0.53 Dot
15300.0 0.54 100-pixel partial ellipse
26100.0 0.54 10-pixel wide partial ellipse
182.0 0.54 500x500 opaque stippled rectangle (17x15 stipple)
3060.0 0.54 Fill 100x100 stippled trapezoid (17x15 stipple)
15400.0 0.54 GetProperty
15500.0 0.54 QueryPointer
4150.0 0.56 100-pixel wide double-dashed circle
105000.0 0.56 10-pixel circle
10200.0 0.60 100-pixel ellipse
10200.0 0.60 500x50 wide vertical line segment
705.0 0.60 Fill 300x300 stippled trapezoid (161x145 stipple)
1480000.0 0.60 Unmap window via parent (50 kids)
10300.0 0.61 500x50 wide horizontal line segment
2530.0 0.61 Fill 100x100 opaque stippled trapezoid (8x8 stipple)
848.0 0.61 Fill 300x300 opaque stippled trapezoid (161x145 stipple)
21700.0 0.61 ShmPutImage 100x100 square
2240.0 0.62 Fill 100x100 stippled trapezoid (8x8 stipple)
386.0 0.62 Fill 300x300 stippled trapezoid (17x15 stipple)
551.0 0.63 Fill 300x300 opaque stippled trapezoid (17x15 stipple)
130000.0 0.64 Fill 10x10 aa trap with 1 bit alpha
4080.0 0.65 100x100 opaque stippled rectangle (17x15 stipple)
296.0 0.65 500x500 stippled rectangle (161x145 stipple)
2200.0 0.66 500-pixel ellipse
341.0 0.67 500x500 opaque stippled rectangle (161x145 stipple)
4610.0 0.67 500x50 wide line
15200.0 0.68 Fill 1x1 aa trap with 4 bit alpha
325.0 0.69 Fill 300x300 opaque stippled trapezoid (8x8 stipple)
1650000.0 0.70 Unmap window via parent (200 kids)
6750.0 0.71 100x100 opaque stippled rectangle (161x145 stipple)
54800.0 0.71 10-pixel fill chord partial ellipse
6290.0 0.73 100x100 stippled rectangle (161x145 stipple)
175.0 0.74 500x500 stippled rectangle (17x15 stipple)
12700.0 0.74 Fill 10x10 aa trap
275.0 0.75 Fill 300x300 stippled trapezoid (8x8 stipple)
109.0 0.76 500x500 opaque stippled rectangle (8x8 stipple)
1130000.0 0.76 Circulate Unmapped window (200 kids)
14500.0 0.78 Fill 10x10 aa trapezoid
15300.0 0.78 Fill 1x1 aa trapezoid
10200.0 0.78 Fill 2x10 aa trap
9180.0 0.78 PutImage 100x100 square
15100.0 0.79 100-pixel solid circle
48000.0 0.80 10-pixel fill slice partial ellipse
33400.0 0.80 10x1 wide line
2350.0 0.80 500x500 rectangle
0.5 0.80 PutImage XY 500x500 square
0.5 0.80 ShmPutImage XY 500x500 square
18400.0 0.81 100-pixel filled ellipse
2590.0 0.82 100x100 opaque stippled rectangle (8x8 stipple)
3900.0 0.82 100x100 stippled rectangle (17x15 stipple)
6930.0 0.82 Fill 10x10 aa trap with 4 bit alpha
927.0 0.83 500x500 tiled rectangle (216x208 tile)
219000.0 0.86 10x10 opaque stippled rectangle (161x145 stipple)
140000.0 0.86 Copy 10x10 from window to pixmap
16300.0 0.87 100-pixel fill slice partial circle
28300.0 0.87 100x100 tiled rectangle (216x208 tile)
30900.0 0.87 10-pixel wide ellipse
23400.0 0.87 10-pixel wide partial circle
859.0 0.87 500x500 tiled rectangle (161x145 tile)
462.0 0.87 PutImage 500x500 square
17600.0 0.88 100-pixel fill chord partial circle
145000.0 0.88 10x10 opaque stippled rectangle (8x8 stipple)
143000.0 0.88 Copy 10x10 from pixmap to window
6530.0 0.89 100-pixel wide partial circle
28100.0 0.89 100x100 tiled rectangle (161x145 tile)
138000.0 0.89 Composite 10x10 from pixmap to window
1470.0 0.90 Fill 100x100 aa trap
1350.0 0.90 Fill 100x100 aa trap with 4 bit alpha
1930.0 0.90 GetImage XY 10x10 square
14200.0 0.92 100x10 wide vertical line segment
4460.0 0.92 Fill 100x100 aa trapezoid
41700.0 0.93 10-pixel filled ellipse
463.0 0.93 Fill 300x300 aa trap with 4 bit alpha
1350000.0 0.93 Move window via parent (200 kids)
14300.0 0.94 100x10 wide horizontal line segment
4810.0 0.94 Fill 300x300 aa pre-added trapezoid
476.0 0.94 Fill 300x300 aa trap
110.0 0.95 500x500 stippled rectangle (8x8 stipple)
16800.0 0.96 Fill 100x100 aa pre-added trapezoid
1140.0 0.96 PutImage XY 10x10 square
1570000.0 0.96 Resize unmapped window (4 kids)
22100.0 0.97 100-pixel fill chord partial ellipse
155000.0 0.97 Fill 10x10 aa pre-added trapezoid
1040.0 0.97 Fill 2x100 aa trap
1660000.0 0.97 Moved unmapped window (16 kids)
1660000.0 0.97 Moved unmapped window (25 kids)
1190000.0 0.97 Move window via parent (100 kids)
11.8 0.97 PutImage XY 100x100 square
11.4 0.97 ShmPutImage XY 100x100 square
1670000.0 0.97 Unmap window via parent (100 kids)
173000.0 0.98 10x10 opaque stippled rectangle (17x15 stipple)
926000.0 0.98 Fill 1x1 aa pre-added trapezoid
346.0 0.98 Fill 2x300 aa trap
57600.0 0.98 Hide/expose window via popup (4 kids)
1630000.0 0.98 Moved unmapped window (100 kids)
1630000.0 0.98 Moved unmapped window (200 kids)
1650000.0 0.98 Moved unmapped window (4 kids)
1640000.0 0.98 Moved unmapped window (50 kids)
1640000.0 0.98 Moved unmapped window (75 kids)
1210.0 0.98 Scroll 500x500 pixels
574.0 0.99 Copy 100x100 n-bit deep plane
867.0 0.99 Copy 500x500 from pixmap to pixmap
23.3 0.99 Copy 500x500 n-bit deep plane
24.7 0.99 GetImage XY 100x100 square
16600.0 0.99 Move window (200 kids)
1560000.0 0.99 Resize unmapped window (16 kids)
1530000.0 0.99 Resize unmapped window (200 kids)
1560000.0 0.99 Resize unmapped window (25 kids)
1550000.0 0.99 Resize unmapped window (50 kids)
1050.0 0.99 ShmPutImage XY 10x10 square
266000.0 1.00 Char in 30-char aa line (Charter 24)
265000.0 1.00 Char in 30-char a line (Charter 24)
508000.0 1.00 Char in 30-char image line (TR 24)
869.0 1.00 Composite 500x500 from window to window
870.0 1.00 Copy 500x500 from window to window
1.0 1.00 GetImage XY 500x500 square
1530000.0 1.00 Resize unmapped window (100 kids)
20000.0 1.01 100-pixel fill slice partial ellipse
231000.0 1.01 10x10 stippled rectangle (161x145 stipple)
19900.0 1.01 Composite 100x100 from pixmap to window
19600.0 1.01 Composite 100x100 from window to window
851.0 1.01 Composite 500x500 from pixmap to window
19800.0 1.01 Copy 100x100 from pixmap to pixmap
19900.0 1.01 Copy 100x100 from pixmap to window
20000.0 1.01 Copy 100x100 from window to pixmap
19600.0 1.01 Copy 100x100 from window to window
851.0 1.01 Copy 500x500 from pixmap to window
1530000.0 1.01 Resize unmapped window (75 kids)
10300.0 1.02 100-pixel wide circle
108000.0 1.02 Char in 80-char rgb core line (Charter 10)
849.0 1.02 Copy 500x500 from window to pixmap
169000.0 1.03 10x10 stippled rectangle (17x15 stipple)
20700.0 1.03 Move window (100 kids)
26900.0 1.03 Scroll 100x100 pixels
255000.0 1.04 Char16 in 23-char line (k24)
1720000.0 1.04 Char in 80-char image line (TR 10)
37200.0 1.04 Circulate window (4 kids)
25200.0 1.04 Move window (25 kids)
23200.0 1.04 Move window (50 kids)
21900.0 1.04 Move window (75 kids)
2540.0 1.05 100x100 stippled rectangle (8x8 stipple)
41700.0 1.05 Copy 10x10 n-bit deep plane
25800.0 1.05 Move window (16 kids)
119000.0 1.06 Char in 80-char a core line (Charter 10)
2010000.0 1.07 Circulate Unmapped window (100 kids)
2250000.0 1.07 Circulate Unmapped window (75 kids)
27600.0 1.07 Move window (4 kids)
377000.0 1.07 Move window via parent (16 kids)
1050000.0 1.07 Move window via parent (75 kids)
626000.0 1.08 Char16 in 40-char line (k14)
118000.0 1.08 Char in 80-char aa core line (Charter 10)
534000.0 1.08 Move window via parent (25 kids)
108000.0 1.08 Move window via parent (4 kids)
471000.0 1.09 Char16 in 40-char image line (k14)
23700.0 1.09 Resize window (200 kids)
1030000.0 1.10 Char in 60-char image line (9x15)
1400000.0 1.10 Char in 80-char image line (6x13)
2610000.0 1.10 Circulate Unmapped window (50 kids)
756.0 1.10 Fill 300x300 aa trapezoid
30000.0 1.10 Resize window (100 kids)
134000.0 1.11 10x10 stippled rectangle (8x8 stipple)
833000.0 1.11 Char in 30-char line (TR 24)
34800.0 1.11 Resize window (50 kids)
32100.0 1.11 Resize window (75 kids)
308000.0 1.11 Unmap window via parent (4 kids)
176000.0 1.12 Char16 in 23-char image line (k24)
1200000.0 1.12 Char in 70-char image line (8x13)
634000.0 1.12 Char in 80-char rgb line (Courier 12)
902000.0 1.12 Unmap window via parent (16 kids)
5790.0 1.13 100-pixel wide partial ellipse
314000.0 1.13 Char16 in 7/14/7 line (k14, k24)
1570000.0 1.13 Char in 60-char line (9x15)
39800.0 1.13 Resize window (16 kids)
21100.0 1.14 Char in 30-char rgb core line (Charter 24)
1940000.0 1.14 Char in 80-char line (6x13)
3010000.0 1.14 Circulate Unmapped window (25 kids)
37900.0 1.14 Resize window (25 kids)
80700.0 1.15 Char in 80-char rgb core line (Courier 12)
74000.0 1.15 Map window via parent (4 kids)
42800.0 1.15 Resize window (4 kids)
1140000.0 1.15 Unmap window via parent (25 kids)
1610000.0 1.15 Unmap window via parent (75 kids)
10700.0 1.16 100x10 wide line
1750000.0 1.16 Char in 70-char line (8x13)
822000.0 1.16 Move window via parent (50 kids)
38100.0 1.17 10-pixel fill chord partial circle
87400.0 1.18 Char in 80-char aa core line (Courier 12)
2160000.0 1.18 Char in 80-char line (TR 10)
19500.0 1.18 Circulate window (200 kids)
22600.0 1.19 Char in 30-char aa core line (Charter 24)
22600.0 1.19 Char in 30-char a core line (Charter 24)
172000.0 1.19 Char in 30-char rgb line (Charter 24)
185000.0 1.19 Destroy window via parent (4 kids)
95200.0 1.19 Hide/expose window via popup (16 kids)
35300.0 1.20 10-pixel fill slice partial circle
21900.0 1.20 Circulate window (100 kids)
25000.0 1.20 Circulate window (16 kids)
87500.0 1.21 Char in 80-char a core line (Courier 12)
3190000.0 1.21 Circulate Unmapped window (16 kids)
24100.0 1.21 Circulate window (25 kids)
8150000.0 1.21 X protocol NoOperation
411000.0 1.22 Destroy window via parent (25 kids)
1620000.0 1.23 Char in 20/40/20 line (6x13, TR 10)
23100.0 1.23 Circulate window (50 kids)
22500.0 1.23 Circulate window (75 kids)
3430000.0 1.25 Circulate Unmapped window (4 kids)
465000.0 1.25 Destroy window via parent (50 kids)
3550.0 1.26 100x10 wide double-dashed line
13500.0 1.27 Fill 100x100 64-gon (Convex)
759.0 1.27 GetImage 100x100 square
374000.0 1.28 Destroy window via parent (16 kids)
774000.0 1.30 Char in 80-char a line (Courier 12)
771000.0 1.32 Char in 80-char aa line (Courier 12)
513000.0 1.32 Destroy window via parent (100 kids)
108000.0 1.33 Hide/expose window via popup (50 kids)
109000.0 1.33 Map window via parent (16 kids)
761000.0 1.34 Char in 80-char rgb line (Charter 10)
97400.0 1.34 Hide/expose window via popup (25 kids)
1380.0 1.36 100-pixel wide dashed circle
494000.0 1.38 Destroy window via parent (200 kids)
490000.0 1.38 Destroy window via parent (75 kids)
115000.0 1.38 Hide/expose window via popup (100 kids)
112000.0 1.38 Hide/expose window via popup (75 kids)
3030.0 1.39 100x10 wide dashed line
12100.0 1.41 Fill 100x100 64-gon (Complex)
117000.0 1.42 Hide/expose window via popup (200 kids)
103000.0 1.43 Create and map subwindows (100 kids)
80900.0 1.43 Create and map subwindows (4 kids)
101000.0 1.47 Create and map subwindows (16 kids)
9970.0 1.48 Fill 100x100 equivalent complex polygons
120000.0 1.48 Map window via parent (50 kids)
105000.0 1.49 Create and map subwindows (75 kids)
29700.0 1.50 10-pixel solid circle
139000.0 1.50 Change graphics context
515000.0 1.50 Create unmapped window (200 kids)
509000.0 1.50 Create unmapped window (50 kids)
514000.0 1.50 Create unmapped window (75 kids)
507000.0 1.51 Create unmapped window (16 kids)
126000.0 1.51 Map window via parent (100 kids)
110000.0 1.51 Map window via parent (25 kids)
81700.0 1.52 Copy 10x10 from window to window
102000.0 1.52 Create and map subwindows (25 kids)
103000.0 1.52 Create and map subwindows (50 kids)
81000.0 1.53 Composite 10x10 from window to window
101000.0 1.53 Create and map subwindows (200 kids)
126000.0 1.53 Map window via parent (200 kids)
81300.0 1.53 Scroll 10x10 pixels
27200.0 1.54 10-pixel wide circle
122000.0 1.56 Map window via parent (75 kids)
28300.0 1.57 Fill 100x100 aa trap with 1 bit alpha
515000.0 1.59 Create unmapped window (100 kids)
488000.0 1.66 Create unmapped window (25 kids)
773000.0 1.75 Char in 80-char a line (Charter 10)
766000.0 1.76 Char in 80-char aa line (Charter 10)
29.4 1.89 GetImage 500x500 square
413000.0 1.91 Create unmapped window (4 kids)
107000.0 2.20 Copy 10x10 1-bit deep plane
392.0 2.48 Copy 500x500 1-bit deep plane
7040.0 2.59 Copy 100x100 1-bit deep plane
25500.0 3.22 Fill 10x10 64-gon (Complex)
25900.0 3.39 Fill 10x10 64-gon (Convex)
26100.0 3.87 Fill 10x10 equivalent complex polygon
5040.0 4.82 Fill 300x300 aa trap with 1 bit alpha
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-11 14:46 ` Knut Petersen
@ 2011-05-11 17:52 ` Chris Wilson
2011-05-12 7:19 ` Knut Petersen
2011-05-11 19:49 ` Adam Jackson
1 sibling, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2011-05-11 17:52 UTC (permalink / raw)
To: Knut Petersen, intel-gfx
On Wed, 11 May 2011 16:46:12 +0200, Knut Petersen <Knut_Petersen@t-online.de> wrote:
> Yes, I made some mistakes during my first measurements.
>
> Below find better results. They are made on the same machine,
> with the same kernel, at the same speed, with the same x11perf
> program, absolutely nothing changed.
>
> I used x11perfcomp -ro and sorted the output, worst results for
> the currrent git code first.
>
> I think the numbers below are quite interesting ...
> 1 2 Operation
> -------- ------ ---------
> 965000.0 0.016 10x10 wide rectangle outline
Something is still not quite right here. This should be mostly CPU bound,
and even my Atom gets 734k.
Can you check that (a) it is CPU bound and (b) the worst offenders
according to the system profiler of your choice (e.g. perf)?
Thanks for doing this investigation.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-11 14:46 ` Knut Petersen
2011-05-11 17:52 ` Chris Wilson
@ 2011-05-11 19:49 ` Adam Jackson
2011-05-11 21:22 ` Knut Petersen
1 sibling, 1 reply; 14+ messages in thread
From: Adam Jackson @ 2011-05-11 19:49 UTC (permalink / raw)
To: Knut Petersen; +Cc: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 5319 bytes --]
On Wed, 2011-05-11 at 16:46 +0200, Knut Petersen wrote:
> Yes, I made some mistakes during my first measurements.
>
> Below find better results. They are made on the same machine,
> with the same kernel, at the same speed, with the same x11perf
> program, absolutely nothing changed.
You don't mention whether the 2d driver varies; I assume it does at
least to the extent of rebuilding for new ABI. Or libdrm, although
that's really a 1% kind of thing.
> I think the numbers below are quite interesting ...
I still wager they're more about the environment than about the driver
proper, there's just too many weird things going on in your results. For
example:
> 198000.0 0.27 ShmPutImage 10x10 square
> 1570.0 0.46 ShmPutImage 500x500 square
> 21700.0 0.61 ShmPutImage 100x100 square
This is essentially a memcpy benchmark. Something has to be very wrong
for that much variation to happen, and my guess would be something like
failing to inline memcpy or pick sufficiently macho optimized versions.
I'd be interested to see what your CFLAGS from build.sh ended up being,
relative to what opensuse gives for 'rpm --eval "%{optflags}"'.
One cool thing you can do from memcpy benchmarks like this is
extrapolate a bandwidth number. Your fast numbers are (small tests to
big) 75.5, 828, and 1497 MB/s. Normally one expects some growth in those
numbers for bigger tests, but typically the jump from 10x10 to 100x100
is a bit larger than the jump from 100x100 to 500x500.
So that hints that small-work tests are being choked somehow. Recall
that x11perf does a 1-pixel GetImage periodically in order to guarantee
that results actually hit the framebuffer instead of just being queued
in the command stream, so round-trip performance with the X server does
actually matter. More than that, small-work requests (which take less
time) would be more strongly dominated by round-trip speed than
large-work requests. Given that:
> 15400.0 0.54 GetProperty
> 15500.0 0.54 QueryPointer
is very telling. Those requests do essentially no work, but they are
round-trips, and their throughput is thus bounded mostly by how long it
takes the scheduler to ping-pong between x11perf and the server. A
factor of ~2 drop would lead me to suspect something like one kernel
scheduling the processes on different cores, and the other both on the
same core; two processes splitting 1CPU time with maybe a little cache
warmth between them would intuitively be about half as fast as two
processes each with their own CPU.
Empirical evidence: On the Ironlake laptop on my desk (kernel
2.6.38.3-18.fc15), if I use taskset to bind the X server to CPU0,
running "x11perf -prop -pointer" bound to CPU0 gives:
300000 trep @ 0.0322 msec ( 31100.0/sec): QueryPointer
300000 trep @ 0.0321 msec ( 31200.0/sec): GetProperty
x11perf bound to CPU3 gives:
600000 trep @ 0.0193 msec ( 51900.0/sec): QueryPointer
600000 trep @ 0.0192 msec ( 52200.0/sec): GetProperty
And running it unbound (letting the scheduler decide) gives:
600000 trep @ 0.0198 msec ( 50600.0/sec): QueryPointer
600000 trep @ 0.0208 msec ( 48000.0/sec): GetProperty
I'd be curious to see how you fare with experimenting with taskset.
One set of results that's a little confusing, and thus probably in the
end most enlightening:
> 553000.0 0.24 Copy 10x10 from pixmap to pixmap
> 140000.0 0.86 Copy 10x10 from window to pixmap
> 143000.0 0.88 Copy 10x10 from pixmap to window
> 867.0 0.99 Copy 500x500 from pixmap to pixmap
> 870.0 1.00 Copy 500x500 from window to window
> 19800.0 1.01 Copy 100x100 from pixmap to pixmap
> 19900.0 1.01 Copy 100x100 from pixmap to window
> 20000.0 1.01 Copy 100x100 from window to pixmap
> 19600.0 1.01 Copy 100x100 from window to window
> 851.0 1.01 Copy 500x500 from pixmap to window
> 849.0 1.02 Copy 500x500 from window to pixmap
> 81700.0 1.52 Copy 10x10 from window to window
This _mostly_ makes sense. These are all just varying calls to
XCopyArea, which does not have a reply. The medium and large ops are
approximately identical before and after. The 0.8x results make sense in
the context of scheduling funniness for small-work requests. But the two
outliers are perplexing. I would guess that copywinwin10 got faster due
to some optimization surrounding buffer reuse or flush reduction (you're
always working on the same buffer, so you can do less work), and that
copypixpix10 is operating wholly in host memory for some reason and
therefore hitting the same kind of memcpy issue as in your ShmPutImage
results.
I'll also note that the paths where you're losing hardest are, in the
majority, things that the driver makes no attempt to accelerate
(anything with the word "tiled" or "stippled" involved, for example). I
would tend to chalk that up to something like gcc -O0 before anything
else since you're primarily measuring the efficiency of the software
renderer. I'm actually pretty pleased with the results you've shown, 10%
or better speedup for basically all text ops, about half of window
management ops, and almost all window exposure ops.
- ajax
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-11 19:49 ` Adam Jackson
@ 2011-05-11 21:22 ` Knut Petersen
2011-05-12 13:42 ` Adam Jackson
0 siblings, 1 reply; 14+ messages in thread
From: Knut Petersen @ 2011-05-11 21:22 UTC (permalink / raw)
To: Adam Jackson; +Cc: intel-gfx
As I do have only a few minutes now, a few comments:
1: The complete trees are compared, all modules/libraries are either old or new. No debug-versions.
2: Speculating about cores is definitely wrong -- the Pentium M Dothan definitely is a single core cpu.
3. There often is a "choked most" (1) -- "choked least" (10) -- "choked a bit more again" (100,500)
result:
1450000.0 0.50 1x1 stippled rectangle (8x8 stipple)
134000.0 1.11 10x10 stippled rectangle (8x8 stipple)
2540.0 1.05 100x100 stippled rectangle (8x8 stipple)
110.0 0.95 500x500 stippled rectangle (8x8 stipple)
Heavy per call impact of factor A on those small requests, light impact of a factor B with growing numbers?
A = compiler / library overhead?
Yes, there is
> 15400.0 0.54 GetProperty
> 15500.0 0.54 QueryPointer
but we also see
8150000.0 1.21 X protocol NoOperation
4. No, it's not the kernel. I did
a) boot
b) x11perf on old X
c) x11perf on new X
d) reboot
e) x11perf on new X
f) x11perf on old X
and saw only very marginal differences between those two runs.
5. Yes, I do agree to that:
> I'm actually pretty pleased with the results you've shown, 10%
> or better speedup for basically all text ops, about half of window
> management ops, and almost all window exposure ops.
6. More later.
cu,
Knut
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-11 17:52 ` Chris Wilson
@ 2011-05-12 7:19 ` Knut Petersen
2011-05-12 7:38 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Knut Petersen @ 2011-05-12 7:19 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
>> 1 2 Operation
>> -------- ------ ---------
>> 965000.0 0.016 10x10 wide rectangle outline
> Something is still not quite right here. This should be mostly CPU bound,
> and even my Atom gets 734k.
>
> Can you check that (a) it is CPU bound and (b) the worst offenders
> according to the system profiler of your choice (e.g. perf)?
>
734k would be nice ;-)
With current git Xorg its 10300 reps at 800 MHz and 16300 reps at 2000 MHz.
Increasing cpu clock by a factor of 2.5 increases reps by a factor of 1.58.
cu,
knut
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-12 7:19 ` Knut Petersen
@ 2011-05-12 7:38 ` Chris Wilson
2011-05-12 8:24 ` Knut Petersen
0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2011-05-12 7:38 UTC (permalink / raw)
To: Knut Petersen; +Cc: intel-gfx
On Thu, 12 May 2011 09:19:39 +0200, Knut Petersen <Knut_Petersen@t-online.de> wrote:
>
> >> 1 2 Operation
> >> -------- ------ ---------
> >> 965000.0 0.016 10x10 wide rectangle outline
> > Something is still not quite right here. This should be mostly CPU bound,
> > and even my Atom gets 734k.
> >
> > Can you check that (a) it is CPU bound and (b) the worst offenders
> > according to the system profiler of your choice (e.g. perf)?
> >
>
> 734k would be nice ;-)
>
> With current git Xorg its 10300 reps at 800 MHz and 16300 reps at 2000 MHz.
> Increasing cpu clock by a factor of 2.5 increases reps by a factor of 1.58.
Please do something like 'perf record -f -g -a x11perf -d :0 -worect10;
perf report | head -150' and paste the output.
-Chris
>
> cu,
> knut
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-12 7:38 ` Chris Wilson
@ 2011-05-12 8:24 ` Knut Petersen
2011-05-12 8:55 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Knut Petersen @ 2011-05-12 8:24 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
[-- Attachment #1: Type: text/plain, Size: 168 bytes --]
> Please do something like 'perf record -f -g -a x11perf -d :0 -worect10;
> perf report | head -150' and paste the output.
> -Chris
>
Attached find the perf log
Knut
[-- Attachment #2: perflog --]
[-- Type: text/plain, Size: 10526 bytes --]
# Events: 19K cycles
#
# Overhead Command Shared Object Symbol
# ........ ............... ............................... .............................................................................................................................................................................................................................................................
#
32.09% Xorg libpixman-1.so.0.23.1 [.] pixman_op
|
--- pixman_op
|
|--99.80%-- pixman_region_union
| |
| |--99.95%-- damageRegionAppend
| | damageDamageBox
| | damagePolyRectangle
| | ProcPolyRectangle
| | Dispatch
| | main
| | __libc_start_main
| --0.05%-- [...]
--0.20%-- [...]
5.98% Xorg libc-2.11.3.so [.] __GI_memmove
|
--- __GI_memmove
|
|--93.46%-- pixman_region_union
| damageRegionAppend
| damageDamageBox
| damagePolyRectangle
| ProcPolyRectangle
| Dispatch
| main
| __libc_start_main
|
|--5.14%-- Dispatch
| main
| __libc_start_main
|
|--1.22%-- WriteEventsToClient
| DamageExtNotify
| .L312
| damageRegionProcessPending
| damagePolyRectangle
| ProcPolyRectangle
| Dispatch
| main
| __libc_start_main
--0.18%-- [...]
3.25% Xorg [kernel.kallsyms] [k] __lock_acquire
|
--- __lock_acquire
|
|--98.72%-- lock_acquire
| |
| |--48.51%-- _raw_spin_lock_irqsave
| | |
| | |--45.74%-- add_wait_queue
| | | __pollwait
| | | |
| | | |--89.24%-- unix_poll
| | | | sock_poll
| | | | do_select
| | | | core_sys_select
| | | | sys_select
| | | | sysenter_do_call
| | | | 0xb76ed424
| | | | Dispatch
| | | | main
| | | | __libc_start_main
| | | |
| | | |--4.40%-- n_tty_poll
| | | | tty_poll
| | | | do_select
| | | | core_sys_select
| | | | sys_select
| | | | sysenter_do_call
| | | | 0xb76ed424
| | | | Dispatch
| | | | main
| | | | __libc_start_main
| | | |
| | | |--3.56%-- datagram_poll
| | | | sock_poll
| | | | do_select
| | | | core_sys_select
| | | | sys_select
| | | | sysenter_do_call
| | | | 0xb76ed424
| | | | Dispatch
| | | | main
| | | | __libc_start_main
| | | |
| | | --2.81%-- drm_poll
| | | do_select
| | | core_sys_select
| | | sys_select
| | | sysenter_do_call
| | | 0xb76ed424
| | | Dispatch
| | | main
| | | __libc_start_main
| | |
| | |--31.44%-- remove_wait_queue
| | | poll_freewait
| | | do_select
| | | core_sys_select
| | | sys_select
| | | sysenter_do_call
| | | 0xb76ed424
| | | Dispatch
| | | main
| | | __libc_start_main
| | |
| | |--6.96%-- skb_dequeue
| | | unix_stream_recvmsg
| | | sock_aio_read
| | | do_sync_read
| | | vfs_read
| | | sys_read
| | | sysenter_do_call
| | | 0xb76ed424
| | | _XSERVTransRead
| | | ReadRequestFromClient
| | | Dispatch
| | | main
| | | __libc_start_main
| | |
| | |--6.55%-- __wake_up_sync_key
| | | |
| | | |--79.99%-- unix_write_space
| | | | sock_wfree
| | | | unix_destruct_scm
| | | | skb_release_head_state
| | | | __kfree_skb
| | | | consume_skb
| | | | unix_stream_recvmsg
| | | | sock_aio_read
| | | | do_sync_read
| | | | vfs_read
| | | | sys_read
| | | | sysenter_do_call
| | | | 0xb76ed424
| | | | _XSERVTransRead
| | | | ReadRequestFromClient
| | | | Dispatch
| | | | main
| | | | __libc_start_main
| | | |
| | | --20.01%-- sock_def_readable
[-- Attachment #3: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-12 8:24 ` Knut Petersen
@ 2011-05-12 8:55 ` Chris Wilson
2011-05-12 9:34 ` Knut Petersen
2011-05-13 9:24 ` Knut Petersen
0 siblings, 2 replies; 14+ messages in thread
From: Chris Wilson @ 2011-05-12 8:55 UTC (permalink / raw)
To: Knut Petersen; +Cc: intel-gfx
On Thu, 12 May 2011 10:24:00 +0200, Knut Petersen <Knut_Petersen@t-online.de> wrote:
>
> > Please do something like 'perf record -f -g -a x11perf -d :0 -worect10;
> > perf report | head -150' and paste the output.
> > -Chris
> >
> Attached find the perf log
Oh, damage. A compositing WM? If you turn off compositing, do you see
similar performance levels to xorg-1.6?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-12 8:55 ` Chris Wilson
@ 2011-05-12 9:34 ` Knut Petersen
2011-05-13 9:24 ` Knut Petersen
1 sibling, 0 replies; 14+ messages in thread
From: Knut Petersen @ 2011-05-12 9:34 UTC (permalink / raw)
To: intel-gfx
> Oh, damage. A compositing WM? If you turn off compositing, do you see
> similar performance levels to xorg-1.6?
> -Chris
>
That makes difference .... 16.300 reps speed up to 1.280.000 reps ... 78.5 times faster.
I think I will rerun the tests.
cu,
Knut
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-11 21:22 ` Knut Petersen
@ 2011-05-12 13:42 ` Adam Jackson
0 siblings, 0 replies; 14+ messages in thread
From: Adam Jackson @ 2011-05-12 13:42 UTC (permalink / raw)
To: Knut Petersen; +Cc: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 417 bytes --]
On Wed, 2011-05-11 at 23:22 +0200, Knut Petersen wrote:
> Yes, there is
> > 15400.0 0.54 GetProperty
> > 15500.0 0.54 QueryPointer
> but we also see
>
> 8150000.0 1.21 X protocol NoOperation
NoOp isn't a round trip, it does not generate a reply. That test
measures how fast the X server can zip around its own main loop, not how
fast it can interact with clients.
- ajax
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: X11 performance regressions
2011-05-12 8:55 ` Chris Wilson
2011-05-12 9:34 ` Knut Petersen
@ 2011-05-13 9:24 ` Knut Petersen
1 sibling, 0 replies; 14+ messages in thread
From: Knut Petersen @ 2011-05-13 9:24 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
[-- Attachment #1: Type: text/plain, Size: 326 bytes --]
> Oh, damage. A compositing WM? If you turn off compositing, do you see
> similar performance levels to xorg-1.6?
> -Chris
>
If "Composite" is disabled, the current X scores much better than the 1.6.5 server
in most cases. But there are a few exceptions ... for the worst of those cases, I
also attached a perf log.
- Knut
[-- Attachment #2: x11perfcomp --]
[-- Type: text/plain, Size: 20937 bytes --]
1: x11perf-10605000-nocomposite
2: x11perf-11099001-nocomposite
1 2 Operation
-------- ------ ---------
2630.0 0.12 100-pixel double-dashed ellipse
4180.0 0.14 100-pixel dashed ellipse
575000.0 0.23 Copy 10x10 from pixmap to pixmap
5850.0 0.34 500-pixel filled ellipse
2970.0 0.35 500-pixel solid circle
6250.0 0.35 Fill 300x300 trapezoid
149000.0 0.41 PutImage 10x10 square
3930.0 0.44 100-pixel wide double-dashed ellipse
189000.0 0.44 ShmPutImage 10x10 square
1570.0 0.46 ShmPutImage 500x500 square
9610.0 0.49 GetImage 10x10 square
21700.0 0.51 ShmPutImage 100x100 square
12600.0 0.63 QueryPointer
12600.0 0.65 GetProperty
220000.0 0.67 100x100 wide rectangle outline
83400.0 0.68 100x100 rectangle
477.0 0.69 PutImage 500x500 square
9100.0 0.71 PutImage 100x100 square
28700.0 0.73 500x500 rectangle outline
5570.0 0.75 500x500 wide rectangle outline
2140.0 0.79 100-pixel double-dashed circle
2550.0 0.81 500-pixel wide circle
1690000.0 0.82 100-pixel vertical line segment
3500.0 0.82 500-pixel wide ellipse
3430.0 0.85 100-pixel dashed circle
163000.0 0.85 Fill 1x1 equivalent triangle
152000.0 0.86 Fill 1x1 trapezoid
139000.0 0.88 Copy 10x10 from window to pixmap
137000.0 0.91 Composite 10x10 from pixmap to window
1930.0 0.91 GetImage XY 10x10 square
21300.0 0.93 500-pixel circle
138000.0 0.93 Copy 10x10 from pixmap to window
1300000.0 0.93 Move window via parent (100 kids)
1370000.0 0.93 Move window via parent (200 kids)
130000.0 0.95 10-pixel partial ellipse
107000.0 0.95 Char in 80-char rgb core line (Charter 10)
831000.0 0.96 1-pixel circle
16700.0 0.96 Fill 100x100 aa pre-added trapezoid
1590.0 0.96 Fill 100x100 aa trap
1460.0 0.96 Fill 100x100 aa trap with 4 bit alpha
513.0 0.96 Fill 300x300 aa trap
499.0 0.96 Fill 300x300 aa trap with 4 bit alpha
74.9 0.96 Fill 300x300 tiled trapezoid (17x15 tile)
153000.0 0.97 Fill 10x10 aa pre-added trapezoid
4780.0 0.97 Fill 300x300 aa pre-added trapezoid
12.0 0.97 ShmPutImage XY 100x100 square
783.0 0.98 Fill 100x100 tiled trapezoid (161x145 tile)
927000.0 0.98 Fill 1x1 aa pre-added trapezoid
12.4 0.98 PutImage XY 100x100 square
1230.0 0.98 PutImage XY 10x10 square
1110.0 0.98 ShmPutImage XY 10x10 square
29000.0 0.99 100x100 tiled rectangle (161x145 tile)
23900.0 0.99 100x100 tiled rectangle (17x15 tile)
30100.0 0.99 100x100 tiled rectangle (216x208 tile)
34800000.0 0.99 1-pixel solid circle
885.0 0.99 500x500 tiled rectangle (161x145 tile)
691.0 0.99 500x500 tiled rectangle (17x15 tile)
960.0 0.99 500x500 tiled rectangle (216x208 tile)
274000.0 0.99 Char in 30-char aa line (Charter 24)
275000.0 0.99 Char in 30-char a line (Charter 24)
20400.0 0.99 Copy 100x100 from pixmap to pixmap
599.0 0.99 Copy 100x100 n-bit deep plane
870.0 0.99 Copy 500x500 from pixmap to pixmap
24.3 0.99 Copy 500x500 n-bit deep plane
120000.0 0.99 Fill 1x1 aa trap
1090.0 0.99 Fill 2x100 aa trap
10700.0 0.99 Fill 2x10 aa trap
91200.0 0.99 Fill 2x1 aa trap
322000.0 1.00 100-pixel dashed line
307000.0 1.00 100-pixel double-dashed line
275000.0 1.00 100-pixel double-dashed segment
307000.0 1.00 100-pixel line
277000.0 1.00 100-pixel line segment
309000.0 1.00 100-pixel line segment (2 kids)
24600000.0 1.00 1-pixel line
2430000.0 1.00 500-pixel horizontal line segment
56500.0 1.00 500-pixel line segment
2400000.0 1.00 500-pixel vertical line segment
2440.0 1.00 500x500 rectangle
20400.0 1.00 Composite 100x100 from pixmap to window
20100.0 1.00 Composite 100x100 from window to window
866.0 1.00 Composite 500x500 from pixmap to window
875.0 1.00 Composite 500x500 from window to window
20500.0 1.00 Copy 100x100 from pixmap to window
20600.0 1.00 Copy 100x100 from window to pixmap
20100.0 1.00 Copy 100x100 from window to window
866.0 1.00 Copy 500x500 from pixmap to window
872.0 1.00 Copy 500x500 from window to pixmap
875.0 1.00 Copy 500x500 from window to window
661.0 1.00 Fill 100x100 tiled trapezoid (17x15 tile)
67.1 1.00 Fill 300x300 tiled trapezoid (4x4 tile)
25.9 1.00 GetImage XY 100x100 square
1.0 1.00 GetImage XY 500x500 square
0.5 1.00 PutImage XY 500x500 square
1240.0 1.00 Scroll 500x500 pixels
0.5 1.00 ShmPutImage XY 500x500 square
289000.0 1.01 100-pixel dashed segment
292000.0 1.01 100-pixel line segment (1 kid)
13400.0 1.01 Fill 10x10 aa trap
28500.0 1.01 Scroll 100x100 pixels
3230000.0 1.02 10-pixel line
193000.0 1.02 Char16 in 23-char image line (k24)
271000.0 1.02 Char16 in 23-char line (k24)
7270.0 1.02 Fill 10x10 aa trap with 4 bit alpha
2350000.0 1.03 10-pixel dashed segment
2200000.0 1.03 10-pixel line segment
61800.0 1.03 500-pixel line
497000.0 1.03 Char16 in 40-char image line (k14)
118000.0 1.03 Char in 80-char aa core line (Charter 10)
118000.0 1.03 Char in 80-char a core line (Charter 10)
353.0 1.03 Fill 2x300 aa trap
1130000.0 1.03 Move window via parent (75 kids)
320000.0 1.04 100-pixel line segment (3 kids)
649000.0 1.04 Char16 in 40-char line (k14)
507000.0 1.04 Char in 30-char image line (TR 24)
1480000.0 1.04 Char in 80-char image line (6x13)
568.0 1.04 Fill 100x100 tiled trapezoid (4x4 tile)
15400.0 1.04 Fill 1x1 aa trap with 4 bit alpha
1090000.0 1.05 Char in 60-char image line (9x15)
1290000.0 1.05 Char in 70-char image line (8x13)
1720000.0 1.05 Char in 80-char image line (TR 10)
58900.0 1.05 Hide/expose window via popup (4 kids)
1810000.0 1.05 Moved unmapped window (100 kids)
1700000.0 1.05 Resize unmapped window (200 kids)
327000.0 1.06 Char16 in 7/14/7 line (k14, k24)
43200.0 1.06 Copy 10x10 n-bit deep plane
3160000.0 1.07 10-pixel dashed line
2050000.0 1.07 Char in 80-char line (6x13)
270000.0 1.07 Fill 1x1 aa trap with 1 bit alpha
1830000.0 1.07 Moved unmapped window (50 kids)
21600000.0 1.08 1-pixel line segment
21200.0 1.08 Char in 30-char rgb core line (Charter 24)
1850000.0 1.08 Moved unmapped window (16 kids)
1830000.0 1.08 Moved unmapped window (4 kids)
1820000.0 1.08 Moved unmapped window (75 kids)
839000.0 1.09 Char in 30-char line (TR 24)
1620000.0 1.09 Char in 60-char line (9x15)
1870000.0 1.09 Char in 70-char line (8x13)
2340000.0 1.09 Char in 80-char line (TR 10)
1810000.0 1.09 Moved unmapped window (200 kids)
25600.0 1.09 Move window (25 kids)
1720000.0 1.09 Resize unmapped window (16 kids)
1720000.0 1.09 Resize unmapped window (25 kids)
1730000.0 1.09 Resize unmapped window (4 kids)
1220000.0 1.10 1x1 tiled rectangle (161x145 tile)
1220000.0 1.10 1x1 tiled rectangle (17x15 tile)
1210000.0 1.10 1x1 tiled rectangle (4x4 tile)
725.0 1.10 Fill 300x300 aa trapezoid
26200.0 1.10 Move window (16 kids)
16600.0 1.10 Move window (200 kids)
1700000.0 1.10 Resize unmapped window (75 kids)
1970000.0 1.10 Unmap window via parent (200 kids)
1750000.0 1.10 Unmap window via parent (50 kids)
1120000.0 1.11 10x10 tiled rectangle (216x208 tile)
973000.0 1.11 Circulate Unmapped window (200 kids)
1780000.0 1.11 Moved unmapped window (25 kids)
20900.0 1.11 Move window (100 kids)
22100.0 1.11 Move window (75 kids)
80500.0 1.12 Char in 80-char rgb core line (Courier 12)
502000.0 1.12 Destroy window via parent (200 kids)
1650000.0 1.12 Resize unmapped window (100 kids)
125000.0 1.13 10-pixel ellipse
4780000.0 1.13 10-pixel horizontal line segment
1070000.0 1.13 10x10 tiled rectangle (161x145 tile)
1180000.0 1.13 1x1 tiled rectangle (216x208 tile)
23500.0 1.13 Move window (50 kids)
1670000.0 1.13 Resize unmapped window (50 kids)
547000.0 1.14 10x10 tiled rectangle (17x15 tile)
37600.0 1.14 Circulate window (4 kids)
22600.0 1.15 Char in 30-char aa core line (Charter 24)
2270000.0 1.15 Circulate Unmapped window (75 kids)
37400.0 1.15 Fill 10x10 tiled trapezoid (4x4 tile)
28100.0 1.15 Move window (4 kids)
87600.0 1.16 Char in 80-char aa core line (Courier 12)
1970000.0 1.16 Circulate Unmapped window (100 kids)
3180000.0 1.16 Circulate Unmapped window (25 kids)
2710000.0 1.16 Circulate Unmapped window (50 kids)
42800.0 1.16 Fill 10x10 tiled trapezoid (216x208 tile)
1980000.0 1.16 Unmap window via parent (100 kids)
3490.0 1.17 100x100 tiled rectangle (4x4 tile)
144000.0 1.17 10x10 tiled rectangle (4x4 tile)
10200.0 1.17 500x50 wide vertical line segment
711000.0 1.17 Char in 80-char rgb line (Courier 12)
90.5 1.17 Fill 300x300 tiled trapezoid (161x145 tile)
557000.0 1.17 Move window via parent (25 kids)
110000.0 1.17 Move window via parent (4 kids)
904000.0 1.17 Move window via parent (50 kids)
102.0 1.18 500x500 tiled rectangle (4x4 tile)
21900.0 1.18 Char in 30-char a core line (Charter 24)
87400.0 1.18 Char in 80-char a core line (Courier 12)
4350.0 1.18 Fill 100x100 aa trapezoid
42200.0 1.18 Fill 10x10 tiled trapezoid (161x145 tile)
39500.0 1.18 Fill 10x10 tiled trapezoid (17x15 tile)
388000.0 1.18 Move window via parent (16 kids)
23400.0 1.18 Resize window (200 kids)
1910000.0 1.18 Unmap window via parent (75 kids)
33800.0 1.19 100-pixel circle
807.0 1.19 Fill 100x100 tiled trapezoid (216x208 tile)
15100.0 1.19 Fill 100x100 trapezoid
76000.0 1.19 Map window via parent (4 kids)
4140.0 1.20 100-pixel wide double-dashed circle
2200.0 1.20 500-pixel ellipse
3370000.0 1.20 Circulate Unmapped window (16 kids)
1690000.0 1.21 Char in 20/40/20 line (6x13, TR 10)
8160000.0 1.21 X protocol NoOperation
1750.0 1.22 100-pixel wide dashed ellipse
191000.0 1.22 Char in 30-char rgb line (Charter 24)
32400.0 1.22 Resize window (75 kids)
10200.0 1.23 100-pixel ellipse
15300.0 1.23 100-pixel partial ellipse
4630.0 1.23 500x50 wide line
30100.0 1.23 Resize window (100 kids)
92.0 1.24 Fill 300x300 tiled trapezoid (216x208 tile)
777.0 1.25 GetImage 100x100 square
96800.0 1.25 Hide/expose window via popup (16 kids)
714000.0 1.26 Create unmapped window (200 kids)
34900.0 1.26 Resize window (50 kids)
40100.0 1.27 Resize window (16 kids)
1290000.0 1.27 Unmap window via parent (25 kids)
10000.0 1.29 500x50 wide horizontal line segment
38100.0 1.29 Resize window (25 kids)
4030000.0 1.30 100-pixel horizontal line segment
3530000.0 1.30 Circulate Unmapped window (4 kids)
25300.0 1.30 Circulate window (16 kids)
981000.0 1.30 Unmap window via parent (16 kids)
22200.0 1.31 Circulate window (100 kids)
19100.0 1.31 Circulate window (200 kids)
43700.0 1.31 Resize window (4 kids)
972000.0 1.32 10x10 wide rectangle outline
23200.0 1.32 Circulate window (50 kids)
22600.0 1.32 Circulate window (75 kids)
13500.0 1.32 Fill 100x100 64-gon (Convex)
24200.0 1.33 Circulate window (25 kids)
365000.0 1.33 Destroy window via parent (16 kids)
12200.0 1.34 Fill 100x100 equivalent triangle
775000.0 1.35 Char in 80-char aa line (Courier 12)
777000.0 1.35 Char in 80-char a line (Courier 12)
26100.0 1.36 10-pixel wide partial ellipse
1840000.0 1.36 10x10 rectangle
711000.0 1.36 Create unmapped window (100 kids)
698000.0 1.37 Create unmapped window (50 kids)
35300.0 1.38 100-pixel partial circle
112000.0 1.38 Map window via parent (16 kids)
689000.0 1.39 Create unmapped window (25 kids)
500000.0 1.39 Destroy window via parent (75 kids)
12700.0 1.40 Fill 100x100 64-gon (Complex)
98700.0 1.40 Hide/expose window via popup (25 kids)
109000.0 1.40 Hide/expose window via popup (50 kids)
114000.0 1.41 Hide/expose window via popup (75 kids)
15600.0 1.43 100-pixel solid circle
112000.0 1.44 Create and map subwindows (200 kids)
113000.0 1.44 Create and map subwindows (50 kids)
114000.0 1.44 Create and map subwindows (75 kids)
671000.0 1.44 Create unmapped window (75 kids)
123000.0 1.45 10-pixel partial circle
112000.0 1.45 Create and map subwindows (25 kids)
182000.0 1.45 Destroy window via parent (4 kids)
113000.0 1.45 Hide/expose window via popup (100 kids)
118000.0 1.45 Hide/expose window via popup (200 kids)
322000.0 1.45 Unmap window via parent (4 kids)
114000.0 1.46 Create and map subwindows (100 kids)
109000.0 1.46 Create and map subwindows (16 kids)
476000.0 1.46 Destroy window via parent (50 kids)
655000.0 1.47 Create unmapped window (16 kids)
763000.0 1.48 Char in 80-char rgb line (Charter 10)
127000.0 1.48 Map window via parent (75 kids)
140000.0 1.51 Change graphics context
54600.0 1.52 10x1 wide vertical line segment
128000.0 1.52 Map window via parent (100 kids)
81500.0 1.53 Copy 10x10 from window to window
481000.0 1.53 Destroy window via parent (100 kids)
81600.0 1.53 Scroll 10x10 pixels
86200.0 1.54 Create and map subwindows (4 kids)
12500.0 1.55 100-pixel wide ellipse
80800.0 1.55 Composite 10x10 from window to window
121000.0 1.55 Map window via parent (50 kids)
394000.0 1.56 Destroy window via parent (25 kids)
138000.0 1.56 Fill 1x1 tiled trapezoid (17x15 tile)
111000.0 1.56 Map window via parent (25 kids)
137000.0 1.57 Fill 1x1 tiled trapezoid (4x4 tile)
130000.0 1.57 Map window via parent (200 kids)
551000.0 1.58 Create unmapped window (4 kids)
136000.0 1.58 Fill 1x1 tiled trapezoid (161x145 tile)
9850.0 1.60 Fill 100x100 equivalent complex polygons
53400.0 1.61 10x1 wide horizontal line segment
132000.0 1.63 Fill 1x1 tiled trapezoid (216x208 tile)
23500.0 1.69 10-pixel wide partial circle
105000.0 1.71 10-pixel circle
1470000.0 1.72 1x1 stippled rectangle (8x8 stipple)
1420000.0 1.73 1x1 opaque stippled rectangle (161x145 stipple)
53400.0 1.76 100x100 rectangle outline
1430000.0 1.77 1x1 stippled rectangle (161x145 stipple)
1430000.0 1.77 1x1 stippled rectangle (17x15 stipple)
1420000.0 1.78 1x1 opaque stippled rectangle (17x15 stipple)
773000.0 1.80 Char in 80-char a line (Charter 10)
768000.0 1.81 Char in 80-char aa line (Charter 10)
1400000.0 1.82 1x1 opaque stippled rectangle (8x8 stipple)
185.0 1.83 500x500 opaque stippled rectangle (17x15 stipple)
14000.0 1.86 Fill 10x10 aa trapezoid
174000.0 1.90 Fill 1x1 stippled trapezoid (17x15 stipple)
173000.0 1.92 Fill 1x1 opaque stippled trapezoid (8x8 stipple)
173000.0 1.93 Fill 1x1 opaque stippled trapezoid (161x145 stipple)
173000.0 1.94 Fill 1x1 opaque stippled trapezoid (17x15 stipple)
4140.0 1.95 100x100 opaque stippled rectangle (17x15 stipple)
134000.0 1.96 Fill 10x10 aa trap with 1 bit alpha
172000.0 1.96 Fill 1x1 stippled trapezoid (8x8 stipple)
1840000.0 1.97 10-pixel vertical line segment
173000.0 1.97 Fill 1x1 stippled trapezoid (161x145 stipple)
5780.0 1.99 100-pixel wide partial ellipse
1830000.0 1.99 1x1 rectangle
86400.0 1.99 Fill 10x10 stippled trapezoid (161x145 stipple)
86000.0 2.01 Fill 10x10 opaque stippled trapezoid (161x145 stipple)
14700.0 2.03 Fill 1x1 aa trapezoid
1840000.0 2.04 Dot
67700.0 2.04 Fill 10x10 stippled trapezoid (8x8 stipple)
77700.0 2.05 Fill 10x10 opaque stippled trapezoid (17x15 stipple)
74600.0 2.05 Fill 10x10 stippled trapezoid (17x15 stipple)
5250.0 2.11 Fill 100x100 stippled trapezoid (161x145 stipple)
28.1 2.11 GetImage 500x500 square
5760.0 2.12 Fill 100x100 opaque stippled trapezoid (161x145 stipple)
570.0 2.12 Fill 300x300 opaque stippled trapezoid (17x15 stipple)
69000.0 2.13 Fill 10x10 opaque stippled trapezoid (8x8 stipple)
18500.0 2.16 100-pixel filled ellipse
4040.0 2.18 Fill 100x100 opaque stippled trapezoid (17x15 stipple)
17600.0 2.19 100-pixel fill chord partial circle
708.0 2.19 Fill 300x300 stippled trapezoid (161x145 stipple)
2980.0 2.20 Fill 100x100 stippled trapezoid (17x15 stipple)
384.0 2.21 Fill 300x300 stippled trapezoid (17x15 stipple)
53800.0 2.23 10x10 rectangle outline
301.0 2.24 500x500 stippled rectangle (161x145 stipple)
869.0 2.24 Fill 300x300 opaque stippled trapezoid (161x145 stipple)
16300.0 2.30 100-pixel fill slice partial circle
7100.0 2.31 100x100 opaque stippled rectangle (161x145 stipple)
114.0 2.31 500x500 opaque stippled rectangle (8x8 stipple)
6560.0 2.32 100x100 stippled rectangle (161x145 stipple)
345.0 2.33 500x500 opaque stippled rectangle (161x145 stipple)
106000.0 2.33 Copy 10x10 1-bit deep plane
2260.0 2.33 Fill 100x100 stippled trapezoid (8x8 stipple)
404.0 2.35 Copy 500x500 1-bit deep plane
2570.0 2.44 Fill 100x100 opaque stippled trapezoid (8x8 stipple)
1370.0 2.45 100-pixel wide dashed circle
328.0 2.47 Fill 300x300 opaque stippled trapezoid (8x8 stipple)
280.0 2.51 Fill 300x300 stippled trapezoid (8x8 stipple)
53800.0 2.57 10-pixel fill chord partial ellipse
33300.0 2.57 10x1 wide line
7010.0 2.61 Copy 100x100 1-bit deep plane
2540.0 2.63 100x100 opaque stippled rectangle (8x8 stipple)
172.0 2.63 500x500 stippled rectangle (17x15 stipple)
6520.0 2.64 100-pixel wide partial circle
48400.0 2.64 10-pixel fill slice partial ellipse
3910.0 2.79 100x100 stippled rectangle (17x15 stipple)
224000.0 2.89 10x10 opaque stippled rectangle (161x145 stipple)
30900.0 2.92 10-pixel wide ellipse
9940.0 2.98 100-pixel wide circle
113.0 3.03 500x500 stippled rectangle (8x8 stipple)
14200.0 3.06 100x10 wide vertical line segment
25600.0 3.11 Fill 10x10 64-gon (Complex)
14200.0 3.15 100x10 wide horizontal line segment
41600.0 3.22 10-pixel filled ellipse
2560.0 3.25 100x100 stippled rectangle (8x8 stipple)
176000.0 3.30 10x10 opaque stippled rectangle (17x15 stipple)
35100.0 3.36 10-pixel fill slice partial circle
38100.0 3.39 10-pixel fill chord partial circle
25400.0 3.44 Fill 10x10 64-gon (Convex)
143000.0 3.47 10x10 opaque stippled rectangle (8x8 stipple)
22100.0 3.49 100-pixel fill chord partial ellipse
231000.0 3.56 10x10 stippled rectangle (161x145 stipple)
20100.0 3.68 100-pixel fill slice partial ellipse
3550.0 3.80 100x10 wide double-dashed line
156000.0 3.83 10x10 stippled rectangle (17x15 stipple)
26100.0 3.87 Fill 10x10 equivalent complex polygon
10700.0 3.93 100x10 wide line
3040.0 4.01 100x10 wide dashed line
27200.0 4.15 Fill 10x10 equivalent triangle
28200.0 4.22 Fill 10x10 trapezoid
27100.0 4.24 10-pixel wide circle
29500.0 4.44 10-pixel solid circle
114000.0 4.49 10x10 stippled rectangle (8x8 stipple)
29300.0 4.78 Fill 100x100 aa trap with 1 bit alpha
5270.0 11.33 Fill 300x300 aa trap with 1 bit alpha
[-- Attachment #3: perflog-ddellipse100 --]
[-- Type: text/plain, Size: 10526 bytes --]
# Events: 19K cycles
#
# Overhead Command Shared Object Symbol
# ........ ............... ............................... .............................................................................................................................................................................................................................................................
#
32.09% Xorg libpixman-1.so.0.23.1 [.] pixman_op
|
--- pixman_op
|
|--99.80%-- pixman_region_union
| |
| |--99.95%-- damageRegionAppend
| | damageDamageBox
| | damagePolyRectangle
| | ProcPolyRectangle
| | Dispatch
| | main
| | __libc_start_main
| --0.05%-- [...]
--0.20%-- [...]
5.98% Xorg libc-2.11.3.so [.] __GI_memmove
|
--- __GI_memmove
|
|--93.46%-- pixman_region_union
| damageRegionAppend
| damageDamageBox
| damagePolyRectangle
| ProcPolyRectangle
| Dispatch
| main
| __libc_start_main
|
|--5.14%-- Dispatch
| main
| __libc_start_main
|
|--1.22%-- WriteEventsToClient
| DamageExtNotify
| .L312
| damageRegionProcessPending
| damagePolyRectangle
| ProcPolyRectangle
| Dispatch
| main
| __libc_start_main
--0.18%-- [...]
3.25% Xorg [kernel.kallsyms] [k] __lock_acquire
|
--- __lock_acquire
|
|--98.72%-- lock_acquire
| |
| |--48.51%-- _raw_spin_lock_irqsave
| | |
| | |--45.74%-- add_wait_queue
| | | __pollwait
| | | |
| | | |--89.24%-- unix_poll
| | | | sock_poll
| | | | do_select
| | | | core_sys_select
| | | | sys_select
| | | | sysenter_do_call
| | | | 0xb76ed424
| | | | Dispatch
| | | | main
| | | | __libc_start_main
| | | |
| | | |--4.40%-- n_tty_poll
| | | | tty_poll
| | | | do_select
| | | | core_sys_select
| | | | sys_select
| | | | sysenter_do_call
| | | | 0xb76ed424
| | | | Dispatch
| | | | main
| | | | __libc_start_main
| | | |
| | | |--3.56%-- datagram_poll
| | | | sock_poll
| | | | do_select
| | | | core_sys_select
| | | | sys_select
| | | | sysenter_do_call
| | | | 0xb76ed424
| | | | Dispatch
| | | | main
| | | | __libc_start_main
| | | |
| | | --2.81%-- drm_poll
| | | do_select
| | | core_sys_select
| | | sys_select
| | | sysenter_do_call
| | | 0xb76ed424
| | | Dispatch
| | | main
| | | __libc_start_main
| | |
| | |--31.44%-- remove_wait_queue
| | | poll_freewait
| | | do_select
| | | core_sys_select
| | | sys_select
| | | sysenter_do_call
| | | 0xb76ed424
| | | Dispatch
| | | main
| | | __libc_start_main
| | |
| | |--6.96%-- skb_dequeue
| | | unix_stream_recvmsg
| | | sock_aio_read
| | | do_sync_read
| | | vfs_read
| | | sys_read
| | | sysenter_do_call
| | | 0xb76ed424
| | | _XSERVTransRead
| | | ReadRequestFromClient
| | | Dispatch
| | | main
| | | __libc_start_main
| | |
| | |--6.55%-- __wake_up_sync_key
| | | |
| | | |--79.99%-- unix_write_space
| | | | sock_wfree
| | | | unix_destruct_scm
| | | | skb_release_head_state
| | | | __kfree_skb
| | | | consume_skb
| | | | unix_stream_recvmsg
| | | | sock_aio_read
| | | | do_sync_read
| | | | vfs_read
| | | | sys_read
| | | | sysenter_do_call
| | | | 0xb76ed424
| | | | _XSERVTransRead
| | | | ReadRequestFromClient
| | | | Dispatch
| | | | main
| | | | __libc_start_main
| | | |
| | | --20.01%-- sock_def_readable
[-- Attachment #4: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-05-13 9:24 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-08 18:22 X11 performance regressions Knut Petersen
2011-05-09 16:53 ` Adam Jackson
2011-05-09 21:43 ` Chris Wilson
2011-05-11 14:46 ` Knut Petersen
2011-05-11 17:52 ` Chris Wilson
2011-05-12 7:19 ` Knut Petersen
2011-05-12 7:38 ` Chris Wilson
2011-05-12 8:24 ` Knut Petersen
2011-05-12 8:55 ` Chris Wilson
2011-05-12 9:34 ` Knut Petersen
2011-05-13 9:24 ` Knut Petersen
2011-05-11 19:49 ` Adam Jackson
2011-05-11 21:22 ` Knut Petersen
2011-05-12 13:42 ` Adam Jackson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).