All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020
       [not found] <5f5e9831.1c69fb81.bdbec.98b8@mx.google.com>
@ 2020-09-14  6:46 ` Philippe Mathieu-Daudé
  2020-09-14 10:50   ` Ahmed Karaman
  0 siblings, 1 reply; 5+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-14  6:46 UTC (permalink / raw)
  To: Ahmed Karaman, qemu-devel, Richard Henderson, Alex Bennée,
	Laurent Vivier, Thomas Huth

Hi Ahmed,

On 9/14/20 12:07 AM, Ahmed Karaman wrote:
> Host CPU         : Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
> Host Memory      : 15.49 GB
> 
> Start Time (UTC) : 2020-09-13 21:35:01
> End Time (UTC)   : 2020-09-13 22:07:44
> Execution Time   : 0:32:42.230467
> 
> Status           : SUCCESS
> 
> Note:
> Changes denoted by '-----' are less than 0.01%.
> 
> --------------------------------------------------------
>             SUMMARY REPORT - COMMIT f00f57f3
> --------------------------------------------------------

(Maybe this was already commented earlier but I missed it).

What change had a so significant impact on the m68k target?
At a glance I only see mostly changes in softfloat:

$ git log --oneline v5.1.0..f00f57f3 tcg target/m68k fpu
fe4b0b5bfa9 tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem
6a17646176e tcg: Eliminate one store for in-place 128-bit dup_mem
e7e8f33fb60 tcg: Fix tcg gen for vectorized absolute value
5ebf5f4be66 softfloat: Define misc operations for bfloat16
34f0c0a98a5 softfloat: Define convert operations for bfloat16
8282310d853 softfloat: Define operations for bfloat16
0d93d8ec632 softfloat: Add fp16 and uint8/int8 conversion functions
fbcc38e4cb1 softfloat: add xtensa specialization for pickNaNMulAdd
913602e3ffe softfloat: pass float_status pointer to pickNaN
cc43c692511 softfloat: make NO_SIGNALING_NANS runtime property
73ebe95e8e5 target/ppc: add vmulld to INDEX_op_mul_vec case

> --------------------------------------------------------
> --------------------------------------------------------
> Test Program: matmult_double
> --------------------------------------------------------
> Target              Instructions      Latest      v5.1.0
> ----------  --------------------  ----------  ----------
> aarch64            1 412 412 599       -----     +0.311%
> alpha              3 233 957 639       -----     +7.472%
> arm                8 545 302 995       -----      +1.09%
> hppa               3 483 527 330       -----     +4.466%
> m68k               3 919 110 506       -----    +18.433%
> mips               2 344 641 840       -----     +4.085%
> mipsel             3 329 912 425       -----     +5.177%
> mips64             2 359 024 910       -----     +4.075%
> mips64el           3 343 650 686       -----     +5.166%
> ppc                3 209 505 701       -----     +3.248%
> ppc64              3 287 495 266       -----     +3.173%
> ppc64le            3 287 135 580       -----     +3.171%
> riscv64            1 221 617 903       -----     +0.278%
> s390x              2 874 160 417       -----     +5.826%
> sh4                3 544 094 841       -----      +6.42%
> sparc64            3 426 094 848       -----     +7.138%
> x86_64             1 249 076 697       -----     +0.335%
> --------------------------------------------------------
...
> --------------------------------------------------------
> Test Program: qsort_double
> --------------------------------------------------------
> Target              Instructions      Latest      v5.1.0
> ----------  --------------------  ----------  ----------
> aarch64            2 709 839 947       -----     +2.423%
> alpha              1 969 432 086       -----     +3.679%
> arm                8 323 168 267       -----     +2.589%
> hppa               3 188 316 726       -----       +2.9%
> m68k               4 953 947 225       -----    +15.153%
> mips               2 123 789 120       -----     +3.049%
> mipsel             2 124 235 492       -----     +3.049%
> mips64             1 999 025 951       -----     +3.404%
> mips64el           1 996 433 190       -----     +3.409%
> ppc                2 819 299 843       -----     +5.436%
> ppc64              2 768 177 037       -----     +5.512%
> ppc64le            2 724 766 044       -----     +5.602%
> riscv64            1 638 324 190       -----     +4.021%
> s390x              2 519 117 806       -----     +3.364%
> sh4                2 595 696 102       -----       +3.0%
> sparc64            3 988 892 763       -----     +2.744%
> x86_64             2 033 624 062       -----     +3.242%
> --------------------------------------------------------


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020
  2020-09-14  6:46 ` [REPORT] Nightly Performance Tests - Sunday, September 13, 2020 Philippe Mathieu-Daudé
@ 2020-09-14 10:50   ` Ahmed Karaman
  2020-09-14 11:17     ` Philippe Mathieu-Daudé
  2020-09-14 12:43     ` Aleksandar Markovic
  0 siblings, 2 replies; 5+ messages in thread
From: Ahmed Karaman @ 2020-09-14 10:50 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Thomas Huth, Laurent Vivier, Alex Bennée, QEMU Developers,
	Richard Henderson

On Mon, Sep 14, 2020 at 8:46 AM Philippe Mathieu-Daudé <f4bug@amsat.org> wrote:
>
> Hi Ahmed,
>
> On 9/14/20 12:07 AM, Ahmed Karaman wrote:
> > Host CPU         : Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
> > Host Memory      : 15.49 GB
> >
> > Start Time (UTC) : 2020-09-13 21:35:01
> > End Time (UTC)   : 2020-09-13 22:07:44
> > Execution Time   : 0:32:42.230467
> >
> > Status           : SUCCESS
> >
> > Note:
> > Changes denoted by '-----' are less than 0.01%.
> >
> > --------------------------------------------------------
> >             SUMMARY REPORT - COMMIT f00f57f3
> > --------------------------------------------------------
>
> (Maybe this was already commented earlier but I missed it).
>
> What change had a so significant impact on the m68k target?
> At a glance I only see mostly changes in softfloat:
>
> $ git log --oneline v5.1.0..f00f57f3 tcg target/m68k fpu
> fe4b0b5bfa9 tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem
> 6a17646176e tcg: Eliminate one store for in-place 128-bit dup_mem
> e7e8f33fb60 tcg: Fix tcg gen for vectorized absolute value
> 5ebf5f4be66 softfloat: Define misc operations for bfloat16
> 34f0c0a98a5 softfloat: Define convert operations for bfloat16
> 8282310d853 softfloat: Define operations for bfloat16
> 0d93d8ec632 softfloat: Add fp16 and uint8/int8 conversion functions
> fbcc38e4cb1 softfloat: add xtensa specialization for pickNaNMulAdd
> 913602e3ffe softfloat: pass float_status pointer to pickNaN
> cc43c692511 softfloat: make NO_SIGNALING_NANS runtime property
> 73ebe95e8e5 target/ppc: add vmulld to INDEX_op_mul_vec case
>
> > --------------------------------------------------------
> > --------------------------------------------------------
> > Test Program: matmult_double
> > --------------------------------------------------------
> > Target              Instructions      Latest      v5.1.0
> > ----------  --------------------  ----------  ----------
> > aarch64            1 412 412 599       -----     +0.311%
> > alpha              3 233 957 639       -----     +7.472%
> > arm                8 545 302 995       -----      +1.09%
> > hppa               3 483 527 330       -----     +4.466%
> > m68k               3 919 110 506       -----    +18.433%
> > mips               2 344 641 840       -----     +4.085%
> > mipsel             3 329 912 425       -----     +5.177%
> > mips64             2 359 024 910       -----     +4.075%
> > mips64el           3 343 650 686       -----     +5.166%
> > ppc                3 209 505 701       -----     +3.248%
> > ppc64              3 287 495 266       -----     +3.173%
> > ppc64le            3 287 135 580       -----     +3.171%
> > riscv64            1 221 617 903       -----     +0.278%
> > s390x              2 874 160 417       -----     +5.826%
> > sh4                3 544 094 841       -----      +6.42%
> > sparc64            3 426 094 848       -----     +7.138%
> > x86_64             1 249 076 697       -----     +0.335%
> > --------------------------------------------------------
> ...
> > --------------------------------------------------------
> > Test Program: qsort_double
> > --------------------------------------------------------
> > Target              Instructions      Latest      v5.1.0
> > ----------  --------------------  ----------  ----------
> > aarch64            2 709 839 947       -----     +2.423%
> > alpha              1 969 432 086       -----     +3.679%
> > arm                8 323 168 267       -----     +2.589%
> > hppa               3 188 316 726       -----       +2.9%
> > m68k               4 953 947 225       -----    +15.153%
> > mips               2 123 789 120       -----     +3.049%
> > mipsel             2 124 235 492       -----     +3.049%
> > mips64             1 999 025 951       -----     +3.404%
> > mips64el           1 996 433 190       -----     +3.409%
> > ppc                2 819 299 843       -----     +5.436%
> > ppc64              2 768 177 037       -----     +5.512%
> > ppc64le            2 724 766 044       -----     +5.602%
> > riscv64            1 638 324 190       -----     +4.021%
> > s390x              2 519 117 806       -----     +3.364%
> > sh4                2 595 696 102       -----       +3.0%
> > sparc64            3 988 892 763       -----     +2.744%
> > x86_64             2 033 624 062       -----     +3.242%
> > --------------------------------------------------------

Hi Mr. Philippe,
The performance degradation from v5.1.0 of all targets, and especially
m68k, was introduced between the two nightly tests below:

[REPORT] Nightly Performance Tests - Thursday, August 20, 2020:
https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg04923.html

[REPORT] Nightly Performance Tests - Saturday, August 22, 2020
https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05537.html

It looks like the new build system is the culprit.

The "bisect.py" script introduced during the "TCG Continuous
Benchmarking" GSoC project can be very handy in these cases. I wrote
about the tool and how to use it in the report below:
https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/Finding-Commits-Affecting-QEMU-Performance/

Best regards,
Ahmed Karaman


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020
  2020-09-14 10:50   ` Ahmed Karaman
@ 2020-09-14 11:17     ` Philippe Mathieu-Daudé
  2020-09-14 12:43     ` Aleksandar Markovic
  1 sibling, 0 replies; 5+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-14 11:17 UTC (permalink / raw)
  To: Ahmed Karaman
  Cc: Thomas Huth, Richard Henderson, Alex Bennée, Laurent Vivier,
	QEMU Developers

On 9/14/20 12:50 PM, Ahmed Karaman wrote:
> On Mon, Sep 14, 2020 at 8:46 AM Philippe Mathieu-Daudé <f4bug@amsat.org> wrote:
>>
>> Hi Ahmed,
>>
>> On 9/14/20 12:07 AM, Ahmed Karaman wrote:
>>> Host CPU         : Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
>>> Host Memory      : 15.49 GB
>>>
>>> Start Time (UTC) : 2020-09-13 21:35:01
>>> End Time (UTC)   : 2020-09-13 22:07:44
>>> Execution Time   : 0:32:42.230467
>>>
>>> Status           : SUCCESS
>>>
>>> Note:
>>> Changes denoted by '-----' are less than 0.01%.
>>>
>>> --------------------------------------------------------
>>>             SUMMARY REPORT - COMMIT f00f57f3
>>> --------------------------------------------------------
>>
>> (Maybe this was already commented earlier but I missed it).
>>
>> What change had a so significant impact on the m68k target?
>> At a glance I only see mostly changes in softfloat:
>>
>> $ git log --oneline v5.1.0..f00f57f3 tcg target/m68k fpu
>> fe4b0b5bfa9 tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem
>> 6a17646176e tcg: Eliminate one store for in-place 128-bit dup_mem
>> e7e8f33fb60 tcg: Fix tcg gen for vectorized absolute value
>> 5ebf5f4be66 softfloat: Define misc operations for bfloat16
>> 34f0c0a98a5 softfloat: Define convert operations for bfloat16
>> 8282310d853 softfloat: Define operations for bfloat16
>> 0d93d8ec632 softfloat: Add fp16 and uint8/int8 conversion functions
>> fbcc38e4cb1 softfloat: add xtensa specialization for pickNaNMulAdd
>> 913602e3ffe softfloat: pass float_status pointer to pickNaN
>> cc43c692511 softfloat: make NO_SIGNALING_NANS runtime property
>> 73ebe95e8e5 target/ppc: add vmulld to INDEX_op_mul_vec case
>>
>>> --------------------------------------------------------
>>> --------------------------------------------------------
>>> Test Program: matmult_double
>>> --------------------------------------------------------
>>> Target              Instructions      Latest      v5.1.0
>>> ----------  --------------------  ----------  ----------
>>> aarch64            1 412 412 599       -----     +0.311%
>>> alpha              3 233 957 639       -----     +7.472%
>>> arm                8 545 302 995       -----      +1.09%
>>> hppa               3 483 527 330       -----     +4.466%
>>> m68k               3 919 110 506       -----    +18.433%
>>> mips               2 344 641 840       -----     +4.085%
>>> mipsel             3 329 912 425       -----     +5.177%
>>> mips64             2 359 024 910       -----     +4.075%
>>> mips64el           3 343 650 686       -----     +5.166%
>>> ppc                3 209 505 701       -----     +3.248%
>>> ppc64              3 287 495 266       -----     +3.173%
>>> ppc64le            3 287 135 580       -----     +3.171%
>>> riscv64            1 221 617 903       -----     +0.278%
>>> s390x              2 874 160 417       -----     +5.826%
>>> sh4                3 544 094 841       -----      +6.42%
>>> sparc64            3 426 094 848       -----     +7.138%
>>> x86_64             1 249 076 697       -----     +0.335%
>>> --------------------------------------------------------
>> ...
>>> --------------------------------------------------------
>>> Test Program: qsort_double
>>> --------------------------------------------------------
>>> Target              Instructions      Latest      v5.1.0
>>> ----------  --------------------  ----------  ----------
>>> aarch64            2 709 839 947       -----     +2.423%
>>> alpha              1 969 432 086       -----     +3.679%
>>> arm                8 323 168 267       -----     +2.589%
>>> hppa               3 188 316 726       -----       +2.9%
>>> m68k               4 953 947 225       -----    +15.153%
>>> mips               2 123 789 120       -----     +3.049%
>>> mipsel             2 124 235 492       -----     +3.049%
>>> mips64             1 999 025 951       -----     +3.404%
>>> mips64el           1 996 433 190       -----     +3.409%
>>> ppc                2 819 299 843       -----     +5.436%
>>> ppc64              2 768 177 037       -----     +5.512%
>>> ppc64le            2 724 766 044       -----     +5.602%
>>> riscv64            1 638 324 190       -----     +4.021%
>>> s390x              2 519 117 806       -----     +3.364%
>>> sh4                2 595 696 102       -----       +3.0%
>>> sparc64            3 988 892 763       -----     +2.744%
>>> x86_64             2 033 624 062       -----     +3.242%
>>> --------------------------------------------------------
> 
> Hi Mr. Philippe,
> The performance degradation from v5.1.0 of all targets, and especially
> m68k, was introduced between the two nightly tests below:
> 
> [REPORT] Nightly Performance Tests - Thursday, August 20, 2020:
> https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg04923.html
> 
> [REPORT] Nightly Performance Tests - Saturday, August 22, 2020
> https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05537.html
> 
> It looks like the new build system is the culprit.

Maybe we lost a build flag in that 1d806cef..66e01f1c range?
(or added a new one unconditionally).

> 
> The "bisect.py" script introduced during the "TCG Continuous
> Benchmarking" GSoC project can be very handy in these cases. I wrote
> about the tool and how to use it in the report below:
> https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/Finding-Commits-Affecting-QEMU-Performance/

Yeah, looks like the ideal tool for that.

> 
> Best regards,
> Ahmed Karaman
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020
  2020-09-14 10:50   ` Ahmed Karaman
  2020-09-14 11:17     ` Philippe Mathieu-Daudé
@ 2020-09-14 12:43     ` Aleksandar Markovic
  2020-09-14 13:05       ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 5+ messages in thread
From: Aleksandar Markovic @ 2020-09-14 12:43 UTC (permalink / raw)
  To: Ahmed Karaman
  Cc: Laurent Vivier, Thomas Huth, QEMU Developers,
	Philippe Mathieu-Daudé,
	Alex Bennée, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 6144 bytes --]

On Mon, Sep 14, 2020 at 12:52 PM Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
wrote:

> On Mon, Sep 14, 2020 at 8:46 AM Philippe Mathieu-Daudé <f4bug@amsat.org>
> wrote:
> >
> > Hi Ahmed,
> >
> > On 9/14/20 12:07 AM, Ahmed Karaman wrote:
> > > Host CPU         : Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
> > > Host Memory      : 15.49 GB
> > >
> > > Start Time (UTC) : 2020-09-13 21:35:01
> > > End Time (UTC)   : 2020-09-13 22:07:44
> > > Execution Time   : 0:32:42.230467
> > >
> > > Status           : SUCCESS
> > >
> > > Note:
> > > Changes denoted by '-----' are less than 0.01%.
> > >
> > > --------------------------------------------------------
> > >             SUMMARY REPORT - COMMIT f00f57f3
> > > --------------------------------------------------------
> >
> > (Maybe this was already commented earlier but I missed it).
> >
> > What change had a so significant impact on the m68k target?
> > At a glance I only see mostly changes in softfloat:
> >
> > $ git log --oneline v5.1.0..f00f57f3 tcg target/m68k fpu
> > fe4b0b5bfa9 tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem
> > 6a17646176e tcg: Eliminate one store for in-place 128-bit dup_mem
> > e7e8f33fb60 tcg: Fix tcg gen for vectorized absolute value
> > 5ebf5f4be66 softfloat: Define misc operations for bfloat16
> > 34f0c0a98a5 softfloat: Define convert operations for bfloat16
> > 8282310d853 softfloat: Define operations for bfloat16
> > 0d93d8ec632 softfloat: Add fp16 and uint8/int8 conversion functions
> > fbcc38e4cb1 softfloat: add xtensa specialization for pickNaNMulAdd
> > 913602e3ffe softfloat: pass float_status pointer to pickNaN
> > cc43c692511 softfloat: make NO_SIGNALING_NANS runtime property
> > 73ebe95e8e5 target/ppc: add vmulld to INDEX_op_mul_vec case
> >
> > > --------------------------------------------------------
> > > --------------------------------------------------------
> > > Test Program: matmult_double
> > > --------------------------------------------------------
> > > Target              Instructions      Latest      v5.1.0
> > > ----------  --------------------  ----------  ----------
> > > aarch64            1 412 412 599       -----     +0.311%
> > > alpha              3 233 957 639       -----     +7.472%
> > > arm                8 545 302 995       -----      +1.09%
> > > hppa               3 483 527 330       -----     +4.466%
> > > m68k               3 919 110 506       -----    +18.433%
> > > mips               2 344 641 840       -----     +4.085%
> > > mipsel             3 329 912 425       -----     +5.177%
> > > mips64             2 359 024 910       -----     +4.075%
> > > mips64el           3 343 650 686       -----     +5.166%
> > > ppc                3 209 505 701       -----     +3.248%
> > > ppc64              3 287 495 266       -----     +3.173%
> > > ppc64le            3 287 135 580       -----     +3.171%
> > > riscv64            1 221 617 903       -----     +0.278%
> > > s390x              2 874 160 417       -----     +5.826%
> > > sh4                3 544 094 841       -----      +6.42%
> > > sparc64            3 426 094 848       -----     +7.138%
> > > x86_64             1 249 076 697       -----     +0.335%
> > > --------------------------------------------------------
> > ...
> > > --------------------------------------------------------
> > > Test Program: qsort_double
> > > --------------------------------------------------------
> > > Target              Instructions      Latest      v5.1.0
> > > ----------  --------------------  ----------  ----------
> > > aarch64            2 709 839 947       -----     +2.423%
> > > alpha              1 969 432 086       -----     +3.679%
> > > arm                8 323 168 267       -----     +2.589%
> > > hppa               3 188 316 726       -----       +2.9%
> > > m68k               4 953 947 225       -----    +15.153%
> > > mips               2 123 789 120       -----     +3.049%
> > > mipsel             2 124 235 492       -----     +3.049%
> > > mips64             1 999 025 951       -----     +3.404%
> > > mips64el           1 996 433 190       -----     +3.409%
> > > ppc                2 819 299 843       -----     +5.436%
> > > ppc64              2 768 177 037       -----     +5.512%
> > > ppc64le            2 724 766 044       -----     +5.602%
> > > riscv64            1 638 324 190       -----     +4.021%
> > > s390x              2 519 117 806       -----     +3.364%
> > > sh4                2 595 696 102       -----       +3.0%
> > > sparc64            3 988 892 763       -----     +2.744%
> > > x86_64             2 033 624 062       -----     +3.242%
> > > --------------------------------------------------------
>
> Hi Mr. Philippe,
> The performance degradation from v5.1.0 of all targets, and especially
> m68k, was introduced between the two nightly tests below:
>
> [REPORT] Nightly Performance Tests - Thursday, August 20, 2020:
> https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg04923.html
>
> [REPORT] Nightly Performance Tests - Saturday, August 22, 2020
> https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05537.html
>
> It looks like the new build system is the culprit.
>
> The "bisect.py" script introduced during the "TCG Continuous
> Benchmarking" GSoC project can be very handy in these cases. I wrote
> about the tool and how to use it in the report below:
>
> https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/Finding-Commits-Affecting-QEMU-Performance/
>
>
Hi, Ahmed.

I think the bisect.py script will work only if both "start" and "end"
commits are before build system change, or if both of them are after build
system change.

In other words, the script is unlikely to work if "start" is before, and
"end" is after build system change.

This means that, most probably, one should resort to manual analysis of
origins of performance degradation on Aug 22nd.

One area that definitely might be the culprit is the difference in CFLAGS
before and after.

Yours,
Aleksandar


> Best regards,
> Ahmed Karaman
>
>

[-- Attachment #2: Type: text/html, Size: 8449 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020
  2020-09-14 12:43     ` Aleksandar Markovic
@ 2020-09-14 13:05       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 5+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-14 13:05 UTC (permalink / raw)
  To: Aleksandar Markovic, Ahmed Karaman
  Cc: Thomas Huth, Richard Henderson, Alex Bennée, Laurent Vivier,
	QEMU Developers

On 9/14/20 2:43 PM, Aleksandar Markovic wrote:
> On Mon, Sep 14, 2020 at 12:52 PM Ahmed Karaman
> <ahmedkhaledkaraman@gmail.com <mailto:ahmedkhaledkaraman@gmail.com>> wrote:
> 
>     On Mon, Sep 14, 2020 at 8:46 AM Philippe Mathieu-Daudé
>     <f4bug@amsat.org <mailto:f4bug@amsat.org>> wrote:
>     >
>     > Hi Ahmed,
>     >
>     > On 9/14/20 12:07 AM, Ahmed Karaman wrote:
>     > > Host CPU         : Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
>     > > Host Memory      : 15.49 GB
>     > >
>     > > Start Time (UTC) : 2020-09-13 21:35:01
>     > > End Time (UTC)   : 2020-09-13 22:07:44
>     > > Execution Time   : 0:32:42.230467
>     > >
>     > > Status           : SUCCESS
>     > >
>     > > Note:
>     > > Changes denoted by '-----' are less than 0.01%.
>     > >
>     > > --------------------------------------------------------
>     > >             SUMMARY REPORT - COMMIT f00f57f3
>     > > --------------------------------------------------------
>     >
>     > (Maybe this was already commented earlier but I missed it).
>     >
>     > What change had a so significant impact on the m68k target?
>     > At a glance I only see mostly changes in softfloat:
>     >
>     > $ git log --oneline v5.1.0..f00f57f3 tcg target/m68k fpu
>     > fe4b0b5bfa9 tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem
>     > 6a17646176e tcg: Eliminate one store for in-place 128-bit dup_mem
>     > e7e8f33fb60 tcg: Fix tcg gen for vectorized absolute value
>     > 5ebf5f4be66 softfloat: Define misc operations for bfloat16
>     > 34f0c0a98a5 softfloat: Define convert operations for bfloat16
>     > 8282310d853 softfloat: Define operations for bfloat16
>     > 0d93d8ec632 softfloat: Add fp16 and uint8/int8 conversion functions
>     > fbcc38e4cb1 softfloat: add xtensa specialization for pickNaNMulAdd
>     > 913602e3ffe softfloat: pass float_status pointer to pickNaN
>     > cc43c692511 softfloat: make NO_SIGNALING_NANS runtime property
>     > 73ebe95e8e5 target/ppc: add vmulld to INDEX_op_mul_vec case
>     >
>     > > --------------------------------------------------------
>     > > --------------------------------------------------------
>     > > Test Program: matmult_double
>     > > --------------------------------------------------------
>     > > Target              Instructions      Latest      v5.1.0
>     > > ----------  --------------------  ----------  ----------
>     > > aarch64            1 412 412 599       -----     +0.311%
>     > > alpha              3 233 957 639       -----     +7.472%
>     > > arm                8 545 302 995       -----      +1.09%
>     > > hppa               3 483 527 330       -----     +4.466%
>     > > m68k               3 919 110 506       -----    +18.433%
>     > > mips               2 344 641 840       -----     +4.085%
>     > > mipsel             3 329 912 425       -----     +5.177%
>     > > mips64             2 359 024 910       -----     +4.075%
>     > > mips64el           3 343 650 686       -----     +5.166%
>     > > ppc                3 209 505 701       -----     +3.248%
>     > > ppc64              3 287 495 266       -----     +3.173%
>     > > ppc64le            3 287 135 580       -----     +3.171%
>     > > riscv64            1 221 617 903       -----     +0.278%
>     > > s390x              2 874 160 417       -----     +5.826%
>     > > sh4                3 544 094 841       -----      +6.42%
>     > > sparc64            3 426 094 848       -----     +7.138%
>     > > x86_64             1 249 076 697       -----     +0.335%
>     > > --------------------------------------------------------
>     > ...
>     > > --------------------------------------------------------
>     > > Test Program: qsort_double
>     > > --------------------------------------------------------
>     > > Target              Instructions      Latest      v5.1.0
>     > > ----------  --------------------  ----------  ----------
>     > > aarch64            2 709 839 947       -----     +2.423%
>     > > alpha              1 969 432 086       -----     +3.679%
>     > > arm                8 323 168 267       -----     +2.589%
>     > > hppa               3 188 316 726       -----       +2.9%
>     > > m68k               4 953 947 225       -----    +15.153%
>     > > mips               2 123 789 120       -----     +3.049%
>     > > mipsel             2 124 235 492       -----     +3.049%
>     > > mips64             1 999 025 951       -----     +3.404%
>     > > mips64el           1 996 433 190       -----     +3.409%
>     > > ppc                2 819 299 843       -----     +5.436%
>     > > ppc64              2 768 177 037       -----     +5.512%
>     > > ppc64le            2 724 766 044       -----     +5.602%
>     > > riscv64            1 638 324 190       -----     +4.021%
>     > > s390x              2 519 117 806       -----     +3.364%
>     > > sh4                2 595 696 102       -----       +3.0%
>     > > sparc64            3 988 892 763       -----     +2.744%
>     > > x86_64             2 033 624 062       -----     +3.242%
>     > > --------------------------------------------------------
> 
>     Hi Mr. Philippe,
>     The performance degradation from v5.1.0 of all targets, and especially
>     m68k, was introduced between the two nightly tests below:
> 
>     [REPORT] Nightly Performance Tests - Thursday, August 20, 2020:
>     https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg04923.html
> 
>     [REPORT] Nightly Performance Tests - Saturday, August 22, 2020
>     https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05537.html
> 
>     It looks like the new build system is the culprit.
> 
>     The "bisect.py" script introduced during the "TCG Continuous
>     Benchmarking" GSoC project can be very handy in these cases. I wrote
>     about the tool and how to use it in the report below:
>     https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/Finding-Commits-Affecting-QEMU-Performance/
> 
> 
> Hi, Ahmed.
> 
> I think the bisect.py script will work only if both "start" and "end"
> commits are before build system change, or if both of them are after
> build system change.
> 
> In other words, the script is unlikely to work if "start" is before, and
> "end" is after build system change.

Good point.

> This means that, most probably, one should resort to manual analysis of
> origins of performance degradation on Aug 22nd.

What would be useful is a report from the build system change
(commit 7fd51e68c34), then as Aleksandar suggested, resume normal
bisection (range 7fd51e68c34..66e01f1cdc9).

> 
> One area that definitely might be the culprit is the difference in
> CFLAGS before and after.
> 
> Yours,
> Aleksandar
>  
> 
>     Best regards,
>     Ahmed Karaman
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-14 13:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <5f5e9831.1c69fb81.bdbec.98b8@mx.google.com>
2020-09-14  6:46 ` [REPORT] Nightly Performance Tests - Sunday, September 13, 2020 Philippe Mathieu-Daudé
2020-09-14 10:50   ` Ahmed Karaman
2020-09-14 11:17     ` Philippe Mathieu-Daudé
2020-09-14 12:43     ` Aleksandar Markovic
2020-09-14 13:05       ` Philippe Mathieu-Daudé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.