All of lore.kernel.org
 help / color / mirror / Atom feed
* Performance drop on Baytrail with 4.5-rc2
@ 2016-02-01 20:34 Thomas Voegtle
  2016-02-02 10:33 ` Longepe, Philippe
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Voegtle @ 2016-02-01 20:34 UTC (permalink / raw)
  To: philippe.longepe; +Cc: linux-kernel, stephane.gasparini, rafael.j.wysocki


Hi,


I have a Celeron J1900 system (Asrock Q1900B-ITX) always running the
latest kernel.
Since running the 4.5-rc2 kernel I have a performance drop in a ffmpeg
benchmark compared to 4.4. Converting a piece of mpeg to mp4 (to
/dev/null).

4.1 made the benchmark in 74s, 4.4 in 75s and now the 4.5-rc2 in 80s.

As the benchmark is very stable in results I easily could bisest it down
to the commit:

commit e70eed2b64545ab5c9d2f4d43372d79762f1b985
Author: Philippe Longepe <philippe.longepe@intel.com>
Date:   Fri Dec 4 17:40:32 2015 +0100

     cpufreq: intel_pstate: Account for non C0 time


I can revert this commit on top on 4.5-rc2 (4 of 5 hunks) and then the
benchmark is down to 76s.

Was this intended to decrease energy consumption?


     Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Performance drop on Baytrail with 4.5-rc2
  2016-02-01 20:34 Performance drop on Baytrail with 4.5-rc2 Thomas Voegtle
@ 2016-02-02 10:33 ` Longepe, Philippe
  2016-02-02 17:56   ` Thomas Voegtle
  0 siblings, 1 reply; 6+ messages in thread
From: Longepe, Philippe @ 2016-02-02 10:33 UTC (permalink / raw)
  To: Thomas Voegtle; +Cc: linux-kernel, Gasparini, Stephane, Wysocki, Rafael J

Hi Thomas,

Yes, this new algorithm is intented to improve the performance versus power efficiency.

Can you please provide us the exact instructions to reproduce your test ?

Best Regards,

Philippe,

________________________________________
From: Thomas Voegtle [tv@lio96.de]
Sent: Monday, February 01, 2016 9:34 PM
To: Longepe, Philippe
Cc: linux-kernel@vger.kernel.org; Gasparini, Stephane; Wysocki, Rafael J
Subject: Performance drop on Baytrail with 4.5-rc2

Hi,


I have a Celeron J1900 system (Asrock Q1900B-ITX) always running the
latest kernel.
Since running the 4.5-rc2 kernel I have a performance drop in a ffmpeg
benchmark compared to 4.4. Converting a piece of mpeg to mp4 (to
/dev/null).

4.1 made the benchmark in 74s, 4.4 in 75s and now the 4.5-rc2 in 80s.

As the benchmark is very stable in results I easily could bisest it down
to the commit:

commit e70eed2b64545ab5c9d2f4d43372d79762f1b985
Author: Philippe Longepe <philippe.longepe@intel.com>
Date:   Fri Dec 4 17:40:32 2015 +0100

     cpufreq: intel_pstate: Account for non C0 time


I can revert this commit on top on 4.5-rc2 (4 of 5 hunks) and then the
benchmark is down to 76s.

Was this intended to decrease energy consumption?


     Thomas


---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris, 
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Performance drop on Baytrail with 4.5-rc2
  2016-02-02 10:33 ` Longepe, Philippe
@ 2016-02-02 17:56   ` Thomas Voegtle
       [not found]     ` <8B729BF8A98FF048A53185F813A469D935B1A189@HASMSX105.ger.corp.intel.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Voegtle @ 2016-02-02 17:56 UTC (permalink / raw)
  To: Longepe, Philippe; +Cc: linux-kernel, Gasparini, Stephane, Wysocki, Rafael J

On Tue, 2 Feb 2016, Longepe, Philippe wrote:

> Hi Thomas,
>
> Yes, this new algorithm is intented to improve the performance versus power efficiency.
>
> Can you please provide us the exact instructions to reproduce your test ?


Hi,

my benchmark which I used is a little bit weird, so I tried to strip it
down a little.

The sample I used: https://32h.de/tv/test.vdr

I downloaded a static built version from
http://johnvansickle.com/ffmpeg/

I used the 64bit v2.8.6 binary.
   (md5 of ffmpeg binary: 2989d50b4b13cb1e549955522fd7d311)

And then I did:
time -p ./ffmpeg -v 0 -y -i test.vdr -preset veryfast -vf scale=320:208
-strict -2  -f mp4 /dev/null


Very interesting: you only get a difference between 4.5-rc2-with-revert 
and 4.5-rc2 with the downscaling. When you remove "-vf scale=320:208", you 
get the same times on 4.5-rc2-with-revert and the unmodified 4.5-rc2. See 
below:


with downscale
==============
4.5-rc2-with-revert
real 93.16
real 92.96
real 93.24

4.5-rc2
real 100.00
real 99.31
real 99.12


without "-vf scale=320:208"
===========================
4.5-rc2-with-revert
real 157.59
real 157.27
real 157.58

4.5-rc2
real 157.49
real 157.68
real 157.53


I'm confused, but I hope this helps? Do you need anything else?

Thanks in advance.


    Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Performance drop on Baytrail with 4.5-rc2
       [not found]     ` <8B729BF8A98FF048A53185F813A469D935B1A189@HASMSX105.ger.corp.intel.com>
@ 2016-02-03 19:18       ` Thomas Voegtle
  2016-03-13 15:54       ` Thomas Voegtle
  1 sibling, 0 replies; 6+ messages in thread
From: Thomas Voegtle @ 2016-02-03 19:18 UTC (permalink / raw)
  To: Longepe, Philippe; +Cc: linux-kernel, Gasparini, Stephane, Wysocki, Rafael J

On Wed, 3 Feb 2016, Longepe, Philippe wrote:

> Thank you for sharing this test. I just did a quick test and yes this is 
> really interesting !
>
> Without the scale option, the load is close to 100% for each cpus (so 
> the pstates are increasing up to the turbo frequency) but with 
> scale=320:208, the load is oscillating (close to 50% in average), so the 
> requested frequencies are lower (the power is also reduced).
>
> I have a patch (not yet submitted) that reduce the gap for such a use case.

That's nice, glad the benchmark helped.

> As a temporary solution, you can also switch to performance with:
>
> sudo su
> echo performance > /sys/devices/system/cpu/cpu*/scaling_governor

Yes, ok, thanks.

> However, I still don't know why there is 50% of idle with the scale 
> option (is it using a Hw accelerator ?).


Mh. I don't know. I always thought this is stuff which is not that 
threadable as the mpeg to mp4 conversion, or something like that?


Thanks,

   Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Performance drop on Baytrail with 4.5-rc2
       [not found]     ` <8B729BF8A98FF048A53185F813A469D935B1A189@HASMSX105.ger.corp.intel.com>
  2016-02-03 19:18       ` Thomas Voegtle
@ 2016-03-13 15:54       ` Thomas Voegtle
  2016-03-15  8:52         ` Longepe, Philippe
  1 sibling, 1 reply; 6+ messages in thread
From: Thomas Voegtle @ 2016-03-13 15:54 UTC (permalink / raw)
  To: Longepe, Philippe; +Cc: linux-kernel, Gasparini, Stephane, Wysocki, Rafael J

On Wed, 3 Feb 2016, Longepe, Philippe wrote:

> Hi Thomas,
>
> Thank you for sharing this test. I just did a quick test and yes this is 
> really interesting !
>
> Without the scale option, the load is close to 100% for each cpus (so 
> the pstates are increasing up to the turbo frequency) but with 
> scale=320:208, the load is oscillating (close to 50% in average), so the 
> requested frequencies are lower (the power is also reduced).
>
> I have a patch (not yet submitted) that reduce the gap for such a use 
> case.

Hi,

may I ask you where this patch stays? I have seen nothing in Linus' tree, 
that fixes that performance drop?


Thanks,

   Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Performance drop on Baytrail with 4.5-rc2
  2016-03-13 15:54       ` Thomas Voegtle
@ 2016-03-15  8:52         ` Longepe, Philippe
  0 siblings, 0 replies; 6+ messages in thread
From: Longepe, Philippe @ 2016-03-15  8:52 UTC (permalink / raw)
  To: Thomas Voegtle
  Cc: linux-kernel, Gasparini, Stephane, Wysocki, Rafael J, Pandruvada,
	Srinivas

Hi Thomas,

The patches which are improving ffmeg had to be tested for 2 months on android before submission. 

That's why I resubmitted them very recently for review on our internal list.

They will be submitted them to the linux-pm list as soon as our internal review will be finished.

Srinivas, Rafael, did you get a chance to look at my patches (Perf/Power improvements for load-based algorithm) ?

In the meantime I'll do some additional tests with phoronix test suites.

Regards,

Philippe,

________________________________________
From: Thomas Voegtle [tv@lio96.de]
Sent: Sunday, March 13, 2016 4:54 PM
To: Longepe, Philippe
Cc: linux-kernel@vger.kernel.org; Gasparini, Stephane; Wysocki, Rafael J
Subject: RE: Performance drop on Baytrail with 4.5-rc2

On Wed, 3 Feb 2016, Longepe, Philippe wrote:

> Hi Thomas,
>
> Thank you for sharing this test. I just did a quick test and yes this is
> really interesting !
>
> Without the scale option, the load is close to 100% for each cpus (so
> the pstates are increasing up to the turbo frequency) but with
> scale=320:208, the load is oscillating (close to 50% in average), so the
> requested frequencies are lower (the power is also reduced).
>
> I have a patch (not yet submitted) that reduce the gap for such a use
> case.

Hi,

may I ask you where this patch stays? I have seen nothing in Linus' tree,
that fixes that performance drop?


Thanks,

   Thomas



---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris, 
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-03-15  8:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-01 20:34 Performance drop on Baytrail with 4.5-rc2 Thomas Voegtle
2016-02-02 10:33 ` Longepe, Philippe
2016-02-02 17:56   ` Thomas Voegtle
     [not found]     ` <8B729BF8A98FF048A53185F813A469D935B1A189@HASMSX105.ger.corp.intel.com>
2016-02-03 19:18       ` Thomas Voegtle
2016-03-13 15:54       ` Thomas Voegtle
2016-03-15  8:52         ` Longepe, Philippe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.