All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Sang <oliver.sang@intel.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>, Ben Segall <bsegall@google.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Mike Galbraith <efault@gmx.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, OTC LSE PnP <otc.lse.pnp@intel.com>,
	ying.huang@intel.com
Subject: Re: [sched/fair] 0b0695f2b3: phoronix-test-suite.compress-gzip.0.seconds 19.8% regression
Date: Thu, 21 May 2020 16:38:15 +0800	[thread overview]
Message-ID: <20200521083815.GA19280@xsang-OptiPlex-9020> (raw)
In-Reply-To: <CAKfTPtCnnCcoN8m+qcPZNhO_RjkwRwiPT4Qq1qYRqTPn8Z_prQ@mail.gmail.com>

On Wed, May 20, 2020 at 03:04:48PM +0200, Vincent Guittot wrote:
> On Thu, 14 May 2020 at 19:09, Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
> >
> > Hi Oliver,
> >
> > On Thu, 14 May 2020 at 16:05, kernel test robot <oliver.sang@intel.com> wrote:
> > >
> > > Hi Vincent Guittot,
> > >
> > > Below report FYI.
> > > Last year, we actually reported an improvement "[sched/fair] 0b0695f2b3:
> > > vm-scalability.median 3.1% improvement" on link [1].
> > > but now we found the regression on pts.compress-gzip.
> > > This seems align with what showed in "[v4,00/10] sched/fair: rework the CFS
> > > load balance" (link [2]), where showed the reworked load balance could have
> > > both positive and negative effect for different test suites.
> >
> > We have tried to run  all possible use cases but it's impossible to
> > covers all so there were a possibility that one that is not covered,
> > would regressed.
> >
> > > And also from link [3], the patch set risks regressions.
> > >
> > > We also confirmed this regression on another platform
> > > (Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz with 8G memory),
> > > below is the data (lower is better).
> > > v5.4    4.1
> > > fcf0553db6f4c79387864f6e4ab4a891601f395e    4.01
> > > 0b0695f2b34a4afa3f6e9aa1ff0e5336d8dad912    4.89
> > > v5.5    5.18
> > > v5.6    4.62
> > > v5.7-rc2    4.53
> > > v5.7-rc3    4.59
> > >
> > > It seems there are some recovery on latest kernels, but not fully back.
> > > We were just wondering whether you could share some lights the further works
> > > on the load balance after patch set [2] which could cause the performance
> > > change?
> > > And whether you have plan to refine the load balance algorithm further?
> >
> > I'm going to have a look at your regression to understand what is
> > going wrong and how it can be fixed
> 
> I have run the benchmark on my local setups to try to reproduce the
> regression and I don't see the regression. But my setups are different
> from your so it might be a problem specific to yours

Hi Vincent, which OS are you using? We found the regression on Clear OS,
but it cannot reproduce on Debian.
On https://www.phoronix.com/scan.php?page=article&item=mac-win-linux2018&num=5
it was mentioned that -
Gzip compression is much faster out-of-the-box on Clear Linux due to it exploiting
multi-threading capabilities compared to the other operating systems Gzip support. 

> 
> After analysing the benchmark, it doesn't overload the system and is
> mainly based on 1 main gzip thread with few others waking up and
> sleeping around.
> 
> I thought that scheduler could be too aggressive when trying to
> balance the threads on your system, which could generate more task
> migrations and impact the performance. But this doesn't seem to be the
> case because perf-stat.i.cpu-migrations is -8%. On the other side, the
> context switch is +16% and more interestingly idle state C1E and C6
> usages increase more than 50%. I don't know if we can rely or this
> value or not but I wonder if it could be that threads are now spread
> on different CPUs which generates idle time on the busy CPUs but the
> added time to enter/leave these states hurts the performance.
> 
> Could you make some traces of both kernels ? Tracing sched events
> should be enough to understand the behavior
> 
> Regards,
> Vincent
> 
> >
> > Thanks
> > Vincent
> >
> > > thanks
> > >
> > > [1] https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/SANC7QLYZKUNMM6O7UNR3OAQAKS5BESE/
> > > [2] https://lore.kernel.org/patchwork/cover/1141687/
> > > [3] https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.5-Scheduler

WARNING: multiple messages have this Message-ID (diff)
From: Oliver Sang <oliver.sang@intel.com>
To: lkp@lists.01.org
Subject: Re: [sched/fair] 0b0695f2b3: phoronix-test-suite.compress-gzip.0.seconds 19.8% regression
Date: Thu, 21 May 2020 16:38:15 +0800	[thread overview]
Message-ID: <20200521083815.GA19280@xsang-OptiPlex-9020> (raw)
In-Reply-To: <CAKfTPtCnnCcoN8m+qcPZNhO_RjkwRwiPT4Qq1qYRqTPn8Z_prQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3718 bytes --]

On Wed, May 20, 2020 at 03:04:48PM +0200, Vincent Guittot wrote:
> On Thu, 14 May 2020 at 19:09, Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
> >
> > Hi Oliver,
> >
> > On Thu, 14 May 2020 at 16:05, kernel test robot <oliver.sang@intel.com> wrote:
> > >
> > > Hi Vincent Guittot,
> > >
> > > Below report FYI.
> > > Last year, we actually reported an improvement "[sched/fair] 0b0695f2b3:
> > > vm-scalability.median 3.1% improvement" on link [1].
> > > but now we found the regression on pts.compress-gzip.
> > > This seems align with what showed in "[v4,00/10] sched/fair: rework the CFS
> > > load balance" (link [2]), where showed the reworked load balance could have
> > > both positive and negative effect for different test suites.
> >
> > We have tried to run  all possible use cases but it's impossible to
> > covers all so there were a possibility that one that is not covered,
> > would regressed.
> >
> > > And also from link [3], the patch set risks regressions.
> > >
> > > We also confirmed this regression on another platform
> > > (Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz with 8G memory),
> > > below is the data (lower is better).
> > > v5.4    4.1
> > > fcf0553db6f4c79387864f6e4ab4a891601f395e    4.01
> > > 0b0695f2b34a4afa3f6e9aa1ff0e5336d8dad912    4.89
> > > v5.5    5.18
> > > v5.6    4.62
> > > v5.7-rc2    4.53
> > > v5.7-rc3    4.59
> > >
> > > It seems there are some recovery on latest kernels, but not fully back.
> > > We were just wondering whether you could share some lights the further works
> > > on the load balance after patch set [2] which could cause the performance
> > > change?
> > > And whether you have plan to refine the load balance algorithm further?
> >
> > I'm going to have a look at your regression to understand what is
> > going wrong and how it can be fixed
> 
> I have run the benchmark on my local setups to try to reproduce the
> regression and I don't see the regression. But my setups are different
> from your so it might be a problem specific to yours

Hi Vincent, which OS are you using? We found the regression on Clear OS,
but it cannot reproduce on Debian.
On https://www.phoronix.com/scan.php?page=article&item=mac-win-linux2018&num=5
it was mentioned that -
Gzip compression is much faster out-of-the-box on Clear Linux due to it exploiting
multi-threading capabilities compared to the other operating systems Gzip support. 

> 
> After analysing the benchmark, it doesn't overload the system and is
> mainly based on 1 main gzip thread with few others waking up and
> sleeping around.
> 
> I thought that scheduler could be too aggressive when trying to
> balance the threads on your system, which could generate more task
> migrations and impact the performance. But this doesn't seem to be the
> case because perf-stat.i.cpu-migrations is -8%. On the other side, the
> context switch is +16% and more interestingly idle state C1E and C6
> usages increase more than 50%. I don't know if we can rely or this
> value or not but I wonder if it could be that threads are now spread
> on different CPUs which generates idle time on the busy CPUs but the
> added time to enter/leave these states hurts the performance.
> 
> Could you make some traces of both kernels ? Tracing sched events
> should be enough to understand the behavior
> 
> Regards,
> Vincent
> 
> >
> > Thanks
> > Vincent
> >
> > > thanks
> > >
> > > [1] https://lists.01.org/hyperkitty/list/lkp(a)lists.01.org/thread/SANC7QLYZKUNMM6O7UNR3OAQAKS5BESE/
> > > [2] https://lore.kernel.org/patchwork/cover/1141687/
> > > [3] https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.5-Scheduler

  reply	other threads:[~2020-05-21  8:28 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-14 14:15 [sched/fair] 0b0695f2b3: phoronix-test-suite.compress-gzip.0.seconds 19.8% regression kernel test robot
2020-05-14 14:15 ` kernel test robot
2020-05-14 17:09 ` Vincent Guittot
2020-05-14 17:09   ` Vincent Guittot
2020-05-15  1:43   ` Oliver Sang
2020-05-15  1:43     ` Oliver Sang
     [not found]   ` <20200515141226.17700-1-hdanton@sina.com>
2020-05-18  7:00     ` Oliver Sang
2020-05-18  7:00       ` Oliver Sang
2020-05-20 13:04   ` Vincent Guittot
2020-05-20 13:04     ` Vincent Guittot
2020-05-21  8:38     ` Oliver Sang [this message]
2020-05-21  8:38       ` Oliver Sang
2020-05-25  8:02       ` Vincent Guittot
2020-05-25  8:02         ` Vincent Guittot
2020-05-29 17:26         ` Vincent Guittot
2020-05-29 17:26           ` Vincent Guittot
2020-06-02  5:23           ` Oliver Sang
2020-06-02  5:23             ` Oliver Sang
2020-06-02 14:23             ` Oliver Sang
2020-06-02 14:23               ` Oliver Sang
2020-06-03 17:06               ` Vincent Guittot
2020-06-03 17:06                 ` Vincent Guittot
2020-06-04  8:56                 ` Mel Gorman
2020-06-04  8:56                   ` Mel Gorman
2020-06-05  7:06                   ` Vincent Guittot
2020-06-05  7:06                     ` Vincent Guittot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200521083815.GA19280@xsang-OptiPlex-9020 \
    --to=oliver.sang@intel.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=efault@gmx.de \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@lists.01.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=otc.lse.pnp@intel.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.