From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1754701AbdEDRjL (ORCPT <rfc822;w@1wt.eu>);
        Thu, 4 May 2017 13:39:11 -0400
Received: from mail-yb0-f194.google.com ([209.85.213.194]:36652 "EHLO
        mail-yb0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751474AbdEDRjE (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 4 May 2017 13:39:04 -0400
Date: Thu, 4 May 2017 13:39:01 -0400
From: Tejun Heo <tj@kernel.org>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>,
        linux-kernel@vger.kernel.org,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Mike Galbraith <efault@gmx.de>, Paul Turner <pjt@google.com>,
        Chris Mason <clm@fb.com>, kernel-team@fb.com
Subject: Re: [PATCH v2 1/2] sched/fair: Fix how load gets propagated from
 cfs_rq to its sched_entity
Message-ID: <20170504173901.GB7288@htj.duckdns.org>
References: <20170424201344.GA14169@wtj.duckdns.org>
 <20170424201415.GB14169@wtj.duckdns.org>
 <20170424213324.GA23619@wtj.duckdns.org>
 <20170503180028.ejf73et3pc4meqji@hirez.programming.kicks-ass.net>
 <20170503214546.GA7451@htj.duckdns.org>
 <20170504055129.c7f7whdqpcqyvnrz@hirez.programming.kicks-ass.net>
 <20170504062109.d5o2dz6t4lqnkp5n@hirez.programming.kicks-ass.net>
 <02c93bb0-ad3c-9a01-0351-e4ee6e56bf1b@arm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <02c93bb0-ad3c-9a01-0351-e4ee6e56bf1b@arm.com>
User-Agent: Mutt/1.8.0 (2017-02-23)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hello, Dietmar.

On Thu, May 04, 2017 at 10:49:51AM +0100, Dietmar Eggemann wrote:
> On 04/05/17 07:21, Peter Zijlstra wrote:
> > On Thu, May 04, 2017 at 07:51:29AM +0200, Peter Zijlstra wrote:
> > 
> >> Urgh, and my numbers were so pretty :/
> > 
> > Just to clarify on how to run schbench, I limited to a single socket (as
> > that is what you have) and set -t to the number of cores in the socket
> > (not the number of threads).
> > 
> > Furthermore, my machine is _idle_, if I don't do anything, it doesn't do
> > _anything_.
> >
> 
> I can't recreate this problem running 'numactl -N 0 ./schbench -m 2 -t
> 10 -s 10000 -c 15000 -r 30' on my E5-2690 v2 (IVB-EP, 2 sockets, 10
> cores / socket, 2 threads / core)
> 
> I tried tip/sched/core comparing running in 'cpu:/' and 'cpu:/foo' and
> 
> using your patch on top with all the combinations of {NO_}FUDGE,
> {NO_}FUDGE2 with prop_type=shares_avg or prop_type_runnable.
> 
> Where you able to see the issue on tip/sched/core w/o your patch on your
> machine?
> 
> The workload of n 60% periodic tasks on n logical cpus always creates a
> very stable task distribution for me.

It depends heavily on what else is going on in the system.  On the
test systems that I'm using, there's always something not-too-heavy
going on.  The pattern over time isn't too varied and the latency
results are usually stable and the grouping of results is very clear
as the difference between the load balancer working properly and not
shows up as upto an order of magnitude difference in p99 latencies.

For these differences to matter, you need to push the machine so that
it's right at the point of saturation - e.g. increase duty cycle till
p99 starts to deteriorate w/o cgroup.

Thanks.

-- 
tejun