From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932107AbbCWQrR (ORCPT <rfc822;w@1wt.eu>);
	Mon, 23 Mar 2015 12:47:17 -0400
Received: from bombadil.infradead.org ([198.137.202.9]:42823 "EHLO
	bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S932080AbbCWQrP (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 23 Mar 2015 12:47:15 -0400
Date: Mon, 23 Mar 2015 17:47:02 +0100
From: Peter Zijlstra <peterz@infradead.org>
To: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Sai Gurrappadi <sgurrappadi@nvidia.com>,
        "mingo@redhat.com" <mingo@redhat.com>,
        "vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
        Dietmar Eggemann <Dietmar.Eggemann@arm.com>,
        "yuyang.du@intel.com" <yuyang.du@intel.com>,
        "preeti@linux.vnet.ibm.com" <preeti@linux.vnet.ibm.com>,
        "mturquette@linaro.org" <mturquette@linaro.org>,
        "nico@linaro.org" <nico@linaro.org>,
        "rjw@rjwysocki.net" <rjw@rjwysocki.net>,
        Juri Lelli <Juri.Lelli@arm.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Peter Boonstoppel <pboonstoppel@nvidia.com>
Subject: Re: [RFCv3 PATCH 30/48] sched: Calculate energy consumption of
 sched_group
Message-ID: <20150323164702.GL23123@twins.programming.kicks-ass.net>
References: <1423074685-6336-1-git-send-email-morten.rasmussen@arm.com>
 <1423074685-6336-31-git-send-email-morten.rasmussen@arm.com>
 <55036AA1.7000801@nvidia.com>
 <20150316141546.GQ4081@e105550-lin.cambridge.arm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150316141546.GQ4081@e105550-lin.cambridge.arm.com>
User-Agent: Mutt/1.5.21 (2012-12-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, Mar 16, 2015 at 02:15:46PM +0000, Morten Rasmussen wrote:
> You are absolutely right. The current code is broken for system
> topologies where all cpus share the same clock source. To be honest, it
> is actually worse than that and you already pointed out the reason. We
> don't have a way of representing top level contributions to power
> consumption in RFCv3, as we don't have sched_group spanning all cpus in
> single cluster system. For example, we can't represent L2 cache and
> interconnect power consumption on such systems.
> 
> In RFCv2 we had a system wide sched_group dangling by itself for that
> purpose. We chose to remove that in this rewrite as it led to messy
> code. In my opinion, a more elegant solution is to introduce an
> additional sched_domain above the current top level which has a single
> sched_group spanning all cpus in the system. That should fix the
> SD_SHARE_CAP_STATES problem and allow us to attach power data for the
> top level.

Maybe remind us why this needs to be tied to sched_groups ? Why can't we
attach the energy information to the domains?

There is an additional problem with groups you've not yet discovered and
that is overlapping groups. Certain NUMA topologies result in this.
There the sum of cpus over the groups is greater than the total cpus in
the domain.