From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4502C07E9B for ; Wed, 7 Jul 2021 09:54:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 90EEB613E1 for ; Wed, 7 Jul 2021 09:54:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231176AbhGGJ5D (ORCPT ); Wed, 7 Jul 2021 05:57:03 -0400 Received: from foss.arm.com ([217.140.110.172]:33236 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229949AbhGGJ5C (ORCPT ); Wed, 7 Jul 2021 05:57:02 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CDC10ED1; Wed, 7 Jul 2021 02:54:21 -0700 (PDT) Received: from [10.57.1.129] (unknown [10.57.1.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 068573F694; Wed, 7 Jul 2021 02:54:18 -0700 (PDT) Subject: Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy To: Dietmar Eggemann Cc: Vincent Guittot , linux-kernel , Chris Redpath , Morten Rasmussen , Quentin Perret , "open list:THERMAL" , Peter Zijlstra , "Rafael J. Wysocki" , Viresh Kumar , Ingo Molnar , Juri Lelli , Steven Rostedt , segall@google.com, Mel Gorman , Daniel Bristot de Oliveira , CCj.Yeh@mediatek.com References: <20210625152603.25960-1-lukasz.luba@arm.com> <20210625152603.25960-2-lukasz.luba@arm.com> <2f43b211-da86-9d48-4e41-1c63359865bb@arm.com> <9b0ea7bc-934a-43bd-7dd8-9fe33dec97bc@arm.com> From: Lukasz Luba Message-ID: <730a57b2-e36f-6b69-5e9d-c27e8a3003bb@arm.com> Date: Wed, 7 Jul 2021 10:54:16 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <9b0ea7bc-934a-43bd-7dd8-9fe33dec97bc@arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/7/21 10:45 AM, Dietmar Eggemann wrote: > On 07/07/2021 10:23, Lukasz Luba wrote: >> >> On 7/7/21 9:00 AM, Vincent Guittot wrote: >>> On Wed, 7 Jul 2021 at 09:49, Lukasz Luba wrote: >>>> >>>> >>>> >>>> On 7/7/21 8:07 AM, Vincent Guittot wrote: >>>>> On Fri, 25 Jun 2021 at 17:26, Lukasz Luba wrote: > > [...] > >>>>> Could you explain why 32bits results are not enough and you need to >>>>> move to 64bits ? >>>>> >>>>> Right now the result is in the range [0..2^32[ mW. If you need more >>>>> precision and you want to return uW instead, you will have a result in >>>>> the rangeĀ  [0..4kW[ which seems to be still enough >>>>> >>>> >>>> Currently we have the max value limit for 'power' in EM which is >>>> EM_MAX_POWER 0xffff (64k - 1). We allow to register such big power >>>> values ~64k mW (~64Watts) for an OPP. Then based on 'power' we >>>> pre-calculate 'cost' fields: >>>> cost[i] = power[i] * freq_max / freq[i] >>>> So, for max freq the cost == power. Let's use that in the example. >>>> >>>> Then the em_cpu_energy() calculates as follow: >>>> cost * sum_util / scale_cpu >>>> We are interested in the first part - the value of multiplication. >>> >>> But all these are internal computations of the energy model. At the >>> end, the computed energy that is returned by compute_energy() and >>> em_cpu_energy(), fits in a long >> >> Let's take a look at existing *10000 precision for x CPUs: >> cost * sum_util / scale_cpu = >> (64k *10000) * (x * 800) / 1024 >> which is: >> x * ~500mln >> >> So to be close to overflowing u32 the 'x' has to be > (?=) 8 >> (depends on sum_util). > > I assume the worst case is `x * 1024` (max return value of > effective_cpu_util = effective_cpu_util()) so x ~ 6.7. > > I'm not aware of any arm32 b.L. systems with > 4 CPUs in a PD. > True, arm32 didn't support bigger number than 4 CPUs in the cluster. We would be safe for them, but I don't want to break with this assumption any other 32bit platform from competitors, which might create such 32bit 16cores clusters. If Peter, Vincent and you are OK to put this assumption about max safe CPUs number, then we can get rid of patch 1/3. But the temporary division of u64 must stay, because there is arm32 platform which need it. So returning also u64 is not a big harm and looks more consistent.