From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C43BC4361A for ; Thu, 3 Dec 2020 16:10:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1A61D207A5 for ; Thu, 3 Dec 2020 16:10:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731139AbgLCQKd (ORCPT ); Thu, 3 Dec 2020 11:10:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55778 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727395AbgLCQKa (ORCPT ); Thu, 3 Dec 2020 11:10:30 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63CEEC061A51 for ; Thu, 3 Dec 2020 08:09:50 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id k14so2441514wrn.1 for ; Thu, 03 Dec 2020 08:09:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=Egp56+bWfac+3k12p/dmisOXz1k0Ww/MYP5K1gwdQ58=; b=IbKIdizWv0uaVk1vhtYurj8fNhCBBnAN4q2j0WS9KRqrF2U6oz3IrfaUHX1Qtrfx3i ogaKErJH4njzq3nwtbFQAWMx40OAREHKSdnaxxaEDqG2eqlKRM9i4+OYUz8q0pdywmzJ Y+u4VUxK5hAvJpJOhlVNgAbvhzDCmnJAWDGkOUq5QOCsK5EAzxttQ+a5nGdZch+09os6 py9ikEmpyXVadNtvdOOw/LGKP7EbI0dnf75bFCMvS+ePPdg7udWoXco1Ux1VHxt9CuOW v5phtZ8cA4Ye1+BWlVYKx+U7T0+l9yZAAsPVaAEbXQiHo6AFEmiNfFlwvufw2ewYAap8 Z71A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=Egp56+bWfac+3k12p/dmisOXz1k0Ww/MYP5K1gwdQ58=; b=Oi3mC8WSoo0N9voZAfcaCSy+nAFRWL80zxsrZn6ulD47/6FW7BwxbnrGvipWKpGaf2 cYFcJNnLxFTaVDFnCdJkAiNgCUDRg8QD008wzI/uOgagtfR+8UQSMtWZBvAbg+UxF3w0 Hp+ehGMA1seh8lnrRS4waRO3Jv1EGGL8H9WAqi6G0PuSQIm4qjtmjYzO142K2ge+pe5o 2QQVEN2PaIhzkMDx4zUaSmaOGO6KfvNd8hW+dN9GL/6ltVwftISFOGo03f/OasSrL7pL tFEYyYamBhD/NG2aoo+UHpP7XaNo5FMUPN9jY014qoqW9JIOxbClB98EwLRaEumfSMK8 eapA== X-Gm-Message-State: AOAM530vNGU12a3rib4iql0GaA4ic+1EBL3aYy3ENdwyIBL/FQHF2jfb ArRa7H24vXfJYNdlmafU97TQfg== X-Google-Smtp-Source: ABdhPJx+gI/bhjY3qusZeY90Fqj3HltB+Pt0HmNnq6N7kRujX0WWTNyi+4zTbYCRxzLDMVulGuZCyA== X-Received: by 2002:a5d:5689:: with SMTP id f9mr4547592wrv.181.1607011788834; Thu, 03 Dec 2020 08:09:48 -0800 (PST) Received: from ?IPv6:2a01:e34:ed2f:f020:9cff:9584:adb2:6288? ([2a01:e34:ed2f:f020:9cff:9584:adb2:6288]) by smtp.googlemail.com with ESMTPSA id o83sm2009235wme.21.2020.12.03.08.09.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 03 Dec 2020 08:09:48 -0800 (PST) Subject: Re: [PATCH v2 2/5] thermal: devfreq_cooling: get a copy of device status To: Lukasz Luba Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, dri-devel@lists.freedesktop.org, rui.zhang@intel.com, amit.kucheria@verdurent.com, orjan.eide@arm.com, robh@kernel.org, alyssa.rosenzweig@collabora.com, steven.price@arm.com, airlied@linux.ie, daniel@ffwll.ch, ionela.voinescu@arm.com References: <20201118120358.17150-1-lukasz.luba@arm.com> <20201118120358.17150-3-lukasz.luba@arm.com> <5d4743b9-5b2f-8494-8d10-6a5fd2c0fdfd@linaro.org> From: Daniel Lezcano Message-ID: <224c6b9b-977a-d553-f22b-2056223a84bc@linaro.org> Date: Thu, 3 Dec 2020 17:09:46 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/12/2020 16:38, Lukasz Luba wrote: > > > On 12/3/20 1:09 PM, Daniel Lezcano wrote: >> On 18/11/2020 13:03, Lukasz Luba wrote: >>> Devfreq cooling needs to now the correct status of the device in order >>> to operate. Do not rely on Devfreq last_status which might be a stale >>> data >>> and get more up-to-date values of the load. >>> >>> Devfreq framework can change the device status in the background. To >>> mitigate this situation make a copy of the status structure and use it >>> for internal calculations. >>> >>> In addition this patch adds normalization function, which also makes >>> sure >>> that whatever data comes from the device, it is in a sane range. >>> >>> Signed-off-by: Lukasz Luba >>> --- >>>   drivers/thermal/devfreq_cooling.c | 52 +++++++++++++++++++++++++------ >>>   1 file changed, 43 insertions(+), 9 deletions(-) >>> >>> diff --git a/drivers/thermal/devfreq_cooling.c >>> b/drivers/thermal/devfreq_cooling.c >>> index 659c0143c9f0..925523694462 100644 >>> --- a/drivers/thermal/devfreq_cooling.c >>> +++ b/drivers/thermal/devfreq_cooling.c >>> @@ -227,20 +227,46 @@ static inline unsigned long >>> get_total_power(struct devfreq_cooling_device *dfc, >>>                                      voltage); >>>   } >>>   +static void _normalize_load(struct devfreq_dev_status *status) >>> +{ >>> +    /* Make some space if needed */ >>> +    if (status->busy_time > 0xffff) { >>> +        status->busy_time >>= 10; >>> +        status->total_time >>= 10; >>> +    } >>> + >>> +    if (status->busy_time > status->total_time) >>> +        status->busy_time = status->total_time; >> >> How the condition above is possible? > > They should, be checked by the driver, but I cannot trust > and have to check for all corner cases: (div by 0, overflow > one of them, etc). The busy_time and total_time are unsigned long, > which means 4B on 32bit machines. > If these values are coming from device counters, which count every > busy cycle and total cycles of a clock of a device running at e.g. > 1GHz they would overflow every ~4s. I don't think it is up to this routine to check the driver is correctly implemented, especially at every call to get_requested_power. If the normalization ends up by doing this kind of thing, there is certainly something wrong in the 'status' computation to be fixed before submitting this series. > Normally IPA polling are 1s and 100ms, it's platform specific. But there > are also 'empty' periods when IPA sees temperature very low and does not > even call the .get_requested_power() callbacks for the cooling devices, > just grants max freq to all. This is problematic. I am investigating it > and will propose a solution for IPA soon. > > I would avoid all of this if devfreq core would have default for all > devices a reliable polling timer... Let me check some possibilities also > for this case. Ok, may be create an API to compute the 'idle,busy,total times' to be used by the different the devfreq drivers and then fix the overflow in this common place. >>> +    status->busy_time *= 100; >>> +    status->busy_time /= status->total_time ? : 1; >>> + >>> +    /* Avoid division by 0 */ >>> +    status->busy_time = status->busy_time ? : 1; >>> +    status->total_time = 100; >> >> Why not base the normalization on 1024? and use an intermediate u64. > > You are the 2nd reviewer who is asking this. I tried to keep 'load' as > in range [0, 100] since we also have 'load' in cpufreq cooling in this > range. Maybe I should switch to 1024 (Ionela was also asking for this). Well it is common practice to compute normalization with 1024 because the division is a bit shift and the compiler optimize the code very well with that value. -- Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog