From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 631CDC43441 for ; Tue, 13 Nov 2018 15:11:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2A3B12086A for ; Tue, 13 Nov 2018 15:11:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2A3B12086A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732912AbeKNBKH (ORCPT ); Tue, 13 Nov 2018 20:10:07 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:57978 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726846AbeKNBKH (ORCPT ); Tue, 13 Nov 2018 20:10:07 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CF4F5A78; Tue, 13 Nov 2018 07:11:33 -0800 (PST) Received: from darkstar (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4548B3F5BD; Tue, 13 Nov 2018 07:11:33 -0800 (PST) Date: Tue, 13 Nov 2018 07:11:27 -0800 From: Patrick Bellasi To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ingo Molnar , Tejun Heo , "Rafael J . Wysocki" , Vincent Guittot , Viresh Kumar , Paul Turner , Quentin Perret , Dietmar Eggemann , Morten Rasmussen , Juri Lelli , Todd Kjos , Joel Fernandes , Steve Muckle , Suren Baghdasaryan Subject: Re: [PATCH v5 04/15] sched/core: uclamp: add CPU's clamp groups refcounting Message-ID: <20181113151127.GA7681@darkstar> References: <20181029183311.29175-1-patrick.bellasi@arm.com> <20181029183311.29175-6-patrick.bellasi@arm.com> <20181111164754.GA3038@worktop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181111164754.GA3038@worktop> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11-Nov 17:47, Peter Zijlstra wrote: > On Mon, Oct 29, 2018 at 06:32:59PM +0000, Patrick Bellasi wrote: > > +static inline void uclamp_cpu_update(struct rq *rq, unsigned int clamp_id) > > +{ > > + unsigned int group_id; > > + int max_value = 0; > > + > > + for (group_id = 0; group_id < UCLAMP_GROUPS; ++group_id) { > > + if (!rq->uclamp.group[clamp_id][group_id].tasks) > > + continue; > > + /* Both min and max clamps are MAX aggregated */ > > + if (max_value < rq->uclamp.group[clamp_id][group_id].value) > > + max_value = rq->uclamp.group[clamp_id][group_id].value; > > max_value = max(max_value, rq->uclamp.group[clamp_id][group_id].value); Right, I get used to this pattern to avoid write instructions. I guess that here, being just a function local variable, we don't really care much... > > + if (max_value >= SCHED_CAPACITY_SCALE) > > + break; > > + } > > + rq->uclamp.value[clamp_id] = max_value; > > +} > > + > > +/** > > + * uclamp_cpu_get_id(): increase reference count for a clamp group on a CPU > > + * @p: the task being enqueued on a CPU > > + * @rq: the CPU's rq where the clamp group has to be reference counted > > + * @clamp_id: the clamp index to update > > + * > > + * Once a task is enqueued on a CPU's rq, the clamp group currently defined by > > + * the task's uclamp::group_id is reference counted on that CPU. > > + */ > > +static inline void uclamp_cpu_get_id(struct task_struct *p, struct rq *rq, > > + unsigned int clamp_id) > > +{ > > + unsigned int group_id; > > + > > + if (unlikely(!p->uclamp[clamp_id].mapped)) > > + return; > > + > > + group_id = p->uclamp[clamp_id].group_id; > > + p->uclamp[clamp_id].active = true; > > + > > + rq->uclamp.group[clamp_id][group_id].tasks += 1; > > ++ > > + > > + if (rq->uclamp.value[clamp_id] < p->uclamp[clamp_id].value) > > + rq->uclamp.value[clamp_id] = p->uclamp[clamp_id].value; > > rq->uclamp.value[clamp_id] = max(rq->uclamp.value[clamp_id], > p->uclamp[clamp_id].value); In this case instead, since we are updating a variable visible from other CPUs, should not be preferred to avoid assignment when not required ? Is the compiler is smart enough to optimize the code above? ... will check better. > > +} > > + > > +/** > > + * uclamp_cpu_put_id(): decrease reference count for a clamp group on a CPU > > + * @p: the task being dequeued from a CPU > > + * @rq: the CPU's rq from where the clamp group has to be released > > + * @clamp_id: the clamp index to update > > + * > > + * When a task is dequeued from a CPU's rq, the CPU's clamp group reference > > + * counted by the task is released. > > + * If this was the last task reference coutning the current max clamp group, > > + * then the CPU clamping is updated to find the new max for the specified > > + * clamp index. > > + */ > > +static inline void uclamp_cpu_put_id(struct task_struct *p, struct rq *rq, > > + unsigned int clamp_id) > > +{ > > + unsigned int clamp_value; > > + unsigned int group_id; > > + > > + if (unlikely(!p->uclamp[clamp_id].mapped)) > > + return; > > + > > + group_id = p->uclamp[clamp_id].group_id; > > + p->uclamp[clamp_id].active = false; > > + > SCHED_WARN_ON(!rq->uclamp.group[clamp_id][group_id].tasks); > > > + if (likely(rq->uclamp.group[clamp_id][group_id].tasks)) > > + rq->uclamp.group[clamp_id][group_id].tasks -= 1; > > -- > > > +#ifdef CONFIG_SCHED_DEBUG > > + else { > > + WARN(1, "invalid CPU[%d] clamp group [%u:%u] refcount\n", > > + cpu_of(rq), clamp_id, group_id); > > + } > > +#endif > > > + > > + if (likely(rq->uclamp.group[clamp_id][group_id].tasks)) > > + return; > > + > > + clamp_value = rq->uclamp.group[clamp_id][group_id].value; > > +#ifdef CONFIG_SCHED_DEBUG > > + if (unlikely(clamp_value > rq->uclamp.value[clamp_id])) { > > + WARN(1, "invalid CPU[%d] clamp group [%u:%u] value\n", > > + cpu_of(rq), clamp_id, group_id); > > + } > > +#endif > > SCHED_WARN_ON(clamp_value > rq->uclamp.value[clamp_id]); > > > + if (clamp_value >= rq->uclamp.value[clamp_id]) > > + uclamp_cpu_update(rq, clamp_id); > > +} > > > @@ -866,6 +1020,28 @@ static void uclamp_group_get(struct uclamp_se *uc_se, unsigned int clamp_id, > > if (res != uc_map_old.data) > > goto retry; > > > > + /* Ensure each CPU tracks the correct value for this clamp group */ > > + if (likely(uc_map_new.se_count > 1)) > > + goto done; > > + for_each_possible_cpu(cpu) { > > yuck yuck yuck.. why!? When a clamp group is released, i.e. no more SEs refcount it, that group could be mapped in the future to a different clamp value. Thus, when this happens, a different clamp value can be assigned to that clamp group and we need to update the value tracked in the CPU side data structures. That's the value actually used to figure out the min/max clamps at enqueue/dequeue times. However, since here we are in the slow-path, this should not be an issue, isn't it ? > > > + struct uclamp_cpu *uc_cpu = &cpu_rq(cpu)->uclamp; > > + > > + /* Refcounting is expected to be always 0 for free groups */ > > + if (unlikely(uc_cpu->group[clamp_id][group_id].tasks)) { > > + uc_cpu->group[clamp_id][group_id].tasks = 0; > > +#ifdef CONFIG_SCHED_DEBUG > > + WARN(1, "invalid CPU[%d] clamp group [%u:%u] refcount\n", > > + cpu, clamp_id, group_id); > > +#endif > > SCHED_WARN_ON(); > > > + } > > + > > + if (uc_cpu->group[clamp_id][group_id].value == clamp_value) > > + continue; > > + uc_cpu->group[clamp_id][group_id].value = clamp_value; > > + } > > + > > +done: > > + > > /* Update SE's clamp values and attach it to new clamp group */ > > uc_se->value = clamp_value; > > uc_se->group_id = group_id; -- #include Patrick Bellasi