From: Suren Baghdasaryan <surenb@google.com> To: Patrick Bellasi <patrick.bellasi@arm.com> Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>, "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>, Viresh Kumar <viresh.kumar@linaro.org>, Vincent Guittot <vincent.guittot@linaro.org>, Paul Turner <pjt@google.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Morten Rasmussen <morten.rasmussen@arm.com>, Juri Lelli <juri.lelli@redhat.com>, Todd Kjos <tkjos@google.com>, Joel Fernandes <joelaf@google.com>, Steve Muckle <smuckle@google.com> Subject: Re: [PATCH v2 02/12] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups Date: Fri, 20 Jul 2018 17:25:31 -0700 [thread overview] Message-ID: <CAJuCfpGvpX5U9rED+vX=wW624GLZo8f=-2kOLGrY1UJSEXRh6g@mail.gmail.com> (raw) In-Reply-To: <20180720151156.GA31421@e110439-lin> Hi Patrick, On Fri, Jul 20, 2018 at 8:11 AM, Patrick Bellasi <patrick.bellasi@arm.com> wrote: > Hi Suren, > thanks for the review, all good point... some more comments follow > inline. > > On 19-Jul 16:51, Suren Baghdasaryan wrote: >> On Mon, Jul 16, 2018 at 1:28 AM, Patrick Bellasi >> <patrick.bellasi@arm.com> wrote: > > [...] > >> > +/** >> > + * uclamp_group_available: checks if a clamp group is available >> > + * @clamp_id: the utilization clamp index (i.e. min or max clamp) >> > + * @group_id: the group index in the given clamp_id >> > + * >> > + * A clamp group is not free if there is at least one SE which is sing a clamp >> >> Did you mean to say "single clamp"? > > No, it's "...at least one SE which is USING a clamp value..." > >> > + * value mapped on the specified clamp_id. These SEs are reference counted by >> > + * the se_count of a uclamp_map entry. >> > + * >> > + * Return: true if there are no SE's mapped on the specified clamp >> > + * index and group >> > + */ >> > +static inline bool uclamp_group_available(int clamp_id, int group_id) >> > +{ >> > + struct uclamp_map *uc_map = &uclamp_maps[clamp_id][0]; >> > + >> > + return (uc_map[group_id].value == UCLAMP_NONE); >> >> The usage of UCLAMP_NONE is very confusing to me. It was not used at >> all in the patch where it was introduced [1/12], here it's used as a >> clamp value and in uclamp_group_find() it's used as group_id. Please >> clarify the usage. > > Yes, it's meant to represent a "clamp not valid" condition, whatever > it's a "clamp group" or a "clamp value"... perhaps the name can be > improved. > >> I also feel UCLAMP_NONE does not really belong to >> the uclamp_id enum because other elements there are indexes in >> uclamp_maps and this one is a special value. > > Right, it looks a bit misplaced, I agree. I think I tried to set it > using a #define but there was some issues I don't remember now... > Anyway, I'll give it another go... > > >> IMHO if both *group_id* >> and *value* need a special value (-1) to represent >> unused/uninitialized entry it would be better to use different >> constants. Maybe UCLAMP_VAL_NONE and UCLAMP_GROUP_NONE? > > Yes, maybe we can use a > > #define UCLAMP_NOT_VALID -1 > > and get rid the confusing enum entry. > > Will update it on v3. > Sounds good to me. >> > +} > > [...] > >> > +/** >> > + * uclamp_group_find: finds the group index of a utilization clamp group >> > + * @clamp_id: the utilization clamp index (i.e. min or max clamping) >> > + * @clamp_value: the utilization clamping value lookup for >> > + * >> > + * Verify if a group has been assigned to a certain clamp value and return >> > + * its index to be used for accounting. >> > + * >> > + * Since only a limited number of utilization clamp groups are allowed, if no >> > + * groups have been assigned for the specified value, a new group is assigned >> > + * if possible. Otherwise an error is returned, meaning that an additional clamp >> > + * value is not (currently) supported. >> > + */ >> > +static int >> > +uclamp_group_find(int clamp_id, unsigned int clamp_value) >> > +{ >> > + struct uclamp_map *uc_map = &uclamp_maps[clamp_id][0]; >> > + int free_group_id = UCLAMP_NONE; >> > + unsigned int group_id = 0; >> > + >> > + for ( ; group_id <= CONFIG_UCLAMP_GROUPS_COUNT; ++group_id) { >> > + /* Keep track of first free clamp group */ >> > + if (uclamp_group_available(clamp_id, group_id)) { >> > + if (free_group_id == UCLAMP_NONE) >> > + free_group_id = group_id; >> > + continue; >> > + } >> > + /* Return index of first group with same clamp value */ >> > + if (uc_map[group_id].value == clamp_value) >> > + return group_id; >> > + } >> > + /* Default to first free clamp group */ >> > + if (group_id > CONFIG_UCLAMP_GROUPS_COUNT) >> >> Is the condition above needed? I think it's always true if you got here. >> Also AFAIKT after the for loop you can just do: >> >> return (free_group_id != UCLAMP_NONE) ? free_group_id : -ENOSPC; > > Yes, you right... the code above can be simplified! > >> >> > + group_id = free_group_id; >> > + /* All clamp group already track different clamp values */ >> > + if (group_id == UCLAMP_NONE) >> > + return -ENOSPC; >> > + return group_id; >> > +} > > [...] > >> > +static inline void uclamp_group_put(int clamp_id, int group_id) >> > +{ >> > + struct uclamp_map *uc_map = &uclamp_maps[clamp_id][0]; >> > + unsigned long flags; >> > + >> > + /* Ignore SE's not yet attached */ >> > + if (group_id == UCLAMP_NONE) >> > + return; >> > + >> > + /* Remove SE from this clamp group */ >> > + raw_spin_lock_irqsave(&uc_map[group_id].se_lock, flags); >> > + uc_map[group_id].se_count -= 1; >> >> If uc_map[group_id].se_count was 0 before decrement you end up with >> se_count == -1 and no reset for the element. > > Well... this should never happen, otherwise the refcounting is not > working as expected. > > Maybe we can add (at least) a debug check and warning, something like: > > #ifdef SCHED_DEBUG > if (unlikely(uc_map[group_id].se_count == 0)) { > WARN(1, "invalid clamp group [%d:%d] refcount\n", > clamp_id, group_id); > uc_map[group_id].se_count = 1; > } > #endif > >> > + if (uc_map[group_id].se_count == 0) >> > + uclamp_group_reset(clamp_id, group_id); >> > + raw_spin_unlock_irqrestore(&uc_map[group_id].se_lock, flags); >> > +} >> > + > > [...] > >> > static inline int __setscheduler_uclamp(struct task_struct *p, >> > const struct sched_attr *attr) >> > { >> > + struct uclamp_se *uc_se; >> > + int retval = 0; >> > + >> > if (attr->sched_util_min > attr->sched_util_max) >> > return -EINVAL; >> > if (attr->sched_util_max > SCHED_CAPACITY_SCALE) >> > return -EINVAL; >> > >> > - p->uclamp[UCLAMP_MIN] = attr->sched_util_min; >> > - p->uclamp[UCLAMP_MAX] = attr->sched_util_max; >> > + mutex_lock(&uclamp_mutex); >> > + >> > + /* Update min utilization clamp */ >> > + uc_se = &p->uclamp[UCLAMP_MIN]; >> > + retval |= uclamp_group_get(p, UCLAMP_MIN, uc_se, >> > + attr->sched_util_min); >> > + >> > + /* Update max utilization clamp */ >> > + uc_se = &p->uclamp[UCLAMP_MAX]; >> > + retval |= uclamp_group_get(p, UCLAMP_MAX, uc_se, >> > + attr->sched_util_max); >> > + >> > + mutex_unlock(&uclamp_mutex); >> > + >> > + /* >> > + * If one of the two clamp values should fail, >> > + * let the userspace know. >> > + */ >> > + if (retval) >> > + return -ENOSPC; >> >> Maybe a minor issue but this failure is ambiguous. It might mean: >> 1. no clamp value was updated >> 2. UCLAMP_MIN was updated but UCLAMP_MAX was not >> 3. UCLAMP_MAX was updated but UCLAMP_MIN was not > > That's right, I put a bit of thought on that me too but at the end > I've been convinced that the possibility to use a single syscall to > set both clamps at the same time has some benefits for user-space. > > Maybe the current solution can be improved by supporting an (optional) > strict semantic with an in-kernel roll-back in case one of the two > uclamp_group_get should fail. > > The strict semantic with roll-back could be controller via an > additional flag, e.g. SCHED_FLAG_UTIL_CLAMP_STRICT. > > When the flag is set, either we are able to set both the attributes or > we roll-back. Otherwise, when the flag is not set, we keep the current > behavior. i.e. we set what we can and report an error to notify > userspace that one constraints was not enforced. > > The following snippet should implement this semantics: > > ---8<--- > > /* Uclamp flags */ > #define SCHED_FLAG_UTIL_CLAMP_STRICT 0x11 /* Roll-back on failure */ > #define SCHED_FLAG_UTIL_CLAMP_MIN 0x12 /* Update util_min */ > #define SCHED_FLAG_UTIL_CLAMP_MAX 0x14 /* Update util_max */ > #define SCHED_FLAG_UTIL_CLAMP ( \ > SCHED_FLAG_UTIL_CLAMP_MIN | SCHED_FLAG_UTIL_CLAMP_MAX) > Having ability to update only min or only max this way might be indeed very useful. Instead of rolling back on failure I would suggest to check both inputs first to make sure there won't be any error before updating. This would remove the need for SCHED_FLAG_UTIL_CLAMP_STRICT (which I think any user would want to set to 1 anyway). Looks like uclamp_group_get() can fail only if uclamp_group_find() fails to find a slot for uclamp_value or a free slot. So one way to do this search before update is to call uclamp_group_find() for both UCLAMP_MIN and UCLAMP_MAX beforehand and if they succeed then pass obtained next_group_ids into uclamp_group_get() to avoid doing the same search twice. This requires some refactoring of uclamp_group_get() but I think the end result would be a cleaner and more predictable solution. > static inline int __setscheduler_uclamp(struct task_struct *p, > const struct sched_attr *attr) > { > unsigned int uclamp_value_old = 0; > unsigned int uclamp_value; > struct uclamp_se *uc_se; > int retval = 0; > > if (attr->sched_util_min > attr->sched_util_max) > return -EINVAL; > if (attr->sched_util_max > 100) > return -EINVAL; > > mutex_lock(&uclamp_mutex); > > if (!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)) > goto set_util_max; > > uc_se = &p->uclamp[UCLAMP_MIN]; > uclamp_value = scale_from_percent(attr->sched_util_min); > if (uc_se->value == uclamp_value) > goto set_util_max; > > /* Update min utilization clamp */ > uclamp_value_old = uc_se->value; > retval |= uclamp_group_get(p, NULL, UCLAMP_MIN, uc_se, uclamp_value); > if (retval && > attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_STRICT) > goto exit_failure; > > set_util_max: > > if (!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)) > goto exit_done; > > uc_se = &p->uclamp[UCLAMP_MAX]; > uclamp_value = scale_from_percent(attr->sched_util_max); > if (uc_se->value == uclamp_value) > goto exit_done; > > /* Update max utilization clamp */ > if (uclamp_group_get(p, NULL, UCLAMP_MAX, > uc_se, uclamp_value)) > goto exit_rollback; > > exit_done: > mutex_unlock(&uclamp_mutex); > return retval; > > exit_rollback: > if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN && > attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_STRICT) { > uclamp_group_get(p, NULL, UCLAMP_MIN, > uc_se, uclamp_value_old); > } > exit_failure: > mutex_unlock(&uclamp_mutex); > > return -ENOSPC; > } > > ---8<--- > > > Could that work better? > > The code is maybe a bit more convoluted... but perhaps it can be > improved by encoding it in a loop. > > > -- > #include <best/regards.h> > > Patrick Bellasi
next prev parent reply other threads:[~2018-07-21 0:25 UTC|newest] Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-07-16 8:28 [PATCH v2 00/12] Add utilization clamping support Patrick Bellasi 2018-07-16 8:28 ` [PATCH v2 01/12] sched/core: uclamp: extend sched_setattr to support utilization clamping Patrick Bellasi 2018-07-17 17:50 ` Joel Fernandes 2018-07-18 8:42 ` Patrick Bellasi 2018-07-18 17:02 ` Joel Fernandes 2018-07-17 18:04 ` Joel Fernandes 2018-07-16 8:28 ` [PATCH v2 02/12] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups Patrick Bellasi 2018-07-19 23:51 ` Suren Baghdasaryan 2018-07-20 15:11 ` Patrick Bellasi 2018-07-21 0:25 ` Suren Baghdasaryan [this message] 2018-07-23 13:36 ` Patrick Bellasi 2018-07-16 8:28 ` [PATCH v2 03/12] sched/core: uclamp: add CPU's clamp groups accounting Patrick Bellasi 2018-07-20 20:25 ` Suren Baghdasaryan 2018-07-16 8:28 ` [PATCH v2 04/12] sched/core: uclamp: update CPU's refcount on clamp changes Patrick Bellasi 2018-07-16 8:28 ` [PATCH v2 05/12] sched/cpufreq: uclamp: add utilization clamping for FAIR tasks Patrick Bellasi 2018-07-16 8:29 ` [PATCH v2 06/12] sched/cpufreq: uclamp: add utilization clamping for RT tasks Patrick Bellasi 2018-07-16 8:29 ` [PATCH v2 07/12] sched/core: uclamp: enforce last task UCLAMP_MAX Patrick Bellasi 2018-07-21 1:23 ` Suren Baghdasaryan 2018-07-23 15:02 ` Patrick Bellasi 2018-07-23 16:40 ` Suren Baghdasaryan 2018-07-16 8:29 ` [PATCH v2 08/12] sched/core: uclamp: extend cpu's cgroup controller Patrick Bellasi 2018-07-21 2:37 ` Suren Baghdasaryan 2018-07-21 3:16 ` Suren Baghdasaryan 2018-07-23 15:17 ` Patrick Bellasi 2018-07-23 15:30 ` Tejun Heo 2018-07-23 17:22 ` Patrick Bellasi 2018-07-24 13:29 ` Tejun Heo 2018-07-24 15:39 ` Patrick Bellasi 2018-07-27 0:39 ` Joel Fernandes 2018-07-27 8:09 ` Quentin Perret 2018-07-16 8:29 ` [PATCH v2 09/12] sched/core: uclamp: map TG's clamp values into CPU's clamp groups Patrick Bellasi 2018-07-16 8:29 ` [PATCH v2 10/12] sched/core: uclamp: use TG's clamps to restrict Task's clamps Patrick Bellasi 2018-07-22 3:05 ` Suren Baghdasaryan 2018-07-23 15:40 ` Patrick Bellasi 2018-07-23 17:11 ` Suren Baghdasaryan 2018-07-24 9:56 ` Patrick Bellasi 2018-07-24 15:28 ` Suren Baghdasaryan 2018-07-24 15:49 ` Patrick Bellasi 2018-07-16 8:29 ` [PATCH v2 11/12] sched/core: uclamp: update CPU's refcount on TG's clamp changes Patrick Bellasi 2018-07-22 3:17 ` Suren Baghdasaryan 2018-07-16 8:29 ` [PATCH v2 12/12] sched/core: uclamp: use percentage clamp values Patrick Bellasi 2018-07-22 4:04 ` Suren Baghdasaryan 2018-07-24 16:43 ` Patrick Bellasi 2018-07-24 17:11 ` Suren Baghdasaryan 2018-07-24 17:17 ` Patrick Bellasi 2018-07-17 13:03 ` [PATCH v2 00/12] Add utilization clamping support Joel Fernandes 2018-07-17 13:41 ` Patrick Bellasi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAJuCfpGvpX5U9rED+vX=wW624GLZo8f=-2kOLGrY1UJSEXRh6g@mail.gmail.com' \ --to=surenb@google.com \ --cc=dietmar.eggemann@arm.com \ --cc=joelaf@google.com \ --cc=juri.lelli@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-pm@vger.kernel.org \ --cc=mingo@redhat.com \ --cc=morten.rasmussen@arm.com \ --cc=patrick.bellasi@arm.com \ --cc=peterz@infradead.org \ --cc=pjt@google.com \ --cc=rafael.j.wysocki@intel.com \ --cc=smuckle@google.com \ --cc=tj@kernel.org \ --cc=tkjos@google.com \ --cc=vincent.guittot@linaro.org \ --cc=viresh.kumar@linaro.org \ --subject='Re: [PATCH v2 02/12] sched/core: uclamp: map TASK'\''s clamp values into CPU'\''s clamp groups' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).