From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCCF3C433E6 for ; Mon, 28 Dec 2020 04:30:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AAAD322522 for ; Mon, 28 Dec 2020 04:30:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726333AbgL1EaM (ORCPT ); Sun, 27 Dec 2020 23:30:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726167AbgL1EaM (ORCPT ); Sun, 27 Dec 2020 23:30:12 -0500 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC74BC061795 for ; Sun, 27 Dec 2020 20:29:25 -0800 (PST) Received: by mail-wm1-x331.google.com with SMTP id q75so8944293wme.2 for ; Sun, 27 Dec 2020 20:29:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Wjb+LO/ssuceHPKZ/ftYW2L7sh6AvUBxbmjIMtkzZqE=; b=vNLyIrHDJ8HRiUDhPhwNOjFF4Ghdggy2jXkEf7p45zpM4vy6UYPdXmSO762obMK/De QHfsS2PS6so0x0mVsyapPYQTnzLOOk/kN/ALwnXYkjK+O9pogKphZGm5Bz+4GKgofnmo 0+wf36CXATc7aqClo64cjjRjibPZm6kuZwYpaSDx5UWxyEluktzLBNvlj81SKq+4qx/K l3eN2eqqh4yn5RVHAannbIUtyMwWot7vhPVz0SzibLePbe8VRfXvDj1pFKGXCA5glmo+ tblwwfyRsMlzhqr0XpyB5jVPPnaEPIyYrnjpHR/6i5eX+KHh20WiwA3f4zDKCdzVGyXn qb+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Wjb+LO/ssuceHPKZ/ftYW2L7sh6AvUBxbmjIMtkzZqE=; b=WER2ALEsaq98pbbOrHENjwvyxddIpd5LXeTzBSbS0giwQ7g32SDoEFEjZEr7pzBL27 Afn9WJoSPZCrf1/dV98y5zbyEKjx+/3slz2mG3SJY+0Nh93quo8Ebqx9EdtOSsQjgqt5 yxPWfPJ5QhQZ7B9yiSsxCa+bHdWO4XaEyY+rPFAAYnBPUWF8wVlgcgvngm8lMeTG6aUA Mel3OtFB1tm5X0rZHMm9NPI4TPOlGMzoa7svxYgLh412C3HsudP48DiABN/xGlbag4lV 6zUKzA7UCuGlMD+T84L5sM4CpxB66XiLFQ7o9ARkJdRdf/sWS5QD+BpEPtjG/yYoI5GS c7NQ== X-Gm-Message-State: AOAM530PPBFb22Kq3jimNKdMJCAbSrDNhsAhRs4E2V78TPPz2ynmrQ/4 6jwXdaTSZlrri8FRt4nMZfBitkydYTGTD3A63CWHtg== X-Google-Smtp-Source: ABdhPJzLY3n9dp7VwXrIfmnAWn81h57NTQfZPAg7EPAmaWCfaEYjm25PeSjYFfk1njw/JjqV5aDagF311R4+zQTrb6Y= X-Received: by 2002:a7b:cf37:: with SMTP id m23mr18480668wmg.37.1609129763943; Sun, 27 Dec 2020 20:29:23 -0800 (PST) MIME-Version: 1.0 References: <20201208132835.6151-1-will@kernel.org> <20201208132835.6151-11-will@kernel.org> In-Reply-To: <20201208132835.6151-11-will@kernel.org> From: Suren Baghdasaryan Date: Sun, 27 Dec 2020 20:29:13 -0800 Message-ID: Subject: Re: [PATCH v5 10/15] sched: Introduce force_compatible_cpus_allowed_ptr() to limit CPU affinity To: Will Deacon Cc: linux-arm-kernel@lists.infradead.org, linux-arch@vger.kernel.org, LKML , Catalin Marinas , Marc Zyngier , Greg Kroah-Hartman , Peter Zijlstra , Morten Rasmussen , Qais Yousef , Quentin Perret , Tejun Heo , Li Zefan , Johannes Weiner , Ingo Molnar , Juri Lelli , Vincent Guittot , kernel-team Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Just a couple minor nits. On Tue, Dec 8, 2020 at 5:29 AM Will Deacon wrote: > > Asymmetric systems may not offer the same level of userspace ISA support > across all CPUs, meaning that some applications cannot be executed by > some CPUs. As a concrete example, upcoming arm64 big.LITTLE designs do > not feature support for 32-bit applications on both clusters. > > Although userspace can carefully manage the affinity masks for such > tasks, one place where it is particularly problematic is execve() > because the CPU on which the execve() is occurring may be incompatible > with the new application image. In such a situation, it is desirable to > restrict the affinity mask of the task and ensure that the new image is > entered on a compatible CPU. From userspace's point of view, this looks > the same as if the incompatible CPUs have been hotplugged off in the > task's affinity mask. > > In preparation for restricting the affinity mask for compat tasks on > arm64 systems without uniform support for 32-bit applications, introduce > force_compatible_cpus_allowed_ptr(), which restricts the affinity mask > for a task to contain only compatible CPUs. > > Reviewed-by: Quentin Perret > Signed-off-by: Will Deacon > --- > include/linux/sched.h | 1 + > kernel/sched/core.c | 100 +++++++++++++++++++++++++++++++++++------- > 2 files changed, 86 insertions(+), 15 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 76cd21fa5501..e42dd0fb85c5 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1653,6 +1653,7 @@ extern int task_can_attach(struct task_struct *p, const struct cpumask *cs_cpus_ > #ifdef CONFIG_SMP > extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask); > extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask); > +extern void force_compatible_cpus_allowed_ptr(struct task_struct *p); > #else > static inline void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) > { > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 92ac3e53f50a..1cfc94be18a9 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1863,25 +1863,19 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) > } > > /* > - * Change a given task's CPU affinity. Migrate the thread to a > - * proper CPU and schedule it away if the CPU it's executing on > - * is removed from the allowed bitmask. > - * > - * NOTE: the caller must have a valid reference to the task, the > - * task must not exit() & deallocate itself prematurely. The > - * call is not atomic; no spinlocks may be held. > + * Called with both p->pi_lock and rq->lock held; drops both before returning. Maybe annotate with __releases()? > */ > -static int __set_cpus_allowed_ptr(struct task_struct *p, > - const struct cpumask *new_mask, bool check) > +static int __set_cpus_allowed_ptr_locked(struct task_struct *p, > + const struct cpumask *new_mask, > + bool check, > + struct rq *rq, > + struct rq_flags *rf) > { > const struct cpumask *cpu_valid_mask = cpu_active_mask; > const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p); > unsigned int dest_cpu; > - struct rq_flags rf; > - struct rq *rq; > int ret = 0; > > - rq = task_rq_lock(p, &rf); > update_rq_clock(rq); > > if (p->flags & PF_KTHREAD) { > @@ -1936,7 +1930,7 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, > if (task_running(rq, p) || p->state == TASK_WAKING) { > struct migration_arg arg = { p, dest_cpu }; > /* Need help from migration thread: drop lock and wait. */ > - task_rq_unlock(rq, p, &rf); > + task_rq_unlock(rq, p, rf); > stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); > return 0; > } else if (task_on_rq_queued(p)) { > @@ -1944,20 +1938,96 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, > * OK, since we're going to drop the lock immediately > * afterwards anyway. > */ > - rq = move_queued_task(rq, &rf, p, dest_cpu); > + rq = move_queued_task(rq, rf, p, dest_cpu); > } > out: > - task_rq_unlock(rq, p, &rf); > + task_rq_unlock(rq, p, rf); > > return ret; > } > > +/* > + * Change a given task's CPU affinity. Migrate the thread to a > + * proper CPU and schedule it away if the CPU it's executing on > + * is removed from the allowed bitmask. > + * > + * NOTE: the caller must have a valid reference to the task, the > + * task must not exit() & deallocate itself prematurely. The > + * call is not atomic; no spinlocks may be held. > + */ > +static int __set_cpus_allowed_ptr(struct task_struct *p, > + const struct cpumask *new_mask, bool check) > +{ > + struct rq_flags rf; > + struct rq *rq; > + > + rq = task_rq_lock(p, &rf); > + return __set_cpus_allowed_ptr_locked(p, new_mask, check, rq, &rf); > +} > + > int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask) > { > return __set_cpus_allowed_ptr(p, new_mask, false); > } > EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr); > > +/* > + * Change a given task's CPU affinity to the intersection of its current > + * affinity mask and @subset_mask, writing the resulting mask to @new_mask. > + * If the resulting mask is empty, leave the affinity unchanged and return > + * -EINVAL. > + */ > +static int restrict_cpus_allowed_ptr(struct task_struct *p, > + struct cpumask *new_mask, > + const struct cpumask *subset_mask) > +{ > + struct rq_flags rf; > + struct rq *rq; > + > + rq = task_rq_lock(p, &rf); > + if (!cpumask_and(new_mask, &p->cpus_mask, subset_mask)) { > + task_rq_unlock(rq, p, &rf); > + return -EINVAL; > + } > + > + return __set_cpus_allowed_ptr_locked(p, new_mask, false, rq, &rf); > +} > + > +/* > + * Restrict a given task's CPU affinity so that it is a subset of > + * task_cpu_possible_mask(). If the resulting mask is empty, we warn and > + * walk up the cpuset hierarchy until we find a suitable mask. Technically you also have a case when new_mask allocation fails, in which case task_cpu_possible_mask() is forced instead. Maybe add a clarifying comment at "goto out_set_mask" ? > + */ > +void force_compatible_cpus_allowed_ptr(struct task_struct *p) > +{ > + cpumask_var_t new_mask; > + const struct cpumask *override_mask = task_cpu_possible_mask(p); > + > + if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) > + goto out_set_mask; > + > + if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask)) > + goto out_free_mask; > + > + /* > + * We failed to find a valid subset of the affinity mask for the > + * task, so override it based on its cpuset hierarchy. > + */ > + cpuset_cpus_allowed(p, new_mask); > + override_mask = new_mask; > + > +out_set_mask: > + if (printk_ratelimit()) { > + printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n", > + task_pid_nr(p), p->comm, > + cpumask_pr_args(override_mask)); > + } > + > + set_cpus_allowed_ptr(p, override_mask); > +out_free_mask: > + free_cpumask_var(new_mask); > +} > + > void set_task_cpu(struct task_struct *p, unsigned int new_cpu) > { > #ifdef CONFIG_SCHED_DEBUG > -- > 2.29.2.576.ga3fc446d84-goog >