From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73664C2BB48 for ; Tue, 8 Dec 2020 13:30:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 53A3823AA8 for ; Tue, 8 Dec 2020 13:30:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729505AbgLHNaV (ORCPT ); Tue, 8 Dec 2020 08:30:21 -0500 Received: from mail.kernel.org ([198.145.29.99]:49436 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729441AbgLHNaR (ORCPT ); Tue, 8 Dec 2020 08:30:17 -0500 From: Will Deacon Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, Will Deacon , Catalin Marinas , Marc Zyngier , Greg Kroah-Hartman , Peter Zijlstra , Morten Rasmussen , Qais Yousef , Suren Baghdasaryan , Quentin Perret , Tejun Heo , Li Zefan , Johannes Weiner , Ingo Molnar , Juri Lelli , Vincent Guittot , kernel-team@android.com Subject: [PATCH v5 10/15] sched: Introduce force_compatible_cpus_allowed_ptr() to limit CPU affinity Date: Tue, 8 Dec 2020 13:28:30 +0000 Message-Id: <20201208132835.6151-11-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201208132835.6151-1-will@kernel.org> References: <20201208132835.6151-1-will@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Asymmetric systems may not offer the same level of userspace ISA support across all CPUs, meaning that some applications cannot be executed by some CPUs. As a concrete example, upcoming arm64 big.LITTLE designs do not feature support for 32-bit applications on both clusters. Although userspace can carefully manage the affinity masks for such tasks, one place where it is particularly problematic is execve() because the CPU on which the execve() is occurring may be incompatible with the new application image. In such a situation, it is desirable to restrict the affinity mask of the task and ensure that the new image is entered on a compatible CPU. From userspace's point of view, this looks the same as if the incompatible CPUs have been hotplugged off in the task's affinity mask. In preparation for restricting the affinity mask for compat tasks on arm64 systems without uniform support for 32-bit applications, introduce force_compatible_cpus_allowed_ptr(), which restricts the affinity mask for a task to contain only compatible CPUs. Reviewed-by: Quentin Perret Signed-off-by: Will Deacon --- include/linux/sched.h | 1 + kernel/sched/core.c | 100 +++++++++++++++++++++++++++++++++++------- 2 files changed, 86 insertions(+), 15 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 76cd21fa5501..e42dd0fb85c5 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1653,6 +1653,7 @@ extern int task_can_attach(struct task_struct *p, const struct cpumask *cs_cpus_ #ifdef CONFIG_SMP extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask); extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask); +extern void force_compatible_cpus_allowed_ptr(struct task_struct *p); #else static inline void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) { diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 92ac3e53f50a..1cfc94be18a9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1863,25 +1863,19 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) } /* - * Change a given task's CPU affinity. Migrate the thread to a - * proper CPU and schedule it away if the CPU it's executing on - * is removed from the allowed bitmask. - * - * NOTE: the caller must have a valid reference to the task, the - * task must not exit() & deallocate itself prematurely. The - * call is not atomic; no spinlocks may be held. + * Called with both p->pi_lock and rq->lock held; drops both before returning. */ -static int __set_cpus_allowed_ptr(struct task_struct *p, - const struct cpumask *new_mask, bool check) +static int __set_cpus_allowed_ptr_locked(struct task_struct *p, + const struct cpumask *new_mask, + bool check, + struct rq *rq, + struct rq_flags *rf) { const struct cpumask *cpu_valid_mask = cpu_active_mask; const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p); unsigned int dest_cpu; - struct rq_flags rf; - struct rq *rq; int ret = 0; - rq = task_rq_lock(p, &rf); update_rq_clock(rq); if (p->flags & PF_KTHREAD) { @@ -1936,7 +1930,7 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, if (task_running(rq, p) || p->state == TASK_WAKING) { struct migration_arg arg = { p, dest_cpu }; /* Need help from migration thread: drop lock and wait. */ - task_rq_unlock(rq, p, &rf); + task_rq_unlock(rq, p, rf); stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); return 0; } else if (task_on_rq_queued(p)) { @@ -1944,20 +1938,96 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, * OK, since we're going to drop the lock immediately * afterwards anyway. */ - rq = move_queued_task(rq, &rf, p, dest_cpu); + rq = move_queued_task(rq, rf, p, dest_cpu); } out: - task_rq_unlock(rq, p, &rf); + task_rq_unlock(rq, p, rf); return ret; } +/* + * Change a given task's CPU affinity. Migrate the thread to a + * proper CPU and schedule it away if the CPU it's executing on + * is removed from the allowed bitmask. + * + * NOTE: the caller must have a valid reference to the task, the + * task must not exit() & deallocate itself prematurely. The + * call is not atomic; no spinlocks may be held. + */ +static int __set_cpus_allowed_ptr(struct task_struct *p, + const struct cpumask *new_mask, bool check) +{ + struct rq_flags rf; + struct rq *rq; + + rq = task_rq_lock(p, &rf); + return __set_cpus_allowed_ptr_locked(p, new_mask, check, rq, &rf); +} + int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask) { return __set_cpus_allowed_ptr(p, new_mask, false); } EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr); +/* + * Change a given task's CPU affinity to the intersection of its current + * affinity mask and @subset_mask, writing the resulting mask to @new_mask. + * If the resulting mask is empty, leave the affinity unchanged and return + * -EINVAL. + */ +static int restrict_cpus_allowed_ptr(struct task_struct *p, + struct cpumask *new_mask, + const struct cpumask *subset_mask) +{ + struct rq_flags rf; + struct rq *rq; + + rq = task_rq_lock(p, &rf); + if (!cpumask_and(new_mask, &p->cpus_mask, subset_mask)) { + task_rq_unlock(rq, p, &rf); + return -EINVAL; + } + + return __set_cpus_allowed_ptr_locked(p, new_mask, false, rq, &rf); +} + +/* + * Restrict a given task's CPU affinity so that it is a subset of + * task_cpu_possible_mask(). If the resulting mask is empty, we warn and + * walk up the cpuset hierarchy until we find a suitable mask. + */ +void force_compatible_cpus_allowed_ptr(struct task_struct *p) +{ + cpumask_var_t new_mask; + const struct cpumask *override_mask = task_cpu_possible_mask(p); + + if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) + goto out_set_mask; + + if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask)) + goto out_free_mask; + + /* + * We failed to find a valid subset of the affinity mask for the + * task, so override it based on its cpuset hierarchy. + */ + cpuset_cpus_allowed(p, new_mask); + override_mask = new_mask; + +out_set_mask: + if (printk_ratelimit()) { + printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n", + task_pid_nr(p), p->comm, + cpumask_pr_args(override_mask)); + } + + set_cpus_allowed_ptr(p, override_mask); +out_free_mask: + free_cpumask_var(new_mask); +} + void set_task_cpu(struct task_struct *p, unsigned int new_cpu) { #ifdef CONFIG_SCHED_DEBUG -- 2.29.2.576.ga3fc446d84-goog From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88985C433FE for ; Tue, 8 Dec 2020 13:32:08 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 434BA23AA8 for ; Tue, 8 Dec 2020 13:32:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 434BA23AA8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=kw5XaqFF9fDT1zW4G+hvT3ButzfbI7ALGVp2rOxojfo=; b=qb2MVFADbwu93EtOcogMFgdCC kF1JlwtYNefDxgSq1HrAq8Qctu01uW5dR3hxOLdG1NBUmJzHLu6RonazigaXaLiknMjU6vXtoJcBS OrXesZm+5hg6Q3KtS0SsJQ/ASkiFcJcuR0kH1TLJGtkhNpWXpksX4BgM+hCjY07THiZ+kNo94+r34 vWefULsqfWWqArdF1xiMgg8oLKUpKNsI7TAoBVXTiwvMfg/nrYh7p4F006LrBAfWmb8fyW9aAuGGH Fla7I9DNoEw/ONgpSHt+lzzGhTcpzR7hF3FsF0jYeu774i45o8FAmJNsJoCW6wRC6u029xAK1KeMn FtLmbAR2A==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmd4e-0003Yt-37; Tue, 08 Dec 2020 13:30:36 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmd3T-00031l-L2 for linux-arm-kernel@lists.infradead.org; Tue, 08 Dec 2020 13:29:29 +0000 From: Will Deacon Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: linux-arm-kernel@lists.infradead.org Subject: [PATCH v5 10/15] sched: Introduce force_compatible_cpus_allowed_ptr() to limit CPU affinity Date: Tue, 8 Dec 2020 13:28:30 +0000 Message-Id: <20201208132835.6151-11-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201208132835.6151-1-will@kernel.org> References: <20201208132835.6151-1-will@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201208_082923_960917_998C8801 X-CRM114-Status: GOOD ( 21.23 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Marc Zyngier , kernel-team@android.com, Vincent Guittot , Juri Lelli , Quentin Perret , Peter Zijlstra , Catalin Marinas , Johannes Weiner , linux-kernel@vger.kernel.org, Qais Yousef , Suren Baghdasaryan , Ingo Molnar , Li Zefan , Greg Kroah-Hartman , Tejun Heo , Will Deacon , Morten Rasmussen Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Asymmetric systems may not offer the same level of userspace ISA support across all CPUs, meaning that some applications cannot be executed by some CPUs. As a concrete example, upcoming arm64 big.LITTLE designs do not feature support for 32-bit applications on both clusters. Although userspace can carefully manage the affinity masks for such tasks, one place where it is particularly problematic is execve() because the CPU on which the execve() is occurring may be incompatible with the new application image. In such a situation, it is desirable to restrict the affinity mask of the task and ensure that the new image is entered on a compatible CPU. From userspace's point of view, this looks the same as if the incompatible CPUs have been hotplugged off in the task's affinity mask. In preparation for restricting the affinity mask for compat tasks on arm64 systems without uniform support for 32-bit applications, introduce force_compatible_cpus_allowed_ptr(), which restricts the affinity mask for a task to contain only compatible CPUs. Reviewed-by: Quentin Perret Signed-off-by: Will Deacon --- include/linux/sched.h | 1 + kernel/sched/core.c | 100 +++++++++++++++++++++++++++++++++++------- 2 files changed, 86 insertions(+), 15 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 76cd21fa5501..e42dd0fb85c5 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1653,6 +1653,7 @@ extern int task_can_attach(struct task_struct *p, const struct cpumask *cs_cpus_ #ifdef CONFIG_SMP extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask); extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask); +extern void force_compatible_cpus_allowed_ptr(struct task_struct *p); #else static inline void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) { diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 92ac3e53f50a..1cfc94be18a9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1863,25 +1863,19 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) } /* - * Change a given task's CPU affinity. Migrate the thread to a - * proper CPU and schedule it away if the CPU it's executing on - * is removed from the allowed bitmask. - * - * NOTE: the caller must have a valid reference to the task, the - * task must not exit() & deallocate itself prematurely. The - * call is not atomic; no spinlocks may be held. + * Called with both p->pi_lock and rq->lock held; drops both before returning. */ -static int __set_cpus_allowed_ptr(struct task_struct *p, - const struct cpumask *new_mask, bool check) +static int __set_cpus_allowed_ptr_locked(struct task_struct *p, + const struct cpumask *new_mask, + bool check, + struct rq *rq, + struct rq_flags *rf) { const struct cpumask *cpu_valid_mask = cpu_active_mask; const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p); unsigned int dest_cpu; - struct rq_flags rf; - struct rq *rq; int ret = 0; - rq = task_rq_lock(p, &rf); update_rq_clock(rq); if (p->flags & PF_KTHREAD) { @@ -1936,7 +1930,7 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, if (task_running(rq, p) || p->state == TASK_WAKING) { struct migration_arg arg = { p, dest_cpu }; /* Need help from migration thread: drop lock and wait. */ - task_rq_unlock(rq, p, &rf); + task_rq_unlock(rq, p, rf); stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); return 0; } else if (task_on_rq_queued(p)) { @@ -1944,20 +1938,96 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, * OK, since we're going to drop the lock immediately * afterwards anyway. */ - rq = move_queued_task(rq, &rf, p, dest_cpu); + rq = move_queued_task(rq, rf, p, dest_cpu); } out: - task_rq_unlock(rq, p, &rf); + task_rq_unlock(rq, p, rf); return ret; } +/* + * Change a given task's CPU affinity. Migrate the thread to a + * proper CPU and schedule it away if the CPU it's executing on + * is removed from the allowed bitmask. + * + * NOTE: the caller must have a valid reference to the task, the + * task must not exit() & deallocate itself prematurely. The + * call is not atomic; no spinlocks may be held. + */ +static int __set_cpus_allowed_ptr(struct task_struct *p, + const struct cpumask *new_mask, bool check) +{ + struct rq_flags rf; + struct rq *rq; + + rq = task_rq_lock(p, &rf); + return __set_cpus_allowed_ptr_locked(p, new_mask, check, rq, &rf); +} + int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask) { return __set_cpus_allowed_ptr(p, new_mask, false); } EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr); +/* + * Change a given task's CPU affinity to the intersection of its current + * affinity mask and @subset_mask, writing the resulting mask to @new_mask. + * If the resulting mask is empty, leave the affinity unchanged and return + * -EINVAL. + */ +static int restrict_cpus_allowed_ptr(struct task_struct *p, + struct cpumask *new_mask, + const struct cpumask *subset_mask) +{ + struct rq_flags rf; + struct rq *rq; + + rq = task_rq_lock(p, &rf); + if (!cpumask_and(new_mask, &p->cpus_mask, subset_mask)) { + task_rq_unlock(rq, p, &rf); + return -EINVAL; + } + + return __set_cpus_allowed_ptr_locked(p, new_mask, false, rq, &rf); +} + +/* + * Restrict a given task's CPU affinity so that it is a subset of + * task_cpu_possible_mask(). If the resulting mask is empty, we warn and + * walk up the cpuset hierarchy until we find a suitable mask. + */ +void force_compatible_cpus_allowed_ptr(struct task_struct *p) +{ + cpumask_var_t new_mask; + const struct cpumask *override_mask = task_cpu_possible_mask(p); + + if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) + goto out_set_mask; + + if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask)) + goto out_free_mask; + + /* + * We failed to find a valid subset of the affinity mask for the + * task, so override it based on its cpuset hierarchy. + */ + cpuset_cpus_allowed(p, new_mask); + override_mask = new_mask; + +out_set_mask: + if (printk_ratelimit()) { + printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n", + task_pid_nr(p), p->comm, + cpumask_pr_args(override_mask)); + } + + set_cpus_allowed_ptr(p, override_mask); +out_free_mask: + free_cpumask_var(new_mask); +} + void set_task_cpu(struct task_struct *p, unsigned int new_cpu) { #ifdef CONFIG_SCHED_DEBUG -- 2.29.2.576.ga3fc446d84-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel