From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EBE5C388F7 for ; Thu, 22 Oct 2020 10:16:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D98E524631 for ; Thu, 22 Oct 2020 10:16:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2896386AbgJVKQ3 (ORCPT ); Thu, 22 Oct 2020 06:16:29 -0400 Received: from foss.arm.com ([217.140.110.172]:53338 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2896371AbgJVKQ3 (ORCPT ); Thu, 22 Oct 2020 06:16:29 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 74CB5D6E; Thu, 22 Oct 2020 03:16:28 -0700 (PDT) Received: from e123083-lin (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EF4713F66E; Thu, 22 Oct 2020 03:16:26 -0700 (PDT) Date: Thu, 22 Oct 2020 12:16:24 +0200 From: Morten Rasmussen To: Qais Yousef Cc: linux-arch@vger.kernel.org, Marc Zyngier , "Peter Zijlstra (Intel)" , Catalin Marinas , Linus Torvalds , James Morse , Greg Kroah-Hartman , Will Deacon , linux-arm-kernel@lists.infradead.org Subject: Re: [RFC PATCH v2 4/4] arm64: Export id_aar64fpr0 via sysfs Message-ID: <20201022101624.GI8004@e123083-lin> References: <20201021104611.2744565-1-qais.yousef@arm.com> <20201021104611.2744565-5-qais.yousef@arm.com> <63fead90e91e08a1b173792b06995765@kernel.org> <20201021121559.GB3976@gaia> <20201021133316.GF8004@e123083-lin> <20201021143153.7ef7n7gdd42l4rbc@e107158-lin> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201021143153.7ef7n7gdd42l4rbc@e107158-lin> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-arch@vger.kernel.org On Wed, Oct 21, 2020 at 03:31:53PM +0100, Qais Yousef wrote: > On 10/21/20 15:33, Morten Rasmussen wrote: > > On Wed, Oct 21, 2020 at 01:15:59PM +0100, Catalin Marinas wrote: > > > one, though not as easy as automatic task placement by the scheduler (my > > > first preference, followed by the id_* regs and the aarch32 mask, though > > > not a strong preference for any). > > > > Automatic task placement by the scheduler would mean giving up the > > requirement that the user-space affinity mask must always be honoured. > > Is that on the table? > > > > Killing aarch32 tasks with an empty intersection between the user-space > > mask and aarch32_mask is not really "automatic" and would require the > > aarch32 capability to be exposed anyway. > > I just noticed this nasty corner case too. > > > Documentation/admin-guide/cgroup-v1/cpusets.rst: Section 1.9 > > "If such a task had been bound to some subset of its cpuset using the > sched_setaffinity() call, the task will be allowed to run on any CPU allowed in > its new cpuset, negating the effect of the prior sched_setaffinity() call." > > So user space must put the tasks into a valid cpuset to fix the problem. Or > make the scheduler behave like the affinity is associated with a cpuset. > > Can user space put the task into the correct cpuset without a race? Clone3 > syscall learnt to specify a cgroup to attach to when forking. Should we do the > same for execve()? Putting a task in a cpuset overrides any affinity mask applied by a previous cpuset or sched_setaffinity() call. I wouldn't call it a corner case though. Android user-space is exploiting it all the time on some devices through the foreground, background, and top-app cgroups. If a tasks fork() the child task will belong to the same cgroup automatically. If you execve() you retain the previous affinity mask and cgroup. So putting parent task about to execve() into aarch32 into a cpuset with only aarch32 CPUs should be enough to never have the task or any of its child tasks SIGKILLED. A few simple experiments with fork() and execve() seems to confirm that. I don't see any changes needed in the kernel. Changing cgroup through clone could of course fail if user-space specifies an unsuitable cgroup. Fixing that would be part of fixing the cpuset setup in user-space which is required anyway. Morten