From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BC73C433DB for ; Tue, 16 Feb 2021 18:30:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E2B7B64E2B for ; Tue, 16 Feb 2021 18:30:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230375AbhBPSaY (ORCPT ); Tue, 16 Feb 2021 13:30:24 -0500 Received: from foss.arm.com ([217.140.110.172]:40928 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229811AbhBPSaV (ORCPT ); Tue, 16 Feb 2021 13:30:21 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 03282101E; Tue, 16 Feb 2021 10:29:35 -0800 (PST) Received: from e107158-lin.cambridge.arm.com (unknown [10.1.195.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 206963F73B; Tue, 16 Feb 2021 10:29:28 -0800 (PST) Date: Tue, 16 Feb 2021 18:29:25 +0000 From: Qais Yousef To: Alexey Klimov Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, peterz@infradead.org, yury.norov@gmail.com, daniel.m.jordan@oracle.com, tglx@linutronix.de, jobaker@redhat.com, audralmitchel@gmail.com, arnd@arndb.de, gregkh@linuxfoundation.org, rafael@kernel.org, tj@kernel.org, hannes@cmpxchg.org, klimov.linux@gmail.com Subject: Re: [PATCH v2] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining Message-ID: <20210216182925.v2fe3foe6dgmiaj6@e107158-lin.cambridge.arm.com> References: <20210212003032.2037750-1-aklimov@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20210212003032.2037750-1-aklimov@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/12/21 00:30, Alexey Klimov wrote: > When a CPU offlined and onlined via device_offline() and device_online() > the userspace gets uevent notification. If, after receiving "online" uevent, > userspace executes sched_setaffinity() on some task trying to move it > to a recently onlined CPU, then it often fails with -EINVAL. Userspace needs > to wait around 5..30 ms before sched_setaffinity() will succeed for recently > onlined CPU after receiving uevent. > > If in_mask argument for sched_setaffinity() has only recently onlined CPU, > it often fails with such flow: > > sched_setaffinity() > cpuset_cpus_allowed() > guarantee_online_cpus() <-- cs->effective_cpus mask does not > contain recently onlined cpu > cpumask_and() <-- final new_mask is empty > __set_cpus_allowed_ptr() > cpumask_any_and_distribute() <-- returns dest_cpu equal to nr_cpu_ids > returns -EINVAL > > Cpusets used in guarantee_online_cpus() are updated using workqueue from > cpuset_update_active_cpus() which in its turn is called from cpu hotplug callback > sched_cpu_activate() hence it may not be observable by sched_setaffinity() if > it is called immediately after uevent. nit: newline > Out of line uevent can be avoided if we will ensure that cpuset_hotplug_work > has run to completion using cpuset_wait_for_hotplug() after onlining the > cpu in cpu_device_up() and in cpuhp_smt_enable(). > > Co-analyzed-by: Joshua Baker > Signed-off-by: Alexey Klimov > --- This looks good to me. Reviewed-by: Qais Yousef Thanks -- Qais Yousef