From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27B4AC433FE for ; Wed, 19 Jan 2022 08:48:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352533AbiASIsU (ORCPT ); Wed, 19 Jan 2022 03:48:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237729AbiASIsQ (ORCPT ); Wed, 19 Jan 2022 03:48:16 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE4A5C061574; Wed, 19 Jan 2022 00:48:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=id804dbsfqejPK5oRefHchyYSiupvM/8xrLRLyc6bqE=; b=DrCxgNBqn8qYFuoVPtXZH+3ZEQ v1EmQwU7CA2xtAsN/J1fhnoyuGpLjlug3F6wyfSoIiXsc5T33lhJBVuMYABLjCUS0w+uu+x23Ccj4 NiyO+h5WA2nHTlI9HSgyyKsAMFuSQT3Kr8z+cQ1nx6l3tfCgIWiAQCQD18WkZ9YWu+ECRDkdD2/Ls z7G8BfnFY8d5LtNwL4JPxhBXQqzVmVSlLILtVBpz/IqHLvpYauR5MrvjbmW3+8SoRasbN2XtgcNiq JoxYpX5uy0lwGVMSbeMd5pi8it9pA6NHyoFjZtyCL5ZRV0ms4wi3JAG3fpkzihFp1ZFthlqTWfpsu IypY5rdg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1nA6cz-0020O1-T1; Wed, 19 Jan 2022 08:47:38 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 05513300140; Wed, 19 Jan 2022 09:47:33 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id CD1F8203C6487; Wed, 19 Jan 2022 09:47:33 +0100 (CET) Date: Wed, 19 Jan 2022 09:47:33 +0100 From: Peter Zijlstra To: Peter Oskolkov Cc: Peter Oskolkov , mingo@redhat.com, tglx@linutronix.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org, x86@kernel.org, pjt@google.com, avagin@google.com, jannh@google.com, tdelisle@uwaterloo.ca Subject: Re: [RFC][PATCH 3/3] sched: User Mode Concurency Groups Message-ID: References: <20211214204445.665580974@infradead.org> <20211214205358.701701555@infradead.org> <20211221171900.GA580323@dev-hv> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 18, 2022 at 10:19:21AM -0800, Peter Oskolkov wrote: > ============= worker-to-worker context switches > > One example: absl::Mutex (https://abseil.io/about/design/mutex) has > google-internal extensions that are "fiber aware". More specifically, > consider this situation: > > - worker W1 acqured the mutex and is doing its work > - worker W2 calls mutex::lock() > mutex::lock(), being aware of workers, understands that W2 is going to sleep; > so instead of just doing so, waking the server, and letting > the server figure out what to run in place of the sleeping worker, > mutex::lock() > calls into the userspace scheduler in the context of W2 running, and the > userspace scheduler then picks W3 to run and does W2->W3 context switch. > > The optimization above replaces W2->Server and Server->W3 context switches > with a single W2->W3 context switch, which is a material performance gain. Yes, I've also already reconsidered. Things like pipelines and other fixed order scheduling policies will greatly benefit from worker-to-worker switching. But I think all of them are explicit. That is, we can limit the ::next_tid usage to sys_umcg_wait() and never look at it for implicit blocks. > In addition, when W1 calls mutex::unlock(), the scheduling code determines > that W2 is waiting on the mutex, and thus calls W2::wake() from the context of > running W1 (you asked earlier why do we need "WAKE_ONLY"). This I'm not at all convinced on. That sounds like it will violate the 1:1 thing.