From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D4FEC64E7C for ; Wed, 2 Dec 2020 15:57:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AC3B621D81 for ; Wed, 2 Dec 2020 15:57:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AC3B621D81 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linutronix.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BB68E6B005D; Wed, 2 Dec 2020 10:57:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B66B78D0002; Wed, 2 Dec 2020 10:57:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A55078D0001; Wed, 2 Dec 2020 10:57:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id 8B66A6B005D for ; Wed, 2 Dec 2020 10:57:35 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 5112A824999B for ; Wed, 2 Dec 2020 15:57:35 +0000 (UTC) X-FDA: 77548797270.12.bead59_2110501273b4 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id 259FB1800C845 for ; Wed, 2 Dec 2020 15:57:35 +0000 (UTC) X-HE-Tag: bead59_2110501273b4 X-Filterd-Recvd-Size: 6041 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Wed, 2 Dec 2020 15:57:34 +0000 (UTC) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1606924652; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+m4/+jR6ER6cZKS1De34V8XtEcatZyIMCYxwgSoHwD8=; b=kUbcFuwMxWt0YJtlM8wJaGVb1s6deWGXbNi+SsKkdoACDoLmKrO9nbZLiSODBfpiOLTyKF NdbA+QmDTwJKQkgS9fTwjWI/7nBFsL4PrsoCNme7sMPO9g5/IHmRqxDNHM/vR9LRz6Ms5w VKCF/6RUqHyghmpZmWQFuFPebUKS+aU+Vedsg7cQmSxuz7QRCVoImrMTBJYkPpLpI0MkNp 0+ykhHYzHMJUMAvDugJONGJR/AY8yYwLiBL2QVcq+WNGlfItrOpjBWr3pGR4rFQNC0Woup 4s17L1zYxDiVuXnevKqfyfaFfbHuN7js6p2MHS39o5I/scMNhJoJ9Pl4WWsh1w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1606924652; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+m4/+jR6ER6cZKS1De34V8XtEcatZyIMCYxwgSoHwD8=; b=B7gyH+byM7t4UcL/CFCWzI53t5H1ttz97Uxa8ttnB1BQctVRRTUgGN8Lqod0BpN5Yblsdb SNhK2I6ysAIwLiCQ== To: Christoph Lameter , Marcelo Tosatti Cc: Matthew Wilcox , linux-mm@kvack.org, Andrew Morton , Alex Belits , Phil Auld , Frederic Weisbecker , Peter Zijlstra Subject: Re: [PATCH] mm: introduce sysctl file to flush per-cpu vmstat statistics In-Reply-To: References: <20201117162805.GA274911@fuller.cnet> <20201117180356.GT29991@casper.infradead.org> <20201117202317.GA282679@fuller.cnet> <20201127154845.GA9100@fuller.cnet> Date: Wed, 02 Dec 2020 16:57:31 +0100 Message-ID: <87h7p4dwus.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Nov 30 2020 at 09:31, Christoph Lameter wrote: > On Fri, 27 Nov 2020, Marcelo Tosatti wrote: > >> Decided to switch to prctl interface, and then it starts >> to become similar to "task mode isolation" patchset API. > > Right I think that was a good approach. prctl() is the right thing to do. >> In addition to quiescing pending activities on the CPU, it would >> also be useful to assign a per-task attribute (which is then assigned >> to a per-CPU attribute), indicating whether that CPU is running >> an isolated task or not. > > Sounds good but what would this do? Give a warning like the isolation > patchset? This all needs a lot more thought about the overall picture. We already have too many knobs and ad hoc hooks which fiddle with isolation. The current CPU isolation is a best effort approach and I agree that for more strict isolation modes we need to be able to enforce that and hunt down offenders and think about them one by one. >> To be called before real time loop, one would have: Can we please agree in the first place, that "real time" is absolutely the wrong term here? It's about running undisturbed CPU bound computations whatever nature they are. It does not matter whether that loop does busy polling ala DPDK, whether it runs a huge math computation on a data set or whatever people come up with. >> prctl(PR_SET_TASK_ISOLATION, ISOLATION_ENABLE) [1] >> real time loop >> prctl(PR_SET_TASK_ISOLATION, ISOLATION_DISABLE) >> >> (with the attribute also being cleared on task exit). >> >> The general description would be: >> >> "Set task isolated mode for a given task, returning an error >> if the task is not pinned to a single CPU. Plus returning an error if the task has no permissions to request this. This should not be an unprivileged prctl ever. >> In this mode, the kernel will avoid interruptions to isolated >> CPUs when possible." >> >> Any objections against such an interface ? > > Maybe do both like in the isolation patchset? We really want to define the scopes first. And here you go: > Often code can tolerate a few interruptions (in some code branches > regular syscalls may be needed) but one wants the thread to be > as quiet as possible. So you say some code can tolerate a few interrupts, then comes Alex and says 'no disturbance' at all. The point is that all of this shares the mechanisms to quiesce certain parts of the kernel so this wants to build common infrastructure and the prctl(ISOLATION, MODE) mode argument defines the scope of isolation which the task asks for and the infrastructure decides whether it can be granted and if so orchestrates the operation and provides a common infrastructure for instrumentation, violation monitoring etc. We really need to stop to look at particular workloads and defining adhoc solutions tailored to their particular itch if we don't want to end up with an uncoordinated and unmaintainable zoo of interfaces, hooks and knobs. Just looking at the problem at hand as an example. NOHZ already issues quiet_vmstat(), but it does not cancel already scheduled work. Now Marcelo wants a new mechanism which is supposed to cancel the work and then Alex want's to prevent it from being rescheduled. If that's not properly coordinated this goes down the drain very fast. So can we please come up with a central place to handle this prctl() with a future proof argument list so the various isolation needs can be expressed as required? That allows Marcelo to start tackling the vmstat side and Alex can utilize that and build the other parts into it piece by piece. Thanks, tglx