linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Christoph Lameter <cl@linux.com>, Marcelo Tosatti <mtosatti@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Alex Belits <abelits@marvell.com>, Phil Auld <pauld@redhat.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH] mm: introduce sysctl file to flush per-cpu vmstat statistics
Date: Wed, 02 Dec 2020 16:57:31 +0100	[thread overview]
Message-ID: <87h7p4dwus.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2011300927120.337729@www.lameter.com>

On Mon, Nov 30 2020 at 09:31, Christoph Lameter wrote:
> On Fri, 27 Nov 2020, Marcelo Tosatti wrote:
>
>> Decided to switch to prctl interface, and then it starts
>> to become similar to "task mode isolation" patchset API.
>
> Right I think that was a good approach.

prctl() is the right thing to do.

>> In addition to quiescing pending activities on the CPU, it would
>> also be useful to assign a per-task attribute (which is then assigned
>> to a per-CPU attribute), indicating whether that CPU is running
>> an isolated task or not.
>
> Sounds good but what would this do? Give a warning like the isolation
> patchset?

This all needs a lot more thought about the overall picture. We already
have too many knobs and ad hoc hooks which fiddle with isolation.

The current CPU isolation is a best effort approach and I agree that for
more strict isolation modes we need to be able to enforce that and hunt
down offenders and think about them one by one.

>> To be called before real time loop, one would have:

Can we please agree in the first place, that "real time" is absolutely
the wrong term here?

It's about running undisturbed CPU bound computations whatever nature
they are. It does not matter whether that loop does busy polling ala
DPDK, whether it runs a huge math computation on a data set or
whatever people come up with.

>> 	prctl(PR_SET_TASK_ISOLATION, ISOLATION_ENABLE) [1]
>> 	real time loop
>> 	prctl(PR_SET_TASK_ISOLATION, ISOLATION_DISABLE)
>>
>> (with the attribute also being cleared on task exit).
>>
>> The general description would be:
>>
>> "Set task isolated mode for a given task, returning an error
>> if the task is not pinned to a single CPU.

Plus returning an error if the task has no permissions to request
this. This should not be an unprivileged prctl ever.

>> In this mode, the kernel will avoid interruptions to isolated
>> CPUs when possible."
>>
>> Any objections against such an interface ?
>
> Maybe do both like in the isolation patchset?

We really want to define the scopes first. And here you go:

> Often code can tolerate a few interruptions (in some code branches
> regular syscalls may be needed) but one wants the thread to be
> as quiet as possible.

So you say some code can tolerate a few interrupts, then comes Alex and
says 'no disturbance' at all.

The point is that all of this shares the mechanisms to quiesce certain
parts of the kernel so this wants to build common infrastructure and the
prctl(ISOLATION, MODE) mode argument defines the scope of isolation
which the task asks for and the infrastructure decides whether it can be
granted and if so orchestrates the operation and provides a common
infrastructure for instrumentation, violation monitoring etc.

We really need to stop to look at particular workloads and defining
adhoc solutions tailored to their particular itch if we don't want to
end up with an uncoordinated and unmaintainable zoo of interfaces, hooks
and knobs.

Just looking at the problem at hand as an example. NOHZ already issues
quiet_vmstat(), but it does not cancel already scheduled work. Now
Marcelo wants a new mechanism which is supposed to cancel the work and
then Alex want's to prevent it from being rescheduled. If that's not
properly coordinated this goes down the drain very fast.

So can we please come up with a central place to handle this prctl()
with a future proof argument list so the various isolation needs can be
expressed as required?

That allows Marcelo to start tackling the vmstat side and Alex can
utilize that and build the other parts into it piece by piece.

Thanks,

        tglx


  parent reply	other threads:[~2020-12-02 15:57 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-17 16:28 Marcelo Tosatti
2020-11-17 18:03 ` Matthew Wilcox
2020-11-17 19:06   ` Christopher Lameter
2020-11-17 19:09     ` Matthew Wilcox
2020-11-20 18:04       ` Christopher Lameter
2020-11-17 20:23     ` Marcelo Tosatti
2020-11-20 18:02       ` Marcelo Tosatti
2020-11-20 18:20       ` Christopher Lameter
2020-11-23 18:02         ` Marcelo Tosatti
2020-11-24 17:12         ` Vlastimil Babka
2020-11-24 19:52           ` Marcelo Tosatti
2020-11-27 15:48         ` Marcelo Tosatti
2020-11-28  3:49           ` [EXT] " Alex Belits
2020-11-30 18:18             ` Marcelo Tosatti
2020-11-30 18:29               ` Marcelo Tosatti
2020-12-03 22:47                 ` Alex Belits
2020-12-03 22:21               ` Alex Belits
2020-11-30  9:31           ` Christoph Lameter
2020-12-02 12:43             ` Marcelo Tosatti
2020-12-02 15:57             ` Thomas Gleixner [this message]
2020-12-02 17:43               ` Christoph Lameter
2020-12-03  3:17                 ` Thomas Gleixner
2020-12-07  8:08                   ` Christoph Lameter
2020-12-07 16:09                     ` Thomas Gleixner
2020-12-07 19:01                       ` Thomas Gleixner
2020-12-02 18:38               ` Marcelo Tosatti
2020-12-04  0:20                 ` Frederic Weisbecker
2020-12-04 13:31                   ` Marcelo Tosatti
2020-12-04  1:43               ` [EXT] " Alex Belits
2021-01-13 12:15                 ` [RFC] tentative prctl task isolation interface Marcelo Tosatti
2021-01-14  9:22                   ` Christoph Lameter
2021-01-14 19:34                     ` Marcelo Tosatti
2021-01-15 13:24                       ` Christoph Lameter
2021-01-15 18:35                         ` Alex Belits
2021-01-21 15:51                           ` Marcelo Tosatti
2021-01-21 16:20                             ` Marcelo Tosatti
2021-01-22 13:05                               ` Marcelo Tosatti
2021-02-01 10:48                             ` Christoph Lameter
2021-02-01 12:47                               ` Alex Belits
2021-02-01 18:20                               ` Marcelo Tosatti
2021-01-18 15:18                         ` Marcelo Tosatti
2020-11-24  5:02 ` [mm] e655d17ffa: BUG:using_smp_processor_id()in_preemptible kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7p4dwus.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=abelits@marvell.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mtosatti@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=willy@infradead.org \
    --subject='Re: [PATCH] mm: introduce sysctl file to flush per-cpu vmstat statistics' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox