All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: nsaenzju@redhat.com, linux-kernel@vger.kernel.org,
	Nitesh Lal <nilal@redhat.com>, Christoph Lameter <cl@linux.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Alex Belits <abelits@marvell.com>, Peter Xu <peterx@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [patch 1/4] add basic task isolation prctl interface
Date: Wed, 28 Jul 2021 06:37:07 -0300	[thread overview]
Message-ID: <20210728093707.GA3242@fuller.cnet> (raw)
In-Reply-To: <20210727234539.GH283787@lothringen>

On Wed, Jul 28, 2021 at 01:45:39AM +0200, Frederic Weisbecker wrote:
> On Tue, Jul 27, 2021 at 11:52:09AM -0300, Marcelo Tosatti wrote:
> > The meaning of isolated is specified as follows:
> > 
> > Isolation features
> > ==================
> > 
> > - prctl(PR_ISOL_GET, ISOL_SUP_FEATURES, 0, 0, 0) returns the supported
> > features as a return value.
> > 
> > - prctl(PR_ISOL_SET, ISOL_FEATURES, bitmask, 0, 0) enables the features in
> > the bitmask.
> > 
> > - prctl(PR_ISOL_GET, ISOL_FEATURES, 0, 0, 0) returns the currently
> > enabled features.
> 
> So what are the ISOL_FEATURES here? A mode that we enter such as flush
> vmstat _everytime_ we resume to userpace after (and including) this prctl() ?

ISOL_FEATURES is just the "command" type (which you can get and set).

The bitmask would include ISOL_F_QUIESCE_ON_URET, so:

- bitmask = ISOL_F_QUIESCE_ON_URET;
- prctl(PR_ISOL_SET, ISOL_FEATURES, bitmask, 0, 0) enables the features in
the bitmask.

- quiesce_bitmap = prctl(PR_ISOL_GET, PR_ISOL_SUP_QUIESCE_CFG, 0, 0, 0)
  (1)

  (returns the supported actions to be quiesced).

- prctl(PR_ISOL_SET, PR_ISOL_QUIESCE_CFG, quiesce_bitmask, 0, 0) _sets_
the actions to be quiesced (2)

If an application does not modify "quiesce_bitmask" between 
points (1) and (2) above, it will enable quiescing of all
"features" the kernel supports.

Application can, however, modify quiesce_bitmap to its preference.

Flushing vmstat _everytime_ you resume to userspace is enabled only
_after_ prctl(PR_ISOL_ENTER, 0, 0, 0, 0) is performed (which happens 
only when isolation is fully configured with the PR_ISOL_SET calls).
OK, will better document that.

> If so I'd rather call that ISOL_MODE because feature is too general.

Well, in the first patchset, there was one "mode" implemented (but
it was possible to implement different modes in the future).

This would allow for example easier integration of "full task isolation"
patchset type of functionality, disallowing syscalls.

I think we'd like to keep that, so i'll keep the previous distinct modes
(but allow configuration of individual features on the bitmap).

> > 
> > The supported features are:
> > 
> > ISOL_F_QUIESCE_ON_URET: quiesce deferred actions on return to userspace.
> > ----------------------
> > 
> > Quiescing of different actions can be performed on return to userspace.
> > 
> > - prctl(PR_ISOL_GET, PR_ISOL_SUP_QUIESCE_CFG, 0, 0, 0) returns
> > the supported actions to be quiesced.
> > 
> > - prctl(PR_ISOL_SET, PR_ISOL_QUIESCE_CFG, quiesce_bitmask, 0, 0) returns
							
							s/returns/sets/

> > the currently supported actions to be quiesced.
> > 
> > - prctl(PR_ISOL_GET, PR_ISOL_QUIESCE_CFG, 0, 0, 0) returns
> > the currently enabled actions to be quiesced.
> > 
> > #define ISOL_F_QUIESCE_VMSTAT_SYNC      (1<<0)
> > #define ISOL_F_QUIESCE_NOHZ_FULL        (1<<1)
> > #define ISOL_F_QUIESCE_DEFER_TLB_FLUSH  (1<<2)
> 
> And then PR_ISOL_QUIESCE_CFG is a oneshot operation that applies only upon
> return to this ctrl, right? If so perhaps this should be called just
> ISOL_QUIESCE or ISOL_QUIESCE_ONCE or ISOL_REQ ?

There was no one-shot operation implemented in the first patchset. What
application would do to achieve that is:

1. Configure isolation with PR_ISOL_SET (say configure mode which 
allows system calls, and when a system call happens, flush all deferred
actions on return to userspace).

2. prctl(PR_ISOL_ENTER, 0, 0, 0, 0) (this actually enables the flushing, 
and tags the task_struct as isolated). Here we can transfer this information
from per-task to per-CPU data, for example, to be able to implement 
other features such as deferred TLB flushing.

On return from this prctl(), deferrable actions are flushed.

3. latency sensitive loop, with no system calls.

4. some event which requires system calls is noticed:
   prctl(PR_ISOL_EXIT, 0, 0, 0, 0)
   (this would untag task_struct as isolated).

5. perform system calls A, B, C, D (with no flushing of vmstat, 
for example).

6. jmp to 2.

So there is a problem with this logic, which is that one would like
certain isolation functionality to remain enabled between points 4
and 6 (for example, blocking CPU hotplug or other blockable activities
that would cause interruptions).

One way to achieve this would be to replace PR_ISOL_ENTER/PR_ISOL_EXIT
with PR_ISOL_ENABLE, which accepts a bitmask:

1. Configure isolation with PR_ISOL_SET (say configure mode which 
allows system calls, and when a system call happens, flush all deferred
actions on return to userspace).

2. enabled_bitmask = ISOL_F_QUIESCE_ON_URET|ISOL_F_BLOCK_INTERRUPTORS;
   prctl(PR_ISOL_ENABLE, enabled_bitmask, 0, 0, 0)

On return from this prctl(), deferrable actions are flushed.

3. latency sensitive loop, with no system calls.

4. some event which requires system calls is noticed:

   prctl(PR_ISOL_ENABLE, ISOL_F_BLOCK_INTERRUPTORS, 0, 0, 0)
   (this would clear ISOL_F_QUIESCE_ON_URET, so no flushing
    is performed on return from system calls).

5. perform system calls A, B, C, D (with no flushing of vmstat).

6. jmp to 2.

...

On exit: prctl(PR_ISOL_ENABLE, 0, 0, 0, 0)

IOW: the one-shot operation does not allow the application
to inform the kernel when the latency sensitive loop has 
begun or has ended.

> 
> But that's just naming debate because otherwise that prctl layout looks good
> to me.
> 
> Thanks!

Thank you for the input!


  reply	other threads:[~2021-07-28  9:56 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-27 10:38 [patch 0/4] prctl task isolation interface and vmstat sync Marcelo Tosatti
2021-07-27 10:38 ` [patch 1/4] add basic task isolation prctl interface Marcelo Tosatti
2021-07-27 10:48   ` nsaenzju
2021-07-27 11:00     ` Marcelo Tosatti
2021-07-27 12:38       ` nsaenzju
2021-07-27 13:06         ` Marcelo Tosatti
2021-07-27 13:08           ` Marcelo Tosatti
2021-07-27 13:09         ` Frederic Weisbecker
2021-07-27 14:52           ` Marcelo Tosatti
2021-07-27 23:45             ` Frederic Weisbecker
2021-07-28  9:37               ` Marcelo Tosatti [this message]
2021-07-28 11:45                 ` Frederic Weisbecker
2021-07-28 13:21                   ` Marcelo Tosatti
2021-07-28 21:22                     ` Frederic Weisbecker
2021-07-28 11:55                 ` nsaenzju
2021-07-28 13:16                   ` Marcelo Tosatti
     [not found]                     ` <CAFki+LkQwoqVTKmgnwLQQM8ua-ixbLp8i+jUT6xF15k6X=89mw@mail.gmail.com>
2021-07-28 16:21                       ` Marcelo Tosatti
2021-07-28 17:08                     ` nsaenzju
     [not found]                 ` <CAFki+LmHeXmSFze8YEHFNbYA5hLEtnZyk37Yjf-eyOuKa8Os4w@mail.gmail.com>
2021-07-28 16:17                   ` Marcelo Tosatti
2021-07-27 10:38 ` [patch 2/4] task isolation: sync vmstats on return to userspace Marcelo Tosatti
2021-07-27 10:38 ` [patch 3/4] mm: vmstat: move need_update Marcelo Tosatti
2021-07-27 10:38 ` [patch 4/4] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean Marcelo Tosatti
2021-07-27 12:29 [patch 1/4] add basic task isolation prctl interface kernel test robot
2021-07-30 20:18 [patch 0/4] extensible prctl task isolation interface and vmstat sync (v2) Marcelo Tosatti
2021-07-30 20:18 ` [patch 1/4] add basic task isolation prctl interface Marcelo Tosatti
     [not found]   ` <CAFki+Lnf0cs62Se0aPubzYxP9wh7xjMXn7RXEPvrmtBdYBrsow@mail.gmail.com>
2021-07-31  0:49     ` Marcelo Tosatti
2021-07-31  7:47   ` kernel test robot
2021-07-31  7:47     ` kernel test robot
     [not found]   ` <CAFki+LkQVQOe+5aNEKWDvLdnjWjxzKWOiqOvBZzeuPWX+G=XgA@mail.gmail.com>
2021-08-02 14:16     ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210728093707.GA3242@fuller.cnet \
    --to=mtosatti@redhat.com \
    --cc=abelits@marvell.com \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nilal@redhat.com \
    --cc=nsaenzju@redhat.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.