All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/4] prctl task isolation interface and vmstat sync
@ 2021-07-27 10:38 Marcelo Tosatti
  2021-07-27 10:38 ` [patch 1/4] add basic task isolation prctl interface Marcelo Tosatti
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Marcelo Tosatti @ 2021-07-27 10:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nitesh Lal, Nicolas Saenz Julienne, Frederic Weisbecker,
	Christoph Lameter, Juri Lelli, Peter Zijlstra, Alex Belits,
	Peter Xu

The logic to disable vmstat worker thread, when entering
nohz full, does not cover all scenarios. For example, it is possible
for the following to happen:

1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
2) app runs mlock, which increases counters for mlock'ed pages.
3) start -RT loop

Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
the mlock, vmstat shepherd can restart vmstat worker thread on
the CPU in question.

To fix this, add task isolation prctl interface to quiesce
deferred actions when returning to userspace.

=============================
Task isolation prctl interface
=============================

Set thread isolation mode and parameters, which allows
informing the kernel that application is
executing latency sensitive code (where interruptions
are undesired).

Its composed of 4 prctl commands (passed as arg1 to
prctl):

PR_ISOL_SET:   set isolation parameters for the task

PR_ISOL_GET:   get isolation parameters for the task

PR_ISOL_ENTER: indicate that task should be considered
               isolated from this point on

PR_ISOL_EXIT: indicate that task should not be considered
              isolated from this point on

The isolation parameters and mode are not inherited by
children created by fork(2) and clone(2). The setting is
preserved across execve(2).

The meaning of isolated is specified as follows, when setting arg2 to
PR_ISOL_SET or PR_ISOL_GET, with the following arguments passed as arg3.

Isolation mode (PR_ISOL_MODE):
------------------------------

- PR_ISOL_MODE_NONE (arg4): no per-task isolation (default mode).
  PR_ISOL_EXIT sets mode to PR_ISOL_MODE_NONE.

- PR_ISOL_MODE_NORMAL (arg4): applications can perform system calls normally,
  and in case of interruption events, the notifications can be collected
  by BPF programs.
  In this mode, if system calls are performed, deferred actions initiated
  by the system call will be executed before return to userspace.

Other modes, which for example send signals upon interruptions events,
can be implemented.

Example
=======

The ``samples/task_isolation/`` directory contains a sample
application.



^ permalink raw reply	[flat|nested] 28+ messages in thread
* Re: [patch 1/4] add basic task isolation prctl interface
@ 2021-07-27 12:29 kernel test robot
  0 siblings, 0 replies; 28+ messages in thread
From: kernel test robot @ 2021-07-27 12:29 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 3191 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210727104119.551607458@fuller.cnet>
References: <20210727104119.551607458@fuller.cnet>
TO: Marcelo Tosatti <mtosatti@redhat.com>
TO: linux-kernel(a)vger.kernel.org
CC: Nitesh Lal <nilal@redhat.com>
CC: Nicolas Saenz Julienne <nsaenzju@redhat.com>
CC: Frederic Weisbecker <frederic@kernel.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Juri Lelli <juri.lelli@redhat.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Alex Belits <abelits@marvell.com>
CC: Peter Xu <peterx@redhat.com>
CC: Marcelo Tosatti <mtosatti@redhat.com>

Hi Marcelo,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.14-rc3]
[cannot apply to hnaz-linux-mm/master linux/master tip/sched/core tip/core/entry next-20210726]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Marcelo-Tosatti/prctl-task-isolation-interface-and-vmstat-sync/20210727-184330
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ff1176468d368232b684f75e82563369208bc371
:::::: branch date: 2 hours ago
:::::: commit date: 2 hours ago
compiler: gcc-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0
reproduce:
cd tools/perf && ./check-headers.sh

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


perfheadercheck warnings: (new ones prefixed by >>)
>> Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h':  270> /* Task isolation control */
>> Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h':  271> #define PR_ISOL_SET			62
>> Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h':  272> #define PR_ISOL_GET			63
>> Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h':  273> #define PR_ISOL_ENTER			64
>> Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h':  274> #define PR_ISOL_EXIT			65
>> Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h':  275> # define PR_ISOL_MODE			1
   Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h':  276> 
>> Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h':  277> # define PR_ISOL_MODE_NONE		0
>> Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h':  278> # define PR_ISOL_MODE_NORMAL		1

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

^ permalink raw reply	[flat|nested] 28+ messages in thread
* [patch 0/4] extensible prctl task isolation interface and vmstat sync (v2)
@ 2021-07-30 20:18 Marcelo Tosatti
  2021-07-30 20:18 ` [patch 1/4] add basic task isolation prctl interface Marcelo Tosatti
  0 siblings, 1 reply; 28+ messages in thread
From: Marcelo Tosatti @ 2021-07-30 20:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nitesh Lal, Nicolas Saenz Julienne, Frederic Weisbecker,
	Christoph Lameter, Juri Lelli, Peter Zijlstra, Alex Belits,
	Peter Xu

The logic to disable vmstat worker thread, when entering
nohz full, does not cover all scenarios. For example, it is possible
for the following to happen:

1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
2) app runs mlock, which increases counters for mlock'ed pages.
3) start -RT loop

Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
the mlock, vmstat shepherd can restart vmstat worker thread on
the CPU in question.

To fix this, add task isolation prctl interface to quiesce
deferred actions when returning to userspace.

The patchset is based on ideas and code from the
task isolation patchset from Alex Belits:
https://lwn.net/Articles/816298/

Please refer to Documentation/userspace-api/task_isolation.rst
(patch 1) for details.

---------

v2:

- Finer-grained control of quiescing (Frederic Weisbecker / Nicolas Saenz).

- Avoid potential regressions by allowing applications
  to use ISOL_F_QUIESCE_DEFMASK (whose default value
  is configurable in /sys/).         (Nitesh Lal / Nicolas Saenz).


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-08-02 14:17 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-27 10:38 [patch 0/4] prctl task isolation interface and vmstat sync Marcelo Tosatti
2021-07-27 10:38 ` [patch 1/4] add basic task isolation prctl interface Marcelo Tosatti
2021-07-27 10:48   ` nsaenzju
2021-07-27 11:00     ` Marcelo Tosatti
2021-07-27 12:38       ` nsaenzju
2021-07-27 13:06         ` Marcelo Tosatti
2021-07-27 13:08           ` Marcelo Tosatti
2021-07-27 13:09         ` Frederic Weisbecker
2021-07-27 14:52           ` Marcelo Tosatti
2021-07-27 23:45             ` Frederic Weisbecker
2021-07-28  9:37               ` Marcelo Tosatti
2021-07-28 11:45                 ` Frederic Weisbecker
2021-07-28 13:21                   ` Marcelo Tosatti
2021-07-28 21:22                     ` Frederic Weisbecker
2021-07-28 11:55                 ` nsaenzju
2021-07-28 13:16                   ` Marcelo Tosatti
     [not found]                     ` <CAFki+LkQwoqVTKmgnwLQQM8ua-ixbLp8i+jUT6xF15k6X=89mw@mail.gmail.com>
2021-07-28 16:21                       ` Marcelo Tosatti
2021-07-28 17:08                     ` nsaenzju
     [not found]                 ` <CAFki+LmHeXmSFze8YEHFNbYA5hLEtnZyk37Yjf-eyOuKa8Os4w@mail.gmail.com>
2021-07-28 16:17                   ` Marcelo Tosatti
2021-07-27 10:38 ` [patch 2/4] task isolation: sync vmstats on return to userspace Marcelo Tosatti
2021-07-27 10:38 ` [patch 3/4] mm: vmstat: move need_update Marcelo Tosatti
2021-07-27 10:38 ` [patch 4/4] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean Marcelo Tosatti
2021-07-27 12:29 [patch 1/4] add basic task isolation prctl interface kernel test robot
2021-07-30 20:18 [patch 0/4] extensible prctl task isolation interface and vmstat sync (v2) Marcelo Tosatti
2021-07-30 20:18 ` [patch 1/4] add basic task isolation prctl interface Marcelo Tosatti
     [not found]   ` <CAFki+Lnf0cs62Se0aPubzYxP9wh7xjMXn7RXEPvrmtBdYBrsow@mail.gmail.com>
2021-07-31  0:49     ` Marcelo Tosatti
2021-07-31  7:47   ` kernel test robot
2021-07-31  7:47     ` kernel test robot
     [not found]   ` <CAFki+LkQVQOe+5aNEKWDvLdnjWjxzKWOiqOvBZzeuPWX+G=XgA@mail.gmail.com>
2021-08-02 14:16     ` Marcelo Tosatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.