All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ankur Arora <ankur.a.arora@oracle.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
	akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de,
	dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com,
	juri.lelli@redhat.com, vincent.guittot@linaro.org,
	willy@infradead.org, mgorman@suse.de, jon.grimm@amd.com,
	bharata@amd.com, raghavendra.kt@amd.com,
	boris.ostrovsky@oracle.com, konrad.wilk@oracle.com,
	jgross@suse.com, andrew.cooper3@citrix.com,
	Joel Fernandes <joel@joelfernandes.org>,
	Youssef Esmat <youssefesmat@chromium.org>,
	Vineeth Pillai <vineethrp@google.com>,
	Suleiman Souhlal <suleiman@google.com>
Subject: Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED
Date: Tue, 24 Oct 2023 10:34:26 -0400	[thread overview]
Message-ID: <20231024103426.4074d319@gandalf.local.home> (raw)
In-Reply-To: <87cyyfxd4k.ffs@tglx>

On Tue, 19 Sep 2023 01:42:03 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

>    2) When the scheduler wants to set NEED_RESCHED due it sets
>       NEED_RESCHED_LAZY instead which is only evaluated in the return to
>       user space preemption points.
> 
>       As NEED_RESCHED_LAZY is not folded into the preemption count the
>       preemption count won't become zero, so the task can continue until
>       it hits return to user space.
> 
>       That preserves the existing behaviour.

I'm looking into extending this concept to user space and to VMs.

I'm calling this the "extended scheduler time slice" (ESTS pronounced "estis")

The ideas is this. Have VMs/user space share a memory region with the
kernel that is per thread/vCPU. This would be registered via a syscall or
ioctl on some defined file or whatever. Then, when entering user space /
VM, if NEED_RESCHED_LAZY (or whatever it's eventually called) is set, it
checks if the thread has this memory region and a special bit in it is
set, and if it does, it does not schedule. It will treat it like a long
kernel system call.

The kernel will then set another bit in the shared memory region that will
tell user space / VM that the kernel wanted to schedule, but is allowing it
to finish its critical section. When user space / VM is done with the
critical section, it will check the bit that may be set by the kernel and
if it is set, it should do a sched_yield() or VMEXIT so that the kernel can
now schedule it.

What about DOS you say? It's no different than running a long system call.
No task can run forever. It's not a "preempt disable", it's just "give me
some more time". A "NEED_RESCHED" will always schedule, just like a kernel
system call that takes a long time. The goal is to allow user space to get
out of critical sections that we know can cause problems if they get
preempted. Usually it's a user space / VM lock is held or maybe a VM
interrupt handler that needs to wake up a task on another vCPU.

If we are worried about abuse, we could even punish tasks that don't call
sched_yield() by the time its extended time slice is taken. Even without
that punishment, if we have EEVDF, this extension will make it less
eligible the next time around.

The goal is to prevent a thread / vCPU being preempted while holding a lock
or resource that other threads / vCPUs will want. That is, prevent
contention, as that's usually the biggest issue with performance in user
space and VMs.

I'm going to work on a POC, and see if I can get some benchmarks on how
much this could help tasks like databases and VMs in general.

-- Steve

  parent reply	other threads:[~2023-10-24 14:34 UTC|newest]

Thread overview: 214+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-30 18:49 [PATCH v2 0/9] x86/clear_huge_page: multi-page clearing Ankur Arora
2023-08-30 18:49 ` [PATCH v2 1/9] mm/clear_huge_page: allow arch override for clear_huge_page() Ankur Arora
2023-08-30 18:49 ` [PATCH v2 2/9] mm/huge_page: separate clear_huge_page() and copy_huge_page() Ankur Arora
2023-08-30 18:49 ` [PATCH v2 3/9] mm/huge_page: cleanup clear_/copy_subpage() Ankur Arora
2023-09-08 13:09   ` Matthew Wilcox
2023-09-11 17:22     ` Ankur Arora
2023-08-30 18:49 ` [PATCH v2 4/9] x86/clear_page: extend clear_page*() for multi-page clearing Ankur Arora
2023-09-08 13:11   ` Matthew Wilcox
2023-08-30 18:49 ` [PATCH v2 5/9] x86/clear_page: add clear_pages() Ankur Arora
2023-08-30 18:49 ` [PATCH v2 6/9] x86/clear_huge_page: multi-page clearing Ankur Arora
2023-08-31 18:26   ` kernel test robot
2023-09-08 12:38   ` Peter Zijlstra
2023-09-13  6:43   ` Raghavendra K T
2023-08-30 18:49 ` [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED Ankur Arora
2023-09-08  7:02   ` Peter Zijlstra
2023-09-08 17:15     ` Linus Torvalds
2023-09-08 22:50       ` Peter Zijlstra
2023-09-09  5:15         ` Linus Torvalds
2023-09-09  6:39           ` Ankur Arora
2023-09-09  9:11             ` Peter Zijlstra
2023-09-09 20:04               ` Ankur Arora
2023-09-09  5:30       ` Ankur Arora
2023-09-09  9:12         ` Peter Zijlstra
2023-09-09 20:15     ` Ankur Arora
2023-09-09 21:16       ` Linus Torvalds
2023-09-10  3:48         ` Ankur Arora
2023-09-10  4:35           ` Linus Torvalds
2023-09-10 10:01             ` Ankur Arora
2023-09-10 18:32               ` Linus Torvalds
2023-09-11 15:04                 ` Peter Zijlstra
2023-09-11 16:29                   ` andrew.cooper3
2023-09-11 17:04                   ` Ankur Arora
2023-09-12  8:26                     ` Peter Zijlstra
2023-09-12 12:24                       ` Phil Auld
2023-09-12 12:33                       ` Matthew Wilcox
2023-09-18 23:42                       ` Thomas Gleixner
2023-09-19  1:57                         ` Linus Torvalds
2023-09-19  8:03                           ` Ingo Molnar
2023-09-19  8:43                             ` Ingo Molnar
2023-09-19 13:43                               ` Thomas Gleixner
2023-09-19 13:25                             ` Thomas Gleixner
2023-09-19 12:30                           ` Thomas Gleixner
2023-09-19 13:00                             ` Arches that don't support PREEMPT Matthew Wilcox
2023-09-19 13:00                               ` Matthew Wilcox
2023-09-19 13:00                               ` Matthew Wilcox
2023-09-19 13:34                               ` Geert Uytterhoeven
2023-09-19 13:34                                 ` Geert Uytterhoeven
2023-09-19 13:34                                 ` Geert Uytterhoeven
2023-09-19 13:37                               ` John Paul Adrian Glaubitz
2023-09-19 13:37                                 ` John Paul Adrian Glaubitz
2023-09-19 13:37                                 ` John Paul Adrian Glaubitz
2023-09-19 13:42                                 ` Peter Zijlstra
2023-09-19 13:42                                   ` Peter Zijlstra
2023-09-19 13:42                                   ` Peter Zijlstra
2023-09-19 13:48                                   ` John Paul Adrian Glaubitz
2023-09-19 13:48                                     ` John Paul Adrian Glaubitz
2023-09-19 13:48                                     ` John Paul Adrian Glaubitz
2023-09-19 14:16                                     ` Peter Zijlstra
2023-09-19 14:16                                       ` Peter Zijlstra
2023-09-19 14:16                                       ` Peter Zijlstra
2023-09-19 14:24                                       ` John Paul Adrian Glaubitz
2023-09-19 14:24                                         ` John Paul Adrian Glaubitz
2023-09-19 14:24                                         ` John Paul Adrian Glaubitz
2023-09-19 14:32                                         ` Matthew Wilcox
2023-09-19 14:32                                           ` Matthew Wilcox
2023-09-19 14:32                                           ` Matthew Wilcox
2023-09-19 15:31                                           ` Steven Rostedt
2023-09-19 15:31                                             ` Steven Rostedt
2023-09-19 15:31                                             ` Steven Rostedt
2023-09-20 14:38                                       ` Anton Ivanov
2023-09-20 14:38                                         ` Anton Ivanov
2023-09-20 14:38                                         ` Anton Ivanov
2023-09-21 12:20                                       ` Arnd Bergmann
2023-09-21 12:20                                         ` Arnd Bergmann
2023-09-21 12:20                                         ` Arnd Bergmann
2023-09-19 14:17                                     ` Thomas Gleixner
2023-09-19 14:17                                       ` Thomas Gleixner
2023-09-19 14:17                                       ` Thomas Gleixner
2023-09-19 14:50                                       ` H. Peter Anvin
2023-09-19 14:50                                         ` H. Peter Anvin
2023-09-19 14:50                                         ` H. Peter Anvin
2023-09-19 14:57                                         ` Matt Turner
2023-09-19 14:57                                           ` Matt Turner
2023-09-19 14:57                                           ` Matt Turner
2023-09-19 17:09                                         ` Ulrich Teichert
2023-09-19 17:09                                           ` Ulrich Teichert
2023-09-19 17:25                                     ` Linus Torvalds
2023-09-19 17:25                                       ` Linus Torvalds
2023-09-19 17:25                                       ` Linus Torvalds
2023-09-19 17:58                                       ` John Paul Adrian Glaubitz
2023-09-19 17:58                                         ` John Paul Adrian Glaubitz
2023-09-19 17:58                                         ` John Paul Adrian Glaubitz
2023-09-19 18:31                                       ` Thomas Gleixner
2023-09-19 18:31                                         ` Thomas Gleixner
2023-09-19 18:31                                         ` Thomas Gleixner
2023-09-19 18:38                                         ` Steven Rostedt
2023-09-19 18:38                                           ` Steven Rostedt
2023-09-19 18:38                                           ` Steven Rostedt
2023-09-19 18:52                                           ` Linus Torvalds
2023-09-19 18:52                                             ` Linus Torvalds
2023-09-19 18:52                                             ` Linus Torvalds
2023-09-19 19:53                                             ` Thomas Gleixner
2023-09-19 19:53                                               ` Thomas Gleixner
2023-09-19 19:53                                               ` Thomas Gleixner
2023-09-20  7:32                                           ` Ingo Molnar
2023-09-20  7:32                                             ` Ingo Molnar
2023-09-20  7:32                                             ` Ingo Molnar
2023-09-20  7:29                                         ` Ingo Molnar
2023-09-20  7:29                                           ` Ingo Molnar
2023-09-20  7:29                                           ` Ingo Molnar
2023-09-20  8:26                                       ` Thomas Gleixner
2023-09-20  8:26                                         ` Thomas Gleixner
2023-09-20  8:26                                         ` Thomas Gleixner
2023-09-20 10:37                                       ` David Laight
2023-09-20 10:37                                         ` David Laight
2023-09-20 10:37                                         ` David Laight
2023-09-19 14:21                                   ` Anton Ivanov
2023-09-19 14:21                                     ` Anton Ivanov
2023-09-19 14:21                                     ` Anton Ivanov
2023-09-19 15:17                                     ` Thomas Gleixner
2023-09-19 15:17                                       ` Thomas Gleixner
2023-09-19 15:17                                       ` Thomas Gleixner
2023-09-19 15:21                                       ` Anton Ivanov
2023-09-19 15:21                                         ` Anton Ivanov
2023-09-19 15:21                                         ` Anton Ivanov
2023-09-19 16:22                                         ` Richard Weinberger
2023-09-19 16:22                                           ` Richard Weinberger
2023-09-19 16:22                                           ` Richard Weinberger
2023-09-19 16:41                                           ` Anton Ivanov
2023-09-19 16:41                                             ` Anton Ivanov
2023-09-19 16:41                                             ` Anton Ivanov
2023-09-19 17:33                                             ` Thomas Gleixner
2023-09-19 17:33                                               ` Thomas Gleixner
2023-09-19 17:33                                               ` Thomas Gleixner
2023-10-06 14:51                               ` Geert Uytterhoeven
2023-10-06 14:51                                 ` Geert Uytterhoeven
2023-09-20 14:22                             ` [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED Ankur Arora
2023-09-20 20:51                               ` Thomas Gleixner
2023-09-21  0:14                                 ` Thomas Gleixner
2023-09-21  0:58                                 ` Ankur Arora
2023-09-21  2:12                                   ` Thomas Gleixner
2023-09-20 23:58                             ` Thomas Gleixner
2023-09-21  0:57                               ` Ankur Arora
2023-09-21  2:02                                 ` Thomas Gleixner
2023-09-21  4:16                                   ` Ankur Arora
2023-09-21 13:59                                     ` Steven Rostedt
2023-09-21 16:00                               ` Linus Torvalds
2023-09-21 22:55                                 ` Thomas Gleixner
2023-09-23  1:11                                   ` Thomas Gleixner
2023-10-02 14:15                                     ` Steven Rostedt
2023-10-02 16:13                                       ` Thomas Gleixner
2023-10-18  1:03                                     ` Paul E. McKenney
2023-10-18 12:09                                       ` Ankur Arora
2023-10-18 17:51                                         ` Paul E. McKenney
2023-10-18 22:53                                           ` Thomas Gleixner
2023-10-18 23:25                                             ` Paul E. McKenney
2023-10-18 13:16                                       ` Thomas Gleixner
2023-10-18 14:31                                         ` Steven Rostedt
2023-10-18 17:55                                           ` Paul E. McKenney
2023-10-18 18:00                                             ` Steven Rostedt
2023-10-18 18:13                                               ` Paul E. McKenney
2023-10-19 12:37                                                 ` Daniel Bristot de Oliveira
2023-10-19 17:08                                                   ` Paul E. McKenney
2023-10-18 17:19                                         ` Paul E. McKenney
2023-10-18 17:41                                           ` Steven Rostedt
2023-10-18 17:59                                             ` Paul E. McKenney
2023-10-18 20:15                                           ` Ankur Arora
2023-10-18 20:42                                             ` Paul E. McKenney
2023-10-19  0:21                                           ` Thomas Gleixner
2023-10-19 19:13                                             ` Paul E. McKenney
2023-10-20 21:59                                               ` Paul E. McKenney
2023-10-20 22:56                                               ` Ankur Arora
2023-10-20 23:36                                                 ` Paul E. McKenney
2023-10-21  1:05                                                   ` Ankur Arora
2023-10-21  2:08                                                     ` Paul E. McKenney
2023-10-24 12:15                                               ` Thomas Gleixner
2023-10-24 18:59                                                 ` Paul E. McKenney
2023-09-23 22:50                             ` Thomas Gleixner
2023-09-24  0:10                               ` Thomas Gleixner
2023-09-24  7:19                               ` Matthew Wilcox
2023-09-24  7:55                                 ` Thomas Gleixner
2023-09-24 10:29                                   ` Matthew Wilcox
2023-09-25  0:13                               ` Ankur Arora
2023-10-06 13:01                             ` Geert Uytterhoeven
2023-09-19  7:21                         ` Ingo Molnar
2023-09-19 19:05                         ` Ankur Arora
2023-10-24 14:34                         ` Steven Rostedt [this message]
2023-10-25  1:49                           ` Steven Rostedt
2023-10-26  7:50                           ` Sergey Senozhatsky
2023-10-26 12:48                             ` Steven Rostedt
2023-09-11 16:48             ` Steven Rostedt
2023-09-11 20:50               ` Linus Torvalds
2023-09-11 21:16                 ` Linus Torvalds
2023-09-12  7:20                   ` Peter Zijlstra
2023-09-12  7:38                     ` Ingo Molnar
2023-09-11 22:20                 ` Steven Rostedt
2023-09-11 23:10                   ` Ankur Arora
2023-09-11 23:16                     ` Steven Rostedt
2023-09-12 16:30                   ` Linus Torvalds
2023-09-12  3:27                 ` Matthew Wilcox
2023-09-12 16:20                   ` Linus Torvalds
2023-09-19  3:21   ` Andy Lutomirski
2023-09-19  9:20     ` Thomas Gleixner
2023-09-19  9:49       ` Ingo Molnar
2023-08-30 18:49 ` [PATCH v2 8/9] irqentry: define irqentry_exit_allow_resched() Ankur Arora
2023-09-08 12:42   ` Peter Zijlstra
2023-09-11 17:24     ` Ankur Arora
2023-08-30 18:49 ` [PATCH v2 9/9] x86/clear_huge_page: make clear_contig_region() preemptible Ankur Arora
2023-09-08 12:45   ` Peter Zijlstra
2023-09-03  8:14 ` [PATCH v2 0/9] x86/clear_huge_page: multi-page clearing Mateusz Guzik
2023-09-05 22:14   ` Ankur Arora
2023-09-08  2:18   ` Raghavendra K T
2023-09-05  1:06 ` Raghavendra K T
2023-09-05 19:36   ` Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231024103426.4074d319@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=ankur.a.arora@oracle.com \
    --cc=bharata@amd.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=joel@joelfernandes.org \
    --cc=jon.grimm@amd.com \
    --cc=juri.lelli@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=suleiman@google.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vineethrp@google.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=youssefesmat@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.