All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ankur Arora <ankur.a.arora@oracle.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
	akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de,
	dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com,
	juri.lelli@redhat.com, vincent.guittot@linaro.org,
	willy@infradead.org, mgorman@suse.de, rostedt@goodmis.org,
	jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com,
	boris.ostrovsky@oracle.com, konrad.wilk@oracle.com,
	jgross@suse.com, andrew.cooper3@citrix.com
Subject: Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED
Date: Sun, 24 Sep 2023 00:50:43 +0200	[thread overview]
Message-ID: <87h6nkh5bw.ffs@tglx> (raw)
In-Reply-To: <87led2wdj0.ffs@tglx>

On Tue, Sep 19 2023 at 14:30, Thomas Gleixner wrote:
> On Mon, Sep 18 2023 at 18:57, Linus Torvalds wrote:
>> Then the question becomes whether we'd want to introduce a *new*
>> concept, which is a "if you are going to schedule, do it now rather
>> than later, because I'm taking a lock, and while it's a preemptible
>> lock, I'd rather not sleep while holding this resource".
>>
>> I suspect we want to avoid that for now, on the assumption that it's
>> hopefully not a problem in practice (the recently addressed problem
>> with might_sleep() was that it actively *moved* the scheduling point
>> to a bad place, not that scheduling could happen there, so instead of
>> optimizing scheduling, it actively pessimized it). But I thought I'd
>> mention it.
>
> I think we want to avoid that completely and if this becomes an issue,
> we rather be smart about it at the core level.
>
> It's trivial enough to have a per task counter which tells whether a
> preemtible lock is held (or about to be acquired) or not. Then the
> scheduler can take that hint into account and decide to grant a
> timeslice extension once in the expectation that the task leaves the
> lock held section soonish and either returns to user space or schedules
> out. It still can enforce it later on.
>
> We really want to let the scheduler decide and rather give it proper
> hints at the conceptual level instead of letting developers make random
> decisions which might work well for a particular use case and completely
> suck for the rest. I think we wasted enough time already on those.

Finally I realized why cond_resched() & et al. are so disgusting. They
are scope-less and just a random spot which someone decided to be a good
place to reschedule.

But in fact the really relevant measure is scope. Full preemption is
scope based:

      preempt_disable();
      do_stuff();
      preempt_enable();

which also nests properly:

      preempt_disable();
      do_stuff()
        preempt_disable();
        do_other_stuff();
        preempt_enable();
      preempt_enable();

cond_resched() cannot nest and is obviously scope-less.

The TIF_ALLOW_RESCHED mechanism, which sparked this discussion only
pretends to be scoped.

As Peter pointed out it does not properly nest with other mechanisms and
it cannot even nest in itself because it is boolean.

The worst thing about it is that it is semantically reverse to the
established model of preempt_disable()/enable(),
i.e. allow_resched()/disallow_resched().

So instead of giving the scheduler a hint about 'this might be a good
place to preempt', providing proper scope would make way more sense:

      preempt_lazy_disable();
      do_stuff();
      preempt_lazy_enable();

That would be the obvious and semantically consistent counterpart to the
existing preemption control primitives with proper nesting support.

might_sleep(), which is in all the lock acquire functions or your
variant of hint (resched better now before I take the lock) are the
wrong place.

     hint();
     lock();
     do_stuff();
     unlock();

hint() might schedule and when the task comes back schedule immediately
again because the lock is contended. hint() does again not have scope
and might be meaningless or even counterproductive if called in a deeper
callchain.

Proper scope based hints avoid that.

      preempt_lazy_disable();
      lock();
      do_stuff();
      unlock();
      preempt_lazy_enable();
      
That's way better because it describes the scope and the task will
either schedule out in lock() on contention or provide a sensible lazy
preemption point in preempt_lazy_enable(). It also nests properly:

      preempt_lazy_disable();
      lock(A);
      do_stuff()
        preempt_lazy_disable();
        lock(B);
        do_other_stuff();
        unlock(B);
        preempt_lazy_enable();
      unlock(A);
      preempt_lazy_enable();

So in this case it does not matter wheter do_stuff() is invoked from a
lock held section or not. The scope which defines the throughput
relevant hint to the scheduler is correct in any case.

Contrary to preempt_disable() the lazy variant does neither prevent
scheduling nor preemption, but its a understandable properly nestable
mechanism.

I seriously hope to avoid it alltogether :)

Thanks,

        tglx

  parent reply	other threads:[~2023-09-23 22:51 UTC|newest]

Thread overview: 214+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-30 18:49 [PATCH v2 0/9] x86/clear_huge_page: multi-page clearing Ankur Arora
2023-08-30 18:49 ` [PATCH v2 1/9] mm/clear_huge_page: allow arch override for clear_huge_page() Ankur Arora
2023-08-30 18:49 ` [PATCH v2 2/9] mm/huge_page: separate clear_huge_page() and copy_huge_page() Ankur Arora
2023-08-30 18:49 ` [PATCH v2 3/9] mm/huge_page: cleanup clear_/copy_subpage() Ankur Arora
2023-09-08 13:09   ` Matthew Wilcox
2023-09-11 17:22     ` Ankur Arora
2023-08-30 18:49 ` [PATCH v2 4/9] x86/clear_page: extend clear_page*() for multi-page clearing Ankur Arora
2023-09-08 13:11   ` Matthew Wilcox
2023-08-30 18:49 ` [PATCH v2 5/9] x86/clear_page: add clear_pages() Ankur Arora
2023-08-30 18:49 ` [PATCH v2 6/9] x86/clear_huge_page: multi-page clearing Ankur Arora
2023-08-31 18:26   ` kernel test robot
2023-09-08 12:38   ` Peter Zijlstra
2023-09-13  6:43   ` Raghavendra K T
2023-08-30 18:49 ` [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED Ankur Arora
2023-09-08  7:02   ` Peter Zijlstra
2023-09-08 17:15     ` Linus Torvalds
2023-09-08 22:50       ` Peter Zijlstra
2023-09-09  5:15         ` Linus Torvalds
2023-09-09  6:39           ` Ankur Arora
2023-09-09  9:11             ` Peter Zijlstra
2023-09-09 20:04               ` Ankur Arora
2023-09-09  5:30       ` Ankur Arora
2023-09-09  9:12         ` Peter Zijlstra
2023-09-09 20:15     ` Ankur Arora
2023-09-09 21:16       ` Linus Torvalds
2023-09-10  3:48         ` Ankur Arora
2023-09-10  4:35           ` Linus Torvalds
2023-09-10 10:01             ` Ankur Arora
2023-09-10 18:32               ` Linus Torvalds
2023-09-11 15:04                 ` Peter Zijlstra
2023-09-11 16:29                   ` andrew.cooper3
2023-09-11 17:04                   ` Ankur Arora
2023-09-12  8:26                     ` Peter Zijlstra
2023-09-12 12:24                       ` Phil Auld
2023-09-12 12:33                       ` Matthew Wilcox
2023-09-18 23:42                       ` Thomas Gleixner
2023-09-19  1:57                         ` Linus Torvalds
2023-09-19  8:03                           ` Ingo Molnar
2023-09-19  8:43                             ` Ingo Molnar
2023-09-19 13:43                               ` Thomas Gleixner
2023-09-19 13:25                             ` Thomas Gleixner
2023-09-19 12:30                           ` Thomas Gleixner
2023-09-19 13:00                             ` Arches that don't support PREEMPT Matthew Wilcox
2023-09-19 13:00                               ` Matthew Wilcox
2023-09-19 13:00                               ` Matthew Wilcox
2023-09-19 13:34                               ` Geert Uytterhoeven
2023-09-19 13:34                                 ` Geert Uytterhoeven
2023-09-19 13:34                                 ` Geert Uytterhoeven
2023-09-19 13:37                               ` John Paul Adrian Glaubitz
2023-09-19 13:37                                 ` John Paul Adrian Glaubitz
2023-09-19 13:37                                 ` John Paul Adrian Glaubitz
2023-09-19 13:42                                 ` Peter Zijlstra
2023-09-19 13:42                                   ` Peter Zijlstra
2023-09-19 13:42                                   ` Peter Zijlstra
2023-09-19 13:48                                   ` John Paul Adrian Glaubitz
2023-09-19 13:48                                     ` John Paul Adrian Glaubitz
2023-09-19 13:48                                     ` John Paul Adrian Glaubitz
2023-09-19 14:16                                     ` Peter Zijlstra
2023-09-19 14:16                                       ` Peter Zijlstra
2023-09-19 14:16                                       ` Peter Zijlstra
2023-09-19 14:24                                       ` John Paul Adrian Glaubitz
2023-09-19 14:24                                         ` John Paul Adrian Glaubitz
2023-09-19 14:24                                         ` John Paul Adrian Glaubitz
2023-09-19 14:32                                         ` Matthew Wilcox
2023-09-19 14:32                                           ` Matthew Wilcox
2023-09-19 14:32                                           ` Matthew Wilcox
2023-09-19 15:31                                           ` Steven Rostedt
2023-09-19 15:31                                             ` Steven Rostedt
2023-09-19 15:31                                             ` Steven Rostedt
2023-09-20 14:38                                       ` Anton Ivanov
2023-09-20 14:38                                         ` Anton Ivanov
2023-09-20 14:38                                         ` Anton Ivanov
2023-09-21 12:20                                       ` Arnd Bergmann
2023-09-21 12:20                                         ` Arnd Bergmann
2023-09-21 12:20                                         ` Arnd Bergmann
2023-09-19 14:17                                     ` Thomas Gleixner
2023-09-19 14:17                                       ` Thomas Gleixner
2023-09-19 14:17                                       ` Thomas Gleixner
2023-09-19 14:50                                       ` H. Peter Anvin
2023-09-19 14:50                                         ` H. Peter Anvin
2023-09-19 14:50                                         ` H. Peter Anvin
2023-09-19 14:57                                         ` Matt Turner
2023-09-19 14:57                                           ` Matt Turner
2023-09-19 14:57                                           ` Matt Turner
2023-09-19 17:09                                         ` Ulrich Teichert
2023-09-19 17:09                                           ` Ulrich Teichert
2023-09-19 17:25                                     ` Linus Torvalds
2023-09-19 17:25                                       ` Linus Torvalds
2023-09-19 17:25                                       ` Linus Torvalds
2023-09-19 17:58                                       ` John Paul Adrian Glaubitz
2023-09-19 17:58                                         ` John Paul Adrian Glaubitz
2023-09-19 17:58                                         ` John Paul Adrian Glaubitz
2023-09-19 18:31                                       ` Thomas Gleixner
2023-09-19 18:31                                         ` Thomas Gleixner
2023-09-19 18:31                                         ` Thomas Gleixner
2023-09-19 18:38                                         ` Steven Rostedt
2023-09-19 18:38                                           ` Steven Rostedt
2023-09-19 18:38                                           ` Steven Rostedt
2023-09-19 18:52                                           ` Linus Torvalds
2023-09-19 18:52                                             ` Linus Torvalds
2023-09-19 18:52                                             ` Linus Torvalds
2023-09-19 19:53                                             ` Thomas Gleixner
2023-09-19 19:53                                               ` Thomas Gleixner
2023-09-19 19:53                                               ` Thomas Gleixner
2023-09-20  7:32                                           ` Ingo Molnar
2023-09-20  7:32                                             ` Ingo Molnar
2023-09-20  7:32                                             ` Ingo Molnar
2023-09-20  7:29                                         ` Ingo Molnar
2023-09-20  7:29                                           ` Ingo Molnar
2023-09-20  7:29                                           ` Ingo Molnar
2023-09-20  8:26                                       ` Thomas Gleixner
2023-09-20  8:26                                         ` Thomas Gleixner
2023-09-20  8:26                                         ` Thomas Gleixner
2023-09-20 10:37                                       ` David Laight
2023-09-20 10:37                                         ` David Laight
2023-09-20 10:37                                         ` David Laight
2023-09-19 14:21                                   ` Anton Ivanov
2023-09-19 14:21                                     ` Anton Ivanov
2023-09-19 14:21                                     ` Anton Ivanov
2023-09-19 15:17                                     ` Thomas Gleixner
2023-09-19 15:17                                       ` Thomas Gleixner
2023-09-19 15:17                                       ` Thomas Gleixner
2023-09-19 15:21                                       ` Anton Ivanov
2023-09-19 15:21                                         ` Anton Ivanov
2023-09-19 15:21                                         ` Anton Ivanov
2023-09-19 16:22                                         ` Richard Weinberger
2023-09-19 16:22                                           ` Richard Weinberger
2023-09-19 16:22                                           ` Richard Weinberger
2023-09-19 16:41                                           ` Anton Ivanov
2023-09-19 16:41                                             ` Anton Ivanov
2023-09-19 16:41                                             ` Anton Ivanov
2023-09-19 17:33                                             ` Thomas Gleixner
2023-09-19 17:33                                               ` Thomas Gleixner
2023-09-19 17:33                                               ` Thomas Gleixner
2023-10-06 14:51                               ` Geert Uytterhoeven
2023-10-06 14:51                                 ` Geert Uytterhoeven
2023-09-20 14:22                             ` [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED Ankur Arora
2023-09-20 20:51                               ` Thomas Gleixner
2023-09-21  0:14                                 ` Thomas Gleixner
2023-09-21  0:58                                 ` Ankur Arora
2023-09-21  2:12                                   ` Thomas Gleixner
2023-09-20 23:58                             ` Thomas Gleixner
2023-09-21  0:57                               ` Ankur Arora
2023-09-21  2:02                                 ` Thomas Gleixner
2023-09-21  4:16                                   ` Ankur Arora
2023-09-21 13:59                                     ` Steven Rostedt
2023-09-21 16:00                               ` Linus Torvalds
2023-09-21 22:55                                 ` Thomas Gleixner
2023-09-23  1:11                                   ` Thomas Gleixner
2023-10-02 14:15                                     ` Steven Rostedt
2023-10-02 16:13                                       ` Thomas Gleixner
2023-10-18  1:03                                     ` Paul E. McKenney
2023-10-18 12:09                                       ` Ankur Arora
2023-10-18 17:51                                         ` Paul E. McKenney
2023-10-18 22:53                                           ` Thomas Gleixner
2023-10-18 23:25                                             ` Paul E. McKenney
2023-10-18 13:16                                       ` Thomas Gleixner
2023-10-18 14:31                                         ` Steven Rostedt
2023-10-18 17:55                                           ` Paul E. McKenney
2023-10-18 18:00                                             ` Steven Rostedt
2023-10-18 18:13                                               ` Paul E. McKenney
2023-10-19 12:37                                                 ` Daniel Bristot de Oliveira
2023-10-19 17:08                                                   ` Paul E. McKenney
2023-10-18 17:19                                         ` Paul E. McKenney
2023-10-18 17:41                                           ` Steven Rostedt
2023-10-18 17:59                                             ` Paul E. McKenney
2023-10-18 20:15                                           ` Ankur Arora
2023-10-18 20:42                                             ` Paul E. McKenney
2023-10-19  0:21                                           ` Thomas Gleixner
2023-10-19 19:13                                             ` Paul E. McKenney
2023-10-20 21:59                                               ` Paul E. McKenney
2023-10-20 22:56                                               ` Ankur Arora
2023-10-20 23:36                                                 ` Paul E. McKenney
2023-10-21  1:05                                                   ` Ankur Arora
2023-10-21  2:08                                                     ` Paul E. McKenney
2023-10-24 12:15                                               ` Thomas Gleixner
2023-10-24 18:59                                                 ` Paul E. McKenney
2023-09-23 22:50                             ` Thomas Gleixner [this message]
2023-09-24  0:10                               ` Thomas Gleixner
2023-09-24  7:19                               ` Matthew Wilcox
2023-09-24  7:55                                 ` Thomas Gleixner
2023-09-24 10:29                                   ` Matthew Wilcox
2023-09-25  0:13                               ` Ankur Arora
2023-10-06 13:01                             ` Geert Uytterhoeven
2023-09-19  7:21                         ` Ingo Molnar
2023-09-19 19:05                         ` Ankur Arora
2023-10-24 14:34                         ` Steven Rostedt
2023-10-25  1:49                           ` Steven Rostedt
2023-10-26  7:50                           ` Sergey Senozhatsky
2023-10-26 12:48                             ` Steven Rostedt
2023-09-11 16:48             ` Steven Rostedt
2023-09-11 20:50               ` Linus Torvalds
2023-09-11 21:16                 ` Linus Torvalds
2023-09-12  7:20                   ` Peter Zijlstra
2023-09-12  7:38                     ` Ingo Molnar
2023-09-11 22:20                 ` Steven Rostedt
2023-09-11 23:10                   ` Ankur Arora
2023-09-11 23:16                     ` Steven Rostedt
2023-09-12 16:30                   ` Linus Torvalds
2023-09-12  3:27                 ` Matthew Wilcox
2023-09-12 16:20                   ` Linus Torvalds
2023-09-19  3:21   ` Andy Lutomirski
2023-09-19  9:20     ` Thomas Gleixner
2023-09-19  9:49       ` Ingo Molnar
2023-08-30 18:49 ` [PATCH v2 8/9] irqentry: define irqentry_exit_allow_resched() Ankur Arora
2023-09-08 12:42   ` Peter Zijlstra
2023-09-11 17:24     ` Ankur Arora
2023-08-30 18:49 ` [PATCH v2 9/9] x86/clear_huge_page: make clear_contig_region() preemptible Ankur Arora
2023-09-08 12:45   ` Peter Zijlstra
2023-09-03  8:14 ` [PATCH v2 0/9] x86/clear_huge_page: multi-page clearing Mateusz Guzik
2023-09-05 22:14   ` Ankur Arora
2023-09-08  2:18   ` Raghavendra K T
2023-09-05  1:06 ` Raghavendra K T
2023-09-05 19:36   ` Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h6nkh5bw.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=ankur.a.arora@oracle.com \
    --cc=bharata@amd.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=jon.grimm@amd.com \
    --cc=juri.lelli@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.