All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ankur Arora <ankur.a.arora@oracle.com>
To: paulmck@kernel.org
Cc: Ankur Arora <ankur.a.arora@oracle.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	peterz@infradead.org, torvalds@linux-foundation.org,
	akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de,
	dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com,
	juri.lelli@redhat.com, vincent.guittot@linaro.org,
	willy@infradead.org, mgorman@suse.de, jpoimboe@kernel.org,
	mark.rutland@arm.com, jgross@suse.com, andrew.cooper3@citrix.com,
	bristot@kernel.org, mathieu.desnoyers@efficios.com,
	geert@linux-m68k.org, glaubitz@physik.fu-berlin.de,
	anton.ivanov@cambridgegreys.com, mattst88@gmail.com,
	krypton@ulrich-teichert.org, rostedt@goodmis.org,
	David.Laight@aculab.com, richard@nod.at, mjguzik@gmail.com,
	jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com,
	boris.ostrovsky@oracle.com, konrad.wilk@oracle.com
Subject: Re: [PATCH 26/30] sched: handle preempt=voluntary under PREEMPT_AUTO
Date: Mon, 11 Mar 2024 13:09:54 -0700	[thread overview]
Message-ID: <87cys0il7x.fsf@oracle.com> (raw)
In-Reply-To: <36eef8c5-8ecd-4c90-8851-1c2ab342e2bb@paulmck-laptop>


Paul E. McKenney <paulmck@kernel.org> writes:

> On Sun, Mar 10, 2024 at 09:50:33PM -0700, Ankur Arora wrote:
>>
>> Paul E. McKenney <paulmck@kernel.org> writes:
>>
>> > On Thu, Mar 07, 2024 at 08:22:30PM -0800, Ankur Arora wrote:
>> >>
>> >> Paul E. McKenney <paulmck@kernel.org> writes:
>> >>
>> >> > On Thu, Mar 07, 2024 at 07:15:35PM -0500, Joel Fernandes wrote:
>> >> >>
>> >> >>
>> >> >> On 3/7/2024 2:01 PM, Paul E. McKenney wrote:
>> >> >> > On Wed, Mar 06, 2024 at 03:42:10PM -0500, Joel Fernandes wrote:
>> >> >> >> Hi Ankur,
>> >> >> >>
>> >> >> >> On 3/5/2024 3:11 AM, Ankur Arora wrote:
>> >> >> >>>
>> >> >> >>> Joel Fernandes <joel@joelfernandes.org> writes:
>> >> >> >>>
>> >> >> >> [..]
>> >> >> >>>> IMO, just kill 'voluntary' if PREEMPT_AUTO is enabled. There is no
>> >> >> >>>> 'voluntary' business because
>> >> >> >>>> 1. The behavior vs =none is to allow higher scheduling class to preempt, it
>> >> >> >>>> is not about the old voluntary.
>> >> >> >>>
>> >> >> >>> What do you think about folding the higher scheduling class preemption logic
>> >> >> >>> into preempt=none? As Juri pointed out, prioritization of at least the leftmost
>> >> >> >>> deadline task needs to be done for correctness.
>> >> >> >>>
>> >> >> >>> (That'll get rid of the current preempt=voluntary model, at least until
>> >> >> >>> there's a separate use for it.)
>> >> >> >>
>> >> >> >> Yes I am all in support for that. Its less confusing for the user as well, and
>> >> >> >> scheduling higher priority class at the next tick for preempt=none sounds good
>> >> >> >> to me. That is still an improvement for folks using SCHED_DEADLINE for whatever
>> >> >> >> reason, with a vanilla CONFIG_PREEMPT_NONE=y kernel. :-P. If we want a new mode
>> >> >> >> that is more aggressive, it could be added in the future.
>> >> >> >
>> >> >> > This would be something that happens only after removing cond_resched()
>> >> >> > might_sleep() functionality from might_sleep(), correct?
>> >> >>
>> >> >> Firstly, Maybe I misunderstood Ankur completely. Re-reading his comments above,
>> >> >> he seems to be suggesting preempting instantly for higher scheduling CLASSES
>> >> >> even for preempt=none mode, without having to wait till the next
>> >> >> scheduling-clock interrupt. Not sure if that makes sense to me, I was asking not
>> >> >> to treat "higher class" any differently than "higher priority" for preempt=none.
>> >> >>
>> >> >> And if SCHED_DEADLINE has a problem with that, then it already happens so with
>> >> >> CONFIG_PREEMPT_NONE=y kernels, so no need special treatment for higher class any
>> >> >> more than the treatment given to higher priority within same class. Ankur/Juri?
>> >> >>
>> >> >> Re: cond_resched(), I did not follow you Paul, why does removing the proposed
>> >> >> preempt=voluntary mode (i.e. dropping this patch) have to happen only after
>> >> >> cond_resched()/might_sleep() modifications?
>> >> >
>> >> > Because right now, one large difference between CONFIG_PREEMPT_NONE
>> >> > an CONFIG_PREEMPT_VOLUNTARY is that for the latter might_sleep() is a
>> >> > preemption point, but not for the former.
>> >>
>> >> True. But, there is no difference between either of those with
>> >> PREEMPT_AUTO=y (at least right now).
>> >>
>> >> For (PREEMPT_AUTO=y, PREEMPT_VOLUNTARY=y, DEBUG_ATOMIC_SLEEP=y),
>> >> might_sleep() is:
>> >>
>> >> # define might_resched() do { } while (0)
>> >> # define might_sleep() \
>> >>         do { __might_sleep(__FILE__, __LINE__); might_resched(); } while (0)
>> >>
>> >> And, cond_resched() for (PREEMPT_AUTO=y, PREEMPT_VOLUNTARY=y,
>> >> DEBUG_ATOMIC_SLEEP=y):
>> >>
>> >> static inline int _cond_resched(void)
>> >> {
>> >>         klp_sched_try_switch();
>> >>         return 0;
>> >> }
>> >> #define cond_resched() ({                       \
>> >>         __might_resched(__FILE__, __LINE__, 0); \
>> >>         _cond_resched();                        \
>> >> })
>> >>
>> >> And, no change for (PREEMPT_AUTO=y, PREEMPT_NONE=y, DEBUG_ATOMIC_SLEEP=y).
>> >
>> > As long as it is easy to restore the prior cond_resched() functionality
>> > for testing in the meantime, I should be OK.  For example, it would
>> > be great to have the commit removing the old functionality from
>> > cond_resched() at the end of the series,
>>
>> I would, of course, be happy to make any changes that helps testing,
>> but I think I'm missing something that you are saying wrt
>> cond_resched()/might_sleep().
>>
>> There's no commit explicitly removing the core cond_reshed()
>> functionality: PREEMPT_AUTO explicitly selects PREEMPT_BUILD and selects
>> out PREEMPTION_{NONE,VOLUNTARY}_BUILD.
>> (That's patch-1 "preempt: introduce CONFIG_PREEMPT_AUTO".)
>>
>> For the rest it just piggybacks on the CONFIG_PREEMPT_DYNAMIC work
>> and just piggybacks on (!CONFIG_PREEMPT_DYNAMIC && CONFIG_PREEMPTION):
>>
>> #if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
>> 	/* ... */
>> #if defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
>> 	/* ... */
>> #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
>> 	/* ... */
>> #else /* !CONFIG_PREEMPTION */
>> 	/* ... */
>> #endif /* PREEMPT_DYNAMIC && CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
>>
>> #else /* CONFIG_PREEMPTION && !CONFIG_PREEMPT_DYNAMIC */
>> static inline int _cond_resched(void)
>> {
>> 	klp_sched_try_switch();
>> 	return 0;
>> }
>> #endif /* !CONFIG_PREEMPTION || CONFIG_PREEMPT_DYNAMIC */
>>
>> Same for might_sleep() (which really amounts to might_resched()):
>>
>> #ifdef CONFIG_PREEMPT_VOLUNTARY_BUILD
>>        /* ... */
>> #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
>>       /* ... */
>> #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
>>       /* ... */
>> #else
>> # define might_resched() do { } while (0)
>> #endif /* CONFIG_PREEMPT_* */
>>
>> But, I doubt that I'm telling you anything new. So, what am I missing?
>
> It is really a choice at your end.
>
> Suppose we enable CONFIG_PREEMPT_AUTO on our fleet, and find that there
> was some small set of cond_resched() calls that provided sub-jiffy
> preemption that matter to some of our workloads.  At that point, what
> are our options?
>
> 1.	Revert CONFIG_PREEMPT_AUTO.
>
> 2.	Revert only the part that disables the voluntary preemption
> 	semantics of cond_resched().  Which, as you point out, ends up
> 	being the same as #1 above.
>
> 3.	Hotwire a voluntary preemption into the required locations.
> 	Which we would avoid doing due to upstream-acceptance concerns.
>
> So, how easy would you like to make it for us to use as much of
> CONFIG_PREEMPT_AUTO=y under various possible problem scenarios?

Ah, I see your point. Basically, keep the lazy semantics but -- in
addition -- also provide the ability to dynamically toggle
cond_resched(), might_reshed() as a feature to help move this along
further.

So, as I mentioned earlier, the callsites are already present, and
removing them needs work (with livepatch and more generally to ensure
PREEMPT_AUTO is good enough for the current PREEMPT_* scenarios so
we can ditch cond_resched()).

I honestly don't see any reason not to do this -- I would prefer
this be a temporary thing to help beat PREEMPT_AUTO into shape. And,
this provides an insurance policy for using PREEMPT_AUTO.

That said, I would like Thomas' opinion on this.

> 3.	Hotwire a voluntary preemption into the required locations.
> 	Which we would avoid doing due to upstream-acceptance concerns.

Apropos of this, how would you determine which are the locations
where we specifically need voluntary preemption?

> Yes, in a perfect world, we would have tested this already, but I
> am still chasing down problems induced by simple rcutorture testing.
> Cowardly of us, isn't it?  ;-)

Cowards are us :).

--
ankur

  reply	other threads:[~2024-03-11 20:10 UTC|newest]

Thread overview: 155+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-13  5:55 [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling Ankur Arora
2024-02-13  5:55 ` [PATCH 01/30] preempt: introduce CONFIG_PREEMPT_AUTO Ankur Arora
2024-02-13  5:55 ` [PATCH 02/30] thread_info: selector for TIF_NEED_RESCHED[_LAZY] Ankur Arora
2024-02-19 15:16   ` Thomas Gleixner
2024-02-20 22:50     ` Ankur Arora
2024-02-21 17:05       ` Thomas Gleixner
2024-02-21 18:26   ` Steven Rostedt
2024-02-21 20:03     ` Thomas Gleixner
2024-02-13  5:55 ` [PATCH 03/30] thread_info: tif_need_resched() now takes resched_t as param Ankur Arora
2024-02-14  3:17   ` kernel test robot
2024-02-14 14:08   ` Mark Rutland
2024-02-15  4:08     ` Ankur Arora
2024-02-19 12:30       ` Mark Rutland
2024-02-20 22:09         ` Ankur Arora
2024-02-19 15:21     ` Thomas Gleixner
2024-02-20 22:21       ` Ankur Arora
2024-02-21 17:07         ` Thomas Gleixner
2024-02-21 21:22           ` Ankur Arora
2024-02-13  5:55 ` [PATCH 04/30] sched: make test_*_tsk_thread_flag() return bool Ankur Arora
2024-02-14 14:12   ` Mark Rutland
2024-02-15  2:04     ` Ankur Arora
2024-02-13  5:55 ` [PATCH 05/30] sched: *_tsk_need_resched() now takes resched_t as param Ankur Arora
2024-02-19 15:26   ` Thomas Gleixner
2024-02-20 22:37     ` Ankur Arora
2024-02-21 17:10       ` Thomas Gleixner
2024-02-13  5:55 ` [PATCH 06/30] entry: handle lazy rescheduling at user-exit Ankur Arora
2024-02-19 15:29   ` Thomas Gleixner
2024-02-20 22:38     ` Ankur Arora
2024-02-13  5:55 ` [PATCH 07/30] entry/kvm: handle lazy rescheduling at guest-entry Ankur Arora
2024-02-13  5:55 ` [PATCH 08/30] entry: irqentry_exit only preempts for TIF_NEED_RESCHED Ankur Arora
2024-02-13  5:55 ` [PATCH 09/30] sched: __schedule_loop() doesn't need to check for need_resched_lazy() Ankur Arora
2024-02-13  5:55 ` [PATCH 10/30] sched: separate PREEMPT_DYNAMIC config logic Ankur Arora
2024-02-13  5:55 ` [PATCH 11/30] sched: runtime preemption config under PREEMPT_AUTO Ankur Arora
2024-02-13  5:55 ` [PATCH 12/30] rcu: limit PREEMPT_RCU to full preemption " Ankur Arora
2024-02-13  5:55 ` [PATCH 13/30] rcu: fix header guard for rcu_all_qs() Ankur Arora
2024-02-13  5:55 ` [PATCH 14/30] preempt,rcu: warn on PREEMPT_RCU=n, preempt=full Ankur Arora
2024-02-13  5:55 ` [PATCH 15/30] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
2024-03-10 10:03   ` Joel Fernandes
2024-03-10 18:56     ` Paul E. McKenney
2024-03-11  0:48       ` Joel Fernandes
2024-03-11  3:56         ` Paul E. McKenney
2024-03-11 15:01           ` Joel Fernandes
2024-03-11 20:51             ` Ankur Arora
2024-03-11 22:12               ` Thomas Gleixner
2024-03-11  5:18         ` Ankur Arora
2024-03-11 15:25           ` Joel Fernandes
2024-03-11 19:12             ` Thomas Gleixner
2024-03-11 19:53               ` Paul E. McKenney
2024-03-11 20:29                 ` Thomas Gleixner
2024-03-12  0:01                   ` Paul E. McKenney
2024-03-12  0:08               ` Joel Fernandes
2024-03-12  3:16                 ` Ankur Arora
2024-03-12  3:24                   ` Joel Fernandes
2024-03-12  5:23                     ` Ankur Arora
2024-02-13  5:55 ` [PATCH 16/30] rcu: force context-switch " Ankur Arora
2024-02-13  5:55 ` [PATCH 17/30] x86/thread_info: define TIF_NEED_RESCHED_LAZY Ankur Arora
2024-02-14 13:25   ` Mark Rutland
2024-02-14 20:31     ` Ankur Arora
2024-02-19 12:32       ` Mark Rutland
2024-02-13  5:55 ` [PATCH 18/30] sched: prepare for lazy rescheduling in resched_curr() Ankur Arora
2024-02-13  5:55 ` [PATCH 19/30] sched: default preemption policy for PREEMPT_AUTO Ankur Arora
2024-02-13  5:55 ` [PATCH 20/30] sched: handle idle preemption " Ankur Arora
2024-02-13  5:55 ` [PATCH 21/30] sched: schedule eagerly in resched_cpu() Ankur Arora
2024-02-13  5:55 ` [PATCH 22/30] sched/fair: refactor update_curr(), entity_tick() Ankur Arora
2024-02-13  5:55 ` [PATCH 23/30] sched/fair: handle tick expiry under lazy preemption Ankur Arora
2024-02-21 21:38   ` Steven Rostedt
2024-02-28 13:47   ` Juri Lelli
2024-02-29  6:43     ` Ankur Arora
2024-02-29  9:33       ` Juri Lelli
2024-02-29 23:54         ` Ankur Arora
2024-03-01  0:28           ` Paul E. McKenney
2024-02-13  5:55 ` [PATCH 24/30] sched: support preempt=none under PREEMPT_AUTO Ankur Arora
2024-02-13  5:55 ` [PATCH 25/30] sched: support preempt=full " Ankur Arora
2024-02-13  5:55 ` [PATCH 26/30] sched: handle preempt=voluntary " Ankur Arora
2024-03-03  1:08   ` Joel Fernandes
2024-03-05  8:11     ` Ankur Arora
2024-03-06 20:42       ` Joel Fernandes
2024-03-07 19:01         ` Paul E. McKenney
2024-03-08  0:15           ` Joel Fernandes
2024-03-08  0:42             ` Paul E. McKenney
2024-03-08  4:22               ` Ankur Arora
2024-03-08 21:33                 ` Paul E. McKenney
2024-03-11  4:50                   ` Ankur Arora
2024-03-11 19:26                     ` Paul E. McKenney
2024-03-11 20:09                       ` Ankur Arora [this message]
2024-03-11 20:23                         ` Linus Torvalds
2024-03-11 21:03                           ` Ankur Arora
2024-03-12  0:03                           ` Paul E. McKenney
2024-03-12 12:14                             ` Thomas Gleixner
2024-03-12 19:40                               ` Paul E. McKenney
2024-03-08  3:49             ` Ankur Arora
2024-03-08  5:29               ` Joel Fernandes
2024-03-08  6:54               ` Juri Lelli
2024-03-11  5:34                 ` Ankur Arora
2024-02-13  5:55 ` [PATCH 27/30] sched: latency warn for TIF_NEED_RESCHED_LAZY Ankur Arora
2024-02-13  5:55 ` [PATCH 28/30] tracing: support lazy resched Ankur Arora
2024-02-13  5:55 ` [PATCH 29/30] Documentation: tracing: add TIF_NEED_RESCHED_LAZY Ankur Arora
2024-02-21 21:43   ` Steven Rostedt
2024-02-21 23:22     ` Ankur Arora
2024-02-21 23:53       ` Steven Rostedt
2024-03-01 23:33     ` Joel Fernandes
2024-03-02  3:09       ` Ankur Arora
2024-03-03 19:32         ` Joel Fernandes
2024-02-13  5:55 ` [PATCH 30/30] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y Ankur Arora
2024-02-13  9:47 ` [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling Geert Uytterhoeven
2024-02-13 21:46   ` Ankur Arora
2024-02-14 23:57 ` Paul E. McKenney
2024-02-15  2:03   ` Ankur Arora
2024-02-15  3:45     ` Paul E. McKenney
2024-02-15 19:28       ` Paul E. McKenney
2024-02-15 20:04         ` Thomas Gleixner
2024-02-15 20:54           ` Paul E. McKenney
2024-02-15 20:53         ` Ankur Arora
2024-02-15 20:55           ` Paul E. McKenney
2024-02-15 21:24         ` Ankur Arora
2024-02-15 22:54           ` Paul E. McKenney
2024-02-15 22:56             ` Paul E. McKenney
2024-02-16  0:45             ` Ankur Arora
2024-02-16  2:59               ` Paul E. McKenney
2024-02-17  0:55                 ` Paul E. McKenney
2024-02-17  3:59                   ` Ankur Arora
2024-02-18 18:17                     ` Paul E. McKenney
2024-02-19 16:48                       ` Paul E. McKenney
2024-02-21 18:19                         ` Steven Rostedt
2024-02-21 19:41                           ` Paul E. McKenney
2024-02-21 20:11                             ` Steven Rostedt
2024-02-21 20:22                               ` Paul E. McKenney
2024-02-22 15:50                                 ` Mark Rutland
2024-02-22 19:11                                   ` Paul E. McKenney
2024-02-23 11:05                                     ` Mark Rutland
2024-02-23 15:31                                       ` Paul E. McKenney
2024-03-02  1:16                                         ` Paul E. McKenney
2024-03-19 11:45                                           ` Tasks RCU, ftrace, and trampolines (was: Re: [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling) Mark Rutland
2024-03-19 23:33                                             ` Paul E. McKenney
2024-02-21  6:48                   ` [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling Ankur Arora
2024-02-21 17:44                     ` Paul E. McKenney
2024-02-16  0:45             ` Ankur Arora
2024-02-21 12:23 ` Raghavendra K T
2024-02-21 17:15   ` Thomas Gleixner
2024-02-21 17:27     ` Raghavendra K T
2024-02-21 21:16       ` Ankur Arora
2024-02-22  4:05         ` Raghavendra K T
2024-02-22 21:23       ` Thomas Gleixner
2024-02-23  3:14         ` Ankur Arora
2024-02-23  6:28           ` Raghavendra K T
2024-02-24  3:15             ` Raghavendra K T
2024-02-27 17:45               ` Ankur Arora
2024-02-22 13:04     ` Raghavendra K T
2024-04-23 15:21 ` Shrikanth Hegde
2024-04-23 16:13   ` Linus Torvalds
2024-04-26  7:46     ` Shrikanth Hegde
2024-04-26 19:00       ` Ankur Arora
2024-05-07 11:16         ` Shrikanth Hegde
2024-05-08  5:18           ` Ankur Arora
2024-05-15 14:31             ` Shrikanth Hegde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87cys0il7x.fsf@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=David.Laight@aculab.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=anton.ivanov@cambridgegreys.com \
    --cc=bharata@amd.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=bristot@kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=geert@linux-m68k.org \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=joel@joelfernandes.org \
    --cc=jon.grimm@amd.com \
    --cc=jpoimboe@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=krypton@ulrich-teichert.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mattst88@gmail.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=mjguzik@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=richard@nod.at \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.