From: Ankur Arora <ankur.a.arora@oracle.com>
To: paulmck@kernel.org
Cc: Ankur Arora <ankur.a.arora@oracle.com>,
linux-kernel@vger.kernel.org, tglx@linutronix.de,
peterz@infradead.org, torvalds@linux-foundation.org,
akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de,
dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com,
juri.lelli@redhat.com, vincent.guittot@linaro.org,
willy@infradead.org, mgorman@suse.de, jpoimboe@kernel.org,
mark.rutland@arm.com, jgross@suse.com, andrew.cooper3@citrix.com,
bristot@kernel.org, mathieu.desnoyers@efficios.com,
glaubitz@physik.fu-berlin.de, anton.ivanov@cambridgegreys.com,
mattst88@gmail.com, krypton@ulrich-teichert.org,
rostedt@goodmis.org, David.Laight@aculab.com, richard@nod.at,
jon.grimm@amd.com, bharata@amd.com, boris.ostrovsky@oracle.com,
konrad.wilk@oracle.com
Subject: Re: [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling
Date: Thu, 15 Feb 2024 16:45:17 -0800 [thread overview]
Message-ID: <87le7lkzj6.fsf@oracle.com> (raw)
In-Reply-To: <9916c73f-510c-47a6-a9b4-ea6b438e82c0@paulmck-laptop>
Paul E. McKenney <paulmck@kernel.org> writes:
> On Thu, Feb 15, 2024 at 01:24:59PM -0800, Ankur Arora wrote:
>>
>> Paul E. McKenney <paulmck@kernel.org> writes:
>>
>> > On Wed, Feb 14, 2024 at 07:45:18PM -0800, Paul E. McKenney wrote:
>> >> On Wed, Feb 14, 2024 at 06:03:28PM -0800, Ankur Arora wrote:
>> >> >
>> >> > Paul E. McKenney <paulmck@kernel.org> writes:
>> >> >
>> >> > > On Mon, Feb 12, 2024 at 09:55:24PM -0800, Ankur Arora wrote:
>> >> > >> Hi,
>> >> > >>
>> >> > >> This series adds a new scheduling model PREEMPT_AUTO, which like
>> >> > >> PREEMPT_DYNAMIC allows dynamic switching between a none/voluntary/full
>> >> > >> preemption model. However, unlike PREEMPT_DYNAMIC, it doesn't depend
>> >> > >> on explicit preemption points for the voluntary models.
>> >> > >>
>> >> > >> The series is based on Thomas' original proposal which he outlined
>> >> > >> in [1], [2] and in his PoC [3].
>> >> > >>
>> >> > >> An earlier RFC version is at [4].
>> >> > >
>> >> > > This uncovered a couple of latent bugs in RCU due to its having been
>> >> > > a good long time since anyone built a !SMP preemptible kernel with
>> >> > > non-preemptible RCU. I have a couple of fixes queued on -rcu [1], most
>> >> > > likely for the merge window after next, but let me know if you need
>> >> > > them sooner.
>> >> >
>> >> > Thanks. As you can probably tell, I skipped out on !SMP in my testing.
>> >> > But, the attached diff should tide me over until the fixes are in.
>> >>
>> >> That was indeed my guess. ;-)
>> >>
>> >> > > I am also seeing OOM conditions during rcutorture testing of callback
>> >> > > flooding, but I am still looking into this.
>> >> >
>> >> > That's on the PREEMPT_AUTO && PREEMPT_VOLUNTARY configuration?
>> >>
>> >> On two of the PREEMPT_AUTO && PREEMPT_NONE configurations, but only on
>> >> two of them thus far. I am running a longer test to see if this might
>> >> be just luck. If not, I look to see what rcutorture scenarios TREE10
>> >> and TRACE01 have in common.
>> >
>> > And still TRACE01 and TREE10 are hitting OOMs, still not seeing what
>> > sets them apart. I also hit a grace-period hang in TREE04, which does
>> > CONFIG_PREEMPT_VOLUNTARY=y along with CONFIG_PREEMPT_AUTO=y. Something
>> > to dig into more.
>>
>> So, the only PREEMPT_VOLUNTARY=y configuration is TREE04. I wonder
>> if you would continue to hit the TREE04 hang with CONFIG_PREEMTP_NONE=y
>> as well?
>> (Just in the interest of minimizing configurations.)
>
> I would be happy to, but in the spirit of full disclosure...
>
> First, I have seen that failure only once, which is not enough to
> conclude that it has much to do with TREE04. It might simply be low
> probability, so that TREE04 simply was unlucky enough to hit it first.
> In contrast, I have sufficient data to be reasonably confident that the
> callback-flooding OOMs really do have something to do with the TRACE01 and
> TREE10 scenarios, even though I am not yet seeing what these two scenarios
> have in common that they don't also have in common with other scenarios.
> But what is life without a bit of mystery? ;-)
:).
> Second, please see the attached tarball, which contains .csv files showing
> Kconfig options and kernel boot parameters for the various torture tests.
> The portions of the filenames preceding the "config.csv" correspond to
> the directories in tools/testing/selftests/rcutorture/configs.
So, at least some of the HZ_FULL=y tests don't run into problems.
> Third, there are additional scenarios hand-crafted by the script at
> tools/testing/selftests/rcutorture/bin/torture.sh. Thus far, none of
> them have triggered, other than via the newly increased difficulty
> of configurating a tracing-free kernel with which to test, but they
> can still be useful in ruling out particular Kconfig options or kernel
> boot parameters being related to a given issue.
>
> But please do take a look at the .csv files and let me know what
> adjustments would be appropriate given the failure information.
Nothing stands out just yet. Let me start a run here and see if
that gives me some ideas.
I'm guessing the splats don't give any useful information or
you would have attached them ;).
Thanks for testing, btw.
--
ankur
next prev parent reply other threads:[~2024-02-16 0:47 UTC|newest]
Thread overview: 157+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-13 5:55 [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling Ankur Arora
2024-02-13 5:55 ` [PATCH 01/30] preempt: introduce CONFIG_PREEMPT_AUTO Ankur Arora
2024-02-13 5:55 ` [PATCH 02/30] thread_info: selector for TIF_NEED_RESCHED[_LAZY] Ankur Arora
2024-02-19 15:16 ` Thomas Gleixner
2024-02-20 22:50 ` Ankur Arora
2024-02-21 17:05 ` Thomas Gleixner
2024-02-21 18:26 ` Steven Rostedt
2024-02-21 20:03 ` Thomas Gleixner
2024-02-13 5:55 ` [PATCH 03/30] thread_info: tif_need_resched() now takes resched_t as param Ankur Arora
2024-02-14 3:17 ` kernel test robot
2024-02-14 14:08 ` Mark Rutland
2024-02-15 4:08 ` Ankur Arora
2024-02-19 12:30 ` Mark Rutland
2024-02-20 22:09 ` Ankur Arora
2024-02-19 15:21 ` Thomas Gleixner
2024-02-20 22:21 ` Ankur Arora
2024-02-21 17:07 ` Thomas Gleixner
2024-02-21 21:22 ` Ankur Arora
2024-02-13 5:55 ` [PATCH 04/30] sched: make test_*_tsk_thread_flag() return bool Ankur Arora
2024-02-14 14:12 ` Mark Rutland
2024-02-15 2:04 ` Ankur Arora
2024-02-13 5:55 ` [PATCH 05/30] sched: *_tsk_need_resched() now takes resched_t as param Ankur Arora
2024-02-19 15:26 ` Thomas Gleixner
2024-02-20 22:37 ` Ankur Arora
2024-02-21 17:10 ` Thomas Gleixner
2024-02-13 5:55 ` [PATCH 06/30] entry: handle lazy rescheduling at user-exit Ankur Arora
2024-02-19 15:29 ` Thomas Gleixner
2024-02-20 22:38 ` Ankur Arora
2024-02-13 5:55 ` [PATCH 07/30] entry/kvm: handle lazy rescheduling at guest-entry Ankur Arora
2024-02-13 5:55 ` [PATCH 08/30] entry: irqentry_exit only preempts for TIF_NEED_RESCHED Ankur Arora
2024-02-13 5:55 ` [PATCH 09/30] sched: __schedule_loop() doesn't need to check for need_resched_lazy() Ankur Arora
2024-02-13 5:55 ` [PATCH 10/30] sched: separate PREEMPT_DYNAMIC config logic Ankur Arora
2024-02-13 5:55 ` [PATCH 11/30] sched: runtime preemption config under PREEMPT_AUTO Ankur Arora
2024-02-13 5:55 ` [PATCH 12/30] rcu: limit PREEMPT_RCU to full preemption " Ankur Arora
2024-02-13 5:55 ` [PATCH 13/30] rcu: fix header guard for rcu_all_qs() Ankur Arora
2024-02-13 5:55 ` [PATCH 14/30] preempt,rcu: warn on PREEMPT_RCU=n, preempt=full Ankur Arora
2024-02-13 5:55 ` [PATCH 15/30] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
2024-03-10 10:03 ` Joel Fernandes
2024-03-10 18:56 ` Paul E. McKenney
2024-03-11 0:48 ` Joel Fernandes
2024-03-11 3:56 ` Paul E. McKenney
2024-03-11 15:01 ` Joel Fernandes
2024-03-11 20:51 ` Ankur Arora
2024-03-11 22:12 ` Thomas Gleixner
2024-03-11 5:18 ` Ankur Arora
2024-03-11 15:25 ` Joel Fernandes
2024-03-11 19:12 ` Thomas Gleixner
2024-03-11 19:53 ` Paul E. McKenney
2024-03-11 20:29 ` Thomas Gleixner
2024-03-12 0:01 ` Paul E. McKenney
2024-03-12 0:08 ` Joel Fernandes
2024-03-12 3:16 ` Ankur Arora
2024-03-12 3:24 ` Joel Fernandes
2024-03-12 5:23 ` Ankur Arora
2024-02-13 5:55 ` [PATCH 16/30] rcu: force context-switch " Ankur Arora
2024-02-13 5:55 ` [PATCH 17/30] x86/thread_info: define TIF_NEED_RESCHED_LAZY Ankur Arora
2024-02-14 13:25 ` Mark Rutland
2024-02-14 20:31 ` Ankur Arora
2024-02-19 12:32 ` Mark Rutland
2024-02-13 5:55 ` [PATCH 18/30] sched: prepare for lazy rescheduling in resched_curr() Ankur Arora
2024-02-13 5:55 ` [PATCH 19/30] sched: default preemption policy for PREEMPT_AUTO Ankur Arora
2024-02-13 5:55 ` [PATCH 20/30] sched: handle idle preemption " Ankur Arora
2024-02-13 5:55 ` [PATCH 21/30] sched: schedule eagerly in resched_cpu() Ankur Arora
2024-02-13 5:55 ` [PATCH 22/30] sched/fair: refactor update_curr(), entity_tick() Ankur Arora
2024-02-13 5:55 ` [PATCH 23/30] sched/fair: handle tick expiry under lazy preemption Ankur Arora
2024-02-21 21:38 ` Steven Rostedt
2024-02-28 13:47 ` Juri Lelli
2024-02-29 6:43 ` Ankur Arora
2024-02-29 9:33 ` Juri Lelli
2024-02-29 23:54 ` Ankur Arora
2024-03-01 0:28 ` Paul E. McKenney
2024-02-13 5:55 ` [PATCH 24/30] sched: support preempt=none under PREEMPT_AUTO Ankur Arora
2024-02-13 5:55 ` [PATCH 25/30] sched: support preempt=full " Ankur Arora
2024-02-13 5:55 ` [PATCH 26/30] sched: handle preempt=voluntary " Ankur Arora
2024-03-03 1:08 ` Joel Fernandes
2024-03-05 8:11 ` Ankur Arora
2024-03-06 20:42 ` Joel Fernandes
2024-03-07 19:01 ` Paul E. McKenney
2024-03-08 0:15 ` Joel Fernandes
2024-03-08 0:42 ` Paul E. McKenney
2024-03-08 4:22 ` Ankur Arora
2024-03-08 21:33 ` Paul E. McKenney
2024-03-11 4:50 ` Ankur Arora
2024-03-11 19:26 ` Paul E. McKenney
2024-03-11 20:09 ` Ankur Arora
2024-03-11 20:23 ` Linus Torvalds
2024-03-11 21:03 ` Ankur Arora
2024-03-12 0:03 ` Paul E. McKenney
2024-03-12 12:14 ` Thomas Gleixner
2024-03-12 19:40 ` Paul E. McKenney
2024-03-08 3:49 ` Ankur Arora
2024-03-08 5:29 ` Joel Fernandes
2024-03-08 6:54 ` Juri Lelli
2024-03-11 5:34 ` Ankur Arora
2024-02-13 5:55 ` [PATCH 27/30] sched: latency warn for TIF_NEED_RESCHED_LAZY Ankur Arora
2024-02-13 5:55 ` [PATCH 28/30] tracing: support lazy resched Ankur Arora
2024-02-13 5:55 ` [PATCH 29/30] Documentation: tracing: add TIF_NEED_RESCHED_LAZY Ankur Arora
2024-02-21 21:43 ` Steven Rostedt
2024-02-21 23:22 ` Ankur Arora
2024-02-21 23:53 ` Steven Rostedt
2024-03-01 23:33 ` Joel Fernandes
2024-03-02 3:09 ` Ankur Arora
2024-03-03 19:32 ` Joel Fernandes
2024-02-13 5:55 ` [PATCH 30/30] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y Ankur Arora
2024-02-13 9:47 ` [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling Geert Uytterhoeven
2024-02-13 21:46 ` Ankur Arora
2024-02-14 23:57 ` Paul E. McKenney
2024-02-15 2:03 ` Ankur Arora
2024-02-15 3:45 ` Paul E. McKenney
2024-02-15 19:28 ` Paul E. McKenney
2024-02-15 20:04 ` Thomas Gleixner
2024-02-15 20:54 ` Paul E. McKenney
2024-02-15 20:53 ` Ankur Arora
2024-02-15 20:55 ` Paul E. McKenney
2024-02-15 21:24 ` Ankur Arora
2024-02-15 22:54 ` Paul E. McKenney
2024-02-15 22:56 ` Paul E. McKenney
2024-02-16 0:45 ` Ankur Arora [this message]
2024-02-16 2:59 ` Paul E. McKenney
2024-02-17 0:55 ` Paul E. McKenney
2024-02-17 3:59 ` Ankur Arora
2024-02-18 18:17 ` Paul E. McKenney
2024-02-19 16:48 ` Paul E. McKenney
2024-02-21 18:19 ` Steven Rostedt
2024-02-21 19:41 ` Paul E. McKenney
2024-02-21 20:11 ` Steven Rostedt
2024-02-21 20:22 ` Paul E. McKenney
2024-02-22 15:50 ` Mark Rutland
2024-02-22 19:11 ` Paul E. McKenney
2024-02-23 11:05 ` Mark Rutland
2024-02-23 15:31 ` Paul E. McKenney
2024-03-02 1:16 ` Paul E. McKenney
2024-03-19 11:45 ` Tasks RCU, ftrace, and trampolines (was: Re: [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling) Mark Rutland
2024-03-19 23:33 ` Paul E. McKenney
2024-02-21 6:48 ` [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling Ankur Arora
2024-02-21 17:44 ` Paul E. McKenney
2024-02-16 0:45 ` Ankur Arora
2024-02-21 12:23 ` Raghavendra K T
2024-02-21 17:15 ` Thomas Gleixner
2024-02-21 17:27 ` Raghavendra K T
2024-02-21 21:16 ` Ankur Arora
2024-02-22 4:05 ` Raghavendra K T
2024-02-22 21:23 ` Thomas Gleixner
2024-02-23 3:14 ` Ankur Arora
2024-02-23 6:28 ` Raghavendra K T
2024-02-24 3:15 ` Raghavendra K T
2024-02-27 17:45 ` Ankur Arora
2024-02-22 13:04 ` Raghavendra K T
2024-04-23 15:21 ` Shrikanth Hegde
2024-04-23 16:13 ` Linus Torvalds
2024-04-26 7:46 ` Shrikanth Hegde
2024-04-26 19:00 ` Ankur Arora
2024-05-07 11:16 ` Shrikanth Hegde
2024-05-08 5:18 ` Ankur Arora
2024-05-15 14:31 ` Shrikanth Hegde
[not found] <draft-87a5o4go5i.ffs@tglx>
2024-02-19 15:54 ` Thomas Gleixner
2024-02-21 6:48 ` Ankur Arora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87le7lkzj6.fsf@oracle.com \
--to=ankur.a.arora@oracle.com \
--cc=David.Laight@aculab.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.cooper3@citrix.com \
--cc=anton.ivanov@cambridgegreys.com \
--cc=bharata@amd.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=bristot@kernel.org \
--cc=dave.hansen@linux.intel.com \
--cc=glaubitz@physik.fu-berlin.de \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=jon.grimm@amd.com \
--cc=jpoimboe@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=krypton@ulrich-teichert.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mattst88@gmail.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=richard@nod.at \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).