linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] sched: Fix affine_move_task() wreckage
@ 2021-02-24 12:24 Peter Zijlstra
  2021-02-24 12:24 ` [PATCH 1/6] sched: Fix migration_cpu_stop() requeueing Peter Zijlstra
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Peter Zijlstra @ 2021-02-24 12:24 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Valentin Schneider, Vincent Guittot, Mel Gorman,
	Dietmar Eggemann, linux-kernel, peterz, Andi Kleen

Hi!

The long and short of it is that commit 6d337eab041d ("sched: Fix
migrate_disable() vs set_cpus_allowed_ptr()") is utterly wrecked and it is a
miracle it doesn't insta explode for anybody (else).

The longer story is that after some initial confusion and tracing I found the
first problem and send (patch #1):

  https://lkml.kernel.org/r/YCfLHxpL+L0BYEyG@hirez.programming.kicks-ass.net

and was hoping that was the end of it (ha!). Obviously the one machine that did
manage to trigger this instantly found the next problem, now addressed in patch
#5.

The even longer story is that Monday last I sat down with a large piece of
(virtual) paper, basically threw the entire affine_move_task() /
migration_cpu_stop() logic out and while doodling re-implemented it all.

The difficult machine was happy on the second try after that.

Ofcourse, at that point I had a single huge rewrite of commit 6d337eab041d, and
I pondered sending it like that. However I figured that for review and
posterity it might be easier/better to do smaller steps. So today I reverse
engineerd a possible logical path between the two states.

I'm hoping nothing got wrecked while doing the cleanups :-)

Patches also in:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/urgent



^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-03-06 11:43 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-24 12:24 [PATCH 0/6] sched: Fix affine_move_task() wreckage Peter Zijlstra
2021-02-24 12:24 ` [PATCH 1/6] sched: Fix migration_cpu_stop() requeueing Peter Zijlstra
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 2/6] sched: Simplify migration_cpu_stop() Peter Zijlstra
2021-02-24 15:34   ` Valentin Schneider
2021-02-25  8:45     ` Peter Zijlstra
2021-02-25 11:10       ` Valentin Schneider
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 3/6] sched: Collate affine_move_task() stoppers Peter Zijlstra
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 4/6] sched: Optimize migration_cpu_stop() Peter Zijlstra
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 5/6] sched: Fix affine_move_task() self-concurrency Peter Zijlstra
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 6/6] sched: Simplify set_affinity_pending refcounts Peter Zijlstra
2021-02-24 15:34   ` Valentin Schneider
2021-02-24 15:34   ` Peter Zijlstra
2021-02-24 17:59     ` Valentin Schneider
2021-02-25  9:27       ` Peter Zijlstra
2021-02-25 11:11         ` Valentin Schneider
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).