All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] sched: Fix affine_move_task() wreckage
@ 2021-02-24 12:24 Peter Zijlstra
  2021-02-24 12:24 ` [PATCH 1/6] sched: Fix migration_cpu_stop() requeueing Peter Zijlstra
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Peter Zijlstra @ 2021-02-24 12:24 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Valentin Schneider, Vincent Guittot, Mel Gorman,
	Dietmar Eggemann, linux-kernel, peterz, Andi Kleen

Hi!

The long and short of it is that commit 6d337eab041d ("sched: Fix
migrate_disable() vs set_cpus_allowed_ptr()") is utterly wrecked and it is a
miracle it doesn't insta explode for anybody (else).

The longer story is that after some initial confusion and tracing I found the
first problem and send (patch #1):

  https://lkml.kernel.org/r/YCfLHxpL+L0BYEyG@hirez.programming.kicks-ass.net

and was hoping that was the end of it (ha!). Obviously the one machine that did
manage to trigger this instantly found the next problem, now addressed in patch
#5.

The even longer story is that Monday last I sat down with a large piece of
(virtual) paper, basically threw the entire affine_move_task() /
migration_cpu_stop() logic out and while doodling re-implemented it all.

The difficult machine was happy on the second try after that.

Ofcourse, at that point I had a single huge rewrite of commit 6d337eab041d, and
I pondered sending it like that. However I figured that for review and
posterity it might be easier/better to do smaller steps. So today I reverse
engineerd a possible logical path between the two states.

I'm hoping nothing got wrecked while doing the cleanups :-)

Patches also in:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/urgent



^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-03-06 11:43 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-24 12:24 [PATCH 0/6] sched: Fix affine_move_task() wreckage Peter Zijlstra
2021-02-24 12:24 ` [PATCH 1/6] sched: Fix migration_cpu_stop() requeueing Peter Zijlstra
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 2/6] sched: Simplify migration_cpu_stop() Peter Zijlstra
2021-02-24 15:34   ` Valentin Schneider
2021-02-25  8:45     ` Peter Zijlstra
2021-02-25 11:10       ` Valentin Schneider
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 3/6] sched: Collate affine_move_task() stoppers Peter Zijlstra
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 4/6] sched: Optimize migration_cpu_stop() Peter Zijlstra
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 5/6] sched: Fix affine_move_task() self-concurrency Peter Zijlstra
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-02-24 12:24 ` [PATCH 6/6] sched: Simplify set_affinity_pending refcounts Peter Zijlstra
2021-02-24 15:34   ` Valentin Schneider
2021-02-24 15:34   ` Peter Zijlstra
2021-02-24 17:59     ` Valentin Schneider
2021-02-25  9:27       ` Peter Zijlstra
2021-02-25 11:11         ` Valentin Schneider
2021-03-01 10:16   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2021-03-06 11:42   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.