All of lore.kernel.org
 help / color / mirror / Atom feed
* [3.14] core onlining/hotplug regression
@ 2014-07-25  7:50 Daniel J Blueman
  2014-07-25  9:05 ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel J Blueman @ 2014-07-25  7:50 UTC (permalink / raw)
  To: Oleg Nesterov, Thomas Gleixner, Peter Zijlstra, Hillf Danton
  Cc: Borislav Petkov, Ingo Molnar, Igor Mammedov, Steffen Persvold, LKML

Hi Thomas et al,

On a larger x86 system with 1728 cores, 3.15(.6) asserts on 
smpboot_thread_fn's td->cpu != smp_processor_id() consistently after 
~1500 cores are online.

Reverting the only directly related changes I could find [1,2] doesn't 
help. Debugging indicates there is a race where the created thread is 
quickly migrated to core 0 when this occurs, since smp_processor_id 
returns 0 in these cases. Thomas introduced a thread parked state to fix 
related issues a year back. Linux 3.14(.13) boots just nice.

Full boot output is at:
https://resources.numascale.com/linux-315-thread-mig.txt

Any theories so far? I'll start bisecting when I have full access to the 
system again in a week and I'll do some more debugging with intermittent 
access before then.

Thanks,
   Daniel

-- [1]

commit 81c98869faa5f3a9457c93efef908ef476326b31
Author: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Date:   Thu Apr 3 14:46:25 2014 -0700
kthread: ensure locality of task_struct allocations

-- [2]

commit 89f898c1e195fa6235c869bb457e500b7b3ac49d
Author: Igor Mammedov <imammedo@redhat.com>
Date:   Thu Jun 5 15:42:43 2014 +0200

     x86: Fix list/memory corruption on CPU hotplug
-- 
Daniel J Blueman
Principal Software Engineer, Numascale

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-09-13  9:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-25  7:50 [3.14] core onlining/hotplug regression Daniel J Blueman
2014-07-25  9:05 ` Thomas Gleixner
2014-07-25  9:36   ` Daniel J Blueman
2014-09-13  9:03   ` Daniel J Blueman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.