All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] panic: Avoid extra noisy messages due to stopped cpus
@ 2018-10-11  7:17 Feng Tang
  2018-10-11  9:35 ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Feng Tang @ 2018-10-11  7:17 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H Peter Anvin, Borislav Petkov,
	Peter Zijlstra, Andrew Morton, linux-kernel
  Cc: Feng Tang

Sometimes when debugging kernel panic, we saw many extra noisy error
messages after the expected end:

[   35.743249] ---[ end Kernel panic - not syncing: Fatal exception
[   35.749975] ------------[ cut here ]------------

These messages may overflow the sceen (framebuffer) and make debugging
much difficulter.

This hack patch just quickly prevent these noisy message, and would
really like to get some comments and suggestions.

I have tried other ways like adding a panic notifier block inside
tick/sched code to cancel tick_sched timer in panic case, which
also works.

These extra messages are of 2 kinds:
a)
	 WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190
	 Call Trace:
	  <IRQ>
	  try_to_wake_up+0x157/0x430
	  default_wake_function+0xd/0x10
	  autoremove_wake_function+0x11/0x60
	  __wake_up_common+0x8a/0x160
	  __wake_up_common_lock+0x6c/0x90
	  __wake_up+0xe/0x10
	  wake_up_klogd_work_func+0x3b/0x60
	  irq_work_run_list+0x4e/0x80
	  irq_work_tick+0x40/0x50
	  update_process_times+0x3d/0x50
	  tick_sched_timer+0x38/0x80
	  __hrtimer_run_queues+0xce/0x200
	  hrtimer_interrupt+0xac/0x1f0
	  smp_apic_timer_interrupt+0x6e/0x140
	  apic_timer_interrupt+0x8e/0xa0

b)
	sched: Unexpected reschedule of offline CPU#0!
	 ------------[ cut here ]------------
	 WARNING: CPU: 1 PID: 300 at arch/x86/kernel/smp.c:141 native_smp_send_reschedule+0x3d/0x50
	  trigger_load_balance+0x125/0x230
	  scheduler_tick+0xa2/0xd0
	  update_process_times+0x42/0x50
	  tick_sched_handle.isra.5+0x21/0x60
	  tick_sched_timer+0x38/0x80
	  __hrtimer_run_queues+0xce/0x200
	  hrtimer_interrupt+0xac/0x1f0
	  smp_apic_timer_interrupt+0x6e/0x140
	  apic_timer_interrupt+0x8e/0xa0

Signed-off-by: Feng Tang <feng.tang@intel.com>
---
 arch/x86/kernel/process.c | 1 +
 kernel/sched/fair.c       | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index c93fcfd..b703862 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -520,6 +520,7 @@ void stop_this_cpu(void *dummy)
 	 * Remove this CPU:
 	 */
 	set_cpu_online(smp_processor_id(), false);
+	set_cpu_active(smp_processor_id(), false);
 	disable_local_APIC();
 	mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7fc4a37..cf41b7b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9034,7 +9034,7 @@ static inline int find_new_ilb(void)
 {
 	int ilb = cpumask_first(nohz.idle_cpus_mask);
 
-	if (ilb < nr_cpu_ids && idle_cpu(ilb))
+	if (ilb < nr_cpu_ids && idle_cpu(ilb) && cpu_online(ilb))
 		return ilb;
 
 	return nr_cpu_ids;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-11-27  3:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-11  7:17 [PATCH RFC] panic: Avoid extra noisy messages due to stopped cpus Feng Tang
2018-10-11  9:35 ` Peter Zijlstra
2018-10-11  9:59   ` Feng Tang
2018-10-22  9:55     ` Feng Tang
2018-11-08 13:05       ` [PATCH v2] panic: Avoid the extra noise dmesg Feng Tang
2018-11-26 23:59         ` Andrew Morton
2018-11-27  3:32           ` Feng Tang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.