All of lore.kernel.org
 help / color / mirror / Atom feed
* hrtimer_interrupt::pr_warn_once() causes BUG: Invalid wait context
@ 2021-01-14 12:45 Qais Yousef
  2021-01-20 13:52 ` Qais Yousef
  0 siblings, 1 reply; 2+ messages in thread
From: Qais Yousef @ 2021-01-14 12:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Thomas Gleixner, Peter Zijlstra (Intel)

Hi

I hit this splat today

	# [67628.388606] hrtimer: interrupt took 304720 ns
	[67628.393546]
	[67628.393550] =============================
	[67628.393554] [ BUG: Invalid wait context ]
	[67628.393557] 5.11.0-rc3-00019-g86be331946f7 #37 Not tainted
	[67628.393560] -----------------------------
	[67628.393563] sugov:0/192 is trying to lock:
	[67628.393566] ffff000800b1d898 (&port_lock_key){-.-.}-{3:3}, at: pl011_console_write+0x138/0x218
	[67628.393581] other info that might help us debug this:
	[67628.393584] context-{2:2}
	[67628.393586] 4 locks held by sugov:0/192:
	[67628.393589]  #0: ffff0008059cb720 (&sg_policy->work_lock){+.+.}-{4:4}, at: sugov_work+0x58/0x88
	[67628.393603]  #1: ffff800015446f20 (prepare_lock){+.+.}-{4:4}, at: clk_prepare_lock+0x34/0xb0
	[67628.393618]  #2: ffff8000152aaa60 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x12c/0x310
	[67628.393632]  #3: ffff8000152aab88 (console_owner){-.-.}-{0:0}, at: console_unlock+0x190/0x6d8
	[67628.393646] stack backtrace:
	[67628.393649] CPU: 0 PID: 192 Comm: sugov:0 Not tainted 5.11.0-rc3-00019-g86be331946f7 #37
	[67628.393653] Hardware name: ARM Juno development board (r2) (DT)
	[67628.393656] Call trace:
	[67628.393659]  dump_backtrace+0x0/0x1b0
	[67628.393661]  show_stack+0x20/0x70
	[67628.393664]  dump_stack+0xf8/0x168
	[67628.393666]  __lock_acquire+0x964/0x1c88
	[67628.393669]  lock_acquire+0x26c/0x500
	[67628.393671]  _raw_spin_lock+0x64/0x88
	[67628.393674]  pl011_console_write+0x138/0x218
	[67628.393677]  console_unlock+0x2c4/0x6d8
	[67628.393680]  vprintk_emit+0x134/0x310
	[67628.393682]  vprintk_default+0x40/0x50
	[67628.393685]  vprintk_func+0xfc/0x2b0
	[67628.393687]  printk+0x68/0x90
	[67628.393690]  hrtimer_interrupt+0x24c/0x250
	[67628.393693]  arch_timer_handler_phys+0x3c/0x50
	[67628.393695]  handle_percpu_devid_irq+0xd8/0x460
	[67628.393698]  generic_handle_irq+0x38/0x50
	[67628.393701]  __handle_domain_irq+0x6c/0xc8
	[67628.393704]  gic_handle_irq+0xb0/0xf0
	[67628.393706]  el1_irq+0xb4/0x180
	[67628.393709]  _raw_spin_unlock_irqrestore+0x58/0xa8
	[67628.393712]  hrtimer_start_range_ns+0x1b0/0x420
	[67628.393715]  msg_submit+0x100/0x108
	[67628.393717]  mbox_send_message+0x84/0x128
	[67628.393720]  scpi_send_message+0x11c/0x2a8
	[67628.393723]  scpi_dvfs_set_idx+0x48/0x70
	[67628.393726]  scpi_dvfs_set_rate+0x60/0x78
	[67628.393728]  clk_change_rate+0x184/0x8a8
	[67628.393731]  clk_core_set_rate_nolock+0x1c0/0x280
	[67628.393734]  clk_set_rate+0x40/0x98
	[67628.393736]  scpi_cpufreq_set_target+0x40/0x68
	[67628.393739]  __cpufreq_driver_target+0x1a0/0x5d0

AFAICT the call to pr_warn_once() is how we end up here. Not sure if there's an
appropriate or easy fix for that.

But for the sake of documenting at least, sending this report to LKML.

It was a random occurrence and not something I can reproduce.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: hrtimer_interrupt::pr_warn_once() causes BUG: Invalid wait context
  2021-01-14 12:45 hrtimer_interrupt::pr_warn_once() causes BUG: Invalid wait context Qais Yousef
@ 2021-01-20 13:52 ` Qais Yousef
  0 siblings, 0 replies; 2+ messages in thread
From: Qais Yousef @ 2021-01-20 13:52 UTC (permalink / raw)
  To: linux-kernel; +Cc: Thomas Gleixner, Peter Zijlstra (Intel)

On 01/14/21 12:45, Qais Yousef wrote:
> Hi
> 
> I hit this splat today
> 
> 	# [67628.388606] hrtimer: interrupt took 304720 ns
> 	[67628.393546]
> 	[67628.393550] =============================
> 	[67628.393554] [ BUG: Invalid wait context ]
> 	[67628.393557] 5.11.0-rc3-00019-g86be331946f7 #37 Not tainted
> 	[67628.393560] -----------------------------
> 	[67628.393563] sugov:0/192 is trying to lock:
> 	[67628.393566] ffff000800b1d898 (&port_lock_key){-.-.}-{3:3}, at: pl011_console_write+0x138/0x218
> 	[67628.393581] other info that might help us debug this:
> 	[67628.393584] context-{2:2}
> 	[67628.393586] 4 locks held by sugov:0/192:
> 	[67628.393589]  #0: ffff0008059cb720 (&sg_policy->work_lock){+.+.}-{4:4}, at: sugov_work+0x58/0x88
> 	[67628.393603]  #1: ffff800015446f20 (prepare_lock){+.+.}-{4:4}, at: clk_prepare_lock+0x34/0xb0
> 	[67628.393618]  #2: ffff8000152aaa60 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x12c/0x310
> 	[67628.393632]  #3: ffff8000152aab88 (console_owner){-.-.}-{0:0}, at: console_unlock+0x190/0x6d8
> 	[67628.393646] stack backtrace:
> 	[67628.393649] CPU: 0 PID: 192 Comm: sugov:0 Not tainted 5.11.0-rc3-00019-g86be331946f7 #37
> 	[67628.393653] Hardware name: ARM Juno development board (r2) (DT)
> 	[67628.393656] Call trace:
> 	[67628.393659]  dump_backtrace+0x0/0x1b0
> 	[67628.393661]  show_stack+0x20/0x70
> 	[67628.393664]  dump_stack+0xf8/0x168
> 	[67628.393666]  __lock_acquire+0x964/0x1c88
> 	[67628.393669]  lock_acquire+0x26c/0x500
> 	[67628.393671]  _raw_spin_lock+0x64/0x88
> 	[67628.393674]  pl011_console_write+0x138/0x218
> 	[67628.393677]  console_unlock+0x2c4/0x6d8
> 	[67628.393680]  vprintk_emit+0x134/0x310
> 	[67628.393682]  vprintk_default+0x40/0x50
> 	[67628.393685]  vprintk_func+0xfc/0x2b0
> 	[67628.393687]  printk+0x68/0x90
> 	[67628.393690]  hrtimer_interrupt+0x24c/0x250
> 	[67628.393693]  arch_timer_handler_phys+0x3c/0x50
> 	[67628.393695]  handle_percpu_devid_irq+0xd8/0x460
> 	[67628.393698]  generic_handle_irq+0x38/0x50
> 	[67628.393701]  __handle_domain_irq+0x6c/0xc8
> 	[67628.393704]  gic_handle_irq+0xb0/0xf0
> 	[67628.393706]  el1_irq+0xb4/0x180
> 	[67628.393709]  _raw_spin_unlock_irqrestore+0x58/0xa8
> 	[67628.393712]  hrtimer_start_range_ns+0x1b0/0x420
> 	[67628.393715]  msg_submit+0x100/0x108
> 	[67628.393717]  mbox_send_message+0x84/0x128
> 	[67628.393720]  scpi_send_message+0x11c/0x2a8
> 	[67628.393723]  scpi_dvfs_set_idx+0x48/0x70
> 	[67628.393726]  scpi_dvfs_set_rate+0x60/0x78
> 	[67628.393728]  clk_change_rate+0x184/0x8a8
> 	[67628.393731]  clk_core_set_rate_nolock+0x1c0/0x280
> 	[67628.393734]  clk_set_rate+0x40/0x98
> 	[67628.393736]  scpi_cpufreq_set_target+0x40/0x68
> 	[67628.393739]  __cpufreq_driver_target+0x1a0/0x5d0
> 
> AFAICT the call to pr_warn_once() is how we end up here. Not sure if there's an
> appropriate or easy fix for that.

Oh I just accidentally came across printk_deferred_once() which looks like the
right thing to do. I can't reproduce but I'll send a patch over the weekend to
see if anyone will complain why it shouldn't work.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-01-20 21:09 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-14 12:45 hrtimer_interrupt::pr_warn_once() causes BUG: Invalid wait context Qais Yousef
2021-01-20 13:52 ` Qais Yousef

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.