linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT
@ 2020-12-14  2:22 Ming Lei
  2020-12-14 15:24 ` Steven Rostedt
  0 siblings, 1 reply; 4+ messages in thread
From: Ming Lei @ 2020-12-14  2:22 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Christoph Hellwig, Steven Rostedt,
	Ingo Molnar, linux-kernel

trace_note_tsk() is called by __blk_add_trace(), which is covered by RCU read lock.
So in case of PREEMPT_RT, warning of 'BUG: sleeping function called from invalid context'
will be triggered because spin lock is converted to rtmutex.

Fix the issue by converting running_trace_lock into raw_spin_lock().

Cc: Christoph Hellwig <hch@lst.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 kernel/trace/blktrace.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 2c5b3c5317c2..53dc876d669d 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -34,7 +34,7 @@ static struct trace_array *blk_tr;
 static bool blk_tracer_enabled __read_mostly;
 
 static LIST_HEAD(running_trace_list);
-static __cacheline_aligned_in_smp DEFINE_SPINLOCK(running_trace_lock);
+static __cacheline_aligned_in_smp DEFINE_RAW_SPINLOCK(running_trace_lock);
 
 /* Select an alternative, minimalistic output than the original one */
 #define TRACE_BLK_OPT_CLASSIC	0x1
@@ -121,12 +121,12 @@ static void trace_note_tsk(struct task_struct *tsk)
 	struct blk_trace *bt;
 
 	tsk->btrace_seq = blktrace_seq;
-	spin_lock_irqsave(&running_trace_lock, flags);
+	raw_spin_lock_irqsave(&running_trace_lock, flags);
 	list_for_each_entry(bt, &running_trace_list, running_list) {
 		trace_note(bt, tsk->pid, BLK_TN_PROCESS, tsk->comm,
 			   sizeof(tsk->comm), 0);
 	}
-	spin_unlock_irqrestore(&running_trace_lock, flags);
+	raw_spin_unlock_irqrestore(&running_trace_lock, flags);
 }
 
 static void trace_note_time(struct blk_trace *bt)
@@ -669,9 +669,9 @@ static int __blk_trace_startstop(struct request_queue *q, int start)
 			blktrace_seq++;
 			smp_mb();
 			bt->trace_state = Blktrace_running;
-			spin_lock_irq(&running_trace_lock);
+			raw_spin_lock_irq(&running_trace_lock);
 			list_add(&bt->running_list, &running_trace_list);
-			spin_unlock_irq(&running_trace_lock);
+			raw_spin_unlock_irq(&running_trace_lock);
 
 			trace_note_time(bt);
 			ret = 0;
@@ -679,9 +679,9 @@ static int __blk_trace_startstop(struct request_queue *q, int start)
 	} else {
 		if (bt->trace_state == Blktrace_running) {
 			bt->trace_state = Blktrace_stopped;
-			spin_lock_irq(&running_trace_lock);
+			raw_spin_lock_irq(&running_trace_lock);
 			list_del_init(&bt->running_list);
-			spin_unlock_irq(&running_trace_lock);
+			raw_spin_unlock_irq(&running_trace_lock);
 			relay_flush(bt->rchan);
 			ret = 0;
 		}
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT
  2020-12-14  2:22 [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT Ming Lei
@ 2020-12-14 15:24 ` Steven Rostedt
  2020-12-15 12:06   ` Ming Lei
  0 siblings, 1 reply; 4+ messages in thread
From: Steven Rostedt @ 2020-12-14 15:24 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Ingo Molnar,
	linux-kernel, linux-rt-users

On Mon, 14 Dec 2020 10:22:17 +0800
Ming Lei <ming.lei@redhat.com> wrote:

> trace_note_tsk() is called by __blk_add_trace(), which is covered by RCU read lock.
> So in case of PREEMPT_RT, warning of 'BUG: sleeping function called from invalid context'
> will be triggered because spin lock is converted to rtmutex.

The RCU read_lock() can not be the cause of this issue, because under
PREEMPT_RT, rcu_read_lock() can be preempted.

What was the full back trace of this problem?

> 
> Fix the issue by converting running_trace_lock into raw_spin_lock().
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  kernel/trace/blktrace.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 2c5b3c5317c2..53dc876d669d 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -34,7 +34,7 @@ static struct trace_array *blk_tr;
>  static bool blk_tracer_enabled __read_mostly;
>  
>  static LIST_HEAD(running_trace_list);
> -static __cacheline_aligned_in_smp DEFINE_SPINLOCK(running_trace_lock);
> +static __cacheline_aligned_in_smp DEFINE_RAW_SPINLOCK(running_trace_lock);
>  
>  /* Select an alternative, minimalistic output than the original one */
>  #define TRACE_BLK_OPT_CLASSIC	0x1
> @@ -121,12 +121,12 @@ static void trace_note_tsk(struct task_struct *tsk)
>  	struct blk_trace *bt;
>  
>  	tsk->btrace_seq = blktrace_seq;
> -	spin_lock_irqsave(&running_trace_lock, flags);
> +	raw_spin_lock_irqsave(&running_trace_lock, flags);
>  	list_for_each_entry(bt, &running_trace_list, running_list) {
>  		trace_note(bt, tsk->pid, BLK_TN_PROCESS, tsk->comm,
>  			   sizeof(tsk->comm), 0);
>  	}

How big is this running_trace_list? May not be something we want raw locks
around.

Please understand that converting locks to raw should be the last resort.
One should always look at the reason for a spin lock in a preempt disabled
area and see if there's other means of solving it before simply switch a
lock to raw, as each raw spinlock makes PREEMPT_RT less real time.

-- Steve


> -	spin_unlock_irqrestore(&running_trace_lock, flags);
> +	raw_spin_unlock_irqrestore(&running_trace_lock, flags);
>  }
>  
>  static void trace_note_time(struct blk_trace *bt)
> @@ -669,9 +669,9 @@ static int __blk_trace_startstop(struct request_queue *q, int start)
>  			blktrace_seq++;
>  			smp_mb();
>  			bt->trace_state = Blktrace_running;
> -			spin_lock_irq(&running_trace_lock);
> +			raw_spin_lock_irq(&running_trace_lock);
>  			list_add(&bt->running_list, &running_trace_list);
> -			spin_unlock_irq(&running_trace_lock);
> +			raw_spin_unlock_irq(&running_trace_lock);
>  
>  			trace_note_time(bt);
>  			ret = 0;
> @@ -679,9 +679,9 @@ static int __blk_trace_startstop(struct request_queue *q, int start)
>  	} else {
>  		if (bt->trace_state == Blktrace_running) {
>  			bt->trace_state = Blktrace_stopped;
> -			spin_lock_irq(&running_trace_lock);
> +			raw_spin_lock_irq(&running_trace_lock);
>  			list_del_init(&bt->running_list);
> -			spin_unlock_irq(&running_trace_lock);
> +			raw_spin_unlock_irq(&running_trace_lock);
>  			relay_flush(bt->rchan);
>  			ret = 0;
>  		}


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT
  2020-12-14 15:24 ` Steven Rostedt
@ 2020-12-15 12:06   ` Ming Lei
  2020-12-15 16:07     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 4+ messages in thread
From: Ming Lei @ 2020-12-15 12:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Ingo Molnar,
	linux-kernel, linux-rt-users

On Mon, Dec 14, 2020 at 10:24:22AM -0500, Steven Rostedt wrote:
> On Mon, 14 Dec 2020 10:22:17 +0800
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > trace_note_tsk() is called by __blk_add_trace(), which is covered by RCU read lock.
> > So in case of PREEMPT_RT, warning of 'BUG: sleeping function called from invalid context'
> > will be triggered because spin lock is converted to rtmutex.
> 
> The RCU read_lock() can not be the cause of this issue, because under
> PREEMPT_RT, rcu_read_lock() can be preempted.

OK, got it.

> 
> What was the full back trace of this problem?

[  284.527619] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:968
[  284.527626] in_atomic(): 1, irqs_disabled(): 0, pid: 6705, name: mandb
[  284.527631] 4 locks held by mandb/6705:
[  284.527634]  #0: ffff88853f4f2eb0 (&f->f_pos_lock){+.+.}-{0:0}, at: __fdget_pos+0xaf/0xe0
[  284.527662]  #1: ffff8885eb814048 (&sb->s_type->i_mutex_key#13){++++}-{0:0}, at: xfs_ilock+0x17d/0x590 [xfs]
[  284.527838]  #2: ffffffffa24d22c0 (rcu_read_lock){....}-{1:2}, at: blk_add_trace_bio+0x0/0x2f0
[  284.527859]  #3: ffffffffa22177a0 (running_trace_lock){+.+.}-{2:2}, at: __blk_add_trace+0x9ea/0xdf0
[  284.527877] Preemption disabled at:
[  284.527886] [<ffffffff9f936743>] get_lock_stats+0x13/0x120
[  284.527897] CPU: 53 PID: 6705 Comm: mandb Kdump: loaded Not tainted 4.18.0-259.rt7.24.el8.x86_64+debug #1
[  284.527901] Hardware name: Lenovo ThinkSystem SR630 -[7X02CTO1WW]-/-[7X02CTO1WW]-, BIOS -[IVE136T-2.10]- 03/22/2019
[  284.527904] Call Trace:
[  284.527918]  dump_stack+0x8e/0xd0
[  284.527934]  ? get_lock_stats+0x13/0x120
[  284.527944]  ___might_sleep.cold.58+0x54/0x77
[  284.527964]  rt_spin_lock+0x3e/0x120
[  284.527969]  ? __blk_add_trace+0x9ea/0xdf0
[  284.527980]  __blk_add_trace+0x9ea/0xdf0
[  284.528025]  ? rcu_read_lock_sched_held+0xe0/0xe0
[  284.528048]  blk_add_trace_bio+0x25d/0x2f0
[  284.528074]  ? recalibrate_cpu_khz+0x10/0x10
[  284.528094]  ? blk_add_trace_bio_frontmerge+0x20/0x20
[  284.528107]  generic_make_request_checks+0x1471/0x1cc0
[  284.528143]  ? blk_update_request+0x1610/0x1610
[  284.528172]  ? iomap_readpages_actor+0x4f3/0x900
[  284.528208]  generic_make_request+0x8e/0xa40
[  284.528255]  ? direct_make_request+0x2b0/0x2b0
[  284.528260]  ? iomap_readpage_actor+0xd40/0xd40
[  284.528267]  ? iomap_readpages+0x20b/0x950
[  284.528277]  ? bad_range+0x237/0x410
[  284.528288]  ? trace_event_raw_event_iomap_apply+0x430/0x430
[  284.528329]  ? submit_bio+0xcb/0x440
[  284.528338]  submit_bio+0xcb/0x440
[  284.528358]  ? generic_make_request+0xa40/0xa40
[  284.528387]  ? rcu_read_lock_bh_held+0x100/0x100
[  284.528412]  ? iomap_readpage_actor+0xd40/0xd40
[  284.528422]  iomap_readpages+0x22f/0x950
[  284.528442]  ? iomap_writepage+0xa0/0xa0
[  284.528505]  read_pages+0xf8/0x600
[  284.528535]  ? __alloc_pages_nodemask+0x80d/0xa20
[  284.528554]  ? read_cache_pages+0x6d0/0x6d0
[  284.528560]  ? __alloc_pages_slowpath+0x2a10/0x2a10
[  284.528621]  __do_page_cache_readahead+0x311/0x3c0
[  284.528663]  ? read_pages+0x600/0x600
[  284.528712]  ondemand_readahead+0x443/0xc00
[  284.528717]  ? page_cache_sync_readahead+0x18a/0x3f0
[  284.528757]  generic_file_buffered_read+0xd65/0x1ee0
[  284.528850]  ? read_cache_page_gfp+0xa0/0xa0
[  284.529003]  ? xfs_ilock+0x17d/0x590 [xfs]
[  284.529009]  ? get_lock_stats+0x18/0x120
[  284.529018]  ? put_lock_stats.isra.19+0xb/0xa0
[  284.529045]  ? lock_contended+0xa70/0xa70
[  284.529069]  ? __down_read_trylock+0x53/0x80
[  284.529222]  ? xfs_file_buffered_aio_read+0x146/0x4f0 [xfs]
[  284.529381]  xfs_file_buffered_aio_read+0x151/0x4f0 [xfs]
[  284.529534]  xfs_file_read_iter+0x2bf/0x740 [xfs]
[  284.529545]  ? rt_mutex_slowlock_locked+0x262/0x790
[  284.529572]  new_sync_read+0x39c/0x560
[  284.529582]  ? mark_held_locks+0xb7/0x120
[  284.529594]  ? do_iter_readv_writev+0x6e0/0x6e0
[  284.529660]  ? fsnotify_first_mark+0x150/0x150
[  284.529717]  vfs_read+0x147/0x350
[  284.529745]  ksys_read+0xb8/0x170
[  284.529759]  ? kernel_write+0x130/0x130
[  284.529809]  do_syscall_64+0x13c/0x670
[  284.529830]  entry_SYSCALL_64_after_hwframe+0x6a/0xdf

Thanks,
Ming


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT
  2020-12-15 12:06   ` Ming Lei
@ 2020-12-15 16:07     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-15 16:07 UTC (permalink / raw)
  To: Ming Lei
  Cc: Steven Rostedt, Jens Axboe, linux-block, Christoph Hellwig,
	Ingo Molnar, linux-kernel, linux-rt-users

On 2020-12-15 20:06:04 [+0800], Ming Lei wrote:
> 
> [  284.527619] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:968
> [  284.527626] in_atomic(): 1, irqs_disabled(): 0, pid: 6705, name: mandb
> [  284.527631] 4 locks held by mandb/6705:
> [  284.527634]  #0: ffff88853f4f2eb0 (&f->f_pos_lock){+.+.}-{0:0}, at: __fdget_pos+0xaf/0xe0
> [  284.527662]  #1: ffff8885eb814048 (&sb->s_type->i_mutex_key#13){++++}-{0:0}, at: xfs_ilock+0x17d/0x590 [xfs]
> [  284.527838]  #2: ffffffffa24d22c0 (rcu_read_lock){....}-{1:2}, at: blk_add_trace_bio+0x0/0x2f0
> [  284.527859]  #3: ffffffffa22177a0 (running_trace_lock){+.+.}-{2:2}, at: __blk_add_trace+0x9ea/0xdf0
> [  284.527877] Preemption disabled at:
> [  284.527886] [<ffffffff9f936743>] get_lock_stats+0x13/0x120
> [  284.527897] CPU: 53 PID: 6705 Comm: mandb Kdump: loaded Not tainted 4.18.0-259.rt7.24.el8.x86_64+debug #1

4.18?
I just gave it a try with 5.10-RT and I don't see a splat. Please retest
with v5.10-RT.

> 
> Thanks,
> Ming

Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-15 16:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-14  2:22 [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT Ming Lei
2020-12-14 15:24 ` Steven Rostedt
2020-12-15 12:06   ` Ming Lei
2020-12-15 16:07     ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).