* [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT
@ 2020-12-14 2:22 Ming Lei
2020-12-14 15:24 ` Steven Rostedt
0 siblings, 1 reply; 4+ messages in thread
From: Ming Lei @ 2020-12-14 2:22 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Ming Lei, Christoph Hellwig, Steven Rostedt,
Ingo Molnar, linux-kernel
trace_note_tsk() is called by __blk_add_trace(), which is covered by RCU read lock.
So in case of PREEMPT_RT, warning of 'BUG: sleeping function called from invalid context'
will be triggered because spin lock is converted to rtmutex.
Fix the issue by converting running_trace_lock into raw_spin_lock().
Cc: Christoph Hellwig <hch@lst.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
kernel/trace/blktrace.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 2c5b3c5317c2..53dc876d669d 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -34,7 +34,7 @@ static struct trace_array *blk_tr;
static bool blk_tracer_enabled __read_mostly;
static LIST_HEAD(running_trace_list);
-static __cacheline_aligned_in_smp DEFINE_SPINLOCK(running_trace_lock);
+static __cacheline_aligned_in_smp DEFINE_RAW_SPINLOCK(running_trace_lock);
/* Select an alternative, minimalistic output than the original one */
#define TRACE_BLK_OPT_CLASSIC 0x1
@@ -121,12 +121,12 @@ static void trace_note_tsk(struct task_struct *tsk)
struct blk_trace *bt;
tsk->btrace_seq = blktrace_seq;
- spin_lock_irqsave(&running_trace_lock, flags);
+ raw_spin_lock_irqsave(&running_trace_lock, flags);
list_for_each_entry(bt, &running_trace_list, running_list) {
trace_note(bt, tsk->pid, BLK_TN_PROCESS, tsk->comm,
sizeof(tsk->comm), 0);
}
- spin_unlock_irqrestore(&running_trace_lock, flags);
+ raw_spin_unlock_irqrestore(&running_trace_lock, flags);
}
static void trace_note_time(struct blk_trace *bt)
@@ -669,9 +669,9 @@ static int __blk_trace_startstop(struct request_queue *q, int start)
blktrace_seq++;
smp_mb();
bt->trace_state = Blktrace_running;
- spin_lock_irq(&running_trace_lock);
+ raw_spin_lock_irq(&running_trace_lock);
list_add(&bt->running_list, &running_trace_list);
- spin_unlock_irq(&running_trace_lock);
+ raw_spin_unlock_irq(&running_trace_lock);
trace_note_time(bt);
ret = 0;
@@ -679,9 +679,9 @@ static int __blk_trace_startstop(struct request_queue *q, int start)
} else {
if (bt->trace_state == Blktrace_running) {
bt->trace_state = Blktrace_stopped;
- spin_lock_irq(&running_trace_lock);
+ raw_spin_lock_irq(&running_trace_lock);
list_del_init(&bt->running_list);
- spin_unlock_irq(&running_trace_lock);
+ raw_spin_unlock_irq(&running_trace_lock);
relay_flush(bt->rchan);
ret = 0;
}
--
2.28.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT
2020-12-14 2:22 [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT Ming Lei
@ 2020-12-14 15:24 ` Steven Rostedt
2020-12-15 12:06 ` Ming Lei
0 siblings, 1 reply; 4+ messages in thread
From: Steven Rostedt @ 2020-12-14 15:24 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-block, Christoph Hellwig, Ingo Molnar,
linux-kernel, linux-rt-users
On Mon, 14 Dec 2020 10:22:17 +0800
Ming Lei <ming.lei@redhat.com> wrote:
> trace_note_tsk() is called by __blk_add_trace(), which is covered by RCU read lock.
> So in case of PREEMPT_RT, warning of 'BUG: sleeping function called from invalid context'
> will be triggered because spin lock is converted to rtmutex.
The RCU read_lock() can not be the cause of this issue, because under
PREEMPT_RT, rcu_read_lock() can be preempted.
What was the full back trace of this problem?
>
> Fix the issue by converting running_trace_lock into raw_spin_lock().
>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
> kernel/trace/blktrace.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 2c5b3c5317c2..53dc876d669d 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -34,7 +34,7 @@ static struct trace_array *blk_tr;
> static bool blk_tracer_enabled __read_mostly;
>
> static LIST_HEAD(running_trace_list);
> -static __cacheline_aligned_in_smp DEFINE_SPINLOCK(running_trace_lock);
> +static __cacheline_aligned_in_smp DEFINE_RAW_SPINLOCK(running_trace_lock);
>
> /* Select an alternative, minimalistic output than the original one */
> #define TRACE_BLK_OPT_CLASSIC 0x1
> @@ -121,12 +121,12 @@ static void trace_note_tsk(struct task_struct *tsk)
> struct blk_trace *bt;
>
> tsk->btrace_seq = blktrace_seq;
> - spin_lock_irqsave(&running_trace_lock, flags);
> + raw_spin_lock_irqsave(&running_trace_lock, flags);
> list_for_each_entry(bt, &running_trace_list, running_list) {
> trace_note(bt, tsk->pid, BLK_TN_PROCESS, tsk->comm,
> sizeof(tsk->comm), 0);
> }
How big is this running_trace_list? May not be something we want raw locks
around.
Please understand that converting locks to raw should be the last resort.
One should always look at the reason for a spin lock in a preempt disabled
area and see if there's other means of solving it before simply switch a
lock to raw, as each raw spinlock makes PREEMPT_RT less real time.
-- Steve
> - spin_unlock_irqrestore(&running_trace_lock, flags);
> + raw_spin_unlock_irqrestore(&running_trace_lock, flags);
> }
>
> static void trace_note_time(struct blk_trace *bt)
> @@ -669,9 +669,9 @@ static int __blk_trace_startstop(struct request_queue *q, int start)
> blktrace_seq++;
> smp_mb();
> bt->trace_state = Blktrace_running;
> - spin_lock_irq(&running_trace_lock);
> + raw_spin_lock_irq(&running_trace_lock);
> list_add(&bt->running_list, &running_trace_list);
> - spin_unlock_irq(&running_trace_lock);
> + raw_spin_unlock_irq(&running_trace_lock);
>
> trace_note_time(bt);
> ret = 0;
> @@ -679,9 +679,9 @@ static int __blk_trace_startstop(struct request_queue *q, int start)
> } else {
> if (bt->trace_state == Blktrace_running) {
> bt->trace_state = Blktrace_stopped;
> - spin_lock_irq(&running_trace_lock);
> + raw_spin_lock_irq(&running_trace_lock);
> list_del_init(&bt->running_list);
> - spin_unlock_irq(&running_trace_lock);
> + raw_spin_unlock_irq(&running_trace_lock);
> relay_flush(bt->rchan);
> ret = 0;
> }
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT
2020-12-14 15:24 ` Steven Rostedt
@ 2020-12-15 12:06 ` Ming Lei
2020-12-15 16:07 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 4+ messages in thread
From: Ming Lei @ 2020-12-15 12:06 UTC (permalink / raw)
To: Steven Rostedt
Cc: Jens Axboe, linux-block, Christoph Hellwig, Ingo Molnar,
linux-kernel, linux-rt-users
On Mon, Dec 14, 2020 at 10:24:22AM -0500, Steven Rostedt wrote:
> On Mon, 14 Dec 2020 10:22:17 +0800
> Ming Lei <ming.lei@redhat.com> wrote:
>
> > trace_note_tsk() is called by __blk_add_trace(), which is covered by RCU read lock.
> > So in case of PREEMPT_RT, warning of 'BUG: sleeping function called from invalid context'
> > will be triggered because spin lock is converted to rtmutex.
>
> The RCU read_lock() can not be the cause of this issue, because under
> PREEMPT_RT, rcu_read_lock() can be preempted.
OK, got it.
>
> What was the full back trace of this problem?
[ 284.527619] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:968
[ 284.527626] in_atomic(): 1, irqs_disabled(): 0, pid: 6705, name: mandb
[ 284.527631] 4 locks held by mandb/6705:
[ 284.527634] #0: ffff88853f4f2eb0 (&f->f_pos_lock){+.+.}-{0:0}, at: __fdget_pos+0xaf/0xe0
[ 284.527662] #1: ffff8885eb814048 (&sb->s_type->i_mutex_key#13){++++}-{0:0}, at: xfs_ilock+0x17d/0x590 [xfs]
[ 284.527838] #2: ffffffffa24d22c0 (rcu_read_lock){....}-{1:2}, at: blk_add_trace_bio+0x0/0x2f0
[ 284.527859] #3: ffffffffa22177a0 (running_trace_lock){+.+.}-{2:2}, at: __blk_add_trace+0x9ea/0xdf0
[ 284.527877] Preemption disabled at:
[ 284.527886] [<ffffffff9f936743>] get_lock_stats+0x13/0x120
[ 284.527897] CPU: 53 PID: 6705 Comm: mandb Kdump: loaded Not tainted 4.18.0-259.rt7.24.el8.x86_64+debug #1
[ 284.527901] Hardware name: Lenovo ThinkSystem SR630 -[7X02CTO1WW]-/-[7X02CTO1WW]-, BIOS -[IVE136T-2.10]- 03/22/2019
[ 284.527904] Call Trace:
[ 284.527918] dump_stack+0x8e/0xd0
[ 284.527934] ? get_lock_stats+0x13/0x120
[ 284.527944] ___might_sleep.cold.58+0x54/0x77
[ 284.527964] rt_spin_lock+0x3e/0x120
[ 284.527969] ? __blk_add_trace+0x9ea/0xdf0
[ 284.527980] __blk_add_trace+0x9ea/0xdf0
[ 284.528025] ? rcu_read_lock_sched_held+0xe0/0xe0
[ 284.528048] blk_add_trace_bio+0x25d/0x2f0
[ 284.528074] ? recalibrate_cpu_khz+0x10/0x10
[ 284.528094] ? blk_add_trace_bio_frontmerge+0x20/0x20
[ 284.528107] generic_make_request_checks+0x1471/0x1cc0
[ 284.528143] ? blk_update_request+0x1610/0x1610
[ 284.528172] ? iomap_readpages_actor+0x4f3/0x900
[ 284.528208] generic_make_request+0x8e/0xa40
[ 284.528255] ? direct_make_request+0x2b0/0x2b0
[ 284.528260] ? iomap_readpage_actor+0xd40/0xd40
[ 284.528267] ? iomap_readpages+0x20b/0x950
[ 284.528277] ? bad_range+0x237/0x410
[ 284.528288] ? trace_event_raw_event_iomap_apply+0x430/0x430
[ 284.528329] ? submit_bio+0xcb/0x440
[ 284.528338] submit_bio+0xcb/0x440
[ 284.528358] ? generic_make_request+0xa40/0xa40
[ 284.528387] ? rcu_read_lock_bh_held+0x100/0x100
[ 284.528412] ? iomap_readpage_actor+0xd40/0xd40
[ 284.528422] iomap_readpages+0x22f/0x950
[ 284.528442] ? iomap_writepage+0xa0/0xa0
[ 284.528505] read_pages+0xf8/0x600
[ 284.528535] ? __alloc_pages_nodemask+0x80d/0xa20
[ 284.528554] ? read_cache_pages+0x6d0/0x6d0
[ 284.528560] ? __alloc_pages_slowpath+0x2a10/0x2a10
[ 284.528621] __do_page_cache_readahead+0x311/0x3c0
[ 284.528663] ? read_pages+0x600/0x600
[ 284.528712] ondemand_readahead+0x443/0xc00
[ 284.528717] ? page_cache_sync_readahead+0x18a/0x3f0
[ 284.528757] generic_file_buffered_read+0xd65/0x1ee0
[ 284.528850] ? read_cache_page_gfp+0xa0/0xa0
[ 284.529003] ? xfs_ilock+0x17d/0x590 [xfs]
[ 284.529009] ? get_lock_stats+0x18/0x120
[ 284.529018] ? put_lock_stats.isra.19+0xb/0xa0
[ 284.529045] ? lock_contended+0xa70/0xa70
[ 284.529069] ? __down_read_trylock+0x53/0x80
[ 284.529222] ? xfs_file_buffered_aio_read+0x146/0x4f0 [xfs]
[ 284.529381] xfs_file_buffered_aio_read+0x151/0x4f0 [xfs]
[ 284.529534] xfs_file_read_iter+0x2bf/0x740 [xfs]
[ 284.529545] ? rt_mutex_slowlock_locked+0x262/0x790
[ 284.529572] new_sync_read+0x39c/0x560
[ 284.529582] ? mark_held_locks+0xb7/0x120
[ 284.529594] ? do_iter_readv_writev+0x6e0/0x6e0
[ 284.529660] ? fsnotify_first_mark+0x150/0x150
[ 284.529717] vfs_read+0x147/0x350
[ 284.529745] ksys_read+0xb8/0x170
[ 284.529759] ? kernel_write+0x130/0x130
[ 284.529809] do_syscall_64+0x13c/0x670
[ 284.529830] entry_SYSCALL_64_after_hwframe+0x6a/0xdf
Thanks,
Ming
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT
2020-12-15 12:06 ` Ming Lei
@ 2020-12-15 16:07 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-15 16:07 UTC (permalink / raw)
To: Ming Lei
Cc: Steven Rostedt, Jens Axboe, linux-block, Christoph Hellwig,
Ingo Molnar, linux-kernel, linux-rt-users
On 2020-12-15 20:06:04 [+0800], Ming Lei wrote:
>
> [ 284.527619] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:968
> [ 284.527626] in_atomic(): 1, irqs_disabled(): 0, pid: 6705, name: mandb
> [ 284.527631] 4 locks held by mandb/6705:
> [ 284.527634] #0: ffff88853f4f2eb0 (&f->f_pos_lock){+.+.}-{0:0}, at: __fdget_pos+0xaf/0xe0
> [ 284.527662] #1: ffff8885eb814048 (&sb->s_type->i_mutex_key#13){++++}-{0:0}, at: xfs_ilock+0x17d/0x590 [xfs]
> [ 284.527838] #2: ffffffffa24d22c0 (rcu_read_lock){....}-{1:2}, at: blk_add_trace_bio+0x0/0x2f0
> [ 284.527859] #3: ffffffffa22177a0 (running_trace_lock){+.+.}-{2:2}, at: __blk_add_trace+0x9ea/0xdf0
> [ 284.527877] Preemption disabled at:
> [ 284.527886] [<ffffffff9f936743>] get_lock_stats+0x13/0x120
> [ 284.527897] CPU: 53 PID: 6705 Comm: mandb Kdump: loaded Not tainted 4.18.0-259.rt7.24.el8.x86_64+debug #1
4.18?
I just gave it a try with 5.10-RT and I don't see a splat. Please retest
with v5.10-RT.
>
> Thanks,
> Ming
Sebastian
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-12-15 16:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-14 2:22 [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT Ming Lei
2020-12-14 15:24 ` Steven Rostedt
2020-12-15 12:06 ` Ming Lei
2020-12-15 16:07 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).