* [PATCH 0/2] locking, tracing: Fix incorrect use of arch_spin_lock() @ 2022-09-21 13:21 Waiman Long 2022-09-21 13:21 ` [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API Waiman Long 2022-09-21 13:21 ` [PATCH 2/2] tracing: Use proper " Waiman Long 0 siblings, 2 replies; 11+ messages in thread From: Waiman Long @ 2022-09-21 13:21 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Steven Rostedt Cc: linux-kernel, Waiman Long It is found that running the LTP read_all_proc test may cause the following warning to show up: [12512.905036] BUG: using smp_processor_id() in preemptible [00000000] code: read_all/133711 [12512.913499] caller is __pv_queued_spin_lock_slowpath+0x7f/0xd30 [12512.921163] CPU: 3 PID: 133711 Comm: read_all Kdump: loaded Tainted: G OE --------- --- 5.14.0-163.el9.x86_64+debug #1 [12512.936652] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [12512.944213] Call Trace: [12512.950660] dump_stack_lvl+0x57/0x81 [12512.957400] check_preemption_disabled+0xcc/0xd0 [12512.964487] __pv_queued_spin_lock_slowpath+0x7f/0xd30 [12512.971552] ? pv_hash+0x110/0x110 [12512.978119] ? __lock_acquire+0xb72/0x1870 [12512.984683] tracing_saved_cmdlines_size_read+0x177/0x190 [12512.991655] ? saved_cmdlines_start+0x2c0/0x2c0 [12512.998355] ? inode_security+0x54/0xf0 [12513.004548] ? selinux_file_permission+0x324/0x420 [12513.011185] ? lock_downgrade+0x130/0x130 [12513.017423] ? fsnotify_perm.part.0+0x14a/0x4c0 [12513.023715] vfs_read+0x126/0x4d0 [12513.029432] ksys_read+0xf9/0x1d0 [12513.035131] ? __ia32_sys_pwrite64+0x1e0/0x1e0 [12513.041028] ? ktime_get_coarse_real_ts64+0x130/0x170 [12513.047167] do_syscall_64+0x59/0x90 [12513.052656] ? lockdep_hardirqs_on+0x79/0x100 [12513.058268] ? do_syscall_64+0x69/0x90 [12513.063593] ? lockdep_hardirqs_on+0x79/0x100 [12513.069022] ? do_syscall_64+0x69/0x90 [12513.074137] ? lockdep_hardirqs_on+0x79/0x100 [12513.079533] entry_SYSCALL_64_after_hwframe+0x63/0xcd [12513.085015] RIP: 0033:0x7f93f09d38c2 So tracing_saved_cmdlines_size_read() does call arch_spin_lock() with preemption enabled. Looking at other arch_spin_lock() call sites in kernel/trace/trace.c, there are several others that may have the same problem. Other arch_spin_lock() callers under kernel look OK as irqs has been disabled before calling arch_spin_lock(). Add a do_arch_spin_lock() helper that disables preemption and make kernel/trace/trace.c use it if it is not obvious that either preemption or interrupt has been disabled. Waiman Long (2): locking: Provide a low overhead do_arch_spin_lock() API tracing: Use proper do_arch_spin_lock() API include/linux/spinlock.h | 27 +++++++++++++++++++++ kernel/trace/trace.c | 52 +++++++++++++++++++--------------------- 2 files changed, 51 insertions(+), 28 deletions(-) -- 2.31.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API 2022-09-21 13:21 [PATCH 0/2] locking, tracing: Fix incorrect use of arch_spin_lock() Waiman Long @ 2022-09-21 13:21 ` Waiman Long 2022-09-21 22:06 ` Steven Rostedt 2022-09-21 13:21 ` [PATCH 2/2] tracing: Use proper " Waiman Long 1 sibling, 1 reply; 11+ messages in thread From: Waiman Long @ 2022-09-21 13:21 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Steven Rostedt Cc: linux-kernel, Waiman Long There are some code paths in the kernel like tracing or rcu where they want to use a spinlock without the lock debugging overhead (lockdep, etc). Provide a do_arch_spin_lock() API with proper preemption disabling and enabling without any debugging or tracing overhead. Signed-off-by: Waiman Long <longman@redhat.com> --- include/linux/spinlock.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h index 5c0c5174155d..535ef0d5bb80 100644 --- a/include/linux/spinlock.h +++ b/include/linux/spinlock.h @@ -119,6 +119,33 @@ do { \ #define raw_spin_is_contended(lock) (((void)(lock), 0)) #endif /*arch_spin_is_contended*/ +/* + * Provide a set of do_arch_spin*() APIs to make use of the arch_spinlock_t + * with proper preemption disabling & enabling without any debugging and + * tracing overhead. Any users of arch_spinlock_t should use this set of + * APIs unless it is sure that either preemption or irqs has been disabled. + */ +static __always_inline void do_arch_spin_lock(arch_spinlock_t *lock) +{ + preempt_disable_notrace(); + arch_spin_lock(lock); +} + +static __always_inline int do_arch_spin_trylock(arch_spinlock_t *lock) +{ + preempt_disable_notrace(); + if (arch_spin_trylock(lock)) + return 1; + preempt_enable_notrace(); + return 0; +} + +static __always_inline void do_arch_spin_unlock(arch_spinlock_t *lock) +{ + arch_spin_unlock(lock); + preempt_enable_notrace(); +} + /* * smp_mb__after_spinlock() provides the equivalent of a full memory barrier * between program-order earlier lock acquisitions and program-order later -- 2.31.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API 2022-09-21 13:21 ` [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API Waiman Long @ 2022-09-21 22:06 ` Steven Rostedt 2022-09-22 0:51 ` Waiman Long 0 siblings, 1 reply; 11+ messages in thread From: Steven Rostedt @ 2022-09-21 22:06 UTC (permalink / raw) To: Waiman Long Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel On Wed, 21 Sep 2022 09:21:51 -0400 Waiman Long <longman@redhat.com> wrote: > There are some code paths in the kernel like tracing or rcu where they > want to use a spinlock without the lock debugging overhead (lockdep, > etc). Provide a do_arch_spin_lock() API with proper preemption disabling > and enabling without any debugging or tracing overhead. > > Signed-off-by: Waiman Long <longman@redhat.com> Please don't do this. arch_spin_lock() is very special, and if tracing did it wrong, then it needs to be fixed in the tracing code. Let's not add handlers to make it easier to use arch_spin_lock(). Tracing is special and arch_spin_lock() helps keep it from tracing itself. -- Steve ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API 2022-09-21 22:06 ` Steven Rostedt @ 2022-09-22 0:51 ` Waiman Long 0 siblings, 0 replies; 11+ messages in thread From: Waiman Long @ 2022-09-22 0:51 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel On 9/21/22 18:06, Steven Rostedt wrote: > On Wed, 21 Sep 2022 09:21:51 -0400 > Waiman Long <longman@redhat.com> wrote: > >> There are some code paths in the kernel like tracing or rcu where they >> want to use a spinlock without the lock debugging overhead (lockdep, >> etc). Provide a do_arch_spin_lock() API with proper preemption disabling >> and enabling without any debugging or tracing overhead. >> >> Signed-off-by: Waiman Long <longman@redhat.com> > Please don't do this. arch_spin_lock() is very special, and if tracing did > it wrong, then it needs to be fixed in the tracing code. > > Let's not add handlers to make it easier to use arch_spin_lock(). Tracing > is special and arch_spin_lock() helps keep it from tracing itself. That makes sense to me. Will remove this patch. Thanks, Longman ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API 2022-09-21 13:21 [PATCH 0/2] locking, tracing: Fix incorrect use of arch_spin_lock() Waiman Long 2022-09-21 13:21 ` [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API Waiman Long @ 2022-09-21 13:21 ` Waiman Long 2022-09-21 22:17 ` Steven Rostedt 1 sibling, 1 reply; 11+ messages in thread From: Waiman Long @ 2022-09-21 13:21 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Steven Rostedt Cc: linux-kernel, Waiman Long It was found that some tracing functions acquire a arch_spinlock_t with preemption and irqs enabled. That can be problematic in case preemption happens after acquiring the lock. Use the proper do_arch_spin_lock() API with preemption disabled to make sure that this won't happen unless it is obvious that either preemption or irqs has been disabled . Signed-off-by: Waiman Long <longman@redhat.com> --- kernel/trace/trace.c | 52 ++++++++++++++++++++------------------------ 1 file changed, 24 insertions(+), 28 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index d3005279165d..cbb8520842ad 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1193,12 +1193,12 @@ void *tracing_cond_snapshot_data(struct trace_array *tr) { void *cond_data = NULL; - arch_spin_lock(&tr->max_lock); + do_arch_spin_lock(&tr->max_lock); if (tr->cond_snapshot) cond_data = tr->cond_snapshot->cond_data; - arch_spin_unlock(&tr->max_lock); + do_arch_spin_unlock(&tr->max_lock); return cond_data; } @@ -1334,9 +1334,9 @@ int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data, goto fail_unlock; } - arch_spin_lock(&tr->max_lock); + do_arch_spin_lock(&tr->max_lock); tr->cond_snapshot = cond_snapshot; - arch_spin_unlock(&tr->max_lock); + do_arch_spin_unlock(&tr->max_lock); mutex_unlock(&trace_types_lock); @@ -1363,7 +1363,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr) { int ret = 0; - arch_spin_lock(&tr->max_lock); + do_arch_spin_lock(&tr->max_lock); if (!tr->cond_snapshot) ret = -EINVAL; @@ -1372,7 +1372,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr) tr->cond_snapshot = NULL; } - arch_spin_unlock(&tr->max_lock); + do_arch_spin_unlock(&tr->max_lock); return ret; } @@ -1819,7 +1819,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu, return; } - arch_spin_lock(&tr->max_lock); + do_arch_spin_lock(&tr->max_lock); /* Inherit the recordable setting from array_buffer */ if (ring_buffer_record_is_set_on(tr->array_buffer.buffer)) @@ -1836,7 +1836,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu, __update_max_tr(tr, tsk, cpu); out_unlock: - arch_spin_unlock(&tr->max_lock); + do_arch_spin_unlock(&tr->max_lock); } /** @@ -1862,7 +1862,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) return; } - arch_spin_lock(&tr->max_lock); + do_arch_spin_lock(&tr->max_lock); ret = ring_buffer_swap_cpu(tr->max_buffer.buffer, tr->array_buffer.buffer, cpu); @@ -1880,7 +1880,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY); __update_max_tr(tr, tsk, cpu); - arch_spin_unlock(&tr->max_lock); + do_arch_spin_unlock(&tr->max_lock); } #endif /* CONFIG_TRACER_MAX_TRACE */ @@ -2413,7 +2413,7 @@ static int trace_save_cmdline(struct task_struct *tsk) * nor do we want to disable interrupts, * so if we miss here, then better luck next time. */ - if (!arch_spin_trylock(&trace_cmdline_lock)) + if (!do_arch_spin_trylock(&trace_cmdline_lock)) return 0; idx = savedcmd->map_pid_to_cmdline[tpid]; @@ -2427,7 +2427,7 @@ static int trace_save_cmdline(struct task_struct *tsk) savedcmd->map_cmdline_to_pid[idx] = tsk->pid; set_cmdline(idx, tsk->comm); - arch_spin_unlock(&trace_cmdline_lock); + do_arch_spin_unlock(&trace_cmdline_lock); return 1; } @@ -2461,13 +2461,11 @@ static void __trace_find_cmdline(int pid, char comm[]) void trace_find_cmdline(int pid, char comm[]) { - preempt_disable(); - arch_spin_lock(&trace_cmdline_lock); + do_arch_spin_lock(&trace_cmdline_lock); __trace_find_cmdline(pid, comm); - arch_spin_unlock(&trace_cmdline_lock); - preempt_enable(); + do_arch_spin_unlock(&trace_cmdline_lock); } static int *trace_find_tgid_ptr(int pid) @@ -5829,8 +5827,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos) void *v; loff_t l = 0; - preempt_disable(); - arch_spin_lock(&trace_cmdline_lock); + do_arch_spin_lock(&trace_cmdline_lock); v = &savedcmd->map_cmdline_to_pid[0]; while (l <= *pos) { @@ -5844,8 +5841,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos) static void saved_cmdlines_stop(struct seq_file *m, void *v) { - arch_spin_unlock(&trace_cmdline_lock); - preempt_enable(); + do_arch_spin_unlock(&trace_cmdline_lock); } static int saved_cmdlines_show(struct seq_file *m, void *v) @@ -5890,9 +5886,9 @@ tracing_saved_cmdlines_size_read(struct file *filp, char __user *ubuf, char buf[64]; int r; - arch_spin_lock(&trace_cmdline_lock); + do_arch_spin_lock(&trace_cmdline_lock); r = scnprintf(buf, sizeof(buf), "%u\n", savedcmd->cmdline_num); - arch_spin_unlock(&trace_cmdline_lock); + do_arch_spin_unlock(&trace_cmdline_lock); return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); } @@ -5917,10 +5913,10 @@ static int tracing_resize_saved_cmdlines(unsigned int val) return -ENOMEM; } - arch_spin_lock(&trace_cmdline_lock); + do_arch_spin_lock(&trace_cmdline_lock); savedcmd_temp = savedcmd; savedcmd = s; - arch_spin_unlock(&trace_cmdline_lock); + do_arch_spin_unlock(&trace_cmdline_lock); free_saved_cmdlines_buffer(savedcmd_temp); return 0; @@ -6373,10 +6369,10 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf) #ifdef CONFIG_TRACER_SNAPSHOT if (t->use_max_tr) { - arch_spin_lock(&tr->max_lock); + do_arch_spin_lock(&tr->max_lock); if (tr->cond_snapshot) ret = -EBUSY; - arch_spin_unlock(&tr->max_lock); + do_arch_spin_unlock(&tr->max_lock); if (ret) goto out; } @@ -7436,10 +7432,10 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt, goto out; } - arch_spin_lock(&tr->max_lock); + do_arch_spin_lock(&tr->max_lock); if (tr->cond_snapshot) ret = -EBUSY; - arch_spin_unlock(&tr->max_lock); + do_arch_spin_unlock(&tr->max_lock); if (ret) goto out; -- 2.31.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API 2022-09-21 13:21 ` [PATCH 2/2] tracing: Use proper " Waiman Long @ 2022-09-21 22:17 ` Steven Rostedt 2022-09-22 0:50 ` Waiman Long ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Steven Rostedt @ 2022-09-21 22:17 UTC (permalink / raw) To: Waiman Long Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel On Wed, 21 Sep 2022 09:21:52 -0400 Waiman Long <longman@redhat.com> wrote: > It was found that some tracing functions acquire a arch_spinlock_t with > preemption and irqs enabled. That can be problematic in case preemption > happens after acquiring the lock. Use the proper do_arch_spin_lock() > API with preemption disabled to make sure that this won't happen unless > it is obvious that either preemption or irqs has been disabled . > > Signed-off-by: Waiman Long <longman@redhat.com> > --- > kernel/trace/trace.c | 52 ++++++++++++++++++++------------------------ > 1 file changed, 24 insertions(+), 28 deletions(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index d3005279165d..cbb8520842ad 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -1193,12 +1193,12 @@ void *tracing_cond_snapshot_data(struct trace_array *tr) > { > void *cond_data = NULL; > > - arch_spin_lock(&tr->max_lock); > + do_arch_spin_lock(&tr->max_lock); This should actually disable interrupts and not preemption. > > if (tr->cond_snapshot) > cond_data = tr->cond_snapshot->cond_data; > > - arch_spin_unlock(&tr->max_lock); > + do_arch_spin_unlock(&tr->max_lock); > > return cond_data; > } > @@ -1334,9 +1334,9 @@ int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data, > goto fail_unlock; > } > > - arch_spin_lock(&tr->max_lock); > + do_arch_spin_lock(&tr->max_lock); Same here. > tr->cond_snapshot = cond_snapshot; > - arch_spin_unlock(&tr->max_lock); > + do_arch_spin_unlock(&tr->max_lock); > > mutex_unlock(&trace_types_lock); > > @@ -1363,7 +1363,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr) > { > int ret = 0; > > - arch_spin_lock(&tr->max_lock); > + do_arch_spin_lock(&tr->max_lock); And here. > > if (!tr->cond_snapshot) > ret = -EINVAL; > @@ -1372,7 +1372,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr) > tr->cond_snapshot = NULL; > } > > - arch_spin_unlock(&tr->max_lock); > + do_arch_spin_unlock(&tr->max_lock); > > return ret; > } > @@ -1819,7 +1819,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu, > return; > } > > - arch_spin_lock(&tr->max_lock); > + do_arch_spin_lock(&tr->max_lock); Nothing here is needed, as interrupts had better be disabled when this function is called. And there's already a: WARN_ON_ONCE(!irqs_disabled()); > > /* Inherit the recordable setting from array_buffer */ > if (ring_buffer_record_is_set_on(tr->array_buffer.buffer)) > @@ -1836,7 +1836,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu, > __update_max_tr(tr, tsk, cpu); > > out_unlock: > - arch_spin_unlock(&tr->max_lock); > + do_arch_spin_unlock(&tr->max_lock); Nothing needs to be done here. > } > > /** > @@ -1862,7 +1862,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) > return; > } > > - arch_spin_lock(&tr->max_lock); > + do_arch_spin_lock(&tr->max_lock); Same here. Interrupts had better be disabled in this function. > > ret = ring_buffer_swap_cpu(tr->max_buffer.buffer, tr->array_buffer.buffer, cpu); > > @@ -1880,7 +1880,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) > WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY); > > __update_max_tr(tr, tsk, cpu); > - arch_spin_unlock(&tr->max_lock); > + do_arch_spin_unlock(&tr->max_lock); Nothing to do here. > } > #endif /* CONFIG_TRACER_MAX_TRACE */ > > @@ -2413,7 +2413,7 @@ static int trace_save_cmdline(struct task_struct *tsk) > * nor do we want to disable interrupts, > * so if we miss here, then better luck next time. > */ > - if (!arch_spin_trylock(&trace_cmdline_lock)) > + if (!do_arch_spin_trylock(&trace_cmdline_lock)) This is called within the scheduler and wake up, so interrupts had better be disabled (run queue lock is held). > return 0; > > idx = savedcmd->map_pid_to_cmdline[tpid]; > @@ -2427,7 +2427,7 @@ static int trace_save_cmdline(struct task_struct *tsk) > savedcmd->map_cmdline_to_pid[idx] = tsk->pid; > set_cmdline(idx, tsk->comm); > > - arch_spin_unlock(&trace_cmdline_lock); > + do_arch_spin_unlock(&trace_cmdline_lock); Nothing to do here. > > return 1; > } > @@ -2461,13 +2461,11 @@ static void __trace_find_cmdline(int pid, char comm[]) > > void trace_find_cmdline(int pid, char comm[]) > { > - preempt_disable(); > - arch_spin_lock(&trace_cmdline_lock); > + do_arch_spin_lock(&trace_cmdline_lock); Keep this as is (with the open coded preempt_disable()). > > __trace_find_cmdline(pid, comm); > > - arch_spin_unlock(&trace_cmdline_lock); > - preempt_enable(); > + do_arch_spin_unlock(&trace_cmdline_lock); > } > > static int *trace_find_tgid_ptr(int pid) > @@ -5829,8 +5827,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos) > void *v; > loff_t l = 0; > > - preempt_disable(); > - arch_spin_lock(&trace_cmdline_lock); > + do_arch_spin_lock(&trace_cmdline_lock); This too. > > v = &savedcmd->map_cmdline_to_pid[0]; > while (l <= *pos) { > @@ -5844,8 +5841,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos) > > static void saved_cmdlines_stop(struct seq_file *m, void *v) > { > - arch_spin_unlock(&trace_cmdline_lock); > - preempt_enable(); > + do_arch_spin_unlock(&trace_cmdline_lock); And this. > } > > static int saved_cmdlines_show(struct seq_file *m, void *v) > @@ -5890,9 +5886,9 @@ tracing_saved_cmdlines_size_read(struct file *filp, char __user *ubuf, > char buf[64]; > int r; > > - arch_spin_lock(&trace_cmdline_lock); > + do_arch_spin_lock(&trace_cmdline_lock); Yeah, we should add preempt_disable() here. > r = scnprintf(buf, sizeof(buf), "%u\n", savedcmd->cmdline_num); > - arch_spin_unlock(&trace_cmdline_lock); > + do_arch_spin_unlock(&trace_cmdline_lock); > > return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); > } > @@ -5917,10 +5913,10 @@ static int tracing_resize_saved_cmdlines(unsigned int val) > return -ENOMEM; > } > > - arch_spin_lock(&trace_cmdline_lock); > + do_arch_spin_lock(&trace_cmdline_lock); And here. > savedcmd_temp = savedcmd; > savedcmd = s; > - arch_spin_unlock(&trace_cmdline_lock); > + do_arch_spin_unlock(&trace_cmdline_lock); > free_saved_cmdlines_buffer(savedcmd_temp); > > return 0; > @@ -6373,10 +6369,10 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf) > > #ifdef CONFIG_TRACER_SNAPSHOT > if (t->use_max_tr) { > - arch_spin_lock(&tr->max_lock); > + do_arch_spin_lock(&tr->max_lock); Add preemption disabling. > if (tr->cond_snapshot) > ret = -EBUSY; > - arch_spin_unlock(&tr->max_lock); > + do_arch_spin_unlock(&tr->max_lock); > if (ret) > goto out; > } > @@ -7436,10 +7432,10 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt, > goto out; > } > > - arch_spin_lock(&tr->max_lock); > + do_arch_spin_lock(&tr->max_lock); And this should disable interrupts first. -- Steve > if (tr->cond_snapshot) > ret = -EBUSY; > - arch_spin_unlock(&tr->max_lock); > + do_arch_spin_unlock(&tr->max_lock); > if (ret) > goto out; > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API 2022-09-21 22:17 ` Steven Rostedt @ 2022-09-22 0:50 ` Waiman Long [not found] ` <3789bf4d-d699-63d4-c97b-c8d9524cbc2f@redhat.com> 2022-09-22 7:55 ` Peter Zijlstra 2 siblings, 0 replies; 11+ messages in thread From: Waiman Long @ 2022-09-22 0:50 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel On 9/21/22 18:17, Steven Rostedt wrote: > On Wed, 21 Sep 2022 09:21:52 -0400 > Waiman Long <longman@redhat.com> wrote: > >> It was found that some tracing functions acquire a arch_spinlock_t with >> preemption and irqs enabled. That can be problematic in case preemption >> happens after acquiring the lock. Use the proper do_arch_spin_lock() >> API with preemption disabled to make sure that this won't happen unless >> it is obvious that either preemption or irqs has been disabled . >> >> Signed-off-by: Waiman Long <longman@redhat.com> >> --- >> kernel/trace/trace.c | 52 ++++++++++++++++++++------------------------ >> 1 file changed, 24 insertions(+), 28 deletions(-) >> >> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c >> index d3005279165d..cbb8520842ad 100644 >> --- a/kernel/trace/trace.c >> +++ b/kernel/trace/trace.c >> @@ -1193,12 +1193,12 @@ void *tracing_cond_snapshot_data(struct trace_array *tr) >> { >> void *cond_data = NULL; >> >> - arch_spin_lock(&tr->max_lock); >> + do_arch_spin_lock(&tr->max_lock); > This should actually disable interrupts and not preemption. > >> >> if (tr->cond_snapshot) >> cond_data = tr->cond_snapshot->cond_data; >> >> - arch_spin_unlock(&tr->max_lock); >> + do_arch_spin_unlock(&tr->max_lock); >> >> return cond_data; >> } >> @@ -1334,9 +1334,9 @@ int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data, >> goto fail_unlock; >> } >> >> - arch_spin_lock(&tr->max_lock); >> + do_arch_spin_lock(&tr->max_lock); > Same here. > >> tr->cond_snapshot = cond_snapshot; >> - arch_spin_unlock(&tr->max_lock); >> + do_arch_spin_unlock(&tr->max_lock); >> >> mutex_unlock(&trace_types_lock); >> >> @@ -1363,7 +1363,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr) >> { >> int ret = 0; >> >> - arch_spin_lock(&tr->max_lock); >> + do_arch_spin_lock(&tr->max_lock); > And here. > >> >> if (!tr->cond_snapshot) >> ret = -EINVAL; >> @@ -1372,7 +1372,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr) >> tr->cond_snapshot = NULL; >> } >> >> - arch_spin_unlock(&tr->max_lock); >> + do_arch_spin_unlock(&tr->max_lock); >> >> return ret; >> } >> @@ -1819,7 +1819,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu, >> return; >> } >> >> - arch_spin_lock(&tr->max_lock); >> + do_arch_spin_lock(&tr->max_lock); > Nothing here is needed, as interrupts had better be disabled when this > function is called. And there's already a: > > WARN_ON_ONCE(!irqs_disabled()); Sorry, I missed that. > >> >> /* Inherit the recordable setting from array_buffer */ >> if (ring_buffer_record_is_set_on(tr->array_buffer.buffer)) >> @@ -1836,7 +1836,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu, >> __update_max_tr(tr, tsk, cpu); >> >> out_unlock: >> - arch_spin_unlock(&tr->max_lock); >> + do_arch_spin_unlock(&tr->max_lock); > Nothing needs to be done here. Right. > >> } >> >> /** >> @@ -1862,7 +1862,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) >> return; >> } >> >> - arch_spin_lock(&tr->max_lock); >> + do_arch_spin_lock(&tr->max_lock); > Same here. Interrupts had better be disabled in this function. > >> >> ret = ring_buffer_swap_cpu(tr->max_buffer.buffer, tr->array_buffer.buffer, cpu); >> >> @@ -1880,7 +1880,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) >> WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY); >> >> __update_max_tr(tr, tsk, cpu); >> - arch_spin_unlock(&tr->max_lock); >> + do_arch_spin_unlock(&tr->max_lock); > Nothing to do here. > >> } >> #endif /* CONFIG_TRACER_MAX_TRACE */ >> >> @@ -2413,7 +2413,7 @@ static int trace_save_cmdline(struct task_struct *tsk) >> * nor do we want to disable interrupts, >> * so if we miss here, then better luck next time. >> */ >> - if (!arch_spin_trylock(&trace_cmdline_lock)) >> + if (!do_arch_spin_trylock(&trace_cmdline_lock)) > This is called within the scheduler and wake up, so interrupts had better > be disabled (run queue lock is held). > >> return 0; >> >> idx = savedcmd->map_pid_to_cmdline[tpid]; >> @@ -2427,7 +2427,7 @@ static int trace_save_cmdline(struct task_struct *tsk) >> savedcmd->map_cmdline_to_pid[idx] = tsk->pid; >> set_cmdline(idx, tsk->comm); >> >> - arch_spin_unlock(&trace_cmdline_lock); >> + do_arch_spin_unlock(&trace_cmdline_lock); > Nothing to do here. > >> >> return 1; >> } >> @@ -2461,13 +2461,11 @@ static void __trace_find_cmdline(int pid, char comm[]) >> >> void trace_find_cmdline(int pid, char comm[]) >> { >> - preempt_disable(); >> - arch_spin_lock(&trace_cmdline_lock); >> + do_arch_spin_lock(&trace_cmdline_lock); > Keep this as is (with the open coded preempt_disable()). OK. > >> >> __trace_find_cmdline(pid, comm); >> >> - arch_spin_unlock(&trace_cmdline_lock); >> - preempt_enable(); >> + do_arch_spin_unlock(&trace_cmdline_lock); >> } >> >> static int *trace_find_tgid_ptr(int pid) >> @@ -5829,8 +5827,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos) >> void *v; >> loff_t l = 0; >> >> - preempt_disable(); >> - arch_spin_lock(&trace_cmdline_lock); >> + do_arch_spin_lock(&trace_cmdline_lock); > This too. > >> >> v = &savedcmd->map_cmdline_to_pid[0]; >> while (l <= *pos) { >> @@ -5844,8 +5841,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos) >> >> static void saved_cmdlines_stop(struct seq_file *m, void *v) >> { >> - arch_spin_unlock(&trace_cmdline_lock); >> - preempt_enable(); >> + do_arch_spin_unlock(&trace_cmdline_lock); > And this. > >> } >> >> static int saved_cmdlines_show(struct seq_file *m, void *v) >> @@ -5890,9 +5886,9 @@ tracing_saved_cmdlines_size_read(struct file *filp, char __user *ubuf, >> char buf[64]; >> int r; >> >> - arch_spin_lock(&trace_cmdline_lock); >> + do_arch_spin_lock(&trace_cmdline_lock); > Yeah, we should add preempt_disable() here. > >> r = scnprintf(buf, sizeof(buf), "%u\n", savedcmd->cmdline_num); >> - arch_spin_unlock(&trace_cmdline_lock); >> + do_arch_spin_unlock(&trace_cmdline_lock); >> >> return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); >> } >> @@ -5917,10 +5913,10 @@ static int tracing_resize_saved_cmdlines(unsigned int val) >> return -ENOMEM; >> } >> >> - arch_spin_lock(&trace_cmdline_lock); >> + do_arch_spin_lock(&trace_cmdline_lock); > And here. > >> savedcmd_temp = savedcmd; >> savedcmd = s; >> - arch_spin_unlock(&trace_cmdline_lock); >> + do_arch_spin_unlock(&trace_cmdline_lock); >> free_saved_cmdlines_buffer(savedcmd_temp); >> >> return 0; >> @@ -6373,10 +6369,10 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf) >> >> #ifdef CONFIG_TRACER_SNAPSHOT >> if (t->use_max_tr) { >> - arch_spin_lock(&tr->max_lock); >> + do_arch_spin_lock(&tr->max_lock); > Add preemption disabling. > >> if (tr->cond_snapshot) >> ret = -EBUSY; >> - arch_spin_unlock(&tr->max_lock); >> + do_arch_spin_unlock(&tr->max_lock); >> if (ret) >> goto out; >> } >> @@ -7436,10 +7432,10 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt, >> goto out; >> } >> >> - arch_spin_lock(&tr->max_lock); >> + do_arch_spin_lock(&tr->max_lock); > And this should disable interrupts first. > > -- Steve Thanks for the comments, will modify the patch as suggested. Cheers, Longman ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <3789bf4d-d699-63d4-c97b-c8d9524cbc2f@redhat.com>]
* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API [not found] ` <3789bf4d-d699-63d4-c97b-c8d9524cbc2f@redhat.com> @ 2022-09-22 2:08 ` Steven Rostedt 0 siblings, 0 replies; 11+ messages in thread From: Steven Rostedt @ 2022-09-22 2:08 UTC (permalink / raw) To: Waiman Long Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel On Wed, 21 Sep 2022 21:48:41 -0400 Waiman Long <longman@redhat.com> wrote: > On 9/21/22 18:17, Steven Rostedt wrote: > >> savedcmd_temp = savedcmd; > >> savedcmd = s; > >> - arch_spin_unlock(&trace_cmdline_lock); > >> + do_arch_spin_unlock(&trace_cmdline_lock); > >> free_saved_cmdlines_buffer(savedcmd_temp); > >> > >> return 0; > >> @@ -6373,10 +6369,10 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf) > >> > >> #ifdef CONFIG_TRACER_SNAPSHOT > >> if (t->use_max_tr) { > >> - arch_spin_lock(&tr->max_lock); > >> + do_arch_spin_lock(&tr->max_lock); > > Add preemption disabling. > > > The pattern that I have seen so far is to disable preemption for > trace_cmdline_lock, but interrupt for max_lock. So should we also > disable interrupt here instead of preemption? > Ah yes. Thanks, -- Steve ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API 2022-09-21 22:17 ` Steven Rostedt 2022-09-22 0:50 ` Waiman Long [not found] ` <3789bf4d-d699-63d4-c97b-c8d9524cbc2f@redhat.com> @ 2022-09-22 7:55 ` Peter Zijlstra 2022-09-22 12:25 ` Waiman Long 2022-09-22 12:40 ` Steven Rostedt 2 siblings, 2 replies; 11+ messages in thread From: Peter Zijlstra @ 2022-09-22 7:55 UTC (permalink / raw) To: Steven Rostedt Cc: Waiman Long, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel I'm with Steve on not adding wrappers for this; people that use arch_spinlock_* get to keep the pieces :-) On Wed, Sep 21, 2022 at 06:17:21PM -0400, Steven Rostedt wrote: > > @@ -1819,7 +1819,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu, > > return; > > } > > > > - arch_spin_lock(&tr->max_lock); > > + do_arch_spin_lock(&tr->max_lock); > > Nothing here is needed, as interrupts had better be disabled when this > function is called. And there's already a: > > WARN_ON_ONCE(!irqs_disabled()); You can write that as lockdep_assert_irqs_disabled(); ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API 2022-09-22 7:55 ` Peter Zijlstra @ 2022-09-22 12:25 ` Waiman Long 2022-09-22 12:40 ` Steven Rostedt 1 sibling, 0 replies; 11+ messages in thread From: Waiman Long @ 2022-09-22 12:25 UTC (permalink / raw) To: Peter Zijlstra, Steven Rostedt Cc: Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel On 9/22/22 03:55, Peter Zijlstra wrote: > I'm with Steve on not adding wrappers for this; people that use > arch_spinlock_* get to keep the pieces :-) Yes, I am going to drop patch 1 and open-code all the changes in a new patch. Cheers, Longman ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API 2022-09-22 7:55 ` Peter Zijlstra 2022-09-22 12:25 ` Waiman Long @ 2022-09-22 12:40 ` Steven Rostedt 1 sibling, 0 replies; 11+ messages in thread From: Steven Rostedt @ 2022-09-22 12:40 UTC (permalink / raw) To: Peter Zijlstra Cc: Waiman Long, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel On Thu, 22 Sep 2022 09:55:29 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > > Nothing here is needed, as interrupts had better be disabled when this > > function is called. And there's already a: > > > > WARN_ON_ONCE(!irqs_disabled()); > > You can write that as lockdep_assert_irqs_disabled(); Of course we can ;-) But this was written before that existed. -- Steve ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-09-22 12:39 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-09-21 13:21 [PATCH 0/2] locking, tracing: Fix incorrect use of arch_spin_lock() Waiman Long 2022-09-21 13:21 ` [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API Waiman Long 2022-09-21 22:06 ` Steven Rostedt 2022-09-22 0:51 ` Waiman Long 2022-09-21 13:21 ` [PATCH 2/2] tracing: Use proper " Waiman Long 2022-09-21 22:17 ` Steven Rostedt 2022-09-22 0:50 ` Waiman Long [not found] ` <3789bf4d-d699-63d4-c97b-c8d9524cbc2f@redhat.com> 2022-09-22 2:08 ` Steven Rostedt 2022-09-22 7:55 ` Peter Zijlstra 2022-09-22 12:25 ` Waiman Long 2022-09-22 12:40 ` Steven Rostedt
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.