All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] locking, tracing: Fix incorrect use of arch_spin_lock()
@ 2022-09-21 13:21 Waiman Long
  2022-09-21 13:21 ` [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API Waiman Long
  2022-09-21 13:21 ` [PATCH 2/2] tracing: Use proper " Waiman Long
  0 siblings, 2 replies; 11+ messages in thread
From: Waiman Long @ 2022-09-21 13:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Steven Rostedt
  Cc: linux-kernel, Waiman Long

It is found that running the LTP read_all_proc test may cause the
following warning to show up:

[12512.905036] BUG: using smp_processor_id() in preemptible [00000000] code: read_all/133711
[12512.913499] caller is __pv_queued_spin_lock_slowpath+0x7f/0xd30
[12512.921163] CPU: 3 PID: 133711 Comm: read_all Kdump: loaded Tainted: G           OE    --------- ---  5.14.0-163.el9.x86_64+debug #1
[12512.936652] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[12512.944213] Call Trace:
[12512.950660]  dump_stack_lvl+0x57/0x81
[12512.957400]  check_preemption_disabled+0xcc/0xd0
[12512.964487]  __pv_queued_spin_lock_slowpath+0x7f/0xd30
[12512.971552]  ? pv_hash+0x110/0x110
[12512.978119]  ? __lock_acquire+0xb72/0x1870
[12512.984683]  tracing_saved_cmdlines_size_read+0x177/0x190
[12512.991655]  ? saved_cmdlines_start+0x2c0/0x2c0
[12512.998355]  ? inode_security+0x54/0xf0
[12513.004548]  ? selinux_file_permission+0x324/0x420
[12513.011185]  ? lock_downgrade+0x130/0x130
[12513.017423]  ? fsnotify_perm.part.0+0x14a/0x4c0
[12513.023715]  vfs_read+0x126/0x4d0
[12513.029432]  ksys_read+0xf9/0x1d0
[12513.035131]  ? __ia32_sys_pwrite64+0x1e0/0x1e0
[12513.041028]  ? ktime_get_coarse_real_ts64+0x130/0x170
[12513.047167]  do_syscall_64+0x59/0x90
[12513.052656]  ? lockdep_hardirqs_on+0x79/0x100
[12513.058268]  ? do_syscall_64+0x69/0x90
[12513.063593]  ? lockdep_hardirqs_on+0x79/0x100
[12513.069022]  ? do_syscall_64+0x69/0x90
[12513.074137]  ? lockdep_hardirqs_on+0x79/0x100
[12513.079533]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[12513.085015] RIP: 0033:0x7f93f09d38c2

So tracing_saved_cmdlines_size_read() does call arch_spin_lock() with
preemption enabled. Looking at other arch_spin_lock() call sites in
kernel/trace/trace.c, there are several others that may have the same
problem. Other arch_spin_lock() callers under kernel look OK as
irqs has been disabled before calling arch_spin_lock().

Add a do_arch_spin_lock() helper that disables preemption and make
kernel/trace/trace.c use it if it is not obvious that either preemption
or interrupt has been disabled.

Waiman Long (2):
  locking: Provide a low overhead do_arch_spin_lock() API
  tracing: Use proper do_arch_spin_lock() API

 include/linux/spinlock.h | 27 +++++++++++++++++++++
 kernel/trace/trace.c     | 52 +++++++++++++++++++---------------------
 2 files changed, 51 insertions(+), 28 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API
  2022-09-21 13:21 [PATCH 0/2] locking, tracing: Fix incorrect use of arch_spin_lock() Waiman Long
@ 2022-09-21 13:21 ` Waiman Long
  2022-09-21 22:06   ` Steven Rostedt
  2022-09-21 13:21 ` [PATCH 2/2] tracing: Use proper " Waiman Long
  1 sibling, 1 reply; 11+ messages in thread
From: Waiman Long @ 2022-09-21 13:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Steven Rostedt
  Cc: linux-kernel, Waiman Long

There are some code paths in the kernel like tracing or rcu where they
want to use a spinlock without the lock debugging overhead (lockdep,
etc). Provide a do_arch_spin_lock() API with proper preemption disabling
and enabling without any debugging or tracing overhead.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/spinlock.h | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 5c0c5174155d..535ef0d5bb80 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -119,6 +119,33 @@ do {									\
 #define raw_spin_is_contended(lock)	(((void)(lock), 0))
 #endif /*arch_spin_is_contended*/
 
+/*
+ * Provide a set of do_arch_spin*() APIs to make use of the arch_spinlock_t
+ * with proper preemption disabling & enabling without any debugging and
+ * tracing overhead. Any users of arch_spinlock_t should use this set of
+ * APIs unless it is sure that either preemption or irqs has been disabled.
+ */
+static __always_inline void do_arch_spin_lock(arch_spinlock_t *lock)
+{
+	preempt_disable_notrace();
+	arch_spin_lock(lock);
+}
+
+static __always_inline int do_arch_spin_trylock(arch_spinlock_t *lock)
+{
+	preempt_disable_notrace();
+	if (arch_spin_trylock(lock))
+		return 1;
+	preempt_enable_notrace();
+	return 0;
+}
+
+static __always_inline void do_arch_spin_unlock(arch_spinlock_t *lock)
+{
+	arch_spin_unlock(lock);
+	preempt_enable_notrace();
+}
+
 /*
  * smp_mb__after_spinlock() provides the equivalent of a full memory barrier
  * between program-order earlier lock acquisitions and program-order later
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API
  2022-09-21 13:21 [PATCH 0/2] locking, tracing: Fix incorrect use of arch_spin_lock() Waiman Long
  2022-09-21 13:21 ` [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API Waiman Long
@ 2022-09-21 13:21 ` Waiman Long
  2022-09-21 22:17   ` Steven Rostedt
  1 sibling, 1 reply; 11+ messages in thread
From: Waiman Long @ 2022-09-21 13:21 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Steven Rostedt
  Cc: linux-kernel, Waiman Long

It was found that some tracing functions acquire a arch_spinlock_t with
preemption and irqs enabled.  That can be problematic in case preemption
happens after acquiring the lock. Use the proper do_arch_spin_lock()
API with preemption disabled to make sure that this won't happen unless
it is obvious that either preemption or irqs has been disabled .

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/trace/trace.c | 52 ++++++++++++++++++++------------------------
 1 file changed, 24 insertions(+), 28 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d3005279165d..cbb8520842ad 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1193,12 +1193,12 @@ void *tracing_cond_snapshot_data(struct trace_array *tr)
 {
 	void *cond_data = NULL;
 
-	arch_spin_lock(&tr->max_lock);
+	do_arch_spin_lock(&tr->max_lock);
 
 	if (tr->cond_snapshot)
 		cond_data = tr->cond_snapshot->cond_data;
 
-	arch_spin_unlock(&tr->max_lock);
+	do_arch_spin_unlock(&tr->max_lock);
 
 	return cond_data;
 }
@@ -1334,9 +1334,9 @@ int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data,
 		goto fail_unlock;
 	}
 
-	arch_spin_lock(&tr->max_lock);
+	do_arch_spin_lock(&tr->max_lock);
 	tr->cond_snapshot = cond_snapshot;
-	arch_spin_unlock(&tr->max_lock);
+	do_arch_spin_unlock(&tr->max_lock);
 
 	mutex_unlock(&trace_types_lock);
 
@@ -1363,7 +1363,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr)
 {
 	int ret = 0;
 
-	arch_spin_lock(&tr->max_lock);
+	do_arch_spin_lock(&tr->max_lock);
 
 	if (!tr->cond_snapshot)
 		ret = -EINVAL;
@@ -1372,7 +1372,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr)
 		tr->cond_snapshot = NULL;
 	}
 
-	arch_spin_unlock(&tr->max_lock);
+	do_arch_spin_unlock(&tr->max_lock);
 
 	return ret;
 }
@@ -1819,7 +1819,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
 		return;
 	}
 
-	arch_spin_lock(&tr->max_lock);
+	do_arch_spin_lock(&tr->max_lock);
 
 	/* Inherit the recordable setting from array_buffer */
 	if (ring_buffer_record_is_set_on(tr->array_buffer.buffer))
@@ -1836,7 +1836,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
 	__update_max_tr(tr, tsk, cpu);
 
  out_unlock:
-	arch_spin_unlock(&tr->max_lock);
+	do_arch_spin_unlock(&tr->max_lock);
 }
 
 /**
@@ -1862,7 +1862,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu)
 		return;
 	}
 
-	arch_spin_lock(&tr->max_lock);
+	do_arch_spin_lock(&tr->max_lock);
 
 	ret = ring_buffer_swap_cpu(tr->max_buffer.buffer, tr->array_buffer.buffer, cpu);
 
@@ -1880,7 +1880,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu)
 	WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY);
 
 	__update_max_tr(tr, tsk, cpu);
-	arch_spin_unlock(&tr->max_lock);
+	do_arch_spin_unlock(&tr->max_lock);
 }
 #endif /* CONFIG_TRACER_MAX_TRACE */
 
@@ -2413,7 +2413,7 @@ static int trace_save_cmdline(struct task_struct *tsk)
 	 * nor do we want to disable interrupts,
 	 * so if we miss here, then better luck next time.
 	 */
-	if (!arch_spin_trylock(&trace_cmdline_lock))
+	if (!do_arch_spin_trylock(&trace_cmdline_lock))
 		return 0;
 
 	idx = savedcmd->map_pid_to_cmdline[tpid];
@@ -2427,7 +2427,7 @@ static int trace_save_cmdline(struct task_struct *tsk)
 	savedcmd->map_cmdline_to_pid[idx] = tsk->pid;
 	set_cmdline(idx, tsk->comm);
 
-	arch_spin_unlock(&trace_cmdline_lock);
+	do_arch_spin_unlock(&trace_cmdline_lock);
 
 	return 1;
 }
@@ -2461,13 +2461,11 @@ static void __trace_find_cmdline(int pid, char comm[])
 
 void trace_find_cmdline(int pid, char comm[])
 {
-	preempt_disable();
-	arch_spin_lock(&trace_cmdline_lock);
+	do_arch_spin_lock(&trace_cmdline_lock);
 
 	__trace_find_cmdline(pid, comm);
 
-	arch_spin_unlock(&trace_cmdline_lock);
-	preempt_enable();
+	do_arch_spin_unlock(&trace_cmdline_lock);
 }
 
 static int *trace_find_tgid_ptr(int pid)
@@ -5829,8 +5827,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos)
 	void *v;
 	loff_t l = 0;
 
-	preempt_disable();
-	arch_spin_lock(&trace_cmdline_lock);
+	do_arch_spin_lock(&trace_cmdline_lock);
 
 	v = &savedcmd->map_cmdline_to_pid[0];
 	while (l <= *pos) {
@@ -5844,8 +5841,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos)
 
 static void saved_cmdlines_stop(struct seq_file *m, void *v)
 {
-	arch_spin_unlock(&trace_cmdline_lock);
-	preempt_enable();
+	do_arch_spin_unlock(&trace_cmdline_lock);
 }
 
 static int saved_cmdlines_show(struct seq_file *m, void *v)
@@ -5890,9 +5886,9 @@ tracing_saved_cmdlines_size_read(struct file *filp, char __user *ubuf,
 	char buf[64];
 	int r;
 
-	arch_spin_lock(&trace_cmdline_lock);
+	do_arch_spin_lock(&trace_cmdline_lock);
 	r = scnprintf(buf, sizeof(buf), "%u\n", savedcmd->cmdline_num);
-	arch_spin_unlock(&trace_cmdline_lock);
+	do_arch_spin_unlock(&trace_cmdline_lock);
 
 	return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
 }
@@ -5917,10 +5913,10 @@ static int tracing_resize_saved_cmdlines(unsigned int val)
 		return -ENOMEM;
 	}
 
-	arch_spin_lock(&trace_cmdline_lock);
+	do_arch_spin_lock(&trace_cmdline_lock);
 	savedcmd_temp = savedcmd;
 	savedcmd = s;
-	arch_spin_unlock(&trace_cmdline_lock);
+	do_arch_spin_unlock(&trace_cmdline_lock);
 	free_saved_cmdlines_buffer(savedcmd_temp);
 
 	return 0;
@@ -6373,10 +6369,10 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
 
 #ifdef CONFIG_TRACER_SNAPSHOT
 	if (t->use_max_tr) {
-		arch_spin_lock(&tr->max_lock);
+		do_arch_spin_lock(&tr->max_lock);
 		if (tr->cond_snapshot)
 			ret = -EBUSY;
-		arch_spin_unlock(&tr->max_lock);
+		do_arch_spin_unlock(&tr->max_lock);
 		if (ret)
 			goto out;
 	}
@@ -7436,10 +7432,10 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt,
 		goto out;
 	}
 
-	arch_spin_lock(&tr->max_lock);
+	do_arch_spin_lock(&tr->max_lock);
 	if (tr->cond_snapshot)
 		ret = -EBUSY;
-	arch_spin_unlock(&tr->max_lock);
+	do_arch_spin_unlock(&tr->max_lock);
 	if (ret)
 		goto out;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API
  2022-09-21 13:21 ` [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API Waiman Long
@ 2022-09-21 22:06   ` Steven Rostedt
  2022-09-22  0:51     ` Waiman Long
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2022-09-21 22:06 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel

On Wed, 21 Sep 2022 09:21:51 -0400
Waiman Long <longman@redhat.com> wrote:

> There are some code paths in the kernel like tracing or rcu where they
> want to use a spinlock without the lock debugging overhead (lockdep,
> etc). Provide a do_arch_spin_lock() API with proper preemption disabling
> and enabling without any debugging or tracing overhead.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>

Please don't do this. arch_spin_lock() is very special, and if tracing did
it wrong, then it needs to be fixed in the tracing code.

Let's not add handlers to make it easier to use arch_spin_lock(). Tracing
is special and arch_spin_lock() helps keep it from tracing itself.

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API
  2022-09-21 13:21 ` [PATCH 2/2] tracing: Use proper " Waiman Long
@ 2022-09-21 22:17   ` Steven Rostedt
  2022-09-22  0:50     ` Waiman Long
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Steven Rostedt @ 2022-09-21 22:17 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel

On Wed, 21 Sep 2022 09:21:52 -0400
Waiman Long <longman@redhat.com> wrote:

> It was found that some tracing functions acquire a arch_spinlock_t with
> preemption and irqs enabled.  That can be problematic in case preemption
> happens after acquiring the lock. Use the proper do_arch_spin_lock()
> API with preemption disabled to make sure that this won't happen unless
> it is obvious that either preemption or irqs has been disabled .
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  kernel/trace/trace.c | 52 ++++++++++++++++++++------------------------
>  1 file changed, 24 insertions(+), 28 deletions(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index d3005279165d..cbb8520842ad 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -1193,12 +1193,12 @@ void *tracing_cond_snapshot_data(struct trace_array *tr)
>  {
>  	void *cond_data = NULL;
>  
> -	arch_spin_lock(&tr->max_lock);
> +	do_arch_spin_lock(&tr->max_lock);

This should actually disable interrupts and not preemption.

>  
>  	if (tr->cond_snapshot)
>  		cond_data = tr->cond_snapshot->cond_data;
>  
> -	arch_spin_unlock(&tr->max_lock);
> +	do_arch_spin_unlock(&tr->max_lock);
>  
>  	return cond_data;
>  }
> @@ -1334,9 +1334,9 @@ int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data,
>  		goto fail_unlock;
>  	}
>  
> -	arch_spin_lock(&tr->max_lock);
> +	do_arch_spin_lock(&tr->max_lock);

Same here.

>  	tr->cond_snapshot = cond_snapshot;
> -	arch_spin_unlock(&tr->max_lock);
> +	do_arch_spin_unlock(&tr->max_lock);
>  
>  	mutex_unlock(&trace_types_lock);
>  
> @@ -1363,7 +1363,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr)
>  {
>  	int ret = 0;
>  
> -	arch_spin_lock(&tr->max_lock);
> +	do_arch_spin_lock(&tr->max_lock);

And here.

>  
>  	if (!tr->cond_snapshot)
>  		ret = -EINVAL;
> @@ -1372,7 +1372,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr)
>  		tr->cond_snapshot = NULL;
>  	}
>  
> -	arch_spin_unlock(&tr->max_lock);
> +	do_arch_spin_unlock(&tr->max_lock);
>  
>  	return ret;
>  }
> @@ -1819,7 +1819,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
>  		return;
>  	}
>  
> -	arch_spin_lock(&tr->max_lock);
> +	do_arch_spin_lock(&tr->max_lock);

Nothing here is needed, as interrupts had better be disabled when this
function is called. And there's already a:

	WARN_ON_ONCE(!irqs_disabled());

>  
>  	/* Inherit the recordable setting from array_buffer */
>  	if (ring_buffer_record_is_set_on(tr->array_buffer.buffer))
> @@ -1836,7 +1836,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
>  	__update_max_tr(tr, tsk, cpu);
>  
>   out_unlock:
> -	arch_spin_unlock(&tr->max_lock);
> +	do_arch_spin_unlock(&tr->max_lock);

Nothing needs to be done here.

>  }
>  
>  /**
> @@ -1862,7 +1862,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu)
>  		return;
>  	}
>  
> -	arch_spin_lock(&tr->max_lock);
> +	do_arch_spin_lock(&tr->max_lock);

Same here. Interrupts had better be disabled in this function.

>  
>  	ret = ring_buffer_swap_cpu(tr->max_buffer.buffer, tr->array_buffer.buffer, cpu);
>  
> @@ -1880,7 +1880,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu)
>  	WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY);
>  
>  	__update_max_tr(tr, tsk, cpu);
> -	arch_spin_unlock(&tr->max_lock);
> +	do_arch_spin_unlock(&tr->max_lock);

Nothing to do here.

>  }
>  #endif /* CONFIG_TRACER_MAX_TRACE */
>  
> @@ -2413,7 +2413,7 @@ static int trace_save_cmdline(struct task_struct *tsk)
>  	 * nor do we want to disable interrupts,
>  	 * so if we miss here, then better luck next time.
>  	 */
> -	if (!arch_spin_trylock(&trace_cmdline_lock))
> +	if (!do_arch_spin_trylock(&trace_cmdline_lock))

This is called within the scheduler and wake up, so interrupts had better
be disabled (run queue lock is held).

>  		return 0;
>  
>  	idx = savedcmd->map_pid_to_cmdline[tpid];
> @@ -2427,7 +2427,7 @@ static int trace_save_cmdline(struct task_struct *tsk)
>  	savedcmd->map_cmdline_to_pid[idx] = tsk->pid;
>  	set_cmdline(idx, tsk->comm);
>  
> -	arch_spin_unlock(&trace_cmdline_lock);
> +	do_arch_spin_unlock(&trace_cmdline_lock);

Nothing to do here.

>  
>  	return 1;
>  }
> @@ -2461,13 +2461,11 @@ static void __trace_find_cmdline(int pid, char comm[])
>  
>  void trace_find_cmdline(int pid, char comm[])
>  {
> -	preempt_disable();
> -	arch_spin_lock(&trace_cmdline_lock);
> +	do_arch_spin_lock(&trace_cmdline_lock);

Keep this as is (with the open coded preempt_disable()).

>  
>  	__trace_find_cmdline(pid, comm);
>  
> -	arch_spin_unlock(&trace_cmdline_lock);
> -	preempt_enable();
> +	do_arch_spin_unlock(&trace_cmdline_lock);
>  }
>  
>  static int *trace_find_tgid_ptr(int pid)
> @@ -5829,8 +5827,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos)
>  	void *v;
>  	loff_t l = 0;
>  
> -	preempt_disable();
> -	arch_spin_lock(&trace_cmdline_lock);
> +	do_arch_spin_lock(&trace_cmdline_lock);

This too.

>  
>  	v = &savedcmd->map_cmdline_to_pid[0];
>  	while (l <= *pos) {
> @@ -5844,8 +5841,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos)
>  
>  static void saved_cmdlines_stop(struct seq_file *m, void *v)
>  {
> -	arch_spin_unlock(&trace_cmdline_lock);
> -	preempt_enable();
> +	do_arch_spin_unlock(&trace_cmdline_lock);

And this.

>  }
>  
>  static int saved_cmdlines_show(struct seq_file *m, void *v)
> @@ -5890,9 +5886,9 @@ tracing_saved_cmdlines_size_read(struct file *filp, char __user *ubuf,
>  	char buf[64];
>  	int r;
>  
> -	arch_spin_lock(&trace_cmdline_lock);
> +	do_arch_spin_lock(&trace_cmdline_lock);

Yeah, we should add preempt_disable() here.

>  	r = scnprintf(buf, sizeof(buf), "%u\n", savedcmd->cmdline_num);
> -	arch_spin_unlock(&trace_cmdline_lock);
> +	do_arch_spin_unlock(&trace_cmdline_lock);
>  
>  	return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
>  }
> @@ -5917,10 +5913,10 @@ static int tracing_resize_saved_cmdlines(unsigned int val)
>  		return -ENOMEM;
>  	}
>  
> -	arch_spin_lock(&trace_cmdline_lock);
> +	do_arch_spin_lock(&trace_cmdline_lock);

And here.

>  	savedcmd_temp = savedcmd;
>  	savedcmd = s;
> -	arch_spin_unlock(&trace_cmdline_lock);
> +	do_arch_spin_unlock(&trace_cmdline_lock);
>  	free_saved_cmdlines_buffer(savedcmd_temp);
>  
>  	return 0;
> @@ -6373,10 +6369,10 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
>  
>  #ifdef CONFIG_TRACER_SNAPSHOT
>  	if (t->use_max_tr) {
> -		arch_spin_lock(&tr->max_lock);
> +		do_arch_spin_lock(&tr->max_lock);

Add preemption disabling.

>  		if (tr->cond_snapshot)
>  			ret = -EBUSY;
> -		arch_spin_unlock(&tr->max_lock);
> +		do_arch_spin_unlock(&tr->max_lock);
>  		if (ret)
>  			goto out;
>  	}
> @@ -7436,10 +7432,10 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt,
>  		goto out;
>  	}
>  
> -	arch_spin_lock(&tr->max_lock);
> +	do_arch_spin_lock(&tr->max_lock);

And this should disable interrupts first.

-- Steve

>  	if (tr->cond_snapshot)
>  		ret = -EBUSY;
> -	arch_spin_unlock(&tr->max_lock);
> +	do_arch_spin_unlock(&tr->max_lock);
>  	if (ret)
>  		goto out;
>  


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API
  2022-09-21 22:17   ` Steven Rostedt
@ 2022-09-22  0:50     ` Waiman Long
       [not found]     ` <3789bf4d-d699-63d4-c97b-c8d9524cbc2f@redhat.com>
  2022-09-22  7:55     ` Peter Zijlstra
  2 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2022-09-22  0:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel

On 9/21/22 18:17, Steven Rostedt wrote:
> On Wed, 21 Sep 2022 09:21:52 -0400
> Waiman Long <longman@redhat.com> wrote:
>
>> It was found that some tracing functions acquire a arch_spinlock_t with
>> preemption and irqs enabled.  That can be problematic in case preemption
>> happens after acquiring the lock. Use the proper do_arch_spin_lock()
>> API with preemption disabled to make sure that this won't happen unless
>> it is obvious that either preemption or irqs has been disabled .
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   kernel/trace/trace.c | 52 ++++++++++++++++++++------------------------
>>   1 file changed, 24 insertions(+), 28 deletions(-)
>>
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index d3005279165d..cbb8520842ad 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -1193,12 +1193,12 @@ void *tracing_cond_snapshot_data(struct trace_array *tr)
>>   {
>>   	void *cond_data = NULL;
>>   
>> -	arch_spin_lock(&tr->max_lock);
>> +	do_arch_spin_lock(&tr->max_lock);
> This should actually disable interrupts and not preemption.
>
>>   
>>   	if (tr->cond_snapshot)
>>   		cond_data = tr->cond_snapshot->cond_data;
>>   
>> -	arch_spin_unlock(&tr->max_lock);
>> +	do_arch_spin_unlock(&tr->max_lock);
>>   
>>   	return cond_data;
>>   }
>> @@ -1334,9 +1334,9 @@ int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data,
>>   		goto fail_unlock;
>>   	}
>>   
>> -	arch_spin_lock(&tr->max_lock);
>> +	do_arch_spin_lock(&tr->max_lock);
> Same here.
>
>>   	tr->cond_snapshot = cond_snapshot;
>> -	arch_spin_unlock(&tr->max_lock);
>> +	do_arch_spin_unlock(&tr->max_lock);
>>   
>>   	mutex_unlock(&trace_types_lock);
>>   
>> @@ -1363,7 +1363,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr)
>>   {
>>   	int ret = 0;
>>   
>> -	arch_spin_lock(&tr->max_lock);
>> +	do_arch_spin_lock(&tr->max_lock);
> And here.
>
>>   
>>   	if (!tr->cond_snapshot)
>>   		ret = -EINVAL;
>> @@ -1372,7 +1372,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr)
>>   		tr->cond_snapshot = NULL;
>>   	}
>>   
>> -	arch_spin_unlock(&tr->max_lock);
>> +	do_arch_spin_unlock(&tr->max_lock);
>>   
>>   	return ret;
>>   }
>> @@ -1819,7 +1819,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
>>   		return;
>>   	}
>>   
>> -	arch_spin_lock(&tr->max_lock);
>> +	do_arch_spin_lock(&tr->max_lock);
> Nothing here is needed, as interrupts had better be disabled when this
> function is called. And there's already a:
>
> 	WARN_ON_ONCE(!irqs_disabled());

Sorry, I missed that.


>
>>   
>>   	/* Inherit the recordable setting from array_buffer */
>>   	if (ring_buffer_record_is_set_on(tr->array_buffer.buffer))
>> @@ -1836,7 +1836,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
>>   	__update_max_tr(tr, tsk, cpu);
>>   
>>    out_unlock:
>> -	arch_spin_unlock(&tr->max_lock);
>> +	do_arch_spin_unlock(&tr->max_lock);
> Nothing needs to be done here.
Right.
>
>>   }
>>   
>>   /**
>> @@ -1862,7 +1862,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu)
>>   		return;
>>   	}
>>   
>> -	arch_spin_lock(&tr->max_lock);
>> +	do_arch_spin_lock(&tr->max_lock);
> Same here. Interrupts had better be disabled in this function.
>
>>   
>>   	ret = ring_buffer_swap_cpu(tr->max_buffer.buffer, tr->array_buffer.buffer, cpu);
>>   
>> @@ -1880,7 +1880,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu)
>>   	WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY);
>>   
>>   	__update_max_tr(tr, tsk, cpu);
>> -	arch_spin_unlock(&tr->max_lock);
>> +	do_arch_spin_unlock(&tr->max_lock);
> Nothing to do here.
>
>>   }
>>   #endif /* CONFIG_TRACER_MAX_TRACE */
>>   
>> @@ -2413,7 +2413,7 @@ static int trace_save_cmdline(struct task_struct *tsk)
>>   	 * nor do we want to disable interrupts,
>>   	 * so if we miss here, then better luck next time.
>>   	 */
>> -	if (!arch_spin_trylock(&trace_cmdline_lock))
>> +	if (!do_arch_spin_trylock(&trace_cmdline_lock))
> This is called within the scheduler and wake up, so interrupts had better
> be disabled (run queue lock is held).
>
>>   		return 0;
>>   
>>   	idx = savedcmd->map_pid_to_cmdline[tpid];
>> @@ -2427,7 +2427,7 @@ static int trace_save_cmdline(struct task_struct *tsk)
>>   	savedcmd->map_cmdline_to_pid[idx] = tsk->pid;
>>   	set_cmdline(idx, tsk->comm);
>>   
>> -	arch_spin_unlock(&trace_cmdline_lock);
>> +	do_arch_spin_unlock(&trace_cmdline_lock);
> Nothing to do here.
>
>>   
>>   	return 1;
>>   }
>> @@ -2461,13 +2461,11 @@ static void __trace_find_cmdline(int pid, char comm[])
>>   
>>   void trace_find_cmdline(int pid, char comm[])
>>   {
>> -	preempt_disable();
>> -	arch_spin_lock(&trace_cmdline_lock);
>> +	do_arch_spin_lock(&trace_cmdline_lock);
> Keep this as is (with the open coded preempt_disable()).
OK.
>
>>   
>>   	__trace_find_cmdline(pid, comm);
>>   
>> -	arch_spin_unlock(&trace_cmdline_lock);
>> -	preempt_enable();
>> +	do_arch_spin_unlock(&trace_cmdline_lock);
>>   }
>>   
>>   static int *trace_find_tgid_ptr(int pid)
>> @@ -5829,8 +5827,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos)
>>   	void *v;
>>   	loff_t l = 0;
>>   
>> -	preempt_disable();
>> -	arch_spin_lock(&trace_cmdline_lock);
>> +	do_arch_spin_lock(&trace_cmdline_lock);
> This too.
>
>>   
>>   	v = &savedcmd->map_cmdline_to_pid[0];
>>   	while (l <= *pos) {
>> @@ -5844,8 +5841,7 @@ static void *saved_cmdlines_start(struct seq_file *m, loff_t *pos)
>>   
>>   static void saved_cmdlines_stop(struct seq_file *m, void *v)
>>   {
>> -	arch_spin_unlock(&trace_cmdline_lock);
>> -	preempt_enable();
>> +	do_arch_spin_unlock(&trace_cmdline_lock);
> And this.
>
>>   }
>>   
>>   static int saved_cmdlines_show(struct seq_file *m, void *v)
>> @@ -5890,9 +5886,9 @@ tracing_saved_cmdlines_size_read(struct file *filp, char __user *ubuf,
>>   	char buf[64];
>>   	int r;
>>   
>> -	arch_spin_lock(&trace_cmdline_lock);
>> +	do_arch_spin_lock(&trace_cmdline_lock);
> Yeah, we should add preempt_disable() here.
>
>>   	r = scnprintf(buf, sizeof(buf), "%u\n", savedcmd->cmdline_num);
>> -	arch_spin_unlock(&trace_cmdline_lock);
>> +	do_arch_spin_unlock(&trace_cmdline_lock);
>>   
>>   	return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
>>   }
>> @@ -5917,10 +5913,10 @@ static int tracing_resize_saved_cmdlines(unsigned int val)
>>   		return -ENOMEM;
>>   	}
>>   
>> -	arch_spin_lock(&trace_cmdline_lock);
>> +	do_arch_spin_lock(&trace_cmdline_lock);
> And here.
>
>>   	savedcmd_temp = savedcmd;
>>   	savedcmd = s;
>> -	arch_spin_unlock(&trace_cmdline_lock);
>> +	do_arch_spin_unlock(&trace_cmdline_lock);
>>   	free_saved_cmdlines_buffer(savedcmd_temp);
>>   
>>   	return 0;
>> @@ -6373,10 +6369,10 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
>>   
>>   #ifdef CONFIG_TRACER_SNAPSHOT
>>   	if (t->use_max_tr) {
>> -		arch_spin_lock(&tr->max_lock);
>> +		do_arch_spin_lock(&tr->max_lock);
> Add preemption disabling.
>
>>   		if (tr->cond_snapshot)
>>   			ret = -EBUSY;
>> -		arch_spin_unlock(&tr->max_lock);
>> +		do_arch_spin_unlock(&tr->max_lock);
>>   		if (ret)
>>   			goto out;
>>   	}
>> @@ -7436,10 +7432,10 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt,
>>   		goto out;
>>   	}
>>   
>> -	arch_spin_lock(&tr->max_lock);
>> +	do_arch_spin_lock(&tr->max_lock);
> And this should disable interrupts first.
>
> -- Steve

Thanks for the comments, will modify the patch as suggested.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API
  2022-09-21 22:06   ` Steven Rostedt
@ 2022-09-22  0:51     ` Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2022-09-22  0:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel


On 9/21/22 18:06, Steven Rostedt wrote:
> On Wed, 21 Sep 2022 09:21:51 -0400
> Waiman Long <longman@redhat.com> wrote:
>
>> There are some code paths in the kernel like tracing or rcu where they
>> want to use a spinlock without the lock debugging overhead (lockdep,
>> etc). Provide a do_arch_spin_lock() API with proper preemption disabling
>> and enabling without any debugging or tracing overhead.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
> Please don't do this. arch_spin_lock() is very special, and if tracing did
> it wrong, then it needs to be fixed in the tracing code.
>
> Let's not add handlers to make it easier to use arch_spin_lock(). Tracing
> is special and arch_spin_lock() helps keep it from tracing itself.

That makes sense to me. Will remove this patch.

Thanks,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API
       [not found]     ` <3789bf4d-d699-63d4-c97b-c8d9524cbc2f@redhat.com>
@ 2022-09-22  2:08       ` Steven Rostedt
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2022-09-22  2:08 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel

On Wed, 21 Sep 2022 21:48:41 -0400
Waiman Long <longman@redhat.com> wrote:

> On 9/21/22 18:17, Steven Rostedt wrote:
> >>   	savedcmd_temp = savedcmd;
> >>   	savedcmd = s;
> >> -	arch_spin_unlock(&trace_cmdline_lock);
> >> +	do_arch_spin_unlock(&trace_cmdline_lock);
> >>   	free_saved_cmdlines_buffer(savedcmd_temp);
> >>   
> >>   	return 0;
> >> @@ -6373,10 +6369,10 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
> >>   
> >>   #ifdef CONFIG_TRACER_SNAPSHOT
> >>   	if (t->use_max_tr) {
> >> -		arch_spin_lock(&tr->max_lock);
> >> +		do_arch_spin_lock(&tr->max_lock);  
> > Add preemption disabling.
> >  
> The pattern that I have seen so far is to disable preemption for 
> trace_cmdline_lock, but interrupt for max_lock. So should we also 
> disable interrupt here instead of preemption?
> 

Ah yes.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API
  2022-09-21 22:17   ` Steven Rostedt
  2022-09-22  0:50     ` Waiman Long
       [not found]     ` <3789bf4d-d699-63d4-c97b-c8d9524cbc2f@redhat.com>
@ 2022-09-22  7:55     ` Peter Zijlstra
  2022-09-22 12:25       ` Waiman Long
  2022-09-22 12:40       ` Steven Rostedt
  2 siblings, 2 replies; 11+ messages in thread
From: Peter Zijlstra @ 2022-09-22  7:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Waiman Long, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel


I'm with Steve on not adding wrappers for this; people that use
arch_spinlock_* get to keep the pieces :-)

On Wed, Sep 21, 2022 at 06:17:21PM -0400, Steven Rostedt wrote:

> > @@ -1819,7 +1819,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
> >  		return;
> >  	}
> >  
> > -	arch_spin_lock(&tr->max_lock);
> > +	do_arch_spin_lock(&tr->max_lock);
> 
> Nothing here is needed, as interrupts had better be disabled when this
> function is called. And there's already a:
> 
> 	WARN_ON_ONCE(!irqs_disabled());

You can write that as lockdep_assert_irqs_disabled();

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API
  2022-09-22  7:55     ` Peter Zijlstra
@ 2022-09-22 12:25       ` Waiman Long
  2022-09-22 12:40       ` Steven Rostedt
  1 sibling, 0 replies; 11+ messages in thread
From: Waiman Long @ 2022-09-22 12:25 UTC (permalink / raw)
  To: Peter Zijlstra, Steven Rostedt
  Cc: Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel

On 9/22/22 03:55, Peter Zijlstra wrote:
> I'm with Steve on not adding wrappers for this; people that use
> arch_spinlock_* get to keep the pieces :-)

Yes, I am going to drop patch 1 and open-code all the changes in a new 
patch.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] tracing: Use proper do_arch_spin_lock() API
  2022-09-22  7:55     ` Peter Zijlstra
  2022-09-22 12:25       ` Waiman Long
@ 2022-09-22 12:40       ` Steven Rostedt
  1 sibling, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2022-09-22 12:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Ingo Molnar, Will Deacon, Boqun Feng, linux-kernel

On Thu, 22 Sep 2022 09:55:29 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> > Nothing here is needed, as interrupts had better be disabled when this
> > function is called. And there's already a:
> > 
> > 	WARN_ON_ONCE(!irqs_disabled());  
> 
> You can write that as lockdep_assert_irqs_disabled();

Of course we can ;-)  But this was written before that existed.

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-09-22 12:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-21 13:21 [PATCH 0/2] locking, tracing: Fix incorrect use of arch_spin_lock() Waiman Long
2022-09-21 13:21 ` [PATCH 1/2] locking: Provide a low overhead do_arch_spin_lock() API Waiman Long
2022-09-21 22:06   ` Steven Rostedt
2022-09-22  0:51     ` Waiman Long
2022-09-21 13:21 ` [PATCH 2/2] tracing: Use proper " Waiman Long
2022-09-21 22:17   ` Steven Rostedt
2022-09-22  0:50     ` Waiman Long
     [not found]     ` <3789bf4d-d699-63d4-c97b-c8d9524cbc2f@redhat.com>
2022-09-22  2:08       ` Steven Rostedt
2022-09-22  7:55     ` Peter Zijlstra
2022-09-22 12:25       ` Waiman Long
2022-09-22 12:40       ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.