All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] wchan: Fix wchan support
@ 2021-10-08 11:15 Peter Zijlstra
  2021-10-08 11:15 ` [PATCH 1/7] Revert "proc/wchan: use printk format instead of lookup_symbol_name()" Peter Zijlstra
                   ` (7 more replies)
  0 siblings, 8 replies; 29+ messages in thread
From: Peter Zijlstra @ 2021-10-08 11:15 UTC (permalink / raw)
  To: keescook, jannh
  Cc: linux-kernel, peterz, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, linux, will, guoren, bcain,
	monstr, tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato,
	davem, chris

Hi,

This fixes up wchan which is various degrees of broken across the
architectures.

Patch 4 fixes wchan for x86, which has been returning 0 for the past many
releases.

Patch 5 fixes the fundamental race against scheduling.

Patch 6 deletes a lot and makes STACKTRACE unconditional

patch 7 fixes up a few STACKTRACE arch oddities

0day says all builds are good, so it must be perfect :-) I'm planning on
queueing up at least the first 5 patches, but I'm hoping the last two patches
can be too.

Also available here:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/wchan

---
 arch/alpha/include/asm/processor.h      |  2 +-
 arch/alpha/kernel/process.c             |  5 ++-
 arch/arc/include/asm/processor.h        |  2 --
 arch/arc/kernel/stacktrace.c            | 19 +---------
 arch/arm/include/asm/processor.h        |  2 --
 arch/arm/kernel/process.c               | 24 -------------
 arch/arm64/include/asm/processor.h      |  2 --
 arch/arm64/kernel/process.c             | 28 ---------------
 arch/csky/include/asm/processor.h       |  2 --
 arch/csky/kernel/stacktrace.c           | 26 ++++----------
 arch/h8300/include/asm/processor.h      |  2 +-
 arch/h8300/kernel/process.c             |  5 +--
 arch/hexagon/include/asm/processor.h    |  3 --
 arch/hexagon/kernel/process.c           | 28 ---------------
 arch/ia64/include/asm/processor.h       |  3 --
 arch/ia64/kernel/process.c              | 31 -----------------
 arch/m68k/include/asm/processor.h       |  2 +-
 arch/m68k/kernel/process.c              |  4 +--
 arch/microblaze/include/asm/processor.h |  2 --
 arch/microblaze/kernel/process.c        |  6 ----
 arch/mips/include/asm/processor.h       |  2 --
 arch/mips/kernel/process.c              | 31 +----------------
 arch/mips/kernel/stacktrace.c           | 27 ++++++++------
 arch/nds32/include/asm/processor.h      |  2 --
 arch/nds32/kernel/process.c             | 28 ---------------
 arch/nds32/kernel/stacktrace.c          | 21 +++++------
 arch/nios2/include/asm/processor.h      |  2 +-
 arch/nios2/kernel/process.c             |  5 +--
 arch/openrisc/include/asm/processor.h   |  1 -
 arch/openrisc/kernel/process.c          |  6 ----
 arch/parisc/include/asm/processor.h     |  2 --
 arch/parisc/kernel/process.c            | 27 --------------
 arch/powerpc/include/asm/processor.h    |  2 --
 arch/powerpc/kernel/process.c           | 40 ---------------------
 arch/riscv/include/asm/processor.h      |  3 --
 arch/riscv/kernel/stacktrace.c          | 23 ------------
 arch/s390/include/asm/processor.h       |  1 -
 arch/s390/kernel/process.c              | 29 ---------------
 arch/sh/include/asm/processor_32.h      |  2 --
 arch/sh/kernel/process_32.c             | 22 ------------
 arch/sparc/include/asm/processor_32.h   |  2 +-
 arch/sparc/include/asm/processor_64.h   |  2 --
 arch/sparc/kernel/process_32.c          |  5 +--
 arch/sparc/kernel/process_64.c          | 31 -----------------
 arch/um/include/asm/processor-generic.h |  1 -
 arch/um/kernel/process.c                | 35 -------------------
 arch/x86/include/asm/processor.h        |  2 --
 arch/x86/kernel/process.c               | 62 ---------------------------------
 arch/xtensa/include/asm/processor.h     |  2 --
 arch/xtensa/kernel/process.c            | 32 -----------------
 fs/proc/array.c                         |  7 ++--
 fs/proc/base.c                          | 19 +++++-----
 include/linux/sched.h                   |  1 +
 kernel/sched/core.c                     | 34 ++++++++++++++++++
 lib/Kconfig.debug                       |  7 +---
 scripts/leaking_addresses.pl            |  3 +-
 56 files changed, 97 insertions(+), 622 deletions(-)


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/7] Revert "proc/wchan: use printk format instead of lookup_symbol_name()"
  2021-10-08 11:15 [PATCH 0/7] wchan: Fix wchan support Peter Zijlstra
@ 2021-10-08 11:15 ` Peter Zijlstra
  2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Kees Cook
  2021-10-08 11:15 ` [PATCH 2/7] leaking_addresses: Always print a trailing newline Peter Zijlstra
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2021-10-08 11:15 UTC (permalink / raw)
  To: keescook, jannh
  Cc: linux-kernel, peterz, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, linux, will, guoren, bcain,
	monstr, tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato,
	davem, chris, kernel test robot, stable

From: Kees Cook <keescook@chromium.org>

This reverts commit 152c432b128cb043fc107e8f211195fe94b2159c.

When a kernel address couldn't be symbolized for /proc/$pid/wchan, it
would leak the raw value, a potential information exposure. This is a
regression compared to the safer pre-v5.12 behavior.

Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Vito Caputo <vcaputo@pengaru.com>
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
---
 fs/proc/base.c |   21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -67,6 +67,7 @@
 #include <linux/mm.h>
 #include <linux/swap.h>
 #include <linux/rcupdate.h>
+#include <linux/kallsyms.h>
 #include <linux/stacktrace.h>
 #include <linux/resource.h>
 #include <linux/module.h>
@@ -386,17 +387,19 @@ static int proc_pid_wchan(struct seq_fil
 			  struct pid *pid, struct task_struct *task)
 {
 	unsigned long wchan;
+	char symname[KSYM_NAME_LEN];
 
-	if (ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS))
-		wchan = get_wchan(task);
-	else
-		wchan = 0;
-
-	if (wchan)
-		seq_printf(m, "%ps", (void *) wchan);
-	else
-		seq_putc(m, '0');
+	if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS))
+		goto print0;
 
+	wchan = get_wchan(task);
+	if (wchan && !lookup_symbol_name(wchan, symname)) {
+		seq_puts(m, symname);
+		return 0;
+	}
+
+print0:
+	seq_putc(m, '0');
 	return 0;
 }
 #endif /* CONFIG_KALLSYMS */



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/7] leaking_addresses: Always print a trailing newline
  2021-10-08 11:15 [PATCH 0/7] wchan: Fix wchan support Peter Zijlstra
  2021-10-08 11:15 ` [PATCH 1/7] Revert "proc/wchan: use printk format instead of lookup_symbol_name()" Peter Zijlstra
@ 2021-10-08 11:15 ` Peter Zijlstra
  2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Kees Cook
  2021-10-08 11:15 ` [PATCH 3/7] proc: Use task_is_running() for wchan in /proc/$pid/stat Peter Zijlstra
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2021-10-08 11:15 UTC (permalink / raw)
  To: keescook, jannh
  Cc: linux-kernel, peterz, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, linux, will, guoren, bcain,
	monstr, tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato,
	davem, chris

From: Kees Cook <keescook@chromium.org>

For files that lack trailing newlines and match a leaking address (e.g.
wchan[1]), the leaking_addresses.pl report would run together with the
next line, making things look corrupted.

Unconditionally remove the newline on input, and write it back out on
output.

[1] https://lore.kernel.org/all/20210103142726.GC30643@xsang-OptiPlex-9020/

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 scripts/leaking_addresses.pl |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/scripts/leaking_addresses.pl
+++ b/scripts/leaking_addresses.pl
@@ -455,8 +455,9 @@ sub parse_file
 
 	open my $fh, "<", $file or return;
 	while ( <$fh> ) {
+		chomp;
 		if (may_leak_address($_)) {
-			print $file . ': ' . $_;
+			printf("$file: $_\n");
 		}
 	}
 	close $fh;



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 3/7] proc: Use task_is_running() for wchan in /proc/$pid/stat
  2021-10-08 11:15 [PATCH 0/7] wchan: Fix wchan support Peter Zijlstra
  2021-10-08 11:15 ` [PATCH 1/7] Revert "proc/wchan: use printk format instead of lookup_symbol_name()" Peter Zijlstra
  2021-10-08 11:15 ` [PATCH 2/7] leaking_addresses: Always print a trailing newline Peter Zijlstra
@ 2021-10-08 11:15 ` Peter Zijlstra
  2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Kees Cook
  2021-10-08 11:15 ` [PATCH 4/7] x86: Fix get_wchan() to support the ORC unwinder Peter Zijlstra
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2021-10-08 11:15 UTC (permalink / raw)
  To: keescook, jannh
  Cc: linux-kernel, peterz, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, linux, will, guoren, bcain,
	monstr, tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato,
	davem, chris

From: Kees Cook <keescook@chromium.org>

The implementations of get_wchan() can be expensive. The only information
imparted here is whether or not a process is currently blocked in the
scheduler (and even this doesn't need to be exact). Avoid doing the
heavy lifting of stack walking and just report that information by using
task_is_running().

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 fs/proc/array.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -541,7 +541,7 @@ static int do_task_stat(struct seq_file
 	}
 
 	if (permitted && (!whole || num_threads < 2))
-		wchan = get_wchan(task);
+		wchan = !task_is_running(task);
 	if (!whole) {
 		min_flt = task->min_flt;
 		maj_flt = task->maj_flt;
@@ -606,10 +606,7 @@ static int do_task_stat(struct seq_file
 	 *
 	 * This works with older implementations of procps as well.
 	 */
-	if (wchan)
-		seq_puts(m, " 1");
-	else
-		seq_puts(m, " 0");
+	seq_put_decimal_ull(m, " ", wchan);
 
 	seq_put_decimal_ull(m, " ", 0);
 	seq_put_decimal_ull(m, " ", 0);



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 4/7] x86: Fix get_wchan() to support the ORC unwinder
  2021-10-08 11:15 [PATCH 0/7] wchan: Fix wchan support Peter Zijlstra
                   ` (2 preceding siblings ...)
  2021-10-08 11:15 ` [PATCH 3/7] proc: Use task_is_running() for wchan in /proc/$pid/stat Peter Zijlstra
@ 2021-10-08 11:15 ` Peter Zijlstra
  2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Qi Zheng
  2021-10-08 11:15 ` [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked Peter Zijlstra
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2021-10-08 11:15 UTC (permalink / raw)
  To: keescook, jannh
  Cc: linux-kernel, peterz, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, linux, will, guoren, bcain,
	monstr, tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato,
	davem, chris

From: Qi Zheng <zhengqi.arch@bytedance.com>

Currently, the kernel CONFIG_UNWINDER_ORC option is enabled by default
on x86, but the implementation of get_wchan() is still based on the frame
pointer unwinder, so the /proc/<pid>/wchan usually returned 0 regardless
of whether the task <pid> is running.

Reimplement get_wchan() by calling stack_trace_save_tsk(), which is
adapted to the ORC and frame pointer unwinders.

Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/process.c |   51 ++--------------------------------------------
 1 file changed, 3 insertions(+), 48 deletions(-)

--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -944,58 +944,13 @@ unsigned long arch_randomize_brk(struct
  */
 unsigned long get_wchan(struct task_struct *p)
 {
-	unsigned long start, bottom, top, sp, fp, ip, ret = 0;
-	int count = 0;
+	unsigned long entry = 0;
 
 	if (p == current || task_is_running(p))
 		return 0;
 
-	if (!try_get_task_stack(p))
-		return 0;
-
-	start = (unsigned long)task_stack_page(p);
-	if (!start)
-		goto out;
-
-	/*
-	 * Layout of the stack page:
-	 *
-	 * ----------- topmax = start + THREAD_SIZE - sizeof(unsigned long)
-	 * PADDING
-	 * ----------- top = topmax - TOP_OF_KERNEL_STACK_PADDING
-	 * stack
-	 * ----------- bottom = start
-	 *
-	 * The tasks stack pointer points at the location where the
-	 * framepointer is stored. The data on the stack is:
-	 * ... IP FP ... IP FP
-	 *
-	 * We need to read FP and IP, so we need to adjust the upper
-	 * bound by another unsigned long.
-	 */
-	top = start + THREAD_SIZE - TOP_OF_KERNEL_STACK_PADDING;
-	top -= 2 * sizeof(unsigned long);
-	bottom = start;
-
-	sp = READ_ONCE(p->thread.sp);
-	if (sp < bottom || sp > top)
-		goto out;
-
-	fp = READ_ONCE_NOCHECK(((struct inactive_task_frame *)sp)->bp);
-	do {
-		if (fp < bottom || fp > top)
-			goto out;
-		ip = READ_ONCE_NOCHECK(*(unsigned long *)(fp + sizeof(unsigned long)));
-		if (!in_sched_functions(ip)) {
-			ret = ip;
-			goto out;
-		}
-		fp = READ_ONCE_NOCHECK(*(unsigned long *)fp);
-	} while (count++ < 16 && !task_is_running(p));
-
-out:
-	put_task_stack(p);
-	return ret;
+	stack_trace_save_tsk(p, &entry, 1, 0);
+	return entry;
 }
 
 long do_arch_prctl_common(struct task_struct *task, int option,



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked
  2021-10-08 11:15 [PATCH 0/7] wchan: Fix wchan support Peter Zijlstra
                   ` (3 preceding siblings ...)
  2021-10-08 11:15 ` [PATCH 4/7] x86: Fix get_wchan() to support the ORC unwinder Peter Zijlstra
@ 2021-10-08 11:15 ` Peter Zijlstra
  2021-10-08 11:26   ` Geert Uytterhoeven
                     ` (3 more replies)
  2021-10-08 11:15 ` [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT Peter Zijlstra
                   ` (2 subsequent siblings)
  7 siblings, 4 replies; 29+ messages in thread
From: Peter Zijlstra @ 2021-10-08 11:15 UTC (permalink / raw)
  To: keescook, jannh
  Cc: linux-kernel, peterz, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, linux, will, guoren, bcain,
	monstr, tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato,
	davem, chris

From: Kees Cook <keescook@chromium.org>

Having a stable wchan means the process must be blocked and for it to
stay that way while performing stack unwinding.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/alpha/include/asm/processor.h      |    2 +-
 arch/alpha/kernel/process.c             |    5 ++---
 arch/arc/include/asm/processor.h        |    2 +-
 arch/arc/kernel/stacktrace.c            |    4 ++--
 arch/arm/include/asm/processor.h        |    2 +-
 arch/arm/kernel/process.c               |    4 +---
 arch/arm64/include/asm/processor.h      |    2 +-
 arch/arm64/kernel/process.c             |    4 +---
 arch/csky/include/asm/processor.h       |    2 +-
 arch/csky/kernel/stacktrace.c           |    5 ++---
 arch/h8300/include/asm/processor.h      |    2 +-
 arch/h8300/kernel/process.c             |    5 +----
 arch/hexagon/include/asm/processor.h    |    2 +-
 arch/hexagon/kernel/process.c           |    4 +---
 arch/ia64/include/asm/processor.h       |    2 +-
 arch/ia64/kernel/process.c              |    5 +----
 arch/m68k/include/asm/processor.h       |    2 +-
 arch/m68k/kernel/process.c              |    4 +---
 arch/microblaze/include/asm/processor.h |    2 +-
 arch/microblaze/kernel/process.c        |    2 +-
 arch/mips/include/asm/processor.h       |    2 +-
 arch/mips/kernel/process.c              |    8 +++-----
 arch/nds32/include/asm/processor.h      |    2 +-
 arch/nds32/kernel/process.c             |    7 +------
 arch/nios2/include/asm/processor.h      |    2 +-
 arch/nios2/kernel/process.c             |    5 +----
 arch/openrisc/include/asm/processor.h   |    2 +-
 arch/openrisc/kernel/process.c          |    2 +-
 arch/parisc/include/asm/processor.h     |    2 +-
 arch/parisc/kernel/process.c            |    5 +----
 arch/powerpc/include/asm/processor.h    |    2 +-
 arch/powerpc/kernel/process.c           |    9 +++------
 arch/riscv/include/asm/processor.h      |    2 +-
 arch/riscv/kernel/stacktrace.c          |   12 +++++-------
 arch/s390/include/asm/processor.h       |    2 +-
 arch/s390/kernel/process.c              |    4 ++--
 arch/sh/include/asm/processor_32.h      |    2 +-
 arch/sh/kernel/process_32.c             |    5 +----
 arch/sparc/include/asm/processor_32.h   |    2 +-
 arch/sparc/include/asm/processor_64.h   |    2 +-
 arch/sparc/kernel/process_32.c          |    5 +----
 arch/sparc/kernel/process_64.c          |    5 +----
 arch/um/include/asm/processor-generic.h |    2 +-
 arch/um/kernel/process.c                |    5 +----
 arch/x86/include/asm/processor.h        |    2 +-
 arch/x86/kernel/process.c               |    5 +----
 arch/xtensa/include/asm/processor.h     |    2 +-
 arch/xtensa/kernel/process.c            |    5 +----
 include/linux/sched.h                   |    1 +
 kernel/sched/core.c                     |   19 +++++++++++++++++++
 50 files changed, 80 insertions(+), 112 deletions(-)

--- a/arch/alpha/include/asm/processor.h
+++ b/arch/alpha/include/asm/processor.h
@@ -42,7 +42,7 @@ extern void start_thread(struct pt_regs
 struct task_struct;
 extern void release_thread(struct task_struct *);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk) (task_pt_regs(tsk)->pc)
 
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -376,12 +376,11 @@ thread_saved_pc(struct task_struct *t)
 }
 
 unsigned long
-get_wchan(struct task_struct *p)
+__get_wchan(struct task_struct *p)
 {
 	unsigned long schedule_frame;
 	unsigned long pc;
-	if (!p || p == current || task_is_running(p))
-		return 0;
+
 	/*
 	 * This one depends on the frame size of schedule().  Do a
 	 * "disass schedule" in gdb to find the frame size.  Also, the
--- a/arch/arc/include/asm/processor.h
+++ b/arch/arc/include/asm/processor.h
@@ -70,7 +70,7 @@ struct task_struct;
 extern void start_thread(struct pt_regs * regs, unsigned long pc,
 			 unsigned long usp);
 
-extern unsigned int get_wchan(struct task_struct *p);
+extern unsigned int __get_wchan(struct task_struct *p);
 
 #endif /* !__ASSEMBLY__ */
 
--- a/arch/arc/kernel/stacktrace.c
+++ b/arch/arc/kernel/stacktrace.c
@@ -15,7 +15,7 @@
  *      = specifics of data structs where trace is saved(CONFIG_STACKTRACE etc)
  *
  *  vineetg: March 2009
- *  -Implemented correct versions of thread_saved_pc() and get_wchan()
+ *  -Implemented correct versions of thread_saved_pc() and __get_wchan()
  *
  *  rajeshwarr: 2008
  *  -Initial implementation
@@ -248,7 +248,7 @@ void show_stack(struct task_struct *tsk,
  * Of course just returning schedule( ) would be pointless so unwind until
  * the function is not in schedular code
  */
-unsigned int get_wchan(struct task_struct *tsk)
+unsigned int __get_wchan(struct task_struct *tsk)
 {
 	return arc_unwind_core(tsk, NULL, __get_first_nonsched, NULL);
 }
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -84,7 +84,7 @@ struct task_struct;
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -276,13 +276,11 @@ int copy_thread(unsigned long clone_flag
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	struct stackframe frame;
 	unsigned long stack_page;
 	int count = 0;
-	if (!p || p == current || task_is_running(p))
-		return 0;
 
 	frame.fp = thread_saved_fp(p);
 	frame.sp = thread_saved_sp(p);
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -257,7 +257,7 @@ struct task_struct;
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 void update_sctlr_el1(u64 sctlr);
 
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -528,13 +528,11 @@ __notrace_funcgraph struct task_struct *
 	return last;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	struct stackframe frame;
 	unsigned long stack_page, ret = 0;
 	int count = 0;
-	if (!p || p == current || task_is_running(p))
-		return 0;
 
 	stack_page = (unsigned long)try_get_task_stack(p);
 	if (!stack_page)
--- a/arch/csky/include/asm/processor.h
+++ b/arch/csky/include/asm/processor.h
@@ -81,7 +81,7 @@ static inline void release_thread(struct
 
 extern int kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)		(task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->usp)
--- a/arch/csky/kernel/stacktrace.c
+++ b/arch/csky/kernel/stacktrace.c
@@ -111,12 +111,11 @@ static bool save_wchan(unsigned long pc,
 	return false;
 }
 
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc = 0;
 
-	if (likely(task && task != current && !task_is_running(task)))
-		walk_stackframe(task, NULL, save_wchan, &pc);
+	walk_stackframe(task, NULL, save_wchan, &pc);
 	return pc;
 }
 
--- a/arch/h8300/include/asm/processor.h
+++ b/arch/h8300/include/asm/processor.h
@@ -105,7 +105,7 @@ static inline void release_thread(struct
 {
 }
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define	KSTK_EIP(tsk)	\
 	({			 \
--- a/arch/h8300/kernel/process.c
+++ b/arch/h8300/kernel/process.c
@@ -128,15 +128,12 @@ int copy_thread(unsigned long clone_flag
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, pc;
 	unsigned long stack_page;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	stack_page = (unsigned long)p;
 	fp = ((struct pt_regs *)p->thread.ksp)->er6;
 	do {
--- a/arch/hexagon/include/asm/processor.h
+++ b/arch/hexagon/include/asm/processor.h
@@ -64,7 +64,7 @@ struct thread_struct {
 extern void release_thread(struct task_struct *dead_task);
 
 /* Get wait channel for task P.  */
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 /*  The following stuff is pretty HEXAGON specific.  */
 
--- a/arch/hexagon/kernel/process.c
+++ b/arch/hexagon/kernel/process.c
@@ -130,13 +130,11 @@ void flush_thread(void)
  * is an identification of the point at which the scheduler
  * was invoked by a blocked thread.
  */
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, pc;
 	unsigned long stack_page;
 	int count = 0;
-	if (!p || p == current || task_is_running(p))
-		return 0;
 
 	stack_page = (unsigned long)task_stack_page(p);
 	fp = ((struct hexagon_switch_stack *)p->thread.switch_sp)->fp;
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -330,7 +330,7 @@ struct task_struct;
 #define release_thread(dead_task)
 
 /* Get wait channel for task P.  */
-extern unsigned long get_wchan (struct task_struct *p);
+extern unsigned long __get_wchan (struct task_struct *p);
 
 /* Return instruction pointer of blocked task TSK.  */
 #define KSTK_EIP(tsk)					\
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -523,15 +523,12 @@ exit_thread (struct task_struct *tsk)
 }
 
 unsigned long
-get_wchan (struct task_struct *p)
+__get_wchan (struct task_struct *p)
 {
 	struct unw_frame_info info;
 	unsigned long ip;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	/*
 	 * Note: p may not be a blocked task (it could be current or
 	 * another process running on some other CPU.  Rather than
--- a/arch/m68k/include/asm/processor.h
+++ b/arch/m68k/include/asm/processor.h
@@ -125,7 +125,7 @@ static inline void release_thread(struct
 {
 }
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define	KSTK_EIP(tsk)	\
     ({			\
--- a/arch/m68k/kernel/process.c
+++ b/arch/m68k/kernel/process.c
@@ -263,13 +263,11 @@ int dump_fpu (struct pt_regs *regs, stru
 }
 EXPORT_SYMBOL(dump_fpu);
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, pc;
 	unsigned long stack_page;
 	int count = 0;
-	if (!p || p == current || task_is_running(p))
-		return 0;
 
 	stack_page = (unsigned long)task_stack_page(p);
 	fp = ((struct switch_stack *)p->thread.ksp)->a6;
--- a/arch/microblaze/include/asm/processor.h
+++ b/arch/microblaze/include/asm/processor.h
@@ -68,7 +68,7 @@ static inline void release_thread(struct
 {
 }
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 /* The size allocated for kernel stacks. This _must_ be a power of two! */
 # define KERNEL_STACK_SIZE	0x2000
--- a/arch/microblaze/kernel/process.c
+++ b/arch/microblaze/kernel/process.c
@@ -112,7 +112,7 @@ int copy_thread(unsigned long clone_flag
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 /* TBD (used by procfs) */
 	return 0;
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -369,7 +369,7 @@ static inline void flush_thread(void)
 {
 }
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define __KSTK_TOS(tsk) ((unsigned long)task_stack_page(tsk) + \
 			 THREAD_SIZE - 32 - sizeof(struct pt_regs))
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -511,7 +511,7 @@ static int __init frame_info_init(void)
 
 	/*
 	 * Without schedule() frame info, result given by
-	 * thread_saved_pc() and get_wchan() are not reliable.
+	 * thread_saved_pc() and __get_wchan() are not reliable.
 	 */
 	if (schedule_mfi.pc_offset < 0)
 		printk("Can't analyze schedule() prologue at %p\n", schedule);
@@ -652,9 +652,9 @@ unsigned long unwind_stack(struct task_s
 #endif
 
 /*
- * get_wchan - a maintenance nightmare^W^Wpain in the ass ...
+ * __get_wchan - a maintenance nightmare^W^Wpain in the ass ...
  */
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc = 0;
 #ifdef CONFIG_KALLSYMS
@@ -662,8 +662,6 @@ unsigned long get_wchan(struct task_stru
 	unsigned long ra = 0;
 #endif
 
-	if (!task || task == current || task_is_running(task))
-		goto out;
 	if (!task_stack_page(task))
 		goto out;
 
--- a/arch/nds32/include/asm/processor.h
+++ b/arch/nds32/include/asm/processor.h
@@ -83,7 +83,7 @@ extern struct task_struct *last_task_use
 /* Prepare to copy thread state - unlazy all lazy status */
 #define prepare_to_copy(tsk)	do { } while (0)
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define cpu_relax()			barrier()
 
--- a/arch/nds32/kernel/process.c
+++ b/arch/nds32/kernel/process.c
@@ -233,15 +233,12 @@ int dump_fpu(struct pt_regs *regs, elf_f
 
 EXPORT_SYMBOL(dump_fpu);
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, lr;
 	unsigned long stack_start, stack_end;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	if (IS_ENABLED(CONFIG_FRAME_POINTER)) {
 		stack_start = (unsigned long)end_of_stack(p);
 		stack_end = (unsigned long)task_stack_page(p) + THREAD_SIZE;
@@ -258,5 +255,3 @@ unsigned long get_wchan(struct task_stru
 	}
 	return 0;
 }
-
-EXPORT_SYMBOL(get_wchan);
--- a/arch/nios2/include/asm/processor.h
+++ b/arch/nios2/include/asm/processor.h
@@ -69,7 +69,7 @@ static inline void release_thread(struct
 {
 }
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1)
--- a/arch/nios2/kernel/process.c
+++ b/arch/nios2/kernel/process.c
@@ -217,15 +217,12 @@ void dump(struct pt_regs *fp)
 	pr_emerg("\n\n");
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, pc;
 	unsigned long stack_page;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	stack_page = (unsigned long)p;
 	fp = ((struct switch_stack *)p->thread.ksp)->fp;	/* ;dgt2 */
 	do {
--- a/arch/openrisc/include/asm/processor.h
+++ b/arch/openrisc/include/asm/processor.h
@@ -73,7 +73,7 @@ struct thread_struct {
 
 void start_thread(struct pt_regs *regs, unsigned long nip, unsigned long sp);
 void release_thread(struct task_struct *);
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define cpu_relax()     barrier()
 
--- a/arch/openrisc/kernel/process.c
+++ b/arch/openrisc/kernel/process.c
@@ -263,7 +263,7 @@ void dump_elf_thread(elf_greg_t *dest, s
 	dest[35] = 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	/* TODO */
 
--- a/arch/parisc/include/asm/processor.h
+++ b/arch/parisc/include/asm/processor.h
@@ -273,7 +273,7 @@ struct mm_struct;
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)	((tsk)->thread.regs.iaoq[0])
 #define KSTK_ESP(tsk)	((tsk)->thread.regs.gr[30])
--- a/arch/parisc/kernel/process.c
+++ b/arch/parisc/kernel/process.c
@@ -240,15 +240,12 @@ copy_thread(unsigned long clone_flags, u
 }
 
 unsigned long
-get_wchan(struct task_struct *p)
+__get_wchan(struct task_struct *p)
 {
 	struct unwind_frame_info info;
 	unsigned long ip;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	/*
 	 * These bracket the sleeping functions..
 	 */
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -300,7 +300,7 @@ struct thread_struct {
 
 #define task_pt_regs(tsk)	((tsk)->thread.regs)
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)  ((tsk)->thread.regs? (tsk)->thread.regs->nip: 0)
 #define KSTK_ESP(tsk)  ((tsk)->thread.regs? (tsk)->thread.regs->gpr[1]: 0)
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2111,14 +2111,11 @@ int validate_sp(unsigned long sp, struct
 
 EXPORT_SYMBOL(validate_sp);
 
-static unsigned long __get_wchan(struct task_struct *p)
+static unsigned long ___get_wchan(struct task_struct *p)
 {
 	unsigned long ip, sp;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	sp = p->thread.ksp;
 	if (!validate_sp(sp, p, STACK_FRAME_OVERHEAD))
 		return 0;
@@ -2137,14 +2134,14 @@ static unsigned long __get_wchan(struct
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long ret;
 
 	if (!try_get_task_stack(p))
 		return 0;
 
-	ret = __get_wchan(p);
+	ret = ___get_wchan(p);
 
 	put_task_stack(p);
 
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -66,7 +66,7 @@ static inline void release_thread(struct
 {
 }
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 
 static inline void wait_for_interrupt(void)
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -128,16 +128,14 @@ static bool save_wchan(void *arg, unsign
 	return true;
 }
 
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc = 0;
 
-	if (likely(task && task != current && !task_is_running(task))) {
-		if (!try_get_task_stack(task))
-			return 0;
-		walk_stackframe(task, NULL, save_wchan, &pc);
-		put_task_stack(task);
-	}
+	if (!try_get_task_stack(task))
+		return 0;
+	walk_stackframe(task, NULL, save_wchan, &pc);
+	put_task_stack(task);
 	return pc;
 }
 
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -192,7 +192,7 @@ static inline void release_thread(struct
 void guarded_storage_release(struct task_struct *tsk);
 void gs_load_bc_cb(struct pt_regs *regs);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 #define task_pt_regs(tsk) ((struct pt_regs *) \
         (task_stack_page(tsk) + THREAD_SIZE) - 1)
 #define KSTK_EIP(tsk)	(task_pt_regs(tsk)->psw.addr)
--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -181,12 +181,12 @@ void execve_tail(void)
 	asm volatile("sfpc %0" : : "d" (0));
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	struct unwind_state state;
 	unsigned long ip = 0;
 
-	if (!p || p == current || task_is_running(p) || !task_stack_page(p))
+	if (!task_stack_page(p))
 		return 0;
 
 	if (!try_get_task_stack(p))
--- a/arch/sh/include/asm/processor_32.h
+++ b/arch/sh/include/asm/processor_32.h
@@ -180,7 +180,7 @@ static inline void show_code(struct pt_r
 }
 #endif
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)  (task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)  (task_pt_regs(tsk)->regs[15])
--- a/arch/sh/kernel/process_32.c
+++ b/arch/sh/kernel/process_32.c
@@ -182,13 +182,10 @@ __switch_to(struct task_struct *prev, st
 	return prev;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long pc;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	/*
 	 * The same comment as on the Alpha applies here, too ...
 	 */
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -89,7 +89,7 @@ static inline void start_thread(struct p
 /* Free all resources held by a thread. */
 #define release_thread(tsk)		do { } while(0)
 
-unsigned long get_wchan(struct task_struct *);
+unsigned long __get_wchan(struct task_struct *);
 
 #define task_pt_regs(tsk) ((tsk)->thread.kregs)
 #define KSTK_EIP(tsk)  ((tsk)->thread.kregs->pc)
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -183,7 +183,7 @@ do { \
 /* Free all resources held by a thread. */
 #define release_thread(tsk)		do { } while (0)
 
-unsigned long get_wchan(struct task_struct *task);
+unsigned long __get_wchan(struct task_struct *task);
 
 #define task_pt_regs(tsk) (task_thread_info(tsk)->kregs)
 #define KSTK_EIP(tsk)  (task_pt_regs(tsk)->tpc)
--- a/arch/sparc/kernel/process_32.c
+++ b/arch/sparc/kernel/process_32.c
@@ -365,7 +365,7 @@ int copy_thread(unsigned long clone_flag
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc, fp, bias = 0;
 	unsigned long task_base = (unsigned long) task;
@@ -373,9 +373,6 @@ unsigned long get_wchan(struct task_stru
 	struct reg_window32 *rw;
 	int count = 0;
 
-	if (!task || task == current || task_is_running(task))
-		goto out;
-
 	fp = task_thread_info(task)->ksp + bias;
 	do {
 		/* Bogus frame pointer? */
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -663,7 +663,7 @@ int arch_dup_task_struct(struct task_str
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc, fp, bias = 0;
 	struct thread_info *tp;
@@ -671,9 +671,6 @@ unsigned long get_wchan(struct task_stru
         unsigned long ret = 0;
 	int count = 0; 
 
-	if (!task || task == current || task_is_running(task))
-		goto out;
-
 	tp = task_thread_info(task);
 	bias = STACK_BIAS;
 	fp = task_thread_info(task)->ksp + bias;
--- a/arch/um/include/asm/processor-generic.h
+++ b/arch/um/include/asm/processor-generic.h
@@ -106,6 +106,6 @@ extern struct cpuinfo_um boot_cpu_data;
 #define cache_line_size()	(boot_cpu_data.cache_alignment)
 
 #define KSTK_REG(tsk, reg) get_thread_reg(reg, &tsk->thread.switch_buf)
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #endif
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -364,14 +364,11 @@ unsigned long arch_align_stack(unsigned
 }
 #endif
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long stack_page, sp, ip;
 	bool seen_sched = 0;
 
-	if ((p == NULL) || (p == current) || task_is_running(p))
-		return 0;
-
 	stack_page = (unsigned long) task_stack_page(p);
 	/* Bail if the process has no kernel stack for some reason */
 	if (stack_page == 0)
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -590,7 +590,7 @@ static inline void load_sp0(unsigned lon
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 /*
  * Generic CPUID function
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -943,13 +943,10 @@ unsigned long arch_randomize_brk(struct
  * because the task might wake up and we might look at a stack
  * changing under us.
  */
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long entry = 0;
 
-	if (p == current || task_is_running(p))
-		return 0;
-
 	stack_trace_save_tsk(p, &entry, 1, 0);
 	return entry;
 }
--- a/arch/xtensa/include/asm/processor.h
+++ b/arch/xtensa/include/asm/processor.h
@@ -215,7 +215,7 @@ struct mm_struct;
 /* Free all resources held by a thread. */
 #define release_thread(thread) do { } while(0)
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)		(task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->areg[1])
--- a/arch/xtensa/kernel/process.c
+++ b/arch/xtensa/kernel/process.c
@@ -298,15 +298,12 @@ int copy_thread(unsigned long clone_flag
  * These bracket the sleeping functions..
  */
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long sp, pc;
 	unsigned long stack_page = (unsigned long) task_stack_page(p);
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	sp = p->thread.sp;
 	pc = MAKE_PC_FROM_RA(p->thread.ra, p->thread.sp);
 
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2139,6 +2139,7 @@ static inline void set_task_cpu(struct t
 #endif /* CONFIG_SMP */
 
 extern bool sched_task_on_rq(struct task_struct *p);
+extern unsigned long get_wchan(struct task_struct *p);
 
 /*
  * In order to reduce various lock holder preemption latencies provide an
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1962,6 +1962,25 @@ bool sched_task_on_rq(struct task_struct
 	return task_on_rq_queued(p);
 }
 
+unsigned long get_wchan(struct task_struct *p)
+{
+	unsigned long ip = 0;
+	unsigned int state;
+
+	if (!p || p == current)
+		return 0;
+
+	/* Only get wchan if task is blocked and we can keep it that way. */
+	raw_spin_lock_irq(&p->pi_lock);
+	state = READ_ONCE(p->__state);
+	smp_rmb(); /* see try_to_wake_up() */
+	if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
+		ip = __get_wchan(p);
+	raw_spin_unlock_irq(&p->pi_lock);
+
+	return ip;
+}
+
 static inline void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
 {
 	if (!(flags & ENQUEUE_NOCLOCK))



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT
  2021-10-08 11:15 [PATCH 0/7] wchan: Fix wchan support Peter Zijlstra
                   ` (4 preceding siblings ...)
  2021-10-08 11:15 ` [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked Peter Zijlstra
@ 2021-10-08 11:15 ` Peter Zijlstra
  2021-10-08 12:40   ` Mark Rutland
  2021-10-14 11:07   ` Russell King (Oracle)
  2021-10-08 11:15 ` [PATCH 7/7] arch: Fix STACKTRACE_SUPPORT Peter Zijlstra
  2021-10-14 12:02 ` [PATCH 0/7] wchan: Fix wchan support Russell King (Oracle)
  7 siblings, 2 replies; 29+ messages in thread
From: Peter Zijlstra @ 2021-10-08 11:15 UTC (permalink / raw)
  To: keescook, jannh
  Cc: linux-kernel, peterz, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, linux, will, guoren, bcain,
	monstr, tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato,
	davem, chris

It's trivial to implement __get_wchan() with CONFIG_STACKTRACE

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arc/include/asm/processor.h        |    2 -
 arch/arc/kernel/stacktrace.c            |   17 --------------
 arch/arm/include/asm/processor.h        |    2 -
 arch/arm/kernel/process.c               |   22 -------------------
 arch/arm64/include/asm/processor.h      |    2 -
 arch/arm64/kernel/process.c             |   26 ----------------------
 arch/csky/include/asm/processor.h       |    2 -
 arch/csky/kernel/stacktrace.c           |   18 ---------------
 arch/hexagon/include/asm/processor.h    |    3 --
 arch/hexagon/kernel/process.c           |   26 ----------------------
 arch/ia64/include/asm/processor.h       |    3 --
 arch/ia64/kernel/process.c              |   28 ------------------------
 arch/microblaze/include/asm/processor.h |    2 -
 arch/microblaze/kernel/process.c        |    6 -----
 arch/mips/include/asm/processor.h       |    2 -
 arch/mips/kernel/process.c              |   27 -----------------------
 arch/nds32/include/asm/processor.h      |    2 -
 arch/nds32/kernel/process.c             |   23 -------------------
 arch/openrisc/include/asm/processor.h   |    1 
 arch/openrisc/kernel/process.c          |    6 -----
 arch/parisc/include/asm/processor.h     |    2 -
 arch/parisc/kernel/process.c            |   24 --------------------
 arch/powerpc/include/asm/processor.h    |    2 -
 arch/powerpc/kernel/process.c           |   37 --------------------------------
 arch/riscv/include/asm/processor.h      |    3 --
 arch/riscv/kernel/stacktrace.c          |   21 ------------------
 arch/s390/include/asm/processor.h       |    1 
 arch/s390/kernel/process.c              |   29 -------------------------
 arch/sh/include/asm/processor_32.h      |    2 -
 arch/sh/kernel/process_32.c             |   19 ----------------
 arch/sparc/include/asm/processor_64.h   |    2 -
 arch/sparc/kernel/process_64.c          |   28 ------------------------
 arch/um/include/asm/processor-generic.h |    1 
 arch/um/kernel/process.c                |   32 ---------------------------
 arch/x86/include/asm/processor.h        |    2 -
 arch/x86/kernel/process.c               |   14 ------------
 arch/xtensa/include/asm/processor.h     |    2 -
 arch/xtensa/kernel/process.c            |   29 -------------------------
 kernel/sched/core.c                     |   15 ++++++++++++
 lib/Kconfig.debug                       |    7 ------
 40 files changed, 16 insertions(+), 476 deletions(-)

--- a/arch/arc/include/asm/processor.h
+++ b/arch/arc/include/asm/processor.h
@@ -70,8 +70,6 @@ struct task_struct;
 extern void start_thread(struct pt_regs * regs, unsigned long pc,
 			 unsigned long usp);
 
-extern unsigned int __get_wchan(struct task_struct *p);
-
 #endif /* !__ASSEMBLY__ */
 
 /*
--- a/arch/arc/kernel/stacktrace.c
+++ b/arch/arc/kernel/stacktrace.c
@@ -217,14 +217,6 @@ static int __collect_all_but_sched(unsig
 
 #endif
 
-static int __get_first_nonsched(unsigned int address, void *unused)
-{
-	if (in_sched_functions(address))
-		return 0;
-
-	return -1;
-}
-
 /*-------------------------------------------------------------------------
  *              APIs expected by various kernel sub-systems
  *-------------------------------------------------------------------------
@@ -244,15 +236,6 @@ void show_stack(struct task_struct *tsk,
 	show_stacktrace(tsk, NULL, loglvl);
 }
 
-/* Another API expected by schedular, shows up in "ps" as Wait Channel
- * Of course just returning schedule( ) would be pointless so unwind until
- * the function is not in schedular code
- */
-unsigned int __get_wchan(struct task_struct *tsk)
-{
-	return arc_unwind_core(tsk, NULL, __get_first_nonsched, NULL);
-}
-
 #ifdef CONFIG_STACKTRACE
 
 /*
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -84,8 +84,6 @@ struct task_struct;
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-unsigned long __get_wchan(struct task_struct *p);
-
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
 
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -276,28 +276,6 @@ int copy_thread(unsigned long clone_flag
 	return 0;
 }
 
-unsigned long __get_wchan(struct task_struct *p)
-{
-	struct stackframe frame;
-	unsigned long stack_page;
-	int count = 0;
-
-	frame.fp = thread_saved_fp(p);
-	frame.sp = thread_saved_sp(p);
-	frame.lr = 0;			/* recovered from the stack */
-	frame.pc = thread_saved_pc(p);
-	stack_page = (unsigned long)task_stack_page(p);
-	do {
-		if (frame.sp < stack_page ||
-		    frame.sp >= stack_page + THREAD_SIZE ||
-		    unwind_frame(&frame) < 0)
-			return 0;
-		if (!in_sched_functions(frame.pc))
-			return frame.pc;
-	} while (count ++ < 16);
-	return 0;
-}
-
 #ifdef CONFIG_MMU
 #ifdef CONFIG_KUSER_HELPERS
 /*
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -257,8 +257,6 @@ struct task_struct;
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-unsigned long __get_wchan(struct task_struct *p);
-
 void update_sctlr_el1(u64 sctlr);
 
 /* Thread switching */
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -528,32 +528,6 @@ __notrace_funcgraph struct task_struct *
 	return last;
 }
 
-unsigned long __get_wchan(struct task_struct *p)
-{
-	struct stackframe frame;
-	unsigned long stack_page, ret = 0;
-	int count = 0;
-
-	stack_page = (unsigned long)try_get_task_stack(p);
-	if (!stack_page)
-		return 0;
-
-	start_backtrace(&frame, thread_saved_fp(p), thread_saved_pc(p));
-
-	do {
-		if (unwind_frame(p, &frame))
-			goto out;
-		if (!in_sched_functions(frame.pc)) {
-			ret = frame.pc;
-			goto out;
-		}
-	} while (count++ < 16);
-
-out:
-	put_task_stack(p);
-	return ret;
-}
-
 unsigned long arch_align_stack(unsigned long sp)
 {
 	if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
--- a/arch/csky/include/asm/processor.h
+++ b/arch/csky/include/asm/processor.h
@@ -81,8 +81,6 @@ static inline void release_thread(struct
 
 extern int kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
 
-unsigned long __get_wchan(struct task_struct *p);
-
 #define KSTK_EIP(tsk)		(task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->usp)
 
--- a/arch/csky/kernel/stacktrace.c
+++ b/arch/csky/kernel/stacktrace.c
@@ -101,24 +101,6 @@ void show_stack(struct task_struct *task
 	walk_stackframe(task, NULL, print_trace_address, (void *)loglvl);
 }
 
-static bool save_wchan(unsigned long pc, void *arg)
-{
-	if (!in_sched_functions(pc)) {
-		unsigned long *p = arg;
-		*p = pc;
-		return true;
-	}
-	return false;
-}
-
-unsigned long __get_wchan(struct task_struct *task)
-{
-	unsigned long pc = 0;
-
-	walk_stackframe(task, NULL, save_wchan, &pc);
-	return pc;
-}
-
 #ifdef CONFIG_STACKTRACE
 static bool __save_trace(unsigned long pc, void *arg, bool nosched)
 {
--- a/arch/hexagon/include/asm/processor.h
+++ b/arch/hexagon/include/asm/processor.h
@@ -63,9 +63,6 @@ struct thread_struct {
 /*  Free all resources held by a thread; defined in process.c  */
 extern void release_thread(struct task_struct *dead_task);
 
-/* Get wait channel for task P.  */
-extern unsigned long __get_wchan(struct task_struct *p);
-
 /*  The following stuff is pretty HEXAGON specific.  */
 
 /*  This is really just here for __switch_to.
--- a/arch/hexagon/kernel/process.c
+++ b/arch/hexagon/kernel/process.c
@@ -126,32 +126,6 @@ void flush_thread(void)
 }
 
 /*
- * The "wait channel" terminology is archaic, but what we want
- * is an identification of the point at which the scheduler
- * was invoked by a blocked thread.
- */
-unsigned long __get_wchan(struct task_struct *p)
-{
-	unsigned long fp, pc;
-	unsigned long stack_page;
-	int count = 0;
-
-	stack_page = (unsigned long)task_stack_page(p);
-	fp = ((struct hexagon_switch_stack *)p->thread.switch_sp)->fp;
-	do {
-		if (fp < (stack_page + sizeof(struct thread_info)) ||
-			fp >= (THREAD_SIZE - 8 + stack_page))
-			return 0;
-		pc = ((unsigned long *)fp)[1];
-		if (!in_sched_functions(pc))
-			return pc;
-		fp = *(unsigned long *) fp;
-	} while (count++ < 16);
-
-	return 0;
-}
-
-/*
  * Called on the exit path of event entry; see vm_entry.S
  *
  * Interrupts will already be disabled.
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -329,9 +329,6 @@ struct task_struct;
  */
 #define release_thread(dead_task)
 
-/* Get wait channel for task P.  */
-extern unsigned long __get_wchan (struct task_struct *p);
-
 /* Return instruction pointer of blocked task TSK.  */
 #define KSTK_EIP(tsk)					\
   ({							\
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -522,34 +522,6 @@ exit_thread (struct task_struct *tsk)
 	ia64_drop_fpu(tsk);
 }
 
-unsigned long
-__get_wchan (struct task_struct *p)
-{
-	struct unw_frame_info info;
-	unsigned long ip;
-	int count = 0;
-
-	/*
-	 * Note: p may not be a blocked task (it could be current or
-	 * another process running on some other CPU.  Rather than
-	 * trying to determine if p is really blocked, we just assume
-	 * it's blocked and rely on the unwind routines to fail
-	 * gracefully if the process wasn't really blocked after all.
-	 * --davidm 99/12/15
-	 */
-	unw_init_from_blocked_task(&info, p);
-	do {
-		if (task_is_running(p))
-			return 0;
-		if (unw_unwind(&info) < 0)
-			return 0;
-		unw_get_ip(&info, &ip);
-		if (!in_sched_functions(ip))
-			return ip;
-	} while (count++ < 16);
-	return 0;
-}
-
 void
 cpu_halt (void)
 {
--- a/arch/microblaze/include/asm/processor.h
+++ b/arch/microblaze/include/asm/processor.h
@@ -68,8 +68,6 @@ static inline void release_thread(struct
 {
 }
 
-unsigned long __get_wchan(struct task_struct *p);
-
 /* The size allocated for kernel stacks. This _must_ be a power of two! */
 # define KERNEL_STACK_SIZE	0x2000
 
--- a/arch/microblaze/kernel/process.c
+++ b/arch/microblaze/kernel/process.c
@@ -112,12 +112,6 @@ int copy_thread(unsigned long clone_flag
 	return 0;
 }
 
-unsigned long __get_wchan(struct task_struct *p)
-{
-/* TBD (used by procfs) */
-	return 0;
-}
-
 /* Set up a thread for executing a new program */
 void start_thread(struct pt_regs *regs, unsigned long pc, unsigned long usp)
 {
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -369,8 +369,6 @@ static inline void flush_thread(void)
 {
 }
 
-unsigned long __get_wchan(struct task_struct *p);
-
 #define __KSTK_TOS(tsk) ((unsigned long)task_stack_page(tsk) + \
 			 THREAD_SIZE - 32 - sizeof(struct pt_regs))
 #define task_pt_regs(tsk) ((struct pt_regs *)__KSTK_TOS(tsk))
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -651,33 +651,6 @@ unsigned long unwind_stack(struct task_s
 }
 #endif
 
-/*
- * __get_wchan - a maintenance nightmare^W^Wpain in the ass ...
- */
-unsigned long __get_wchan(struct task_struct *task)
-{
-	unsigned long pc = 0;
-#ifdef CONFIG_KALLSYMS
-	unsigned long sp;
-	unsigned long ra = 0;
-#endif
-
-	if (!task_stack_page(task))
-		goto out;
-
-	pc = thread_saved_pc(task);
-
-#ifdef CONFIG_KALLSYMS
-	sp = task->thread.reg29 + schedule_mfi.frame_size;
-
-	while (in_sched_functions(pc))
-		pc = unwind_stack(task, &sp, pc, &ra);
-#endif
-
-out:
-	return pc;
-}
-
 unsigned long mips_stack_top(void)
 {
 	unsigned long top = TASK_SIZE & PAGE_MASK;
--- a/arch/nds32/include/asm/processor.h
+++ b/arch/nds32/include/asm/processor.h
@@ -83,8 +83,6 @@ extern struct task_struct *last_task_use
 /* Prepare to copy thread state - unlazy all lazy status */
 #define prepare_to_copy(tsk)	do { } while (0)
 
-unsigned long __get_wchan(struct task_struct *p);
-
 #define cpu_relax()			barrier()
 
 #define task_pt_regs(task) \
--- a/arch/nds32/kernel/process.c
+++ b/arch/nds32/kernel/process.c
@@ -232,26 +232,3 @@ int dump_fpu(struct pt_regs *regs, elf_f
 }
 
 EXPORT_SYMBOL(dump_fpu);
-
-unsigned long __get_wchan(struct task_struct *p)
-{
-	unsigned long fp, lr;
-	unsigned long stack_start, stack_end;
-	int count = 0;
-
-	if (IS_ENABLED(CONFIG_FRAME_POINTER)) {
-		stack_start = (unsigned long)end_of_stack(p);
-		stack_end = (unsigned long)task_stack_page(p) + THREAD_SIZE;
-
-		fp = thread_saved_fp(p);
-		do {
-			if (fp < stack_start || fp > stack_end)
-				return 0;
-			lr = ((unsigned long *)fp)[0];
-			if (!in_sched_functions(lr))
-				return lr;
-			fp = *(unsigned long *)(fp + 4);
-		} while (count++ < 16);
-	}
-	return 0;
-}
--- a/arch/openrisc/include/asm/processor.h
+++ b/arch/openrisc/include/asm/processor.h
@@ -73,7 +73,6 @@ struct thread_struct {
 
 void start_thread(struct pt_regs *regs, unsigned long nip, unsigned long sp);
 void release_thread(struct task_struct *);
-unsigned long __get_wchan(struct task_struct *p);
 
 #define cpu_relax()     barrier()
 
--- a/arch/openrisc/kernel/process.c
+++ b/arch/openrisc/kernel/process.c
@@ -263,9 +263,3 @@ void dump_elf_thread(elf_greg_t *dest, s
 	dest[35] = 0;
 }
 
-unsigned long __get_wchan(struct task_struct *p)
-{
-	/* TODO */
-
-	return 0;
-}
--- a/arch/parisc/include/asm/processor.h
+++ b/arch/parisc/include/asm/processor.h
@@ -273,8 +273,6 @@ struct mm_struct;
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-extern unsigned long __get_wchan(struct task_struct *p);
-
 #define KSTK_EIP(tsk)	((tsk)->thread.regs.iaoq[0])
 #define KSTK_ESP(tsk)	((tsk)->thread.regs.gr[30])
 
--- a/arch/parisc/kernel/process.c
+++ b/arch/parisc/kernel/process.c
@@ -239,30 +239,6 @@ copy_thread(unsigned long clone_flags, u
 	return 0;
 }
 
-unsigned long
-__get_wchan(struct task_struct *p)
-{
-	struct unwind_frame_info info;
-	unsigned long ip;
-	int count = 0;
-
-	/*
-	 * These bracket the sleeping functions..
-	 */
-
-	unwind_frame_init_from_blocked_task(&info, p);
-	do {
-		if (unwind_once(&info) < 0)
-			return 0;
-		if (task_is_running(p))
-                        return 0;
-		ip = info.ip;
-		if (!in_sched_functions(ip))
-			return ip;
-	} while (count++ < MAX_UNWIND_ENTRIES);
-	return 0;
-}
-
 #ifdef CONFIG_64BIT
 void *dereference_function_descriptor(void *ptr)
 {
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -300,8 +300,6 @@ struct thread_struct {
 
 #define task_pt_regs(tsk)	((tsk)->thread.regs)
 
-unsigned long __get_wchan(struct task_struct *p);
-
 #define KSTK_EIP(tsk)  ((tsk)->thread.regs? (tsk)->thread.regs->nip: 0)
 #define KSTK_ESP(tsk)  ((tsk)->thread.regs? (tsk)->thread.regs->gpr[1]: 0)
 
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2111,43 +2111,6 @@ int validate_sp(unsigned long sp, struct
 
 EXPORT_SYMBOL(validate_sp);
 
-static unsigned long ___get_wchan(struct task_struct *p)
-{
-	unsigned long ip, sp;
-	int count = 0;
-
-	sp = p->thread.ksp;
-	if (!validate_sp(sp, p, STACK_FRAME_OVERHEAD))
-		return 0;
-
-	do {
-		sp = *(unsigned long *)sp;
-		if (!validate_sp(sp, p, STACK_FRAME_OVERHEAD) ||
-		    task_is_running(p))
-			return 0;
-		if (count > 0) {
-			ip = ((unsigned long *)sp)[STACK_FRAME_LR_SAVE];
-			if (!in_sched_functions(ip))
-				return ip;
-		}
-	} while (count++ < 16);
-	return 0;
-}
-
-unsigned long __get_wchan(struct task_struct *p)
-{
-	unsigned long ret;
-
-	if (!try_get_task_stack(p))
-		return 0;
-
-	ret = ___get_wchan(p);
-
-	put_task_stack(p);
-
-	return ret;
-}
-
 static int kstack_depth_to_print = CONFIG_PRINT_STACK_DEPTH;
 
 void __no_sanitize_address show_stack(struct task_struct *tsk,
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -66,9 +66,6 @@ static inline void release_thread(struct
 {
 }
 
-extern unsigned long __get_wchan(struct task_struct *p);
-
-
 static inline void wait_for_interrupt(void)
 {
 	__asm__ __volatile__ ("wfi");
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -118,27 +118,6 @@ void show_stack(struct task_struct *task
 	dump_backtrace(NULL, task, loglvl);
 }
 
-static bool save_wchan(void *arg, unsigned long pc)
-{
-	if (!in_sched_functions(pc)) {
-		unsigned long *p = arg;
-		*p = pc;
-		return false;
-	}
-	return true;
-}
-
-unsigned long __get_wchan(struct task_struct *task)
-{
-	unsigned long pc = 0;
-
-	if (!try_get_task_stack(task))
-		return 0;
-	walk_stackframe(task, NULL, save_wchan, &pc);
-	put_task_stack(task);
-	return pc;
-}
-
 #ifdef CONFIG_STACKTRACE
 
 noinline void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -192,7 +192,6 @@ static inline void release_thread(struct
 void guarded_storage_release(struct task_struct *tsk);
 void gs_load_bc_cb(struct pt_regs *regs);
 
-unsigned long __get_wchan(struct task_struct *p);
 #define task_pt_regs(tsk) ((struct pt_regs *) \
         (task_stack_page(tsk) + THREAD_SIZE) - 1)
 #define KSTK_EIP(tsk)	(task_pt_regs(tsk)->psw.addr)
--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -181,35 +181,6 @@ void execve_tail(void)
 	asm volatile("sfpc %0" : : "d" (0));
 }
 
-unsigned long __get_wchan(struct task_struct *p)
-{
-	struct unwind_state state;
-	unsigned long ip = 0;
-
-	if (!task_stack_page(p))
-		return 0;
-
-	if (!try_get_task_stack(p))
-		return 0;
-
-	unwind_for_each_frame(&state, p, NULL, 0) {
-		if (state.stack_info.type != STACK_TYPE_TASK) {
-			ip = 0;
-			break;
-		}
-
-		ip = unwind_get_return_address(&state);
-		if (!ip)
-			break;
-
-		if (!in_sched_functions(ip))
-			break;
-	}
-
-	put_task_stack(p);
-	return ip;
-}
-
 unsigned long arch_align_stack(unsigned long sp)
 {
 	if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
--- a/arch/sh/include/asm/processor_32.h
+++ b/arch/sh/include/asm/processor_32.h
@@ -180,8 +180,6 @@ static inline void show_code(struct pt_r
 }
 #endif
 
-extern unsigned long __get_wchan(struct task_struct *p);
-
 #define KSTK_EIP(tsk)  (task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)  (task_pt_regs(tsk)->regs[15])
 
--- a/arch/sh/kernel/process_32.c
+++ b/arch/sh/kernel/process_32.c
@@ -181,22 +181,3 @@ __switch_to(struct task_struct *prev, st
 
 	return prev;
 }
-
-unsigned long __get_wchan(struct task_struct *p)
-{
-	unsigned long pc;
-
-	/*
-	 * The same comment as on the Alpha applies here, too ...
-	 */
-	pc = thread_saved_pc(p);
-
-#ifdef CONFIG_FRAME_POINTER
-	if (in_sched_functions(pc)) {
-		unsigned long schedule_frame = (unsigned long)p->thread.sp;
-		return ((unsigned long *)schedule_frame)[21];
-	}
-#endif
-
-	return pc;
-}
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -183,8 +183,6 @@ do { \
 /* Free all resources held by a thread. */
 #define release_thread(tsk)		do { } while (0)
 
-unsigned long __get_wchan(struct task_struct *task);
-
 #define task_pt_regs(tsk) (task_thread_info(tsk)->kregs)
 #define KSTK_EIP(tsk)  (task_pt_regs(tsk)->tpc)
 #define KSTK_ESP(tsk)  (task_pt_regs(tsk)->u_regs[UREG_FP])
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -662,31 +662,3 @@ int arch_dup_task_struct(struct task_str
 	*dst = *src;
 	return 0;
 }
-
-unsigned long __get_wchan(struct task_struct *task)
-{
-	unsigned long pc, fp, bias = 0;
-	struct thread_info *tp;
-	struct reg_window *rw;
-        unsigned long ret = 0;
-	int count = 0; 
-
-	tp = task_thread_info(task);
-	bias = STACK_BIAS;
-	fp = task_thread_info(task)->ksp + bias;
-
-	do {
-		if (!kstack_valid(tp, fp))
-			break;
-		rw = (struct reg_window *) fp;
-		pc = rw->ins[7];
-		if (!in_sched_functions(pc)) {
-			ret = pc;
-			goto out;
-		}
-		fp = rw->ins[6] + bias;
-	} while (++count < 16);
-
-out:
-	return ret;
-}
--- a/arch/um/include/asm/processor-generic.h
+++ b/arch/um/include/asm/processor-generic.h
@@ -106,6 +106,5 @@ extern struct cpuinfo_um boot_cpu_data;
 #define cache_line_size()	(boot_cpu_data.cache_alignment)
 
 #define KSTK_REG(tsk, reg) get_thread_reg(reg, &tsk->thread.switch_buf)
-extern unsigned long __get_wchan(struct task_struct *p);
 
 #endif
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -364,38 +364,6 @@ unsigned long arch_align_stack(unsigned
 }
 #endif
 
-unsigned long __get_wchan(struct task_struct *p)
-{
-	unsigned long stack_page, sp, ip;
-	bool seen_sched = 0;
-
-	stack_page = (unsigned long) task_stack_page(p);
-	/* Bail if the process has no kernel stack for some reason */
-	if (stack_page == 0)
-		return 0;
-
-	sp = p->thread.switch_buf->JB_SP;
-	/*
-	 * Bail if the stack pointer is below the bottom of the kernel
-	 * stack for some reason
-	 */
-	if (sp < stack_page)
-		return 0;
-
-	while (sp < stack_page + THREAD_SIZE) {
-		ip = *((unsigned long *) sp);
-		if (in_sched_functions(ip))
-			/* Ignore everything until we're above the scheduler */
-			seen_sched = 1;
-		else if (kernel_text_address(ip) && seen_sched)
-			return ip;
-
-		sp += sizeof(unsigned long);
-	}
-
-	return 0;
-}
-
 int elf_core_copy_fpregs(struct task_struct *t, elf_fpregset_t *fpu)
 {
 	int cpu = current_thread_info()->cpu;
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -590,8 +590,6 @@ static inline void load_sp0(unsigned lon
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-unsigned long __get_wchan(struct task_struct *p);
-
 /*
  * Generic CPUID function
  * clear %ecx since some cpus (Cyrix MII) do not set or clear %ecx
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -937,20 +937,6 @@ unsigned long arch_randomize_brk(struct
 	return randomize_page(mm->brk, 0x02000000);
 }
 
-/*
- * Called from fs/proc with a reference on @p to find the function
- * which called into schedule(). This needs to be done carefully
- * because the task might wake up and we might look at a stack
- * changing under us.
- */
-unsigned long __get_wchan(struct task_struct *p)
-{
-	unsigned long entry = 0;
-
-	stack_trace_save_tsk(p, &entry, 1, 0);
-	return entry;
-}
-
 long do_arch_prctl_common(struct task_struct *task, int option,
 			  unsigned long cpuid_enabled)
 {
--- a/arch/xtensa/include/asm/processor.h
+++ b/arch/xtensa/include/asm/processor.h
@@ -215,8 +215,6 @@ struct mm_struct;
 /* Free all resources held by a thread. */
 #define release_thread(thread) do { } while(0)
 
-extern unsigned long __get_wchan(struct task_struct *p);
-
 #define KSTK_EIP(tsk)		(task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->areg[1])
 
--- a/arch/xtensa/kernel/process.c
+++ b/arch/xtensa/kernel/process.c
@@ -293,32 +293,3 @@ int copy_thread(unsigned long clone_flag
 	return 0;
 }
 
-
-/*
- * These bracket the sleeping functions..
- */
-
-unsigned long __get_wchan(struct task_struct *p)
-{
-	unsigned long sp, pc;
-	unsigned long stack_page = (unsigned long) task_stack_page(p);
-	int count = 0;
-
-	sp = p->thread.sp;
-	pc = MAKE_PC_FROM_RA(p->thread.ra, p->thread.sp);
-
-	do {
-		if (sp < stack_page + sizeof(struct task_struct) ||
-		    sp >= (stack_page + THREAD_SIZE) ||
-		    pc == 0)
-			return 0;
-		if (!in_sched_functions(pc))
-			return pc;
-
-		/* Stack layout: sp-4: ra, sp-3: sp' */
-
-		pc = MAKE_PC_FROM_RA(SPILL_SLOT(sp, 0), sp);
-		sp = SPILL_SLOT(sp, 1);
-	} while (count++ < 16);
-	return 0;
-}
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1966,6 +1966,21 @@ bool sched_task_on_rq(struct task_struct
 	return task_on_rq_queued(p);
 }
 
+#ifdef CONFIG_STACKTRACE
+static unsigned long __get_wchan(struct task_struct *p)
+{
+	unsigned long entry = 0;
+
+	stack_trace_save_tsk(p, &entry, 1, 0);
+
+	return entry;
+}
+#endif
+
+/*
+ * Called from fs/proc with a reference on @p to find the function
+ * which called into schedule().
+ */
 unsigned long get_wchan(struct task_struct *p)
 {
 	unsigned long ip = 0;
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1531,13 +1531,8 @@ config DEBUG_IRQFLAGS
 	  are enabled.
 
 config STACKTRACE
-	bool "Stack backtrace support"
+	def_bool y
 	depends on STACKTRACE_SUPPORT
-	help
-	  This option causes the kernel to create a /proc/pid/stack for
-	  every process, showing its current stack trace.
-	  It is also used by various kernel debugging features that require
-	  stack trace generation.
 
 config WARN_ALL_UNSEEDED_RANDOM
 	bool "Warn for all uses of unseeded randomness"



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 7/7] arch: Fix STACKTRACE_SUPPORT
  2021-10-08 11:15 [PATCH 0/7] wchan: Fix wchan support Peter Zijlstra
                   ` (5 preceding siblings ...)
  2021-10-08 11:15 ` [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT Peter Zijlstra
@ 2021-10-08 11:15 ` Peter Zijlstra
  2021-10-08 12:52   ` Mark Rutland
  2021-10-09  9:36   ` Guo Ren
  2021-10-14 12:02 ` [PATCH 0/7] wchan: Fix wchan support Russell King (Oracle)
  7 siblings, 2 replies; 29+ messages in thread
From: Peter Zijlstra @ 2021-10-08 11:15 UTC (permalink / raw)
  To: keescook, jannh
  Cc: linux-kernel, peterz, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, linux, will, guoren, bcain,
	monstr, tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato,
	davem, chris

A few archs got save_stack_trace_tsk() vs in_sched_functions() wrong.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/csky/kernel/stacktrace.c  |    7 ++++++-
 arch/mips/kernel/stacktrace.c  |   27 ++++++++++++++++-----------
 arch/nds32/kernel/stacktrace.c |   21 +++++++++++----------
 3 files changed, 33 insertions(+), 22 deletions(-)

--- a/arch/csky/kernel/stacktrace.c
+++ b/arch/csky/kernel/stacktrace.c
@@ -122,12 +122,17 @@ static bool save_trace(unsigned long pc,
 	return __save_trace(pc, arg, false);
 }
 
+static bool save_trace_nosched(unsigned long pc, void *arg)
+{
+	return __save_trace(pc, arg, true);
+}
+
 /*
  * Save stack-backtrace addresses into a stack_trace buffer.
  */
 void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
 {
-	walk_stackframe(tsk, NULL, save_trace, trace);
+	walk_stackframe(tsk, NULL, save_trace_nosched, trace);
 }
 EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
 
--- a/arch/mips/kernel/stacktrace.c
+++ b/arch/mips/kernel/stacktrace.c
@@ -66,16 +66,7 @@ static void save_context_stack(struct st
 #endif
 }
 
-/*
- * Save stack-backtrace addresses into a stack_trace buffer.
- */
-void save_stack_trace(struct stack_trace *trace)
-{
-	save_stack_trace_tsk(current, trace);
-}
-EXPORT_SYMBOL_GPL(save_stack_trace);
-
-void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+static void __save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace, bool savesched)
 {
 	struct pt_regs dummyregs;
 	struct pt_regs *regs = &dummyregs;
@@ -88,6 +79,20 @@ void save_stack_trace_tsk(struct task_st
 		regs->cp0_epc = tsk->thread.reg31;
 	} else
 		prepare_frametrace(regs);
-	save_context_stack(trace, tsk, regs, tsk == current);
+	save_context_stack(trace, tsk, regs, savesched);
+}
+
+/*
+ * Save stack-backtrace addresses into a stack_trace buffer.
+ */
+void save_stack_trace(struct stack_trace *trace)
+{
+	__save_stack_trace_tsk(current, trace, true);
+}
+EXPORT_SYMBOL_GPL(save_stack_trace);
+
+void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+{
+	__save_stack_trace_tsk(tsk, trace, false);
 }
 EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
--- a/arch/nds32/kernel/stacktrace.c
+++ b/arch/nds32/kernel/stacktrace.c
@@ -6,25 +6,16 @@
 #include <linux/stacktrace.h>
 #include <linux/ftrace.h>
 
-void save_stack_trace(struct stack_trace *trace)
-{
-	save_stack_trace_tsk(current, trace);
-}
-EXPORT_SYMBOL_GPL(save_stack_trace);
-
-void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+static void __save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace, bool savesched)
 {
 	unsigned long *fpn;
 	int skip = trace->skip;
-	int savesched;
 	int graph_idx = 0;
 
 	if (tsk == current) {
 		__asm__ __volatile__("\tori\t%0, $fp, #0\n":"=r"(fpn));
-		savesched = 1;
 	} else {
 		fpn = (unsigned long *)thread_saved_fp(tsk);
-		savesched = 0;
 	}
 
 	while (!kstack_end(fpn) && !((unsigned long)fpn & 0x3)
@@ -50,4 +41,14 @@ void save_stack_trace_tsk(struct task_st
 		fpn = (unsigned long *)fpp;
 	}
 }
+void save_stack_trace(struct stack_trace *trace)
+{
+	__save_stack_trace_tsk(current, trace, true);
+}
+EXPORT_SYMBOL_GPL(save_stack_trace);
+
+void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+{
+	__save_stack_trace_tsk(tsk, trace, false);
+}
 EXPORT_SYMBOL_GPL(save_stack_trace_tsk);



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked
  2021-10-08 11:15 ` [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked Peter Zijlstra
@ 2021-10-08 11:26   ` Geert Uytterhoeven
  2021-10-08 12:45   ` Mark Rutland
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Geert Uytterhoeven @ 2021-10-08 11:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Kees Cook, Jann Horn, Linux Kernel Mailing List, vcaputo,
	Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Andrew Morton, Christian Brauner,
	amistry, Kenta.Tada, legion, michael.weiss, Michal Hocko,
	Helge Deller, zhengqi.arch, Tobin C. Harding, Tycho Andersen,
	Thomas Gleixner, Borislav Petkov, H. Peter Anvin, Mark Rutland,
	Jens Axboe, metze, laijs, Andy Lutomirski, Dave Hansen,
	Eric W. Biederman, ohoono.kwon, kaleshsingh, YiFei Zhu,
	Josh Poimboeuf, linux-hardening, Linux-Arch, Vineet Gupta,
	Russell King, Will Deacon, Guo Ren, Brian Cain, Michal Simek,
	Thomas Bogendoerfer, Nick Hu, Jonas Bonn, Michael Ellerman,
	Paul Walmsley, Heiko Carstens, Yoshinori Sato, David S. Miller,
	Chris Zankel

On Fri, Oct 8, 2021 at 1:20 PM Peter Zijlstra <peterz@infradead.org> wrote:
> From: Kees Cook <keescook@chromium.org>
>
> Having a stable wchan means the process must be blocked and for it to
> stay that way while performing stack unwinding.
>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

>  arch/m68k/include/asm/processor.h       |    2 +-
>  arch/m68k/kernel/process.c              |    4 +---

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT
  2021-10-08 11:15 ` [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT Peter Zijlstra
@ 2021-10-08 12:40   ` Mark Rutland
  2021-10-08 13:45     ` Peter Zijlstra
  2021-10-14 11:07   ` Russell King (Oracle)
  1 sibling, 1 reply; 29+ messages in thread
From: Mark Rutland @ 2021-10-08 12:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: keescook, jannh, linux-kernel, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, axboe, metze, laijs, luto, dave.hansen, ebiederm,
	ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe, linux-hardening,
	linux-arch, vgupta, linux, will, guoren, bcain, monstr, tsbogend,
	nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem, chris

[Adding Josh, since there might be a concern here from a livepatch pov]

On Fri, Oct 08, 2021 at 01:15:33PM +0200, Peter Zijlstra wrote:
> It's trivial to implement __get_wchan() with CONFIG_STACKTRACE

Possibly, but I don't think this is quite right -- semantic issue below.
 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---

> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -257,8 +257,6 @@ struct task_struct;
>  /* Free all resources held by a thread. */
>  extern void release_thread(struct task_struct *);
>  
> -unsigned long __get_wchan(struct task_struct *p);
> -
>  void update_sctlr_el1(u64 sctlr);
>  
>  /* Thread switching */
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -528,32 +528,6 @@ __notrace_funcgraph struct task_struct *
>  	return last;
>  }
>  
> -unsigned long __get_wchan(struct task_struct *p)
> -{
> -	struct stackframe frame;
> -	unsigned long stack_page, ret = 0;
> -	int count = 0;
> -
> -	stack_page = (unsigned long)try_get_task_stack(p);
> -	if (!stack_page)
> -		return 0;
> -
> -	start_backtrace(&frame, thread_saved_fp(p), thread_saved_pc(p));
> -
> -	do {
> -		if (unwind_frame(p, &frame))
> -			goto out;
> -		if (!in_sched_functions(frame.pc)) {
> -			ret = frame.pc;
> -			goto out;
> -		}
> -	} while (count++ < 16);
> -
> -out:
> -	put_task_stack(p);
> -	return ret;
> -}
> -

> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1966,6 +1966,21 @@ bool sched_task_on_rq(struct task_struct
>  	return task_on_rq_queued(p);
>  }
>  
> +#ifdef CONFIG_STACKTRACE
> +static unsigned long __get_wchan(struct task_struct *p)
> +{
> +	unsigned long entry = 0;
> +
> +	stack_trace_save_tsk(p, &entry, 1, 0);

This assumes stack_trace_save_tsk() will skip sched functions, but I
don't think that's ever been a requirement? It's certinaly not
documented anywhere that I could find, and arm64 doesn't do so today,
and this patch causes wchan to just log `__switch_to` for everything.

I realise you "fix" that for some arches in the next patch, but it's not
clear to me that's the right thing to do -- I would expect that
stack_trace_save_tsk() *shouldn't* skip anything unless we've explicitly
told it to via skipnr, because I'd expect that
stack_trace_save_tsk_reliable() mustn't, in case we ever need to patch
anything in the scheduler (or arch ctxsw code) with a livepatch, or if
you ever *want* to have the sched functions in a trace.

So I have two big questions:

1) Where precisely should stack_trace_save_tsk() and
   stack_trace_save_tsk_reliable() start from?

1) What should you do when you *do* want sched functions in a trace?

We could side-step the issue here by using arch_stack_walk(), which'd
make it easy to skip sched functions in the core code.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked
  2021-10-08 11:15 ` [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked Peter Zijlstra
  2021-10-08 11:26   ` Geert Uytterhoeven
@ 2021-10-08 12:45   ` Mark Rutland
  2021-10-14 10:46   ` Russell King (Oracle)
  2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Kees Cook
  3 siblings, 0 replies; 29+ messages in thread
From: Mark Rutland @ 2021-10-08 12:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: keescook, jannh, linux-kernel, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, axboe, metze, laijs, luto, dave.hansen, ebiederm,
	ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe, linux-hardening,
	linux-arch, vgupta, linux, will, guoren, bcain, monstr, tsbogend,
	nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem, chris

On Fri, Oct 08, 2021 at 01:15:32PM +0200, Peter Zijlstra wrote:
> From: Kees Cook <keescook@chromium.org>
> 
> Having a stable wchan means the process must be blocked and for it to
> stay that way while performing stack unwinding.
> 
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

FWIW, this seems to wpork on arm64:

Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]

Mark.

> ---
>  arch/alpha/include/asm/processor.h      |    2 +-
>  arch/alpha/kernel/process.c             |    5 ++---
>  arch/arc/include/asm/processor.h        |    2 +-
>  arch/arc/kernel/stacktrace.c            |    4 ++--
>  arch/arm/include/asm/processor.h        |    2 +-
>  arch/arm/kernel/process.c               |    4 +---
>  arch/arm64/include/asm/processor.h      |    2 +-
>  arch/arm64/kernel/process.c             |    4 +---
>  arch/csky/include/asm/processor.h       |    2 +-
>  arch/csky/kernel/stacktrace.c           |    5 ++---
>  arch/h8300/include/asm/processor.h      |    2 +-
>  arch/h8300/kernel/process.c             |    5 +----
>  arch/hexagon/include/asm/processor.h    |    2 +-
>  arch/hexagon/kernel/process.c           |    4 +---
>  arch/ia64/include/asm/processor.h       |    2 +-
>  arch/ia64/kernel/process.c              |    5 +----
>  arch/m68k/include/asm/processor.h       |    2 +-
>  arch/m68k/kernel/process.c              |    4 +---
>  arch/microblaze/include/asm/processor.h |    2 +-
>  arch/microblaze/kernel/process.c        |    2 +-
>  arch/mips/include/asm/processor.h       |    2 +-
>  arch/mips/kernel/process.c              |    8 +++-----
>  arch/nds32/include/asm/processor.h      |    2 +-
>  arch/nds32/kernel/process.c             |    7 +------
>  arch/nios2/include/asm/processor.h      |    2 +-
>  arch/nios2/kernel/process.c             |    5 +----
>  arch/openrisc/include/asm/processor.h   |    2 +-
>  arch/openrisc/kernel/process.c          |    2 +-
>  arch/parisc/include/asm/processor.h     |    2 +-
>  arch/parisc/kernel/process.c            |    5 +----
>  arch/powerpc/include/asm/processor.h    |    2 +-
>  arch/powerpc/kernel/process.c           |    9 +++------
>  arch/riscv/include/asm/processor.h      |    2 +-
>  arch/riscv/kernel/stacktrace.c          |   12 +++++-------
>  arch/s390/include/asm/processor.h       |    2 +-
>  arch/s390/kernel/process.c              |    4 ++--
>  arch/sh/include/asm/processor_32.h      |    2 +-
>  arch/sh/kernel/process_32.c             |    5 +----
>  arch/sparc/include/asm/processor_32.h   |    2 +-
>  arch/sparc/include/asm/processor_64.h   |    2 +-
>  arch/sparc/kernel/process_32.c          |    5 +----
>  arch/sparc/kernel/process_64.c          |    5 +----
>  arch/um/include/asm/processor-generic.h |    2 +-
>  arch/um/kernel/process.c                |    5 +----
>  arch/x86/include/asm/processor.h        |    2 +-
>  arch/x86/kernel/process.c               |    5 +----
>  arch/xtensa/include/asm/processor.h     |    2 +-
>  arch/xtensa/kernel/process.c            |    5 +----
>  include/linux/sched.h                   |    1 +
>  kernel/sched/core.c                     |   19 +++++++++++++++++++
>  50 files changed, 80 insertions(+), 112 deletions(-)
> 
> --- a/arch/alpha/include/asm/processor.h
> +++ b/arch/alpha/include/asm/processor.h
> @@ -42,7 +42,7 @@ extern void start_thread(struct pt_regs
>  struct task_struct;
>  extern void release_thread(struct task_struct *);
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  #define KSTK_EIP(tsk) (task_pt_regs(tsk)->pc)
>  
> --- a/arch/alpha/kernel/process.c
> +++ b/arch/alpha/kernel/process.c
> @@ -376,12 +376,11 @@ thread_saved_pc(struct task_struct *t)
>  }
>  
>  unsigned long
> -get_wchan(struct task_struct *p)
> +__get_wchan(struct task_struct *p)
>  {
>  	unsigned long schedule_frame;
>  	unsigned long pc;
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
> +
>  	/*
>  	 * This one depends on the frame size of schedule().  Do a
>  	 * "disass schedule" in gdb to find the frame size.  Also, the
> --- a/arch/arc/include/asm/processor.h
> +++ b/arch/arc/include/asm/processor.h
> @@ -70,7 +70,7 @@ struct task_struct;
>  extern void start_thread(struct pt_regs * regs, unsigned long pc,
>  			 unsigned long usp);
>  
> -extern unsigned int get_wchan(struct task_struct *p);
> +extern unsigned int __get_wchan(struct task_struct *p);
>  
>  #endif /* !__ASSEMBLY__ */
>  
> --- a/arch/arc/kernel/stacktrace.c
> +++ b/arch/arc/kernel/stacktrace.c
> @@ -15,7 +15,7 @@
>   *      = specifics of data structs where trace is saved(CONFIG_STACKTRACE etc)
>   *
>   *  vineetg: March 2009
> - *  -Implemented correct versions of thread_saved_pc() and get_wchan()
> + *  -Implemented correct versions of thread_saved_pc() and __get_wchan()
>   *
>   *  rajeshwarr: 2008
>   *  -Initial implementation
> @@ -248,7 +248,7 @@ void show_stack(struct task_struct *tsk,
>   * Of course just returning schedule( ) would be pointless so unwind until
>   * the function is not in schedular code
>   */
> -unsigned int get_wchan(struct task_struct *tsk)
> +unsigned int __get_wchan(struct task_struct *tsk)
>  {
>  	return arc_unwind_core(tsk, NULL, __get_first_nonsched, NULL);
>  }
> --- a/arch/arm/include/asm/processor.h
> +++ b/arch/arm/include/asm/processor.h
> @@ -84,7 +84,7 @@ struct task_struct;
>  /* Free all resources held by a thread. */
>  extern void release_thread(struct task_struct *);
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  #define task_pt_regs(p) \
>  	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
> --- a/arch/arm/kernel/process.c
> +++ b/arch/arm/kernel/process.c
> @@ -276,13 +276,11 @@ int copy_thread(unsigned long clone_flag
>  	return 0;
>  }
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	struct stackframe frame;
>  	unsigned long stack_page;
>  	int count = 0;
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
>  
>  	frame.fp = thread_saved_fp(p);
>  	frame.sp = thread_saved_sp(p);
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -257,7 +257,7 @@ struct task_struct;
>  /* Free all resources held by a thread. */
>  extern void release_thread(struct task_struct *);
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  void update_sctlr_el1(u64 sctlr);
>  
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -528,13 +528,11 @@ __notrace_funcgraph struct task_struct *
>  	return last;
>  }
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	struct stackframe frame;
>  	unsigned long stack_page, ret = 0;
>  	int count = 0;
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
>  
>  	stack_page = (unsigned long)try_get_task_stack(p);
>  	if (!stack_page)
> --- a/arch/csky/include/asm/processor.h
> +++ b/arch/csky/include/asm/processor.h
> @@ -81,7 +81,7 @@ static inline void release_thread(struct
>  
>  extern int kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  #define KSTK_EIP(tsk)		(task_pt_regs(tsk)->pc)
>  #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->usp)
> --- a/arch/csky/kernel/stacktrace.c
> +++ b/arch/csky/kernel/stacktrace.c
> @@ -111,12 +111,11 @@ static bool save_wchan(unsigned long pc,
>  	return false;
>  }
>  
> -unsigned long get_wchan(struct task_struct *task)
> +unsigned long __get_wchan(struct task_struct *task)
>  {
>  	unsigned long pc = 0;
>  
> -	if (likely(task && task != current && !task_is_running(task)))
> -		walk_stackframe(task, NULL, save_wchan, &pc);
> +	walk_stackframe(task, NULL, save_wchan, &pc);
>  	return pc;
>  }
>  
> --- a/arch/h8300/include/asm/processor.h
> +++ b/arch/h8300/include/asm/processor.h
> @@ -105,7 +105,7 @@ static inline void release_thread(struct
>  {
>  }
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  #define	KSTK_EIP(tsk)	\
>  	({			 \
> --- a/arch/h8300/kernel/process.c
> +++ b/arch/h8300/kernel/process.c
> @@ -128,15 +128,12 @@ int copy_thread(unsigned long clone_flag
>  	return 0;
>  }
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long fp, pc;
>  	unsigned long stack_page;
>  	int count = 0;
>  
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
> -
>  	stack_page = (unsigned long)p;
>  	fp = ((struct pt_regs *)p->thread.ksp)->er6;
>  	do {
> --- a/arch/hexagon/include/asm/processor.h
> +++ b/arch/hexagon/include/asm/processor.h
> @@ -64,7 +64,7 @@ struct thread_struct {
>  extern void release_thread(struct task_struct *dead_task);
>  
>  /* Get wait channel for task P.  */
> -extern unsigned long get_wchan(struct task_struct *p);
> +extern unsigned long __get_wchan(struct task_struct *p);
>  
>  /*  The following stuff is pretty HEXAGON specific.  */
>  
> --- a/arch/hexagon/kernel/process.c
> +++ b/arch/hexagon/kernel/process.c
> @@ -130,13 +130,11 @@ void flush_thread(void)
>   * is an identification of the point at which the scheduler
>   * was invoked by a blocked thread.
>   */
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long fp, pc;
>  	unsigned long stack_page;
>  	int count = 0;
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
>  
>  	stack_page = (unsigned long)task_stack_page(p);
>  	fp = ((struct hexagon_switch_stack *)p->thread.switch_sp)->fp;
> --- a/arch/ia64/include/asm/processor.h
> +++ b/arch/ia64/include/asm/processor.h
> @@ -330,7 +330,7 @@ struct task_struct;
>  #define release_thread(dead_task)
>  
>  /* Get wait channel for task P.  */
> -extern unsigned long get_wchan (struct task_struct *p);
> +extern unsigned long __get_wchan (struct task_struct *p);
>  
>  /* Return instruction pointer of blocked task TSK.  */
>  #define KSTK_EIP(tsk)					\
> --- a/arch/ia64/kernel/process.c
> +++ b/arch/ia64/kernel/process.c
> @@ -523,15 +523,12 @@ exit_thread (struct task_struct *tsk)
>  }
>  
>  unsigned long
> -get_wchan (struct task_struct *p)
> +__get_wchan (struct task_struct *p)
>  {
>  	struct unw_frame_info info;
>  	unsigned long ip;
>  	int count = 0;
>  
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
> -
>  	/*
>  	 * Note: p may not be a blocked task (it could be current or
>  	 * another process running on some other CPU.  Rather than
> --- a/arch/m68k/include/asm/processor.h
> +++ b/arch/m68k/include/asm/processor.h
> @@ -125,7 +125,7 @@ static inline void release_thread(struct
>  {
>  }
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  #define	KSTK_EIP(tsk)	\
>      ({			\
> --- a/arch/m68k/kernel/process.c
> +++ b/arch/m68k/kernel/process.c
> @@ -263,13 +263,11 @@ int dump_fpu (struct pt_regs *regs, stru
>  }
>  EXPORT_SYMBOL(dump_fpu);
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long fp, pc;
>  	unsigned long stack_page;
>  	int count = 0;
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
>  
>  	stack_page = (unsigned long)task_stack_page(p);
>  	fp = ((struct switch_stack *)p->thread.ksp)->a6;
> --- a/arch/microblaze/include/asm/processor.h
> +++ b/arch/microblaze/include/asm/processor.h
> @@ -68,7 +68,7 @@ static inline void release_thread(struct
>  {
>  }
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  /* The size allocated for kernel stacks. This _must_ be a power of two! */
>  # define KERNEL_STACK_SIZE	0x2000
> --- a/arch/microblaze/kernel/process.c
> +++ b/arch/microblaze/kernel/process.c
> @@ -112,7 +112,7 @@ int copy_thread(unsigned long clone_flag
>  	return 0;
>  }
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  /* TBD (used by procfs) */
>  	return 0;
> --- a/arch/mips/include/asm/processor.h
> +++ b/arch/mips/include/asm/processor.h
> @@ -369,7 +369,7 @@ static inline void flush_thread(void)
>  {
>  }
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  #define __KSTK_TOS(tsk) ((unsigned long)task_stack_page(tsk) + \
>  			 THREAD_SIZE - 32 - sizeof(struct pt_regs))
> --- a/arch/mips/kernel/process.c
> +++ b/arch/mips/kernel/process.c
> @@ -511,7 +511,7 @@ static int __init frame_info_init(void)
>  
>  	/*
>  	 * Without schedule() frame info, result given by
> -	 * thread_saved_pc() and get_wchan() are not reliable.
> +	 * thread_saved_pc() and __get_wchan() are not reliable.
>  	 */
>  	if (schedule_mfi.pc_offset < 0)
>  		printk("Can't analyze schedule() prologue at %p\n", schedule);
> @@ -652,9 +652,9 @@ unsigned long unwind_stack(struct task_s
>  #endif
>  
>  /*
> - * get_wchan - a maintenance nightmare^W^Wpain in the ass ...
> + * __get_wchan - a maintenance nightmare^W^Wpain in the ass ...
>   */
> -unsigned long get_wchan(struct task_struct *task)
> +unsigned long __get_wchan(struct task_struct *task)
>  {
>  	unsigned long pc = 0;
>  #ifdef CONFIG_KALLSYMS
> @@ -662,8 +662,6 @@ unsigned long get_wchan(struct task_stru
>  	unsigned long ra = 0;
>  #endif
>  
> -	if (!task || task == current || task_is_running(task))
> -		goto out;
>  	if (!task_stack_page(task))
>  		goto out;
>  
> --- a/arch/nds32/include/asm/processor.h
> +++ b/arch/nds32/include/asm/processor.h
> @@ -83,7 +83,7 @@ extern struct task_struct *last_task_use
>  /* Prepare to copy thread state - unlazy all lazy status */
>  #define prepare_to_copy(tsk)	do { } while (0)
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  #define cpu_relax()			barrier()
>  
> --- a/arch/nds32/kernel/process.c
> +++ b/arch/nds32/kernel/process.c
> @@ -233,15 +233,12 @@ int dump_fpu(struct pt_regs *regs, elf_f
>  
>  EXPORT_SYMBOL(dump_fpu);
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long fp, lr;
>  	unsigned long stack_start, stack_end;
>  	int count = 0;
>  
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
> -
>  	if (IS_ENABLED(CONFIG_FRAME_POINTER)) {
>  		stack_start = (unsigned long)end_of_stack(p);
>  		stack_end = (unsigned long)task_stack_page(p) + THREAD_SIZE;
> @@ -258,5 +255,3 @@ unsigned long get_wchan(struct task_stru
>  	}
>  	return 0;
>  }
> -
> -EXPORT_SYMBOL(get_wchan);
> --- a/arch/nios2/include/asm/processor.h
> +++ b/arch/nios2/include/asm/processor.h
> @@ -69,7 +69,7 @@ static inline void release_thread(struct
>  {
>  }
>  
> -extern unsigned long get_wchan(struct task_struct *p);
> +extern unsigned long __get_wchan(struct task_struct *p);
>  
>  #define task_pt_regs(p) \
>  	((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1)
> --- a/arch/nios2/kernel/process.c
> +++ b/arch/nios2/kernel/process.c
> @@ -217,15 +217,12 @@ void dump(struct pt_regs *fp)
>  	pr_emerg("\n\n");
>  }
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long fp, pc;
>  	unsigned long stack_page;
>  	int count = 0;
>  
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
> -
>  	stack_page = (unsigned long)p;
>  	fp = ((struct switch_stack *)p->thread.ksp)->fp;	/* ;dgt2 */
>  	do {
> --- a/arch/openrisc/include/asm/processor.h
> +++ b/arch/openrisc/include/asm/processor.h
> @@ -73,7 +73,7 @@ struct thread_struct {
>  
>  void start_thread(struct pt_regs *regs, unsigned long nip, unsigned long sp);
>  void release_thread(struct task_struct *);
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  #define cpu_relax()     barrier()
>  
> --- a/arch/openrisc/kernel/process.c
> +++ b/arch/openrisc/kernel/process.c
> @@ -263,7 +263,7 @@ void dump_elf_thread(elf_greg_t *dest, s
>  	dest[35] = 0;
>  }
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	/* TODO */
>  
> --- a/arch/parisc/include/asm/processor.h
> +++ b/arch/parisc/include/asm/processor.h
> @@ -273,7 +273,7 @@ struct mm_struct;
>  /* Free all resources held by a thread. */
>  extern void release_thread(struct task_struct *);
>  
> -extern unsigned long get_wchan(struct task_struct *p);
> +extern unsigned long __get_wchan(struct task_struct *p);
>  
>  #define KSTK_EIP(tsk)	((tsk)->thread.regs.iaoq[0])
>  #define KSTK_ESP(tsk)	((tsk)->thread.regs.gr[30])
> --- a/arch/parisc/kernel/process.c
> +++ b/arch/parisc/kernel/process.c
> @@ -240,15 +240,12 @@ copy_thread(unsigned long clone_flags, u
>  }
>  
>  unsigned long
> -get_wchan(struct task_struct *p)
> +__get_wchan(struct task_struct *p)
>  {
>  	struct unwind_frame_info info;
>  	unsigned long ip;
>  	int count = 0;
>  
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
> -
>  	/*
>  	 * These bracket the sleeping functions..
>  	 */
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -300,7 +300,7 @@ struct thread_struct {
>  
>  #define task_pt_regs(tsk)	((tsk)->thread.regs)
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  #define KSTK_EIP(tsk)  ((tsk)->thread.regs? (tsk)->thread.regs->nip: 0)
>  #define KSTK_ESP(tsk)  ((tsk)->thread.regs? (tsk)->thread.regs->gpr[1]: 0)
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -2111,14 +2111,11 @@ int validate_sp(unsigned long sp, struct
>  
>  EXPORT_SYMBOL(validate_sp);
>  
> -static unsigned long __get_wchan(struct task_struct *p)
> +static unsigned long ___get_wchan(struct task_struct *p)
>  {
>  	unsigned long ip, sp;
>  	int count = 0;
>  
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
> -
>  	sp = p->thread.ksp;
>  	if (!validate_sp(sp, p, STACK_FRAME_OVERHEAD))
>  		return 0;
> @@ -2137,14 +2134,14 @@ static unsigned long __get_wchan(struct
>  	return 0;
>  }
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long ret;
>  
>  	if (!try_get_task_stack(p))
>  		return 0;
>  
> -	ret = __get_wchan(p);
> +	ret = ___get_wchan(p);
>  
>  	put_task_stack(p);
>  
> --- a/arch/riscv/include/asm/processor.h
> +++ b/arch/riscv/include/asm/processor.h
> @@ -66,7 +66,7 @@ static inline void release_thread(struct
>  {
>  }
>  
> -extern unsigned long get_wchan(struct task_struct *p);
> +extern unsigned long __get_wchan(struct task_struct *p);
>  
>  
>  static inline void wait_for_interrupt(void)
> --- a/arch/riscv/kernel/stacktrace.c
> +++ b/arch/riscv/kernel/stacktrace.c
> @@ -128,16 +128,14 @@ static bool save_wchan(void *arg, unsign
>  	return true;
>  }
>  
> -unsigned long get_wchan(struct task_struct *task)
> +unsigned long __get_wchan(struct task_struct *task)
>  {
>  	unsigned long pc = 0;
>  
> -	if (likely(task && task != current && !task_is_running(task))) {
> -		if (!try_get_task_stack(task))
> -			return 0;
> -		walk_stackframe(task, NULL, save_wchan, &pc);
> -		put_task_stack(task);
> -	}
> +	if (!try_get_task_stack(task))
> +		return 0;
> +	walk_stackframe(task, NULL, save_wchan, &pc);
> +	put_task_stack(task);
>  	return pc;
>  }
>  
> --- a/arch/s390/include/asm/processor.h
> +++ b/arch/s390/include/asm/processor.h
> @@ -192,7 +192,7 @@ static inline void release_thread(struct
>  void guarded_storage_release(struct task_struct *tsk);
>  void gs_load_bc_cb(struct pt_regs *regs);
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  #define task_pt_regs(tsk) ((struct pt_regs *) \
>          (task_stack_page(tsk) + THREAD_SIZE) - 1)
>  #define KSTK_EIP(tsk)	(task_pt_regs(tsk)->psw.addr)
> --- a/arch/s390/kernel/process.c
> +++ b/arch/s390/kernel/process.c
> @@ -181,12 +181,12 @@ void execve_tail(void)
>  	asm volatile("sfpc %0" : : "d" (0));
>  }
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	struct unwind_state state;
>  	unsigned long ip = 0;
>  
> -	if (!p || p == current || task_is_running(p) || !task_stack_page(p))
> +	if (!task_stack_page(p))
>  		return 0;
>  
>  	if (!try_get_task_stack(p))
> --- a/arch/sh/include/asm/processor_32.h
> +++ b/arch/sh/include/asm/processor_32.h
> @@ -180,7 +180,7 @@ static inline void show_code(struct pt_r
>  }
>  #endif
>  
> -extern unsigned long get_wchan(struct task_struct *p);
> +extern unsigned long __get_wchan(struct task_struct *p);
>  
>  #define KSTK_EIP(tsk)  (task_pt_regs(tsk)->pc)
>  #define KSTK_ESP(tsk)  (task_pt_regs(tsk)->regs[15])
> --- a/arch/sh/kernel/process_32.c
> +++ b/arch/sh/kernel/process_32.c
> @@ -182,13 +182,10 @@ __switch_to(struct task_struct *prev, st
>  	return prev;
>  }
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long pc;
>  
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
> -
>  	/*
>  	 * The same comment as on the Alpha applies here, too ...
>  	 */
> --- a/arch/sparc/include/asm/processor_32.h
> +++ b/arch/sparc/include/asm/processor_32.h
> @@ -89,7 +89,7 @@ static inline void start_thread(struct p
>  /* Free all resources held by a thread. */
>  #define release_thread(tsk)		do { } while(0)
>  
> -unsigned long get_wchan(struct task_struct *);
> +unsigned long __get_wchan(struct task_struct *);
>  
>  #define task_pt_regs(tsk) ((tsk)->thread.kregs)
>  #define KSTK_EIP(tsk)  ((tsk)->thread.kregs->pc)
> --- a/arch/sparc/include/asm/processor_64.h
> +++ b/arch/sparc/include/asm/processor_64.h
> @@ -183,7 +183,7 @@ do { \
>  /* Free all resources held by a thread. */
>  #define release_thread(tsk)		do { } while (0)
>  
> -unsigned long get_wchan(struct task_struct *task);
> +unsigned long __get_wchan(struct task_struct *task);
>  
>  #define task_pt_regs(tsk) (task_thread_info(tsk)->kregs)
>  #define KSTK_EIP(tsk)  (task_pt_regs(tsk)->tpc)
> --- a/arch/sparc/kernel/process_32.c
> +++ b/arch/sparc/kernel/process_32.c
> @@ -365,7 +365,7 @@ int copy_thread(unsigned long clone_flag
>  	return 0;
>  }
>  
> -unsigned long get_wchan(struct task_struct *task)
> +unsigned long __get_wchan(struct task_struct *task)
>  {
>  	unsigned long pc, fp, bias = 0;
>  	unsigned long task_base = (unsigned long) task;
> @@ -373,9 +373,6 @@ unsigned long get_wchan(struct task_stru
>  	struct reg_window32 *rw;
>  	int count = 0;
>  
> -	if (!task || task == current || task_is_running(task))
> -		goto out;
> -
>  	fp = task_thread_info(task)->ksp + bias;
>  	do {
>  		/* Bogus frame pointer? */
> --- a/arch/sparc/kernel/process_64.c
> +++ b/arch/sparc/kernel/process_64.c
> @@ -663,7 +663,7 @@ int arch_dup_task_struct(struct task_str
>  	return 0;
>  }
>  
> -unsigned long get_wchan(struct task_struct *task)
> +unsigned long __get_wchan(struct task_struct *task)
>  {
>  	unsigned long pc, fp, bias = 0;
>  	struct thread_info *tp;
> @@ -671,9 +671,6 @@ unsigned long get_wchan(struct task_stru
>          unsigned long ret = 0;
>  	int count = 0; 
>  
> -	if (!task || task == current || task_is_running(task))
> -		goto out;
> -
>  	tp = task_thread_info(task);
>  	bias = STACK_BIAS;
>  	fp = task_thread_info(task)->ksp + bias;
> --- a/arch/um/include/asm/processor-generic.h
> +++ b/arch/um/include/asm/processor-generic.h
> @@ -106,6 +106,6 @@ extern struct cpuinfo_um boot_cpu_data;
>  #define cache_line_size()	(boot_cpu_data.cache_alignment)
>  
>  #define KSTK_REG(tsk, reg) get_thread_reg(reg, &tsk->thread.switch_buf)
> -extern unsigned long get_wchan(struct task_struct *p);
> +extern unsigned long __get_wchan(struct task_struct *p);
>  
>  #endif
> --- a/arch/um/kernel/process.c
> +++ b/arch/um/kernel/process.c
> @@ -364,14 +364,11 @@ unsigned long arch_align_stack(unsigned
>  }
>  #endif
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long stack_page, sp, ip;
>  	bool seen_sched = 0;
>  
> -	if ((p == NULL) || (p == current) || task_is_running(p))
> -		return 0;
> -
>  	stack_page = (unsigned long) task_stack_page(p);
>  	/* Bail if the process has no kernel stack for some reason */
>  	if (stack_page == 0)
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -590,7 +590,7 @@ static inline void load_sp0(unsigned lon
>  /* Free all resources held by a thread. */
>  extern void release_thread(struct task_struct *);
>  
> -unsigned long get_wchan(struct task_struct *p);
> +unsigned long __get_wchan(struct task_struct *p);
>  
>  /*
>   * Generic CPUID function
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -943,13 +943,10 @@ unsigned long arch_randomize_brk(struct
>   * because the task might wake up and we might look at a stack
>   * changing under us.
>   */
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long entry = 0;
>  
> -	if (p == current || task_is_running(p))
> -		return 0;
> -
>  	stack_trace_save_tsk(p, &entry, 1, 0);
>  	return entry;
>  }
> --- a/arch/xtensa/include/asm/processor.h
> +++ b/arch/xtensa/include/asm/processor.h
> @@ -215,7 +215,7 @@ struct mm_struct;
>  /* Free all resources held by a thread. */
>  #define release_thread(thread) do { } while(0)
>  
> -extern unsigned long get_wchan(struct task_struct *p);
> +extern unsigned long __get_wchan(struct task_struct *p);
>  
>  #define KSTK_EIP(tsk)		(task_pt_regs(tsk)->pc)
>  #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->areg[1])
> --- a/arch/xtensa/kernel/process.c
> +++ b/arch/xtensa/kernel/process.c
> @@ -298,15 +298,12 @@ int copy_thread(unsigned long clone_flag
>   * These bracket the sleeping functions..
>   */
>  
> -unsigned long get_wchan(struct task_struct *p)
> +unsigned long __get_wchan(struct task_struct *p)
>  {
>  	unsigned long sp, pc;
>  	unsigned long stack_page = (unsigned long) task_stack_page(p);
>  	int count = 0;
>  
> -	if (!p || p == current || task_is_running(p))
> -		return 0;
> -
>  	sp = p->thread.sp;
>  	pc = MAKE_PC_FROM_RA(p->thread.ra, p->thread.sp);
>  
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2139,6 +2139,7 @@ static inline void set_task_cpu(struct t
>  #endif /* CONFIG_SMP */
>  
>  extern bool sched_task_on_rq(struct task_struct *p);
> +extern unsigned long get_wchan(struct task_struct *p);
>  
>  /*
>   * In order to reduce various lock holder preemption latencies provide an
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1962,6 +1962,25 @@ bool sched_task_on_rq(struct task_struct
>  	return task_on_rq_queued(p);
>  }
>  
> +unsigned long get_wchan(struct task_struct *p)
> +{
> +	unsigned long ip = 0;
> +	unsigned int state;
> +
> +	if (!p || p == current)
> +		return 0;
> +
> +	/* Only get wchan if task is blocked and we can keep it that way. */
> +	raw_spin_lock_irq(&p->pi_lock);
> +	state = READ_ONCE(p->__state);
> +	smp_rmb(); /* see try_to_wake_up() */
> +	if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
> +		ip = __get_wchan(p);
> +	raw_spin_unlock_irq(&p->pi_lock);
> +
> +	return ip;
> +}
> +
>  static inline void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
>  {
>  	if (!(flags & ENQUEUE_NOCLOCK))
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] arch: Fix STACKTRACE_SUPPORT
  2021-10-08 11:15 ` [PATCH 7/7] arch: Fix STACKTRACE_SUPPORT Peter Zijlstra
@ 2021-10-08 12:52   ` Mark Rutland
  2021-10-09  9:36   ` Guo Ren
  1 sibling, 0 replies; 29+ messages in thread
From: Mark Rutland @ 2021-10-08 12:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: keescook, jannh, linux-kernel, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, axboe, metze, laijs, luto, dave.hansen, ebiederm,
	ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe, linux-hardening,
	linux-arch, vgupta, linux, will, guoren, bcain, monstr, tsbogend,
	nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem, chris

On Fri, Oct 08, 2021 at 01:15:34PM +0200, Peter Zijlstra wrote:
> A few archs got save_stack_trace_tsk() vs in_sched_functions() wrong.

As mentioned on the last patch, it's not clear to me what the intended
semantic of save_stack_trace_tsk() is w.r.t. sched functions, as the
naive reading is that it should report *everything* a task may return
to.

If it's meant to skip sched functions, I think we need some explicit
documentation/commentary to that effect. In that case, there are other
architectures that need a fixup (e.g. arm64).

TBH, I don't think it *should* skip sched functions, and we should
filter out sched functions as required at a higher level, or deprecate
this interface in favour of arch_stack_walk() where it's easier to have
common filter functions invoked during the walk....

Thanks,
Mark.

> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/csky/kernel/stacktrace.c  |    7 ++++++-
>  arch/mips/kernel/stacktrace.c  |   27 ++++++++++++++++-----------
>  arch/nds32/kernel/stacktrace.c |   21 +++++++++++----------
>  3 files changed, 33 insertions(+), 22 deletions(-)
> 
> --- a/arch/csky/kernel/stacktrace.c
> +++ b/arch/csky/kernel/stacktrace.c
> @@ -122,12 +122,17 @@ static bool save_trace(unsigned long pc,
>  	return __save_trace(pc, arg, false);
>  }
>  
> +static bool save_trace_nosched(unsigned long pc, void *arg)
> +{
> +	return __save_trace(pc, arg, true);
> +}
> +
>  /*
>   * Save stack-backtrace addresses into a stack_trace buffer.
>   */
>  void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
>  {
> -	walk_stackframe(tsk, NULL, save_trace, trace);
> +	walk_stackframe(tsk, NULL, save_trace_nosched, trace);
>  }
>  EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
>  
> --- a/arch/mips/kernel/stacktrace.c
> +++ b/arch/mips/kernel/stacktrace.c
> @@ -66,16 +66,7 @@ static void save_context_stack(struct st
>  #endif
>  }
>  
> -/*
> - * Save stack-backtrace addresses into a stack_trace buffer.
> - */
> -void save_stack_trace(struct stack_trace *trace)
> -{
> -	save_stack_trace_tsk(current, trace);
> -}
> -EXPORT_SYMBOL_GPL(save_stack_trace);
> -
> -void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
> +static void __save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace, bool savesched)
>  {
>  	struct pt_regs dummyregs;
>  	struct pt_regs *regs = &dummyregs;
> @@ -88,6 +79,20 @@ void save_stack_trace_tsk(struct task_st
>  		regs->cp0_epc = tsk->thread.reg31;
>  	} else
>  		prepare_frametrace(regs);
> -	save_context_stack(trace, tsk, regs, tsk == current);
> +	save_context_stack(trace, tsk, regs, savesched);
> +}
> +
> +/*
> + * Save stack-backtrace addresses into a stack_trace buffer.
> + */
> +void save_stack_trace(struct stack_trace *trace)
> +{
> +	__save_stack_trace_tsk(current, trace, true);
> +}
> +EXPORT_SYMBOL_GPL(save_stack_trace);
> +
> +void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
> +{
> +	__save_stack_trace_tsk(tsk, trace, false);
>  }
>  EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
> --- a/arch/nds32/kernel/stacktrace.c
> +++ b/arch/nds32/kernel/stacktrace.c
> @@ -6,25 +6,16 @@
>  #include <linux/stacktrace.h>
>  #include <linux/ftrace.h>
>  
> -void save_stack_trace(struct stack_trace *trace)
> -{
> -	save_stack_trace_tsk(current, trace);
> -}
> -EXPORT_SYMBOL_GPL(save_stack_trace);
> -
> -void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
> +static void __save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace, bool savesched)
>  {
>  	unsigned long *fpn;
>  	int skip = trace->skip;
> -	int savesched;
>  	int graph_idx = 0;
>  
>  	if (tsk == current) {
>  		__asm__ __volatile__("\tori\t%0, $fp, #0\n":"=r"(fpn));
> -		savesched = 1;
>  	} else {
>  		fpn = (unsigned long *)thread_saved_fp(tsk);
> -		savesched = 0;
>  	}
>  
>  	while (!kstack_end(fpn) && !((unsigned long)fpn & 0x3)
> @@ -50,4 +41,14 @@ void save_stack_trace_tsk(struct task_st
>  		fpn = (unsigned long *)fpp;
>  	}
>  }
> +void save_stack_trace(struct stack_trace *trace)
> +{
> +	__save_stack_trace_tsk(current, trace, true);
> +}
> +EXPORT_SYMBOL_GPL(save_stack_trace);
> +
> +void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
> +{
> +	__save_stack_trace_tsk(tsk, trace, false);
> +}
>  EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT
  2021-10-08 12:40   ` Mark Rutland
@ 2021-10-08 13:45     ` Peter Zijlstra
  2021-10-08 16:17       ` Josh Poimboeuf
  2021-10-14 18:03       ` Mark Rutland
  0 siblings, 2 replies; 29+ messages in thread
From: Peter Zijlstra @ 2021-10-08 13:45 UTC (permalink / raw)
  To: Mark Rutland
  Cc: keescook, jannh, linux-kernel, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, axboe, metze, laijs, luto, dave.hansen, ebiederm,
	ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe, linux-hardening,
	linux-arch, vgupta, linux, will, guoren, bcain, monstr, tsbogend,
	nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem, chris

On Fri, Oct 08, 2021 at 01:40:52PM +0100, Mark Rutland wrote:
> [Adding Josh, since there might be a concern here from a livepatch pov]
> 

> > +static unsigned long __get_wchan(struct task_struct *p)
> > +{
> > +	unsigned long entry = 0;
> > +
> > +	stack_trace_save_tsk(p, &entry, 1, 0);
> 
> This assumes stack_trace_save_tsk() will skip sched functions, but I
> don't think that's ever been a requirement? It's certinaly not
> documented anywhere that I could find, and arm64 doesn't do so today,
> and this patch causes wchan to just log `__switch_to` for everything.

Confused, arm64 has arch_stack_walk() and should thus use
kernel/stacktrace.c's stack_trace_consume_entry_nosched.

> I realise you "fix" that for some arches in the next patch, but it's not
> clear to me that's the right thing to do -- I would expect that

I only actually change the behaviour on csky, both mips and nds32 have
this 'savesched = (task == current)' logic which ends up being a very
confusing way to write things, but for wchan we never call on current,
and hence don't save the __sched functions.

> stack_trace_save_tsk() *shouldn't* skip anything unless we've explicitly
> told it to via skipnr, because I'd expect that

It's what most archs happen to do today and is what
stack_trace_save_tsk() as implemented using arch_stack_walk() does.
Which is I think the closest to canonical we have.

> stack_trace_save_tsk_reliable() mustn't, in case we ever need to patch
> anything in the scheduler (or arch ctxsw code) with a livepatch, or if
> you ever *want* to have the sched functions in a trace.
> 
> So I have two big questions:
> 
> 1) Where precisely should stack_trace_save_tsk() and
>    stack_trace_save_tsk_reliable() start from?
> 
> 1) What should you do when you *do* want sched functions in a trace?
> 
> We could side-step the issue here by using arch_stack_walk(), which'd
> make it easy to skip sched functions in the core code.

arch_stack_walk() is the modern API and should be used going forward,
and I've gone with the stack_trace_save*() implementation as per that.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT
  2021-10-08 13:45     ` Peter Zijlstra
@ 2021-10-08 16:17       ` Josh Poimboeuf
  2021-10-14 18:07         ` Mark Rutland
  2021-10-14 18:03       ` Mark Rutland
  1 sibling, 1 reply; 29+ messages in thread
From: Josh Poimboeuf @ 2021-10-08 16:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, keescook, jannh, linux-kernel, vcaputo, mingo,
	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, akpm, christian.brauner, amistry, Kenta.Tada,
	legion, michael.weiss, mhocko, deller, zhengqi.arch, me, tycho,
	tglx, bp, hpa, axboe, metze, laijs, luto, dave.hansen, ebiederm,
	ohoono.kwon, kaleshsingh, yifeifz2, linux-hardening, linux-arch,
	vgupta, linux, will, guoren, bcain, monstr, tsbogend, nickhu,
	jonas, mpe, paul.walmsley, hca, ysato, davem, chris

On Fri, Oct 08, 2021 at 03:45:59PM +0200, Peter Zijlstra wrote:
> > stack_trace_save_tsk() *shouldn't* skip anything unless we've explicitly
> > told it to via skipnr, because I'd expect that
> 
> It's what most archs happen to do today and is what
> stack_trace_save_tsk() as implemented using arch_stack_walk() does.
> Which is I think the closest to canonical we have.

It *is* confusing though.  Even if 'nosched' may be the normally
desired behavior, stack_trace_save_tsk() should probably be named
stack_trace_save_tsk_nosched().

-- 
Josh


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] arch: Fix STACKTRACE_SUPPORT
  2021-10-08 11:15 ` [PATCH 7/7] arch: Fix STACKTRACE_SUPPORT Peter Zijlstra
  2021-10-08 12:52   ` Mark Rutland
@ 2021-10-09  9:36   ` Guo Ren
  1 sibling, 0 replies; 29+ messages in thread
From: Guo Ren @ 2021-10-09  9:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Kees Cook, Jann Horn, Linux Kernel Mailing List, vcaputo,
	Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Andrew Morton, christian.brauner,
	amistry, Kenta.Tada, legion, michael.weiss, mhocko, Helge Deller,
	zhengqi.arch, me, tycho, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, Mark Rutland, Jens Axboe, metze, laijs,
	Andy Lutomirski, Dave Hansen, Eric W. Biederman, ohoono.kwon,
	kaleshsingh, yifeifz2, jpoimboe, linux-hardening, linux-arch,
	vgupta, Russell King, Will Deacon, Brian Cain, Michal Simek,
	Thomas Bogendoerfer, Nick Hu, Jonas Bonn, Michael Ellerman,
	Paul Walmsley, Heiko Carstens, Yoshinori Sato, David Miller,
	Chris Zankel

Thx Peter,

On Fri, Oct 8, 2021 at 7:18 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> A few archs got save_stack_trace_tsk() vs in_sched_functions() wrong.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/csky/kernel/stacktrace.c  |    7 ++++++-
>  arch/mips/kernel/stacktrace.c  |   27 ++++++++++++++++-----------
>  arch/nds32/kernel/stacktrace.c |   21 +++++++++++----------
>  3 files changed, 33 insertions(+), 22 deletions(-)
>
> --- a/arch/csky/kernel/stacktrace.c
> +++ b/arch/csky/kernel/stacktrace.c
> @@ -122,12 +122,17 @@ static bool save_trace(unsigned long pc,
>         return __save_trace(pc, arg, false);
>  }
>
> +static bool save_trace_nosched(unsigned long pc, void *arg)
> +{
> +       return __save_trace(pc, arg, true);
> +}
> +
>  /*
>   * Save stack-backtrace addresses into a stack_trace buffer.
>   */
>  void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
>  {
> -       walk_stackframe(tsk, NULL, save_trace, trace);
> +       walk_stackframe(tsk, NULL, save_trace_nosched, trace);
>  }
>  EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
I think the patch should be:
@@ -138,7 +138,7 @@ static bool __save_trace(unsigned long pc, void
*arg, bool nosched)

 static bool save_trace(unsigned long pc, void *arg)
 {
-       return __save_trace(pc, arg, false);
+       return __save_trace(pc, arg, true);
 }

Another question:
If we put sched_text in the backtrace buffer, just cause put no useful
information in wchan, right?
(I think it wouldn't cause a worse problem than debugging.)


>
> --- a/arch/mips/kernel/stacktrace.c
> +++ b/arch/mips/kernel/stacktrace.c
> @@ -66,16 +66,7 @@ static void save_context_stack(struct st
>  #endif
>  }
>
> -/*
> - * Save stack-backtrace addresses into a stack_trace buffer.
> - */
> -void save_stack_trace(struct stack_trace *trace)
> -{
> -       save_stack_trace_tsk(current, trace);
> -}
> -EXPORT_SYMBOL_GPL(save_stack_trace);
> -
> -void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
> +static void __save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace, bool savesched)
>  {
>         struct pt_regs dummyregs;
>         struct pt_regs *regs = &dummyregs;
> @@ -88,6 +79,20 @@ void save_stack_trace_tsk(struct task_st
>                 regs->cp0_epc = tsk->thread.reg31;
>         } else
>                 prepare_frametrace(regs);
> -       save_context_stack(trace, tsk, regs, tsk == current);
> +       save_context_stack(trace, tsk, regs, savesched);
> +}
> +
> +/*
> + * Save stack-backtrace addresses into a stack_trace buffer.
> + */
> +void save_stack_trace(struct stack_trace *trace)
> +{
> +       __save_stack_trace_tsk(current, trace, true);
> +}
> +EXPORT_SYMBOL_GPL(save_stack_trace);
> +
> +void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
> +{
> +       __save_stack_trace_tsk(tsk, trace, false);
>  }
>  EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
> --- a/arch/nds32/kernel/stacktrace.c
> +++ b/arch/nds32/kernel/stacktrace.c
> @@ -6,25 +6,16 @@
>  #include <linux/stacktrace.h>
>  #include <linux/ftrace.h>
>
> -void save_stack_trace(struct stack_trace *trace)
> -{
> -       save_stack_trace_tsk(current, trace);
> -}
> -EXPORT_SYMBOL_GPL(save_stack_trace);
> -
> -void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
> +static void __save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace, bool savesched)
>  {
>         unsigned long *fpn;
>         int skip = trace->skip;
> -       int savesched;
>         int graph_idx = 0;
>
>         if (tsk == current) {
>                 __asm__ __volatile__("\tori\t%0, $fp, #0\n":"=r"(fpn));
> -               savesched = 1;
>         } else {
>                 fpn = (unsigned long *)thread_saved_fp(tsk);
> -               savesched = 0;
>         }
>
>         while (!kstack_end(fpn) && !((unsigned long)fpn & 0x3)
> @@ -50,4 +41,14 @@ void save_stack_trace_tsk(struct task_st
>                 fpn = (unsigned long *)fpp;
>         }
>  }
> +void save_stack_trace(struct stack_trace *trace)
> +{
> +       __save_stack_trace_tsk(current, trace, true);
> +}
> +EXPORT_SYMBOL_GPL(save_stack_trace);
> +
> +void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
> +{
> +       __save_stack_trace_tsk(tsk, trace, false);
> +}
>  EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
>
>


--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked
  2021-10-08 11:15 ` [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked Peter Zijlstra
  2021-10-08 11:26   ` Geert Uytterhoeven
  2021-10-08 12:45   ` Mark Rutland
@ 2021-10-14 10:46   ` Russell King (Oracle)
  2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Kees Cook
  3 siblings, 0 replies; 29+ messages in thread
From: Russell King (Oracle) @ 2021-10-14 10:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: keescook, jannh, linux-kernel, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, will, guoren, bcain, monstr,
	tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem,
	chris

On Fri, Oct 08, 2021 at 01:15:32PM +0200, Peter Zijlstra wrote:
> From: Kees Cook <keescook@chromium.org>
> 
> Having a stable wchan means the process must be blocked and for it to
> stay that way while performing stack unwinding.
> 
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm]

I will eventually get around to test this once I've caught up.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT
  2021-10-08 11:15 ` [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT Peter Zijlstra
  2021-10-08 12:40   ` Mark Rutland
@ 2021-10-14 11:07   ` Russell King (Oracle)
  1 sibling, 0 replies; 29+ messages in thread
From: Russell King (Oracle) @ 2021-10-14 11:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: keescook, jannh, linux-kernel, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, will, guoren, bcain, monstr,
	tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem,
	chris

On Fri, Oct 08, 2021 at 01:15:33PM +0200, Peter Zijlstra wrote:
> It's trivial to implement __get_wchan() with CONFIG_STACKTRACE
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

This definitely needs testing, but looking at the code, I think this
will be compatible since we're essentially doing the same tests to
omit scheduling functions in both the old and replacement code.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] wchan: Fix wchan support
  2021-10-08 11:15 [PATCH 0/7] wchan: Fix wchan support Peter Zijlstra
                   ` (6 preceding siblings ...)
  2021-10-08 11:15 ` [PATCH 7/7] arch: Fix STACKTRACE_SUPPORT Peter Zijlstra
@ 2021-10-14 12:02 ` Russell King (Oracle)
  2021-10-14 13:38   ` Russell King (Oracle)
  7 siblings, 1 reply; 29+ messages in thread
From: Russell King (Oracle) @ 2021-10-14 12:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: keescook, jannh, linux-kernel, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, will, guoren, bcain, monstr,
	tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem,
	chris

On Fri, Oct 08, 2021 at 01:15:27PM +0200, Peter Zijlstra wrote:
> Hi,
> 
> This fixes up wchan which is various degrees of broken across the
> architectures.
> 
> Patch 4 fixes wchan for x86, which has been returning 0 for the past many
> releases.
> 
> Patch 5 fixes the fundamental race against scheduling.
> 
> Patch 6 deletes a lot and makes STACKTRACE unconditional
> 
> patch 7 fixes up a few STACKTRACE arch oddities
> 
> 0day says all builds are good, so it must be perfect :-) I'm planning on
> queueing up at least the first 5 patches, but I'm hoping the last two patches
> can be too.
> 
> Also available here:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/wchan

These patches introduce a regression on ARM. Whereas before, I have
/proc/*/wchan populated with non-zero values, with these patches they
_all_ contain "0":

root@clearfog21:~# cat /proc/*/wchan
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000root@clearfog21:~#

I'll try to investigate what is going on later today.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] wchan: Fix wchan support
  2021-10-14 12:02 ` [PATCH 0/7] wchan: Fix wchan support Russell King (Oracle)
@ 2021-10-14 13:38   ` Russell King (Oracle)
  2021-10-14 19:51     ` Josh Poimboeuf
  0 siblings, 1 reply; 29+ messages in thread
From: Russell King (Oracle) @ 2021-10-14 13:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: keescook, jannh, linux-kernel, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, mark.rutland, axboe, metze, laijs, luto, dave.hansen,
	ebiederm, ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe,
	linux-hardening, linux-arch, vgupta, will, guoren, bcain, monstr,
	tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem,
	chris

On Thu, Oct 14, 2021 at 01:02:34PM +0100, Russell King (Oracle) wrote:
> On Fri, Oct 08, 2021 at 01:15:27PM +0200, Peter Zijlstra wrote:
> > Hi,
> > 
> > This fixes up wchan which is various degrees of broken across the
> > architectures.
> > 
> > Patch 4 fixes wchan for x86, which has been returning 0 for the past many
> > releases.
> > 
> > Patch 5 fixes the fundamental race against scheduling.
> > 
> > Patch 6 deletes a lot and makes STACKTRACE unconditional
> > 
> > patch 7 fixes up a few STACKTRACE arch oddities
> > 
> > 0day says all builds are good, so it must be perfect :-) I'm planning on
> > queueing up at least the first 5 patches, but I'm hoping the last two patches
> > can be too.
> > 
> > Also available here:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/wchan
> 
> These patches introduce a regression on ARM. Whereas before, I have
> /proc/*/wchan populated with non-zero values, with these patches they
> _all_ contain "0":
> 
> root@clearfog21:~# cat /proc/*/wchan
> 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000root@clearfog21:~#
> 
> I'll try to investigate what is going on later today.

What is going on here is that the ARM stacktrace code refuses to trace
non-current tasks in a SMP environment due to the racy nature of doing
so if the non-current tasks are running.

When walking the stack with frame pointers, we:

- validate that the frame pointer is between the stack pointer and the
  top of stack defined by that stack pointer.
- we then load the next stack pointer and next frame pointer from the
  stack.

The reason this is unsafe when the task is not blocked is the stack can
change at any moment, which can cause the value read as a stack pointer
to be wildly different. If the read frame pointer value is roughly in
agreement, we can end up reading any part of memory, which would be an
information leak.

The table based unwinding is much more complex being essentially a set
of instructions to the unwinder code about which values to read from
the stack into a set of pseudo-registers, corrections to the stack
pointer, or transfers from the pseudo-registers. I haven't analysed
this code enough to really know the implications of what could be
possible if the values on the stack change while this code is running
on another CPU (it's not my code!) There is an attempt to bounds-limit
the virtual stack pointer after each unwind instruction is processed
to catch the unwinder doing anything silly, so it may be safe in so far
as it will fail should it encounter anything "stupid".

However, get_wchan() is a different case; we know for certain that the
task is blocked, so it won't be running on another CPU, and with your
patch 4, we have this guarantee. However, that is not true of all
callers to the stacktracing code, so I don't see how we can sanely
switch to using the stacktracing code for this.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT
  2021-10-08 13:45     ` Peter Zijlstra
  2021-10-08 16:17       ` Josh Poimboeuf
@ 2021-10-14 18:03       ` Mark Rutland
  2021-10-14 18:48         ` Josh Poimboeuf
  1 sibling, 1 reply; 29+ messages in thread
From: Mark Rutland @ 2021-10-14 18:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: keescook, jannh, linux-kernel, vcaputo, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, akpm, christian.brauner, amistry, Kenta.Tada, legion,
	michael.weiss, mhocko, deller, zhengqi.arch, me, tycho, tglx, bp,
	hpa, axboe, metze, laijs, luto, dave.hansen, ebiederm,
	ohoono.kwon, kaleshsingh, yifeifz2, jpoimboe, linux-hardening,
	linux-arch, vgupta, linux, will, guoren, bcain, monstr, tsbogend,
	nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem, chris

On Fri, Oct 08, 2021 at 03:45:59PM +0200, Peter Zijlstra wrote:
> On Fri, Oct 08, 2021 at 01:40:52PM +0100, Mark Rutland wrote:
> > [Adding Josh, since there might be a concern here from a livepatch pov]
> > 
> 
> > > +static unsigned long __get_wchan(struct task_struct *p)
> > > +{
> > > +	unsigned long entry = 0;
> > > +
> > > +	stack_trace_save_tsk(p, &entry, 1, 0);
> > 
> > This assumes stack_trace_save_tsk() will skip sched functions, but I
> > don't think that's ever been a requirement? It's certinaly not
> > documented anywhere that I could find, and arm64 doesn't do so today,
> > and this patch causes wchan to just log `__switch_to` for everything.
> 
> Confused, arm64 has arch_stack_walk() and should thus use
> kernel/stacktrace.c's stack_trace_consume_entry_nosched.

Looking at this arm64's *current* get_wchan() unwinds once before
checking in_sched_functions(), so it skips __switch_to(). As of this
patch, we check in_sched_functions() first, which stops the unwind
immediately as __switch_to() isn't marked as __sched.

I think x86 gets away with this because switch_to() is asm, and that
tail-calls __switch_to() when returning.

Does switch_to() and below need to be marked __sched?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT
  2021-10-08 16:17       ` Josh Poimboeuf
@ 2021-10-14 18:07         ` Mark Rutland
  2021-10-14 18:41           ` Josh Poimboeuf
  0 siblings, 1 reply; 29+ messages in thread
From: Mark Rutland @ 2021-10-14 18:07 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, keescook, jannh, linux-kernel, vcaputo, mingo,
	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, akpm, christian.brauner, amistry, Kenta.Tada,
	legion, michael.weiss, mhocko, deller, zhengqi.arch, me, tycho,
	tglx, bp, hpa, axboe, metze, laijs, luto, dave.hansen, ebiederm,
	ohoono.kwon, kaleshsingh, yifeifz2, linux-hardening, linux-arch,
	vgupta, linux, will, guoren, bcain, monstr, tsbogend, nickhu,
	jonas, mpe, paul.walmsley, hca, ysato, davem, chris

On Fri, Oct 08, 2021 at 09:17:07AM -0700, Josh Poimboeuf wrote:
> On Fri, Oct 08, 2021 at 03:45:59PM +0200, Peter Zijlstra wrote:
> > > stack_trace_save_tsk() *shouldn't* skip anything unless we've explicitly
> > > told it to via skipnr, because I'd expect that
> > 
> > It's what most archs happen to do today and is what
> > stack_trace_save_tsk() as implemented using arch_stack_walk() does.
> > Which is I think the closest to canonical we have.

Ah; and arch_stack_walk() itself shouldn't skip anything, which gives
the consistent low-level semantic I wanted.

> It *is* confusing though.  Even if 'nosched' may be the normally
> desired behavior, stack_trace_save_tsk() should probably be named
> stack_trace_save_tsk_nosched().

I agree that'd be less confusing!

Josh, am I right in my understanding that the reliable stacktrace
functions *shouldn't* skip sched functions, or should those similarly
gain a _nosched suffix?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT
  2021-10-14 18:07         ` Mark Rutland
@ 2021-10-14 18:41           ` Josh Poimboeuf
  0 siblings, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2021-10-14 18:41 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, keescook, jannh, linux-kernel, vcaputo, mingo,
	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, akpm, christian.brauner, amistry, Kenta.Tada,
	legion, michael.weiss, mhocko, deller, zhengqi.arch, me, tycho,
	tglx, bp, hpa, axboe, metze, laijs, luto, dave.hansen, ebiederm,
	ohoono.kwon, kaleshsingh, yifeifz2, linux-hardening, linux-arch,
	vgupta, linux, will, guoren, bcain, monstr, tsbogend, nickhu,
	jonas, mpe, paul.walmsley, hca, ysato, davem, chris

On Thu, Oct 14, 2021 at 07:07:38PM +0100, Mark Rutland wrote:
> On Fri, Oct 08, 2021 at 09:17:07AM -0700, Josh Poimboeuf wrote:
> > On Fri, Oct 08, 2021 at 03:45:59PM +0200, Peter Zijlstra wrote:
> > > > stack_trace_save_tsk() *shouldn't* skip anything unless we've explicitly
> > > > told it to via skipnr, because I'd expect that
> > > 
> > > It's what most archs happen to do today and is what
> > > stack_trace_save_tsk() as implemented using arch_stack_walk() does.
> > > Which is I think the closest to canonical we have.
> 
> Ah; and arch_stack_walk() itself shouldn't skip anything, which gives
> the consistent low-level semantic I wanted.
> 
> > It *is* confusing though.  Even if 'nosched' may be the normally
> > desired behavior, stack_trace_save_tsk() should probably be named
> > stack_trace_save_tsk_nosched().
> 
> I agree that'd be less confusing!
> 
> Josh, am I right in my understanding that the reliable stacktrace
> functions *shouldn't* skip sched functions, or should those similarly
> gain a _nosched suffix?

Correct, the reliable variants need to see the entire call stack and
therefore they shouldn't skip sched functions.

-- 
Josh


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT
  2021-10-14 18:03       ` Mark Rutland
@ 2021-10-14 18:48         ` Josh Poimboeuf
  0 siblings, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2021-10-14 18:48 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, keescook, jannh, linux-kernel, vcaputo, mingo,
	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, akpm, christian.brauner, amistry, Kenta.Tada,
	legion, michael.weiss, mhocko, deller, zhengqi.arch, me, tycho,
	tglx, bp, hpa, axboe, metze, laijs, luto, dave.hansen, ebiederm,
	ohoono.kwon, kaleshsingh, yifeifz2, linux-hardening, linux-arch,
	vgupta, linux, will, guoren, bcain, monstr, tsbogend, nickhu,
	jonas, mpe, paul.walmsley, hca, ysato, davem, chris

On Thu, Oct 14, 2021 at 07:03:07PM +0100, Mark Rutland wrote:
> On Fri, Oct 08, 2021 at 03:45:59PM +0200, Peter Zijlstra wrote:
> > On Fri, Oct 08, 2021 at 01:40:52PM +0100, Mark Rutland wrote:
> > > [Adding Josh, since there might be a concern here from a livepatch pov]
> > > 
> > 
> > > > +static unsigned long __get_wchan(struct task_struct *p)
> > > > +{
> > > > +	unsigned long entry = 0;
> > > > +
> > > > +	stack_trace_save_tsk(p, &entry, 1, 0);
> > > 
> > > This assumes stack_trace_save_tsk() will skip sched functions, but I
> > > don't think that's ever been a requirement? It's certinaly not
> > > documented anywhere that I could find, and arm64 doesn't do so today,
> > > and this patch causes wchan to just log `__switch_to` for everything.
> > 
> > Confused, arm64 has arch_stack_walk() and should thus use
> > kernel/stacktrace.c's stack_trace_consume_entry_nosched.
> 
> Looking at this arm64's *current* get_wchan() unwinds once before
> checking in_sched_functions(), so it skips __switch_to(). As of this
> patch, we check in_sched_functions() first, which stops the unwind
> immediately as __switch_to() isn't marked as __sched.
> 
> I think x86 gets away with this because switch_to() is asm, and that
> tail-calls __switch_to() when returning.
> 
> Does switch_to() and below need to be marked __sched?

Yes, I would think so, for arches where they otherwise show up on the
stacktrace.

-- 
Josh


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] wchan: Fix wchan support
  2021-10-14 13:38   ` Russell King (Oracle)
@ 2021-10-14 19:51     ` Josh Poimboeuf
  0 siblings, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2021-10-14 19:51 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Peter Zijlstra, keescook, jannh, linux-kernel, vcaputo, mingo,
	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, akpm, christian.brauner, amistry, Kenta.Tada,
	legion, michael.weiss, mhocko, deller, zhengqi.arch, me, tycho,
	tglx, bp, hpa, mark.rutland, axboe, metze, laijs, luto,
	dave.hansen, ebiederm, ohoono.kwon, kaleshsingh, yifeifz2,
	linux-hardening, linux-arch, vgupta, will, guoren, bcain, monstr,
	tsbogend, nickhu, jonas, mpe, paul.walmsley, hca, ysato, davem,
	chris

On Thu, Oct 14, 2021 at 02:38:19PM +0100, Russell King (Oracle) wrote:
> What is going on here is that the ARM stacktrace code refuses to trace
> non-current tasks in a SMP environment due to the racy nature of doing
> so if the non-current tasks are running.
> 
> When walking the stack with frame pointers, we:
> 
> - validate that the frame pointer is between the stack pointer and the
>   top of stack defined by that stack pointer.
> - we then load the next stack pointer and next frame pointer from the
>   stack.
> 
> The reason this is unsafe when the task is not blocked is the stack can
> change at any moment, which can cause the value read as a stack pointer
> to be wildly different. If the read frame pointer value is roughly in
> agreement, we can end up reading any part of memory, which would be an
> information leak.

It would be a good idea to add some guardrails to prevent that
regardless.  If there's stack corruption for any reason, the unwinder
shouldn't make things worse.

On x86 the unwinder relies on the caller to ensure the task is blocked
(or current).  If the caller doesn't do that, they might get garbage,
and they get to keep the pieces.

But an important part of that is that the unwinder has guardrails to
ensure it handles stack corruption gracefully by never accessing out of
bounds of the stack.

When multiple stacks are involved in a kernel execution path (task, irq,
exception, etc), the stacks link to each other (e.g., last word on the
irq stack might point to the task stack).  Also the irq/exception stack
addresses are stored in percpu variables, and the task stack is in the
task struct.  So the unwinder can easily make sure it's in-bounds.  See
get_stack_info() in arch/x86/kernel/dumpstack_64.c.

-- 
Josh


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [tip: sched/core] sched: Add wrapper for get_wchan() to keep task blocked
  2021-10-08 11:15 ` [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked Peter Zijlstra
                     ` (2 preceding siblings ...)
  2021-10-14 10:46   ` Russell King (Oracle)
@ 2021-10-15  9:45   ` tip-bot2 for Kees Cook
  3 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Kees Cook @ 2021-10-15  9:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra, Kees Cook, Geert Uytterhoeven,
	Russell King (Oracle),
	Mark Rutland, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     42a20f86dc19f9282d974df0ba4d226c865ab9dd
Gitweb:        https://git.kernel.org/tip/42a20f86dc19f9282d974df0ba4d226c865ab9dd
Author:        Kees Cook <keescook@chromium.org>
AuthorDate:    Wed, 29 Sep 2021 15:02:14 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 15 Oct 2021 11:25:14 +02:00

sched: Add wrapper for get_wchan() to keep task blocked

Having a stable wchan means the process must be blocked and for it to
stay that way while performing stack unwinding.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm]
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Link: https://lkml.kernel.org/r/20211008111626.332092234@infradead.org
---
 arch/alpha/include/asm/processor.h      |  2 +-
 arch/alpha/kernel/process.c             |  5 ++---
 arch/arc/include/asm/processor.h        |  2 +-
 arch/arc/kernel/stacktrace.c            |  4 ++--
 arch/arm/include/asm/processor.h        |  2 +-
 arch/arm/kernel/process.c               |  4 +---
 arch/arm64/include/asm/processor.h      |  2 +-
 arch/arm64/kernel/process.c             |  4 +---
 arch/csky/include/asm/processor.h       |  2 +-
 arch/csky/kernel/stacktrace.c           |  5 ++---
 arch/h8300/include/asm/processor.h      |  2 +-
 arch/h8300/kernel/process.c             |  5 +----
 arch/hexagon/include/asm/processor.h    |  2 +-
 arch/hexagon/kernel/process.c           |  4 +---
 arch/ia64/include/asm/processor.h       |  2 +-
 arch/ia64/kernel/process.c              |  5 +----
 arch/m68k/include/asm/processor.h       |  2 +-
 arch/m68k/kernel/process.c              |  4 +---
 arch/microblaze/include/asm/processor.h |  2 +-
 arch/microblaze/kernel/process.c        |  2 +-
 arch/mips/include/asm/processor.h       |  2 +-
 arch/mips/kernel/process.c              |  8 +++-----
 arch/nds32/include/asm/processor.h      |  2 +-
 arch/nds32/kernel/process.c             |  7 +------
 arch/nios2/include/asm/processor.h      |  2 +-
 arch/nios2/kernel/process.c             |  5 +----
 arch/openrisc/include/asm/processor.h   |  2 +-
 arch/openrisc/kernel/process.c          |  2 +-
 arch/parisc/include/asm/processor.h     |  2 +-
 arch/parisc/kernel/process.c            |  5 +----
 arch/powerpc/include/asm/processor.h    |  2 +-
 arch/powerpc/kernel/process.c           |  9 +++------
 arch/riscv/include/asm/processor.h      |  2 +-
 arch/riscv/kernel/stacktrace.c          | 12 +++++-------
 arch/s390/include/asm/processor.h       |  2 +-
 arch/s390/kernel/process.c              |  4 ++--
 arch/sh/include/asm/processor_32.h      |  2 +-
 arch/sh/kernel/process_32.c             |  5 +----
 arch/sparc/include/asm/processor_32.h   |  2 +-
 arch/sparc/include/asm/processor_64.h   |  2 +-
 arch/sparc/kernel/process_32.c          |  5 +----
 arch/sparc/kernel/process_64.c          |  5 +----
 arch/um/include/asm/processor-generic.h |  2 +-
 arch/um/kernel/process.c                |  5 +----
 arch/x86/include/asm/processor.h        |  2 +-
 arch/x86/kernel/process.c               |  5 +----
 arch/xtensa/include/asm/processor.h     |  2 +-
 arch/xtensa/kernel/process.c            |  5 +----
 include/linux/sched.h                   |  1 +
 kernel/sched/core.c                     | 19 +++++++++++++++++++
 50 files changed, 80 insertions(+), 112 deletions(-)

diff --git a/arch/alpha/include/asm/processor.h b/arch/alpha/include/asm/processor.h
index 6100431..090499c 100644
--- a/arch/alpha/include/asm/processor.h
+++ b/arch/alpha/include/asm/processor.h
@@ -42,7 +42,7 @@ extern void start_thread(struct pt_regs *, unsigned long, unsigned long);
 struct task_struct;
 extern void release_thread(struct task_struct *);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk) (task_pt_regs(tsk)->pc)
 
diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
index a5123ea..5f85270 100644
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -376,12 +376,11 @@ thread_saved_pc(struct task_struct *t)
 }
 
 unsigned long
-get_wchan(struct task_struct *p)
+__get_wchan(struct task_struct *p)
 {
 	unsigned long schedule_frame;
 	unsigned long pc;
-	if (!p || p == current || task_is_running(p))
-		return 0;
+
 	/*
 	 * This one depends on the frame size of schedule().  Do a
 	 * "disass schedule" in gdb to find the frame size.  Also, the
diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
index f28afcf..54db9d7 100644
--- a/arch/arc/include/asm/processor.h
+++ b/arch/arc/include/asm/processor.h
@@ -70,7 +70,7 @@ struct task_struct;
 extern void start_thread(struct pt_regs * regs, unsigned long pc,
 			 unsigned long usp);
 
-extern unsigned int get_wchan(struct task_struct *p);
+extern unsigned int __get_wchan(struct task_struct *p);
 
 #endif /* !__ASSEMBLY__ */
 
diff --git a/arch/arc/kernel/stacktrace.c b/arch/arc/kernel/stacktrace.c
index c376ff3..5372dc0 100644
--- a/arch/arc/kernel/stacktrace.c
+++ b/arch/arc/kernel/stacktrace.c
@@ -15,7 +15,7 @@
  *      = specifics of data structs where trace is saved(CONFIG_STACKTRACE etc)
  *
  *  vineetg: March 2009
- *  -Implemented correct versions of thread_saved_pc() and get_wchan()
+ *  -Implemented correct versions of thread_saved_pc() and __get_wchan()
  *
  *  rajeshwarr: 2008
  *  -Initial implementation
@@ -248,7 +248,7 @@ void show_stack(struct task_struct *tsk, unsigned long *sp, const char *loglvl)
  * Of course just returning schedule( ) would be pointless so unwind until
  * the function is not in schedular code
  */
-unsigned int get_wchan(struct task_struct *tsk)
+unsigned int __get_wchan(struct task_struct *tsk)
 {
 	return arc_unwind_core(tsk, NULL, __get_first_nonsched, NULL);
 }
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index 9e6b972..6af68ed 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -84,7 +84,7 @@ struct task_struct;
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 0e2d305..96f577e 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -276,13 +276,11 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	struct stackframe frame;
 	unsigned long stack_page;
 	int count = 0;
-	if (!p || p == current || task_is_running(p))
-		return 0;
 
 	frame.fp = thread_saved_fp(p);
 	frame.sp = thread_saved_sp(p);
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index ee2bdc1..55ca034 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -257,7 +257,7 @@ struct task_struct;
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 void update_sctlr_el1(u64 sctlr);
 
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 40adb8c..aacf2f5 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -528,13 +528,11 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	return last;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	struct stackframe frame;
 	unsigned long stack_page, ret = 0;
 	int count = 0;
-	if (!p || p == current || task_is_running(p))
-		return 0;
 
 	stack_page = (unsigned long)try_get_task_stack(p);
 	if (!stack_page)
diff --git a/arch/csky/include/asm/processor.h b/arch/csky/include/asm/processor.h
index 9e93302..817dd60 100644
--- a/arch/csky/include/asm/processor.h
+++ b/arch/csky/include/asm/processor.h
@@ -81,7 +81,7 @@ static inline void release_thread(struct task_struct *dead_task)
 
 extern int kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)		(task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->usp)
diff --git a/arch/csky/kernel/stacktrace.c b/arch/csky/kernel/stacktrace.c
index 1b280ef..9f78f5d 100644
--- a/arch/csky/kernel/stacktrace.c
+++ b/arch/csky/kernel/stacktrace.c
@@ -111,12 +111,11 @@ static bool save_wchan(unsigned long pc, void *arg)
 	return false;
 }
 
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc = 0;
 
-	if (likely(task && task != current && !task_is_running(task)))
-		walk_stackframe(task, NULL, save_wchan, &pc);
+	walk_stackframe(task, NULL, save_wchan, &pc);
 	return pc;
 }
 
diff --git a/arch/h8300/include/asm/processor.h b/arch/h8300/include/asm/processor.h
index a060b41..141a23e 100644
--- a/arch/h8300/include/asm/processor.h
+++ b/arch/h8300/include/asm/processor.h
@@ -105,7 +105,7 @@ static inline void release_thread(struct task_struct *dead_task)
 {
 }
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define	KSTK_EIP(tsk)	\
 	({			 \
diff --git a/arch/h8300/kernel/process.c b/arch/h8300/kernel/process.c
index 2ac27e4..8833fa4 100644
--- a/arch/h8300/kernel/process.c
+++ b/arch/h8300/kernel/process.c
@@ -128,15 +128,12 @@ int copy_thread(unsigned long clone_flags, unsigned long usp,
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, pc;
 	unsigned long stack_page;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	stack_page = (unsigned long)p;
 	fp = ((struct pt_regs *)p->thread.ksp)->er6;
 	do {
diff --git a/arch/hexagon/include/asm/processor.h b/arch/hexagon/include/asm/processor.h
index 9f0cc99..615f7e4 100644
--- a/arch/hexagon/include/asm/processor.h
+++ b/arch/hexagon/include/asm/processor.h
@@ -64,7 +64,7 @@ struct thread_struct {
 extern void release_thread(struct task_struct *dead_task);
 
 /* Get wait channel for task P.  */
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 /*  The following stuff is pretty HEXAGON specific.  */
 
diff --git a/arch/hexagon/kernel/process.c b/arch/hexagon/kernel/process.c
index 6a6835f..232dfd8 100644
--- a/arch/hexagon/kernel/process.c
+++ b/arch/hexagon/kernel/process.c
@@ -130,13 +130,11 @@ void flush_thread(void)
  * is an identification of the point at which the scheduler
  * was invoked by a blocked thread.
  */
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, pc;
 	unsigned long stack_page;
 	int count = 0;
-	if (!p || p == current || task_is_running(p))
-		return 0;
 
 	stack_page = (unsigned long)task_stack_page(p);
 	fp = ((struct hexagon_switch_stack *)p->thread.switch_sp)->fp;
diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index 2d8bcdc..45365c2 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -330,7 +330,7 @@ struct task_struct;
 #define release_thread(dead_task)
 
 /* Get wait channel for task P.  */
-extern unsigned long get_wchan (struct task_struct *p);
+extern unsigned long __get_wchan (struct task_struct *p);
 
 /* Return instruction pointer of blocked task TSK.  */
 #define KSTK_EIP(tsk)					\
diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
index e56d63f..834df24 100644
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -523,15 +523,12 @@ exit_thread (struct task_struct *tsk)
 }
 
 unsigned long
-get_wchan (struct task_struct *p)
+__get_wchan (struct task_struct *p)
 {
 	struct unw_frame_info info;
 	unsigned long ip;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	/*
 	 * Note: p may not be a blocked task (it could be current or
 	 * another process running on some other CPU.  Rather than
diff --git a/arch/m68k/include/asm/processor.h b/arch/m68k/include/asm/processor.h
index f4d82c6..ffeda9a 100644
--- a/arch/m68k/include/asm/processor.h
+++ b/arch/m68k/include/asm/processor.h
@@ -150,7 +150,7 @@ static inline void release_thread(struct task_struct *dead_task)
 {
 }
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define	KSTK_EIP(tsk)	\
     ({			\
diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
index 1ab692b..a6030db 100644
--- a/arch/m68k/kernel/process.c
+++ b/arch/m68k/kernel/process.c
@@ -263,13 +263,11 @@ int dump_fpu (struct pt_regs *regs, struct user_m68kfp_struct *fpu)
 }
 EXPORT_SYMBOL(dump_fpu);
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, pc;
 	unsigned long stack_page;
 	int count = 0;
-	if (!p || p == current || task_is_running(p))
-		return 0;
 
 	stack_page = (unsigned long)task_stack_page(p);
 	fp = ((struct switch_stack *)p->thread.ksp)->a6;
diff --git a/arch/microblaze/include/asm/processor.h b/arch/microblaze/include/asm/processor.h
index 06c6e49..7e9e926 100644
--- a/arch/microblaze/include/asm/processor.h
+++ b/arch/microblaze/include/asm/processor.h
@@ -68,7 +68,7 @@ static inline void release_thread(struct task_struct *dead_task)
 {
 }
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 /* The size allocated for kernel stacks. This _must_ be a power of two! */
 # define KERNEL_STACK_SIZE	0x2000
diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c
index 62aa237..5e2b91c 100644
--- a/arch/microblaze/kernel/process.c
+++ b/arch/microblaze/kernel/process.c
@@ -112,7 +112,7 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 /* TBD (used by procfs) */
 	return 0;
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index 0c3550c..252ed38 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -369,7 +369,7 @@ static inline void flush_thread(void)
 {
 }
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define __KSTK_TOS(tsk) ((unsigned long)task_stack_page(tsk) + \
 			 THREAD_SIZE - 32 - sizeof(struct pt_regs))
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 95aa86f..cbff1b9 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -511,7 +511,7 @@ static int __init frame_info_init(void)
 
 	/*
 	 * Without schedule() frame info, result given by
-	 * thread_saved_pc() and get_wchan() are not reliable.
+	 * thread_saved_pc() and __get_wchan() are not reliable.
 	 */
 	if (schedule_mfi.pc_offset < 0)
 		printk("Can't analyze schedule() prologue at %p\n", schedule);
@@ -652,9 +652,9 @@ unsigned long unwind_stack(struct task_struct *task, unsigned long *sp,
 #endif
 
 /*
- * get_wchan - a maintenance nightmare^W^Wpain in the ass ...
+ * __get_wchan - a maintenance nightmare^W^Wpain in the ass ...
  */
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc = 0;
 #ifdef CONFIG_KALLSYMS
@@ -662,8 +662,6 @@ unsigned long get_wchan(struct task_struct *task)
 	unsigned long ra = 0;
 #endif
 
-	if (!task || task == current || task_is_running(task))
-		goto out;
 	if (!task_stack_page(task))
 		goto out;
 
diff --git a/arch/nds32/include/asm/processor.h b/arch/nds32/include/asm/processor.h
index b82369c..e6bfc74 100644
--- a/arch/nds32/include/asm/processor.h
+++ b/arch/nds32/include/asm/processor.h
@@ -83,7 +83,7 @@ extern struct task_struct *last_task_used_math;
 /* Prepare to copy thread state - unlazy all lazy status */
 #define prepare_to_copy(tsk)	do { } while (0)
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define cpu_relax()			barrier()
 
diff --git a/arch/nds32/kernel/process.c b/arch/nds32/kernel/process.c
index 391895b..49fab9e 100644
--- a/arch/nds32/kernel/process.c
+++ b/arch/nds32/kernel/process.c
@@ -233,15 +233,12 @@ int dump_fpu(struct pt_regs *regs, elf_fpregset_t * fpu)
 
 EXPORT_SYMBOL(dump_fpu);
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, lr;
 	unsigned long stack_start, stack_end;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	if (IS_ENABLED(CONFIG_FRAME_POINTER)) {
 		stack_start = (unsigned long)end_of_stack(p);
 		stack_end = (unsigned long)task_stack_page(p) + THREAD_SIZE;
@@ -258,5 +255,3 @@ unsigned long get_wchan(struct task_struct *p)
 	}
 	return 0;
 }
-
-EXPORT_SYMBOL(get_wchan);
diff --git a/arch/nios2/include/asm/processor.h b/arch/nios2/include/asm/processor.h
index 94bcb86..b8125df 100644
--- a/arch/nios2/include/asm/processor.h
+++ b/arch/nios2/include/asm/processor.h
@@ -69,7 +69,7 @@ static inline void release_thread(struct task_struct *dead_task)
 {
 }
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1)
diff --git a/arch/nios2/kernel/process.c b/arch/nios2/kernel/process.c
index 9ff37ba..f8ea522 100644
--- a/arch/nios2/kernel/process.c
+++ b/arch/nios2/kernel/process.c
@@ -217,15 +217,12 @@ void dump(struct pt_regs *fp)
 	pr_emerg("\n\n");
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long fp, pc;
 	unsigned long stack_page;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	stack_page = (unsigned long)p;
 	fp = ((struct switch_stack *)p->thread.ksp)->fp;	/* ;dgt2 */
 	do {
diff --git a/arch/openrisc/include/asm/processor.h b/arch/openrisc/include/asm/processor.h
index ad53b31..aa1699c 100644
--- a/arch/openrisc/include/asm/processor.h
+++ b/arch/openrisc/include/asm/processor.h
@@ -73,7 +73,7 @@ struct thread_struct {
 
 void start_thread(struct pt_regs *regs, unsigned long nip, unsigned long sp);
 void release_thread(struct task_struct *);
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define cpu_relax()     barrier()
 
diff --git a/arch/openrisc/kernel/process.c b/arch/openrisc/kernel/process.c
index b0698d9..3c0c91b 100644
--- a/arch/openrisc/kernel/process.c
+++ b/arch/openrisc/kernel/process.c
@@ -263,7 +263,7 @@ void dump_elf_thread(elf_greg_t *dest, struct pt_regs* regs)
 	dest[35] = 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	/* TODO */
 
diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h
index eeb7da0..85a2dbf 100644
--- a/arch/parisc/include/asm/processor.h
+++ b/arch/parisc/include/asm/processor.h
@@ -273,7 +273,7 @@ struct mm_struct;
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)	((tsk)->thread.regs.iaoq[0])
 #define KSTK_ESP(tsk)	((tsk)->thread.regs.gr[30])
diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
index 38ec4ae..4f562de 100644
--- a/arch/parisc/kernel/process.c
+++ b/arch/parisc/kernel/process.c
@@ -240,15 +240,12 @@ copy_thread(unsigned long clone_flags, unsigned long usp,
 }
 
 unsigned long
-get_wchan(struct task_struct *p)
+__get_wchan(struct task_struct *p)
 {
 	struct unwind_frame_info info;
 	unsigned long ip;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	/*
 	 * These bracket the sleeping functions..
 	 */
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index f348e56..e39bd0f 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -300,7 +300,7 @@ struct thread_struct {
 
 #define task_pt_regs(tsk)	((tsk)->thread.regs)
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)  ((tsk)->thread.regs? (tsk)->thread.regs->nip: 0)
 #define KSTK_ESP(tsk)  ((tsk)->thread.regs? (tsk)->thread.regs->gpr[1]: 0)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 50436b5..406d7ee 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2111,14 +2111,11 @@ int validate_sp(unsigned long sp, struct task_struct *p,
 
 EXPORT_SYMBOL(validate_sp);
 
-static unsigned long __get_wchan(struct task_struct *p)
+static unsigned long ___get_wchan(struct task_struct *p)
 {
 	unsigned long ip, sp;
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	sp = p->thread.ksp;
 	if (!validate_sp(sp, p, STACK_FRAME_OVERHEAD))
 		return 0;
@@ -2137,14 +2134,14 @@ static unsigned long __get_wchan(struct task_struct *p)
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long ret;
 
 	if (!try_get_task_stack(p))
 		return 0;
 
-	ret = __get_wchan(p);
+	ret = ___get_wchan(p);
 
 	put_task_stack(p);
 
diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 46b492c..0749924 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -66,7 +66,7 @@ static inline void release_thread(struct task_struct *dead_task)
 {
 }
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 
 static inline void wait_for_interrupt(void)
diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index 315db3d..0fcdc02 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -128,16 +128,14 @@ static bool save_wchan(void *arg, unsigned long pc)
 	return true;
 }
 
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc = 0;
 
-	if (likely(task && task != current && !task_is_running(task))) {
-		if (!try_get_task_stack(task))
-			return 0;
-		walk_stackframe(task, NULL, save_wchan, &pc);
-		put_task_stack(task);
-	}
+	if (!try_get_task_stack(task))
+		return 0;
+	walk_stackframe(task, NULL, save_wchan, &pc);
+	put_task_stack(task);
 	return pc;
 }
 
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 879b8e3..f54c152 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -192,7 +192,7 @@ static inline void release_thread(struct task_struct *tsk) { }
 void guarded_storage_release(struct task_struct *tsk);
 void gs_load_bc_cb(struct pt_regs *regs);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 #define task_pt_regs(tsk) ((struct pt_regs *) \
         (task_stack_page(tsk) + THREAD_SIZE) - 1)
 #define KSTK_EIP(tsk)	(task_pt_regs(tsk)->psw.addr)
diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
index 350e94d..e5dd46b 100644
--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -181,12 +181,12 @@ void execve_tail(void)
 	asm volatile("sfpc %0" : : "d" (0));
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	struct unwind_state state;
 	unsigned long ip = 0;
 
-	if (!p || p == current || task_is_running(p) || !task_stack_page(p))
+	if (!task_stack_page(p))
 		return 0;
 
 	if (!try_get_task_stack(p))
diff --git a/arch/sh/include/asm/processor_32.h b/arch/sh/include/asm/processor_32.h
index aa92cc9..45240ec 100644
--- a/arch/sh/include/asm/processor_32.h
+++ b/arch/sh/include/asm/processor_32.h
@@ -180,7 +180,7 @@ static inline void show_code(struct pt_regs *regs)
 }
 #endif
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)  (task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)  (task_pt_regs(tsk)->regs[15])
diff --git a/arch/sh/kernel/process_32.c b/arch/sh/kernel/process_32.c
index 717de05..1c28e3c 100644
--- a/arch/sh/kernel/process_32.c
+++ b/arch/sh/kernel/process_32.c
@@ -182,13 +182,10 @@ __switch_to(struct task_struct *prev, struct task_struct *next)
 	return prev;
 }
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long pc;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	/*
 	 * The same comment as on the Alpha applies here, too ...
 	 */
diff --git a/arch/sparc/include/asm/processor_32.h b/arch/sparc/include/asm/processor_32.h
index b6242f7..647bf0a 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -89,7 +89,7 @@ static inline void start_thread(struct pt_regs * regs, unsigned long pc,
 /* Free all resources held by a thread. */
 #define release_thread(tsk)		do { } while(0)
 
-unsigned long get_wchan(struct task_struct *);
+unsigned long __get_wchan(struct task_struct *);
 
 #define task_pt_regs(tsk) ((tsk)->thread.kregs)
 #define KSTK_EIP(tsk)  ((tsk)->thread.kregs->pc)
diff --git a/arch/sparc/include/asm/processor_64.h b/arch/sparc/include/asm/processor_64.h
index 5cf145f..ae851e8 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -183,7 +183,7 @@ do { \
 /* Free all resources held by a thread. */
 #define release_thread(tsk)		do { } while (0)
 
-unsigned long get_wchan(struct task_struct *task);
+unsigned long __get_wchan(struct task_struct *task);
 
 #define task_pt_regs(tsk) (task_thread_info(tsk)->kregs)
 #define KSTK_EIP(tsk)  (task_pt_regs(tsk)->tpc)
diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c
index bbbe0cf..2dc0bf9 100644
--- a/arch/sparc/kernel/process_32.c
+++ b/arch/sparc/kernel/process_32.c
@@ -365,7 +365,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc, fp, bias = 0;
 	unsigned long task_base = (unsigned long) task;
@@ -373,9 +373,6 @@ unsigned long get_wchan(struct task_struct *task)
 	struct reg_window32 *rw;
 	int count = 0;
 
-	if (!task || task == current || task_is_running(task))
-		goto out;
-
 	fp = task_thread_info(task)->ksp + bias;
 	do {
 		/* Bogus frame pointer? */
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index d1cc410..f5b2cac 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -663,7 +663,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	return 0;
 }
 
-unsigned long get_wchan(struct task_struct *task)
+unsigned long __get_wchan(struct task_struct *task)
 {
 	unsigned long pc, fp, bias = 0;
 	struct thread_info *tp;
@@ -671,9 +671,6 @@ unsigned long get_wchan(struct task_struct *task)
         unsigned long ret = 0;
 	int count = 0; 
 
-	if (!task || task == current || task_is_running(task))
-		goto out;
-
 	tp = task_thread_info(task);
 	bias = STACK_BIAS;
 	fp = task_thread_info(task)->ksp + bias;
diff --git a/arch/um/include/asm/processor-generic.h b/arch/um/include/asm/processor-generic.h
index b5cf0ed..579692a 100644
--- a/arch/um/include/asm/processor-generic.h
+++ b/arch/um/include/asm/processor-generic.h
@@ -106,6 +106,6 @@ extern struct cpuinfo_um boot_cpu_data;
 #define cache_line_size()	(boot_cpu_data.cache_alignment)
 
 #define KSTK_REG(tsk, reg) get_thread_reg(reg, &tsk->thread.switch_buf)
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #endif
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index 457a38d..8210737 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -364,14 +364,11 @@ unsigned long arch_align_stack(unsigned long sp)
 }
 #endif
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long stack_page, sp, ip;
 	bool seen_sched = 0;
 
-	if ((p == NULL) || (p == current) || task_is_running(p))
-		return 0;
-
 	stack_page = (unsigned long) task_stack_page(p);
 	/* Bail if the process has no kernel stack for some reason */
 	if (stack_page == 0)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 9ad2aca..e81177e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -589,7 +589,7 @@ static inline void load_sp0(unsigned long sp0)
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
 
-unsigned long get_wchan(struct task_struct *p);
+unsigned long __get_wchan(struct task_struct *p);
 
 /*
  * Generic CPUID function
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index e645925..a69885a 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -942,13 +942,10 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
  * because the task might wake up and we might look at a stack
  * changing under us.
  */
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long entry = 0;
 
-	if (p == current || task_is_running(p))
-		return 0;
-
 	stack_trace_save_tsk(p, &entry, 1, 0);
 	return entry;
 }
diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h
index 7f63aca..ad15fbc 100644
--- a/arch/xtensa/include/asm/processor.h
+++ b/arch/xtensa/include/asm/processor.h
@@ -215,7 +215,7 @@ struct mm_struct;
 /* Free all resources held by a thread. */
 #define release_thread(thread) do { } while(0)
 
-extern unsigned long get_wchan(struct task_struct *p);
+extern unsigned long __get_wchan(struct task_struct *p);
 
 #define KSTK_EIP(tsk)		(task_pt_regs(tsk)->pc)
 #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->areg[1])
diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c
index 0601653..47f933f 100644
--- a/arch/xtensa/kernel/process.c
+++ b/arch/xtensa/kernel/process.c
@@ -298,15 +298,12 @@ int copy_thread(unsigned long clone_flags, unsigned long usp_thread_fn,
  * These bracket the sleeping functions..
  */
 
-unsigned long get_wchan(struct task_struct *p)
+unsigned long __get_wchan(struct task_struct *p)
 {
 	unsigned long sp, pc;
 	unsigned long stack_page = (unsigned long) task_stack_page(p);
 	int count = 0;
 
-	if (!p || p == current || task_is_running(p))
-		return 0;
-
 	sp = p->thread.sp;
 	pc = MAKE_PC_FROM_RA(p->thread.ra, p->thread.sp);
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 343603f..5f8db54 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2139,6 +2139,7 @@ static inline void set_task_cpu(struct task_struct *p, unsigned int cpu)
 #endif /* CONFIG_SMP */
 
 extern bool sched_task_on_rq(struct task_struct *p);
+extern unsigned long get_wchan(struct task_struct *p);
 
 /*
  * In order to reduce various lock holder preemption latencies provide an
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 935c2da..f2611b9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1966,6 +1966,25 @@ bool sched_task_on_rq(struct task_struct *p)
 	return task_on_rq_queued(p);
 }
 
+unsigned long get_wchan(struct task_struct *p)
+{
+	unsigned long ip = 0;
+	unsigned int state;
+
+	if (!p || p == current)
+		return 0;
+
+	/* Only get wchan if task is blocked and we can keep it that way. */
+	raw_spin_lock_irq(&p->pi_lock);
+	state = READ_ONCE(p->__state);
+	smp_rmb(); /* see try_to_wake_up() */
+	if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
+		ip = __get_wchan(p);
+	raw_spin_unlock_irq(&p->pi_lock);
+
+	return ip;
+}
+
 static inline void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
 {
 	if (!(flags & ENQUEUE_NOCLOCK))

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [tip: sched/core] x86: Fix get_wchan() to support the ORC unwinder
  2021-10-08 11:15 ` [PATCH 4/7] x86: Fix get_wchan() to support the ORC unwinder Peter Zijlstra
@ 2021-10-15  9:45   ` tip-bot2 for Qi Zheng
  0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Qi Zheng @ 2021-10-15  9:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Qi Zheng, Kees Cook, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     bc9bbb81730ea667c31c5b284f95ee312bab466f
Gitweb:        https://git.kernel.org/tip/bc9bbb81730ea667c31c5b284f95ee312bab466f
Author:        Qi Zheng <zhengqi.arch@bytedance.com>
AuthorDate:    Wed, 29 Sep 2021 15:02:17 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 15 Oct 2021 11:25:14 +02:00

x86: Fix get_wchan() to support the ORC unwinder

Currently, the kernel CONFIG_UNWINDER_ORC option is enabled by default
on x86, but the implementation of get_wchan() is still based on the frame
pointer unwinder, so the /proc/<pid>/wchan usually returned 0 regardless
of whether the task <pid> is running.

Reimplement get_wchan() by calling stack_trace_save_tsk(), which is
adapted to the ORC and frame pointer unwinders.

Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211008111626.271115116@infradead.org
---
 arch/x86/kernel/process.c | 51 ++------------------------------------
 1 file changed, 3 insertions(+), 48 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 1d9463e..e645925 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -944,58 +944,13 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
  */
 unsigned long get_wchan(struct task_struct *p)
 {
-	unsigned long start, bottom, top, sp, fp, ip, ret = 0;
-	int count = 0;
+	unsigned long entry = 0;
 
 	if (p == current || task_is_running(p))
 		return 0;
 
-	if (!try_get_task_stack(p))
-		return 0;
-
-	start = (unsigned long)task_stack_page(p);
-	if (!start)
-		goto out;
-
-	/*
-	 * Layout of the stack page:
-	 *
-	 * ----------- topmax = start + THREAD_SIZE - sizeof(unsigned long)
-	 * PADDING
-	 * ----------- top = topmax - TOP_OF_KERNEL_STACK_PADDING
-	 * stack
-	 * ----------- bottom = start
-	 *
-	 * The tasks stack pointer points at the location where the
-	 * framepointer is stored. The data on the stack is:
-	 * ... IP FP ... IP FP
-	 *
-	 * We need to read FP and IP, so we need to adjust the upper
-	 * bound by another unsigned long.
-	 */
-	top = start + THREAD_SIZE - TOP_OF_KERNEL_STACK_PADDING;
-	top -= 2 * sizeof(unsigned long);
-	bottom = start;
-
-	sp = READ_ONCE(p->thread.sp);
-	if (sp < bottom || sp > top)
-		goto out;
-
-	fp = READ_ONCE_NOCHECK(((struct inactive_task_frame *)sp)->bp);
-	do {
-		if (fp < bottom || fp > top)
-			goto out;
-		ip = READ_ONCE_NOCHECK(*(unsigned long *)(fp + sizeof(unsigned long)));
-		if (!in_sched_functions(ip)) {
-			ret = ip;
-			goto out;
-		}
-		fp = READ_ONCE_NOCHECK(*(unsigned long *)fp);
-	} while (count++ < 16 && !task_is_running(p));
-
-out:
-	put_task_stack(p);
-	return ret;
+	stack_trace_save_tsk(p, &entry, 1, 0);
+	return entry;
 }
 
 long do_arch_prctl_common(struct task_struct *task, int option,

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [tip: sched/core] proc: Use task_is_running() for wchan in /proc/$pid/stat
  2021-10-08 11:15 ` [PATCH 3/7] proc: Use task_is_running() for wchan in /proc/$pid/stat Peter Zijlstra
@ 2021-10-15  9:45   ` tip-bot2 for Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Kees Cook @ 2021-10-15  9:45 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Kees Cook, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     4e046156792c26bef8a4e30be711777fc8578257
Gitweb:        https://git.kernel.org/tip/4e046156792c26bef8a4e30be711777fc8578257
Author:        Kees Cook <keescook@chromium.org>
AuthorDate:    Wed, 29 Sep 2021 15:02:15 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 15 Oct 2021 11:25:13 +02:00

proc: Use task_is_running() for wchan in /proc/$pid/stat

The implementations of get_wchan() can be expensive. The only information
imparted here is whether or not a process is currently blocked in the
scheduler (and even this doesn't need to be exact). Avoid doing the
heavy lifting of stack walking and just report that information by using
task_is_running().

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211008111626.211281780@infradead.org
---
 fs/proc/array.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 49be8c8..77cf418 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -541,7 +541,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	}
 
 	if (permitted && (!whole || num_threads < 2))
-		wchan = get_wchan(task);
+		wchan = !task_is_running(task);
 	if (!whole) {
 		min_flt = task->min_flt;
 		maj_flt = task->maj_flt;
@@ -606,10 +606,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	 *
 	 * This works with older implementations of procps as well.
 	 */
-	if (wchan)
-		seq_puts(m, " 1");
-	else
-		seq_puts(m, " 0");
+	seq_put_decimal_ull(m, " ", wchan);
 
 	seq_put_decimal_ull(m, " ", 0);
 	seq_put_decimal_ull(m, " ", 0);

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [tip: sched/core] leaking_addresses: Always print a trailing newline
  2021-10-08 11:15 ` [PATCH 2/7] leaking_addresses: Always print a trailing newline Peter Zijlstra
@ 2021-10-15  9:45   ` tip-bot2 for Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Kees Cook @ 2021-10-15  9:45 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Kees Cook, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     cf2a85efdade117e2169d6e26641016cbbf03ef0
Gitweb:        https://git.kernel.org/tip/cf2a85efdade117e2169d6e26641016cbbf03ef0
Author:        Kees Cook <keescook@chromium.org>
AuthorDate:    Wed, 29 Sep 2021 15:02:18 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 15 Oct 2021 11:25:13 +02:00

leaking_addresses: Always print a trailing newline

For files that lack trailing newlines and match a leaking address (e.g.
wchan[1]), the leaking_addresses.pl report would run together with the
next line, making things look corrupted.

Unconditionally remove the newline on input, and write it back out on
output.

[1] https://lore.kernel.org/all/20210103142726.GC30643@xsang-OptiPlex-9020/

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211008111626.151570317@infradead.org
---
 scripts/leaking_addresses.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
index b2d8b8a..8f636a2 100755
--- a/scripts/leaking_addresses.pl
+++ b/scripts/leaking_addresses.pl
@@ -455,8 +455,9 @@ sub parse_file
 
 	open my $fh, "<", $file or return;
 	while ( <$fh> ) {
+		chomp;
 		if (may_leak_address($_)) {
-			print $file . ': ' . $_;
+			printf("$file: $_\n");
 		}
 	}
 	close $fh;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [tip: sched/core] Revert "proc/wchan: use printk format instead of lookup_symbol_name()"
  2021-10-08 11:15 ` [PATCH 1/7] Revert "proc/wchan: use printk format instead of lookup_symbol_name()" Peter Zijlstra
@ 2021-10-15  9:45   ` tip-bot2 for Kees Cook
  0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Kees Cook @ 2021-10-15  9:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kernel test robot, Vito Caputo, Jann Horn, Kees Cook,
	Peter Zijlstra (Intel),
	stable, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     54354c6a9f7fd5572d2b9ec108117c4f376d4d23
Gitweb:        https://git.kernel.org/tip/54354c6a9f7fd5572d2b9ec108117c4f376d4d23
Author:        Kees Cook <keescook@chromium.org>
AuthorDate:    Wed, 29 Sep 2021 15:02:13 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 15 Oct 2021 11:25:13 +02:00

Revert "proc/wchan: use printk format instead of lookup_symbol_name()"

This reverts commit 152c432b128cb043fc107e8f211195fe94b2159c.

When a kernel address couldn't be symbolized for /proc/$pid/wchan, it
would leak the raw value, a potential information exposure. This is a
regression compared to the safer pre-v5.12 behavior.

Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Vito Caputo <vcaputo@pengaru.com>
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20211008111626.090829198@infradead.org
---
 fs/proc/base.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 533d583..1f39409 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -67,6 +67,7 @@
 #include <linux/mm.h>
 #include <linux/swap.h>
 #include <linux/rcupdate.h>
+#include <linux/kallsyms.h>
 #include <linux/stacktrace.h>
 #include <linux/resource.h>
 #include <linux/module.h>
@@ -386,17 +387,19 @@ static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns,
 			  struct pid *pid, struct task_struct *task)
 {
 	unsigned long wchan;
+	char symname[KSYM_NAME_LEN];
 
-	if (ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS))
-		wchan = get_wchan(task);
-	else
-		wchan = 0;
+	if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS))
+		goto print0;
 
-	if (wchan)
-		seq_printf(m, "%ps", (void *) wchan);
-	else
-		seq_putc(m, '0');
+	wchan = get_wchan(task);
+	if (wchan && !lookup_symbol_name(wchan, symname)) {
+		seq_puts(m, symname);
+		return 0;
+	}
 
+print0:
+	seq_putc(m, '0');
 	return 0;
 }
 #endif /* CONFIG_KALLSYMS */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2021-10-15  9:45 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-08 11:15 [PATCH 0/7] wchan: Fix wchan support Peter Zijlstra
2021-10-08 11:15 ` [PATCH 1/7] Revert "proc/wchan: use printk format instead of lookup_symbol_name()" Peter Zijlstra
2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Kees Cook
2021-10-08 11:15 ` [PATCH 2/7] leaking_addresses: Always print a trailing newline Peter Zijlstra
2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Kees Cook
2021-10-08 11:15 ` [PATCH 3/7] proc: Use task_is_running() for wchan in /proc/$pid/stat Peter Zijlstra
2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Kees Cook
2021-10-08 11:15 ` [PATCH 4/7] x86: Fix get_wchan() to support the ORC unwinder Peter Zijlstra
2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Qi Zheng
2021-10-08 11:15 ` [PATCH 5/7] sched: Add wrapper for get_wchan() to keep task blocked Peter Zijlstra
2021-10-08 11:26   ` Geert Uytterhoeven
2021-10-08 12:45   ` Mark Rutland
2021-10-14 10:46   ` Russell King (Oracle)
2021-10-15  9:45   ` [tip: sched/core] " tip-bot2 for Kees Cook
2021-10-08 11:15 ` [PATCH 6/7] arch: __get_wchan || STACKTRACE_SUPPORT Peter Zijlstra
2021-10-08 12:40   ` Mark Rutland
2021-10-08 13:45     ` Peter Zijlstra
2021-10-08 16:17       ` Josh Poimboeuf
2021-10-14 18:07         ` Mark Rutland
2021-10-14 18:41           ` Josh Poimboeuf
2021-10-14 18:03       ` Mark Rutland
2021-10-14 18:48         ` Josh Poimboeuf
2021-10-14 11:07   ` Russell King (Oracle)
2021-10-08 11:15 ` [PATCH 7/7] arch: Fix STACKTRACE_SUPPORT Peter Zijlstra
2021-10-08 12:52   ` Mark Rutland
2021-10-09  9:36   ` Guo Ren
2021-10-14 12:02 ` [PATCH 0/7] wchan: Fix wchan support Russell King (Oracle)
2021-10-14 13:38   ` Russell King (Oracle)
2021-10-14 19:51     ` Josh Poimboeuf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.