* [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in
@ 2019-12-02 8:35 zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 1/7] sched/core: Allow putting thread_info into task_struct zhangyi (F)
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: zhangyi (F) @ 2019-12-02 8:35 UTC (permalink / raw)
To: gregkh
Cc: linux-kernel, linux-fsdevel, luto, adobriyan, akpm, vbabka,
peterz, bigeasy, mhocko, john.ogness, yi.zhang, nixiaoming
Reporting eip and esp fields on a non-current task is dangerous,
so backport this series for 4.4 to fix protential oops and info leak
problems. The first 3 patch are depended on the 6/7 patch.
Alexey Dobriyan (1):
proc: fix coredump vs read /proc/*/stat race
Andy Lutomirski (3):
sched/core: Allow putting thread_info into task_struct
sched/core: Add try_get_task_stack() and put_task_stack()
fs/proc: Stop reporting eip and esp in /proc/PID/stat
Heiko Carstens (1):
sched/core, x86: Make struct thread_info arch specific again
John Ogness (2):
fs/proc: Report eip/esp in /prod/PID/stat for coredumping
fs/proc/array.c: allow reporting eip/esp for all coredumping threads
fs/proc/array.c | 18 ++++++++++---
include/linux/init_task.h | 9 +++++++
include/linux/sched.h | 52 +++++++++++++++++++++++++++++++++++--
include/linux/thread_info.h | 4 +++
init/Kconfig | 10 +++++++
init/init_task.c | 7 +++--
kernel/sched/sched.h | 4 +++
7 files changed, 97 insertions(+), 7 deletions(-)
--
2.17.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 4.4 1/7] sched/core: Allow putting thread_info into task_struct
2019-12-02 8:35 [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in zhangyi (F)
@ 2019-12-02 8:35 ` zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 2/7] sched/core: Add try_get_task_stack() and put_task_stack() zhangyi (F)
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: zhangyi (F) @ 2019-12-02 8:35 UTC (permalink / raw)
To: gregkh
Cc: linux-kernel, linux-fsdevel, luto, adobriyan, akpm, vbabka,
peterz, bigeasy, mhocko, john.ogness, yi.zhang, nixiaoming
From: Andy Lutomirski <luto@kernel.org>
commit c65eacbe290b8141554c71b2c94489e73ade8c8d upstream.
If an arch opts in by setting CONFIG_THREAD_INFO_IN_TASK_STRUCT,
then thread_info is defined as a single 'u32 flags' and is the first
entry of task_struct. thread_info::task is removed (it serves no
purpose if thread_info is embedded in task_struct), and
thread_info::cpu gets its own slot in task_struct.
This is heavily based on a patch written by Linus.
Originally-from: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a0898196f0476195ca02713691a5037a14f2aac5.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
---
include/linux/init_task.h | 9 +++++++++
include/linux/sched.h | 36 ++++++++++++++++++++++++++++++++++--
include/linux/thread_info.h | 15 +++++++++++++++
init/Kconfig | 7 +++++++
init/init_task.c | 7 +++++--
kernel/sched/sched.h | 4 ++++
6 files changed, 74 insertions(+), 4 deletions(-)
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 1c1ff7e4faa4..d25d3f70ee99 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -15,6 +15,8 @@
#include <net/net_namespace.h>
#include <linux/sched/rt.h>
+#include <asm/thread_info.h>
+
#ifdef CONFIG_SMP
# define INIT_PUSHABLE_TASKS(tsk) \
.pushable_tasks = PLIST_NODE_INIT(tsk.pushable_tasks, MAX_PRIO),
@@ -183,12 +185,19 @@ extern struct task_group root_task_group;
# define INIT_KASAN(tsk)
#endif
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+# define INIT_TASK_TI(tsk) .thread_info = INIT_THREAD_INFO(tsk),
+#else
+# define INIT_TASK_TI(tsk)
+#endif
+
/*
* INIT_TASK is used to set up the first task table, touch at
* your own risk!. Base=0, limit=0x1fffff (=2MB)
*/
#define INIT_TASK(tsk) \
{ \
+ INIT_TASK_TI(tsk) \
.state = 0, \
.stack = &init_thread_info, \
.usage = ATOMIC_INIT(2), \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 1218980f53de..d6b2e77aad3f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1389,6 +1389,13 @@ struct tlbflush_unmap_batch {
};
struct task_struct {
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ /*
+ * For reasons of header soup (see current_thread_info()), this
+ * must be the first element of task_struct.
+ */
+ struct thread_info thread_info;
+#endif
volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */
void *stack;
atomic_t usage;
@@ -1398,6 +1405,9 @@ struct task_struct {
#ifdef CONFIG_SMP
struct llist_node wake_entry;
int on_cpu;
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ unsigned int cpu; /* current CPU */
+#endif
unsigned int wakee_flips;
unsigned long wakee_flip_decay_ts;
struct task_struct *last_wakee;
@@ -2440,7 +2450,9 @@ extern void set_curr_task(int cpu, struct task_struct *p);
void yield(void);
union thread_union {
+#ifndef CONFIG_THREAD_INFO_IN_TASK
struct thread_info thread_info;
+#endif
unsigned long stack[THREAD_SIZE/sizeof(long)];
};
@@ -2840,10 +2852,26 @@ static inline void threadgroup_change_end(struct task_struct *tsk)
cgroup_threadgroup_change_end(tsk);
}
-#ifndef __HAVE_THREAD_FUNCTIONS
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+
+static inline struct thread_info *task_thread_info(struct task_struct *task)
+{
+ return &task->thread_info;
+}
+static inline void *task_stack_page(const struct task_struct *task)
+{
+ return task->stack;
+}
+#define setup_thread_stack(new,old) do { } while(0)
+static inline unsigned long *end_of_stack(const struct task_struct *task)
+{
+ return task->stack;
+}
+
+#elif !defined(__HAVE_THREAD_FUNCTIONS)
#define task_thread_info(task) ((struct thread_info *)(task)->stack)
-#define task_stack_page(task) ((task)->stack)
+#define task_stack_page(task) ((void *)(task)->stack)
static inline void setup_thread_stack(struct task_struct *p, struct task_struct *org)
{
@@ -3135,7 +3163,11 @@ static inline void ptrace_signal_wake_up(struct task_struct *t, bool resume)
static inline unsigned int task_cpu(const struct task_struct *p)
{
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ return p->cpu;
+#else
return task_thread_info(p)->cpu;
+#endif
}
static inline int task_node(const struct task_struct *p)
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 646891f3bc1e..2813de75a96e 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -13,6 +13,21 @@
struct timespec;
struct compat_timespec;
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+struct thread_info {
+ u32 flags; /* low level flags */
+};
+
+#define INIT_THREAD_INFO(tsk) \
+{ \
+ .flags = 0, \
+}
+#endif
+
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+#define current_thread_info() ((struct thread_info *)current)
+#endif
+
/*
* System call restart block.
*/
diff --git a/init/Kconfig b/init/Kconfig
index 47b0bdcf33c2..5dec7d424980 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -26,6 +26,13 @@ config IRQ_WORK
config BUILDTIME_EXTABLE_SORT
bool
+config THREAD_INFO_IN_TASK
+ bool
+ help
+ Select this to move thread_info off the stack into task_struct. To
+ make this work, an arch will need to remove all thread_info fields
+ except flags and fix any runtime bugs.
+
menu "General setup"
config BROKEN
diff --git a/init/init_task.c b/init/init_task.c
index ba0a7f362d9e..11f83be1fa79 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -22,5 +22,8 @@ EXPORT_SYMBOL(init_task);
* Initial thread structure. Alignment of this is handled by a special
* linker map entry.
*/
-union thread_union init_thread_union __init_task_data =
- { INIT_THREAD_INFO(init_task) };
+union thread_union init_thread_union __init_task_data = {
+#ifndef CONFIG_THREAD_INFO_IN_TASK
+ INIT_THREAD_INFO(init_task)
+#endif
+};
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 8b96df04ba78..8afd9d62c56e 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -978,7 +978,11 @@ static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
* per-task data have been completed by this moment.
*/
smp_wmb();
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ p->cpu = cpu;
+#else
task_thread_info(p)->cpu = cpu;
+#endif
p->wake_cpu = cpu;
#endif
}
--
2.17.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4.4 2/7] sched/core: Add try_get_task_stack() and put_task_stack()
2019-12-02 8:35 [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 1/7] sched/core: Allow putting thread_info into task_struct zhangyi (F)
@ 2019-12-02 8:35 ` zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 3/7] sched/core, x86: Make struct thread_info arch specific again zhangyi (F)
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: zhangyi (F) @ 2019-12-02 8:35 UTC (permalink / raw)
To: gregkh
Cc: linux-kernel, linux-fsdevel, luto, adobriyan, akpm, vbabka,
peterz, bigeasy, mhocko, john.ogness, yi.zhang, nixiaoming
From: Andy Lutomirski <luto@kernel.org>
commit c6c314a613cd7d03fb97713e0d642b493de42e69 upstream.
There are a few places in the kernel that access stack memory
belonging to a different task. Before we can start freeing task
stacks before the task_struct is freed, we need a way for those code
paths to pin the stack.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/17a434f50ad3d77000104f21666575e10a9c1fbd.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
---
include/linux/sched.h | 16 ++++++++++++++++
init/Kconfig | 3 +++
2 files changed, 19 insertions(+)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d6b2e77aad3f..761247c966a5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2858,11 +2858,19 @@ static inline struct thread_info *task_thread_info(struct task_struct *task)
{
return &task->thread_info;
}
+
+/*
+ * When accessing the stack of a non-current task that might exit, use
+ * try_get_task_stack() instead. task_stack_page will return a pointer
+ * that could get freed out from under you.
+ */
static inline void *task_stack_page(const struct task_struct *task)
{
return task->stack;
}
+
#define setup_thread_stack(new,old) do { } while(0)
+
static inline unsigned long *end_of_stack(const struct task_struct *task)
{
return task->stack;
@@ -2898,6 +2906,14 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
}
#endif
+
+static inline void *try_get_task_stack(struct task_struct *tsk)
+{
+ return task_stack_page(tsk);
+}
+
+static inline void put_task_stack(struct task_struct *tsk) {}
+
#define task_stack_end_corrupted(task) \
(*(end_of_stack(task)) != STACK_END_MAGIC)
diff --git a/init/Kconfig b/init/Kconfig
index 5dec7d424980..f9fb621c9562 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -33,6 +33,9 @@ config THREAD_INFO_IN_TASK
make this work, an arch will need to remove all thread_info fields
except flags and fix any runtime bugs.
+ One subtle change that will be needed is to use try_get_task_stack()
+ and put_task_stack() in save_thread_stack_tsk() and get_wchan().
+
menu "General setup"
config BROKEN
--
2.17.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4.4 3/7] sched/core, x86: Make struct thread_info arch specific again
2019-12-02 8:35 [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 1/7] sched/core: Allow putting thread_info into task_struct zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 2/7] sched/core: Add try_get_task_stack() and put_task_stack() zhangyi (F)
@ 2019-12-02 8:35 ` zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 4/7] fs/proc: Stop reporting eip and esp in /proc/PID/stat zhangyi (F)
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: zhangyi (F) @ 2019-12-02 8:35 UTC (permalink / raw)
To: gregkh
Cc: linux-kernel, linux-fsdevel, luto, adobriyan, akpm, vbabka,
peterz, bigeasy, mhocko, john.ogness, yi.zhang, nixiaoming
From: Heiko Carstens <heiko.carstens@de.ibm.com>
commit c8061485a0d7569a865a3cc3c63347b0f42b3765 upstream.
The following commit:
c65eacbe290b ("sched/core: Allow putting thread_info into task_struct")
... made 'struct thread_info' a generic struct with only a
single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
selected.
This change however seems to be quite x86 centric, since at least the
generic preemption code (asm-generic/preempt.h) assumes that struct
thread_info also has a preempt_count member, which apparently was not
true for x86.
We could add a bit more #ifdefs to solve this problem too, but it seems
to be much simpler to make struct thread_info arch specific
again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
bit easier for architectures that have a couple of arch specific stuff
in their thread_info definition.
The arch specific stuff _could_ be moved to thread_struct. However
keeping them in thread_info makes it easier: accessing thread_info
members is simple, since it is at the beginning of the task_struct,
while the thread_struct is at the end. At least on s390 the offsets
needed to access members of the thread_struct (with task_struct as
base) are too large for various asm instructions. This is not a
problem when keeping these members within thread_info.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: keescook@chromium.org
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ zhangyi: skip defination of INIT_THREAD_INFO and struct thread_info ]
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
---
include/linux/thread_info.h | 11 -----------
1 file changed, 11 deletions(-)
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 2813de75a96e..897e835379d8 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -13,17 +13,6 @@
struct timespec;
struct compat_timespec;
-#ifdef CONFIG_THREAD_INFO_IN_TASK
-struct thread_info {
- u32 flags; /* low level flags */
-};
-
-#define INIT_THREAD_INFO(tsk) \
-{ \
- .flags = 0, \
-}
-#endif
-
#ifdef CONFIG_THREAD_INFO_IN_TASK
#define current_thread_info() ((struct thread_info *)current)
#endif
--
2.17.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4.4 4/7] fs/proc: Stop reporting eip and esp in /proc/PID/stat
2019-12-02 8:35 [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in zhangyi (F)
` (2 preceding siblings ...)
2019-12-02 8:35 ` [PATCH 4.4 3/7] sched/core, x86: Make struct thread_info arch specific again zhangyi (F)
@ 2019-12-02 8:35 ` zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 5/7] fs/proc: Report eip/esp in /prod/PID/stat for coredumping zhangyi (F)
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: zhangyi (F) @ 2019-12-02 8:35 UTC (permalink / raw)
To: gregkh
Cc: linux-kernel, linux-fsdevel, luto, adobriyan, akpm, vbabka,
peterz, bigeasy, mhocko, john.ogness, yi.zhang, nixiaoming
From: Andy Lutomirski <luto@kernel.org>
commit 0a1eb2d474edfe75466be6b4677ad84e5e8ca3f5 upstream.
Reporting these fields on a non-current task is dangerous. If the
task is in any state other than normal kernel code, they may contain
garbage or even kernel addresses on some architectures. (x86_64
used to do this. I bet lots of architectures still do.) With
CONFIG_THREAD_INFO_IN_TASK=y, it can OOPS, too.
As far as I know, there are no use programs that make any material
use of these fields, so just get rid of them.
Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux API <linux-api@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Link: http://lkml.kernel.org/r/a5fed4c3f4e33ed25d4bb03567e329bc5a712bcc.1475257877.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
---
fs/proc/array.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 60cbaa821164..618c83f1866d 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -425,10 +425,11 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
mm = get_task_mm(task);
if (mm) {
vsize = task_vsize(mm);
- if (permitted) {
- eip = KSTK_EIP(task);
- esp = KSTK_ESP(task);
- }
+ /*
+ * esp and eip are intentionally zeroed out. There is no
+ * non-racy way to read them without freezing the task.
+ * Programs that need reliable values can use ptrace(2).
+ */
}
get_task_comm(tcomm, task);
--
2.17.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4.4 5/7] fs/proc: Report eip/esp in /prod/PID/stat for coredumping
2019-12-02 8:35 [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in zhangyi (F)
` (3 preceding siblings ...)
2019-12-02 8:35 ` [PATCH 4.4 4/7] fs/proc: Stop reporting eip and esp in /proc/PID/stat zhangyi (F)
@ 2019-12-02 8:35 ` zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 6/7] proc: fix coredump vs read /proc/*/stat race zhangyi (F)
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: zhangyi (F) @ 2019-12-02 8:35 UTC (permalink / raw)
To: gregkh
Cc: linux-kernel, linux-fsdevel, luto, adobriyan, akpm, vbabka,
peterz, bigeasy, mhocko, john.ogness, yi.zhang, nixiaoming
From: John Ogness <john.ogness@linutronix.de>
commit fd7d56270b526ca3ed0c224362e3c64a0f86687a upstream.
Commit 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in
/proc/PID/stat") stopped reporting eip/esp because it is
racy and dangerous for executing tasks. The comment adds:
As far as I know, there are no use programs that make any
material use of these fields, so just get rid of them.
However, existing userspace core-dump-handler applications (for
example, minicoredumper) are using these fields since they
provide an excellent cross-platform interface to these valuable
pointers. So that commit introduced a user space visible
regression.
Partially revert the change and make the readout possible for
tasks with the proper permissions and only if the target task
has the PF_DUMPCORE flag set.
Fixes: 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in> /proc/PID/stat")
Reported-by: Marco Felsch <marco.felsch@preh.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: stable@vger.kernel.org
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linux API <linux-api@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/87poatfwg6.fsf@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ zhangyi: 68db0cf10678 does not merged, skip the task_stack.h for 4.4]
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
---
fs/proc/array.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 618c83f1866d..c3e1732bcaa5 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -429,7 +429,15 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
* esp and eip are intentionally zeroed out. There is no
* non-racy way to read them without freezing the task.
* Programs that need reliable values can use ptrace(2).
+ *
+ * The only exception is if the task is core dumping because
+ * a program is not able to use ptrace(2) in that case. It is
+ * safe because the task has stopped executing permanently.
*/
+ if (permitted && (task->flags & PF_DUMPCORE)) {
+ eip = KSTK_EIP(task);
+ esp = KSTK_ESP(task);
+ }
}
get_task_comm(tcomm, task);
--
2.17.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4.4 6/7] proc: fix coredump vs read /proc/*/stat race
2019-12-02 8:35 [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in zhangyi (F)
` (4 preceding siblings ...)
2019-12-02 8:35 ` [PATCH 4.4 5/7] fs/proc: Report eip/esp in /prod/PID/stat for coredumping zhangyi (F)
@ 2019-12-02 8:35 ` zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 7/7] fs/proc/array.c: allow reporting eip/esp for all coredumping threads zhangyi (F)
2019-12-12 17:15 ` [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in Greg KH
7 siblings, 0 replies; 9+ messages in thread
From: zhangyi (F) @ 2019-12-02 8:35 UTC (permalink / raw)
To: gregkh
Cc: linux-kernel, linux-fsdevel, luto, adobriyan, akpm, vbabka,
peterz, bigeasy, mhocko, john.ogness, yi.zhang, nixiaoming
From: Alexey Dobriyan <adobriyan@gmail.com>
commit 8bb2ee192e482c5d500df9f2b1b26a560bd3026f upstream.
do_task_stat() accesses IP and SP of a task without bumping reference
count of a stack (which became an entity with independent lifetime at
some point).
Steps to reproduce:
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <unistd.h>
#include <sys/wait.h>
int main(void)
{
setrlimit(RLIMIT_CORE, &(struct rlimit){});
while (1) {
char buf[64];
char buf2[4096];
pid_t pid;
int fd;
pid = fork();
if (pid == 0) {
*(volatile int *)0 = 0;
}
snprintf(buf, sizeof(buf), "/proc/%u/stat", pid);
fd = open(buf, O_RDONLY);
read(fd, buf2, sizeof(buf2));
close(fd);
waitpid(pid, NULL, 0);
}
return 0;
}
BUG: unable to handle kernel paging request at 0000000000003fd8
IP: do_task_stat+0x8b4/0xaf0
PGD 800000003d73e067 P4D 800000003d73e067 PUD 3d558067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 1417 Comm: a.out Not tainted 4.15.0-rc8-dirty #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014
RIP: 0010:do_task_stat+0x8b4/0xaf0
Call Trace:
proc_single_show+0x43/0x70
seq_read+0xe6/0x3b0
__vfs_read+0x1e/0x120
vfs_read+0x84/0x110
SyS_read+0x3d/0xa0
entry_SYSCALL_64_fastpath+0x13/0x6c
RIP: 0033:0x7f4d7928cba0
RSP: 002b:00007ffddb245158 EFLAGS: 00000246
Code: 03 b7 a0 01 00 00 4c 8b 4c 24 70 4c 8b 44 24 78 4c 89 74 24 18 e9 91 f9 ff ff f6 45 4d 02 0f 84 fd f7 ff ff 48 8b 45 40 48 89 ef <48> 8b 80 d8 3f 00 00 48 89 44 24 20 e8 9b 97 eb ff 48 89 44 24
RIP: do_task_stat+0x8b4/0xaf0 RSP: ffffc90000607cc8
CR2: 0000000000003fd8
John Ogness said: for my tests I added an else case to verify that the
race is hit and correctly mitigated.
Link: http://lkml.kernel.org/r/20180116175054.GA11513@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: "Kohli, Gaurav" <gkohli@codeaurora.org>
Tested-by: John Ogness <john.ogness@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
---
fs/proc/array.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index c3e1732bcaa5..42e33ea50d1c 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -435,8 +435,11 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
* safe because the task has stopped executing permanently.
*/
if (permitted && (task->flags & PF_DUMPCORE)) {
- eip = KSTK_EIP(task);
- esp = KSTK_ESP(task);
+ if (try_get_task_stack(task)) {
+ eip = KSTK_EIP(task);
+ esp = KSTK_ESP(task);
+ put_task_stack(task);
+ }
}
}
--
2.17.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4.4 7/7] fs/proc/array.c: allow reporting eip/esp for all coredumping threads
2019-12-02 8:35 [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in zhangyi (F)
` (5 preceding siblings ...)
2019-12-02 8:35 ` [PATCH 4.4 6/7] proc: fix coredump vs read /proc/*/stat race zhangyi (F)
@ 2019-12-02 8:35 ` zhangyi (F)
2019-12-12 17:15 ` [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in Greg KH
7 siblings, 0 replies; 9+ messages in thread
From: zhangyi (F) @ 2019-12-02 8:35 UTC (permalink / raw)
To: gregkh
Cc: linux-kernel, linux-fsdevel, luto, adobriyan, akpm, vbabka,
peterz, bigeasy, mhocko, john.ogness, yi.zhang, nixiaoming
From: John Ogness <john.ogness@linutronix.de>
commit cb8f381f1613cafe3aec30809991cd56e7135d92 upstream.
0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in /proc/PID/stat")
stopped reporting eip/esp and fd7d56270b52 ("fs/proc: Report eip/esp in
/prod/PID/stat for coredumping") reintroduced the feature to fix a
regression with userspace core dump handlers (such as minicoredumper).
Because PF_DUMPCORE is only set for the primary thread, this didn't fix
the original problem for secondary threads. Allow reporting the eip/esp
for all threads by checking for PF_EXITING as well. This is set for all
the other threads when they are killed. coredump_wait() waits for all the
tasks to become inactive before proceeding to invoke a core dumper.
Link: http://lkml.kernel.org/r/87y32p7i7a.fsf@linutronix.de
Link: http://lkml.kernel.org/r/20190522161614.628-1-jlu@pengutronix.de
Fixes: fd7d56270b526ca3 ("fs/proc: Report eip/esp in /prod/PID/stat for coredumping")
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reported-by: Jan Luebbe <jlu@pengutronix.de>
Tested-by: Jan Luebbe <jlu@pengutronix.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
---
fs/proc/array.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 42e33ea50d1c..6238f45eed02 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -434,7 +434,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
* a program is not able to use ptrace(2) in that case. It is
* safe because the task has stopped executing permanently.
*/
- if (permitted && (task->flags & PF_DUMPCORE)) {
+ if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE))) {
if (try_get_task_stack(task)) {
eip = KSTK_EIP(task);
esp = KSTK_ESP(task);
--
2.17.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in
2019-12-02 8:35 [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in zhangyi (F)
` (6 preceding siblings ...)
2019-12-02 8:35 ` [PATCH 4.4 7/7] fs/proc/array.c: allow reporting eip/esp for all coredumping threads zhangyi (F)
@ 2019-12-12 17:15 ` Greg KH
7 siblings, 0 replies; 9+ messages in thread
From: Greg KH @ 2019-12-12 17:15 UTC (permalink / raw)
To: zhangyi (F)
Cc: linux-kernel, linux-fsdevel, luto, adobriyan, akpm, vbabka,
peterz, bigeasy, mhocko, john.ogness, nixiaoming
On Mon, Dec 02, 2019 at 04:35:12PM +0800, zhangyi (F) wrote:
> Reporting eip and esp fields on a non-current task is dangerous,
> so backport this series for 4.4 to fix protential oops and info leak
> problems. The first 3 patch are depended on the 6/7 patch.
Thanks for the backports, now queued up.
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-12-12 17:15 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-02 8:35 [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 1/7] sched/core: Allow putting thread_info into task_struct zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 2/7] sched/core: Add try_get_task_stack() and put_task_stack() zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 3/7] sched/core, x86: Make struct thread_info arch specific again zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 4/7] fs/proc: Stop reporting eip and esp in /proc/PID/stat zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 5/7] fs/proc: Report eip/esp in /prod/PID/stat for coredumping zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 6/7] proc: fix coredump vs read /proc/*/stat race zhangyi (F)
2019-12-02 8:35 ` [PATCH 4.4 7/7] fs/proc/array.c: allow reporting eip/esp for all coredumping threads zhangyi (F)
2019-12-12 17:15 ` [PATCH 4.4 0/7] fs/proc: Stop reporting eip and esp in Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).