From: ebiederm@xmission.com (Eric W. Biederman)
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <linux-kernel@vger.kernel.org>,
Alexey Gladkov <gladkov.alexey@gmail.com>,
Oleg Nesterov <oleg@redhat.com>,
Christian Brauner <christian.brauner@ubuntu.com>
Subject: [GIT PULL] proc fix for 5.7-rc1
Date: Fri, 10 Apr 2020 08:03:04 -0500 [thread overview]
Message-ID: <87sghbmr1z.fsf@x220.int.ebiederm.org> (raw)
In-Reply-To: <87blobnq02.fsf@x220.int.ebiederm.org> (Eric W. Biederman's message of "Wed, 01 Apr 2020 11:13:17 -0500")
Linus,
Please pull the for-linus branch from the git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git for-linus
HEAD: 63f818f46af9f8b3f17b9695501e8d08959feb60 proc: Use a dedicated lock in struct pid
A brown paper bag slipped through my proc changes, and syzcaller caught
it when the code ended up in your tree. I have opted to fix it the
simplest cleanest way I know how. So there is no reasonable chance
for the bug to repeat.
Eric
From 63f818f46af9f8b3f17b9695501e8d08959feb60 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm@xmission.com>
Date: Tue, 7 Apr 2020 09:43:04 -0500
Subject: [PATCH] proc: Use a dedicated lock in struct pid
syzbot wrote:
> ========================================================
> WARNING: possible irq lock inversion dependency detected
> 5.6.0-syzkaller #0 Not tainted
> --------------------------------------------------------
> swapper/1/0 just changed the state of lock:
> ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigurg+0x9f/0x320 fs/fcntl.c:840
> but this lock took another, SOFTIRQ-unsafe lock in the past:
> (&pid->wait_pidfd){+.+.}-{2:2}
>
>
> and interrupts could create inverse lock ordering between them.
>
>
> other info that might help us debug this:
> Possible interrupt unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&pid->wait_pidfd);
> local_irq_disable();
> lock(tasklist_lock);
> lock(&pid->wait_pidfd);
> <Interrupt>
> lock(tasklist_lock);
>
> *** DEADLOCK ***
>
> 4 locks held by swapper/1/0:
The problem is that because wait_pidfd.lock is taken under the tasklist
lock. It must always be taken with irqs disabled as tasklist_lock can be
taken from interrupt context and if wait_pidfd.lock was already taken this
would create a lock order inversion.
Oleg suggested just disabling irqs where I have added extra calls to
wait_pidfd.lock. That should be safe and I think the code will eventually
do that. It was rightly pointed out by Christian that sharing the
wait_pidfd.lock was a premature optimization.
It is also true that my pre-merge window testing was insufficient. So
remove the premature optimization and give struct pid a dedicated lock of
it's own for struct pid things. I have verified that lockdep sees all 3
paths where we take the new pid->lock and lockdep does not complain.
It is my current day dream that one day pid->lock can be used to guard the
task lists as well and then the tasklist_lock won't need to be held to
deliver signals. That will require taking pid->lock with irqs disabled.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/lkml/00000000000011d66805a25cd73f@google.com/
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Reported-by: syzbot+343f75cdeea091340956@syzkaller.appspotmail.com
Reported-by: syzbot+832aabf700bc3ec920b9@syzkaller.appspotmail.com
Reported-by: syzbot+f675f964019f884dbd0f@syzkaller.appspotmail.com
Reported-by: syzbot+a9fb1457d720a55d6dc5@syzkaller.appspotmail.com
Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
fs/proc/base.c | 10 +++++-----
include/linux/pid.h | 1 +
kernel/pid.c | 1 +
3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 74f948a6b621..6042b646ab27 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1839,9 +1839,9 @@ void proc_pid_evict_inode(struct proc_inode *ei)
struct pid *pid = ei->pid;
if (S_ISDIR(ei->vfs_inode.i_mode)) {
- spin_lock(&pid->wait_pidfd.lock);
+ spin_lock(&pid->lock);
hlist_del_init_rcu(&ei->sibling_inodes);
- spin_unlock(&pid->wait_pidfd.lock);
+ spin_unlock(&pid->lock);
}
put_pid(pid);
@@ -1877,9 +1877,9 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
/* Let the pid remember us for quick removal */
ei->pid = pid;
if (S_ISDIR(mode)) {
- spin_lock(&pid->wait_pidfd.lock);
+ spin_lock(&pid->lock);
hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
- spin_unlock(&pid->wait_pidfd.lock);
+ spin_unlock(&pid->lock);
}
task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid);
@@ -3273,7 +3273,7 @@ static const struct inode_operations proc_tgid_base_inode_operations = {
void proc_flush_pid(struct pid *pid)
{
- proc_invalidate_siblings_dcache(&pid->inodes, &pid->wait_pidfd.lock);
+ proc_invalidate_siblings_dcache(&pid->inodes, &pid->lock);
put_pid(pid);
}
diff --git a/include/linux/pid.h b/include/linux/pid.h
index 01a0d4e28506..cc896f0fc4e3 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -60,6 +60,7 @@ struct pid
{
refcount_t count;
unsigned int level;
+ spinlock_t lock;
/* lists of tasks that use this pid */
struct hlist_head tasks[PIDTYPE_MAX];
struct hlist_head inodes;
diff --git a/kernel/pid.c b/kernel/pid.c
index efd34874b3d1..517d0855d4cf 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -246,6 +246,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid,
get_pid_ns(ns);
refcount_set(&pid->count, 1);
+ spin_lock_init(&pid->lock);
for (type = 0; type < PIDTYPE_MAX; ++type)
INIT_HLIST_HEAD(&pid->tasks[type]);
--
2.20.1
next prev parent reply other threads:[~2020-04-10 13:06 UTC|newest]
Thread overview: 127+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <87blobnq02.fsf@x220.int.ebiederm.org>
2020-04-02 19:04 ` [GIT PULL] Please pull proc and exec work for 5.7-rc1 Linus Torvalds
2020-04-02 19:31 ` Bernd Edlinger
2020-04-02 19:52 ` Linus Torvalds
2020-04-02 20:59 ` Bernd Edlinger
2020-04-02 21:46 ` Linus Torvalds
2020-04-02 23:01 ` Eric W. Biederman
2020-04-02 23:42 ` Bernd Edlinger
2020-04-02 23:45 ` Eric W. Biederman
2020-04-02 23:49 ` Bernd Edlinger
2020-04-02 23:45 ` Linus Torvalds
2020-04-02 23:44 ` Linus Torvalds
2020-04-03 0:05 ` Eric W. Biederman
2020-04-07 1:29 ` [RFC][PATCH 0/3] exec_update_mutex related cleanups Eric W. Biederman
2020-04-07 1:31 ` [PATCH 1/3] binfmt: Move install_exec_creds after setup_new_exec to match binfmt_elf Eric W. Biederman
2020-04-07 15:58 ` Kees Cook
2020-04-07 16:11 ` Christian Brauner
2020-04-08 17:25 ` Linus Torvalds
2020-04-08 19:51 ` Eric W. Biederman
2020-04-07 1:31 ` [PATCH 2/3] exec: Make unlocking exec_update_mutex explict Eric W. Biederman
2020-04-07 16:02 ` Kees Cook
2020-04-07 16:17 ` Christian Brauner
2020-04-07 16:21 ` Eric W. Biederman
2020-04-07 1:32 ` [PATCH 3/3] exec: Rename the flag called_exec_mmap point_of_no_return Eric W. Biederman
2020-04-07 16:03 ` Kees Cook
2020-04-07 16:21 ` Christian Brauner
2020-04-07 16:22 ` [RFC][PATCH 0/3] exec_update_mutex related cleanups Christian Brauner
2020-04-08 17:26 ` Linus Torvalds
2020-04-03 5:09 ` [GIT PULL] Please pull proc and exec work for 5.7-rc1 Bernd Edlinger
2020-04-03 19:26 ` Linus Torvalds
2020-04-03 20:41 ` Waiman Long
2020-04-03 20:59 ` Linus Torvalds
2020-04-03 23:16 ` Waiman Long
2020-04-03 23:23 ` Waiman Long
2020-04-04 1:30 ` Linus Torvalds
2020-04-04 2:02 ` Waiman Long
2020-04-04 2:28 ` Linus Torvalds
2020-04-04 6:34 ` Bernd Edlinger
2020-04-05 6:34 ` Bernd Edlinger
2020-04-05 19:35 ` Linus Torvalds
2020-04-05 2:42 ` Waiman Long
2020-04-05 3:35 ` Bernd Edlinger
2020-04-05 3:45 ` Waiman Long
2020-04-06 13:13 ` Will Deacon
2020-04-04 4:23 ` Bernd Edlinger
2020-04-06 22:17 ` Eric W. Biederman
2020-04-07 19:50 ` Linus Torvalds
2020-04-07 20:29 ` Bernd Edlinger
2020-04-07 20:47 ` Linus Torvalds
2020-04-08 15:14 ` Eric W. Biederman
2020-04-08 15:21 ` Bernd Edlinger
2020-04-08 16:34 ` Linus Torvalds
2020-04-09 14:58 ` Eric W. Biederman
2020-04-09 15:15 ` Bernd Edlinger
2020-04-09 16:15 ` Linus Torvalds
2020-04-09 16:24 ` Linus Torvalds
2020-04-09 17:03 ` Eric W. Biederman
2020-04-09 17:17 ` Bernd Edlinger
2020-04-09 17:37 ` Linus Torvalds
2020-04-09 17:46 ` Bernd Edlinger
2020-04-09 18:36 ` Linus Torvalds
2020-04-09 19:42 ` Linus Torvalds
2020-04-09 19:57 ` Bernd Edlinger
2020-04-09 20:04 ` Linus Torvalds
2020-04-09 20:36 ` Bernd Edlinger
2020-04-09 21:00 ` Eric W. Biederman
2020-04-09 21:17 ` Linus Torvalds
2020-04-09 23:52 ` Bernd Edlinger
2020-04-10 0:30 ` Linus Torvalds
2020-04-10 0:32 ` Linus Torvalds
2020-04-11 4:07 ` Bernd Edlinger
2020-04-11 18:20 ` Oleg Nesterov
2020-04-11 18:29 ` Linus Torvalds
2020-04-11 18:31 ` Linus Torvalds
2020-04-11 19:15 ` Bernd Edlinger
2020-04-11 20:07 ` Linus Torvalds
2020-04-11 21:16 ` Bernd Edlinger
[not found] ` <CAHk-=wgWHkBzFazWJj57emHPd3Dg9SZHaZqoO7-AD+UbBTJgig@mail.gmail.com>
2020-04-11 21:57 ` Linus Torvalds
2020-04-12 6:01 ` Bernd Edlinger
2020-04-12 19:50 ` Oleg Nesterov
2020-04-12 20:14 ` Linus Torvalds
2020-04-28 2:56 ` Bernd Edlinger
2020-04-28 17:07 ` Linus Torvalds
2020-04-28 19:08 ` Oleg Nesterov
2020-04-28 20:35 ` Linus Torvalds
2020-04-28 21:06 ` Jann Horn
2020-04-28 21:36 ` Linus Torvalds
2020-04-28 21:53 ` Jann Horn
2020-04-28 22:14 ` Linus Torvalds
2020-04-28 23:36 ` Jann Horn
2020-04-29 17:58 ` Linus Torvalds
2020-04-29 18:33 ` Jann Horn
2020-04-29 18:57 ` Linus Torvalds
2020-04-29 19:23 ` Bernd Edlinger
2020-04-29 19:26 ` Jann Horn
2020-04-29 20:19 ` Bernd Edlinger
2020-04-29 21:06 ` Jann Horn
2020-04-29 22:38 ` Linus Torvalds
2020-04-29 23:22 ` Linus Torvalds
2020-04-29 23:59 ` Jann Horn
2020-04-30 1:08 ` Bernd Edlinger
2020-04-30 2:20 ` Linus Torvalds
2020-04-30 3:00 ` Jann Horn
2020-04-30 3:25 ` Linus Torvalds
2020-04-30 3:41 ` Jann Horn
2020-04-30 3:50 ` Linus Torvalds
2020-04-30 13:37 ` Linus Torvalds
2020-04-30 2:16 ` Linus Torvalds
2020-04-30 13:39 ` Bernd Edlinger
2020-04-30 13:47 ` Linus Torvalds
2020-04-30 14:29 ` Bernd Edlinger
2020-04-30 16:40 ` Linus Torvalds
2020-05-02 4:11 ` Bernd Edlinger
2020-04-09 17:36 ` Linus Torvalds
2020-04-09 20:34 ` Eric W. Biederman
2020-04-09 20:56 ` Linus Torvalds
2020-04-02 23:02 ` Bernd Edlinger
2020-04-02 23:22 ` Bernd Edlinger
2020-04-03 7:38 ` Bernd Edlinger
2020-04-03 16:00 ` Bernd Edlinger
2020-04-03 15:09 ` Bernd Edlinger
2020-04-03 16:23 ` Linus Torvalds
2020-04-03 16:36 ` Bernd Edlinger
2020-04-04 5:43 ` Bernd Edlinger
2020-04-04 5:48 ` Bernd Edlinger
2020-04-06 6:41 ` Bernd Edlinger
2020-04-10 13:03 ` Eric W. Biederman [this message]
2020-04-10 20:40 ` [GIT PULL] proc fix " pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sghbmr1z.fsf@x220.int.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=christian.brauner@ubuntu.com \
--cc=gladkov.alexey@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).