linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Andrii Nakryiko <andrii@kernel.org>,
	viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org
Cc: torvalds@linux-foundation.org, kernel-team@fb.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] fs: fix NULL dereference due to data race in prepend_path()
Date: Wed, 14 Oct 2020 17:12:39 -0400	[thread overview]
Message-ID: <0f91e494-1f81-d4ee-7b96-ce231bddbdb3@toxicpanda.com> (raw)
In-Reply-To: <20201014204529.934574-1-andrii@kernel.org>

On 10/14/20 4:45 PM, Andrii Nakryiko wrote:
> Fix data race in prepend_path() with re-reading mnt->mnt_ns twice without
> holding the lock. is_mounted() does check for NULL, but is_anon_ns(mnt->mnt_ns)
> might re-read the pointer again which could be NULL already, if in between
> reads one of kern_unmount()/kern_unmount_array()/umount_tree() sets mnt->mnt_ns
> to NULL.
> 
> This is seen in production with the following stack trace:
> 
> [22942.418012] BUG: kernel NULL pointer dereference, address: 0000000000000048
> ...
> [22942.976884] RIP: 0010:prepend_path.isra.4+0x1ce/0x2e0
> [22943.037706] Code: 89 c6 e9 0d ff ff ff 49 8b 85 c0 00 00 00 48 85 c0 0f 84 9d 00 00 00 48 3d 00 f0 ff ff 0f 87 91 00 00 00 49 8b 86 e0 00 00 00 <48> 83 78 48 00 0f 94 c0 0f b6 c0 83 c0 01 e9 3b ff ff ff 39 0d 29
> [22943.264141] RSP: 0018:ffffc90020d6fd98 EFLAGS: 00010283
> [22943.327058] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000007e5ee
> [22943.413041] RDX: ffff889fb56ac0c0 RSI: ffffffd05dc8147e RDI: ffff88b1f845ab7b
> [22943.499015] RBP: ffff889fbf8100c0 R08: ffffc90020d6fe30 R09: ffffc90020d6fe2c
> [22943.584992] R10: ffffc90020d6fe2c R11: ffffea00095836c0 R12: ffffc90020d6fe30
> [22943.670968] R13: ffff88b7d336bea0 R14: ffff88b7d336be80 R15: ffff88aeb78db980
> [22943.756944] FS:  00007f228447e980(0000) GS:ffff889fc00c0000(0000) knlGS:0000000000000000
> [22943.854448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [22943.923653] CR2: 0000000000000048 CR3: 0000001ed235e001 CR4: 00000000007606e0
> [22944.009630] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [22944.095604] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [22944.181581] PKRU: 55555554
> [22944.214100] Call Trace:
> [22944.243485]  d_path+0xe6/0x150
> [22944.280202]  proc_pid_readlink+0x8f/0x100
> [22944.328449]  vfs_readlink+0xf8/0x110
> [22944.371456]  ? touch_atime+0x33/0xd0
> [22944.414466]  do_readlinkat+0xfd/0x120
> [22944.458522]  __x64_sys_readlinkat+0x1a/0x20
> [22944.508868]  do_syscall_64+0x42/0x110
> [22944.552928]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Fixes: f2683bd8d5bd ("[PATCH] fix d_absolute_path() interplay with fsmount()")
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>   fs/d_path.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/d_path.c b/fs/d_path.c
> index 0f1fc1743302..a69e2cd36e6e 100644
> --- a/fs/d_path.c
> +++ b/fs/d_path.c
> @@ -102,6 +102,8 @@ static int prepend_path(const struct path *path,
>   
>   		if (dentry == vfsmnt->mnt_root || IS_ROOT(dentry)) {
>   			struct mount *parent = READ_ONCE(mnt->mnt_parent);
> +			struct mnt_namespace *mnt_ns;
> +
>   			/* Escaped? */
>   			if (dentry != vfsmnt->mnt_root) {
>   				bptr = *buffer;
> @@ -116,7 +118,9 @@ static int prepend_path(const struct path *path,
>   				vfsmnt = &mnt->mnt;
>   				continue;
>   			}
> -			if (is_mounted(vfsmnt) && !is_anon_ns(mnt->mnt_ns))
> +			mnt_ns = READ_ONCE(mnt->mnt_ns);
> +			/* open-coded is_mounted() to use local mnt_ns */
> +			if (!IS_ERR_OR_NULL(mnt_ns) && !is_anon_ns(mnt_ns))
>   				error = 1;	// absolute root
>   			else

I had to go look at this code carefully to make sure that mnt == 
real_mount(vfsmnt), which it does.  I was also afraid that if we could have 
mnt->mnt_ns change in between checks that we were just trading a possible NULL 
deref with a UAF, but we're under RCU here so we're good there as well.  You can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

  reply	other threads:[~2020-10-15  1:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-14 20:45 [PATCH] fs: fix NULL dereference due to data race in prepend_path() Andrii Nakryiko
2020-10-14 21:12 ` Josef Bacik [this message]
2020-10-14 21:49 ` Linus Torvalds
2020-10-14 23:08   ` Al Viro
2020-10-14 23:12     ` Linus Torvalds
2020-10-14 23:08 ` Al Viro
2020-10-15  1:07   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0f91e494-1f81-d4ee-7b96-ce231bddbdb3@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=andrii@kernel.org \
    --cc=kernel-team@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).