All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
	Tejun Heo <tj@kernel.org>, Zefan Li <lizefan.x@bytedance.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Richard Purdie <richard.purdie@linuxfoundation.org>,
	Paul Gortmaker <paul.gortmaker@windriver.com>
Subject: [PATCH 5.10 06/24] cgroup1: fix leaked context root causing sporadic NULL deref in LTP
Date: Thu, 29 Jul 2021 15:54:26 +0200	[thread overview]
Message-ID: <20210729135137.470133128@linuxfoundation.org> (raw)
In-Reply-To: <20210729135137.267680390@linuxfoundation.org>

From: Paul Gortmaker <paul.gortmaker@windriver.com>

commit 1e7107c5ef44431bc1ebbd4c353f1d7c22e5f2ec upstream.

Richard reported sporadic (roughly one in 10 or so) null dereferences and
other strange behaviour for a set of automated LTP tests.  Things like:

   BUG: kernel NULL pointer dereference, address: 0000000000000008
   #PF: supervisor read access in kernel mode
   #PF: error_code(0x0000) - not-present page
   PGD 0 P4D 0
   Oops: 0000 [#1] PREEMPT SMP PTI
   CPU: 0 PID: 1516 Comm: umount Not tainted 5.10.0-yocto-standard #1
   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
   RIP: 0010:kernfs_sop_show_path+0x1b/0x60

...or these others:

   RIP: 0010:do_mkdirat+0x6a/0xf0
   RIP: 0010:d_alloc_parallel+0x98/0x510
   RIP: 0010:do_readlinkat+0x86/0x120

There were other less common instances of some kind of a general scribble
but the common theme was mount and cgroup and a dubious dentry triggering
the NULL dereference.  I was only able to reproduce it under qemu by
replicating Richard's setup as closely as possible - I never did get it
to happen on bare metal, even while keeping everything else the same.

In commit 71d883c37e8d ("cgroup_do_mount(): massage calling conventions")
we see this as a part of the overall change:

   --------------
           struct cgroup_subsys *ss;
   -       struct dentry *dentry;

   [...]

   -       dentry = cgroup_do_mount(&cgroup_fs_type, fc->sb_flags, root,
   -                                CGROUP_SUPER_MAGIC, ns);

   [...]

   -       if (percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
   -               struct super_block *sb = dentry->d_sb;
   -               dput(dentry);
   +       ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns);
   +       if (!ret && percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
   +               struct super_block *sb = fc->root->d_sb;
   +               dput(fc->root);
                   deactivate_locked_super(sb);
                   msleep(10);
                   return restart_syscall();
           }
   --------------

In changing from the local "*dentry" variable to using fc->root, we now
export/leave that dentry pointer in the file context after doing the dput()
in the unlikely "is_dying" case.   With LTP doing a crazy amount of back to
back mount/unmount [testcases/bin/cgroup_regression_5_1.sh] the unlikely
becomes slightly likely and then bad things happen.

A fix would be to not leave the stale reference in fc->root as follows:

   --------------
                  dput(fc->root);
  +               fc->root = NULL;
                  deactivate_locked_super(sb);
   --------------

...but then we are just open-coding a duplicate of fc_drop_locked() so we
simply use that instead.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: stable@vger.kernel.org      # v5.1+
Reported-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Fixes: 71d883c37e8d ("cgroup_do_mount(): massage calling conventions")
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/internal.h              |    1 -
 include/linux/fs_context.h |    1 +
 kernel/cgroup/cgroup-v1.c  |    4 +---
 3 files changed, 2 insertions(+), 4 deletions(-)

--- a/fs/internal.h
+++ b/fs/internal.h
@@ -64,7 +64,6 @@ extern void __init chrdev_init(void);
  */
 extern const struct fs_context_operations legacy_fs_context_ops;
 extern int parse_monolithic_mount_data(struct fs_context *, void *);
-extern void fc_drop_locked(struct fs_context *);
 extern void vfs_clean_context(struct fs_context *fc);
 extern int finish_clean_context(struct fs_context *fc);
 
--- a/include/linux/fs_context.h
+++ b/include/linux/fs_context.h
@@ -139,6 +139,7 @@ extern int vfs_parse_fs_string(struct fs
 extern int generic_parse_monolithic(struct fs_context *fc, void *data);
 extern int vfs_get_tree(struct fs_context *fc);
 extern void put_fs_context(struct fs_context *fc);
+extern void fc_drop_locked(struct fs_context *fc);
 
 /*
  * sget() wrappers to be called from the ->get_tree() op.
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -1225,9 +1225,7 @@ int cgroup1_get_tree(struct fs_context *
 		ret = cgroup_do_get_tree(fc);
 
 	if (!ret && percpu_ref_is_dying(&ctx->root->cgrp.self.refcnt)) {
-		struct super_block *sb = fc->root->d_sb;
-		dput(fc->root);
-		deactivate_locked_super(sb);
+		fc_drop_locked(fc);
 		ret = 1;
 	}
 



  parent reply	other threads:[~2021-07-29 14:01 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-29 13:54 [PATCH 5.10 00/24] 5.10.55-rc1 review Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 01/24] tools: Allow proper CC/CXX/... override with LLVM=1 in Makefile.include Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 02/24] [PATCH] io_uring: fix link timeout refs Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 03/24] KVM: x86: determine if an exception has an error code only when injecting it Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 04/24] af_unix: fix garbage collect vs MSG_PEEK Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 05/24] workqueue: fix UAF in pwq_unbound_release_workfn() Greg Kroah-Hartman
2021-07-29 13:54 ` Greg Kroah-Hartman [this message]
2021-07-29 13:54 ` [PATCH 5.10 07/24] net/802/mrp: fix memleak in mrp_request_join() Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 08/24] net/802/garp: fix memleak in garp_request_join() Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 09/24] net: annotate data race around sk_ll_usec Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 10/24] sctp: move 198 addresses from unusable to private scope Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 11/24] rcu-tasks: Dont delete holdouts within trc_inspect_reader() Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 12/24] rcu-tasks: Dont delete holdouts within trc_wait_for_one_reader() Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 13/24] ipv6: allocate enough headroom in ip6_finish_output2() Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 14/24] drm/ttm: add a check against null pointer dereference Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 15/24] hfs: add missing clean-up in hfs_fill_super Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 16/24] hfs: fix high memory mapping in hfs_bnode_read Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 17/24] hfs: add lock nesting notation to hfs_find_init Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 18/24] firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 19/24] firmware: arm_scmi: Fix range check for the maximum number of pending messages Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 20/24] cifs: fix the out of range assignment to bit fields in parse_server_interfaces Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 21/24] iomap: remove the length variable in iomap_seek_data Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 22/24] iomap: remove the length variable in iomap_seek_hole Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 23/24] ARM: dts: versatile: Fix up interrupt controller node names Greg Kroah-Hartman
2021-07-29 13:54 ` [PATCH 5.10 24/24] ipv6: ip6_finish_output2: set sk into newly allocated nskb Greg Kroah-Hartman
2021-07-29 18:35 ` [PATCH 5.10 00/24] 5.10.55-rc1 review Jon Hunter
2021-07-29 19:47 ` Pavel Machek
2021-07-29 22:53 ` Shuah Khan
2021-07-30  4:58   ` Greg Kroah-Hartman
2021-07-29 23:36 ` Florian Fainelli
2021-07-30  2:40 ` Shuah Khan
2021-07-30  6:23 ` Naresh Kamboju
2021-07-30 10:12 ` Sudip Mukherjee
2021-07-31  4:43 ` Guenter Roeck
2021-07-31  7:51 ` Fox Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210729135137.470133128@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=paul.gortmaker@windriver.com \
    --cc=richard.purdie@linuxfoundation.org \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.