From: Kees Cook <keescook@chromium.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: gregkh@linuxfoundation.org, rafael@kernel.org,
davem@davemloft.net, kuba@kernel.org, ast@kernel.org,
andriin@fb.com, daniel@iogearbox.net, atenart@kernel.org,
alobakin@pm.me, weiwan@google.com, ap420073@gmail.com,
jeyu@kernel.org, ngupta@vflare.org,
sergey.senozhatsky.work@gmail.com, minchan@kernel.org,
axboe@kernel.dk, mbenes@suse.com, jpoimboe@redhat.com,
tglx@linutronix.de, jikos@kernel.org, rostedt@goodmis.org,
peterz@infradead.org, linux-block@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4] sysfs: fix kobject refcount to address races with kobject removal
Date: Wed, 23 Jun 2021 15:59:37 -0700 [thread overview]
Message-ID: <202106231555.871D23D50@keescook> (raw)
In-Reply-To: <20210623215007.862787-1-mcgrof@kernel.org>
On Wed, Jun 23, 2021 at 02:50:07PM -0700, Luis Chamberlain wrote:
> It's possible today to have a device attribute read or store
> race against device removal. This is known to happen as follows:
>
> write system call -->
> ksys_write () -->
> vfs_write() -->
> __vfs_write() -->
> kernfs_fop_write_iter() -->
> sysfs_kf_write() -->
> dev_attr_store() -->
> null reference
>
> This happens because the dev_attr->store() callback can be
> removed prior to its call, after dev_attr_store() was initiated.
> The null dereference is possible because the sysfs ops can be
> removed on module removal, for instance, when device_del() is
> called, and a sysfs read / store is not doing any kobject reference
> bumps either. This allows a read/store call to initiate, a
> device_del() to kick off, and then the read/store call can be
> gone by the time to execute it.
>
> The sysfs filesystem is not doing any kobject reference bumps during a
> read / store ops to prevent this.
>
> To fix this in a simplified way, just bump the kobject reference when
> we create a directory and remove it on directory removal.
>
> The big unfortunate eye-sore is addressing the manual kobject reference
> assumption on the networking code, which leads me to believe we should
> end up replacing that eventually with another sort of check.
>
> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>
> This v4 moves to fixing the race condition on dev_attr_store() and
> dev_attr_read() to sysfs by bumping the kobject reference count
> on directory creation / deletion as suggested by Greg.
>
> Unfortunately at least the networking core has a manual refcount
> assumption, which needs to be adjusted to account for this change.
> This should also mean there is runtime for other kobjects which may
> not be explored yet which may need fixing as well. We may want to
> change the check to something else on the networking front, but its
> not clear to me yet what to use.
>
> fs/sysfs/dir.c | 3 +++
> net/core/dev.c | 4 ++--
> 2 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index 59dffd5ca517..6c47aa4af6f5 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -56,12 +56,14 @@ int sysfs_create_dir_ns(struct kobject *kobj, const void *ns)
>
> kobject_get_ownership(kobj, &uid, &gid);
>
> + kobject_get(kobj);
> kn = kernfs_create_dir_ns(parent, kobject_name(kobj),
> S_IRWXU | S_IRUGO | S_IXUGO, uid, gid,
> kobj, ns);
> if (IS_ERR(kn)) {
> if (PTR_ERR(kn) == -EEXIST)
> sysfs_warn_dup(parent, kobject_name(kobj));
> + kobject_put(kobj);
> return PTR_ERR(kn);
> }
>
> @@ -100,6 +102,7 @@ void sysfs_remove_dir(struct kobject *kobj)
> if (kn) {
> WARN_ON_ONCE(kernfs_type(kn) != KERNFS_DIR);
> kernfs_remove(kn);
> + kobject_put(kobj);
> }
> }
Shouldn't this be taken on open() not sysfs creation, or is the problem
that the kobject is held the module memory rather than duplicated by
sysfs?
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 222b1d322c96..3a0ffa603d14 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10429,7 +10429,7 @@ static void netdev_wait_allrefs(struct net_device *dev)
> rebroadcast_time = warning_time = jiffies;
> refcnt = netdev_refcnt_read(dev);
>
> - while (refcnt != 1) {
> + while (refcnt != 3) {
> if (time_after(jiffies, rebroadcast_time + 1 * HZ)) {
> rtnl_lock();
>
> @@ -10544,7 +10544,7 @@ void netdev_run_todo(void)
> netdev_wait_allrefs(dev);
>
> /* paranoia */
> - BUG_ON(netdev_refcnt_read(dev) != 1);
> + BUG_ON(netdev_refcnt_read(dev) != 3);
And surely there are things besides netdevs that would suffer from this
change?
> BUG_ON(!list_empty(&dev->ptype_all));
> BUG_ON(!list_empty(&dev->ptype_specific));
> WARN_ON(rcu_access_pointer(dev->ip_ptr));
> --
> 2.27.0
>
--
Kees Cook
next prev parent reply other threads:[~2021-06-23 22:59 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-23 21:50 [PATCH v4] sysfs: fix kobject refcount to address races with kobject removal Luis Chamberlain
2021-06-23 22:59 ` Kees Cook [this message]
2021-06-24 1:09 ` Luis Chamberlain
2021-06-24 11:06 ` Greg KH
2021-06-24 11:09 ` Greg KH
2021-06-25 21:55 ` Luis Chamberlain
2021-07-01 22:48 ` Luis Chamberlain
2021-07-02 1:04 ` Luis Chamberlain
2021-07-21 11:30 ` Greg KH
2021-07-22 21:31 ` Luis Chamberlain
2021-07-23 11:14 ` Greg KH
2021-07-23 17:35 ` Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202106231555.871D23D50@keescook \
--to=keescook@chromium.org \
--cc=alobakin@pm.me \
--cc=andriin@fb.com \
--cc=ap420073@gmail.com \
--cc=ast@kernel.org \
--cc=atenart@kernel.org \
--cc=axboe@kernel.dk \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=gregkh@linuxfoundation.org \
--cc=jeyu@kernel.org \
--cc=jikos@kernel.org \
--cc=jpoimboe@redhat.com \
--cc=kuba@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mbenes@suse.com \
--cc=mcgrof@kernel.org \
--cc=minchan@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=ngupta@vflare.org \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=tglx@linutronix.de \
--cc=weiwan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).