linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] sysfs: fix kobject refcount to address races with kobject removal
@ 2021-06-23 21:50 Luis Chamberlain
  2021-06-23 22:59 ` Kees Cook
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Luis Chamberlain @ 2021-06-23 21:50 UTC (permalink / raw)
  To: gregkh, rafael, davem, kuba, ast, andriin, daniel, atenart,
	alobakin, weiwan, ap420073
  Cc: jeyu, ngupta, sergey.senozhatsky.work, minchan, mcgrof, axboe,
	mbenes, jpoimboe, tglx, keescook, jikos, rostedt, peterz,
	linux-block, netdev, linux-kernel

It's possible today to have a device attribute read or store
race against device removal. This is known to happen as follows:

write system call -->
  ksys_write () -->
    vfs_write() -->
      __vfs_write() -->
        kernfs_fop_write_iter() -->
          sysfs_kf_write() -->
            dev_attr_store() -->
              null reference

This happens because the dev_attr->store() callback can be
removed prior to its call, after dev_attr_store() was initiated.
The null dereference is possible because the sysfs ops can be
removed on module removal, for instance, when device_del() is
called, and a sysfs read / store is not doing any kobject reference
bumps either. This allows a read/store call to initiate, a
device_del() to kick off, and then the read/store call can be
gone by the time to execute it.

The sysfs filesystem is not doing any kobject reference bumps during a
read / store ops to prevent this.

To fix this in a simplified way, just bump the kobject reference when
we create a directory and remove it on directory removal.

The big unfortunate eye-sore is addressing the manual kobject reference
assumption on the networking code, which leads me to believe we should
end up replacing that eventually with another sort of check.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---

This v4 moves to fixing the race condition on dev_attr_store() and
dev_attr_read() to sysfs by bumping the kobject reference count
on directory creation / deletion as suggested by Greg.

Unfortunately at least the networking core has a manual refcount
assumption, which needs to be adjusted to account for this change.
This should also mean there is runtime for other kobjects which may
not be explored yet which may need fixing as well. We may want to
change the check to something else on the networking front, but its
not clear to me yet what to use.

 fs/sysfs/dir.c | 3 +++
 net/core/dev.c | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 59dffd5ca517..6c47aa4af6f5 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -56,12 +56,14 @@ int sysfs_create_dir_ns(struct kobject *kobj, const void *ns)
 
 	kobject_get_ownership(kobj, &uid, &gid);
 
+	kobject_get(kobj);
 	kn = kernfs_create_dir_ns(parent, kobject_name(kobj),
 				  S_IRWXU | S_IRUGO | S_IXUGO, uid, gid,
 				  kobj, ns);
 	if (IS_ERR(kn)) {
 		if (PTR_ERR(kn) == -EEXIST)
 			sysfs_warn_dup(parent, kobject_name(kobj));
+		kobject_put(kobj);
 		return PTR_ERR(kn);
 	}
 
@@ -100,6 +102,7 @@ void sysfs_remove_dir(struct kobject *kobj)
 	if (kn) {
 		WARN_ON_ONCE(kernfs_type(kn) != KERNFS_DIR);
 		kernfs_remove(kn);
+		kobject_put(kobj);
 	}
 }
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 222b1d322c96..3a0ffa603d14 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10429,7 +10429,7 @@ static void netdev_wait_allrefs(struct net_device *dev)
 	rebroadcast_time = warning_time = jiffies;
 	refcnt = netdev_refcnt_read(dev);
 
-	while (refcnt != 1) {
+	while (refcnt != 3) {
 		if (time_after(jiffies, rebroadcast_time + 1 * HZ)) {
 			rtnl_lock();
 
@@ -10544,7 +10544,7 @@ void netdev_run_todo(void)
 		netdev_wait_allrefs(dev);
 
 		/* paranoia */
-		BUG_ON(netdev_refcnt_read(dev) != 1);
+		BUG_ON(netdev_refcnt_read(dev) != 3);
 		BUG_ON(!list_empty(&dev->ptype_all));
 		BUG_ON(!list_empty(&dev->ptype_specific));
 		WARN_ON(rcu_access_pointer(dev->ip_ptr));
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-07-23 17:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-23 21:50 [PATCH v4] sysfs: fix kobject refcount to address races with kobject removal Luis Chamberlain
2021-06-23 22:59 ` Kees Cook
2021-06-24  1:09   ` Luis Chamberlain
2021-06-24 11:06   ` Greg KH
2021-06-24 11:09 ` Greg KH
2021-06-25 21:55   ` Luis Chamberlain
2021-07-01 22:48     ` Luis Chamberlain
2021-07-02  1:04       ` Luis Chamberlain
2021-07-21 11:30       ` Greg KH
2021-07-22 21:31         ` Luis Chamberlain
2021-07-23 11:14           ` Greg KH
2021-07-23 17:35             ` Luis Chamberlain
2021-07-01  2:27 ` [sysfs] 1c04296f8f: suspend-stress.fail kernel test robot
2021-07-01 15:59   ` Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).