linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] driver core: Fix use-after-free and double free on glue directory
@ 2019-04-23 14:32 Muchun Song
  2019-04-25  9:24 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Muchun Song @ 2019-04-23 14:32 UTC (permalink / raw)
  To: gregkh, rafael, benh; +Cc: linux-kernel

There is a race condition between removing glue directory and adding a new
device under the glue directory. It can be reproduced in following test:

path 1: Add the child device under glue dir
device_add()
    get_device_parent()
        mutex_lock(&gdp_mutex);
        ....
        /*find parent from glue_dirs.list*/
        list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry)
            if (k->parent == parent_kobj) {
                kobj = kobject_get(k);
                break;
            }
        ....
        mutex_unlock(&gdp_mutex);
        ....
    ....
    kobject_add()
        kobject_add_internal()
            create_dir()
                sysfs_create_dir_ns()
                    if (kobj->parent)
                        parent = kobj->parent->sd;
                    ....
                    kernfs_create_dir_ns(parent)
                        kernfs_new_node()
                            kernfs_get(parent)
                        ....
                        /* link in */
                        rc = kernfs_add_one(kn);
                        if (!rc)
                            return kn;

                        kernfs_put(kn)
                            ....
                            repeat:
                            kmem_cache_free(kn)
                            kn = parent;

                            if (kn) {
                                if (atomic_dec_and_test(&kn->count))
                                    goto repeat;
                            }
                        ....

path2: Remove last child device under glue dir
device_del()
    cleanup_device_parent()
        cleanup_glue_dir()
            mutex_lock(&gdp_mutex);
            if (!kobject_has_children(glue_dir))
                kobject_del(glue_dir);
            kobject_put(glue_dir);
            mutex_unlock(&gdp_mutex);

Before path2 remove last child device under glue dir, If path1 add a new
device under glue dir, the glue_dir kobject reference count will be
increase to 2 via kobject_get(k) in get_device_parent(). And path1 has
been called kernfs_new_node(), but not call kernfs_get(parent).
Meanwhile, path2 call kobject_del(glue_dir) beacause 0 is returned by
kobject_has_children(). This result in glue_dir->sd is freed and it's
reference count will be 0. Then path1 call kernfs_get(parent) will trigger
a warning in kernfs_get()(WARN_ON(!atomic_read(&kn->count))) and increase
it's reference count to 1. Because glue_dir->sd is freed by path2, the next
call kernfs_add_one() by path1 will fail(This is also use-after-free)
and call atomic_dec_and_test() to decrease reference count. Because the
reference count is decremented to 0, it will also call kmem_cache_free()
to free glue_dir->sd again. This will result in double free.

In order to avoid this happening, we we should not call kobject_del() on
path2 when the reference count of glue_dir is greater than 1. So we add a
conditional statement to fix it.

The following calltrace is captured in kernel 4.14 with the following patch
applied:

commit 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier")

--------------------------------------------------------------------------
[    3.633703] WARNING: CPU: 4 PID: 513 at .../fs/kernfs/dir.c:494
                Here is WARN_ON(!atomic_read(&kn->count) in kernfs_get().
....
[    3.633986] Call trace:
[    3.633991]  kernfs_create_dir_ns+0xa8/0xb0
[    3.633994]  sysfs_create_dir_ns+0x54/0xe8
[    3.634001]  kobject_add_internal+0x22c/0x3f0
[    3.634005]  kobject_add+0xe4/0x118
[    3.634011]  device_add+0x200/0x870
[    3.634017]  _request_firmware+0x958/0xc38
[    3.634020]  request_firmware_into_buf+0x4c/0x70
....
[    3.634064] kernel BUG at .../mm/slub.c:294!
                Hrer is BUG_ON(object == fp) in set_freepointer().
....
[    3.634346] Call trace:
[    3.634351]  kmem_cache_free+0x504/0x6b8
[    3.634355]  kernfs_put+0x14c/0x1d8
[    3.634359]  kernfs_create_dir_ns+0x88/0xb0
[    3.634362]  sysfs_create_dir_ns+0x54/0xe8
[    3.634366]  kobject_add_internal+0x22c/0x3f0
[    3.634370]  kobject_add+0xe4/0x118
[    3.634374]  device_add+0x200/0x870
[    3.634378]  _request_firmware+0x958/0xc38
[    3.634381]  request_firmware_into_buf+0x4c/0x70
--------------------------------------------------------------------------

Fixes: 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier")

Signed-off-by: Muchun Song <smuchun@gmail.com>
---
 drivers/base/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 4aeaa0c92bda..5ac5376ae9af 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1825,7 +1825,7 @@ static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir)
 		return;
 
 	mutex_lock(&gdp_mutex);
-	if (!kobject_has_children(glue_dir))
+	if (!kobject_has_children(glue_dir) && kref_read(&glue_dir->kref) == 1)
 		kobject_del(glue_dir);
 	kobject_put(glue_dir);
 	mutex_unlock(&gdp_mutex);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-04-23 14:32 [PATCH] driver core: Fix use-after-free and double free on glue directory Muchun Song
@ 2019-04-25  9:24 ` Benjamin Herrenschmidt
  2019-04-25 15:44   ` Muchun Song
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2019-04-25  9:24 UTC (permalink / raw)
  To: Muchun Song, gregkh, rafael; +Cc: linux-kernel

On Tue, 2019-04-23 at 22:32 +0800, Muchun Song wrote:
> There is a race condition between removing glue directory and adding a new
> device under the glue directory. It can be reproduced in following test:
> 

 .../...

> In order to avoid this happening, we we should not call kobject_del() on
> path2 when the reference count of glue_dir is greater than 1. So we add a
> conditional statement to fix it.

Good catch ! However I'm not completely happy about the fix you
propose.

I find relying on the object count for such decisions rather fragile as
it could be taken temporarily for other reasons, couldn't it ? In which
case we would just fail...

Ideally, the looking up of the glue dir and creation of its child
should be protected by the same lock instance (the gdp_mutex in that
case).

That might require a bit of shuffling around though.

Greg, thoughts ? This whole gluedir business is annoyingly racy still.

My gut feeling is that the "right fix" is to ensure the lookup of the
glue dir and creation of the child object(s) are done under a single
instance of gdp_mutex so we never see a stale "empty" but still
poentially used glue dir around.

This should also be true when creating such gluedir in the first place
in fact, though that race is a lot harder to hit.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-04-25  9:24 ` Benjamin Herrenschmidt
@ 2019-04-25 15:44   ` Muchun Song
  2019-04-28 10:10     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Muchun Song @ 2019-04-25 15:44 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: gregkh, rafael, linux-kernel, zhaowuyun

Hi Cheers,

Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年4月25日周四 下午5:24写道:
>
> On Tue, 2019-04-23 at 22:32 +0800, Muchun Song wrote:
> > There is a race condition between removing glue directory and adding a new
> > device under the glue directory. It can be reproduced in following test:
> >
>
>  .../...
>
> > In order to avoid this happening, we we should not call kobject_del() on
> > path2 when the reference count of glue_dir is greater than 1. So we add a
> > conditional statement to fix it.
>
> Good catch ! However I'm not completely happy about the fix you
> propose.
>
> I find relying on the object count for such decisions rather fragile as
> it could be taken temporarily for other reasons, couldn't it ? In which
> case we would just fail...

It could be taken temporarily for other reasons, what reasons?
I also can not figure out which case could result in this.

>
> Ideally, the looking up of the glue dir and creation of its child
> should be protected by the same lock instance (the gdp_mutex in that
> case).
>
> That might require a bit of shuffling around though.
>
> Greg, thoughts ? This whole gluedir business is annoyingly racy still.
>
> My gut feeling is that the "right fix" is to ensure the lookup of the
> glue dir and creation of the child object(s) are done under a single
> instance of gdp_mutex so we never see a stale "empty" but still
> poentially used glue dir around.

I agree with you that the looking up of the glue dir and creation of its child
should be protected by the same lock of gdp_mutex. So, do you agree with
the fix of the following code snippet?

--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1740,8 +1740,11 @@ class_dir_create_and_add(struct class *class,
struct kobject *parent_kobj)
 static DEFINE_MUTEX(gdp_mutex);

 static struct kobject *get_device_parent(struct device *dev,
-                                        struct device *parent)
+                                        struct device *parent,
+                                        bool *locked)
 {
+       *locked = false;
+
        if (dev->class) {
                struct kobject *kobj = NULL;
                struct kobject *parent_kobj;
@@ -1779,7 +1782,7 @@ static struct kobject *get_device_parent(struct
device *dev,
                        }
                spin_unlock(&dev->class->p->glue_dirs.list_lock);
                if (kobj) {
-                       mutex_unlock(&gdp_mutex);
+                       *locked = true;
                        return kobj;
                }

@@ -2007,6 +2010,7 @@ int device_add(struct device *dev)
        struct class_interface *class_intf;
        int error = -EINVAL;
        struct kobject *glue_dir = NULL;
+       bool locked;

        dev = get_device(dev);
        if (!dev)
@@ -2040,7 +2044,7 @@ int device_add(struct device *dev)
        pr_debug("device: '%s': %s\n", dev_name(dev), __func__);

        parent = get_device(dev->parent);
-       kobj = get_device_parent(dev, parent);
+       kobj = get_device_parent(dev, parent, &locked);
        if (IS_ERR(kobj)) {
                error = PTR_ERR(kobj);
                goto parent_error;
@@ -2057,9 +2061,14 @@ int device_add(struct device *dev)
        error = kobject_add(&dev->kobj, dev->kobj.parent, NULL);
        if (error) {
                glue_dir = get_glue_dir(dev);
+               if (locked)
+                       mutex_unlock(&gdp_mutex);
                goto Error;
        }

+       if (locked)
+               mutex_unlock(&gdp_mutex);
+
        /* notify platform of device entry */
        error = device_platform_notify(dev, KOBJ_ADD);
        if (error)

Yours
Muchun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-04-25 15:44   ` Muchun Song
@ 2019-04-28 10:10     ` Benjamin Herrenschmidt
  2019-04-28 14:49       ` Muchun Song
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2019-04-28 10:10 UTC (permalink / raw)
  To: Muchun Song; +Cc: gregkh, rafael, linux-kernel, zhaowuyun

On Thu, 2019-04-25 at 23:44 +0800, Muchun Song wrote:
> I agree with you that the looking up of the glue dir and creation of its child
> should be protected by the same lock of gdp_mutex. So, do you agree with
> the fix of the following code snippet?

The basic idea yes, the whole bool *locked is horrid though. Wouldn't it
work to have a get_device_parent_locked that always returns with the mutex held,
or just move the mutex to the caller or something simpler like this ?

Ben.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-04-28 10:10     ` Benjamin Herrenschmidt
@ 2019-04-28 14:49       ` Muchun Song
  2019-05-02  6:25         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Muchun Song @ 2019-04-28 14:49 UTC (permalink / raw)
  To: gregkh, rafael; +Cc: Benjamin Herrenschmidt, linux-kernel, zhaowuyun

Hi Greg and Rafael:


Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年4月28日周日 下午6:10写道:
>
> The basic idea yes, the whole bool *locked is horrid though. Wouldn't it
> work to have a get_device_parent_locked that always returns with the mutex held,
> or just move the mutex to the caller or something simpler like this ?
>

Greg and Rafael, do you have any suggestions for this? Or you also
agree with Ben?

Yours,
Muchun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-04-28 14:49       ` Muchun Song
@ 2019-05-02  6:25         ` Benjamin Herrenschmidt
  2019-05-04 14:47           ` Muchun Song
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2019-05-02  6:25 UTC (permalink / raw)
  To: Muchun Song, gregkh, rafael; +Cc: linux-kernel, zhaowuyun

On Sun, 2019-04-28 at 22:49 +0800, Muchun Song wrote:
> Hi Greg and Rafael:
> 
> 
> Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年4月28日周日
> 下午6:10写道:
> > 
> > The basic idea yes, the whole bool *locked is horrid though.
> > Wouldn't it
> > work to have a get_device_parent_locked that always returns with
> > the mutex held,
> > or just move the mutex to the caller or something simpler like this
> > ?
> > 
> 
> Greg and Rafael, do you have any suggestions for this? Or you also
> agree with Ben?

Ping guys ? This is worth fixing... 

Ben.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-05-02  6:25         ` Benjamin Herrenschmidt
@ 2019-05-04 14:47           ` Muchun Song
  2019-05-04 15:34             ` Greg KH
  2019-05-14 10:56             ` Mukesh Ojha
  0 siblings, 2 replies; 14+ messages in thread
From: Muchun Song @ 2019-05-04 14:47 UTC (permalink / raw)
  To: gregkh, rafael; +Cc: Benjamin Herrenschmidt, linux-kernel, zhaowuyun

[-- Attachment #1: Type: text/plain, Size: 4382 bytes --]

Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年5月2日周四 下午2:25写道:

> > > The basic idea yes, the whole bool *locked is horrid though.
> > > Wouldn't it
> > > work to have a get_device_parent_locked that always returns with
> > > the mutex held,
> > > or just move the mutex to the caller or something simpler like this
> > > ?
> > >
> >
> > Greg and Rafael, do you have any suggestions for this? Or you also
> > agree with Ben?
>
> Ping guys ? This is worth fixing...

I also agree with you. But Greg and Rafael seem to be high latency right now.

From your suggestions, I think introduce get_device_parent_locked() may easy
to fix. So, do you agree with the fix of the following code snippet
(You can also
view attachments)?

I introduce a new function named get_device_parent_locked_if_glue_dir() which
always returns with the mutex held only when we live in glue dir. We should call
unlock_if_glue_dir() to release the mutex. The
get_device_parent_locked_if_glue_dir()
and unlock_if_glue_dir() should be called in pairs.

---
drivers/base/core.c | 44 ++++++++++++++++++++++++++++++++++++--------
1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 4aeaa0c92bda..5112755c43fa 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1739,8 +1739,9 @@ class_dir_create_and_add(struct class *class,
struct kobject *parent_kobj)
static DEFINE_MUTEX(gdp_mutex);
-static struct kobject *get_device_parent(struct device *dev,
-                    struct device *parent)
+static struct kobject *__get_device_parent(struct device *dev,
+                    struct device *parent,
+                    bool lock)
{
   if (dev->class) {
       struct kobject *kobj = NULL;
@@ -1779,14 +1780,16 @@ static struct kobject
*get_device_parent(struct device *dev,
           }
       spin_unlock(&dev->class->p->glue_dirs.list_lock);
       if (kobj) {
-           mutex_unlock(&gdp_mutex);
+           if (!lock)
+               mutex_unlock(&gdp_mutex);
           return kobj;
       }
       /* or create a new class-directory at the parent device */
       k = class_dir_create_and_add(dev->class, parent_kobj);
       /* do not emit an uevent for this simple "glue" directory */
-       mutex_unlock(&gdp_mutex);
+       if (!lock)
+           mutex_unlock(&gdp_mutex);
       return k;
   }
@@ -1799,6 +1802,19 @@ static struct kobject *get_device_parent(struct
device *dev,
   return NULL;
}
+static inline struct kobject *get_device_parent(struct device *dev,
+                       struct device *parent)
+{
+   return __get_device_parent(dev, parent, false);
+}
+
+static inline struct kobject *
+get_device_parent_locked_if_glue_dir(struct device *dev,
+                struct device *parent)
+{
+   return __get_device_parent(dev, parent, true);
+}
+
static inline bool live_in_glue_dir(struct kobject *kobj,
                struct device *dev)
{
@@ -1831,6 +1847,16 @@ static void cleanup_glue_dir(struct device
*dev, struct kobject *glue_dir)
   mutex_unlock(&gdp_mutex);
}
+static inline void unlock_if_glue_dir(struct device *dev,
+                struct kobject *glue_dir)
+{
+   /* see if we live in a "glue" directory */
+   if (!live_in_glue_dir(glue_dir, dev))
+       return;
+
+   mutex_unlock(&gdp_mutex);
+}
+
static int device_add_class_symlinks(struct device *dev)
{
   struct device_node *of_node = dev_of_node(dev);
@@ -2040,7 +2066,7 @@ int device_add(struct device *dev)
   pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
   parent = get_device(dev->parent);
-   kobj = get_device_parent(dev, parent);
+   kobj = get_device_parent_locked_if_glue_dir(dev, parent);
   if (IS_ERR(kobj)) {
       error = PTR_ERR(kobj);
       goto parent_error;
@@ -2055,10 +2081,12 @@ int device_add(struct device *dev)
   /* first, register with generic layer. */
   /* we require the name to be set before, and pass NULL */
   error = kobject_add(&dev->kobj, dev->kobj.parent, NULL);
-   if (error) {
-       glue_dir = get_glue_dir(dev);
+
+   glue_dir = get_glue_dir(dev);
+   unlock_if_glue_dir(dev, glue_dir);
+
+   if (error)
       goto Error;
-   }
   /* notify platform of device entry */
   error = device_platform_notify(dev, KOBJ_ADD);
--

[-- Attachment #2: 0001-driver-core-Fix-use-after-free-and-double-free-on-gl.patch --]
[-- Type: application/octet-stream, Size: 7681 bytes --]

From 091a0800664201c1719b2aa1bcd965f9578ee13e Mon Sep 17 00:00:00 2001
From: Muchun Song <smuchun@gmail.com>
Date: Sun, 28 Apr 2019 23:31:06 +0800
Subject: [PATCH] driver core: Fix use-after-free and double free on glue
 directory

There is a race condition between removing glue directory and adding a new
device under the glue directory. It can be reproduced in following test:

path 1: Add the child device under glue dir
device_add()
    get_device_parent()
        mutex_lock(&gdp_mutex);
        ....
        /*find parent from glue_dirs.list*/
        list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry)
            if (k->parent == parent_kobj) {
                kobj = kobject_get(k);
                break;
            }
        ....
        mutex_unlock(&gdp_mutex);
        ....
    ....
    kobject_add()
        kobject_add_internal()
            create_dir()
                sysfs_create_dir_ns()
                    if (kobj->parent)
                        parent = kobj->parent->sd;
                    ....
                    kernfs_create_dir_ns(parent)
                        kernfs_new_node()
                            kernfs_get(parent)
                        ....
                        /* link in */
                        rc = kernfs_add_one(kn);
                        if (!rc)
                            return kn;

                        kernfs_put(kn)
                            ....
                            repeat:
                            kmem_cache_free(kn)
                            kn = parent;

                            if (kn) {
                                if (atomic_dec_and_test(&kn->count))
                                    goto repeat;
                            }
                        ....

path2: Remove last child device under glue dir
device_del()
    cleanup_device_parent()
        cleanup_glue_dir()
            mutex_lock(&gdp_mutex);
            if (!kobject_has_children(glue_dir))
                kobject_del(glue_dir);
            kobject_put(glue_dir);
            mutex_unlock(&gdp_mutex);

Before path2 remove last child device under glue dir, If path1 add a new
device under glue dir, the glue_dir kobject reference count will be
increase to 2 via kobject_get(k) in get_device_parent(). And path1 has
been called kernfs_new_node(), but not call kernfs_get(parent).
Meanwhile, path2 call kobject_del(glue_dir) beacause 0 is returned by
kobject_has_children(). This result in glue_dir->sd is freed and it's
reference count will be 0. Then path1 call kernfs_get(parent) will trigger
a warning in kernfs_get()(WARN_ON(!atomic_read(&kn->count))) and increase
it's reference count to 1. Because glue_dir->sd is freed by path2, the next
call kernfs_add_one() by path1 will fail(This is also use-after-free)
and call atomic_dec_and_test() to decrease reference count. Because the
reference count is decremented to 0, it will also call kmem_cache_free()
to free glue_dir->sd again. This will result in double free.

In order to avoid this happening, we we should not call kobject_del() on
path2 when the reference count of glue_dir is greater than 1. So we add a
conditional statement to fix it.

The following calltrace is captured in kernel 4.14 with the following patch
applied:

commit 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier")

--------------------------------------------------------------------------
[    3.633703] WARNING: CPU: 4 PID: 513 at .../fs/kernfs/dir.c:494
                Here is WARN_ON(!atomic_read(&kn->count) in kernfs_get().
....
[    3.633986] Call trace:
[    3.633991]  kernfs_create_dir_ns+0xa8/0xb0
[    3.633994]  sysfs_create_dir_ns+0x54/0xe8
[    3.634001]  kobject_add_internal+0x22c/0x3f0
[    3.634005]  kobject_add+0xe4/0x118
[    3.634011]  device_add+0x200/0x870
[    3.634017]  _request_firmware+0x958/0xc38
[    3.634020]  request_firmware_into_buf+0x4c/0x70
....
[    3.634064] kernel BUG at .../mm/slub.c:294!
                Here is BUG_ON(object == fp) in set_freepointer().
....
[    3.634346] Call trace:
[    3.634351]  kmem_cache_free+0x504/0x6b8
[    3.634355]  kernfs_put+0x14c/0x1d8
[    3.634359]  kernfs_create_dir_ns+0x88/0xb0
[    3.634362]  sysfs_create_dir_ns+0x54/0xe8
[    3.634366]  kobject_add_internal+0x22c/0x3f0
[    3.634370]  kobject_add+0xe4/0x118
[    3.634374]  device_add+0x200/0x870
[    3.634378]  _request_firmware+0x958/0xc38
[    3.634381]  request_firmware_into_buf+0x4c/0x70
--------------------------------------------------------------------------

Fixes: 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier")

Signed-off-by: Muchun Song <smuchun@gmail.com>
---
 drivers/base/core.c | 44 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 4aeaa0c92bda..5112755c43fa 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1739,8 +1739,9 @@ class_dir_create_and_add(struct class *class, struct kobject *parent_kobj)
 
 static DEFINE_MUTEX(gdp_mutex);
 
-static struct kobject *get_device_parent(struct device *dev,
-					 struct device *parent)
+static struct kobject *__get_device_parent(struct device *dev,
+					   struct device *parent,
+					   bool lock)
 {
 	if (dev->class) {
 		struct kobject *kobj = NULL;
@@ -1779,14 +1780,16 @@ static struct kobject *get_device_parent(struct device *dev,
 			}
 		spin_unlock(&dev->class->p->glue_dirs.list_lock);
 		if (kobj) {
-			mutex_unlock(&gdp_mutex);
+			if (!lock)
+				mutex_unlock(&gdp_mutex);
 			return kobj;
 		}
 
 		/* or create a new class-directory at the parent device */
 		k = class_dir_create_and_add(dev->class, parent_kobj);
 		/* do not emit an uevent for this simple "glue" directory */
-		mutex_unlock(&gdp_mutex);
+		if (!lock)
+			mutex_unlock(&gdp_mutex);
 		return k;
 	}
 
@@ -1799,6 +1802,19 @@ static struct kobject *get_device_parent(struct device *dev,
 	return NULL;
 }
 
+static inline struct kobject *get_device_parent(struct device *dev,
+						struct device *parent)
+{
+	return __get_device_parent(dev, parent, false);
+}
+
+static inline struct kobject *
+get_device_parent_locked_if_glue_dir(struct device *dev,
+				     struct device *parent)
+{
+	return __get_device_parent(dev, parent, true);
+}
+
 static inline bool live_in_glue_dir(struct kobject *kobj,
 				    struct device *dev)
 {
@@ -1831,6 +1847,16 @@ static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir)
 	mutex_unlock(&gdp_mutex);
 }
 
+static inline void unlock_if_glue_dir(struct device *dev,
+				      struct kobject *glue_dir)
+{
+	/* see if we live in a "glue" directory */
+	if (!live_in_glue_dir(glue_dir, dev))
+		return;
+
+	mutex_unlock(&gdp_mutex);
+}
+
 static int device_add_class_symlinks(struct device *dev)
 {
 	struct device_node *of_node = dev_of_node(dev);
@@ -2040,7 +2066,7 @@ int device_add(struct device *dev)
 	pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
 
 	parent = get_device(dev->parent);
-	kobj = get_device_parent(dev, parent);
+	kobj = get_device_parent_locked_if_glue_dir(dev, parent);
 	if (IS_ERR(kobj)) {
 		error = PTR_ERR(kobj);
 		goto parent_error;
@@ -2055,10 +2081,12 @@ int device_add(struct device *dev)
 	/* first, register with generic layer. */
 	/* we require the name to be set before, and pass NULL */
 	error = kobject_add(&dev->kobj, dev->kobj.parent, NULL);
-	if (error) {
-		glue_dir = get_glue_dir(dev);
+
+	glue_dir = get_glue_dir(dev);
+	unlock_if_glue_dir(dev, glue_dir);
+
+	if (error)
 		goto Error;
-	}
 
 	/* notify platform of device entry */
 	error = device_platform_notify(dev, KOBJ_ADD);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-05-04 14:47           ` Muchun Song
@ 2019-05-04 15:34             ` Greg KH
  2019-05-09 14:38               ` Gaurav Kohli
  2019-05-14 10:56             ` Mukesh Ojha
  1 sibling, 1 reply; 14+ messages in thread
From: Greg KH @ 2019-05-04 15:34 UTC (permalink / raw)
  To: Muchun Song; +Cc: rafael, Benjamin Herrenschmidt, linux-kernel, zhaowuyun

On Sat, May 04, 2019 at 10:47:07PM +0800, Muchun Song wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年5月2日周四 下午2:25写道:
> 
> > > > The basic idea yes, the whole bool *locked is horrid though.
> > > > Wouldn't it
> > > > work to have a get_device_parent_locked that always returns with
> > > > the mutex held,
> > > > or just move the mutex to the caller or something simpler like this
> > > > ?
> > > >
> > >
> > > Greg and Rafael, do you have any suggestions for this? Or you also
> > > agree with Ben?
> >
> > Ping guys ? This is worth fixing...
> 
> I also agree with you. But Greg and Rafael seem to be high latency right now.

It's in my list of patches to get to, sorry, hopefully will dig out of
that next week with the buffer that the merge window provides me.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-05-04 15:34             ` Greg KH
@ 2019-05-09 14:38               ` Gaurav Kohli
  2019-05-09 23:22                 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Gaurav Kohli @ 2019-05-09 14:38 UTC (permalink / raw)
  To: Greg KH, Muchun Song
  Cc: rafael, Benjamin Herrenschmidt, linux-kernel, zhaowuyun, linux-arm-msm

Hi ,

Last patch will serialize the addition of child to parent directory, 
won't it affect performance.

Regards
Gaurav

On 5/4/2019 9:04 PM, Greg KH wrote:
> On Sat, May 04, 2019 at 10:47:07PM +0800, Muchun Song wrote:
>> Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年5月2日周四 下午2:25写道:
>>
>>>>> The basic idea yes, the whole bool *locked is horrid though.
>>>>> Wouldn't it
>>>>> work to have a get_device_parent_locked that always returns with
>>>>> the mutex held,
>>>>> or just move the mutex to the caller or something simpler like this
>>>>> ?
>>>>>
>>>>
>>>> Greg and Rafael, do you have any suggestions for this? Or you also
>>>> agree with Ben?
>>>
>>> Ping guys ? This is worth fixing...
>>
>> I also agree with you. But Greg and Rafael seem to be high latency right now.
> 
> It's in my list of patches to get to, sorry, hopefully will dig out of
> that next week with the buffer that the merge window provides me.
> 
> thanks,
> 
> greg k-h
> 

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-05-09 14:38               ` Gaurav Kohli
@ 2019-05-09 23:22                 ` Benjamin Herrenschmidt
  2019-05-10  3:31                   ` Gaurav Kohli
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2019-05-09 23:22 UTC (permalink / raw)
  To: Gaurav Kohli, Greg KH, Muchun Song
  Cc: rafael, linux-kernel, zhaowuyun, linux-arm-msm

On Thu, 2019-05-09 at 20:08 +0530, Gaurav Kohli wrote:
> Hi ,
> 
> Last patch will serialize the addition of child to parent directory, 
> won't it affect performance.

I doubt this is a significant issue, and there's already a global lock
taken once or twice in that path, the fix is purely to make sure that
the some locked section is used both for the lookup and the addition as
the bug comes from the window in between those two operations allowing
the object to be removed after it was "found".

Cheers,
Ben.
 
> 
> Regards
> Gaurav
> 
> On 5/4/2019 9:04 PM, Greg KH wrote:
> > On Sat, May 04, 2019 at 10:47:07PM +0800, Muchun Song wrote:
> > > Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年5月2日周四
> > > 下午2:25写道:
> > > 
> > > > > > The basic idea yes, the whole bool *locked is horrid
> > > > > > though.
> > > > > > Wouldn't it
> > > > > > work to have a get_device_parent_locked that always returns
> > > > > > with
> > > > > > the mutex held,
> > > > > > or just move the mutex to the caller or something simpler
> > > > > > like this
> > > > > > ?
> > > > > > 
> > > > > 
> > > > > Greg and Rafael, do you have any suggestions for this? Or you
> > > > > also
> > > > > agree with Ben?
> > > > 
> > > > Ping guys ? This is worth fixing...
> > > 
> > > I also agree with you. But Greg and Rafael seem to be high
> > > latency right now.
> > 
> > It's in my list of patches to get to, sorry, hopefully will dig out
> > of
> > that next week with the buffer that the merge window provides me.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-05-09 23:22                 ` Benjamin Herrenschmidt
@ 2019-05-10  3:31                   ` Gaurav Kohli
  0 siblings, 0 replies; 14+ messages in thread
From: Gaurav Kohli @ 2019-05-10  3:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Greg KH, Muchun Song
  Cc: rafael, linux-kernel, zhaowuyun, linux-arm-msm

Thanks for the comment, will check the patch and update.

Regards
Gaurav

On 5/10/2019 4:52 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2019-05-09 at 20:08 +0530, Gaurav Kohli wrote:
>> Hi ,
>>
>> Last patch will serialize the addition of child to parent directory,
>> won't it affect performance.
> 
> I doubt this is a significant issue, and there's already a global lock
> taken once or twice in that path, the fix is purely to make sure that
> the some locked section is used both for the lookup and the addition as
> the bug comes from the window in between those two operations allowing
> the object to be removed after it was "found".
> 
> Cheers,
> Ben.
>   
>>
>> Regards
>> Gaurav
>>
>> On 5/4/2019 9:04 PM, Greg KH wrote:
>>> On Sat, May 04, 2019 at 10:47:07PM +0800, Muchun Song wrote:
>>>> Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年5月2日周四
>>>> 下午2:25写道:
>>>>
>>>>>>> The basic idea yes, the whole bool *locked is horrid
>>>>>>> though.
>>>>>>> Wouldn't it
>>>>>>> work to have a get_device_parent_locked that always returns
>>>>>>> with
>>>>>>> the mutex held,
>>>>>>> or just move the mutex to the caller or something simpler
>>>>>>> like this
>>>>>>> ?
>>>>>>>
>>>>>>
>>>>>> Greg and Rafael, do you have any suggestions for this? Or you
>>>>>> also
>>>>>> agree with Ben?
>>>>>
>>>>> Ping guys ? This is worth fixing...
>>>>
>>>> I also agree with you. But Greg and Rafael seem to be high
>>>> latency right now.
>>>
>>> It's in my list of patches to get to, sorry, hopefully will dig out
>>> of
>>> that next week with the buffer that the merge window provides me.
>>>
>>> thanks,
>>>
>>> greg k-h
>>>
>>
>>
> 

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-05-04 14:47           ` Muchun Song
  2019-05-04 15:34             ` Greg KH
@ 2019-05-14 10:56             ` Mukesh Ojha
  2019-05-14 10:59               ` Prateek Sood
  1 sibling, 1 reply; 14+ messages in thread
From: Mukesh Ojha @ 2019-05-14 10:56 UTC (permalink / raw)
  To: Muchun Song, gregkh, rafael; +Cc: Benjamin Herrenschmidt, linux-kernel, prsood

++

On 5/4/2019 8:17 PM, Muchun Song wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年5月2日周四 下午2:25写道:
>
>>>> The basic idea yes, the whole bool *locked is horrid though.
>>>> Wouldn't it
>>>> work to have a get_device_parent_locked that always returns with
>>>> the mutex held,
>>>> or just move the mutex to the caller or something simpler like this
>>>> ?
>>>>
>>> Greg and Rafael, do you have any suggestions for this? Or you also
>>> agree with Ben?
>> Ping guys ? This is worth fixing...
> I also agree with you. But Greg and Rafael seem to be high latency right now.
>
>  From your suggestions, I think introduce get_device_parent_locked() may easy
> to fix. So, do you agree with the fix of the following code snippet
> (You can also
> view attachments)?
>
> I introduce a new function named get_device_parent_locked_if_glue_dir() which
> always returns with the mutex held only when we live in glue dir. We should call
> unlock_if_glue_dir() to release the mutex. The
> get_device_parent_locked_if_glue_dir()
> and unlock_if_glue_dir() should be called in pairs.
>
> ---
> drivers/base/core.c | 44 ++++++++++++++++++++++++++++++++++++--------
> 1 file changed, 36 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 4aeaa0c92bda..5112755c43fa 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1739,8 +1739,9 @@ class_dir_create_and_add(struct class *class,
> struct kobject *parent_kobj)
> static DEFINE_MUTEX(gdp_mutex);
> -static struct kobject *get_device_parent(struct device *dev,
> -                    struct device *parent)
> +static struct kobject *__get_device_parent(struct device *dev,
> +                    struct device *parent,
> +                    bool lock)
> {
>     if (dev->class) {
>         struct kobject *kobj = NULL;
> @@ -1779,14 +1780,16 @@ static struct kobject
> *get_device_parent(struct device *dev,
>             }
>         spin_unlock(&dev->class->p->glue_dirs.list_lock);
>         if (kobj) {
> -           mutex_unlock(&gdp_mutex);
> +           if (!lock)
> +               mutex_unlock(&gdp_mutex);
>             return kobj;
>         }
>         /* or create a new class-directory at the parent device */
>         k = class_dir_create_and_add(dev->class, parent_kobj);
>         /* do not emit an uevent for this simple "glue" directory */
> -       mutex_unlock(&gdp_mutex);
> +       if (!lock)
> +           mutex_unlock(&gdp_mutex);
>         return k;
>     }
> @@ -1799,6 +1802,19 @@ static struct kobject *get_device_parent(struct
> device *dev,
>     return NULL;
> }
> +static inline struct kobject *get_device_parent(struct device *dev,
> +                       struct device *parent)
> +{
> +   return __get_device_parent(dev, parent, false);
> +}
> +
> +static inline struct kobject *
> +get_device_parent_locked_if_glue_dir(struct device *dev,
> +                struct device *parent)
> +{
> +   return __get_device_parent(dev, parent, true);
> +}
> +
> static inline bool live_in_glue_dir(struct kobject *kobj,
>                  struct device *dev)
> {
> @@ -1831,6 +1847,16 @@ static void cleanup_glue_dir(struct device
> *dev, struct kobject *glue_dir)
>     mutex_unlock(&gdp_mutex);
> }
> +static inline void unlock_if_glue_dir(struct device *dev,
> +                struct kobject *glue_dir)
> +{
> +   /* see if we live in a "glue" directory */
> +   if (!live_in_glue_dir(glue_dir, dev))
> +       return;
> +
> +   mutex_unlock(&gdp_mutex);
> +}
> +
> static int device_add_class_symlinks(struct device *dev)
> {
>     struct device_node *of_node = dev_of_node(dev);
> @@ -2040,7 +2066,7 @@ int device_add(struct device *dev)
>     pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
>     parent = get_device(dev->parent);
> -   kobj = get_device_parent(dev, parent);
> +   kobj = get_device_parent_locked_if_glue_dir(dev, parent);
>     if (IS_ERR(kobj)) {
>         error = PTR_ERR(kobj);
>         goto parent_error;
> @@ -2055,10 +2081,12 @@ int device_add(struct device *dev)
>     /* first, register with generic layer. */
>     /* we require the name to be set before, and pass NULL */
>     error = kobject_add(&dev->kobj, dev->kobj.parent, NULL);
> -   if (error) {
> -       glue_dir = get_glue_dir(dev);
> +
> +   glue_dir = get_glue_dir(dev);
> +   unlock_if_glue_dir(dev, glue_dir);
> +
> +   if (error)
>         goto Error;
> -   }
>     /* notify platform of device entry */
>     error = device_platform_notify(dev, KOBJ_ADD);
> --

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-05-14 10:56             ` Mukesh Ojha
@ 2019-05-14 10:59               ` Prateek Sood
  2019-05-14 11:51                 ` Muchun Song
  0 siblings, 1 reply; 14+ messages in thread
From: Prateek Sood @ 2019-05-14 10:59 UTC (permalink / raw)
  To: Mukesh Ojha, Muchun Song, gregkh, rafael
  Cc: Benjamin Herrenschmidt, linux-kernel

On 5/14/19 4:26 PM, Mukesh Ojha wrote:
> ++
> 
> On 5/4/2019 8:17 PM, Muchun Song wrote:
>> Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年5月2日周四 下午2:25写道:
>>
>>>>> The basic idea yes, the whole bool *locked is horrid though.
>>>>> Wouldn't it
>>>>> work to have a get_device_parent_locked that always returns with
>>>>> the mutex held,
>>>>> or just move the mutex to the caller or something simpler like this
>>>>> ?
>>>>>
>>>> Greg and Rafael, do you have any suggestions for this? Or you also
>>>> agree with Ben?
>>> Ping guys ? This is worth fixing...
>> I also agree with you. But Greg and Rafael seem to be high latency right now.
>>
>>  From your suggestions, I think introduce get_device_parent_locked() may easy
>> to fix. So, do you agree with the fix of the following code snippet
>> (You can also
>> view attachments)?
>>
>> I introduce a new function named get_device_parent_locked_if_glue_dir() which
>> always returns with the mutex held only when we live in glue dir. We should call
>> unlock_if_glue_dir() to release the mutex. The
>> get_device_parent_locked_if_glue_dir()
>> and unlock_if_glue_dir() should be called in pairs.
>>
>> ---
>> drivers/base/core.c | 44 ++++++++++++++++++++++++++++++++++++--------
>> 1 file changed, 36 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 4aeaa0c92bda..5112755c43fa 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -1739,8 +1739,9 @@ class_dir_create_and_add(struct class *class,
>> struct kobject *parent_kobj)
>> static DEFINE_MUTEX(gdp_mutex);
>> -static struct kobject *get_device_parent(struct device *dev,
>> -                    struct device *parent)
>> +static struct kobject *__get_device_parent(struct device *dev,
>> +                    struct device *parent,
>> +                    bool lock)
>> {
>>     if (dev->class) {
>>         struct kobject *kobj = NULL;
>> @@ -1779,14 +1780,16 @@ static struct kobject
>> *get_device_parent(struct device *dev,
>>             }
>>         spin_unlock(&dev->class->p->glue_dirs.list_lock);
>>         if (kobj) {
>> -           mutex_unlock(&gdp_mutex);
>> +           if (!lock)
>> +               mutex_unlock(&gdp_mutex);
>>             return kobj;
>>         }
>>         /* or create a new class-directory at the parent device */
>>         k = class_dir_create_and_add(dev->class, parent_kobj);
>>         /* do not emit an uevent for this simple "glue" directory */
>> -       mutex_unlock(&gdp_mutex);
>> +       if (!lock)
>> +           mutex_unlock(&gdp_mutex);
>>         return k;
>>     }
>> @@ -1799,6 +1802,19 @@ static struct kobject *get_device_parent(struct
>> device *dev,
>>     return NULL;
>> }
>> +static inline struct kobject *get_device_parent(struct device *dev,
>> +                       struct device *parent)
>> +{
>> +   return __get_device_parent(dev, parent, false);
>> +}
>> +
>> +static inline struct kobject *
>> +get_device_parent_locked_if_glue_dir(struct device *dev,
>> +                struct device *parent)
>> +{
>> +   return __get_device_parent(dev, parent, true);
>> +}
>> +
>> static inline bool live_in_glue_dir(struct kobject *kobj,
>>                  struct device *dev)
>> {
>> @@ -1831,6 +1847,16 @@ static void cleanup_glue_dir(struct device
>> *dev, struct kobject *glue_dir)
>>     mutex_unlock(&gdp_mutex);
>> }
>> +static inline void unlock_if_glue_dir(struct device *dev,
>> +                struct kobject *glue_dir)
>> +{
>> +   /* see if we live in a "glue" directory */
>> +   if (!live_in_glue_dir(glue_dir, dev))
>> +       return;
>> +
>> +   mutex_unlock(&gdp_mutex);
>> +}
>> +
>> static int device_add_class_symlinks(struct device *dev)
>> {
>>     struct device_node *of_node = dev_of_node(dev);
>> @@ -2040,7 +2066,7 @@ int device_add(struct device *dev)
>>     pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
>>     parent = get_device(dev->parent);
>> -   kobj = get_device_parent(dev, parent);
>> +   kobj = get_device_parent_locked_if_glue_dir(dev, parent);
>>     if (IS_ERR(kobj)) {
>>         error = PTR_ERR(kobj);
>>         goto parent_error;
>> @@ -2055,10 +2081,12 @@ int device_add(struct device *dev)
>>     /* first, register with generic layer. */
>>     /* we require the name to be set before, and pass NULL */
>>     error = kobject_add(&dev->kobj, dev->kobj.parent, NULL);
>> -   if (error) {
>> -       glue_dir = get_glue_dir(dev);
>> +
>> +   glue_dir = get_glue_dir(dev);
>> +   unlock_if_glue_dir(dev, glue_dir);
>> +
>> +   if (error)
>>         goto Error;
>> -   }
>>     /* notify platform of device entry */
>>     error = device_platform_notify(dev, KOBJ_ADD);
>> -- 

This change has been done in device_add(). AFAICT, locked
version of get_device_parent should be used in device_move()
also.

Thanks

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] driver core: Fix use-after-free and double free on glue directory
  2019-05-14 10:59               ` Prateek Sood
@ 2019-05-14 11:51                 ` Muchun Song
  0 siblings, 0 replies; 14+ messages in thread
From: Muchun Song @ 2019-05-14 11:51 UTC (permalink / raw)
  To: Prateek Sood
  Cc: Mukesh Ojha, gregkh, rafael, Benjamin Herrenschmidt,
	linux-kernel, zhaowuyun

Prateek Sood <prsood@codeaurora.org> 于2019年5月14日周二 下午7:00写道:
>
> On 5/14/19 4:26 PM, Mukesh Ojha wrote:
> > ++
> >
> > On 5/4/2019 8:17 PM, Muchun Song wrote:
> >> Benjamin Herrenschmidt <benh@kernel.crashing.org> 于2019年5月2日周四 下午2:25写道:
> >>
> >>>>> The basic idea yes, the whole bool *locked is horrid though.
> >>>>> Wouldn't it
> >>>>> work to have a get_device_parent_locked that always returns with
> >>>>> the mutex held,
> >>>>> or just move the mutex to the caller or something simpler like this
> >>>>> ?
> >>>>>
> >>>> Greg and Rafael, do you have any suggestions for this? Or you also
> >>>> agree with Ben?
> >>> Ping guys ? This is worth fixing...
> >> I also agree with you. But Greg and Rafael seem to be high latency right now.
> >>
> >>  From your suggestions, I think introduce get_device_parent_locked() may easy
> >> to fix. So, do you agree with the fix of the following code snippet
> >> (You can also
> >> view attachments)?
> >>
> >> I introduce a new function named get_device_parent_locked_if_glue_dir() which
> >> always returns with the mutex held only when we live in glue dir. We should call
> >> unlock_if_glue_dir() to release the mutex. The
> >> get_device_parent_locked_if_glue_dir()
> >> and unlock_if_glue_dir() should be called in pairs.
> >>
> >> ---
> >> drivers/base/core.c | 44 ++++++++++++++++++++++++++++++++++++--------
> >> 1 file changed, 36 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/base/core.c b/drivers/base/core.c
> >> index 4aeaa0c92bda..5112755c43fa 100644
> >> --- a/drivers/base/core.c
> >> +++ b/drivers/base/core.c
> >> @@ -1739,8 +1739,9 @@ class_dir_create_and_add(struct class *class,
> >> struct kobject *parent_kobj)
> >> static DEFINE_MUTEX(gdp_mutex);
> >> -static struct kobject *get_device_parent(struct device *dev,
> >> -                    struct device *parent)
> >> +static struct kobject *__get_device_parent(struct device *dev,
> >> +                    struct device *parent,
> >> +                    bool lock)
> >> {
> >>     if (dev->class) {
> >>         struct kobject *kobj = NULL;
> >> @@ -1779,14 +1780,16 @@ static struct kobject
> >> *get_device_parent(struct device *dev,
> >>             }
> >>         spin_unlock(&dev->class->p->glue_dirs.list_lock);
> >>         if (kobj) {
> >> -           mutex_unlock(&gdp_mutex);
> >> +           if (!lock)
> >> +               mutex_unlock(&gdp_mutex);
> >>             return kobj;
> >>         }
> >>         /* or create a new class-directory at the parent device */
> >>         k = class_dir_create_and_add(dev->class, parent_kobj);
> >>         /* do not emit an uevent for this simple "glue" directory */
> >> -       mutex_unlock(&gdp_mutex);
> >> +       if (!lock)
> >> +           mutex_unlock(&gdp_mutex);
> >>         return k;
> >>     }
> >> @@ -1799,6 +1802,19 @@ static struct kobject *get_device_parent(struct
> >> device *dev,
> >>     return NULL;
> >> }
> >> +static inline struct kobject *get_device_parent(struct device *dev,
> >> +                       struct device *parent)
> >> +{
> >> +   return __get_device_parent(dev, parent, false);
> >> +}
> >> +
> >> +static inline struct kobject *
> >> +get_device_parent_locked_if_glue_dir(struct device *dev,
> >> +                struct device *parent)
> >> +{
> >> +   return __get_device_parent(dev, parent, true);
> >> +}
> >> +
> >> static inline bool live_in_glue_dir(struct kobject *kobj,
> >>                  struct device *dev)
> >> {
> >> @@ -1831,6 +1847,16 @@ static void cleanup_glue_dir(struct device
> >> *dev, struct kobject *glue_dir)
> >>     mutex_unlock(&gdp_mutex);
> >> }
> >> +static inline void unlock_if_glue_dir(struct device *dev,
> >> +                struct kobject *glue_dir)
> >> +{
> >> +   /* see if we live in a "glue" directory */
> >> +   if (!live_in_glue_dir(glue_dir, dev))
> >> +       return;
> >> +
> >> +   mutex_unlock(&gdp_mutex);
> >> +}
> >> +
> >> static int device_add_class_symlinks(struct device *dev)
> >> {
> >>     struct device_node *of_node = dev_of_node(dev);
> >> @@ -2040,7 +2066,7 @@ int device_add(struct device *dev)
> >>     pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
> >>     parent = get_device(dev->parent);
> >> -   kobj = get_device_parent(dev, parent);
> >> +   kobj = get_device_parent_locked_if_glue_dir(dev, parent);
> >>     if (IS_ERR(kobj)) {
> >>         error = PTR_ERR(kobj);
> >>         goto parent_error;
> >> @@ -2055,10 +2081,12 @@ int device_add(struct device *dev)
> >>     /* first, register with generic layer. */
> >>     /* we require the name to be set before, and pass NULL */
> >>     error = kobject_add(&dev->kobj, dev->kobj.parent, NULL);
> >> -   if (error) {
> >> -       glue_dir = get_glue_dir(dev);
> >> +
> >> +   glue_dir = get_glue_dir(dev);
> >> +   unlock_if_glue_dir(dev, glue_dir);
> >> +
> >> +   if (error)
> >>         goto Error;
> >> -   }
> >>     /* notify platform of device entry */
> >>     error = device_platform_notify(dev, KOBJ_ADD);
> >> --
>
> This change has been done in device_add(). AFAICT, locked
> version of get_device_parent should be used in device_move()
> also.
>

Yeah, I agree with you. I will send the v2 patch later to fix it also.
Thanks.

Yours,
Muchun

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-05-14 11:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-23 14:32 [PATCH] driver core: Fix use-after-free and double free on glue directory Muchun Song
2019-04-25  9:24 ` Benjamin Herrenschmidt
2019-04-25 15:44   ` Muchun Song
2019-04-28 10:10     ` Benjamin Herrenschmidt
2019-04-28 14:49       ` Muchun Song
2019-05-02  6:25         ` Benjamin Herrenschmidt
2019-05-04 14:47           ` Muchun Song
2019-05-04 15:34             ` Greg KH
2019-05-09 14:38               ` Gaurav Kohli
2019-05-09 23:22                 ` Benjamin Herrenschmidt
2019-05-10  3:31                   ` Gaurav Kohli
2019-05-14 10:56             ` Mukesh Ojha
2019-05-14 10:59               ` Prateek Sood
2019-05-14 11:51                 ` Muchun Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).