linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 ] devcoredump : Serialize devcd_del work
@ 2022-04-25 13:09 Mukesh Ojha
  2022-04-25 14:06 ` Greg KH
  2022-04-25 17:00 ` Thomas Gleixner
  0 siblings, 2 replies; 8+ messages in thread
From: Mukesh Ojha @ 2022-04-25 13:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: tglx, sboyd, rafael, johannes, gregkh, Mukesh Ojha

In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
device to the framework which sends uevent notification to userspace
and another thread Y reads this uevent and call to devcd_data_write()
which eventually try to delete the queued timer that is not initialized/queued yet.

So, debug object reports some warning and in the meantime, timer is initialized
and queued from X path. and from Y path, it gets reinitialized again and
timer->entry.pprev=NULL and try_to_grab_pending() stucks.

To fix this, introduce mutex to serialize the behaviour.

 	cpu0(X)			                      cpu1(Y)

    dev_coredump() uevent sent to userspace
    device_add()  =========================> userspace process Y reads the uevents
                                             writes to devcd fd which
                                             results into writes to

                                            devcd_data_write()
                                              mod_delayed_work()
                                                try_to_grab_pending()
                                                  del_timer()
                                                    debug_assert_init()
   INIT_DELAYED_WORK
   schedule_delayed_work
                                                     debug_object_fixup()
                                                      timer_fixup_assert_init()
                                                       timer_setup()
                                                         do_init_timer()   ==> reinitialized the
                                                                                 timer to
                                                                                 timer->entry.pprev=NULL

                                                  timer_pending()
                                                   !hlist_unhashed_lockless(&timer->entry)
                                                     !h->pprev  ==> del_timer checks
                                                                  and finds it to be NULL
 								  try_to_grab_pending() stucks.

Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
---
v1->v2:
 - Added del_wk_queued to serialize the race between devcd_data_write()
   and disabled_store().

 drivers/base/devcoredump.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index f4d794d..3e6fd6b 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -25,6 +25,8 @@ struct devcd_entry {
 	struct device devcd_dev;
 	void *data;
 	size_t datalen;
+	struct mutex mutex;
+	bool del_wk_queued;
 	struct module *owner;
 	ssize_t (*read)(char *buffer, loff_t offset, size_t count,
 			void *data, size_t datalen);
@@ -84,7 +86,12 @@ static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
 	struct device *dev = kobj_to_dev(kobj);
 	struct devcd_entry *devcd = dev_to_devcd(dev);
 
-	mod_delayed_work(system_wq, &devcd->del_wk, 0);
+	mutex_lock(&devcd->mutex);
+	if (!devcd->del_wk_queued) {
+		devcd->del_wk_queued = true;
+		mod_delayed_work(system_wq, &devcd->del_wk, 0);
+	}
+	mutex_unlock(&devcd->mutex);
 
 	return count;
 }
@@ -112,7 +119,12 @@ static int devcd_free(struct device *dev, void *data)
 {
 	struct devcd_entry *devcd = dev_to_devcd(dev);
 
+	mutex_lock(&devcd->mutex);
+	if (!devcd->del_wk_queued)
+		devcd->del_wk_queued = true;
+
 	flush_delayed_work(&devcd->del_wk);
+	mutex_unlock(&devcd->mutex);
 	return 0;
 }
 
@@ -278,13 +290,15 @@ void dev_coredumpm(struct device *dev, struct module *owner,
 	devcd->read = read;
 	devcd->free = free;
 	devcd->failing_dev = get_device(dev);
-
+	mutex_init(&devcd->mutex);
 	device_initialize(&devcd->devcd_dev);
 
 	dev_set_name(&devcd->devcd_dev, "devcd%d",
 		     atomic_inc_return(&devcd_count));
 	devcd->devcd_dev.class = &devcd_class;
 
+	mutex_lock(&devcd->mutex);
+	devcd->del_wk_queued = false;
 	if (device_add(&devcd->devcd_dev))
 		goto put_device;
 
@@ -301,10 +315,11 @@ void dev_coredumpm(struct device *dev, struct module *owner,
 
 	INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
 	schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
-
+	mutex_unlock(&devcd->mutex);
 	return;
  put_device:
 	put_device(&devcd->devcd_dev);
+	mutex_unlock(&devcd->mutex);
  put_module:
 	module_put(owner);
  free:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 ] devcoredump : Serialize devcd_del work
  2022-04-25 13:09 [PATCH v2 ] devcoredump : Serialize devcd_del work Mukesh Ojha
@ 2022-04-25 14:06 ` Greg KH
  2022-04-25 17:00 ` Thomas Gleixner
  1 sibling, 0 replies; 8+ messages in thread
From: Greg KH @ 2022-04-25 14:06 UTC (permalink / raw)
  To: Mukesh Ojha; +Cc: linux-kernel, tglx, sboyd, rafael, johannes

On Mon, Apr 25, 2022 at 06:39:53PM +0530, Mukesh Ojha wrote:
> In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> device to the framework which sends uevent notification to userspace
> and another thread Y reads this uevent and call to devcd_data_write()
> which eventually try to delete the queued timer that is not initialized/queued yet.
> 
> So, debug object reports some warning and in the meantime, timer is initialized
> and queued from X path. and from Y path, it gets reinitialized again and
> timer->entry.pprev=NULL and try_to_grab_pending() stucks.

Nit, please wrap your lines at 72 columns like git asked you to when you
made the commit

> 
> To fix this, introduce mutex to serialize the behaviour.
> 
>  	cpu0(X)			                      cpu1(Y)
> 
>     dev_coredump() uevent sent to userspace
>     device_add()  =========================> userspace process Y reads the uevents
>                                              writes to devcd fd which
>                                              results into writes to
> 
>                                             devcd_data_write()
>                                               mod_delayed_work()
>                                                 try_to_grab_pending()
>                                                   del_timer()
>                                                     debug_assert_init()
>    INIT_DELAYED_WORK
>    schedule_delayed_work
>                                                      debug_object_fixup()
>                                                       timer_fixup_assert_init()
>                                                        timer_setup()
>                                                          do_init_timer()   ==> reinitialized the
>                                                                                  timer to
>                                                                                  timer->entry.pprev=NULL
> 
>                                                   timer_pending()
>                                                    !hlist_unhashed_lockless(&timer->entry)
>                                                      !h->pprev  ==> del_timer checks
>                                                                   and finds it to be NULL
>  								  try_to_grab_pending() stucks.

Mix of tabs and spaces?  This can all go left a bit as well.

> 
> Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/
> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
> ---
> v1->v2:
>  - Added del_wk_queued to serialize the race between devcd_data_write()
>    and disabled_store().
> 
>  drivers/base/devcoredump.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
> index f4d794d..3e6fd6b 100644
> --- a/drivers/base/devcoredump.c
> +++ b/drivers/base/devcoredump.c
> @@ -25,6 +25,8 @@ struct devcd_entry {
>  	struct device devcd_dev;
>  	void *data;
>  	size_t datalen;
> +	struct mutex mutex;

Document what this lock is for here please.  I think checkpatch asks you
for that, right?

> +	bool del_wk_queued;

Please spell this out better, you can use vowels :)

>  	struct module *owner;
>  	ssize_t (*read)(char *buffer, loff_t offset, size_t count,
>  			void *data, size_t datalen);
> @@ -84,7 +86,12 @@ static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
>  	struct device *dev = kobj_to_dev(kobj);
>  	struct devcd_entry *devcd = dev_to_devcd(dev);
>  
> -	mod_delayed_work(system_wq, &devcd->del_wk, 0);
> +	mutex_lock(&devcd->mutex);
> +	if (!devcd->del_wk_queued) {
> +		devcd->del_wk_queued = true;
> +		mod_delayed_work(system_wq, &devcd->del_wk, 0);
> +	}
> +	mutex_unlock(&devcd->mutex);
>  
>  	return count;
>  }
> @@ -112,7 +119,12 @@ static int devcd_free(struct device *dev, void *data)
>  {
>  	struct devcd_entry *devcd = dev_to_devcd(dev);
>  
> +	mutex_lock(&devcd->mutex);
> +	if (!devcd->del_wk_queued)
> +		devcd->del_wk_queued = true;
> +
>  	flush_delayed_work(&devcd->del_wk);
> +	mutex_unlock(&devcd->mutex);
>  	return 0;
>  }
>  
> @@ -278,13 +290,15 @@ void dev_coredumpm(struct device *dev, struct module *owner,
>  	devcd->read = read;
>  	devcd->free = free;
>  	devcd->failing_dev = get_device(dev);
> -
> +	mutex_init(&devcd->mutex);

Why drop the blank line?

>  	device_initialize(&devcd->devcd_dev);
>  
>  	dev_set_name(&devcd->devcd_dev, "devcd%d",
>  		     atomic_inc_return(&devcd_count));
>  	devcd->devcd_dev.class = &devcd_class;
>  
> +	mutex_lock(&devcd->mutex);

Why lock this here?

> +	devcd->del_wk_queued = false;

This was already set to false above, right?  And if you want to
explicitly initialize it, do it where the other variables are
initialized up by mutex_init() please.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 ] devcoredump : Serialize devcd_del work
  2022-04-25 13:09 [PATCH v2 ] devcoredump : Serialize devcd_del work Mukesh Ojha
  2022-04-25 14:06 ` Greg KH
@ 2022-04-25 17:00 ` Thomas Gleixner
  2022-04-25 17:19   ` Johannes Berg
  1 sibling, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2022-04-25 17:00 UTC (permalink / raw)
  To: Mukesh Ojha, linux-kernel; +Cc: sboyd, rafael, johannes, gregkh, Mukesh Ojha

On Mon, Apr 25 2022 at 18:39, Mukesh Ojha wrote:
> v1->v2:
>  - Added del_wk_queued to serialize the race between devcd_data_write()
>    and disabled_store().

How so?

Neither the flag nor the mutex can prevent the race between the work
being executed in parallel.

disabled_store()                                worker()    

  class_for_each_device(&devcd_class, NULL, NULL, devcd_free)
    ...
    while ((dev = class_dev_iter_next(&iter)) {
    						devcd_del()
                                                 device_del()
                                                 put_device() <- last reference
          error = fn(dev, data)                   devcd_dev_release()
            devcd_free(dev, data)                  kfree(devcd)
              mutex_lock(&devcd->mutex);
      

There is zero protection of the class iterator against the work being
executed and removing the device and freeing its data. IOW, at the
point where fn(), i.e. devcd_free(), dereferences 'dev' to acquire the
mutex, it might be gone. No?

If disabled_store() really needs to flush all instances immediately,
then it requires global serialization, not device specific serialization.

Johannes, can you please explain whether this immediate flush in
disabled_store() is really required and if so, why?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 ] devcoredump : Serialize devcd_del work
  2022-04-25 17:00 ` Thomas Gleixner
@ 2022-04-25 17:19   ` Johannes Berg
  2022-04-25 19:37     ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Berg @ 2022-04-25 17:19 UTC (permalink / raw)
  To: Thomas Gleixner, Mukesh Ojha, linux-kernel; +Cc: sboyd, rafael, gregkh

On Mon, 2022-04-25 at 19:00 +0200, Thomas Gleixner wrote:
> 
> Johannes, can you please explain whether this immediate flush in
> disabled_store() is really required and if so, why?
> 

I don't really know, as I remember that requirement (or maybe even code,
not sure) came from Kees, who needed the lockdown.

Given the use case (ChromeOS?) I'm not sure I see a need to flush all of
them, since I guess a typical system would set the lockdown early in
boot and hopefully not have a crash-dump around already.


That said, I don't think the diagram you made works - fn() during the
iteration is guaranteed to be invoked with a reference of its own, so
the put_device() there can't be the last reference, only as fn() returns
you'd put the last reference *there*, freeing it.

johannes

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 ] devcoredump : Serialize devcd_del work
  2022-04-25 17:19   ` Johannes Berg
@ 2022-04-25 19:37     ` Thomas Gleixner
  2022-04-26 14:04       ` Mukesh Ojha
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2022-04-25 19:37 UTC (permalink / raw)
  To: Johannes Berg, Mukesh Ojha, linux-kernel; +Cc: sboyd, rafael, gregkh, Kees Cook

Cc+: Kees

On Mon, Apr 25 2022 at 19:19, Johannes Berg wrote:
> On Mon, 2022-04-25 at 19:00 +0200, Thomas Gleixner wrote:
>> 
>> Johannes, can you please explain whether this immediate flush in
>> disabled_store() is really required and if so, why?
>> 
> I don't really know, as I remember that requirement (or maybe even code,
> not sure) came from Kees, who needed the lockdown.
>
> Given the use case (ChromeOS?) I'm not sure I see a need to flush all of
> them, since I guess a typical system would set the lockdown early in
> boot and hopefully not have a crash-dump around already.
>
> That said, I don't think the diagram you made works - fn() during the
> iteration is guaranteed to be invoked with a reference of its own, so
> the put_device() there can't be the last reference, only as fn() returns
> you'd put the last reference *there*, freeing it.

Bah, you are right, it's magically protected by the klist ref, which
prevents devcd from going away. Damned obvious.

This really needs comments why this all can magically "work".

Thanks,

        tglx





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 ] devcoredump : Serialize devcd_del work
  2022-04-25 19:37     ` Thomas Gleixner
@ 2022-04-26 14:04       ` Mukesh Ojha
  2022-04-26 21:25         ` Kees Cook
  0 siblings, 1 reply; 8+ messages in thread
From: Mukesh Ojha @ 2022-04-26 14:04 UTC (permalink / raw)
  To: Thomas Gleixner, Johannes Berg, linux-kernel
  Cc: sboyd, rafael, gregkh, Kees Cook



On 4/26/2022 1:07 AM, Thomas Gleixner wrote:
> Cc+: Kees
> 
> On Mon, Apr 25 2022 at 19:19, Johannes Berg wrote:
>> On Mon, 2022-04-25 at 19:00 +0200, Thomas Gleixner wrote:
>>>
>>> Johannes, can you please explain whether this immediate flush in
>>> disabled_store() is really required and if so, why?
>>>
>> I don't really know, as I remember that requirement (or maybe even code,
>> not sure) came from Kees, who needed the lockdown.
>>
>> Given the use case (ChromeOS?) I'm not sure I see a need to flush all of
>> them, since I guess a typical system would set the lockdown early in
>> boot and hopefully not have a crash-dump around already.
>>
>> That said, I don't think the diagram you made works - fn() during the
>> iteration is guaranteed to be invoked with a reference of its own, so
>> the put_device() there can't be the last reference, only as fn() returns
>> you'd put the last reference *there*, freeing it.
> 
> Bah, you are right, it's magically protected by the klist ref, which
> prevents devcd from going away. Damned obvious.
> 
> This really needs comments why this all can magically "work".
> 
> Thanks,
> 
>          tglx
> 

Thanks you all for your time in reviewing this.
I tried to address few comments in v3 here.

https://lore.kernel.org/lkml/1650981343-11739-1-git-send-email-quic_mojha@quicinc.com/

While, we would like to hear from Kees about reason of immediate flush 
from disabled_store().

Regards,
-Mukesh
> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 ] devcoredump : Serialize devcd_del work
  2022-04-26 14:04       ` Mukesh Ojha
@ 2022-04-26 21:25         ` Kees Cook
  2022-04-27 11:58           ` Johannes Berg
  0 siblings, 1 reply; 8+ messages in thread
From: Kees Cook @ 2022-04-26 21:25 UTC (permalink / raw)
  To: Mukesh Ojha
  Cc: Thomas Gleixner, Johannes Berg, linux-kernel, sboyd, rafael, gregkh

On Tue, Apr 26, 2022 at 07:34:07PM +0530, Mukesh Ojha wrote:
> 
> 
> On 4/26/2022 1:07 AM, Thomas Gleixner wrote:
> > Cc+: Kees
> > 
> > On Mon, Apr 25 2022 at 19:19, Johannes Berg wrote:
> > > On Mon, 2022-04-25 at 19:00 +0200, Thomas Gleixner wrote:
> > > > 
> > > > Johannes, can you please explain whether this immediate flush in
> > > > disabled_store() is really required and if so, why?
> > > > 
> > > I don't really know, as I remember that requirement (or maybe even code,
> > > not sure) came from Kees, who needed the lockdown.
> > > 
> > > Given the use case (ChromeOS?) I'm not sure I see a need to flush all of
> > > them, since I guess a typical system would set the lockdown early in
> > > boot and hopefully not have a crash-dump around already.
> > > 
> > > That said, I don't think the diagram you made works - fn() during the
> > > iteration is guaranteed to be invoked with a reference of its own, so
> > > the put_device() there can't be the last reference, only as fn() returns
> > > you'd put the last reference *there*, freeing it.
> > 
> > Bah, you are right, it's magically protected by the klist ref, which
> > prevents devcd from going away. Damned obvious.
> > 
> > This really needs comments why this all can magically "work".
> > 
> > Thanks,
> > 
> >          tglx
> > 
> 
> Thanks you all for your time in reviewing this.
> I tried to address few comments in v3 here.
> 
> https://lore.kernel.org/lkml/1650981343-11739-1-git-send-email-quic_mojha@quicinc.com/
> 
> While, we would like to hear from Kees about reason of immediate flush from
> disabled_store().

This is lost to ancient history in my brain right now. Do you have any
references to past threads on this? The only thing I remember about
device memory dumping was just to make sure that lockdown's
CONFIDENTIALITY mode would block it.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 ] devcoredump : Serialize devcd_del work
  2022-04-26 21:25         ` Kees Cook
@ 2022-04-27 11:58           ` Johannes Berg
  0 siblings, 0 replies; 8+ messages in thread
From: Johannes Berg @ 2022-04-27 11:58 UTC (permalink / raw)
  To: Kees Cook, Mukesh Ojha
  Cc: Thomas Gleixner, linux-kernel, sboyd, rafael, gregkh

On Tue, 2022-04-26 at 14:25 -0700, Kees Cook wrote:
> > 
> > https://lore.kernel.org/lkml/1650981343-11739-1-git-send-email-quic_mojha@quicinc.com/
> > 
> > While, we would like to hear from Kees about reason of immediate flush from
> > disabled_store().
> 
> This is lost to ancient history in my brain right now. Do you have any
> references to past threads on this? The only thing I remember about
> device memory dumping was just to make sure that lockdown's
> CONFIDENTIALITY mode would block it.
> 

Hm, no, I don't, sorry. I don't even have an email record of it now,
perhaps it went to my @intel.com and has long expired?

Anyway, I seem to remember you asking to have a disable bit and I
implemented it in commit d45333294da8, which you reviewed, but I have no
record of you asking for it even though I think you did :)

johannes

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-04-27 11:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-25 13:09 [PATCH v2 ] devcoredump : Serialize devcd_del work Mukesh Ojha
2022-04-25 14:06 ` Greg KH
2022-04-25 17:00 ` Thomas Gleixner
2022-04-25 17:19   ` Johannes Berg
2022-04-25 19:37     ` Thomas Gleixner
2022-04-26 14:04       ` Mukesh Ojha
2022-04-26 21:25         ` Kees Cook
2022-04-27 11:58           ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).