linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
@ 2012-01-05  8:49 Chanho Min
  2012-01-15 10:28 ` Rabin Vincent
  0 siblings, 1 reply; 21+ messages in thread
From: Chanho Min @ 2012-01-05  8:49 UTC (permalink / raw)
  To: linux-kernel

>On Tue, Jan 03, 2012 at 12:23:44PM +0900, Chanho Min wrote:
>> >On Mon, Jan 02, 2012 at 06:38:21PM +0900,wrote:
>> >> from Chanho Min <chanho.min@lge.com>
>> >>
>> >> System may crash in backing-dev.c when removal SCSI device is detached.
>> >> bdi task is killed by bdi_unregister()/'khubd', but task's point
>remains.
>> >> Shortly afterward, If 'wb->wakeup_timer' is expired before
>> >> del_timer()/bdi_forker_thread,
>> >> wakeup_timer_fn() may wake up the dead thread which cause the crash.
>> >> 'bdi->wb.task' should be NULL as this patch.
>> >
>> >Is it some race condition between del_timer() and del_timer_sync()?
>> >
>> >bdi_unregister() calls
>> >
>> >        del_timer_sync
>> >        bdi_wb_shutdown
>> >            kthread_stop
>> >
>> >in turn, and del_timer_sync() should guarantee wakeup_timer_fn() is
>> >no longer called to access the stopped task.
>> >
>>
>> It is not race condition. This happens when USB is removed during write-
>access.
>> bdi_wakeup_thread_delayed is called after kthread_stop, and timer is
>activated again.
>>
>> 	bdi_unregister
>> 		kthread_stop
>> 	bdi_wakeup_thread_delayed (sys_write mostly calls this)
>> 	timer fires
>
>Ah OK, the timer could be restarted in the mean while, which breaks
>the synchronization rule in del_timer_sync().
>
>I noticed a related fix is merged recently, does your test kernel
>contain this commit?
>

No, I will try to reproduce with this patch.
But, bdi_destroy is not called during write-access. Same result is expected.

>commit 7a401a972df8e184b3d1a3fc958c0a4ddee8d312
>Author: Rabin Vincent <rabin.vincent@stericsson.com>
>Date:   Fri Nov 11 13:29:04 2011 +0100
>
>    backing-dev: ensure wakeup_timer is deleted
>
>> Anyway,Is this safeguard to prevent from waking up killed thread?
>
>This patch makes no guarantee wakeup_timer_fn() will see NULL
>bdi->wb.task before the task is stopped, so there is still race
>conditions. And still, the complete fix would be to prevent
>wakeup_timer_fn() from being called at all.

If wakeup_timer_fn() see NULL bdi->wb.task, wakeup_timer_fn regards
task as killed
and wake up forker thread instead of the defined thread.
Is this intended behavior of the bdi?

>
>Thanks,
>Fengguang
>
>> >> Signed-off-by: Chanho Min <chanho.min@lge.com>
>> >> ---
>> >>  mm/backing-dev.c |    1 +
>> >>  1 files changed, 1 insertions(+), 0 deletions(-)
>> >>
>> >> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
>> >> index 71034f4..4378a5e 100644
>> >> --- a/mm/backing-dev.c
>> >> +++ b/mm/backing-dev.c
>> >> @@ -607,6 +607,7 @@ static void bdi_wb_shutdown(struct backing_dev_info
>> >> *bdi)
>> >>         if (bdi->wb.task) {
>> >>                 thaw_process(bdi->wb.task);
>> >>                 kthread_stop(bdi->wb.task);
>> >> +               bdi->wb.task = NULL;
>> >>         }
>> >>  }
>> >>
>> >> --
>> >> 1.7.0.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
  2012-01-05  8:49 [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached Chanho Min
@ 2012-01-15 10:28 ` Rabin Vincent
  2012-01-15 12:58   ` Wu Fengguang
  0 siblings, 1 reply; 21+ messages in thread
From: Rabin Vincent @ 2012-01-15 10:28 UTC (permalink / raw)
  To: Chanho Min, Jens Axboe; +Cc: linux-kernel, fengguang.wu

On Thu, Jan 5, 2012 at 14:19, Chanho Min <chanho0207@gmail.com> wrote:
>>On Tue, Jan 03, 2012 at 12:23:44PM +0900, Chanho Min wrote:
>>> >On Mon, Jan 02, 2012 at 06:38:21PM +0900,wrote:
>>> >> from Chanho Min <chanho.min@lge.com>
>>> >>
>>> >> System may crash in backing-dev.c when removal SCSI device is detached.
>>> >> bdi task is killed by bdi_unregister()/'khubd', but task's point
>>remains.
>>> >> Shortly afterward, If 'wb->wakeup_timer' is expired before
>>> >> del_timer()/bdi_forker_thread,
>>> >> wakeup_timer_fn() may wake up the dead thread which cause the crash.
>>> >> 'bdi->wb.task' should be NULL as this patch.
>>
>>I noticed a related fix is merged recently, does your test kernel
>>contain this commit?
>>
> No, I will try to reproduce with this patch.
> But, bdi_destroy is not called during write-access. Same result is expected.

I agree. 7a401a972df8e184b3d1a3fc958c0a4ddee8d312 only addressed the
problem of the bdi being destroyed with an active timer, but there are
other races that could happen before that.

>>This patch makes no guarantee wakeup_timer_fn() will see NULL
>>bdi->wb.task before the task is stopped, so there is still race
>>conditions. And still, the complete fix would be to prevent
>>wakeup_timer_fn() from being called at all.
>
> If wakeup_timer_fn() see NULL bdi->wb.task, wakeup_timer_fn regards
> task as killed
> and wake up forker thread instead of the defined thread.
> Is this intended behavior of the bdi?

This appears to be the intended behaviour before, but certainly not
after the bdi is unregistered, since anyway the forker thread will not
find the bdi on the list.  In fact, if tracing is enabled the kernel
crashes because dev_name() is called on a NULL bdi->dev from the
wake_forker_thread tracepoint.

The following patch should address these issues:

8<---------------------------
>From 271f92d34b661d701eaad9b262423de5dba1cc11 Mon Sep 17 00:00:00 2001
From: Rabin Vincent <rabin@rab.in>
Date: Sun, 15 Jan 2012 15:30:40 +0530
Subject: [PATCH] backing-dev: fix wakeup timer races with bdi_unregister()

While 7a401a972df8e18 ("backing-dev: ensure wakeup_timer is deleted")
addressed the problem of the bdi being freed with a queued wakeup
timer, there are other races that could happen if the wakeup timer
expires after/during bdi_unregister(), before bdi_destroy() is called.

wakeup_timer_fn() could attempt to wakeup a task which has already has
been freed, or could access a NULL bdi->dev via the wake_forker_thread
tracepoint.

Reported-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Rabin Vincent <rabin@rab.in>
---
 mm/backing-dev.c |   17 +++++++++++++----
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 71034f4..a39ad70 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -318,7 +318,7 @@ static void wakeup_timer_fn(unsigned long data)
 	if (bdi->wb.task) {
 		trace_writeback_wake_thread(bdi);
 		wake_up_process(bdi->wb.task);
-	} else {
+	} else if (bdi->dev) {
 		/*
 		 * When bdi tasks are inactive for long time, they are killed.
 		 * In this case we have to wake-up the forker thread which
@@ -584,6 +584,8 @@ EXPORT_SYMBOL(bdi_register_dev);
  */
 static void bdi_wb_shutdown(struct backing_dev_info *bdi)
 {
+	struct task_struct *task = NULL;
+
 	if (!bdi_cap_writeback_dirty(bdi))
 		return;

@@ -604,9 +606,14 @@ static void bdi_wb_shutdown(struct backing_dev_info *bdi)
 	 * unfreeze of the thread before calling kthread_stop(), otherwise
 	 * it would never exet if it is currently stuck in the refrigerator.
 	 */
-	if (bdi->wb.task) {
-		thaw_process(bdi->wb.task);
-		kthread_stop(bdi->wb.task);
+	spin_lock_bh(&bdi->wb_lock);
+	task = bdi->wb.task;
+	bdi->wb.task = NULL;
+	spin_unlock_bh(&bdi->wb_lock);
+
+	if (task) {
+		thaw_process(task);
+		kthread_stop(task);
 	}
 }

@@ -637,7 +644,9 @@ void bdi_unregister(struct backing_dev_info *bdi)
 			bdi_wb_shutdown(bdi);
 		bdi_debug_unregister(bdi);
 		device_unregister(bdi->dev);
+		spin_lock_bh(&bdi->wb_lock);
 		bdi->dev = NULL;
+		spin_unlock_bh(&bdi->wb_lock);
 	}
 }
 EXPORT_SYMBOL(bdi_unregister);
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
  2012-01-15 10:28 ` Rabin Vincent
@ 2012-01-15 12:58   ` Wu Fengguang
  2012-01-15 15:41     ` Rabin Vincent
  0 siblings, 1 reply; 21+ messages in thread
From: Wu Fengguang @ 2012-01-15 12:58 UTC (permalink / raw)
  To: Rabin Vincent; +Cc: Chanho Min, Jens Axboe, linux-kernel

On Sun, Jan 15, 2012 at 03:58:43PM +0530, Rabin Vincent wrote:
> On Thu, Jan 5, 2012 at 14:19, Chanho Min <chanho0207@gmail.com> wrote:
> >>On Tue, Jan 03, 2012 at 12:23:44PM +0900, Chanho Min wrote:
> >>> >On Mon, Jan 02, 2012 at 06:38:21PM +0900,wrote:
> >>> >> from Chanho Min <chanho.min@lge.com>
> >>> >>
> >>> >> System may crash in backing-dev.c when removal SCSI device is detached.
> >>> >> bdi task is killed by bdi_unregister()/'khubd', but task's point
> >>remains.
> >>> >> Shortly afterward, If 'wb->wakeup_timer' is expired before
> >>> >> del_timer()/bdi_forker_thread,
> >>> >> wakeup_timer_fn() may wake up the dead thread which cause the crash.
> >>> >> 'bdi->wb.task' should be NULL as this patch.
> >>
> >>I noticed a related fix is merged recently, does your test kernel
> >>contain this commit?
> >>
> > No, I will try to reproduce with this patch.
> > But, bdi_destroy is not called during write-access. Same result is expected.
> 
> I agree. 7a401a972df8e184b3d1a3fc958c0a4ddee8d312 only addressed the
> problem of the bdi being destroyed with an active timer, but there are
> other races that could happen before that.
> 
> >>This patch makes no guarantee wakeup_timer_fn() will see NULL
> >>bdi->wb.task before the task is stopped, so there is still race
> >>conditions. And still, the complete fix would be to prevent
> >>wakeup_timer_fn() from being called at all.
> >
> > If wakeup_timer_fn() see NULL bdi->wb.task, wakeup_timer_fn regards
> > task as killed
> > and wake up forker thread instead of the defined thread.
> > Is this intended behavior of the bdi?
> 
> This appears to be the intended behaviour before, but certainly not
> after the bdi is unregistered, since anyway the forker thread will not
> find the bdi on the list.  In fact, if tracing is enabled the kernel
> crashes because dev_name() is called on a NULL bdi->dev from the
> wake_forker_thread tracepoint.

Right.

> The following patch should address these issues:
> 
> 8<---------------------------
> >From 271f92d34b661d701eaad9b262423de5dba1cc11 Mon Sep 17 00:00:00 2001
> From: Rabin Vincent <rabin@rab.in>
> Date: Sun, 15 Jan 2012 15:30:40 +0530
> Subject: [PATCH] backing-dev: fix wakeup timer races with bdi_unregister()
> 
> While 7a401a972df8e18 ("backing-dev: ensure wakeup_timer is deleted")
> addressed the problem of the bdi being freed with a queued wakeup
> timer, there are other races that could happen if the wakeup timer
> expires after/during bdi_unregister(), before bdi_destroy() is called.
> 
> wakeup_timer_fn() could attempt to wakeup a task which has already has
> been freed, or could access a NULL bdi->dev via the wake_forker_thread
> tracepoint.
> 
> Reported-by: Chanho Min <chanho.min@lge.com>
> Signed-off-by: Rabin Vincent <rabin@rab.in>

Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>

>  mm/backing-dev.c |   17 +++++++++++++----
>  1 files changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 71034f4..a39ad70 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -318,7 +318,7 @@ static void wakeup_timer_fn(unsigned long data)
>  	if (bdi->wb.task) {
>  		trace_writeback_wake_thread(bdi);
>  		wake_up_process(bdi->wb.task);
> -	} else {
> +	} else if (bdi->dev) {
>  		/*
>  		 * When bdi tasks are inactive for long time, they are killed.
>  		 * In this case we have to wake-up the forker thread which
> @@ -584,6 +584,8 @@ EXPORT_SYMBOL(bdi_register_dev);
>   */
>  static void bdi_wb_shutdown(struct backing_dev_info *bdi)
>  {
> +	struct task_struct *task = NULL;

NULL not necessary?

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
  2012-01-15 12:58   ` Wu Fengguang
@ 2012-01-15 15:41     ` Rabin Vincent
  2012-01-16  2:53       ` Wu Fengguang
  0 siblings, 1 reply; 21+ messages in thread
From: Rabin Vincent @ 2012-01-15 15:41 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Chanho Min, Jens Axboe, linux-kernel

On Sun, Jan 15, 2012 at 08:58:53PM +0800, Wu Fengguang wrote:
> On Sun, Jan 15, 2012 at 03:58:43PM +0530, Rabin Vincent wrote:
> > diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> > index 71034f4..a39ad70 100644
> > --- a/mm/backing-dev.c
> > +++ b/mm/backing-dev.c
> > @@ -318,7 +318,7 @@ static void wakeup_timer_fn(unsigned long data)
> >  	if (bdi->wb.task) {
> >  		trace_writeback_wake_thread(bdi);
> >  		wake_up_process(bdi->wb.task);
> > -	} else {
> > +	} else if (bdi->dev) {
> >  		/*
> >  		 * When bdi tasks are inactive for long time, they are killed.
> >  		 * In this case we have to wake-up the forker thread which
> > @@ -584,6 +584,8 @@ EXPORT_SYMBOL(bdi_register_dev);
> >   */
> >  static void bdi_wb_shutdown(struct backing_dev_info *bdi)
> >  {
> > +	struct task_struct *task = NULL;
> 
> NULL not necessary?

Will fix.

Also, I notice another problem here.  bdi->dev should be made NULL before the
device is unregistered, not after:

		if (!bdi_cap_flush_forker(bdi))
			bdi_wb_shutdown(bdi);
		bdi_debug_unregister(bdi);
		device_unregister(bdi->dev);
		bdi->dev = NULL;


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
  2012-01-15 15:41     ` Rabin Vincent
@ 2012-01-16  2:53       ` Wu Fengguang
  2012-01-16  5:28         ` Chanho Min
  2012-01-19 16:50         ` [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister() Rabin Vincent
  0 siblings, 2 replies; 21+ messages in thread
From: Wu Fengguang @ 2012-01-16  2:53 UTC (permalink / raw)
  To: Rabin Vincent; +Cc: Chanho Min, Jens Axboe, linux-kernel

On Sun, Jan 15, 2012 at 09:11:07PM +0530, Rabin Vincent wrote:
> On Sun, Jan 15, 2012 at 08:58:53PM +0800, Wu Fengguang wrote:
> > On Sun, Jan 15, 2012 at 03:58:43PM +0530, Rabin Vincent wrote:
> > > diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> > > index 71034f4..a39ad70 100644
> > > --- a/mm/backing-dev.c
> > > +++ b/mm/backing-dev.c
> > > @@ -318,7 +318,7 @@ static void wakeup_timer_fn(unsigned long data)
> > >  	if (bdi->wb.task) {
> > >  		trace_writeback_wake_thread(bdi);
> > >  		wake_up_process(bdi->wb.task);
> > > -	} else {
> > > +	} else if (bdi->dev) {
> > >  		/*
> > >  		 * When bdi tasks are inactive for long time, they are killed.
> > >  		 * In this case we have to wake-up the forker thread which
> > > @@ -584,6 +584,8 @@ EXPORT_SYMBOL(bdi_register_dev);
> > >   */
> > >  static void bdi_wb_shutdown(struct backing_dev_info *bdi)
> > >  {
> > > +	struct task_struct *task = NULL;
> > 
> > NULL not necessary?
> 
> Will fix.
> 
> Also, I notice another problem here.  bdi->dev should be made NULL before the
> device is unregistered, not after:

Yes, good catch!

> 		if (!bdi_cap_flush_forker(bdi))
> 			bdi_wb_shutdown(bdi);
> 		bdi_debug_unregister(bdi);
> 		device_unregister(bdi->dev);
> 		bdi->dev = NULL;

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
  2012-01-16  2:53       ` Wu Fengguang
@ 2012-01-16  5:28         ` Chanho Min
  2012-01-16  5:50           ` Wu Fengguang
  2012-01-19 16:50         ` [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister() Rabin Vincent
  1 sibling, 1 reply; 21+ messages in thread
From: Chanho Min @ 2012-01-16  5:28 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Rabin Vincent, Jens Axboe, linux-kernel

On Mon, Jan 16, 2012 at 11:53 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> On Sun, Jan 15, 2012 at 09:11:07PM +0530, Rabin Vincent wrote:
>> On Sun, Jan 15, 2012 at 08:58:53PM +0800, Wu Fengguang wrote:
>> > On Sun, Jan 15, 2012 at 03:58:43PM +0530, Rabin Vincent wrote:
>> > > diff --git a/mm/backing-dev.c b/mm/backing-dev.c
>> > > index 71034f4..a39ad70 100644
>> > > --- a/mm/backing-dev.c
>> > > +++ b/mm/backing-dev.c
>> > > @@ -318,7 +318,7 @@ static void wakeup_timer_fn(unsigned long data)
>> > >   if (bdi->wb.task) {
>> > >           trace_writeback_wake_thread(bdi);
>> > >           wake_up_process(bdi->wb.task);
>> > > - } else {
>> > > + } else if (bdi->dev) {
>> > >           /*
>> > >            * When bdi tasks are inactive for long time, they are killed.
>> > >            * In this case we have to wake-up the forker thread which
>> > > @@ -584,6 +584,8 @@ EXPORT_SYMBOL(bdi_register_dev);
>> > >   */
>> > >  static void bdi_wb_shutdown(struct backing_dev_info *bdi)
>> > >  {
>> > > + struct task_struct *task = NULL;

Thanks, I fully understand.
In addition, Would not bdi_wakeup_flusher also be fixed?

>From dc7d4e86911c0a7ea35043485b04e8a09aa74ffd Mon Sep 17 00:00:00 2001
From: Chanho Min <chanho.min@lge.com>
Date: Mon, 16 Jan 2012 14:14:48 +0900
Subject: [PATCH] backing-dev: add for 'fix wakeup timer races with
bdi_unregister()'

Signed-off-by: Chanho Min <chanho.min@lge.com>
---
 fs/fs-writeback.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index f855916..ee40833 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -97,7 +97,7 @@ static void bdi_wakeup_flusher(struct backing_dev_info *bdi)
 {
        if (bdi->wb.task) {
                wake_up_process(bdi->wb.task);
-       } else {
+       } else if (bdi->dev) {
                /*
                 * The bdi thread isn't there, wake up the forker thread which
                 * will create and run it.
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
  2012-01-16  5:28         ` Chanho Min
@ 2012-01-16  5:50           ` Wu Fengguang
  2012-01-16  5:53             ` Wu Fengguang
  0 siblings, 1 reply; 21+ messages in thread
From: Wu Fengguang @ 2012-01-16  5:50 UTC (permalink / raw)
  To: Chanho Min; +Cc: Rabin Vincent, Jens Axboe, linux-kernel

On Mon, Jan 16, 2012 at 02:28:30PM +0900, Chanho Min wrote:
> On Mon, Jan 16, 2012 at 11:53 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > On Sun, Jan 15, 2012 at 09:11:07PM +0530, Rabin Vincent wrote:
> >> On Sun, Jan 15, 2012 at 08:58:53PM +0800, Wu Fengguang wrote:
> >> > On Sun, Jan 15, 2012 at 03:58:43PM +0530, Rabin Vincent wrote:
> >> > > diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> >> > > index 71034f4..a39ad70 100644
> >> > > --- a/mm/backing-dev.c
> >> > > +++ b/mm/backing-dev.c
> >> > > @@ -318,7 +318,7 @@ static void wakeup_timer_fn(unsigned long data)
> >> > >   if (bdi->wb.task) {
> >> > >           trace_writeback_wake_thread(bdi);
> >> > >           wake_up_process(bdi->wb.task);
> >> > > - } else {
> >> > > + } else if (bdi->dev) {
> >> > >           /*
> >> > >            * When bdi tasks are inactive for long time, they are killed.
> >> > >            * In this case we have to wake-up the forker thread which
> >> > > @@ -584,6 +584,8 @@ EXPORT_SYMBOL(bdi_register_dev);
> >> > >   */
> >> > >  static void bdi_wb_shutdown(struct backing_dev_info *bdi)
> >> > >  {
> >> > > + struct task_struct *task = NULL;
> 
> Thanks, I fully understand.
> In addition, Would not bdi_wakeup_flusher also be fixed?

Yes, indeed!

Thanks,
Fengguang

> >From dc7d4e86911c0a7ea35043485b04e8a09aa74ffd Mon Sep 17 00:00:00 2001
> From: Chanho Min <chanho.min@lge.com>
> Date: Mon, 16 Jan 2012 14:14:48 +0900
> Subject: [PATCH] backing-dev: add for 'fix wakeup timer races with
> bdi_unregister()'
> 
> Signed-off-by: Chanho Min <chanho.min@lge.com>
> ---
>  fs/fs-writeback.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index f855916..ee40833 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -97,7 +97,7 @@ static void bdi_wakeup_flusher(struct backing_dev_info *bdi)
>  {
>         if (bdi->wb.task) {
>                 wake_up_process(bdi->wb.task);
> -       } else {
> +       } else if (bdi->dev) {
>                 /*
>                  * The bdi thread isn't there, wake up the forker thread which
>                  * will create and run it.
> -- 
> 1.7.0.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
  2012-01-16  5:50           ` Wu Fengguang
@ 2012-01-16  5:53             ` Wu Fengguang
  2012-01-16  6:34               ` Chanho Min
  0 siblings, 1 reply; 21+ messages in thread
From: Wu Fengguang @ 2012-01-16  5:53 UTC (permalink / raw)
  To: Chanho Min; +Cc: Rabin Vincent, Jens Axboe, linux-kernel

On Mon, Jan 16, 2012 at 01:50:44PM +0800, Wu Fengguang wrote:
> On Mon, Jan 16, 2012 at 02:28:30PM +0900, Chanho Min wrote:
> > On Mon, Jan 16, 2012 at 11:53 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > On Sun, Jan 15, 2012 at 09:11:07PM +0530, Rabin Vincent wrote:
> > >> On Sun, Jan 15, 2012 at 08:58:53PM +0800, Wu Fengguang wrote:
> > >> > On Sun, Jan 15, 2012 at 03:58:43PM +0530, Rabin Vincent wrote:
> > >> > > diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> > >> > > index 71034f4..a39ad70 100644
> > >> > > --- a/mm/backing-dev.c
> > >> > > +++ b/mm/backing-dev.c
> > >> > > @@ -318,7 +318,7 @@ static void wakeup_timer_fn(unsigned long data)
> > >> > >   if (bdi->wb.task) {
> > >> > >           trace_writeback_wake_thread(bdi);
> > >> > >           wake_up_process(bdi->wb.task);
> > >> > > - } else {
> > >> > > + } else if (bdi->dev) {
> > >> > >           /*
> > >> > >            * When bdi tasks are inactive for long time, they are killed.
> > >> > >            * In this case we have to wake-up the forker thread which
> > >> > > @@ -584,6 +584,8 @@ EXPORT_SYMBOL(bdi_register_dev);
> > >> > >   */
> > >> > >  static void bdi_wb_shutdown(struct backing_dev_info *bdi)
> > >> > >  {
> > >> > > + struct task_struct *task = NULL;
> > 
> > Thanks, I fully understand.
> > In addition, Would not bdi_wakeup_flusher also be fixed?
> 
> Yes, indeed!

But wait.. Rabin's patch actually fixes the NULL deference in
the call

        trace_writeback_wake_forker_thread(bdi);

The wakeup of the forker thread should be harmless.

Thanks,
Fengguang

> > >From dc7d4e86911c0a7ea35043485b04e8a09aa74ffd Mon Sep 17 00:00:00 2001
> > From: Chanho Min <chanho.min@lge.com>
> > Date: Mon, 16 Jan 2012 14:14:48 +0900
> > Subject: [PATCH] backing-dev: add for 'fix wakeup timer races with
> > bdi_unregister()'
> > 
> > Signed-off-by: Chanho Min <chanho.min@lge.com>
> > ---
> >  fs/fs-writeback.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index f855916..ee40833 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -97,7 +97,7 @@ static void bdi_wakeup_flusher(struct backing_dev_info *bdi)
> >  {
> >         if (bdi->wb.task) {
> >                 wake_up_process(bdi->wb.task);
> > -       } else {
> > +       } else if (bdi->dev) {
> >                 /*
> >                  * The bdi thread isn't there, wake up the forker thread which
> >                  * will create and run it.
> > -- 
> > 1.7.0.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
  2012-01-16  5:53             ` Wu Fengguang
@ 2012-01-16  6:34               ` Chanho Min
  2012-01-18 19:43                 ` Rabin Vincent
  0 siblings, 1 reply; 21+ messages in thread
From: Chanho Min @ 2012-01-16  6:34 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Rabin Vincent, Jens Axboe, linux-kernel

> But wait.. Rabin's patch actually fixes the NULL deference in
> the call
>
>        trace_writeback_wake_forker_thread(bdi);
>
> The wakeup of the forker thread should be harmless.
>
The forker thread shoud not be woken up after the bdi is unregistered.
Is this also Rabin's intention?
Also, I'm not sure the wakeup of the forker thread is harmless as bellows.
..
case FORK_THREAD:
..
  task = kthread_create(bdi_writeback_thread, &bdi->wb,
		      "flush-%s", dev_name(bdi->dev));

Thanks
Chanho

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
  2012-01-16  6:34               ` Chanho Min
@ 2012-01-18 19:43                 ` Rabin Vincent
  0 siblings, 0 replies; 21+ messages in thread
From: Rabin Vincent @ 2012-01-18 19:43 UTC (permalink / raw)
  To: Chanho Min; +Cc: Wu Fengguang, Jens Axboe, linux-kernel

On Mon, Jan 16, 2012 at 12:04, Chanho Min <chanho0207@gmail.com> wrote:
>> But wait.. Rabin's patch actually fixes the NULL deference in
>> the call
>>
>>        trace_writeback_wake_forker_thread(bdi);
>>
>> The wakeup of the forker thread should be harmless.
>>
> The forker thread shoud not be woken up after the bdi is unregistered.
> Is this also Rabin's intention?
> Also, I'm not sure the wakeup of the forker thread is harmless as bellows.
> ..
> case FORK_THREAD:
> ..
>  task = kthread_create(bdi_writeback_thread, &bdi->wb,
>                      "flush-%s", dev_name(bdi->dev));

By the time the bdi->dev is made NULL, the bdi has been removed from the
bdi_list (in bdi_wb_shutdown()), so even if the forker thread is woken
up after that it will not find the bdi and won't get to this part of the
code.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister()
  2012-01-16  2:53       ` Wu Fengguang
  2012-01-16  5:28         ` Chanho Min
@ 2012-01-19 16:50         ` Rabin Vincent
  2012-01-19 23:46           ` Namjae Jeon
  2012-01-31 13:24           ` Wu Fengguang
  1 sibling, 2 replies; 21+ messages in thread
From: Rabin Vincent @ 2012-01-19 16:50 UTC (permalink / raw)
  To: fengguang.wu, axboe; +Cc: linux-kernel, chanho0207, Rabin Vincent

While 7a401a972df8e18 ("backing-dev: ensure wakeup_timer is deleted")
addressed the problem of the bdi being freed with a queued wakeup
timer, there are other races that could happen if the wakeup timer
expires after/during bdi_unregister(), before bdi_destroy() is called.

wakeup_timer_fn() could attempt to wakeup a task which has already has
been freed, or could access a NULL bdi->dev via the wake_forker_thread
tracepoint.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Reported-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Rabin Vincent <rabin@rab.in>
---

v2:
 - rebase onto the latest kernel which removed the thaw()
 - don't unnecessarily initialize task
 - unregister device after setting bdi->dev to NULL, not before

 mm/backing-dev.c |   23 ++++++++++++++++++-----
 1 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 7ba8fea..dd8e2aa 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -318,7 +318,7 @@ static void wakeup_timer_fn(unsigned long data)
 	if (bdi->wb.task) {
 		trace_writeback_wake_thread(bdi);
 		wake_up_process(bdi->wb.task);
-	} else {
+	} else if (bdi->dev) {
 		/*
 		 * When bdi tasks are inactive for long time, they are killed.
 		 * In this case we have to wake-up the forker thread which
@@ -584,6 +584,8 @@ EXPORT_SYMBOL(bdi_register_dev);
  */
 static void bdi_wb_shutdown(struct backing_dev_info *bdi)
 {
+	struct task_struct *task;
+
 	if (!bdi_cap_writeback_dirty(bdi))
 		return;
 
@@ -602,8 +604,13 @@ static void bdi_wb_shutdown(struct backing_dev_info *bdi)
 	 * Finally, kill the kernel thread. We don't need to be RCU
 	 * safe anymore, since the bdi is gone from visibility.
 	 */
-	if (bdi->wb.task)
-		kthread_stop(bdi->wb.task);
+	spin_lock_bh(&bdi->wb_lock);
+	task = bdi->wb.task;
+	bdi->wb.task = NULL;
+	spin_unlock_bh(&bdi->wb_lock);
+
+	if (task)
+		kthread_stop(task);
 }
 
 /*
@@ -623,7 +630,9 @@ static void bdi_prune_sb(struct backing_dev_info *bdi)
 
 void bdi_unregister(struct backing_dev_info *bdi)
 {
-	if (bdi->dev) {
+	struct device *dev = bdi->dev;
+
+	if (dev) {
 		bdi_set_min_ratio(bdi, 0);
 		trace_writeback_bdi_unregister(bdi);
 		bdi_prune_sb(bdi);
@@ -632,8 +641,12 @@ void bdi_unregister(struct backing_dev_info *bdi)
 		if (!bdi_cap_flush_forker(bdi))
 			bdi_wb_shutdown(bdi);
 		bdi_debug_unregister(bdi);
-		device_unregister(bdi->dev);
+
+		spin_lock_bh(&bdi->wb_lock);
 		bdi->dev = NULL;
+		spin_unlock_bh(&bdi->wb_lock);
+
+		device_unregister(dev);
 	}
 }
 EXPORT_SYMBOL(bdi_unregister);
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister()
  2012-01-19 16:50         ` [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister() Rabin Vincent
@ 2012-01-19 23:46           ` Namjae Jeon
  2012-01-20  5:24             ` Rabin Vincent
  2012-01-31 13:24           ` Wu Fengguang
  1 sibling, 1 reply; 21+ messages in thread
From: Namjae Jeon @ 2012-01-19 23:46 UTC (permalink / raw)
  To: Rabin Vincent; +Cc: fengguang.wu, axboe, linux-kernel, chanho0207

>                bdi_debug_unregister(bdi);
> -               device_unregister(bdi->dev);
> +
> +               spin_lock_bh(&bdi->wb_lock);
>                bdi->dev = NULL;
> +               spin_unlock_bh(&bdi->wb_lock);
Hi.
Would you explain me why you add spinlock in here ?
Thanks.
> +
> +               device_unregister(dev);
>        }
>  }
>  EXPORT_SYMBOL(bdi_unregister);
> --
> 1.7.7.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister()
  2012-01-19 23:46           ` Namjae Jeon
@ 2012-01-20  5:24             ` Rabin Vincent
  2012-01-20  6:15               ` Namjae Jeon
  0 siblings, 1 reply; 21+ messages in thread
From: Rabin Vincent @ 2012-01-20  5:24 UTC (permalink / raw)
  To: Namjae Jeon; +Cc: fengguang.wu, axboe, linux-kernel, chanho0207

On Fri, Jan 20, 2012 at 05:16, Namjae Jeon <linkinjeon@gmail.com> wrote:
>>                bdi_debug_unregister(bdi);
>> -               device_unregister(bdi->dev);
>> +
>> +               spin_lock_bh(&bdi->wb_lock);
>>                bdi->dev = NULL;
>> +               spin_unlock_bh(&bdi->wb_lock);
> Hi.
> Would you explain me why you add spinlock in here ?

wakeup_timer_fn() does the following, where the
trace_writeback_wake_forker_thread() also accesses bdi->dev.
It does this under the wb_lock:

	} else if (bdi->dev) {
		/*
		 * When bdi tasks are inactive for long time, they are killed.
		 * In this case we have to wake-up the forker thread which
		 * should create and run the bdi thread.
		 */
		trace_writeback_wake_forker_thread(bdi);

If we don't have the lock above, the bdi->dev could potentially be
cleared after the check but before the tracepoint is hit, leading to a
NULL pointer dereference.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister()
  2012-01-20  5:24             ` Rabin Vincent
@ 2012-01-20  6:15               ` Namjae Jeon
  2012-01-20 10:03                 ` Rabin Vincent
  0 siblings, 1 reply; 21+ messages in thread
From: Namjae Jeon @ 2012-01-20  6:15 UTC (permalink / raw)
  To: Rabin Vincent; +Cc: fengguang.wu, axboe, linux-kernel, chanho0207

2012/1/20 Rabin Vincent <rabin@rab.in>:
> On Fri, Jan 20, 2012 at 05:16, Namjae Jeon <linkinjeon@gmail.com> wrote:
>>>                bdi_debug_unregister(bdi);
>>> -               device_unregister(bdi->dev);
>>> +
>>> +               spin_lock_bh(&bdi->wb_lock);
>>>                bdi->dev = NULL;
>>> +               spin_unlock_bh(&bdi->wb_lock);
>> Hi.
>> Would you explain me why you add spinlock in here ?
>
> wakeup_timer_fn() does the following, where the
> trace_writeback_wake_forker_thread() also accesses bdi->dev.
> It does this under the wb_lock:
>
>        } else if (bdi->dev) {
>                /*
>                 * When bdi tasks are inactive for long time, they are killed.
>                 * In this case we have to wake-up the forker thread which
>                 * should create and run the bdi thread.
>                 */
>                trace_writeback_wake_forker_thread(bdi);
>
> If we don't have the lock above, the bdi->dev could potentially be
> cleared after the check but before the tracepoint is hit, leading to a
> NULL pointer dereference.
Is there no possibility trace_writeback_wake_forker_thread is called
after spin_unlock of bdi->de= null ?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister()
  2012-01-20  6:15               ` Namjae Jeon
@ 2012-01-20 10:03                 ` Rabin Vincent
  2012-01-20 11:18                   ` Namjae Jeon
  0 siblings, 1 reply; 21+ messages in thread
From: Rabin Vincent @ 2012-01-20 10:03 UTC (permalink / raw)
  To: Namjae Jeon; +Cc: fengguang.wu, axboe, linux-kernel, chanho0207

On Fri, Jan 20, 2012 at 03:15:32PM +0900, Namjae Jeon wrote:
> 2012/1/20 Rabin Vincent <rabin@rab.in>:
> > On Fri, Jan 20, 2012 at 05:16, Namjae Jeon <linkinjeon@gmail.com> wrote:
> >>>                bdi_debug_unregister(bdi);
> >>> -               device_unregister(bdi->dev);
> >>> +
> >>> +               spin_lock_bh(&bdi->wb_lock);
> >>>                bdi->dev = NULL;
> >>> +               spin_unlock_bh(&bdi->wb_lock);
> >> Hi.
> >> Would you explain me why you add spinlock in here ?
> >
> > wakeup_timer_fn() does the following, where the
> > trace_writeback_wake_forker_thread() also accesses bdi->dev.
> > It does this under the wb_lock:
> >
> >        } else if (bdi->dev) {
> >                /*
> >                 * When bdi tasks are inactive for long time, they are killed.
> >                 * In this case we have to wake-up the forker thread which
> >                 * should create and run the bdi thread.
> >                 */
> >                trace_writeback_wake_forker_thread(bdi);
> >
> > If we don't have the lock above, the bdi->dev could potentially be
> > cleared after the check but before the tracepoint is hit, leading to a
> > NULL pointer dereference.
> Is there no possibility trace_writeback_wake_forker_thread is called
> after spin_unlock of bdi->de= null ?

wakeup_timer_fn() holds the wb_lock across the check for bdi->dev !=
NULL and the call to trace_writeback_wake_forker_thread(), so no.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister()
  2012-01-20 10:03                 ` Rabin Vincent
@ 2012-01-20 11:18                   ` Namjae Jeon
  2012-01-20 12:08                     ` Rabin Vincent
  0 siblings, 1 reply; 21+ messages in thread
From: Namjae Jeon @ 2012-01-20 11:18 UTC (permalink / raw)
  To: Rabin Vincent; +Cc: fengguang.wu, axboe, linux-kernel, chanho0207

2012/1/20 Rabin Vincent <rabin@rab.in>:
> On Fri, Jan 20, 2012 at 03:15:32PM +0900, Namjae Jeon wrote:
>> 2012/1/20 Rabin Vincent <rabin@rab.in>:
>> > On Fri, Jan 20, 2012 at 05:16, Namjae Jeon <linkinjeon@gmail.com> wrote:
>> >>>                bdi_debug_unregister(bdi);
>> >>> -               device_unregister(bdi->dev);
>> >>> +
>> >>> +               spin_lock_bh(&bdi->wb_lock);
>> >>>                bdi->dev = NULL;
>> >>> +               spin_unlock_bh(&bdi->wb_lock);
>> >> Hi.
>> >> Would you explain me why you add spinlock in here ?
>> >
>> > wakeup_timer_fn() does the following, where the
>> > trace_writeback_wake_forker_thread() also accesses bdi->dev.
>> > It does this under the wb_lock:
>> >
>> >        } else if (bdi->dev) {
>> >                /*
>> >                 * When bdi tasks are inactive for long time, they are killed.
>> >                 * In this case we have to wake-up the forker thread which
>> >                 * should create and run the bdi thread.
>> >                 */
>> >                trace_writeback_wake_forker_thread(bdi);
>> >
>> > If we don't have the lock above, the bdi->dev could potentially be
>> > cleared after the check but before the tracepoint is hit, leading to a
>> > NULL pointer dereference.
>> Is there no possibility trace_writeback_wake_forker_thread is called
>> after spin_unlock of bdi->de= null ?
>
> wakeup_timer_fn() holds the wb_lock across the check for bdi->dev !=
> NULL and the call to trace_writeback_wake_forker_thread(), so no.

The following case is not possible with racing of __mark_inode_dirty  ?
--------------------------------------------------------------------------------------------------
spin_lock
bdi->dev = NULL;
spin_unlock
                              spin_lock
                               ......
                               ......
                               trace_writeback_wake_forker_thread(bdi);
                               spin_unlock
-----------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister()
  2012-01-20 11:18                   ` Namjae Jeon
@ 2012-01-20 12:08                     ` Rabin Vincent
  2012-01-20 15:04                       ` Namjae Jeon
  0 siblings, 1 reply; 21+ messages in thread
From: Rabin Vincent @ 2012-01-20 12:08 UTC (permalink / raw)
  To: Namjae Jeon; +Cc: fengguang.wu, axboe, linux-kernel, chanho0207

On Fri, Jan 20, 2012 at 08:18:31PM +0900, Namjae Jeon wrote:
> The following case is not possible with racing of __mark_inode_dirty  ?
> --------------------------------------------------------------------------------------------------
> spin_lock
> bdi->dev = NULL;
> spin_unlock
>                               spin_lock
>                                ......
>                                ......
>                                trace_writeback_wake_forker_thread(bdi);
>                                spin_unlock
> -----------------------------------------------------------------------------

After this patch, this is what wakeup_timer_fn looks like:

static void wakeup_timer_fn(unsigned long data)
{
        struct backing_dev_info *bdi = (struct backing_dev_info *)data;

        spin_lock_bh(&bdi->wb_lock);
        if (bdi->wb.task) {
                trace_writeback_wake_thread(bdi);
                wake_up_process(bdi->wb.task);
        } else if (bdi->dev) {
                /*
                 * When bdi tasks are inactive for long time, they are killed.
                 * In this case we have to wake-up the forker thread which
                 * should create and run the bdi thread.
                 */
                trace_writeback_wake_forker_thread(bdi);
                wake_up_process(default_backing_dev_info.wb.task);
        }
        spin_unlock_bh(&bdi->wb_lock);
}

So how will trace_writeback_wake_forker_thread() be called if bdi->dev is NULL?

This patch added the if (bdi->dev) check, perhaps you overlooked that?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister()
  2012-01-20 12:08                     ` Rabin Vincent
@ 2012-01-20 15:04                       ` Namjae Jeon
  0 siblings, 0 replies; 21+ messages in thread
From: Namjae Jeon @ 2012-01-20 15:04 UTC (permalink / raw)
  To: Rabin Vincent; +Cc: fengguang.wu, axboe, linux-kernel, chanho0207

>
> After this patch, this is what wakeup_timer_fn looks like:
>
> static void wakeup_timer_fn(unsigned long data)
> {
>        struct backing_dev_info *bdi = (struct backing_dev_info *)data;
>
>        spin_lock_bh(&bdi->wb_lock);
>        if (bdi->wb.task) {
>                trace_writeback_wake_thread(bdi);
>                wake_up_process(bdi->wb.task);
>        } else if (bdi->dev) {
>                /*
>                 * When bdi tasks are inactive for long time, they are killed.
>                 * In this case we have to wake-up the forker thread which
>                 * should create and run the bdi thread.
>                 */
>                trace_writeback_wake_forker_thread(bdi);
>                wake_up_process(default_backing_dev_info.wb.task);
>        }
>        spin_unlock_bh(&bdi->wb_lock);
> }
>
> So how will trace_writeback_wake_forker_thread() be called if bdi->dev is NULL?
>
> This patch added the if (bdi->dev) check, perhaps you overlooked that?

Hi Rabin.
I clearly understand. Thanks for your explanation.

Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister()
  2012-01-19 16:50         ` [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister() Rabin Vincent
  2012-01-19 23:46           ` Namjae Jeon
@ 2012-01-31 13:24           ` Wu Fengguang
  1 sibling, 0 replies; 21+ messages in thread
From: Wu Fengguang @ 2012-01-31 13:24 UTC (permalink / raw)
  To: Rabin Vincent; +Cc: axboe, linux-kernel, chanho0207

On Thu, Jan 19, 2012 at 10:20:20PM +0530, Rabin Vincent wrote:
> While 7a401a972df8e18 ("backing-dev: ensure wakeup_timer is deleted")
> addressed the problem of the bdi being freed with a queued wakeup
> timer, there are other races that could happen if the wakeup timer
> expires after/during bdi_unregister(), before bdi_destroy() is called.
> 
> wakeup_timer_fn() could attempt to wakeup a task which has already has
> been freed, or could access a NULL bdi->dev via the wake_forker_thread
> tracepoint.
> 
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Wu Fengguang <fengguang.wu@intel.com>
> Reported-by: Chanho Min <chanho.min@lge.com>
> Signed-off-by: Rabin Vincent <rabin@rab.in>
> ---

Thanks. I just pushed it to linux-next, together with this related one:

        Subject: writeback: fix NULL bdi->dev in trace writeback_single_inode
        Date: Tue Jan 17 11:18:56 CST 2012

        bdi_prune_sb() resets sb->s_bdi to default_backing_dev_info when the
        tearing down the original bdi. Fix trace_writeback_single_inode to
        use sb->s_bdi=default_backing_dev_info rather than bdi->dev=NULL for a
        teared down bdi.

        Reported-by: Rabin Vincent <rabin@rab.in>
        Tested-by: Rabin Vincent <rabin@rab.in>
        Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
       [not found]   ` <002e01ccc9c7$1928c940$4b7a5bc0$@min@lge.com>
@ 2012-01-03  4:49     ` Wu Fengguang
  0 siblings, 0 replies; 21+ messages in thread
From: Wu Fengguang @ 2012-01-03  4:49 UTC (permalink / raw)
  To: Chanho Min
  Cc: linux-mm, linux-kernel, 'Jens Axboe',
	'Andrew Morton',
	Rabin Vincent, Linus Walleij

On Tue, Jan 03, 2012 at 12:23:44PM +0900, Chanho Min wrote:
> >On Mon, Jan 02, 2012 at 06:38:21PM +0900, ����ȣ wrote:
> >> from Chanho Min <chanho.min@lge.com>
> >>
> >> System may crash in backing-dev.c when removal SCSI device is detached.
> >> bdi task is killed by bdi_unregister()/'khubd', but task's point remains.
> >> Shortly afterward, If 'wb->wakeup_timer' is expired before
> >> del_timer()/bdi_forker_thread,
> >> wakeup_timer_fn() may wake up the dead thread which cause the crash.
> >> 'bdi->wb.task' should be NULL as this patch.
> >
> >Is it some race condition between del_timer() and del_timer_sync()?
> >
> >bdi_unregister() calls
> >
> >        del_timer_sync
> >        bdi_wb_shutdown
> >            kthread_stop
> >
> >in turn, and del_timer_sync() should guarantee wakeup_timer_fn() is
> >no longer called to access the stopped task.
> >
> 
> It is not race condition. This happens when USB is removed during write-access.
> bdi_wakeup_thread_delayed is called after kthread_stop, and timer is activated again.
> 
> 	bdi_unregister
> 		kthread_stop
> 	bdi_wakeup_thread_delayed (sys_write mostly calls this)
> 	timer fires

Ah OK, the timer could be restarted in the mean while, which breaks
the synchronization rule in del_timer_sync().

I noticed a related fix is merged recently, does your test kernel
contain this commit?

commit 7a401a972df8e184b3d1a3fc958c0a4ddee8d312
Author: Rabin Vincent <rabin.vincent@stericsson.com>
Date:   Fri Nov 11 13:29:04 2011 +0100

    backing-dev: ensure wakeup_timer is deleted

> Anyway,Is this safeguard to prevent from waking up killed thread?

This patch makes no guarantee wakeup_timer_fn() will see NULL
bdi->wb.task before the task is stopped, so there is still race
conditions. And still, the complete fix would be to prevent
wakeup_timer_fn() from being called at all.

Thanks,
Fengguang

> >> Signed-off-by: Chanho Min <chanho.min@lge.com>
> >> ---
> >>  mm/backing-dev.c |    1 +
> >>  1 files changed, 1 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> >> index 71034f4..4378a5e 100644
> >> --- a/mm/backing-dev.c
> >> +++ b/mm/backing-dev.c
> >> @@ -607,6 +607,7 @@ static void bdi_wb_shutdown(struct backing_dev_info
> >> *bdi)
> >>         if (bdi->wb.task) {
> >>                 thaw_process(bdi->wb.task);
> >>                 kthread_stop(bdi->wb.task);
> >> +               bdi->wb.task = NULL;
> >>         }
> >>  }
> >>
> >> --
> >> 1.7.0.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached
       [not found] <004401ccc932$444a0070$ccde0150$@min@lge.com>
@ 2012-01-02  9:57 ` Wu Fengguang
       [not found]   ` <002e01ccc9c7$1928c940$4b7a5bc0$@min@lge.com>
  0 siblings, 1 reply; 21+ messages in thread
From: Wu Fengguang @ 2012-01-02  9:57 UTC (permalink / raw)
  To: ����ȣ
  Cc: linux-mm, linux-kernel, 'Jens Axboe', 'Andrew Morton'

On Mon, Jan 02, 2012 at 06:38:21PM +0900, ����ȣ wrote:
> from Chanho Min <chanho.min@lge.com>
> 
> System may crash in backing-dev.c when removal SCSI device is detached.
> bdi task is killed by bdi_unregister()/'khubd', but task's point remains.
> Shortly afterward, If 'wb->wakeup_timer' is expired before
> del_timer()/bdi_forker_thread,
> wakeup_timer_fn() may wake up the dead thread which cause the crash.
> 'bdi->wb.task' should be NULL as this patch.

Is it some race condition between del_timer() and del_timer_sync()?

bdi_unregister() calls 

        del_timer_sync
        bdi_wb_shutdown
            kthread_stop

in turn, and del_timer_sync() should guarantee wakeup_timer_fn() is
no longer called to access the stopped task.

Thanks,
Fengguang


> Signed-off-by: Chanho Min <chanho.min@lge.com>
> ---
>  mm/backing-dev.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 71034f4..4378a5e 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -607,6 +607,7 @@ static void bdi_wb_shutdown(struct backing_dev_info
> *bdi)
>         if (bdi->wb.task) {
>                 thaw_process(bdi->wb.task);
>                 kthread_stop(bdi->wb.task);
> +               bdi->wb.task = NULL;
>         }
>  }
> 
> -- 
> 1.7.0.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-01-31 13:34 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-05  8:49 [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached Chanho Min
2012-01-15 10:28 ` Rabin Vincent
2012-01-15 12:58   ` Wu Fengguang
2012-01-15 15:41     ` Rabin Vincent
2012-01-16  2:53       ` Wu Fengguang
2012-01-16  5:28         ` Chanho Min
2012-01-16  5:50           ` Wu Fengguang
2012-01-16  5:53             ` Wu Fengguang
2012-01-16  6:34               ` Chanho Min
2012-01-18 19:43                 ` Rabin Vincent
2012-01-19 16:50         ` [PATCHv2] backing-dev: fix wakeup timer races with bdi_unregister() Rabin Vincent
2012-01-19 23:46           ` Namjae Jeon
2012-01-20  5:24             ` Rabin Vincent
2012-01-20  6:15               ` Namjae Jeon
2012-01-20 10:03                 ` Rabin Vincent
2012-01-20 11:18                   ` Namjae Jeon
2012-01-20 12:08                     ` Rabin Vincent
2012-01-20 15:04                       ` Namjae Jeon
2012-01-31 13:24           ` Wu Fengguang
     [not found] <004401ccc932$444a0070$ccde0150$@min@lge.com>
2012-01-02  9:57 ` [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached Wu Fengguang
     [not found]   ` <002e01ccc9c7$1928c940$4b7a5bc0$@min@lge.com>
2012-01-03  4:49     ` Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).