All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-01 15:57 ` Kautuk Consul
  0 siblings, 0 replies; 26+ messages in thread
From: Kautuk Consul @ 2011-09-01 15:57 UTC (permalink / raw)
  To: Andrew Morton, Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner
  Cc: linux-mm, linux-kernel, Kautuk Consul

This is important for SMP scenario, to check whether the timer
callback is executing on another CPU when we are deleting the
timer.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
---
 mm/backing-dev.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index d6edf8d..754b35a 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
 		 * dirty data on the default backing_dev_info
 		 */
 		if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
-			del_timer(&me->wakeup_timer);
+			del_timer_sync(&me->wakeup_timer);
 			wb_do_writeback(me, 0);
 		}
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-01 15:57 ` Kautuk Consul
  0 siblings, 0 replies; 26+ messages in thread
From: Kautuk Consul @ 2011-09-01 15:57 UTC (permalink / raw)
  To: Andrew Morton, Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner
  Cc: linux-mm, linux-kernel, Kautuk Consul

This is important for SMP scenario, to check whether the timer
callback is executing on another CPU when we are deleting the
timer.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
---
 mm/backing-dev.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index d6edf8d..754b35a 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
 		 * dirty data on the default backing_dev_info
 		 */
 		if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
-			del_timer(&me->wakeup_timer);
+			del_timer_sync(&me->wakeup_timer);
 			wb_do_writeback(me, 0);
 		}
 
-- 
1.7.4.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-01 15:57 ` Kautuk Consul
@ 2011-09-01 21:33   ` Andrew Morton
  -1 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2011-09-01 21:33 UTC (permalink / raw)
  To: Kautuk Consul
  Cc: Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel

On Thu,  1 Sep 2011 21:27:02 +0530
Kautuk Consul <consul.kautuk@gmail.com> wrote:

> This is important for SMP scenario, to check whether the timer
> callback is executing on another CPU when we are deleting the
> timer.
> 

I don't see why?

> index d6edf8d..754b35a 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>  		 * dirty data on the default backing_dev_info
>  		 */
>  		if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
> -			del_timer(&me->wakeup_timer);
> +			del_timer_sync(&me->wakeup_timer);
>  			wb_do_writeback(me, 0);
>  		}

It isn't a use-after-free fix: bdi_unregister() safely shoots down any
running timer.

Please completely explain what you believe the problem is here.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-01 21:33   ` Andrew Morton
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2011-09-01 21:33 UTC (permalink / raw)
  To: Kautuk Consul
  Cc: Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel

On Thu,  1 Sep 2011 21:27:02 +0530
Kautuk Consul <consul.kautuk@gmail.com> wrote:

> This is important for SMP scenario, to check whether the timer
> callback is executing on another CPU when we are deleting the
> timer.
> 

I don't see why?

> index d6edf8d..754b35a 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>  		 * dirty data on the default backing_dev_info
>  		 */
>  		if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
> -			del_timer(&me->wakeup_timer);
> +			del_timer_sync(&me->wakeup_timer);
>  			wb_do_writeback(me, 0);
>  		}

It isn't a use-after-free fix: bdi_unregister() safely shoots down any
running timer.

Please completely explain what you believe the problem is here.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-01 21:33   ` Andrew Morton
@ 2011-09-02  5:17     ` kautuk.c @samsung.com
  -1 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-02  5:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel

Hi,

On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu,  1 Sep 2011 21:27:02 +0530
> Kautuk Consul <consul.kautuk@gmail.com> wrote:
>
>> This is important for SMP scenario, to check whether the timer
>> callback is executing on another CPU when we are deleting the
>> timer.
>>
>
> I don't see why?
>
>> index d6edf8d..754b35a 100644
>> --- a/mm/backing-dev.c
>> +++ b/mm/backing-dev.c
>> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>>                * dirty data on the default backing_dev_info
>>                */
>>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
>> -                     del_timer(&me->wakeup_timer);
>> +                     del_timer_sync(&me->wakeup_timer);
>>                       wb_do_writeback(me, 0);
>>               }
>
> It isn't a use-after-free fix: bdi_unregister() safely shoots down any
> running timer.
>

In the situation that we do a del_timer at the same time that the
wakeup_timer_fn is
executing on another CPU, there is one tiny possible problem:
1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
      This will set the bdi-default thread's state to TASK_RUNNING.
2)  However, the code in bdi_writeback_thread() sets the state of the
bdi-default process
    to TASK_INTERRUPTIBLE as it intends to sleep later.

If 2) happens before 1), then the bdi_forker_thread will not sleep
inside schedule as is the
intention of the bdi_forker_thread() code.

This protection is not achieved even by acquiring spinlocks before
setting the task->state
as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
bdi_forker_thread acquires &bdi_lock which is a different spin_lock.

Am I correct in concluding this ?

> Please completely explain what you believe the problem is here.
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-02  5:17     ` kautuk.c @samsung.com
  0 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-02  5:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel

Hi,

On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu,  1 Sep 2011 21:27:02 +0530
> Kautuk Consul <consul.kautuk@gmail.com> wrote:
>
>> This is important for SMP scenario, to check whether the timer
>> callback is executing on another CPU when we are deleting the
>> timer.
>>
>
> I don't see why?
>
>> index d6edf8d..754b35a 100644
>> --- a/mm/backing-dev.c
>> +++ b/mm/backing-dev.c
>> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>>                * dirty data on the default backing_dev_info
>>                */
>>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
>> -                     del_timer(&me->wakeup_timer);
>> +                     del_timer_sync(&me->wakeup_timer);
>>                       wb_do_writeback(me, 0);
>>               }
>
> It isn't a use-after-free fix: bdi_unregister() safely shoots down any
> running timer.
>

In the situation that we do a del_timer at the same time that the
wakeup_timer_fn is
executing on another CPU, there is one tiny possible problem:
1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
      This will set the bdi-default thread's state to TASK_RUNNING.
2)  However, the code in bdi_writeback_thread() sets the state of the
bdi-default process
    to TASK_INTERRUPTIBLE as it intends to sleep later.

If 2) happens before 1), then the bdi_forker_thread will not sleep
inside schedule as is the
intention of the bdi_forker_thread() code.

This protection is not achieved even by acquiring spinlocks before
setting the task->state
as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
bdi_forker_thread acquires &bdi_lock which is a different spin_lock.

Am I correct in concluding this ?

> Please completely explain what you believe the problem is here.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-02  5:17     ` kautuk.c @samsung.com
@ 2011-09-02 11:21       ` Jan Kara
  -1 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-02 11:21 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner,
	linux-mm, linux-kernel

  Hello,

On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Thu,  1 Sep 2011 21:27:02 +0530
> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
> >
> >> This is important for SMP scenario, to check whether the timer
> >> callback is executing on another CPU when we are deleting the
> >> timer.
> >>
> >
> > I don't see why?
> >
> >> index d6edf8d..754b35a 100644
> >> --- a/mm/backing-dev.c
> >> +++ b/mm/backing-dev.c
> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
> >>                * dirty data on the default backing_dev_info
> >>                */
> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
> >> -                     del_timer(&me->wakeup_timer);
> >> +                     del_timer_sync(&me->wakeup_timer);
> >>                       wb_do_writeback(me, 0);
> >>               }
> >
> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
> > running timer.
> >
> 
> In the situation that we do a del_timer at the same time that the
> wakeup_timer_fn is
> executing on another CPU, there is one tiny possible problem:
> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
>       This will set the bdi-default thread's state to TASK_RUNNING.
> 2)  However, the code in bdi_writeback_thread() sets the state of the
> bdi-default process
>     to TASK_INTERRUPTIBLE as it intends to sleep later.
> 
> If 2) happens before 1), then the bdi_forker_thread will not sleep
> inside schedule as is the intention of the bdi_forker_thread() code.
  OK, I agree the code in bdi_forker_thread() might use some straightening
up wrt. task state handling but is what you decribe really an issue? Sure
the task won't go to sleep but the whole effect is that it will just loop
once more to find out there's nothing to do and then go to sleep - not a
bug deal... Or am I missing something?

> This protection is not achieved even by acquiring spinlocks before
> setting the task->state
> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
> 
> Am I correct in concluding this ?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-02 11:21       ` Jan Kara
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-02 11:21 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner,
	linux-mm, linux-kernel

  Hello,

On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Thu,  1 Sep 2011 21:27:02 +0530
> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
> >
> >> This is important for SMP scenario, to check whether the timer
> >> callback is executing on another CPU when we are deleting the
> >> timer.
> >>
> >
> > I don't see why?
> >
> >> index d6edf8d..754b35a 100644
> >> --- a/mm/backing-dev.c
> >> +++ b/mm/backing-dev.c
> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
> >>                * dirty data on the default backing_dev_info
> >>                */
> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
> >> -                     del_timer(&me->wakeup_timer);
> >> +                     del_timer_sync(&me->wakeup_timer);
> >>                       wb_do_writeback(me, 0);
> >>               }
> >
> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
> > running timer.
> >
> 
> In the situation that we do a del_timer at the same time that the
> wakeup_timer_fn is
> executing on another CPU, there is one tiny possible problem:
> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
>       This will set the bdi-default thread's state to TASK_RUNNING.
> 2)  However, the code in bdi_writeback_thread() sets the state of the
> bdi-default process
>     to TASK_INTERRUPTIBLE as it intends to sleep later.
> 
> If 2) happens before 1), then the bdi_forker_thread will not sleep
> inside schedule as is the intention of the bdi_forker_thread() code.
  OK, I agree the code in bdi_forker_thread() might use some straightening
up wrt. task state handling but is what you decribe really an issue? Sure
the task won't go to sleep but the whole effect is that it will just loop
once more to find out there's nothing to do and then go to sleep - not a
bug deal... Or am I missing something?

> This protection is not achieved even by acquiring spinlocks before
> setting the task->state
> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
> 
> Am I correct in concluding this ?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-02 11:21       ` Jan Kara
@ 2011-09-02 11:44         ` kautuk.c @samsung.com
  -1 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-02 11:44 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

Hi,

On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
>  Hello,
>
> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>> > On Thu,  1 Sep 2011 21:27:02 +0530
>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
>> >
>> >> This is important for SMP scenario, to check whether the timer
>> >> callback is executing on another CPU when we are deleting the
>> >> timer.
>> >>
>> >
>> > I don't see why?
>> >
>> >> index d6edf8d..754b35a 100644
>> >> --- a/mm/backing-dev.c
>> >> +++ b/mm/backing-dev.c
>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>> >>                * dirty data on the default backing_dev_info
>> >>                */
>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
>> >> -                     del_timer(&me->wakeup_timer);
>> >> +                     del_timer_sync(&me->wakeup_timer);
>> >>                       wb_do_writeback(me, 0);
>> >>               }
>> >
>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
>> > running timer.
>> >
>>
>> In the situation that we do a del_timer at the same time that the
>> wakeup_timer_fn is
>> executing on another CPU, there is one tiny possible problem:
>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
>>       This will set the bdi-default thread's state to TASK_RUNNING.
>> 2)  However, the code in bdi_writeback_thread() sets the state of the
>> bdi-default process
>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
>>
>> If 2) happens before 1), then the bdi_forker_thread will not sleep
>> inside schedule as is the intention of the bdi_forker_thread() code.
>  OK, I agree the code in bdi_forker_thread() might use some straightening
> up wrt. task state handling but is what you decribe really an issue? Sure
> the task won't go to sleep but the whole effect is that it will just loop
> once more to find out there's nothing to do and then go to sleep - not a
> bug deal... Or am I missing something?

Yes, you are right.
I was studying the code and I found this inconsistency.
Anyways, if there is NO_ACTION it will just loop and go to sleep again.
I just posted this because I felt that the code was not achieving the logic
that was intended in terms of sleeps and wakeups.

I am currently trying to study the other patches you have just sent.

>
>> This protection is not achieved even by acquiring spinlocks before
>> setting the task->state
>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
>>
>> Am I correct in concluding this ?
>
>                                                                Honza
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-02 11:44         ` kautuk.c @samsung.com
  0 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-02 11:44 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

Hi,

On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
>  Hello,
>
> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>> > On Thu,  1 Sep 2011 21:27:02 +0530
>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
>> >
>> >> This is important for SMP scenario, to check whether the timer
>> >> callback is executing on another CPU when we are deleting the
>> >> timer.
>> >>
>> >
>> > I don't see why?
>> >
>> >> index d6edf8d..754b35a 100644
>> >> --- a/mm/backing-dev.c
>> >> +++ b/mm/backing-dev.c
>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>> >>                * dirty data on the default backing_dev_info
>> >>                */
>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
>> >> -                     del_timer(&me->wakeup_timer);
>> >> +                     del_timer_sync(&me->wakeup_timer);
>> >>                       wb_do_writeback(me, 0);
>> >>               }
>> >
>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
>> > running timer.
>> >
>>
>> In the situation that we do a del_timer at the same time that the
>> wakeup_timer_fn is
>> executing on another CPU, there is one tiny possible problem:
>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
>>       This will set the bdi-default thread's state to TASK_RUNNING.
>> 2)  However, the code in bdi_writeback_thread() sets the state of the
>> bdi-default process
>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
>>
>> If 2) happens before 1), then the bdi_forker_thread will not sleep
>> inside schedule as is the intention of the bdi_forker_thread() code.
>  OK, I agree the code in bdi_forker_thread() might use some straightening
> up wrt. task state handling but is what you decribe really an issue? Sure
> the task won't go to sleep but the whole effect is that it will just loop
> once more to find out there's nothing to do and then go to sleep - not a
> bug deal... Or am I missing something?

Yes, you are right.
I was studying the code and I found this inconsistency.
Anyways, if there is NO_ACTION it will just loop and go to sleep again.
I just posted this because I felt that the code was not achieving the logic
that was intended in terms of sleeps and wakeups.

I am currently trying to study the other patches you have just sent.

>
>> This protection is not achieved even by acquiring spinlocks before
>> setting the task->state
>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
>>
>> Am I correct in concluding this ?
>
>                                                                Honza
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-02 11:44         ` kautuk.c @samsung.com
@ 2011-09-02 12:02           ` kautuk.c @samsung.com
  -1 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-02 12:02 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

Hi Jan,

I looked at that other patch you just sent.

I think that the task state problem can still happen in that case as the setting
of the task state is not protected by any lock and the timer callback can be
executing on another CPU at that time.

Am I right about this ?


On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com
<consul.kautuk@gmail.com> wrote:
> Hi,
>
> On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
>>  Hello,
>>
>> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
>>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>> > On Thu,  1 Sep 2011 21:27:02 +0530
>>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
>>> >
>>> >> This is important for SMP scenario, to check whether the timer
>>> >> callback is executing on another CPU when we are deleting the
>>> >> timer.
>>> >>
>>> >
>>> > I don't see why?
>>> >
>>> >> index d6edf8d..754b35a 100644
>>> >> --- a/mm/backing-dev.c
>>> >> +++ b/mm/backing-dev.c
>>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>>> >>                * dirty data on the default backing_dev_info
>>> >>                */
>>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
>>> >> -                     del_timer(&me->wakeup_timer);
>>> >> +                     del_timer_sync(&me->wakeup_timer);
>>> >>                       wb_do_writeback(me, 0);
>>> >>               }
>>> >
>>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
>>> > running timer.
>>> >
>>>
>>> In the situation that we do a del_timer at the same time that the
>>> wakeup_timer_fn is
>>> executing on another CPU, there is one tiny possible problem:
>>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
>>>       This will set the bdi-default thread's state to TASK_RUNNING.
>>> 2)  However, the code in bdi_writeback_thread() sets the state of the
>>> bdi-default process
>>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
>>>
>>> If 2) happens before 1), then the bdi_forker_thread will not sleep
>>> inside schedule as is the intention of the bdi_forker_thread() code.
>>  OK, I agree the code in bdi_forker_thread() might use some straightening
>> up wrt. task state handling but is what you decribe really an issue? Sure
>> the task won't go to sleep but the whole effect is that it will just loop
>> once more to find out there's nothing to do and then go to sleep - not a
>> bug deal... Or am I missing something?
>
> Yes, you are right.
> I was studying the code and I found this inconsistency.
> Anyways, if there is NO_ACTION it will just loop and go to sleep again.
> I just posted this because I felt that the code was not achieving the logic
> that was intended in terms of sleeps and wakeups.
>
> I am currently trying to study the other patches you have just sent.
>
>>
>>> This protection is not achieved even by acquiring spinlocks before
>>> setting the task->state
>>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
>>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
>>>
>>> Am I correct in concluding this ?
>>
>>                                                                Honza
>> --
>> Jan Kara <jack@suse.cz>
>> SUSE Labs, CR
>>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-02 12:02           ` kautuk.c @samsung.com
  0 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-02 12:02 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

Hi Jan,

I looked at that other patch you just sent.

I think that the task state problem can still happen in that case as the setting
of the task state is not protected by any lock and the timer callback can be
executing on another CPU at that time.

Am I right about this ?


On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com
<consul.kautuk@gmail.com> wrote:
> Hi,
>
> On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
>>  Hello,
>>
>> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
>>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>> > On Thu,  1 Sep 2011 21:27:02 +0530
>>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
>>> >
>>> >> This is important for SMP scenario, to check whether the timer
>>> >> callback is executing on another CPU when we are deleting the
>>> >> timer.
>>> >>
>>> >
>>> > I don't see why?
>>> >
>>> >> index d6edf8d..754b35a 100644
>>> >> --- a/mm/backing-dev.c
>>> >> +++ b/mm/backing-dev.c
>>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>>> >>                * dirty data on the default backing_dev_info
>>> >>                */
>>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
>>> >> -                     del_timer(&me->wakeup_timer);
>>> >> +                     del_timer_sync(&me->wakeup_timer);
>>> >>                       wb_do_writeback(me, 0);
>>> >>               }
>>> >
>>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
>>> > running timer.
>>> >
>>>
>>> In the situation that we do a del_timer at the same time that the
>>> wakeup_timer_fn is
>>> executing on another CPU, there is one tiny possible problem:
>>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
>>>       This will set the bdi-default thread's state to TASK_RUNNING.
>>> 2)  However, the code in bdi_writeback_thread() sets the state of the
>>> bdi-default process
>>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
>>>
>>> If 2) happens before 1), then the bdi_forker_thread will not sleep
>>> inside schedule as is the intention of the bdi_forker_thread() code.
>>  OK, I agree the code in bdi_forker_thread() might use some straightening
>> up wrt. task state handling but is what you decribe really an issue? Sure
>> the task won't go to sleep but the whole effect is that it will just loop
>> once more to find out there's nothing to do and then go to sleep - not a
>> bug deal... Or am I missing something?
>
> Yes, you are right.
> I was studying the code and I found this inconsistency.
> Anyways, if there is NO_ACTION it will just loop and go to sleep again.
> I just posted this because I felt that the code was not achieving the logic
> that was intended in terms of sleeps and wakeups.
>
> I am currently trying to study the other patches you have just sent.
>
>>
>>> This protection is not achieved even by acquiring spinlocks before
>>> setting the task->state
>>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
>>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
>>>
>>> Am I correct in concluding this ?
>>
>>                                                                Honza
>> --
>> Jan Kara <jack@suse.cz>
>> SUSE Labs, CR
>>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-02 12:02           ` kautuk.c @samsung.com
@ 2011-09-02 15:14             ` Jan Kara
  -1 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-02 15:14 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner,
	linux-mm, linux-kernel

On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote:
> Hi Jan,
> 
> I looked at that other patch you just sent.
> 
> I think that the task state problem can still happen in that case as the setting
> of the task state is not protected by any lock and the timer callback can be
> executing on another CPU at that time.
> 
> Am I right about this ?
  Yes, the cleanup is not meant to change the scenario you describe - as I
said, there's no point in protecting against it as it's harmless...

								Honza

> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com
> <consul.kautuk@gmail.com> wrote:
> > Hi,
> >
> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
> >>  Hello,
> >>
> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> >>> > On Thu,  1 Sep 2011 21:27:02 +0530
> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
> >>> >
> >>> >> This is important for SMP scenario, to check whether the timer
> >>> >> callback is executing on another CPU when we are deleting the
> >>> >> timer.
> >>> >>
> >>> >
> >>> > I don't see why?
> >>> >
> >>> >> index d6edf8d..754b35a 100644
> >>> >> --- a/mm/backing-dev.c
> >>> >> +++ b/mm/backing-dev.c
> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
> >>> >>                * dirty data on the default backing_dev_info
> >>> >>                */
> >>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
> >>> >> -                     del_timer(&me->wakeup_timer);
> >>> >> +                     del_timer_sync(&me->wakeup_timer);
> >>> >>                       wb_do_writeback(me, 0);
> >>> >>               }
> >>> >
> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
> >>> > running timer.
> >>> >
> >>>
> >>> In the situation that we do a del_timer at the same time that the
> >>> wakeup_timer_fn is
> >>> executing on another CPU, there is one tiny possible problem:
> >>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
> >>>       This will set the bdi-default thread's state to TASK_RUNNING.
> >>> 2)  However, the code in bdi_writeback_thread() sets the state of the
> >>> bdi-default process
> >>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
> >>>
> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep
> >>> inside schedule as is the intention of the bdi_forker_thread() code.
> >>  OK, I agree the code in bdi_forker_thread() might use some straightening
> >> up wrt. task state handling but is what you decribe really an issue? Sure
> >> the task won't go to sleep but the whole effect is that it will just loop
> >> once more to find out there's nothing to do and then go to sleep - not a
> >> bug deal... Or am I missing something?
> >
> > Yes, you are right.
> > I was studying the code and I found this inconsistency.
> > Anyways, if there is NO_ACTION it will just loop and go to sleep again.
> > I just posted this because I felt that the code was not achieving the logic
> > that was intended in terms of sleeps and wakeups.
> >
> > I am currently trying to study the other patches you have just sent.
> >
> >>
> >>> This protection is not achieved even by acquiring spinlocks before
> >>> setting the task->state
> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
> >>>
> >>> Am I correct in concluding this ?
> >>
> >>                                                                Honza
> >> --
> >> Jan Kara <jack@suse.cz>
> >> SUSE Labs, CR
> >>
> >
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-02 15:14             ` Jan Kara
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-02 15:14 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner,
	linux-mm, linux-kernel

On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote:
> Hi Jan,
> 
> I looked at that other patch you just sent.
> 
> I think that the task state problem can still happen in that case as the setting
> of the task state is not protected by any lock and the timer callback can be
> executing on another CPU at that time.
> 
> Am I right about this ?
  Yes, the cleanup is not meant to change the scenario you describe - as I
said, there's no point in protecting against it as it's harmless...

								Honza

> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com
> <consul.kautuk@gmail.com> wrote:
> > Hi,
> >
> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
> >>  Hello,
> >>
> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> >>> > On Thu,  1 Sep 2011 21:27:02 +0530
> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
> >>> >
> >>> >> This is important for SMP scenario, to check whether the timer
> >>> >> callback is executing on another CPU when we are deleting the
> >>> >> timer.
> >>> >>
> >>> >
> >>> > I don't see why?
> >>> >
> >>> >> index d6edf8d..754b35a 100644
> >>> >> --- a/mm/backing-dev.c
> >>> >> +++ b/mm/backing-dev.c
> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
> >>> >>                * dirty data on the default backing_dev_info
> >>> >>                */
> >>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
> >>> >> -                     del_timer(&me->wakeup_timer);
> >>> >> +                     del_timer_sync(&me->wakeup_timer);
> >>> >>                       wb_do_writeback(me, 0);
> >>> >>               }
> >>> >
> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
> >>> > running timer.
> >>> >
> >>>
> >>> In the situation that we do a del_timer at the same time that the
> >>> wakeup_timer_fn is
> >>> executing on another CPU, there is one tiny possible problem:
> >>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
> >>>       This will set the bdi-default thread's state to TASK_RUNNING.
> >>> 2)  However, the code in bdi_writeback_thread() sets the state of the
> >>> bdi-default process
> >>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
> >>>
> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep
> >>> inside schedule as is the intention of the bdi_forker_thread() code.
> >>  OK, I agree the code in bdi_forker_thread() might use some straightening
> >> up wrt. task state handling but is what you decribe really an issue? Sure
> >> the task won't go to sleep but the whole effect is that it will just loop
> >> once more to find out there's nothing to do and then go to sleep - not a
> >> bug deal... Or am I missing something?
> >
> > Yes, you are right.
> > I was studying the code and I found this inconsistency.
> > Anyways, if there is NO_ACTION it will just loop and go to sleep again.
> > I just posted this because I felt that the code was not achieving the logic
> > that was intended in terms of sleeps and wakeups.
> >
> > I am currently trying to study the other patches you have just sent.
> >
> >>
> >>> This protection is not achieved even by acquiring spinlocks before
> >>> setting the task->state
> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
> >>>
> >>> Am I correct in concluding this ?
> >>
> >>                                                                Honza
> >> --
> >> Jan Kara <jack@suse.cz>
> >> SUSE Labs, CR
> >>
> >
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-02 15:14             ` Jan Kara
@ 2011-09-05  5:49               ` kautuk.c @samsung.com
  -1 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-05  5:49 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

On Fri, Sep 2, 2011 at 8:44 PM, Jan Kara <jack@suse.cz> wrote:
> On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote:
>> Hi Jan,
>>
>> I looked at that other patch you just sent.
>>
>> I think that the task state problem can still happen in that case as the setting
>> of the task state is not protected by any lock and the timer callback can be
>> executing on another CPU at that time.
>>
>> Am I right about this ?
>  Yes, the cleanup is not meant to change the scenario you describe - as I
> said, there's no point in protecting against it as it's harmless...
>
>                                                                Honza
>

On second thought:
In the case that the timer_fn of the default bdi causes the
bdi_forker_thread to wake up,
why waste CPU time on one more loop when we could know convincingly
that we would want to
sleep ?

Of course, if any of the other BDIs are scheduling work to the default
the default thread
will wake up reliably as you mentioned.
But, in the case that the race between the default BDI's own timer_fn
(me->wakeup_timer)
on one CPU with the code in bdi_forker_thread on another CPU happens,
we will end up in
one more loop which will result in more CPU usage when we could
actually just go to sleep in
the current iteration of the loop if no work is found on its own bdi
list (i.e., me->bdi->work_list).


>> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com
>> <consul.kautuk@gmail.com> wrote:
>> > Hi,
>> >
>> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
>> >>  Hello,
>> >>
>> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
>> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>> >>> > On Thu,  1 Sep 2011 21:27:02 +0530
>> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
>> >>> >
>> >>> >> This is important for SMP scenario, to check whether the timer
>> >>> >> callback is executing on another CPU when we are deleting the
>> >>> >> timer.
>> >>> >>
>> >>> >
>> >>> > I don't see why?
>> >>> >
>> >>> >> index d6edf8d..754b35a 100644
>> >>> >> --- a/mm/backing-dev.c
>> >>> >> +++ b/mm/backing-dev.c
>> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>> >>> >>                * dirty data on the default backing_dev_info
>> >>> >>                */
>> >>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
>> >>> >> -                     del_timer(&me->wakeup_timer);
>> >>> >> +                     del_timer_sync(&me->wakeup_timer);
>> >>> >>                       wb_do_writeback(me, 0);
>> >>> >>               }
>> >>> >
>> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
>> >>> > running timer.
>> >>> >
>> >>>
>> >>> In the situation that we do a del_timer at the same time that the
>> >>> wakeup_timer_fn is
>> >>> executing on another CPU, there is one tiny possible problem:
>> >>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
>> >>>       This will set the bdi-default thread's state to TASK_RUNNING.
>> >>> 2)  However, the code in bdi_writeback_thread() sets the state of the
>> >>> bdi-default process
>> >>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
>> >>>
>> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep
>> >>> inside schedule as is the intention of the bdi_forker_thread() code.
>> >>  OK, I agree the code in bdi_forker_thread() might use some straightening
>> >> up wrt. task state handling but is what you decribe really an issue? Sure
>> >> the task won't go to sleep but the whole effect is that it will just loop
>> >> once more to find out there's nothing to do and then go to sleep - not a
>> >> bug deal... Or am I missing something?
>> >
>> > Yes, you are right.
>> > I was studying the code and I found this inconsistency.
>> > Anyways, if there is NO_ACTION it will just loop and go to sleep again.
>> > I just posted this because I felt that the code was not achieving the logic
>> > that was intended in terms of sleeps and wakeups.
>> >
>> > I am currently trying to study the other patches you have just sent.
>> >
>> >>
>> >>> This protection is not achieved even by acquiring spinlocks before
>> >>> setting the task->state
>> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
>> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
>> >>>
>> >>> Am I correct in concluding this ?
>> >>
>> >>                                                                Honza
>> >> --
>> >> Jan Kara <jack@suse.cz>
>> >> SUSE Labs, CR
>> >>
>> >
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-05  5:49               ` kautuk.c @samsung.com
  0 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-05  5:49 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

On Fri, Sep 2, 2011 at 8:44 PM, Jan Kara <jack@suse.cz> wrote:
> On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote:
>> Hi Jan,
>>
>> I looked at that other patch you just sent.
>>
>> I think that the task state problem can still happen in that case as the setting
>> of the task state is not protected by any lock and the timer callback can be
>> executing on another CPU at that time.
>>
>> Am I right about this ?
>  Yes, the cleanup is not meant to change the scenario you describe - as I
> said, there's no point in protecting against it as it's harmless...
>
>                                                                Honza
>

On second thought:
In the case that the timer_fn of the default bdi causes the
bdi_forker_thread to wake up,
why waste CPU time on one more loop when we could know convincingly
that we would want to
sleep ?

Of course, if any of the other BDIs are scheduling work to the default
the default thread
will wake up reliably as you mentioned.
But, in the case that the race between the default BDI's own timer_fn
(me->wakeup_timer)
on one CPU with the code in bdi_forker_thread on another CPU happens,
we will end up in
one more loop which will result in more CPU usage when we could
actually just go to sleep in
the current iteration of the loop if no work is found on its own bdi
list (i.e., me->bdi->work_list).


>> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com
>> <consul.kautuk@gmail.com> wrote:
>> > Hi,
>> >
>> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
>> >>  Hello,
>> >>
>> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
>> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>> >>> > On Thu,  1 Sep 2011 21:27:02 +0530
>> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
>> >>> >
>> >>> >> This is important for SMP scenario, to check whether the timer
>> >>> >> callback is executing on another CPU when we are deleting the
>> >>> >> timer.
>> >>> >>
>> >>> >
>> >>> > I don't see why?
>> >>> >
>> >>> >> index d6edf8d..754b35a 100644
>> >>> >> --- a/mm/backing-dev.c
>> >>> >> +++ b/mm/backing-dev.c
>> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
>> >>> >>                * dirty data on the default backing_dev_info
>> >>> >>                */
>> >>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
>> >>> >> -                     del_timer(&me->wakeup_timer);
>> >>> >> +                     del_timer_sync(&me->wakeup_timer);
>> >>> >>                       wb_do_writeback(me, 0);
>> >>> >>               }
>> >>> >
>> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
>> >>> > running timer.
>> >>> >
>> >>>
>> >>> In the situation that we do a del_timer at the same time that the
>> >>> wakeup_timer_fn is
>> >>> executing on another CPU, there is one tiny possible problem:
>> >>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
>> >>>       This will set the bdi-default thread's state to TASK_RUNNING.
>> >>> 2)  However, the code in bdi_writeback_thread() sets the state of the
>> >>> bdi-default process
>> >>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
>> >>>
>> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep
>> >>> inside schedule as is the intention of the bdi_forker_thread() code.
>> >>  OK, I agree the code in bdi_forker_thread() might use some straightening
>> >> up wrt. task state handling but is what you decribe really an issue? Sure
>> >> the task won't go to sleep but the whole effect is that it will just loop
>> >> once more to find out there's nothing to do and then go to sleep - not a
>> >> bug deal... Or am I missing something?
>> >
>> > Yes, you are right.
>> > I was studying the code and I found this inconsistency.
>> > Anyways, if there is NO_ACTION it will just loop and go to sleep again.
>> > I just posted this because I felt that the code was not achieving the logic
>> > that was intended in terms of sleeps and wakeups.
>> >
>> > I am currently trying to study the other patches you have just sent.
>> >
>> >>
>> >>> This protection is not achieved even by acquiring spinlocks before
>> >>> setting the task->state
>> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
>> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
>> >>>
>> >>> Am I correct in concluding this ?
>> >>
>> >>                                                                Honza
>> >> --
>> >> Jan Kara <jack@suse.cz>
>> >> SUSE Labs, CR
>> >>
>> >
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-05  5:49               ` kautuk.c @samsung.com
@ 2011-09-05 10:39                 ` Jan Kara
  -1 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-05 10:39 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner,
	linux-mm, linux-kernel

On Mon 05-09-11 11:19:46, kautuk.c @samsung.com wrote:
> On Fri, Sep 2, 2011 at 8:44 PM, Jan Kara <jack@suse.cz> wrote:
> > On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote:
> >> Hi Jan,
> >>
> >> I looked at that other patch you just sent.
> >>
> >> I think that the task state problem can still happen in that case as the setting
> >> of the task state is not protected by any lock and the timer callback can be
> >> executing on another CPU at that time.
> >>
> >> Am I right about this ?
> >  Yes, the cleanup is not meant to change the scenario you describe - as I
> > said, there's no point in protecting against it as it's harmless...
> >
> On second thought:
> In the case that the timer_fn of the default bdi causes the
> bdi_forker_thread to wake up, why waste CPU time on one more loop when we
> could know convincingly that we would want to sleep ?
  OK, I don't care much whether we have there del_timer() or
del_timer_sync(). Let me just say that the race you are afraid of is
probably not going to happen in practice so I'm not sure it's valid to be
afraid of CPU cycles being burned needlessly. The timer is armed when an
dirty inode is first attached to default bdi's dirty list. Then the default
bdi flusher thread would have to be woken up so that following happens:
	CPU1				CPU2
  timer fires -> wakeup_timer_fn()
					bdi_forker_thread()
					  del_timer(&me->wakeup_timer);
					  wb_do_writeback(me, 0);
					  ...
					  set_current_state(TASK_INTERRUPTIBLE);
  wake_up_process(default_backing_dev_info.wb.task);

  Especially wb_do_writeback() is going to take a long time so just that
single thing makes the race unlikely. Given del_timer_sync() is slightly
more costly than del_timer() even for unarmed timer, it is questionable
whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
in del_timer_sync() * frequency that code is executed in
bdi_forker_thread())...

								Honza

> Of course, if any of the other BDIs are scheduling work to the default
> the default thread
> will wake up reliably as you mentioned.
> But, in the case that the race between the default BDI's own timer_fn
> (me->wakeup_timer)
> on one CPU with the code in bdi_forker_thread on another CPU happens,
> we will end up in
> one more loop which will result in more CPU usage when we could
> actually just go to sleep in
> the current iteration of the loop if no work is found on its own bdi
> list (i.e., me->bdi->work_list).
> 
> 
> >> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com
> >> <consul.kautuk@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
> >> >>  Hello,
> >> >>
> >> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
> >> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> >> >>> > On Thu,  1 Sep 2011 21:27:02 +0530
> >> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
> >> >>> >
> >> >>> >> This is important for SMP scenario, to check whether the timer
> >> >>> >> callback is executing on another CPU when we are deleting the
> >> >>> >> timer.
> >> >>> >>
> >> >>> >
> >> >>> > I don't see why?
> >> >>> >
> >> >>> >> index d6edf8d..754b35a 100644
> >> >>> >> --- a/mm/backing-dev.c
> >> >>> >> +++ b/mm/backing-dev.c
> >> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
> >> >>> >>                * dirty data on the default backing_dev_info
> >> >>> >>                */
> >> >>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
> >> >>> >> -                     del_timer(&me->wakeup_timer);
> >> >>> >> +                     del_timer_sync(&me->wakeup_timer);
> >> >>> >>                       wb_do_writeback(me, 0);
> >> >>> >>               }
> >> >>> >
> >> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
> >> >>> > running timer.
> >> >>> >
> >> >>>
> >> >>> In the situation that we do a del_timer at the same time that the
> >> >>> wakeup_timer_fn is
> >> >>> executing on another CPU, there is one tiny possible problem:
> >> >>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
> >> >>>       This will set the bdi-default thread's state to TASK_RUNNING.
> >> >>> 2)  However, the code in bdi_writeback_thread() sets the state of the
> >> >>> bdi-default process
> >> >>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
> >> >>>
> >> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep
> >> >>> inside schedule as is the intention of the bdi_forker_thread() code.
> >> >>  OK, I agree the code in bdi_forker_thread() might use some straightening
> >> >> up wrt. task state handling but is what you decribe really an issue? Sure
> >> >> the task won't go to sleep but the whole effect is that it will just loop
> >> >> once more to find out there's nothing to do and then go to sleep - not a
> >> >> bug deal... Or am I missing something?
> >> >
> >> > Yes, you are right.
> >> > I was studying the code and I found this inconsistency.
> >> > Anyways, if there is NO_ACTION it will just loop and go to sleep again.
> >> > I just posted this because I felt that the code was not achieving the logic
> >> > that was intended in terms of sleeps and wakeups.
> >> >
> >> > I am currently trying to study the other patches you have just sent.
> >> >
> >> >>
> >> >>> This protection is not achieved even by acquiring spinlocks before
> >> >>> setting the task->state
> >> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
> >> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
> >> >>>
> >> >>> Am I correct in concluding this ?
> >> >>
> >> >>                                                                Honza
> >> >> --
> >> >> Jan Kara <jack@suse.cz>
> >> >> SUSE Labs, CR
> >> >>
> >> >
> > --
> > Jan Kara <jack@suse.cz>
> > SUSE Labs, CR
> >
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-05 10:39                 ` Jan Kara
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-05 10:39 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner,
	linux-mm, linux-kernel

On Mon 05-09-11 11:19:46, kautuk.c @samsung.com wrote:
> On Fri, Sep 2, 2011 at 8:44 PM, Jan Kara <jack@suse.cz> wrote:
> > On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote:
> >> Hi Jan,
> >>
> >> I looked at that other patch you just sent.
> >>
> >> I think that the task state problem can still happen in that case as the setting
> >> of the task state is not protected by any lock and the timer callback can be
> >> executing on another CPU at that time.
> >>
> >> Am I right about this ?
> >  Yes, the cleanup is not meant to change the scenario you describe - as I
> > said, there's no point in protecting against it as it's harmless...
> >
> On second thought:
> In the case that the timer_fn of the default bdi causes the
> bdi_forker_thread to wake up, why waste CPU time on one more loop when we
> could know convincingly that we would want to sleep ?
  OK, I don't care much whether we have there del_timer() or
del_timer_sync(). Let me just say that the race you are afraid of is
probably not going to happen in practice so I'm not sure it's valid to be
afraid of CPU cycles being burned needlessly. The timer is armed when an
dirty inode is first attached to default bdi's dirty list. Then the default
bdi flusher thread would have to be woken up so that following happens:
	CPU1				CPU2
  timer fires -> wakeup_timer_fn()
					bdi_forker_thread()
					  del_timer(&me->wakeup_timer);
					  wb_do_writeback(me, 0);
					  ...
					  set_current_state(TASK_INTERRUPTIBLE);
  wake_up_process(default_backing_dev_info.wb.task);

  Especially wb_do_writeback() is going to take a long time so just that
single thing makes the race unlikely. Given del_timer_sync() is slightly
more costly than del_timer() even for unarmed timer, it is questionable
whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
in del_timer_sync() * frequency that code is executed in
bdi_forker_thread())...

								Honza

> Of course, if any of the other BDIs are scheduling work to the default
> the default thread
> will wake up reliably as you mentioned.
> But, in the case that the race between the default BDI's own timer_fn
> (me->wakeup_timer)
> on one CPU with the code in bdi_forker_thread on another CPU happens,
> we will end up in
> one more loop which will result in more CPU usage when we could
> actually just go to sleep in
> the current iteration of the loop if no work is found on its own bdi
> list (i.e., me->bdi->work_list).
> 
> 
> >> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com
> >> <consul.kautuk@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote:
> >> >>  Hello,
> >> >>
> >> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote:
> >> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> >> >>> > On Thu,  1 Sep 2011 21:27:02 +0530
> >> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote:
> >> >>> >
> >> >>> >> This is important for SMP scenario, to check whether the timer
> >> >>> >> callback is executing on another CPU when we are deleting the
> >> >>> >> timer.
> >> >>> >>
> >> >>> >
> >> >>> > I don't see why?
> >> >>> >
> >> >>> >> index d6edf8d..754b35a 100644
> >> >>> >> --- a/mm/backing-dev.c
> >> >>> >> +++ b/mm/backing-dev.c
> >> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr)
> >> >>> >>                * dirty data on the default backing_dev_info
> >> >>> >>                */
> >> >>> >>               if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) {
> >> >>> >> -                     del_timer(&me->wakeup_timer);
> >> >>> >> +                     del_timer_sync(&me->wakeup_timer);
> >> >>> >>                       wb_do_writeback(me, 0);
> >> >>> >>               }
> >> >>> >
> >> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any
> >> >>> > running timer.
> >> >>> >
> >> >>>
> >> >>> In the situation that we do a del_timer at the same time that the
> >> >>> wakeup_timer_fn is
> >> >>> executing on another CPU, there is one tiny possible problem:
> >> >>> 1)  The wakeup_timer_fn will call wake_up_process on the bdi-default thread.
> >> >>>       This will set the bdi-default thread's state to TASK_RUNNING.
> >> >>> 2)  However, the code in bdi_writeback_thread() sets the state of the
> >> >>> bdi-default process
> >> >>>     to TASK_INTERRUPTIBLE as it intends to sleep later.
> >> >>>
> >> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep
> >> >>> inside schedule as is the intention of the bdi_forker_thread() code.
> >> >>  OK, I agree the code in bdi_forker_thread() might use some straightening
> >> >> up wrt. task state handling but is what you decribe really an issue? Sure
> >> >> the task won't go to sleep but the whole effect is that it will just loop
> >> >> once more to find out there's nothing to do and then go to sleep - not a
> >> >> bug deal... Or am I missing something?
> >> >
> >> > Yes, you are right.
> >> > I was studying the code and I found this inconsistency.
> >> > Anyways, if there is NO_ACTION it will just loop and go to sleep again.
> >> > I just posted this because I felt that the code was not achieving the logic
> >> > that was intended in terms of sleeps and wakeups.
> >> >
> >> > I am currently trying to study the other patches you have just sent.
> >> >
> >> >>
> >> >>> This protection is not achieved even by acquiring spinlocks before
> >> >>> setting the task->state
> >> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in
> >> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock.
> >> >>>
> >> >>> Am I correct in concluding this ?
> >> >>
> >> >>                                                                Honza
> >> >> --
> >> >> Jan Kara <jack@suse.cz>
> >> >> SUSE Labs, CR
> >> >>
> >> >
> > --
> > Jan Kara <jack@suse.cz>
> > SUSE Labs, CR
> >
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-05 10:39                 ` Jan Kara
@ 2011-09-05 14:36                   ` kautuk.c @samsung.com
  -1 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-05 14:36 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

Hi,

>  OK, I don't care much whether we have there del_timer() or
> del_timer_sync(). Let me just say that the race you are afraid of is
> probably not going to happen in practice so I'm not sure it's valid to be
> afraid of CPU cycles being burned needlessly. The timer is armed when an
> dirty inode is first attached to default bdi's dirty list. Then the default
> bdi flusher thread would have to be woken up so that following happens:
>        CPU1                            CPU2
>  timer fires -> wakeup_timer_fn()
>                                        bdi_forker_thread()
>                                          del_timer(&me->wakeup_timer);
>                                          wb_do_writeback(me, 0);
>                                          ...
>                                          set_current_state(TASK_INTERRUPTIBLE);
>  wake_up_process(default_backing_dev_info.wb.task);
>
>  Especially wb_do_writeback() is going to take a long time so just that
> single thing makes the race unlikely. Given del_timer_sync() is slightly
> more costly than del_timer() even for unarmed timer, it is questionable
> whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
> in del_timer_sync() * frequency that code is executed in
> bdi_forker_thread())...
>

Ok, so this means that we can compare the following 2 paths of code:
i)   One extra iteration of the bdi_forker_thread loop, versus
ii)  The amount of time it takes for the del_timer_sync to wait till
the timer_fn
     on the other CPU finishes executing + schedule resulting in a
guaranteed sleep.

Considering both situations to be a race till the tasks are ejected
from the runqueue
(i.e., sleep), I think ii) should be a better option, don't you think ?
Scenario i)  will result in execution of the entire schedule()
function once without
resulting in the "sleep" of the task. Also, if another task schedules,
it could take a
lot of CPU cycles before we return to this (bdi-default) task.
Scenario ii) will result only in the execution of a couple of more
iterations of the
del_timer_sync loop which will quickly respond to completion of
timer_fn on other CPU
and lead to removal of current task as per the call to schedule with
guaranteed sleep.

Is my reasoning correct/adequate ?

I know that the bdi_forker_thread anyways doesn't do much on its own,
but I'm just
understanding your expert opinion(s) on this aspect of the kernel code. :)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-05 14:36                   ` kautuk.c @samsung.com
  0 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-05 14:36 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

Hi,

>  OK, I don't care much whether we have there del_timer() or
> del_timer_sync(). Let me just say that the race you are afraid of is
> probably not going to happen in practice so I'm not sure it's valid to be
> afraid of CPU cycles being burned needlessly. The timer is armed when an
> dirty inode is first attached to default bdi's dirty list. Then the default
> bdi flusher thread would have to be woken up so that following happens:
>        CPU1                            CPU2
>  timer fires -> wakeup_timer_fn()
>                                        bdi_forker_thread()
>                                          del_timer(&me->wakeup_timer);
>                                          wb_do_writeback(me, 0);
>                                          ...
>                                          set_current_state(TASK_INTERRUPTIBLE);
>  wake_up_process(default_backing_dev_info.wb.task);
>
>  Especially wb_do_writeback() is going to take a long time so just that
> single thing makes the race unlikely. Given del_timer_sync() is slightly
> more costly than del_timer() even for unarmed timer, it is questionable
> whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
> in del_timer_sync() * frequency that code is executed in
> bdi_forker_thread())...
>

Ok, so this means that we can compare the following 2 paths of code:
i)   One extra iteration of the bdi_forker_thread loop, versus
ii)  The amount of time it takes for the del_timer_sync to wait till
the timer_fn
     on the other CPU finishes executing + schedule resulting in a
guaranteed sleep.

Considering both situations to be a race till the tasks are ejected
from the runqueue
(i.e., sleep), I think ii) should be a better option, don't you think ?
Scenario i)  will result in execution of the entire schedule()
function once without
resulting in the "sleep" of the task. Also, if another task schedules,
it could take a
lot of CPU cycles before we return to this (bdi-default) task.
Scenario ii) will result only in the execution of a couple of more
iterations of the
del_timer_sync loop which will quickly respond to completion of
timer_fn on other CPU
and lead to removal of current task as per the call to schedule with
guaranteed sleep.

Is my reasoning correct/adequate ?

I know that the bdi_forker_thread anyways doesn't do much on its own,
but I'm just
understanding your expert opinion(s) on this aspect of the kernel code. :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-05 14:36                   ` kautuk.c @samsung.com
@ 2011-09-05 16:05                     ` Jan Kara
  -1 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-05 16:05 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner,
	linux-mm, linux-kernel

  Hi,

On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote:
> >  OK, I don't care much whether we have there del_timer() or
> > del_timer_sync(). Let me just say that the race you are afraid of is
> > probably not going to happen in practice so I'm not sure it's valid to be
> > afraid of CPU cycles being burned needlessly. The timer is armed when an
> > dirty inode is first attached to default bdi's dirty list. Then the default
> > bdi flusher thread would have to be woken up so that following happens:
> >        CPU1                            CPU2
> >  timer fires -> wakeup_timer_fn()
> >                                        bdi_forker_thread()
> >                                          del_timer(&me->wakeup_timer);
> >                                          wb_do_writeback(me, 0);
> >                                          ...
> >                                          set_current_state(TASK_INTERRUPTIBLE);
> >  wake_up_process(default_backing_dev_info.wb.task);
> >
> >  Especially wb_do_writeback() is going to take a long time so just that
> > single thing makes the race unlikely. Given del_timer_sync() is slightly
> > more costly than del_timer() even for unarmed timer, it is questionable
> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
> > in del_timer_sync() * frequency that code is executed in
> > bdi_forker_thread())...
> >
> 
> Ok, so this means that we can compare the following 2 paths of code:
> i)   One extra iteration of the bdi_forker_thread loop, versus
> ii)  The amount of time it takes for the del_timer_sync to wait till the
> timer_fn on the other CPU finishes executing + schedule resulting in a
> guaranteed sleep.
  No, ii) is going to be as rare. But instead you should compare i) against:
iii) The amount of time it takes del_timer_sync() to check whether the
timer_fn is running on a different CPU (which is work del_timer() doesn't
do).

  We are going to spend time in iii) each and every time
if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list))
  evaluates to true.

  Now frequency of i) and iii) happening is hard to evaluate so it's not
clear what's going to be better. Certainly I don't think such evaluation is
worth my time...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-05 16:05                     ` Jan Kara
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-05 16:05 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner,
	linux-mm, linux-kernel

  Hi,

On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote:
> >  OK, I don't care much whether we have there del_timer() or
> > del_timer_sync(). Let me just say that the race you are afraid of is
> > probably not going to happen in practice so I'm not sure it's valid to be
> > afraid of CPU cycles being burned needlessly. The timer is armed when an
> > dirty inode is first attached to default bdi's dirty list. Then the default
> > bdi flusher thread would have to be woken up so that following happens:
> >        CPU1                            CPU2
> >  timer fires -> wakeup_timer_fn()
> >                                        bdi_forker_thread()
> >                                          del_timer(&me->wakeup_timer);
> >                                          wb_do_writeback(me, 0);
> >                                          ...
> >                                          set_current_state(TASK_INTERRUPTIBLE);
> >  wake_up_process(default_backing_dev_info.wb.task);
> >
> >  Especially wb_do_writeback() is going to take a long time so just that
> > single thing makes the race unlikely. Given del_timer_sync() is slightly
> > more costly than del_timer() even for unarmed timer, it is questionable
> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
> > in del_timer_sync() * frequency that code is executed in
> > bdi_forker_thread())...
> >
> 
> Ok, so this means that we can compare the following 2 paths of code:
> i)   One extra iteration of the bdi_forker_thread loop, versus
> ii)  The amount of time it takes for the del_timer_sync to wait till the
> timer_fn on the other CPU finishes executing + schedule resulting in a
> guaranteed sleep.
  No, ii) is going to be as rare. But instead you should compare i) against:
iii) The amount of time it takes del_timer_sync() to check whether the
timer_fn is running on a different CPU (which is work del_timer() doesn't
do).

  We are going to spend time in iii) each and every time
if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list))
  evaluates to true.

  Now frequency of i) and iii) happening is hard to evaluate so it's not
clear what's going to be better. Certainly I don't think such evaluation is
worth my time...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-05 16:05                     ` Jan Kara
@ 2011-09-06  4:11                       ` kautuk.c @samsung.com
  -1 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-06  4:11 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

Hi,

On Mon, Sep 5, 2011 at 9:35 PM, Jan Kara <jack@suse.cz> wrote:
>  Hi,
>
> On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote:
>> >  OK, I don't care much whether we have there del_timer() or
>> > del_timer_sync(). Let me just say that the race you are afraid of is
>> > probably not going to happen in practice so I'm not sure it's valid to be
>> > afraid of CPU cycles being burned needlessly. The timer is armed when an
>> > dirty inode is first attached to default bdi's dirty list. Then the default
>> > bdi flusher thread would have to be woken up so that following happens:
>> >        CPU1                            CPU2
>> >  timer fires -> wakeup_timer_fn()
>> >                                        bdi_forker_thread()
>> >                                          del_timer(&me->wakeup_timer);
>> >                                          wb_do_writeback(me, 0);
>> >                                          ...
>> >                                          set_current_state(TASK_INTERRUPTIBLE);
>> >  wake_up_process(default_backing_dev_info.wb.task);
>> >
>> >  Especially wb_do_writeback() is going to take a long time so just that
>> > single thing makes the race unlikely. Given del_timer_sync() is slightly
>> > more costly than del_timer() even for unarmed timer, it is questionable
>> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
>> > in del_timer_sync() * frequency that code is executed in
>> > bdi_forker_thread())...
>> >
>>
>> Ok, so this means that we can compare the following 2 paths of code:
>> i)   One extra iteration of the bdi_forker_thread loop, versus
>> ii)  The amount of time it takes for the del_timer_sync to wait till the
>> timer_fn on the other CPU finishes executing + schedule resulting in a
>> guaranteed sleep.
>  No, ii) is going to be as rare. But instead you should compare i) against:
> iii) The amount of time it takes del_timer_sync() to check whether the
> timer_fn is running on a different CPU (which is work del_timer() doesn't
> do).

The amount of time it takes del_timer_sync to check the timer_fn should be
negligible.
In fact, try_to_del_timer_sync differs from del_timer_sync in only
that it performs
an additional check:
if (base->running_timer == timer)
    goto out;

>
>  We are going to spend time in iii) each and every time
> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list))
>  evaluates to true.

The amount of time spent on this every time will not matter much, as
the task will
still be preemptible. However, if you notice that in most of the
bdi_forker_thread
loop, we disable preemption due to taking a spinlock so an additional loop there
might be more costly.

>
>  Now frequency of i) and iii) happening is hard to evaluate so it's not
> clear what's going to be better. Certainly I don't think such evaluation is
> worth my time...
>

Ok. Anyways, thanks for explaining all this to me.
I really appreciate your time. :)

>                                                                Honza
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-06  4:11                       ` kautuk.c @samsung.com
  0 siblings, 0 replies; 26+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-06  4:11 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm,
	linux-kernel

Hi,

On Mon, Sep 5, 2011 at 9:35 PM, Jan Kara <jack@suse.cz> wrote:
>  Hi,
>
> On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote:
>> >  OK, I don't care much whether we have there del_timer() or
>> > del_timer_sync(). Let me just say that the race you are afraid of is
>> > probably not going to happen in practice so I'm not sure it's valid to be
>> > afraid of CPU cycles being burned needlessly. The timer is armed when an
>> > dirty inode is first attached to default bdi's dirty list. Then the default
>> > bdi flusher thread would have to be woken up so that following happens:
>> >        CPU1                            CPU2
>> >  timer fires -> wakeup_timer_fn()
>> >                                        bdi_forker_thread()
>> >                                          del_timer(&me->wakeup_timer);
>> >                                          wb_do_writeback(me, 0);
>> >                                          ...
>> >                                          set_current_state(TASK_INTERRUPTIBLE);
>> >  wake_up_process(default_backing_dev_info.wb.task);
>> >
>> >  Especially wb_do_writeback() is going to take a long time so just that
>> > single thing makes the race unlikely. Given del_timer_sync() is slightly
>> > more costly than del_timer() even for unarmed timer, it is questionable
>> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
>> > in del_timer_sync() * frequency that code is executed in
>> > bdi_forker_thread())...
>> >
>>
>> Ok, so this means that we can compare the following 2 paths of code:
>> i)   One extra iteration of the bdi_forker_thread loop, versus
>> ii)  The amount of time it takes for the del_timer_sync to wait till the
>> timer_fn on the other CPU finishes executing + schedule resulting in a
>> guaranteed sleep.
>  No, ii) is going to be as rare. But instead you should compare i) against:
> iii) The amount of time it takes del_timer_sync() to check whether the
> timer_fn is running on a different CPU (which is work del_timer() doesn't
> do).

The amount of time it takes del_timer_sync to check the timer_fn should be
negligible.
In fact, try_to_del_timer_sync differs from del_timer_sync in only
that it performs
an additional check:
if (base->running_timer == timer)
    goto out;

>
>  We are going to spend time in iii) each and every time
> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list))
>  evaluates to true.

The amount of time spent on this every time will not matter much, as
the task will
still be preemptible. However, if you notice that in most of the
bdi_forker_thread
loop, we disable preemption due to taking a spinlock so an additional loop there
might be more costly.

>
>  Now frequency of i) and iii) happening is hard to evaluate so it's not
> clear what's going to be better. Certainly I don't think such evaluation is
> worth my time...
>

Ok. Anyways, thanks for explaining all this to me.
I really appreciate your time. :)

>                                                                Honza
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
  2011-09-06  4:11                       ` kautuk.c @samsung.com
@ 2011-09-06  9:14                         ` Jan Kara
  -1 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-06  9:14 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner,
	linux-mm, linux-kernel

On Tue 06-09-11 09:41:42, kautuk.c @samsung.com wrote:
> > On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote:
> >> >  OK, I don't care much whether we have there del_timer() or
> >> > del_timer_sync(). Let me just say that the race you are afraid of is
> >> > probably not going to happen in practice so I'm not sure it's valid to be
> >> > afraid of CPU cycles being burned needlessly. The timer is armed when an
> >> > dirty inode is first attached to default bdi's dirty list. Then the default
> >> > bdi flusher thread would have to be woken up so that following happens:
> >> >        CPU1                            CPU2
> >> >  timer fires -> wakeup_timer_fn()
> >> >                                        bdi_forker_thread()
> >> >                                          del_timer(&me->wakeup_timer);
> >> >                                          wb_do_writeback(me, 0);
> >> >                                          ...
> >> >                                          set_current_state(TASK_INTERRUPTIBLE);
> >> >  wake_up_process(default_backing_dev_info.wb.task);
> >> >
> >> >  Especially wb_do_writeback() is going to take a long time so just that
> >> > single thing makes the race unlikely. Given del_timer_sync() is slightly
> >> > more costly than del_timer() even for unarmed timer, it is questionable
> >> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
> >> > in del_timer_sync() * frequency that code is executed in
> >> > bdi_forker_thread())...
> >> >
> >>
> >> Ok, so this means that we can compare the following 2 paths of code:
> >> i)   One extra iteration of the bdi_forker_thread loop, versus
> >> ii)  The amount of time it takes for the del_timer_sync to wait till the
> >> timer_fn on the other CPU finishes executing + schedule resulting in a
> >> guaranteed sleep.
> >  No, ii) is going to be as rare. But instead you should compare i) against:
> > iii) The amount of time it takes del_timer_sync() to check whether the
> > timer_fn is running on a different CPU (which is work del_timer() doesn't
> > do).
> 
> The amount of time it takes del_timer_sync to check the timer_fn should be
> negligible.
> In fact, try_to_del_timer_sync differs from del_timer_sync in only
> that it performs
> an additional check:
> if (base->running_timer == timer)
>     goto out;
  Yes, but the probability the race happens is also negligible. So you are
comparing two negligible things... 

> >  We are going to spend time in iii) each and every time
> > if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list))
> >  evaluates to true.
> 
> The amount of time spent on this every time will not matter much, as the
> task will still be preemptible. However, if you notice that in most of
> the bdi_forker_thread loop, we disable preemption due to taking a
> spinlock so an additional loop there might be more costly.
  So either you speak about CPU cost in amount of cycles spent - and there
I still don't buy that it's clear del_timer_sync() is better than
del_timer() - or you speak about latency which is a different thing. From
latency POV that additional loop might be worse. But still I don't think
it's clear enough to change it without any measurement...

> >  Now frequency of i) and iii) happening is hard to evaluate so it's not
> > clear what's going to be better. Certainly I don't think such evaluation is
> > worth my time...
> >
> 
> Ok. Anyways, thanks for explaining all this to me.
> I really appreciate your time. :)
  You are welcome. You made me refresh my memory about some parts of kernel
which is also valuable so thanks goes also to you :)

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer
@ 2011-09-06  9:14                         ` Jan Kara
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Kara @ 2011-09-06  9:14 UTC (permalink / raw)
  To: kautuk.c @samsung.com
  Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner,
	linux-mm, linux-kernel

On Tue 06-09-11 09:41:42, kautuk.c @samsung.com wrote:
> > On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote:
> >> >  OK, I don't care much whether we have there del_timer() or
> >> > del_timer_sync(). Let me just say that the race you are afraid of is
> >> > probably not going to happen in practice so I'm not sure it's valid to be
> >> > afraid of CPU cycles being burned needlessly. The timer is armed when an
> >> > dirty inode is first attached to default bdi's dirty list. Then the default
> >> > bdi flusher thread would have to be woken up so that following happens:
> >> >        CPU1                            CPU2
> >> >  timer fires -> wakeup_timer_fn()
> >> >                                        bdi_forker_thread()
> >> >                                          del_timer(&me->wakeup_timer);
> >> >                                          wb_do_writeback(me, 0);
> >> >                                          ...
> >> >                                          set_current_state(TASK_INTERRUPTIBLE);
> >> >  wake_up_process(default_backing_dev_info.wb.task);
> >> >
> >> >  Especially wb_do_writeback() is going to take a long time so just that
> >> > single thing makes the race unlikely. Given del_timer_sync() is slightly
> >> > more costly than del_timer() even for unarmed timer, it is questionable
> >> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent
> >> > in del_timer_sync() * frequency that code is executed in
> >> > bdi_forker_thread())...
> >> >
> >>
> >> Ok, so this means that we can compare the following 2 paths of code:
> >> i)   One extra iteration of the bdi_forker_thread loop, versus
> >> ii)  The amount of time it takes for the del_timer_sync to wait till the
> >> timer_fn on the other CPU finishes executing + schedule resulting in a
> >> guaranteed sleep.
> >  No, ii) is going to be as rare. But instead you should compare i) against:
> > iii) The amount of time it takes del_timer_sync() to check whether the
> > timer_fn is running on a different CPU (which is work del_timer() doesn't
> > do).
> 
> The amount of time it takes del_timer_sync to check the timer_fn should be
> negligible.
> In fact, try_to_del_timer_sync differs from del_timer_sync in only
> that it performs
> an additional check:
> if (base->running_timer == timer)
>     goto out;
  Yes, but the probability the race happens is also negligible. So you are
comparing two negligible things... 

> >  We are going to spend time in iii) each and every time
> > if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list))
> >  evaluates to true.
> 
> The amount of time spent on this every time will not matter much, as the
> task will still be preemptible. However, if you notice that in most of
> the bdi_forker_thread loop, we disable preemption due to taking a
> spinlock so an additional loop there might be more costly.
  So either you speak about CPU cost in amount of cycles spent - and there
I still don't buy that it's clear del_timer_sync() is better than
del_timer() - or you speak about latency which is a different thing. From
latency POV that additional loop might be worse. But still I don't think
it's clear enough to change it without any measurement...

> >  Now frequency of i) and iii) happening is hard to evaluate so it's not
> > clear what's going to be better. Certainly I don't think such evaluation is
> > worth my time...
> >
> 
> Ok. Anyways, thanks for explaining all this to me.
> I really appreciate your time. :)
  You are welcome. You made me refresh my memory about some parts of kernel
which is also valuable so thanks goes also to you :)

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2011-09-06  9:14 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-01 15:57 [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer Kautuk Consul
2011-09-01 15:57 ` Kautuk Consul
2011-09-01 21:33 ` Andrew Morton
2011-09-01 21:33   ` Andrew Morton
2011-09-02  5:17   ` kautuk.c @samsung.com
2011-09-02  5:17     ` kautuk.c @samsung.com
2011-09-02 11:21     ` Jan Kara
2011-09-02 11:21       ` Jan Kara
2011-09-02 11:44       ` kautuk.c @samsung.com
2011-09-02 11:44         ` kautuk.c @samsung.com
2011-09-02 12:02         ` kautuk.c @samsung.com
2011-09-02 12:02           ` kautuk.c @samsung.com
2011-09-02 15:14           ` Jan Kara
2011-09-02 15:14             ` Jan Kara
2011-09-05  5:49             ` kautuk.c @samsung.com
2011-09-05  5:49               ` kautuk.c @samsung.com
2011-09-05 10:39               ` Jan Kara
2011-09-05 10:39                 ` Jan Kara
2011-09-05 14:36                 ` kautuk.c @samsung.com
2011-09-05 14:36                   ` kautuk.c @samsung.com
2011-09-05 16:05                   ` Jan Kara
2011-09-05 16:05                     ` Jan Kara
2011-09-06  4:11                     ` kautuk.c @samsung.com
2011-09-06  4:11                       ` kautuk.c @samsung.com
2011-09-06  9:14                       ` Jan Kara
2011-09-06  9:14                         ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.