* [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-01 15:57 ` Kautuk Consul 0 siblings, 0 replies; 26+ messages in thread From: Kautuk Consul @ 2011-09-01 15:57 UTC (permalink / raw) To: Andrew Morton, Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner Cc: linux-mm, linux-kernel, Kautuk Consul This is important for SMP scenario, to check whether the timer callback is executing on another CPU when we are deleting the timer. Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com> --- mm/backing-dev.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/mm/backing-dev.c b/mm/backing-dev.c index d6edf8d..754b35a 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) * dirty data on the default backing_dev_info */ if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { - del_timer(&me->wakeup_timer); + del_timer_sync(&me->wakeup_timer); wb_do_writeback(me, 0); } -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-01 15:57 ` Kautuk Consul 0 siblings, 0 replies; 26+ messages in thread From: Kautuk Consul @ 2011-09-01 15:57 UTC (permalink / raw) To: Andrew Morton, Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner Cc: linux-mm, linux-kernel, Kautuk Consul This is important for SMP scenario, to check whether the timer callback is executing on another CPU when we are deleting the timer. Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com> --- mm/backing-dev.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/mm/backing-dev.c b/mm/backing-dev.c index d6edf8d..754b35a 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) * dirty data on the default backing_dev_info */ if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { - del_timer(&me->wakeup_timer); + del_timer_sync(&me->wakeup_timer); wb_do_writeback(me, 0); } -- 1.7.4.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-01 15:57 ` Kautuk Consul @ 2011-09-01 21:33 ` Andrew Morton -1 siblings, 0 replies; 26+ messages in thread From: Andrew Morton @ 2011-09-01 21:33 UTC (permalink / raw) To: Kautuk Consul Cc: Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel On Thu, 1 Sep 2011 21:27:02 +0530 Kautuk Consul <consul.kautuk@gmail.com> wrote: > This is important for SMP scenario, to check whether the timer > callback is executing on another CPU when we are deleting the > timer. > I don't see why? > index d6edf8d..754b35a 100644 > --- a/mm/backing-dev.c > +++ b/mm/backing-dev.c > @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) > * dirty data on the default backing_dev_info > */ > if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { > - del_timer(&me->wakeup_timer); > + del_timer_sync(&me->wakeup_timer); > wb_do_writeback(me, 0); > } It isn't a use-after-free fix: bdi_unregister() safely shoots down any running timer. Please completely explain what you believe the problem is here. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-01 21:33 ` Andrew Morton 0 siblings, 0 replies; 26+ messages in thread From: Andrew Morton @ 2011-09-01 21:33 UTC (permalink / raw) To: Kautuk Consul Cc: Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel On Thu, 1 Sep 2011 21:27:02 +0530 Kautuk Consul <consul.kautuk@gmail.com> wrote: > This is important for SMP scenario, to check whether the timer > callback is executing on another CPU when we are deleting the > timer. > I don't see why? > index d6edf8d..754b35a 100644 > --- a/mm/backing-dev.c > +++ b/mm/backing-dev.c > @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) > * dirty data on the default backing_dev_info > */ > if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { > - del_timer(&me->wakeup_timer); > + del_timer_sync(&me->wakeup_timer); > wb_do_writeback(me, 0); > } It isn't a use-after-free fix: bdi_unregister() safely shoots down any running timer. Please completely explain what you believe the problem is here. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-01 21:33 ` Andrew Morton @ 2011-09-02 5:17 ` kautuk.c @samsung.com -1 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-02 5:17 UTC (permalink / raw) To: Andrew Morton Cc: Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel Hi, On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Thu, 1 Sep 2011 21:27:02 +0530 > Kautuk Consul <consul.kautuk@gmail.com> wrote: > >> This is important for SMP scenario, to check whether the timer >> callback is executing on another CPU when we are deleting the >> timer. >> > > I don't see why? > >> index d6edf8d..754b35a 100644 >> --- a/mm/backing-dev.c >> +++ b/mm/backing-dev.c >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) >> * dirty data on the default backing_dev_info >> */ >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { >> - del_timer(&me->wakeup_timer); >> + del_timer_sync(&me->wakeup_timer); >> wb_do_writeback(me, 0); >> } > > It isn't a use-after-free fix: bdi_unregister() safely shoots down any > running timer. > In the situation that we do a del_timer at the same time that the wakeup_timer_fn is executing on another CPU, there is one tiny possible problem: 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. This will set the bdi-default thread's state to TASK_RUNNING. 2) However, the code in bdi_writeback_thread() sets the state of the bdi-default process to TASK_INTERRUPTIBLE as it intends to sleep later. If 2) happens before 1), then the bdi_forker_thread will not sleep inside schedule as is the intention of the bdi_forker_thread() code. This protection is not achieved even by acquiring spinlocks before setting the task->state as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in bdi_forker_thread acquires &bdi_lock which is a different spin_lock. Am I correct in concluding this ? > Please completely explain what you believe the problem is here. > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-02 5:17 ` kautuk.c @samsung.com 0 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-02 5:17 UTC (permalink / raw) To: Andrew Morton Cc: Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel Hi, On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Thu, 1 Sep 2011 21:27:02 +0530 > Kautuk Consul <consul.kautuk@gmail.com> wrote: > >> This is important for SMP scenario, to check whether the timer >> callback is executing on another CPU when we are deleting the >> timer. >> > > I don't see why? > >> index d6edf8d..754b35a 100644 >> --- a/mm/backing-dev.c >> +++ b/mm/backing-dev.c >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) >> * dirty data on the default backing_dev_info >> */ >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { >> - del_timer(&me->wakeup_timer); >> + del_timer_sync(&me->wakeup_timer); >> wb_do_writeback(me, 0); >> } > > It isn't a use-after-free fix: bdi_unregister() safely shoots down any > running timer. > In the situation that we do a del_timer at the same time that the wakeup_timer_fn is executing on another CPU, there is one tiny possible problem: 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. This will set the bdi-default thread's state to TASK_RUNNING. 2) However, the code in bdi_writeback_thread() sets the state of the bdi-default process to TASK_INTERRUPTIBLE as it intends to sleep later. If 2) happens before 1), then the bdi_forker_thread will not sleep inside schedule as is the intention of the bdi_forker_thread() code. This protection is not achieved even by acquiring spinlocks before setting the task->state as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in bdi_forker_thread acquires &bdi_lock which is a different spin_lock. Am I correct in concluding this ? > Please completely explain what you believe the problem is here. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-02 5:17 ` kautuk.c @samsung.com @ 2011-09-02 11:21 ` Jan Kara -1 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-02 11:21 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel Hello, On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: > On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > > On Thu, 1 Sep 2011 21:27:02 +0530 > > Kautuk Consul <consul.kautuk@gmail.com> wrote: > > > >> This is important for SMP scenario, to check whether the timer > >> callback is executing on another CPU when we are deleting the > >> timer. > >> > > > > I don't see why? > > > >> index d6edf8d..754b35a 100644 > >> --- a/mm/backing-dev.c > >> +++ b/mm/backing-dev.c > >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) > >> * dirty data on the default backing_dev_info > >> */ > >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { > >> - del_timer(&me->wakeup_timer); > >> + del_timer_sync(&me->wakeup_timer); > >> wb_do_writeback(me, 0); > >> } > > > > It isn't a use-after-free fix: bdi_unregister() safely shoots down any > > running timer. > > > > In the situation that we do a del_timer at the same time that the > wakeup_timer_fn is > executing on another CPU, there is one tiny possible problem: > 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. > This will set the bdi-default thread's state to TASK_RUNNING. > 2) However, the code in bdi_writeback_thread() sets the state of the > bdi-default process > to TASK_INTERRUPTIBLE as it intends to sleep later. > > If 2) happens before 1), then the bdi_forker_thread will not sleep > inside schedule as is the intention of the bdi_forker_thread() code. OK, I agree the code in bdi_forker_thread() might use some straightening up wrt. task state handling but is what you decribe really an issue? Sure the task won't go to sleep but the whole effect is that it will just loop once more to find out there's nothing to do and then go to sleep - not a bug deal... Or am I missing something? > This protection is not achieved even by acquiring spinlocks before > setting the task->state > as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in > bdi_forker_thread acquires &bdi_lock which is a different spin_lock. > > Am I correct in concluding this ? Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-02 11:21 ` Jan Kara 0 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-02 11:21 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Jan Kara, Dave Chinner, linux-mm, linux-kernel Hello, On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: > On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > > On Thu, 1 Sep 2011 21:27:02 +0530 > > Kautuk Consul <consul.kautuk@gmail.com> wrote: > > > >> This is important for SMP scenario, to check whether the timer > >> callback is executing on another CPU when we are deleting the > >> timer. > >> > > > > I don't see why? > > > >> index d6edf8d..754b35a 100644 > >> --- a/mm/backing-dev.c > >> +++ b/mm/backing-dev.c > >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) > >> * dirty data on the default backing_dev_info > >> */ > >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { > >> - del_timer(&me->wakeup_timer); > >> + del_timer_sync(&me->wakeup_timer); > >> wb_do_writeback(me, 0); > >> } > > > > It isn't a use-after-free fix: bdi_unregister() safely shoots down any > > running timer. > > > > In the situation that we do a del_timer at the same time that the > wakeup_timer_fn is > executing on another CPU, there is one tiny possible problem: > 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. > This will set the bdi-default thread's state to TASK_RUNNING. > 2) However, the code in bdi_writeback_thread() sets the state of the > bdi-default process > to TASK_INTERRUPTIBLE as it intends to sleep later. > > If 2) happens before 1), then the bdi_forker_thread will not sleep > inside schedule as is the intention of the bdi_forker_thread() code. OK, I agree the code in bdi_forker_thread() might use some straightening up wrt. task state handling but is what you decribe really an issue? Sure the task won't go to sleep but the whole effect is that it will just loop once more to find out there's nothing to do and then go to sleep - not a bug deal... Or am I missing something? > This protection is not achieved even by acquiring spinlocks before > setting the task->state > as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in > bdi_forker_thread acquires &bdi_lock which is a different spin_lock. > > Am I correct in concluding this ? Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-02 11:21 ` Jan Kara @ 2011-09-02 11:44 ` kautuk.c @samsung.com -1 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-02 11:44 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi, On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: > Hello, > > On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: >> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: >> > On Thu, 1 Sep 2011 21:27:02 +0530 >> > Kautuk Consul <consul.kautuk@gmail.com> wrote: >> > >> >> This is important for SMP scenario, to check whether the timer >> >> callback is executing on another CPU when we are deleting the >> >> timer. >> >> >> > >> > I don't see why? >> > >> >> index d6edf8d..754b35a 100644 >> >> --- a/mm/backing-dev.c >> >> +++ b/mm/backing-dev.c >> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) >> >> * dirty data on the default backing_dev_info >> >> */ >> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { >> >> - del_timer(&me->wakeup_timer); >> >> + del_timer_sync(&me->wakeup_timer); >> >> wb_do_writeback(me, 0); >> >> } >> > >> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any >> > running timer. >> > >> >> In the situation that we do a del_timer at the same time that the >> wakeup_timer_fn is >> executing on another CPU, there is one tiny possible problem: >> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. >> This will set the bdi-default thread's state to TASK_RUNNING. >> 2) However, the code in bdi_writeback_thread() sets the state of the >> bdi-default process >> to TASK_INTERRUPTIBLE as it intends to sleep later. >> >> If 2) happens before 1), then the bdi_forker_thread will not sleep >> inside schedule as is the intention of the bdi_forker_thread() code. > OK, I agree the code in bdi_forker_thread() might use some straightening > up wrt. task state handling but is what you decribe really an issue? Sure > the task won't go to sleep but the whole effect is that it will just loop > once more to find out there's nothing to do and then go to sleep - not a > bug deal... Or am I missing something? Yes, you are right. I was studying the code and I found this inconsistency. Anyways, if there is NO_ACTION it will just loop and go to sleep again. I just posted this because I felt that the code was not achieving the logic that was intended in terms of sleeps and wakeups. I am currently trying to study the other patches you have just sent. > >> This protection is not achieved even by acquiring spinlocks before >> setting the task->state >> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in >> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. >> >> Am I correct in concluding this ? > > Honza > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-02 11:44 ` kautuk.c @samsung.com 0 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-02 11:44 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi, On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: > Hello, > > On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: >> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: >> > On Thu, 1 Sep 2011 21:27:02 +0530 >> > Kautuk Consul <consul.kautuk@gmail.com> wrote: >> > >> >> This is important for SMP scenario, to check whether the timer >> >> callback is executing on another CPU when we are deleting the >> >> timer. >> >> >> > >> > I don't see why? >> > >> >> index d6edf8d..754b35a 100644 >> >> --- a/mm/backing-dev.c >> >> +++ b/mm/backing-dev.c >> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) >> >> * dirty data on the default backing_dev_info >> >> */ >> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { >> >> - del_timer(&me->wakeup_timer); >> >> + del_timer_sync(&me->wakeup_timer); >> >> wb_do_writeback(me, 0); >> >> } >> > >> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any >> > running timer. >> > >> >> In the situation that we do a del_timer at the same time that the >> wakeup_timer_fn is >> executing on another CPU, there is one tiny possible problem: >> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. >> This will set the bdi-default thread's state to TASK_RUNNING. >> 2) However, the code in bdi_writeback_thread() sets the state of the >> bdi-default process >> to TASK_INTERRUPTIBLE as it intends to sleep later. >> >> If 2) happens before 1), then the bdi_forker_thread will not sleep >> inside schedule as is the intention of the bdi_forker_thread() code. > OK, I agree the code in bdi_forker_thread() might use some straightening > up wrt. task state handling but is what you decribe really an issue? Sure > the task won't go to sleep but the whole effect is that it will just loop > once more to find out there's nothing to do and then go to sleep - not a > bug deal... Or am I missing something? Yes, you are right. I was studying the code and I found this inconsistency. Anyways, if there is NO_ACTION it will just loop and go to sleep again. I just posted this because I felt that the code was not achieving the logic that was intended in terms of sleeps and wakeups. I am currently trying to study the other patches you have just sent. > >> This protection is not achieved even by acquiring spinlocks before >> setting the task->state >> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in >> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. >> >> Am I correct in concluding this ? > > Honza > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-02 11:44 ` kautuk.c @samsung.com @ 2011-09-02 12:02 ` kautuk.c @samsung.com -1 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-02 12:02 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi Jan, I looked at that other patch you just sent. I think that the task state problem can still happen in that case as the setting of the task state is not protected by any lock and the timer callback can be executing on another CPU at that time. Am I right about this ? On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com <consul.kautuk@gmail.com> wrote: > Hi, > > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: >> Hello, >> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: >>> > On Thu, 1 Sep 2011 21:27:02 +0530 >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote: >>> > >>> >> This is important for SMP scenario, to check whether the timer >>> >> callback is executing on another CPU when we are deleting the >>> >> timer. >>> >> >>> > >>> > I don't see why? >>> > >>> >> index d6edf8d..754b35a 100644 >>> >> --- a/mm/backing-dev.c >>> >> +++ b/mm/backing-dev.c >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) >>> >> * dirty data on the default backing_dev_info >>> >> */ >>> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { >>> >> - del_timer(&me->wakeup_timer); >>> >> + del_timer_sync(&me->wakeup_timer); >>> >> wb_do_writeback(me, 0); >>> >> } >>> > >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any >>> > running timer. >>> > >>> >>> In the situation that we do a del_timer at the same time that the >>> wakeup_timer_fn is >>> executing on another CPU, there is one tiny possible problem: >>> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. >>> This will set the bdi-default thread's state to TASK_RUNNING. >>> 2) However, the code in bdi_writeback_thread() sets the state of the >>> bdi-default process >>> to TASK_INTERRUPTIBLE as it intends to sleep later. >>> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep >>> inside schedule as is the intention of the bdi_forker_thread() code. >> OK, I agree the code in bdi_forker_thread() might use some straightening >> up wrt. task state handling but is what you decribe really an issue? Sure >> the task won't go to sleep but the whole effect is that it will just loop >> once more to find out there's nothing to do and then go to sleep - not a >> bug deal... Or am I missing something? > > Yes, you are right. > I was studying the code and I found this inconsistency. > Anyways, if there is NO_ACTION it will just loop and go to sleep again. > I just posted this because I felt that the code was not achieving the logic > that was intended in terms of sleeps and wakeups. > > I am currently trying to study the other patches you have just sent. > >> >>> This protection is not achieved even by acquiring spinlocks before >>> setting the task->state >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. >>> >>> Am I correct in concluding this ? >> >> Honza >> -- >> Jan Kara <jack@suse.cz> >> SUSE Labs, CR >> > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-02 12:02 ` kautuk.c @samsung.com 0 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-02 12:02 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi Jan, I looked at that other patch you just sent. I think that the task state problem can still happen in that case as the setting of the task state is not protected by any lock and the timer callback can be executing on another CPU at that time. Am I right about this ? On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com <consul.kautuk@gmail.com> wrote: > Hi, > > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: >> Hello, >> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: >>> > On Thu, 1 Sep 2011 21:27:02 +0530 >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote: >>> > >>> >> This is important for SMP scenario, to check whether the timer >>> >> callback is executing on another CPU when we are deleting the >>> >> timer. >>> >> >>> > >>> > I don't see why? >>> > >>> >> index d6edf8d..754b35a 100644 >>> >> --- a/mm/backing-dev.c >>> >> +++ b/mm/backing-dev.c >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) >>> >> * dirty data on the default backing_dev_info >>> >> */ >>> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { >>> >> - del_timer(&me->wakeup_timer); >>> >> + del_timer_sync(&me->wakeup_timer); >>> >> wb_do_writeback(me, 0); >>> >> } >>> > >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any >>> > running timer. >>> > >>> >>> In the situation that we do a del_timer at the same time that the >>> wakeup_timer_fn is >>> executing on another CPU, there is one tiny possible problem: >>> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. >>> This will set the bdi-default thread's state to TASK_RUNNING. >>> 2) However, the code in bdi_writeback_thread() sets the state of the >>> bdi-default process >>> to TASK_INTERRUPTIBLE as it intends to sleep later. >>> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep >>> inside schedule as is the intention of the bdi_forker_thread() code. >> OK, I agree the code in bdi_forker_thread() might use some straightening >> up wrt. task state handling but is what you decribe really an issue? Sure >> the task won't go to sleep but the whole effect is that it will just loop >> once more to find out there's nothing to do and then go to sleep - not a >> bug deal... Or am I missing something? > > Yes, you are right. > I was studying the code and I found this inconsistency. > Anyways, if there is NO_ACTION it will just loop and go to sleep again. > I just posted this because I felt that the code was not achieving the logic > that was intended in terms of sleeps and wakeups. > > I am currently trying to study the other patches you have just sent. > >> >>> This protection is not achieved even by acquiring spinlocks before >>> setting the task->state >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. >>> >>> Am I correct in concluding this ? >> >> Honza >> -- >> Jan Kara <jack@suse.cz> >> SUSE Labs, CR >> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-02 12:02 ` kautuk.c @samsung.com @ 2011-09-02 15:14 ` Jan Kara -1 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-02 15:14 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote: > Hi Jan, > > I looked at that other patch you just sent. > > I think that the task state problem can still happen in that case as the setting > of the task state is not protected by any lock and the timer callback can be > executing on another CPU at that time. > > Am I right about this ? Yes, the cleanup is not meant to change the scenario you describe - as I said, there's no point in protecting against it as it's harmless... Honza > On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com > <consul.kautuk@gmail.com> wrote: > > Hi, > > > > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: > >> Hello, > >> > >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: > >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > >>> > On Thu, 1 Sep 2011 21:27:02 +0530 > >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote: > >>> > > >>> >> This is important for SMP scenario, to check whether the timer > >>> >> callback is executing on another CPU when we are deleting the > >>> >> timer. > >>> >> > >>> > > >>> > I don't see why? > >>> > > >>> >> index d6edf8d..754b35a 100644 > >>> >> --- a/mm/backing-dev.c > >>> >> +++ b/mm/backing-dev.c > >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) > >>> >> * dirty data on the default backing_dev_info > >>> >> */ > >>> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { > >>> >> - del_timer(&me->wakeup_timer); > >>> >> + del_timer_sync(&me->wakeup_timer); > >>> >> wb_do_writeback(me, 0); > >>> >> } > >>> > > >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any > >>> > running timer. > >>> > > >>> > >>> In the situation that we do a del_timer at the same time that the > >>> wakeup_timer_fn is > >>> executing on another CPU, there is one tiny possible problem: > >>> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. > >>> This will set the bdi-default thread's state to TASK_RUNNING. > >>> 2) However, the code in bdi_writeback_thread() sets the state of the > >>> bdi-default process > >>> to TASK_INTERRUPTIBLE as it intends to sleep later. > >>> > >>> If 2) happens before 1), then the bdi_forker_thread will not sleep > >>> inside schedule as is the intention of the bdi_forker_thread() code. > >> OK, I agree the code in bdi_forker_thread() might use some straightening > >> up wrt. task state handling but is what you decribe really an issue? Sure > >> the task won't go to sleep but the whole effect is that it will just loop > >> once more to find out there's nothing to do and then go to sleep - not a > >> bug deal... Or am I missing something? > > > > Yes, you are right. > > I was studying the code and I found this inconsistency. > > Anyways, if there is NO_ACTION it will just loop and go to sleep again. > > I just posted this because I felt that the code was not achieving the logic > > that was intended in terms of sleeps and wakeups. > > > > I am currently trying to study the other patches you have just sent. > > > >> > >>> This protection is not achieved even by acquiring spinlocks before > >>> setting the task->state > >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in > >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. > >>> > >>> Am I correct in concluding this ? > >> > >> Honza > >> -- > >> Jan Kara <jack@suse.cz> > >> SUSE Labs, CR > >> > > -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-02 15:14 ` Jan Kara 0 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-02 15:14 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote: > Hi Jan, > > I looked at that other patch you just sent. > > I think that the task state problem can still happen in that case as the setting > of the task state is not protected by any lock and the timer callback can be > executing on another CPU at that time. > > Am I right about this ? Yes, the cleanup is not meant to change the scenario you describe - as I said, there's no point in protecting against it as it's harmless... Honza > On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com > <consul.kautuk@gmail.com> wrote: > > Hi, > > > > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: > >> Hello, > >> > >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: > >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > >>> > On Thu, 1 Sep 2011 21:27:02 +0530 > >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote: > >>> > > >>> >> This is important for SMP scenario, to check whether the timer > >>> >> callback is executing on another CPU when we are deleting the > >>> >> timer. > >>> >> > >>> > > >>> > I don't see why? > >>> > > >>> >> index d6edf8d..754b35a 100644 > >>> >> --- a/mm/backing-dev.c > >>> >> +++ b/mm/backing-dev.c > >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) > >>> >> * dirty data on the default backing_dev_info > >>> >> */ > >>> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { > >>> >> - del_timer(&me->wakeup_timer); > >>> >> + del_timer_sync(&me->wakeup_timer); > >>> >> wb_do_writeback(me, 0); > >>> >> } > >>> > > >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any > >>> > running timer. > >>> > > >>> > >>> In the situation that we do a del_timer at the same time that the > >>> wakeup_timer_fn is > >>> executing on another CPU, there is one tiny possible problem: > >>> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. > >>> This will set the bdi-default thread's state to TASK_RUNNING. > >>> 2) However, the code in bdi_writeback_thread() sets the state of the > >>> bdi-default process > >>> to TASK_INTERRUPTIBLE as it intends to sleep later. > >>> > >>> If 2) happens before 1), then the bdi_forker_thread will not sleep > >>> inside schedule as is the intention of the bdi_forker_thread() code. > >> OK, I agree the code in bdi_forker_thread() might use some straightening > >> up wrt. task state handling but is what you decribe really an issue? Sure > >> the task won't go to sleep but the whole effect is that it will just loop > >> once more to find out there's nothing to do and then go to sleep - not a > >> bug deal... Or am I missing something? > > > > Yes, you are right. > > I was studying the code and I found this inconsistency. > > Anyways, if there is NO_ACTION it will just loop and go to sleep again. > > I just posted this because I felt that the code was not achieving the logic > > that was intended in terms of sleeps and wakeups. > > > > I am currently trying to study the other patches you have just sent. > > > >> > >>> This protection is not achieved even by acquiring spinlocks before > >>> setting the task->state > >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in > >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. > >>> > >>> Am I correct in concluding this ? > >> > >> Honza > >> -- > >> Jan Kara <jack@suse.cz> > >> SUSE Labs, CR > >> > > -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-02 15:14 ` Jan Kara @ 2011-09-05 5:49 ` kautuk.c @samsung.com -1 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-05 5:49 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel On Fri, Sep 2, 2011 at 8:44 PM, Jan Kara <jack@suse.cz> wrote: > On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote: >> Hi Jan, >> >> I looked at that other patch you just sent. >> >> I think that the task state problem can still happen in that case as the setting >> of the task state is not protected by any lock and the timer callback can be >> executing on another CPU at that time. >> >> Am I right about this ? > Yes, the cleanup is not meant to change the scenario you describe - as I > said, there's no point in protecting against it as it's harmless... > > Honza > On second thought: In the case that the timer_fn of the default bdi causes the bdi_forker_thread to wake up, why waste CPU time on one more loop when we could know convincingly that we would want to sleep ? Of course, if any of the other BDIs are scheduling work to the default the default thread will wake up reliably as you mentioned. But, in the case that the race between the default BDI's own timer_fn (me->wakeup_timer) on one CPU with the code in bdi_forker_thread on another CPU happens, we will end up in one more loop which will result in more CPU usage when we could actually just go to sleep in the current iteration of the loop if no work is found on its own bdi list (i.e., me->bdi->work_list). >> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com >> <consul.kautuk@gmail.com> wrote: >> > Hi, >> > >> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: >> >> Hello, >> >> >> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: >> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: >> >>> > On Thu, 1 Sep 2011 21:27:02 +0530 >> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote: >> >>> > >> >>> >> This is important for SMP scenario, to check whether the timer >> >>> >> callback is executing on another CPU when we are deleting the >> >>> >> timer. >> >>> >> >> >>> > >> >>> > I don't see why? >> >>> > >> >>> >> index d6edf8d..754b35a 100644 >> >>> >> --- a/mm/backing-dev.c >> >>> >> +++ b/mm/backing-dev.c >> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) >> >>> >> * dirty data on the default backing_dev_info >> >>> >> */ >> >>> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { >> >>> >> - del_timer(&me->wakeup_timer); >> >>> >> + del_timer_sync(&me->wakeup_timer); >> >>> >> wb_do_writeback(me, 0); >> >>> >> } >> >>> > >> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any >> >>> > running timer. >> >>> > >> >>> >> >>> In the situation that we do a del_timer at the same time that the >> >>> wakeup_timer_fn is >> >>> executing on another CPU, there is one tiny possible problem: >> >>> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. >> >>> This will set the bdi-default thread's state to TASK_RUNNING. >> >>> 2) However, the code in bdi_writeback_thread() sets the state of the >> >>> bdi-default process >> >>> to TASK_INTERRUPTIBLE as it intends to sleep later. >> >>> >> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep >> >>> inside schedule as is the intention of the bdi_forker_thread() code. >> >> OK, I agree the code in bdi_forker_thread() might use some straightening >> >> up wrt. task state handling but is what you decribe really an issue? Sure >> >> the task won't go to sleep but the whole effect is that it will just loop >> >> once more to find out there's nothing to do and then go to sleep - not a >> >> bug deal... Or am I missing something? >> > >> > Yes, you are right. >> > I was studying the code and I found this inconsistency. >> > Anyways, if there is NO_ACTION it will just loop and go to sleep again. >> > I just posted this because I felt that the code was not achieving the logic >> > that was intended in terms of sleeps and wakeups. >> > >> > I am currently trying to study the other patches you have just sent. >> > >> >> >> >>> This protection is not achieved even by acquiring spinlocks before >> >>> setting the task->state >> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in >> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. >> >>> >> >>> Am I correct in concluding this ? >> >> >> >> Honza >> >> -- >> >> Jan Kara <jack@suse.cz> >> >> SUSE Labs, CR >> >> >> > > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-05 5:49 ` kautuk.c @samsung.com 0 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-05 5:49 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel On Fri, Sep 2, 2011 at 8:44 PM, Jan Kara <jack@suse.cz> wrote: > On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote: >> Hi Jan, >> >> I looked at that other patch you just sent. >> >> I think that the task state problem can still happen in that case as the setting >> of the task state is not protected by any lock and the timer callback can be >> executing on another CPU at that time. >> >> Am I right about this ? > Yes, the cleanup is not meant to change the scenario you describe - as I > said, there's no point in protecting against it as it's harmless... > > Honza > On second thought: In the case that the timer_fn of the default bdi causes the bdi_forker_thread to wake up, why waste CPU time on one more loop when we could know convincingly that we would want to sleep ? Of course, if any of the other BDIs are scheduling work to the default the default thread will wake up reliably as you mentioned. But, in the case that the race between the default BDI's own timer_fn (me->wakeup_timer) on one CPU with the code in bdi_forker_thread on another CPU happens, we will end up in one more loop which will result in more CPU usage when we could actually just go to sleep in the current iteration of the loop if no work is found on its own bdi list (i.e., me->bdi->work_list). >> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com >> <consul.kautuk@gmail.com> wrote: >> > Hi, >> > >> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: >> >> Hello, >> >> >> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: >> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: >> >>> > On Thu, 1 Sep 2011 21:27:02 +0530 >> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote: >> >>> > >> >>> >> This is important for SMP scenario, to check whether the timer >> >>> >> callback is executing on another CPU when we are deleting the >> >>> >> timer. >> >>> >> >> >>> > >> >>> > I don't see why? >> >>> > >> >>> >> index d6edf8d..754b35a 100644 >> >>> >> --- a/mm/backing-dev.c >> >>> >> +++ b/mm/backing-dev.c >> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) >> >>> >> * dirty data on the default backing_dev_info >> >>> >> */ >> >>> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { >> >>> >> - del_timer(&me->wakeup_timer); >> >>> >> + del_timer_sync(&me->wakeup_timer); >> >>> >> wb_do_writeback(me, 0); >> >>> >> } >> >>> > >> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any >> >>> > running timer. >> >>> > >> >>> >> >>> In the situation that we do a del_timer at the same time that the >> >>> wakeup_timer_fn is >> >>> executing on another CPU, there is one tiny possible problem: >> >>> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. >> >>> This will set the bdi-default thread's state to TASK_RUNNING. >> >>> 2) However, the code in bdi_writeback_thread() sets the state of the >> >>> bdi-default process >> >>> to TASK_INTERRUPTIBLE as it intends to sleep later. >> >>> >> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep >> >>> inside schedule as is the intention of the bdi_forker_thread() code. >> >> OK, I agree the code in bdi_forker_thread() might use some straightening >> >> up wrt. task state handling but is what you decribe really an issue? Sure >> >> the task won't go to sleep but the whole effect is that it will just loop >> >> once more to find out there's nothing to do and then go to sleep - not a >> >> bug deal... Or am I missing something? >> > >> > Yes, you are right. >> > I was studying the code and I found this inconsistency. >> > Anyways, if there is NO_ACTION it will just loop and go to sleep again. >> > I just posted this because I felt that the code was not achieving the logic >> > that was intended in terms of sleeps and wakeups. >> > >> > I am currently trying to study the other patches you have just sent. >> > >> >> >> >>> This protection is not achieved even by acquiring spinlocks before >> >>> setting the task->state >> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in >> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. >> >>> >> >>> Am I correct in concluding this ? >> >> >> >> Honza >> >> -- >> >> Jan Kara <jack@suse.cz> >> >> SUSE Labs, CR >> >> >> > > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-05 5:49 ` kautuk.c @samsung.com @ 2011-09-05 10:39 ` Jan Kara -1 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-05 10:39 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel On Mon 05-09-11 11:19:46, kautuk.c @samsung.com wrote: > On Fri, Sep 2, 2011 at 8:44 PM, Jan Kara <jack@suse.cz> wrote: > > On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote: > >> Hi Jan, > >> > >> I looked at that other patch you just sent. > >> > >> I think that the task state problem can still happen in that case as the setting > >> of the task state is not protected by any lock and the timer callback can be > >> executing on another CPU at that time. > >> > >> Am I right about this ? > > Yes, the cleanup is not meant to change the scenario you describe - as I > > said, there's no point in protecting against it as it's harmless... > > > On second thought: > In the case that the timer_fn of the default bdi causes the > bdi_forker_thread to wake up, why waste CPU time on one more loop when we > could know convincingly that we would want to sleep ? OK, I don't care much whether we have there del_timer() or del_timer_sync(). Let me just say that the race you are afraid of is probably not going to happen in practice so I'm not sure it's valid to be afraid of CPU cycles being burned needlessly. The timer is armed when an dirty inode is first attached to default bdi's dirty list. Then the default bdi flusher thread would have to be woken up so that following happens: CPU1 CPU2 timer fires -> wakeup_timer_fn() bdi_forker_thread() del_timer(&me->wakeup_timer); wb_do_writeback(me, 0); ... set_current_state(TASK_INTERRUPTIBLE); wake_up_process(default_backing_dev_info.wb.task); Especially wb_do_writeback() is going to take a long time so just that single thing makes the race unlikely. Given del_timer_sync() is slightly more costly than del_timer() even for unarmed timer, it is questionable whether (chance race happens * CPU spent in extra loop) > (extra CPU spent in del_timer_sync() * frequency that code is executed in bdi_forker_thread())... Honza > Of course, if any of the other BDIs are scheduling work to the default > the default thread > will wake up reliably as you mentioned. > But, in the case that the race between the default BDI's own timer_fn > (me->wakeup_timer) > on one CPU with the code in bdi_forker_thread on another CPU happens, > we will end up in > one more loop which will result in more CPU usage when we could > actually just go to sleep in > the current iteration of the loop if no work is found on its own bdi > list (i.e., me->bdi->work_list). > > > >> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com > >> <consul.kautuk@gmail.com> wrote: > >> > Hi, > >> > > >> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: > >> >> Hello, > >> >> > >> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: > >> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > >> >>> > On Thu, 1 Sep 2011 21:27:02 +0530 > >> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote: > >> >>> > > >> >>> >> This is important for SMP scenario, to check whether the timer > >> >>> >> callback is executing on another CPU when we are deleting the > >> >>> >> timer. > >> >>> >> > >> >>> > > >> >>> > I don't see why? > >> >>> > > >> >>> >> index d6edf8d..754b35a 100644 > >> >>> >> --- a/mm/backing-dev.c > >> >>> >> +++ b/mm/backing-dev.c > >> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) > >> >>> >> * dirty data on the default backing_dev_info > >> >>> >> */ > >> >>> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { > >> >>> >> - del_timer(&me->wakeup_timer); > >> >>> >> + del_timer_sync(&me->wakeup_timer); > >> >>> >> wb_do_writeback(me, 0); > >> >>> >> } > >> >>> > > >> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any > >> >>> > running timer. > >> >>> > > >> >>> > >> >>> In the situation that we do a del_timer at the same time that the > >> >>> wakeup_timer_fn is > >> >>> executing on another CPU, there is one tiny possible problem: > >> >>> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. > >> >>> This will set the bdi-default thread's state to TASK_RUNNING. > >> >>> 2) However, the code in bdi_writeback_thread() sets the state of the > >> >>> bdi-default process > >> >>> to TASK_INTERRUPTIBLE as it intends to sleep later. > >> >>> > >> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep > >> >>> inside schedule as is the intention of the bdi_forker_thread() code. > >> >> OK, I agree the code in bdi_forker_thread() might use some straightening > >> >> up wrt. task state handling but is what you decribe really an issue? Sure > >> >> the task won't go to sleep but the whole effect is that it will just loop > >> >> once more to find out there's nothing to do and then go to sleep - not a > >> >> bug deal... Or am I missing something? > >> > > >> > Yes, you are right. > >> > I was studying the code and I found this inconsistency. > >> > Anyways, if there is NO_ACTION it will just loop and go to sleep again. > >> > I just posted this because I felt that the code was not achieving the logic > >> > that was intended in terms of sleeps and wakeups. > >> > > >> > I am currently trying to study the other patches you have just sent. > >> > > >> >> > >> >>> This protection is not achieved even by acquiring spinlocks before > >> >>> setting the task->state > >> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in > >> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. > >> >>> > >> >>> Am I correct in concluding this ? > >> >> > >> >> Honza > >> >> -- > >> >> Jan Kara <jack@suse.cz> > >> >> SUSE Labs, CR > >> >> > >> > > > -- > > Jan Kara <jack@suse.cz> > > SUSE Labs, CR > > -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-05 10:39 ` Jan Kara 0 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-05 10:39 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel On Mon 05-09-11 11:19:46, kautuk.c @samsung.com wrote: > On Fri, Sep 2, 2011 at 8:44 PM, Jan Kara <jack@suse.cz> wrote: > > On Fri 02-09-11 17:32:35, kautuk.c @samsung.com wrote: > >> Hi Jan, > >> > >> I looked at that other patch you just sent. > >> > >> I think that the task state problem can still happen in that case as the setting > >> of the task state is not protected by any lock and the timer callback can be > >> executing on another CPU at that time. > >> > >> Am I right about this ? > > Yes, the cleanup is not meant to change the scenario you describe - as I > > said, there's no point in protecting against it as it's harmless... > > > On second thought: > In the case that the timer_fn of the default bdi causes the > bdi_forker_thread to wake up, why waste CPU time on one more loop when we > could know convincingly that we would want to sleep ? OK, I don't care much whether we have there del_timer() or del_timer_sync(). Let me just say that the race you are afraid of is probably not going to happen in practice so I'm not sure it's valid to be afraid of CPU cycles being burned needlessly. The timer is armed when an dirty inode is first attached to default bdi's dirty list. Then the default bdi flusher thread would have to be woken up so that following happens: CPU1 CPU2 timer fires -> wakeup_timer_fn() bdi_forker_thread() del_timer(&me->wakeup_timer); wb_do_writeback(me, 0); ... set_current_state(TASK_INTERRUPTIBLE); wake_up_process(default_backing_dev_info.wb.task); Especially wb_do_writeback() is going to take a long time so just that single thing makes the race unlikely. Given del_timer_sync() is slightly more costly than del_timer() even for unarmed timer, it is questionable whether (chance race happens * CPU spent in extra loop) > (extra CPU spent in del_timer_sync() * frequency that code is executed in bdi_forker_thread())... Honza > Of course, if any of the other BDIs are scheduling work to the default > the default thread > will wake up reliably as you mentioned. > But, in the case that the race between the default BDI's own timer_fn > (me->wakeup_timer) > on one CPU with the code in bdi_forker_thread on another CPU happens, > we will end up in > one more loop which will result in more CPU usage when we could > actually just go to sleep in > the current iteration of the loop if no work is found on its own bdi > list (i.e., me->bdi->work_list). > > > >> On Fri, Sep 2, 2011 at 5:14 PM, kautuk.c @samsung.com > >> <consul.kautuk@gmail.com> wrote: > >> > Hi, > >> > > >> > On Fri, Sep 2, 2011 at 4:51 PM, Jan Kara <jack@suse.cz> wrote: > >> >> Hello, > >> >> > >> >> On Fri 02-09-11 10:47:03, kautuk.c @samsung.com wrote: > >> >>> On Fri, Sep 2, 2011 at 3:03 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > >> >>> > On Thu, 1 Sep 2011 21:27:02 +0530 > >> >>> > Kautuk Consul <consul.kautuk@gmail.com> wrote: > >> >>> > > >> >>> >> This is important for SMP scenario, to check whether the timer > >> >>> >> callback is executing on another CPU when we are deleting the > >> >>> >> timer. > >> >>> >> > >> >>> > > >> >>> > I don't see why? > >> >>> > > >> >>> >> index d6edf8d..754b35a 100644 > >> >>> >> --- a/mm/backing-dev.c > >> >>> >> +++ b/mm/backing-dev.c > >> >>> >> @@ -385,7 +385,7 @@ static int bdi_forker_thread(void *ptr) > >> >>> >> * dirty data on the default backing_dev_info > >> >>> >> */ > >> >>> >> if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) { > >> >>> >> - del_timer(&me->wakeup_timer); > >> >>> >> + del_timer_sync(&me->wakeup_timer); > >> >>> >> wb_do_writeback(me, 0); > >> >>> >> } > >> >>> > > >> >>> > It isn't a use-after-free fix: bdi_unregister() safely shoots down any > >> >>> > running timer. > >> >>> > > >> >>> > >> >>> In the situation that we do a del_timer at the same time that the > >> >>> wakeup_timer_fn is > >> >>> executing on another CPU, there is one tiny possible problem: > >> >>> 1) The wakeup_timer_fn will call wake_up_process on the bdi-default thread. > >> >>> This will set the bdi-default thread's state to TASK_RUNNING. > >> >>> 2) However, the code in bdi_writeback_thread() sets the state of the > >> >>> bdi-default process > >> >>> to TASK_INTERRUPTIBLE as it intends to sleep later. > >> >>> > >> >>> If 2) happens before 1), then the bdi_forker_thread will not sleep > >> >>> inside schedule as is the intention of the bdi_forker_thread() code. > >> >> OK, I agree the code in bdi_forker_thread() might use some straightening > >> >> up wrt. task state handling but is what you decribe really an issue? Sure > >> >> the task won't go to sleep but the whole effect is that it will just loop > >> >> once more to find out there's nothing to do and then go to sleep - not a > >> >> bug deal... Or am I missing something? > >> > > >> > Yes, you are right. > >> > I was studying the code and I found this inconsistency. > >> > Anyways, if there is NO_ACTION it will just loop and go to sleep again. > >> > I just posted this because I felt that the code was not achieving the logic > >> > that was intended in terms of sleeps and wakeups. > >> > > >> > I am currently trying to study the other patches you have just sent. > >> > > >> >> > >> >>> This protection is not achieved even by acquiring spinlocks before > >> >>> setting the task->state > >> >>> as the spinlock used in wakeup_timer_fn is &bdi->wb_lock whereas the code in > >> >>> bdi_forker_thread acquires &bdi_lock which is a different spin_lock. > >> >>> > >> >>> Am I correct in concluding this ? > >> >> > >> >> Honza > >> >> -- > >> >> Jan Kara <jack@suse.cz> > >> >> SUSE Labs, CR > >> >> > >> > > > -- > > Jan Kara <jack@suse.cz> > > SUSE Labs, CR > > -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-05 10:39 ` Jan Kara @ 2011-09-05 14:36 ` kautuk.c @samsung.com -1 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-05 14:36 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi, > OK, I don't care much whether we have there del_timer() or > del_timer_sync(). Let me just say that the race you are afraid of is > probably not going to happen in practice so I'm not sure it's valid to be > afraid of CPU cycles being burned needlessly. The timer is armed when an > dirty inode is first attached to default bdi's dirty list. Then the default > bdi flusher thread would have to be woken up so that following happens: > CPU1 CPU2 > timer fires -> wakeup_timer_fn() > bdi_forker_thread() > del_timer(&me->wakeup_timer); > wb_do_writeback(me, 0); > ... > set_current_state(TASK_INTERRUPTIBLE); > wake_up_process(default_backing_dev_info.wb.task); > > Especially wb_do_writeback() is going to take a long time so just that > single thing makes the race unlikely. Given del_timer_sync() is slightly > more costly than del_timer() even for unarmed timer, it is questionable > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent > in del_timer_sync() * frequency that code is executed in > bdi_forker_thread())... > Ok, so this means that we can compare the following 2 paths of code: i) One extra iteration of the bdi_forker_thread loop, versus ii) The amount of time it takes for the del_timer_sync to wait till the timer_fn on the other CPU finishes executing + schedule resulting in a guaranteed sleep. Considering both situations to be a race till the tasks are ejected from the runqueue (i.e., sleep), I think ii) should be a better option, don't you think ? Scenario i) will result in execution of the entire schedule() function once without resulting in the "sleep" of the task. Also, if another task schedules, it could take a lot of CPU cycles before we return to this (bdi-default) task. Scenario ii) will result only in the execution of a couple of more iterations of the del_timer_sync loop which will quickly respond to completion of timer_fn on other CPU and lead to removal of current task as per the call to schedule with guaranteed sleep. Is my reasoning correct/adequate ? I know that the bdi_forker_thread anyways doesn't do much on its own, but I'm just understanding your expert opinion(s) on this aspect of the kernel code. :) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-05 14:36 ` kautuk.c @samsung.com 0 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-05 14:36 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi, > OK, I don't care much whether we have there del_timer() or > del_timer_sync(). Let me just say that the race you are afraid of is > probably not going to happen in practice so I'm not sure it's valid to be > afraid of CPU cycles being burned needlessly. The timer is armed when an > dirty inode is first attached to default bdi's dirty list. Then the default > bdi flusher thread would have to be woken up so that following happens: > CPU1 CPU2 > timer fires -> wakeup_timer_fn() > bdi_forker_thread() > del_timer(&me->wakeup_timer); > wb_do_writeback(me, 0); > ... > set_current_state(TASK_INTERRUPTIBLE); > wake_up_process(default_backing_dev_info.wb.task); > > Especially wb_do_writeback() is going to take a long time so just that > single thing makes the race unlikely. Given del_timer_sync() is slightly > more costly than del_timer() even for unarmed timer, it is questionable > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent > in del_timer_sync() * frequency that code is executed in > bdi_forker_thread())... > Ok, so this means that we can compare the following 2 paths of code: i) One extra iteration of the bdi_forker_thread loop, versus ii) The amount of time it takes for the del_timer_sync to wait till the timer_fn on the other CPU finishes executing + schedule resulting in a guaranteed sleep. Considering both situations to be a race till the tasks are ejected from the runqueue (i.e., sleep), I think ii) should be a better option, don't you think ? Scenario i) will result in execution of the entire schedule() function once without resulting in the "sleep" of the task. Also, if another task schedules, it could take a lot of CPU cycles before we return to this (bdi-default) task. Scenario ii) will result only in the execution of a couple of more iterations of the del_timer_sync loop which will quickly respond to completion of timer_fn on other CPU and lead to removal of current task as per the call to schedule with guaranteed sleep. Is my reasoning correct/adequate ? I know that the bdi_forker_thread anyways doesn't do much on its own, but I'm just understanding your expert opinion(s) on this aspect of the kernel code. :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-05 14:36 ` kautuk.c @samsung.com @ 2011-09-05 16:05 ` Jan Kara -1 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-05 16:05 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi, On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote: > > OK, I don't care much whether we have there del_timer() or > > del_timer_sync(). Let me just say that the race you are afraid of is > > probably not going to happen in practice so I'm not sure it's valid to be > > afraid of CPU cycles being burned needlessly. The timer is armed when an > > dirty inode is first attached to default bdi's dirty list. Then the default > > bdi flusher thread would have to be woken up so that following happens: > > CPU1 CPU2 > > timer fires -> wakeup_timer_fn() > > bdi_forker_thread() > > del_timer(&me->wakeup_timer); > > wb_do_writeback(me, 0); > > ... > > set_current_state(TASK_INTERRUPTIBLE); > > wake_up_process(default_backing_dev_info.wb.task); > > > > Especially wb_do_writeback() is going to take a long time so just that > > single thing makes the race unlikely. Given del_timer_sync() is slightly > > more costly than del_timer() even for unarmed timer, it is questionable > > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent > > in del_timer_sync() * frequency that code is executed in > > bdi_forker_thread())... > > > > Ok, so this means that we can compare the following 2 paths of code: > i) One extra iteration of the bdi_forker_thread loop, versus > ii) The amount of time it takes for the del_timer_sync to wait till the > timer_fn on the other CPU finishes executing + schedule resulting in a > guaranteed sleep. No, ii) is going to be as rare. But instead you should compare i) against: iii) The amount of time it takes del_timer_sync() to check whether the timer_fn is running on a different CPU (which is work del_timer() doesn't do). We are going to spend time in iii) each and every time if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) evaluates to true. Now frequency of i) and iii) happening is hard to evaluate so it's not clear what's going to be better. Certainly I don't think such evaluation is worth my time... Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-05 16:05 ` Jan Kara 0 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-05 16:05 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi, On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote: > > OK, I don't care much whether we have there del_timer() or > > del_timer_sync(). Let me just say that the race you are afraid of is > > probably not going to happen in practice so I'm not sure it's valid to be > > afraid of CPU cycles being burned needlessly. The timer is armed when an > > dirty inode is first attached to default bdi's dirty list. Then the default > > bdi flusher thread would have to be woken up so that following happens: > > CPU1 CPU2 > > timer fires -> wakeup_timer_fn() > > bdi_forker_thread() > > del_timer(&me->wakeup_timer); > > wb_do_writeback(me, 0); > > ... > > set_current_state(TASK_INTERRUPTIBLE); > > wake_up_process(default_backing_dev_info.wb.task); > > > > Especially wb_do_writeback() is going to take a long time so just that > > single thing makes the race unlikely. Given del_timer_sync() is slightly > > more costly than del_timer() even for unarmed timer, it is questionable > > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent > > in del_timer_sync() * frequency that code is executed in > > bdi_forker_thread())... > > > > Ok, so this means that we can compare the following 2 paths of code: > i) One extra iteration of the bdi_forker_thread loop, versus > ii) The amount of time it takes for the del_timer_sync to wait till the > timer_fn on the other CPU finishes executing + schedule resulting in a > guaranteed sleep. No, ii) is going to be as rare. But instead you should compare i) against: iii) The amount of time it takes del_timer_sync() to check whether the timer_fn is running on a different CPU (which is work del_timer() doesn't do). We are going to spend time in iii) each and every time if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) evaluates to true. Now frequency of i) and iii) happening is hard to evaluate so it's not clear what's going to be better. Certainly I don't think such evaluation is worth my time... Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-05 16:05 ` Jan Kara @ 2011-09-06 4:11 ` kautuk.c @samsung.com -1 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-06 4:11 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi, On Mon, Sep 5, 2011 at 9:35 PM, Jan Kara <jack@suse.cz> wrote: > Hi, > > On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote: >> > OK, I don't care much whether we have there del_timer() or >> > del_timer_sync(). Let me just say that the race you are afraid of is >> > probably not going to happen in practice so I'm not sure it's valid to be >> > afraid of CPU cycles being burned needlessly. The timer is armed when an >> > dirty inode is first attached to default bdi's dirty list. Then the default >> > bdi flusher thread would have to be woken up so that following happens: >> > CPU1 CPU2 >> > timer fires -> wakeup_timer_fn() >> > bdi_forker_thread() >> > del_timer(&me->wakeup_timer); >> > wb_do_writeback(me, 0); >> > ... >> > set_current_state(TASK_INTERRUPTIBLE); >> > wake_up_process(default_backing_dev_info.wb.task); >> > >> > Especially wb_do_writeback() is going to take a long time so just that >> > single thing makes the race unlikely. Given del_timer_sync() is slightly >> > more costly than del_timer() even for unarmed timer, it is questionable >> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent >> > in del_timer_sync() * frequency that code is executed in >> > bdi_forker_thread())... >> > >> >> Ok, so this means that we can compare the following 2 paths of code: >> i) One extra iteration of the bdi_forker_thread loop, versus >> ii) The amount of time it takes for the del_timer_sync to wait till the >> timer_fn on the other CPU finishes executing + schedule resulting in a >> guaranteed sleep. > No, ii) is going to be as rare. But instead you should compare i) against: > iii) The amount of time it takes del_timer_sync() to check whether the > timer_fn is running on a different CPU (which is work del_timer() doesn't > do). The amount of time it takes del_timer_sync to check the timer_fn should be negligible. In fact, try_to_del_timer_sync differs from del_timer_sync in only that it performs an additional check: if (base->running_timer == timer) goto out; > > We are going to spend time in iii) each and every time > if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) > evaluates to true. The amount of time spent on this every time will not matter much, as the task will still be preemptible. However, if you notice that in most of the bdi_forker_thread loop, we disable preemption due to taking a spinlock so an additional loop there might be more costly. > > Now frequency of i) and iii) happening is hard to evaluate so it's not > clear what's going to be better. Certainly I don't think such evaluation is > worth my time... > Ok. Anyways, thanks for explaining all this to me. I really appreciate your time. :) > Honza > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-06 4:11 ` kautuk.c @samsung.com 0 siblings, 0 replies; 26+ messages in thread From: kautuk.c @samsung.com @ 2011-09-06 4:11 UTC (permalink / raw) To: Jan Kara Cc: Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel Hi, On Mon, Sep 5, 2011 at 9:35 PM, Jan Kara <jack@suse.cz> wrote: > Hi, > > On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote: >> > OK, I don't care much whether we have there del_timer() or >> > del_timer_sync(). Let me just say that the race you are afraid of is >> > probably not going to happen in practice so I'm not sure it's valid to be >> > afraid of CPU cycles being burned needlessly. The timer is armed when an >> > dirty inode is first attached to default bdi's dirty list. Then the default >> > bdi flusher thread would have to be woken up so that following happens: >> > CPU1 CPU2 >> > timer fires -> wakeup_timer_fn() >> > bdi_forker_thread() >> > del_timer(&me->wakeup_timer); >> > wb_do_writeback(me, 0); >> > ... >> > set_current_state(TASK_INTERRUPTIBLE); >> > wake_up_process(default_backing_dev_info.wb.task); >> > >> > Especially wb_do_writeback() is going to take a long time so just that >> > single thing makes the race unlikely. Given del_timer_sync() is slightly >> > more costly than del_timer() even for unarmed timer, it is questionable >> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent >> > in del_timer_sync() * frequency that code is executed in >> > bdi_forker_thread())... >> > >> >> Ok, so this means that we can compare the following 2 paths of code: >> i) One extra iteration of the bdi_forker_thread loop, versus >> ii) The amount of time it takes for the del_timer_sync to wait till the >> timer_fn on the other CPU finishes executing + schedule resulting in a >> guaranteed sleep. > No, ii) is going to be as rare. But instead you should compare i) against: > iii) The amount of time it takes del_timer_sync() to check whether the > timer_fn is running on a different CPU (which is work del_timer() doesn't > do). The amount of time it takes del_timer_sync to check the timer_fn should be negligible. In fact, try_to_del_timer_sync differs from del_timer_sync in only that it performs an additional check: if (base->running_timer == timer) goto out; > > We are going to spend time in iii) each and every time > if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) > evaluates to true. The amount of time spent on this every time will not matter much, as the task will still be preemptible. However, if you notice that in most of the bdi_forker_thread loop, we disable preemption due to taking a spinlock so an additional loop there might be more costly. > > Now frequency of i) and iii) happening is hard to evaluate so it's not > clear what's going to be better. Certainly I don't think such evaluation is > worth my time... > Ok. Anyways, thanks for explaining all this to me. I really appreciate your time. :) > Honza > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer 2011-09-06 4:11 ` kautuk.c @samsung.com @ 2011-09-06 9:14 ` Jan Kara -1 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-06 9:14 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel On Tue 06-09-11 09:41:42, kautuk.c @samsung.com wrote: > > On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote: > >> > OK, I don't care much whether we have there del_timer() or > >> > del_timer_sync(). Let me just say that the race you are afraid of is > >> > probably not going to happen in practice so I'm not sure it's valid to be > >> > afraid of CPU cycles being burned needlessly. The timer is armed when an > >> > dirty inode is first attached to default bdi's dirty list. Then the default > >> > bdi flusher thread would have to be woken up so that following happens: > >> > CPU1 CPU2 > >> > timer fires -> wakeup_timer_fn() > >> > bdi_forker_thread() > >> > del_timer(&me->wakeup_timer); > >> > wb_do_writeback(me, 0); > >> > ... > >> > set_current_state(TASK_INTERRUPTIBLE); > >> > wake_up_process(default_backing_dev_info.wb.task); > >> > > >> > Especially wb_do_writeback() is going to take a long time so just that > >> > single thing makes the race unlikely. Given del_timer_sync() is slightly > >> > more costly than del_timer() even for unarmed timer, it is questionable > >> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent > >> > in del_timer_sync() * frequency that code is executed in > >> > bdi_forker_thread())... > >> > > >> > >> Ok, so this means that we can compare the following 2 paths of code: > >> i) One extra iteration of the bdi_forker_thread loop, versus > >> ii) The amount of time it takes for the del_timer_sync to wait till the > >> timer_fn on the other CPU finishes executing + schedule resulting in a > >> guaranteed sleep. > > No, ii) is going to be as rare. But instead you should compare i) against: > > iii) The amount of time it takes del_timer_sync() to check whether the > > timer_fn is running on a different CPU (which is work del_timer() doesn't > > do). > > The amount of time it takes del_timer_sync to check the timer_fn should be > negligible. > In fact, try_to_del_timer_sync differs from del_timer_sync in only > that it performs > an additional check: > if (base->running_timer == timer) > goto out; Yes, but the probability the race happens is also negligible. So you are comparing two negligible things... > > We are going to spend time in iii) each and every time > > if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) > > evaluates to true. > > The amount of time spent on this every time will not matter much, as the > task will still be preemptible. However, if you notice that in most of > the bdi_forker_thread loop, we disable preemption due to taking a > spinlock so an additional loop there might be more costly. So either you speak about CPU cost in amount of cycles spent - and there I still don't buy that it's clear del_timer_sync() is better than del_timer() - or you speak about latency which is a different thing. From latency POV that additional loop might be worse. But still I don't think it's clear enough to change it without any measurement... > > Now frequency of i) and iii) happening is hard to evaluate so it's not > > clear what's going to be better. Certainly I don't think such evaluation is > > worth my time... > > > > Ok. Anyways, thanks for explaining all this to me. > I really appreciate your time. :) You are welcome. You made me refresh my memory about some parts of kernel which is also valuable so thanks goes also to you :) Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer @ 2011-09-06 9:14 ` Jan Kara 0 siblings, 0 replies; 26+ messages in thread From: Jan Kara @ 2011-09-06 9:14 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Jan Kara, Andrew Morton, Jens Axboe, Wu Fengguang, Dave Chinner, linux-mm, linux-kernel On Tue 06-09-11 09:41:42, kautuk.c @samsung.com wrote: > > On Mon 05-09-11 20:06:04, kautuk.c @samsung.com wrote: > >> > OK, I don't care much whether we have there del_timer() or > >> > del_timer_sync(). Let me just say that the race you are afraid of is > >> > probably not going to happen in practice so I'm not sure it's valid to be > >> > afraid of CPU cycles being burned needlessly. The timer is armed when an > >> > dirty inode is first attached to default bdi's dirty list. Then the default > >> > bdi flusher thread would have to be woken up so that following happens: > >> > CPU1 CPU2 > >> > timer fires -> wakeup_timer_fn() > >> > bdi_forker_thread() > >> > del_timer(&me->wakeup_timer); > >> > wb_do_writeback(me, 0); > >> > ... > >> > set_current_state(TASK_INTERRUPTIBLE); > >> > wake_up_process(default_backing_dev_info.wb.task); > >> > > >> > Especially wb_do_writeback() is going to take a long time so just that > >> > single thing makes the race unlikely. Given del_timer_sync() is slightly > >> > more costly than del_timer() even for unarmed timer, it is questionable > >> > whether (chance race happens * CPU spent in extra loop) > (extra CPU spent > >> > in del_timer_sync() * frequency that code is executed in > >> > bdi_forker_thread())... > >> > > >> > >> Ok, so this means that we can compare the following 2 paths of code: > >> i) One extra iteration of the bdi_forker_thread loop, versus > >> ii) The amount of time it takes for the del_timer_sync to wait till the > >> timer_fn on the other CPU finishes executing + schedule resulting in a > >> guaranteed sleep. > > No, ii) is going to be as rare. But instead you should compare i) against: > > iii) The amount of time it takes del_timer_sync() to check whether the > > timer_fn is running on a different CPU (which is work del_timer() doesn't > > do). > > The amount of time it takes del_timer_sync to check the timer_fn should be > negligible. > In fact, try_to_del_timer_sync differs from del_timer_sync in only > that it performs > an additional check: > if (base->running_timer == timer) > goto out; Yes, but the probability the race happens is also negligible. So you are comparing two negligible things... > > We are going to spend time in iii) each and every time > > if (wb_has_dirty_io(me) || !list_empty(&me->bdi->work_list)) > > evaluates to true. > > The amount of time spent on this every time will not matter much, as the > task will still be preemptible. However, if you notice that in most of > the bdi_forker_thread loop, we disable preemption due to taking a > spinlock so an additional loop there might be more costly. So either you speak about CPU cost in amount of cycles spent - and there I still don't buy that it's clear del_timer_sync() is better than del_timer() - or you speak about latency which is a different thing. From latency POV that additional loop might be worse. But still I don't think it's clear enough to change it without any measurement... > > Now frequency of i) and iii) happening is hard to evaluate so it's not > > clear what's going to be better. Certainly I don't think such evaluation is > > worth my time... > > > > Ok. Anyways, thanks for explaining all this to me. > I really appreciate your time. :) You are welcome. You made me refresh my memory about some parts of kernel which is also valuable so thanks goes also to you :) Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2011-09-06 9:14 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-09-01 15:57 [PATCH 1/1] mm/backing-dev.c: Call del_timer_sync instead of del_timer Kautuk Consul 2011-09-01 15:57 ` Kautuk Consul 2011-09-01 21:33 ` Andrew Morton 2011-09-01 21:33 ` Andrew Morton 2011-09-02 5:17 ` kautuk.c @samsung.com 2011-09-02 5:17 ` kautuk.c @samsung.com 2011-09-02 11:21 ` Jan Kara 2011-09-02 11:21 ` Jan Kara 2011-09-02 11:44 ` kautuk.c @samsung.com 2011-09-02 11:44 ` kautuk.c @samsung.com 2011-09-02 12:02 ` kautuk.c @samsung.com 2011-09-02 12:02 ` kautuk.c @samsung.com 2011-09-02 15:14 ` Jan Kara 2011-09-02 15:14 ` Jan Kara 2011-09-05 5:49 ` kautuk.c @samsung.com 2011-09-05 5:49 ` kautuk.c @samsung.com 2011-09-05 10:39 ` Jan Kara 2011-09-05 10:39 ` Jan Kara 2011-09-05 14:36 ` kautuk.c @samsung.com 2011-09-05 14:36 ` kautuk.c @samsung.com 2011-09-05 16:05 ` Jan Kara 2011-09-05 16:05 ` Jan Kara 2011-09-06 4:11 ` kautuk.c @samsung.com 2011-09-06 4:11 ` kautuk.c @samsung.com 2011-09-06 9:14 ` Jan Kara 2011-09-06 9:14 ` Jan Kara
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.