linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* add_timer_on: in-kernel users _all_ buggy ?
@ 2010-02-18 14:28 Mathieu Desnoyers
  2010-02-18 16:21 ` Neil Horman
  0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Desnoyers @ 2010-02-18 14:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, ltt-dev, linux-kernel, Steven Rostedt, Neil Horman,
	Martin Schwidefsky

* Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote:
> * Thomas Gleixner (tglx@linutronix.de) wrote:
> > On Tue, 16 Feb 2010, Mathieu Desnoyers wrote:
> > > > The function is called from an IPI. That's a LTTNG problem, not a RT one.
> > > 
> > > I use del_timer in IPI to delete lttng per-cpu timers on all CPUs. I
> > > have to do this because timers created with add_timer_on are documented
> > > to be incompatible with del_timer_sync():
> > > 
> > >  * Synchronization rules: Callers must prevent restarting of the timer,
> > >  * otherwise this function is meaningless. It must not be called from
> > >  * interrupt contexts. The caller must not hold locks which would prevent
> > >  * completion of the timer's handler. The timer's handler must not call
> > >  * add_timer_on(). Upon exit the timer is not queued and the handler is
> > >  * not running on any CPU.
> > 
> > Errm. The documentation says: 
> > 
> >       "The timer's handler must not call add_timer_on()." 
> > 
> > It's not talking about a timer which was initialized with
> > add_timer_on().
> > 
> >  And your per cpu timer handlers have no requirement to call
> > add_timer_on() simply because add/mod_timer() is requeueing the timer
> > on the same cpu on which the handler runs. 
> > 
> > So the IPI is just a solution for a non existing problem.
> 
> Oh, right. Thanks for the explanation. I'll look into moving LTTng to a
> saner del_timer_sync() scheme to delete the timers.

Double-checking this:

add_timer_on() needs to be paired with mod_timer_pinned(), otherwise
NO_HZ SMP config can rebalance the timer to a different CPU. I am fixing
this in lttng 0.194. These per-cpu timers, of course, should usually be
deferrable (they are in lttng).

(looking at kernel 2.6.32.4 here)
Looking at the kernel/time/clocksource.c watchdog, I wonder how
del_timer manages to synchronize the timer teardown. The handler,
clocksource_watchdog(), uses add_timer_on(), which prohibits using
del_timer_sync(). This seems rather odd. If we remove the watchdog and
re-add it, it may still be in use while we initialize the timer
structure.

Also, net/core/drop_monitor.c trace_drop_common usage of add_timer_on
seems odd:

Executing (AFAIK) with preempt on, data points to a per-cpu timer:

       if (!timer_pending(&data->send_timer)) {
                data->send_timer.expires = jiffies + dm_delay * HZ;
                add_timer_on(&data->send_timer, smp_processor_id());
       }

How is timer_pending synchronized with the target CPU timer wheel ?

Wait, there's more: arch/x86/kernel/cpu/mcheck/mce.c uses both
add_timer_on in its handler and del_timer_sync (which is incorrect).

arch/x86/kernel/apic/x2apic_uv_x.c almost has it right, but maybe it
should use del_timer_sync ?

arch/powerpc/platforms/chrp/setup.c should learn about
mod_timer_pinned().

Which leads to the following question: is there _any_ add_timer_on()
kernel user that's not currently buggy ? ;-) Maybe this calls for better
documentation of this interface. From what I've learn from digging into
cpufreq to debug its incorrect timer teardown last year, I fear there
are lots and lots of buggy per-cpu _and_ standard timer interface users
out there.

Maybe adding some debugging options, e.g. checking that a timer created
with add_timer_on is always modified by mod_timer_pinned, and is always
deferrable, and deleted by del_timer_sync could help discovering a
couple of outlawyer.

Thanks,

Mathieu


> 
> Thanks,
> 
> Mathieu
> 
> > 
> > Thanks,
> > 
> > 	tglx
> > 
> 
> -- 
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> 
> _______________________________________________
> ltt-dev mailing list
> ltt-dev@lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: add_timer_on: in-kernel users _all_ buggy ?
  2010-02-18 14:28 add_timer_on: in-kernel users _all_ buggy ? Mathieu Desnoyers
@ 2010-02-18 16:21 ` Neil Horman
  2010-02-18 16:41   ` Mathieu Desnoyers
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Horman @ 2010-02-18 16:21 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Thomas Gleixner, Ingo Molnar, ltt-dev, linux-kernel,
	Steven Rostedt, Martin Schwidefsky

On Thu, Feb 18, 2010 at 09:28:00AM -0500, Mathieu Desnoyers wrote:
> * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote:
> > * Thomas Gleixner (tglx@linutronix.de) wrote:
> > > On Tue, 16 Feb 2010, Mathieu Desnoyers wrote:
> > > > > The function is called from an IPI. That's a LTTNG problem, not a RT one.
> > > > 
> > > > I use del_timer in IPI to delete lttng per-cpu timers on all CPUs. I
> > > > have to do this because timers created with add_timer_on are documented
> > > > to be incompatible with del_timer_sync():
> > > > 
> > > >  * Synchronization rules: Callers must prevent restarting of the timer,
> > > >  * otherwise this function is meaningless. It must not be called from
> > > >  * interrupt contexts. The caller must not hold locks which would prevent
> > > >  * completion of the timer's handler. The timer's handler must not call
> > > >  * add_timer_on(). Upon exit the timer is not queued and the handler is
> > > >  * not running on any CPU.
> > > 
> > > Errm. The documentation says: 
> > > 
> > >       "The timer's handler must not call add_timer_on()." 
> > > 
> > > It's not talking about a timer which was initialized with
> > > add_timer_on().
> > > 
> > >  And your per cpu timer handlers have no requirement to call
> > > add_timer_on() simply because add/mod_timer() is requeueing the timer
> > > on the same cpu on which the handler runs. 
> > > 
> > > So the IPI is just a solution for a non existing problem.
> > 
> > Oh, right. Thanks for the explanation. I'll look into moving LTTng to a
> > saner del_timer_sync() scheme to delete the timers.
> 
> Double-checking this:
> 
> add_timer_on() needs to be paired with mod_timer_pinned(), otherwise
> NO_HZ SMP config can rebalance the timer to a different CPU. I am fixing
> this in lttng 0.194. These per-cpu timers, of course, should usually be
> deferrable (they are in lttng).
> 
> (looking at kernel 2.6.32.4 here)
> Looking at the kernel/time/clocksource.c watchdog, I wonder how
> del_timer manages to synchronize the timer teardown. The handler,
> clocksource_watchdog(), uses add_timer_on(), which prohibits using
> del_timer_sync(). This seems rather odd. If we remove the watchdog and
> re-add it, it may still be in use while we initialize the timer
> structure.
> 
> Also, net/core/drop_monitor.c trace_drop_common usage of add_timer_on
> seems odd:
> 
> Executing (AFAIK) with preempt on, data points to a per-cpu timer:
> 
>        if (!timer_pending(&data->send_timer)) {
>                 data->send_timer.expires = jiffies + dm_delay * HZ;
>                 add_timer_on(&data->send_timer, smp_processor_id());
>        }
> 
> How is timer_pending synchronized with the target CPU timer wheel ?
> 
Hm, I think I see your point here.  You're suggesting that a call to one of the
tracepoint hooks in the drop monitor can race against a second call to the hook
from an interrupt context that pre-empted the first, leading to double add of
the timer?  I agree, in fact I think its likely worse that that, the shared data
on the skb that I modify there can get corrupted in that case as well.  I expect
a bit of refactoring paired with a local_irq_save/restore should fix that.

Thanks!
Neil

> Wait, there's more: arch/x86/kernel/cpu/mcheck/mce.c uses both
> add_timer_on in its handler and del_timer_sync (which is incorrect).
> 
> arch/x86/kernel/apic/x2apic_uv_x.c almost has it right, but maybe it
> should use del_timer_sync ?
> 
> arch/powerpc/platforms/chrp/setup.c should learn about
> mod_timer_pinned().
> 
> Which leads to the following question: is there _any_ add_timer_on()
> kernel user that's not currently buggy ? ;-) Maybe this calls for better
> documentation of this interface. From what I've learn from digging into
> cpufreq to debug its incorrect timer teardown last year, I fear there
> are lots and lots of buggy per-cpu _and_ standard timer interface users
> out there.
> 
> Maybe adding some debugging options, e.g. checking that a timer created
> with add_timer_on is always modified by mod_timer_pinned, and is always
> deferrable, and deleted by del_timer_sync could help discovering a
> couple of outlawyer.
> 
> Thanks,
> 
> Mathieu
> 
> 
> > 
> > Thanks,
> > 
> > Mathieu
> > 
> > > 
> > > Thanks,
> > > 
> > > 	tglx
> > > 
> > 
> > -- 
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> > 
> > _______________________________________________
> > ltt-dev mailing list
> > ltt-dev@lists.casi.polymtl.ca
> > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > 
> 
> -- 
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: add_timer_on: in-kernel users _all_ buggy ?
  2010-02-18 16:21 ` Neil Horman
@ 2010-02-18 16:41   ` Mathieu Desnoyers
  2010-02-18 18:33     ` Neil Horman
  0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Desnoyers @ 2010-02-18 16:41 UTC (permalink / raw)
  To: Neil Horman
  Cc: Thomas Gleixner, Ingo Molnar, ltt-dev, linux-kernel,
	Steven Rostedt, Martin Schwidefsky

* Neil Horman (nhorman@tuxdriver.com) wrote:
> On Thu, Feb 18, 2010 at 09:28:00AM -0500, Mathieu Desnoyers wrote:
> > * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote:
> > > * Thomas Gleixner (tglx@linutronix.de) wrote:
> > > > On Tue, 16 Feb 2010, Mathieu Desnoyers wrote:
> > > > > > The function is called from an IPI. That's a LTTNG problem, not a RT one.
> > > > > 
> > > > > I use del_timer in IPI to delete lttng per-cpu timers on all CPUs. I
> > > > > have to do this because timers created with add_timer_on are documented
> > > > > to be incompatible with del_timer_sync():
> > > > > 
> > > > >  * Synchronization rules: Callers must prevent restarting of the timer,
> > > > >  * otherwise this function is meaningless. It must not be called from
> > > > >  * interrupt contexts. The caller must not hold locks which would prevent
> > > > >  * completion of the timer's handler. The timer's handler must not call
> > > > >  * add_timer_on(). Upon exit the timer is not queued and the handler is
> > > > >  * not running on any CPU.
> > > > 
> > > > Errm. The documentation says: 
> > > > 
> > > >       "The timer's handler must not call add_timer_on()." 
> > > > 
> > > > It's not talking about a timer which was initialized with
> > > > add_timer_on().
> > > > 
> > > >  And your per cpu timer handlers have no requirement to call
> > > > add_timer_on() simply because add/mod_timer() is requeueing the timer
> > > > on the same cpu on which the handler runs. 
> > > > 
> > > > So the IPI is just a solution for a non existing problem.
> > > 
> > > Oh, right. Thanks for the explanation. I'll look into moving LTTng to a
> > > saner del_timer_sync() scheme to delete the timers.
> > 
> > Double-checking this:
> > 
> > add_timer_on() needs to be paired with mod_timer_pinned(), otherwise
> > NO_HZ SMP config can rebalance the timer to a different CPU. I am fixing
> > this in lttng 0.194. These per-cpu timers, of course, should usually be
> > deferrable (they are in lttng).
> > 
> > (looking at kernel 2.6.32.4 here)
> > Looking at the kernel/time/clocksource.c watchdog, I wonder how
> > del_timer manages to synchronize the timer teardown. The handler,
> > clocksource_watchdog(), uses add_timer_on(), which prohibits using
> > del_timer_sync(). This seems rather odd. If we remove the watchdog and
> > re-add it, it may still be in use while we initialize the timer
> > structure.
> > 
> > Also, net/core/drop_monitor.c trace_drop_common usage of add_timer_on
> > seems odd:
> > 
> > Executing (AFAIK) with preempt on, data points to a per-cpu timer:
> > 
> >        if (!timer_pending(&data->send_timer)) {
> >                 data->send_timer.expires = jiffies + dm_delay * HZ;
> >                 add_timer_on(&data->send_timer, smp_processor_id());
> >        }
> > 
> > How is timer_pending synchronized with the target CPU timer wheel ?
> > 
> Hm, I think I see your point here.  You're suggesting that a call to one of the
> tracepoint hooks in the drop monitor can race against a second call to the hook
> from an interrupt context that pre-empted the first, leading to double add of
> the timer?

Yes.

> I agree, in fact I think its likely worse that that, the shared data
> on the skb that I modify there can get corrupted in that case as well.  I expect
> a bit of refactoring paired with a local_irq_save/restore should fix that.

For this current code, you don't seem to need to delete the timer. I think
you'll have to find a way to do the timer_pending check and the add_timer_on
operations atomically wrt timer wheel execution on the CPU (which can be a
remote cpu because preemption is enabled). Your case is a bit weird..  the
typical scenario here would be to use add_timer_on directly to re-arm the timer
for a future expiration time (removing the current pending timer
unconditionnally at the same time). But you leave the timer at its current
expiration time if present, and only if not present add it. I wonder if it's
really your intent ?

Thanks,

Mathieu


> 
> Thanks!
> Neil
> 
> > Wait, there's more: arch/x86/kernel/cpu/mcheck/mce.c uses both
> > add_timer_on in its handler and del_timer_sync (which is incorrect).
> > 
> > arch/x86/kernel/apic/x2apic_uv_x.c almost has it right, but maybe it
> > should use del_timer_sync ?
> > 
> > arch/powerpc/platforms/chrp/setup.c should learn about
> > mod_timer_pinned().
> > 
> > Which leads to the following question: is there _any_ add_timer_on()
> > kernel user that's not currently buggy ? ;-) Maybe this calls for better
> > documentation of this interface. From what I've learn from digging into
> > cpufreq to debug its incorrect timer teardown last year, I fear there
> > are lots and lots of buggy per-cpu _and_ standard timer interface users
> > out there.
> > 
> > Maybe adding some debugging options, e.g. checking that a timer created
> > with add_timer_on is always modified by mod_timer_pinned, and is always
> > deferrable, and deleted by del_timer_sync could help discovering a
> > couple of outlawyer.
> > 
> > Thanks,
> > 
> > Mathieu
> > 
> > 
> > > 
> > > Thanks,
> > > 
> > > Mathieu
> > > 
> > > > 
> > > > Thanks,
> > > > 
> > > > 	tglx
> > > > 
> > > 
> > > -- 
> > > Mathieu Desnoyers
> > > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> > > 
> > > _______________________________________________
> > > ltt-dev mailing list
> > > ltt-dev@lists.casi.polymtl.ca
> > > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > > 
> > 
> > -- 
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> > 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: add_timer_on: in-kernel users _all_ buggy ?
  2010-02-18 16:41   ` Mathieu Desnoyers
@ 2010-02-18 18:33     ` Neil Horman
  2010-02-23 23:11       ` Mathieu Desnoyers
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Horman @ 2010-02-18 18:33 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Thomas Gleixner, Ingo Molnar, ltt-dev, linux-kernel,
	Steven Rostedt, Martin Schwidefsky

On Thu, Feb 18, 2010 at 11:41:18AM -0500, Mathieu Desnoyers wrote:
> * Neil Horman (nhorman@tuxdriver.com) wrote:
> > On Thu, Feb 18, 2010 at 09:28:00AM -0500, Mathieu Desnoyers wrote:
> > > * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote:
> > > > * Thomas Gleixner (tglx@linutronix.de) wrote:
> > > > > On Tue, 16 Feb 2010, Mathieu Desnoyers wrote:
> > > > > > > The function is called from an IPI. That's a LTTNG problem, not a RT one.
> > > > > > 
> > > > > > I use del_timer in IPI to delete lttng per-cpu timers on all CPUs. I
> > > > > > have to do this because timers created with add_timer_on are documented
> > > > > > to be incompatible with del_timer_sync():
> > > > > > 
> > > > > >  * Synchronization rules: Callers must prevent restarting of the timer,
> > > > > >  * otherwise this function is meaningless. It must not be called from
> > > > > >  * interrupt contexts. The caller must not hold locks which would prevent
> > > > > >  * completion of the timer's handler. The timer's handler must not call
> > > > > >  * add_timer_on(). Upon exit the timer is not queued and the handler is
> > > > > >  * not running on any CPU.
> > > > > 
> > > > > Errm. The documentation says: 
> > > > > 
> > > > >       "The timer's handler must not call add_timer_on()." 
> > > > > 
> > > > > It's not talking about a timer which was initialized with
> > > > > add_timer_on().
> > > > > 
> > > > >  And your per cpu timer handlers have no requirement to call
> > > > > add_timer_on() simply because add/mod_timer() is requeueing the timer
> > > > > on the same cpu on which the handler runs. 
> > > > > 
> > > > > So the IPI is just a solution for a non existing problem.
> > > > 
> > > > Oh, right. Thanks for the explanation. I'll look into moving LTTng to a
> > > > saner del_timer_sync() scheme to delete the timers.
> > > 
> > > Double-checking this:
> > > 
> > > add_timer_on() needs to be paired with mod_timer_pinned(), otherwise
> > > NO_HZ SMP config can rebalance the timer to a different CPU. I am fixing
> > > this in lttng 0.194. These per-cpu timers, of course, should usually be
> > > deferrable (they are in lttng).
> > > 
> > > (looking at kernel 2.6.32.4 here)
> > > Looking at the kernel/time/clocksource.c watchdog, I wonder how
> > > del_timer manages to synchronize the timer teardown. The handler,
> > > clocksource_watchdog(), uses add_timer_on(), which prohibits using
> > > del_timer_sync(). This seems rather odd. If we remove the watchdog and
> > > re-add it, it may still be in use while we initialize the timer
> > > structure.
> > > 
> > > Also, net/core/drop_monitor.c trace_drop_common usage of add_timer_on
> > > seems odd:
> > > 
> > > Executing (AFAIK) with preempt on, data points to a per-cpu timer:
> > > 
> > >        if (!timer_pending(&data->send_timer)) {
> > >                 data->send_timer.expires = jiffies + dm_delay * HZ;
> > >                 add_timer_on(&data->send_timer, smp_processor_id());
> > >        }
> > > 
> > > How is timer_pending synchronized with the target CPU timer wheel ?
> > > 
> > Hm, I think I see your point here.  You're suggesting that a call to one of the
> > tracepoint hooks in the drop monitor can race against a second call to the hook
> > from an interrupt context that pre-empted the first, leading to double add of
> > the timer?
> 
> Yes.
> 
> > I agree, in fact I think its likely worse that that, the shared data
> > on the skb that I modify there can get corrupted in that case as well.  I expect
> > a bit of refactoring paired with a local_irq_save/restore should fix that.
> 
> For this current code, you don't seem to need to delete the timer. I think
> you'll have to find a way to do the timer_pending check and the add_timer_on
> operations atomically wrt timer wheel execution on the CPU (which can be a
> remote cpu because preemption is enabled). Your case is a bit weird..  the
> typical scenario here would be to use add_timer_on directly to re-arm the timer
> for a future expiration time (removing the current pending timer
> unconditionnally at the same time). But you leave the timer at its current
> expiration time if present, and only if not present add it. I wonder if it's
> really your intent ?
> 

Yes, thats my intent.  The idea is that when I note a packet is lost, I record
that in a data buffer that is part of an skb, and schedule a timer to send that
skb a second or so later.  Other drops in that time frame are recorded within
the same skb.  Since the drop monitor isn't built as a module currently, I don't
need to worry about deleting the timer at all, I can just let it expire.

Sounds like I need to wrap up the data extract and time modifications in a
prempt_disable/enable and local_irq_save/restore set
Neil

> Thanks,
> 
> Mathieu
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: add_timer_on: in-kernel users _all_ buggy ?
  2010-02-18 18:33     ` Neil Horman
@ 2010-02-23 23:11       ` Mathieu Desnoyers
  0 siblings, 0 replies; 5+ messages in thread
From: Mathieu Desnoyers @ 2010-02-23 23:11 UTC (permalink / raw)
  To: Neil Horman
  Cc: Thomas Gleixner, Ingo Molnar, ltt-dev, linux-kernel,
	Steven Rostedt, Martin Schwidefsky

* Neil Horman (nhorman@tuxdriver.com) wrote:
> On Thu, Feb 18, 2010 at 11:41:18AM -0500, Mathieu Desnoyers wrote:
> > * Neil Horman (nhorman@tuxdriver.com) wrote:
> > > On Thu, Feb 18, 2010 at 09:28:00AM -0500, Mathieu Desnoyers wrote:
> > > > * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote:
> > > > > * Thomas Gleixner (tglx@linutronix.de) wrote:
> > > > > > On Tue, 16 Feb 2010, Mathieu Desnoyers wrote:
> > > > > > > > The function is called from an IPI. That's a LTTNG problem, not a RT one.
> > > > > > > 
> > > > > > > I use del_timer in IPI to delete lttng per-cpu timers on all CPUs. I
> > > > > > > have to do this because timers created with add_timer_on are documented
> > > > > > > to be incompatible with del_timer_sync():
> > > > > > > 
> > > > > > >  * Synchronization rules: Callers must prevent restarting of the timer,
> > > > > > >  * otherwise this function is meaningless. It must not be called from
> > > > > > >  * interrupt contexts. The caller must not hold locks which would prevent
> > > > > > >  * completion of the timer's handler. The timer's handler must not call
> > > > > > >  * add_timer_on(). Upon exit the timer is not queued and the handler is
> > > > > > >  * not running on any CPU.
> > > > > > 
> > > > > > Errm. The documentation says: 
> > > > > > 
> > > > > >       "The timer's handler must not call add_timer_on()." 
> > > > > > 
> > > > > > It's not talking about a timer which was initialized with
> > > > > > add_timer_on().
> > > > > > 
> > > > > >  And your per cpu timer handlers have no requirement to call
> > > > > > add_timer_on() simply because add/mod_timer() is requeueing the timer
> > > > > > on the same cpu on which the handler runs. 
> > > > > > 
> > > > > > So the IPI is just a solution for a non existing problem.
> > > > > 
> > > > > Oh, right. Thanks for the explanation. I'll look into moving LTTng to a
> > > > > saner del_timer_sync() scheme to delete the timers.
> > > > 
> > > > Double-checking this:
> > > > 
> > > > add_timer_on() needs to be paired with mod_timer_pinned(), otherwise
> > > > NO_HZ SMP config can rebalance the timer to a different CPU. I am fixing
> > > > this in lttng 0.194. These per-cpu timers, of course, should usually be
> > > > deferrable (they are in lttng).
> > > > 
> > > > (looking at kernel 2.6.32.4 here)
> > > > Looking at the kernel/time/clocksource.c watchdog, I wonder how
> > > > del_timer manages to synchronize the timer teardown. The handler,
> > > > clocksource_watchdog(), uses add_timer_on(), which prohibits using
> > > > del_timer_sync(). This seems rather odd. If we remove the watchdog and
> > > > re-add it, it may still be in use while we initialize the timer
> > > > structure.
> > > > 
> > > > Also, net/core/drop_monitor.c trace_drop_common usage of add_timer_on
> > > > seems odd:
> > > > 
> > > > Executing (AFAIK) with preempt on, data points to a per-cpu timer:
> > > > 
> > > >        if (!timer_pending(&data->send_timer)) {
> > > >                 data->send_timer.expires = jiffies + dm_delay * HZ;
> > > >                 add_timer_on(&data->send_timer, smp_processor_id());
> > > >        }
> > > > 
> > > > How is timer_pending synchronized with the target CPU timer wheel ?
> > > > 
> > > Hm, I think I see your point here.  You're suggesting that a call to one of the
> > > tracepoint hooks in the drop monitor can race against a second call to the hook
> > > from an interrupt context that pre-empted the first, leading to double add of
> > > the timer?
> > 
> > Yes.
> > 
> > > I agree, in fact I think its likely worse that that, the shared data
> > > on the skb that I modify there can get corrupted in that case as well.  I expect
> > > a bit of refactoring paired with a local_irq_save/restore should fix that.
> > 
> > For this current code, you don't seem to need to delete the timer. I think
> > you'll have to find a way to do the timer_pending check and the add_timer_on
> > operations atomically wrt timer wheel execution on the CPU (which can be a
> > remote cpu because preemption is enabled). Your case is a bit weird..  the
> > typical scenario here would be to use add_timer_on directly to re-arm the timer
> > for a future expiration time (removing the current pending timer
> > unconditionnally at the same time). But you leave the timer at its current
> > expiration time if present, and only if not present add it. I wonder if it's
> > really your intent ?
> > 
> 
> Yes, thats my intent.  The idea is that when I note a packet is lost, I record
> that in a data buffer that is part of an skb, and schedule a timer to send that
> skb a second or so later.  Other drops in that time frame are recorded within
> the same skb.  Since the drop monitor isn't built as a module currently, I don't
> need to worry about deleting the timer at all, I can just let it expire.
> 
> Sounds like I need to wrap up the data extract and time modifications in a
> prempt_disable/enable and local_irq_save/restore set
> Neil

I fear this will probably break the RT kernel, because the timer spinlocks
become sleepable (I had the same issue within LTTng). Another approach you could
try is to add a new member to the timer API, which hold the local timer base
lock while:

if the timer is already pending
  - waits for already executing timers to complete their execution.
  - add_timer_on local cpu with new expiration.
else
  - do nothing

Thanks,

Mathieu

> 
> > Thanks,
> > 
> > Mathieu
> > 
> > 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-02-23 23:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-18 14:28 add_timer_on: in-kernel users _all_ buggy ? Mathieu Desnoyers
2010-02-18 16:21 ` Neil Horman
2010-02-18 16:41   ` Mathieu Desnoyers
2010-02-18 18:33     ` Neil Horman
2010-02-23 23:11       ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).