linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] softirq: don't push timer softirq handling to ksoftirqd
@ 2018-11-15 17:07 Michael Zhivich
  2018-11-15 17:17 ` John Stultz
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Zhivich @ 2018-11-15 17:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: tiny.windzz, joel, alexander.levin, frederic, bigeasy, mingo,
	rostedt, paulmck, tglx, john.stultz, arnd, omosnace,
	jason.wessel, kreview, Michael Zhivich

Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
ksoftirqd runs, thus preventing problems with reading clocksources that
wrap often (e.g. acpi_pm).

If acpi_pm is used as the clocksource watchdog, and machine is under heavy
load, the time period for the watchdog check may be significantly longer
than the requested 0.5 seconds.  If the watchdog check is delayed by 2
seconds (observed behavior), then acpi_pm time delta will be

    2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f

which will be treated as negative (since acpi_pm is only 24-bits wide) and
truncated to 0.  This behavior will cause tsc to be incorrectly declared
unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
If the clocksource watchdog check is delayed by more than 4.7 sec, then the
acpi_pm clocksource will wrap altogether and produce incorrect time delta.

The likely cause of this delay is that timer interrupts are serviced in
ksoftirqd when the machine is very busy.

Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
synchronous"):
   ...
   We should probably also consider the timer softirqs to be synchronous
   and not be delayed to ksoftirqd (since they were the issue with the
   earlier watchdog problems), but that should be done as a separate patch.
   ...

Signed-off-by: Michael Zhivich <mzhivich@akamai.com>
---
 kernel/softirq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index d28813306b2c..6d517ce0fba8 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
  * right now. Let ksoftirqd handle this at its own rate, to get fairness,
  * unless we're doing some of the synchronous softirqs.
  */
-#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
+#define SOFTIRQ_NOW_MASK \
+	((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << TIMER_SOFTIRQ))
 static bool ksoftirqd_running(unsigned long pending)
 {
 	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd
  2018-11-15 17:07 [PATCH] softirq: don't push timer softirq handling to ksoftirqd Michael Zhivich
@ 2018-11-15 17:17 ` John Stultz
  2018-11-16 18:46   ` Zhivich, Michael
  0 siblings, 1 reply; 5+ messages in thread
From: John Stultz @ 2018-11-15 17:17 UTC (permalink / raw)
  To: Michael Zhivich
  Cc: lkml, tiny.windzz, Joel Fernandes, alexander.levin, frederic,
	Sebastian Andrzej Siewior, Ingo Molnar, Steven Rostedt,
	Paul E. McKenney, Thomas Gleixner, Arnd Bergmann,
	Ondrej Mosnacek, Jason Wessel, kreview

On Thu, Nov 15, 2018 at 9:07 AM, Michael Zhivich <mzhivich@akamai.com> wrote:
> Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
> ksoftirqd runs, thus preventing problems with reading clocksources that
> wrap often (e.g. acpi_pm).
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds.  If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
>     2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative (since acpi_pm is only 24-bits wide) and
> truncated to 0.  This behavior will cause tsc to be incorrectly declared
> unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
> If the clocksource watchdog check is delayed by more than 4.7 sec, then the
> acpi_pm clocksource will wrap altogether and produce incorrect time delta.
>
> The likely cause of this delay is that timer interrupts are serviced in
> ksoftirqd when the machine is very busy.
>
> Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
> synchronous"):
>    ...
>    We should probably also consider the timer softirqs to be synchronous
>    and not be delayed to ksoftirqd (since they were the issue with the
>    earlier watchdog problems), but that should be done as a separate patch.
>    ...
>
> Signed-off-by: Michael Zhivich <mzhivich@akamai.com>
> ---
>  kernel/softirq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d28813306b2c..6d517ce0fba8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
>   * right now. Let ksoftirqd handle this at its own rate, to get fairness,
>   * unless we're doing some of the synchronous softirqs.
>   */
> -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +#define SOFTIRQ_NOW_MASK \
> +       ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << TIMER_SOFTIRQ))
>  static bool ksoftirqd_running(unsigned long pending)
>  {
>         struct task_struct *tsk = __this_cpu_read(ksoftirqd);

Thanks so much for sending this along! Sorry I didn't get back to your
mail earlier this week, I've been at Plumbers.

So while this does try to attack the reliability issue w/ the
clocksource watchdog being delayed, I worry this will have to many
side-effects elsewhere.

Would a more focused fix be to move the clocksource watchdog from a
normal timer to a hrtimer?

thanks
-john

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd
  2018-11-15 17:17 ` John Stultz
@ 2018-11-16 18:46   ` Zhivich, Michael
  2018-11-28 15:21     ` Thomas Gleixner
  0 siblings, 1 reply; 5+ messages in thread
From: Zhivich, Michael @ 2018-11-16 18:46 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, tiny.windzz, Joel Fernandes, alexander.levin, frederic,
	Sebastian Andrzej Siewior, Ingo Molnar, Steven Rostedt,
	Paul E. McKenney, Thomas Gleixner, Arnd Bergmann,
	Ondrej Mosnacek, Jason Wessel, kreview

On 11/15/18, 12:17 PM, "John Stultz" <john.stultz@linaro.org> wrote:

    On Thu, Nov 15, 2018 at 9:07 AM, Michael Zhivich <mzhivich@akamai.com> wrote:
    > Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
    > ksoftirqd runs, thus preventing problems with reading clocksources that
    > wrap often (e.g. acpi_pm).
    >
    > If acpi_pm is used as the clocksource watchdog, and machine is under heavy
    > load, the time period for the watchdog check may be significantly longer
    > than the requested 0.5 seconds.  If the watchdog check is delayed by 2
    > seconds (observed behavior), then acpi_pm time delta will be
    >
    >     2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
    >
    > which will be treated as negative (since acpi_pm is only 24-bits wide) and
    > truncated to 0.  This behavior will cause tsc to be incorrectly declared
    > unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
    > If the clocksource watchdog check is delayed by more than 4.7 sec, then the
    > acpi_pm clocksource will wrap altogether and produce incorrect time delta.
    >
    > The likely cause of this delay is that timer interrupts are serviced in
    > ksoftirqd when the machine is very busy.
    >
    > Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
    > synchronous"):
    >    ...
    >    We should probably also consider the timer softirqs to be synchronous
    >    and not be delayed to ksoftirqd (since they were the issue with the
    >    earlier watchdog problems), but that should be done as a separate patch.
    >    ...
    >
    > Signed-off-by: Michael Zhivich <mzhivich@akamai.com>
    > ---
    >  kernel/softirq.c | 3 ++-
    >  1 file changed, 2 insertions(+), 1 deletion(-)
    >
    > diff --git a/kernel/softirq.c b/kernel/softirq.c
    > index d28813306b2c..6d517ce0fba8 100644
    > --- a/kernel/softirq.c
    > +++ b/kernel/softirq.c
    > @@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
    >   * right now. Let ksoftirqd handle this at its own rate, to get fairness,
    >   * unless we're doing some of the synchronous softirqs.
    >   */
    > -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
    > +#define SOFTIRQ_NOW_MASK \
    > +       ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << TIMER_SOFTIRQ))
    >  static bool ksoftirqd_running(unsigned long pending)
    >  {
    >         struct task_struct *tsk = __this_cpu_read(ksoftirqd);
    
    Thanks so much for sending this along! Sorry I didn't get back to your
    mail earlier this week, I've been at Plumbers.
    
    So while this does try to attack the reliability issue w/ the
    clocksource watchdog being delayed, I worry this will have to many
    side-effects elsewhere.
    
    Would a more focused fix be to move the clocksource watchdog from a
    normal timer to a hrtimer?
    
    thanks
    -john
    
Hi John,

That's an interesting idea - it would get clocksource watchdog out of ksoftirqd.  However, clocksource watchdog iterates over available CPUs to check the TSC on each core (see add_timer_on() call in clocksource_watchdog()).  I'm not seeing an API to start an hrtimer on a specific CPU - is this possible and I'm missing it?  Or would something like this have to be added to hrtimer?

Thanks,
~ Michael


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd
  2018-11-16 18:46   ` Zhivich, Michael
@ 2018-11-28 15:21     ` Thomas Gleixner
  2018-11-28 17:56       ` [kreview] " Zhivich, Michael
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2018-11-28 15:21 UTC (permalink / raw)
  To: Zhivich, Michael
  Cc: John Stultz, lkml, tiny.windzz, Joel Fernandes, alexander.levin,
	frederic, Sebastian Andrzej Siewior, Ingo Molnar, Steven Rostedt,
	Paul E. McKenney, Arnd Bergmann, Ondrej Mosnacek, Jason Wessel,
	kreview

Michael,

On Fri, 16 Nov 2018, Zhivich, Michael wrote:
> On 11/15/18, 12:17 PM, "John Stultz" <john.stultz@linaro.org> wrote:
>     Would a more focused fix be to move the clocksource watchdog from a
>     normal timer to a hrtimer?
>     
> That's an interesting idea - it would get clocksource watchdog out of
> ksoftirqd.  However, clocksource watchdog iterates over available CPUs to
> check the TSC on each core (see add_timer_on() call in
> clocksource_watchdog()).  I'm not seeing an API to start an hrtimer on a
> specific CPU - is this possible and I'm missing it?  Or would something
> like this have to be added to hrtimer?

It's surely an interesting idea, but it's not trivial.

There is no way to reliably queue hrtimers remote when high resolution mode
is enabled. It only works when the to be queued timer is not the first to
expire one. If it ends up being the first to expire timer, then there is
currently no way to rearm the remote per cpu clockevent device because it's
not remotely accesible.

It would need an SMP function call, but that needs to be asynchronous due
to locking/interrupt disabled constraints. Maybe ...

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [kreview] [PATCH] softirq: don't push timer softirq handling to ksoftirqd
  2018-11-28 15:21     ` Thomas Gleixner
@ 2018-11-28 17:56       ` Zhivich, Michael
  0 siblings, 0 replies; 5+ messages in thread
From: Zhivich, Michael @ 2018-11-28 17:56 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Arnd Bergmann, Ondrej Mosnacek, tiny.windzz,
	frederic, lkml, Steven Rostedt, alexander.levin, John Stultz,
	kreview, Jason Wessel, Joel Fernandes, Paul E. McKenney,
	Sebastian Andrzej Siewior

On 11/28/18, 10:22 AM, "Thomas Gleixner" <tglx@linutronix.de> wrote:

    Michael,
    
    On Fri, 16 Nov 2018, Zhivich, Michael wrote:
    > On 11/15/18, 12:17 PM, "John Stultz" <john.stultz@linaro.org> wrote:
    >     Would a more focused fix be to move the clocksource watchdog from a
    >     normal timer to a hrtimer?
    >     
    > That's an interesting idea - it would get clocksource watchdog out of
    > ksoftirqd.  However, clocksource watchdog iterates over available CPUs to
    > check the TSC on each core (see add_timer_on() call in
    > clocksource_watchdog()).  I'm not seeing an API to start an hrtimer on a
    > specific CPU - is this possible and I'm missing it?  Or would something
    > like this have to be added to hrtimer?
    
    It's surely an interesting idea, but it's not trivial.
    
    There is no way to reliably queue hrtimers remote when high resolution mode
    is enabled. It only works when the to be queued timer is not the first to
    expire one. If it ends up being the first to expire timer, then there is
    currently no way to rearm the remote per cpu clockevent device because it's
    not remotely accesible.
    
    It would need an SMP function call, but that needs to be asynchronous due
    to locking/interrupt disabled constraints. Maybe ...
    
    Thanks,
    
    	tglx

    
Thomas,

Thanks for the feedback - it does sound tricky to get right.  I'll spend some more time thinking about it.

My original patch tries to ensure that timer softirqs are serviced immediately, not punted to ksoftirqd.  It is perhaps too heavy-handed (all softirqs will get serviced if a timer softirq fired), but I imagine the clocksource watchdog is not the only timer that doesn't want to be arbitrarily delayed when the machine is busy.

Would it make sense to modify the patch such that only timer softirqs are serviced immediately if fired?  Or are most timers that require precision wakeups using hrtimer framework?

Thanks,
~ Michael



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-11-28 17:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-15 17:07 [PATCH] softirq: don't push timer softirq handling to ksoftirqd Michael Zhivich
2018-11-15 17:17 ` John Stultz
2018-11-16 18:46   ` Zhivich, Michael
2018-11-28 15:21     ` Thomas Gleixner
2018-11-28 17:56       ` [kreview] " Zhivich, Michael

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).