linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] hrtimer: About the warning message in hrtimer_interrupt()
@ 2021-12-24  2:39 Rei Yamamoto
  2022-02-24  9:28 ` [PATCH] hrtimer: Remove a " Rei Yamamoto
  0 siblings, 1 reply; 9+ messages in thread
From: Rei Yamamoto @ 2021-12-24  2:39 UTC (permalink / raw)
  To: tglx; +Cc: linux-kernel, geert+renesas, yamamoto.rei

When the following warning message of hrtimer_interrupt() is output,
I often ignore it because I don't know what I should do.
-----
kernel/time/hrtimer.c
void hrtimer_interrupt(struct clock_event_device *dev)
{
:
        pr_warn_once("hrtimer: interrupt took %llu ns\n", 
		      ktime_to_ns(delta));
-----

What is the intention of this message?
What action should be required for this warning?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] hrtimer: Remove a warning message in hrtimer_interrupt()
  2021-12-24  2:39 [RFC] hrtimer: About the warning message in hrtimer_interrupt() Rei Yamamoto
@ 2022-02-24  9:28 ` Rei Yamamoto
  2022-03-25 19:48   ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Rei Yamamoto @ 2022-02-24  9:28 UTC (permalink / raw)
  To: tglx; +Cc: geert+renesas, linux-kernel, yamamoto.rei

It seems the warning message in hrtimer_interrupt() has
no an intention.
The required action for the warning message in hrtimer_interrupt() 
is not clear.
Remove this message not to confuse user.

Signed-off-by: Rei Yamamoto <yamamoto.rei@jp.fujitsu.com>
---
 kernel/time/hrtimer.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 0ea8702eb516..4f4eb9d1f05a 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1866,7 +1866,6 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	else
 		expires_next = ktime_add(now, delta);
 	tick_program_event(expires_next, 1);
-	pr_warn_once("hrtimer: interrupt took %llu ns\n", ktime_to_ns(delta));
 }
 
 /* called with interrupts disabled */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] hrtimer: Remove a warning message in hrtimer_interrupt()
  2022-02-24  9:28 ` [PATCH] hrtimer: Remove a " Rei Yamamoto
@ 2022-03-25 19:48   ` Thomas Gleixner
  2022-06-24  7:00     ` [PATCH] hrtimer: CPU and entry_time is added to " Rei Yamamoto
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2022-03-25 19:48 UTC (permalink / raw)
  To: Rei Yamamoto; +Cc: geert+renesas, linux-kernel, yamamoto.rei

On Thu, Feb 24 2022 at 18:28, Rei Yamamoto wrote:
> It seems the warning message in hrtimer_interrupt() has
> no an intention.
> The required action for the warning message in hrtimer_interrupt() 
> is not clear.

The required action is to figure out why the hrtimer interrupt is
looping for a long time.

> Remove this message not to confuse user.

No. This message is interesting because as a consequence of an overly
long hrtimer interrupt the hrtimer interrupt will be deferred so tasks
can make progress. That means for the time of deferral there will be no
hrtimers fire, which is confusing if the message is removed because you
cannot figure out the reason.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] hrtimer: CPU and entry_time is added to a warning message in hrtimer_interrupt()
  2022-03-25 19:48   ` Thomas Gleixner
@ 2022-06-24  7:00     ` Rei Yamamoto
  2022-06-24  7:09       ` Rei Yamamoto
  2022-12-01 21:30       ` Thomas Gleixner
  0 siblings, 2 replies; 9+ messages in thread
From: Rei Yamamoto @ 2022-06-24  7:00 UTC (permalink / raw)
  To: tglx; +Cc: geert+renesas, linux-kernel, yamamoto.rei

A warning message in hrtimer_interrupt() is output up to 5 times
by default, and CPU and entry_time are also shown.
These changes are helpful that the function spending a lot of time is clear
by using ftrace:
-----
dmesg
:
[ 1462.836971] start repro_hrtimer_interrupt
[ 1462.836976] test_handler = test_handler [repro] 0xffff9788f7bb3048
:
[ 1462.879117] hrtimer: CPU 7, entry_time = 1462807264840, interrupt took
60048886 ns             ^^^^^               ^^^^^^^^^^^^^

cat /sys/kernel/debug/tracing/trace
:
          <idle>-0       [007] d.h1.  1462.838075: hrtimer_expire_entry:
                         ^^^^^
hrtimer=0000000041fcee42 function=test_handler [repro] now=1462807264840
                                                           ^^^^^^^^^^^^^
-----

Signed-off-by: Rei Yamamoto <yamamoto.rei@jp.fujitsu.com>
---
 include/linux/hrtimer.h |  1 +
 kernel/sysctl.c         | 10 ++++++++++
 kernel/time/hrtimer.c   | 16 +++++++++++++++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 0ee140176f10..1a6e03b3015b 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -317,6 +317,7 @@ static inline int hrtimer_is_hres_active(struct hrtimer *timer)
 struct clock_event_device;
 
 extern void hrtimer_interrupt(struct clock_event_device *dev);
+extern int sysctl_hrtimer_interrupt_warnings;
 
 extern unsigned int hrtimer_resolution;
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index e52b6e372c60..b0420d60cec9 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -64,6 +64,7 @@
 #include <linux/mount.h>
 #include <linux/userfaultfd_k.h>
 #include <linux/pid.h>
+#include <linux/hrtimer.h>
 
 #include "../lib/kstrtox.h"
 
@@ -2038,6 +2039,15 @@ static struct ctl_table kern_table[] = {
 		.extra1		= SYSCTL_ONE,
 		.extra2		= SYSCTL_INT_MAX,
 	},
+#endif
+#ifdef CONFIG_HIGH_RES_TIMERS
+	{
+		.procname       = "hrtimer_interrupt_warnings",
+		.data           = &sysctl_hrtimer_interrupt_warnings,
+		.maxlen         = sizeof(int),
+		.mode           = 0644,
+		.proc_handler   = proc_dointvec,
+	},
 #endif
 	{ }
 };
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 0ea8702eb516..19adcbcf92e4 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1773,6 +1773,7 @@ static __latent_entropy void hrtimer_run_softirq(struct softirq_action *h)
 }
 
 #ifdef CONFIG_HIGH_RES_TIMERS
+int sysctl_hrtimer_interrupt_warnings = 5;
 
 /*
  * High resolution timer interrupt
@@ -1866,7 +1867,20 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	else
 		expires_next = ktime_add(now, delta);
 	tick_program_event(expires_next, 1);
-	pr_warn_once("hrtimer: interrupt took %llu ns\n", ktime_to_ns(delta));
+
+	/*
+	 * If a message is output many times, the delayed funciton
+	 * may be identified by resetting sysctl_hrtimer_interrupt_warnings
+	 * and enabling ftrace.
+	 */
+	if (sysctl_hrtimer_interrupt_warnings) {
+		if (sysctl_hrtimer_interrupt_warnings > 0)
+			sysctl_hrtimer_interrupt_warnings--;
+		pr_warn("hrtimer: CPU %d, entry_time = %llu, "
+				"interrupt took %llu ns\n",
+				cpu_base->cpu, entry_time,
+				ktime_to_ns(delta));
+	}
 }
 
 /* called with interrupts disabled */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] hrtimer: CPU and entry_time is added to a warning message in hrtimer_interrupt()
  2022-06-24  7:00     ` [PATCH] hrtimer: CPU and entry_time is added to " Rei Yamamoto
@ 2022-06-24  7:09       ` Rei Yamamoto
  2022-09-09  5:22         ` Rei Yamamoto
  2022-12-01 21:30       ` Thomas Gleixner
  1 sibling, 1 reply; 9+ messages in thread
From: Rei Yamamoto @ 2022-06-24  7:09 UTC (permalink / raw)
  To: tglx; +Cc: geert+renesas, linux-kernel, yamamoto.rei

On Fri, 24 Jun 2022 16:00:11, Rei Yamamoto wrote:
> dmesg
> :
> [ 1462.836971] start repro_hrtimer_interrupt
> [ 1462.836976] test_handler = test_handler [repro] 0xffff9788f7bb3048
> :
> [ 1462.879117] hrtimer: CPU 7, entry_time = 1462807264840, interrupt took
> 60048886 ns             ^^^^^               ^^^^^^^^^^^^^
> 
> cat /sys/kernel/debug/tracing/trace
> :
>           <idle>-0       [007] d.h1.  1462.838075: hrtimer_expire_entry:
>                          ^^^^^
> hrtimer=0000000041fcee42 function=test_handler [repro] now=1462807264840
>                                                            ^^^^^^^^^^^^^

test_handler() keeps running for 20 milliseconds due to a loop,
and it leads to exceed next_expire.

# grep -A 21 "hrtimer_restart test_handler" repro.c
static enum hrtimer_restart test_handler(struct hrtimer *hrtimer)
{
        ktime_t start, end;
        printk(KERN_INFO "start test_handler %d times\n", count);

        hrtimer_forward_now(hrtimer, ns_to_ktime(TICK_NSEC));

        start = ktime_get();

        for(;;){
                end = ktime_get();
                if((ktime_sub(end, start)) > 20000000)
                        break;
        }

        if(count < 20){
                count++;
                return HRTIMER_RESTART;
        }else{
                return HRTIMER_NORESTART;
        }
}

Thanks,
Rei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hrtimer: CPU and entry_time is added to a warning message in hrtimer_interrupt()
  2022-06-24  7:09       ` Rei Yamamoto
@ 2022-09-09  5:22         ` Rei Yamamoto
  2022-11-07  0:20           ` Rei Yamamoto
  0 siblings, 1 reply; 9+ messages in thread
From: Rei Yamamoto @ 2022-09-09  5:22 UTC (permalink / raw)
  To: tglx; +Cc: geert+renesas, linux-kernel, yamamoto.rei

Hi, Thomas Gleixner.

On Fri, 24 Jun 2022 16:00:11, Rei Yamamoto wrote:
> A warning message in hrtimer_interrupt() is output up to 5 times
> by default, and CPU and entry_time are also shown.
> These changes are helpful that the function spending a lot of time is clear
> by using ftrace:
> -----
> dmesg
> :
> [ 1462.836971] start repro_hrtimer_interrupt
> [ 1462.836976] test_handler = test_handler [repro] 0xffff9788f7bb3048
> :
> [ 1462.879117] hrtimer: CPU 7, entry_time = 1462807264840, interrupt took
> 60048886 ns             ^^^^^               ^^^^^^^^^^^^^
> 
> cat /sys/kernel/debug/tracing/trace
> :
>           <idle>-0       [007] d.h1.  1462.838075: hrtimer_expire_entry:
>                          ^^^^^
> hrtimer=0000000041fcee42 function=test_handler [repro] now=1462807264840
>                                                            ^^^^^^^^^^^^^
> -----
> 
> Signed-off-by: Rei Yamamoto <yamamoto.rei@jp.fujitsu.com>
> ---
>  include/linux/hrtimer.h |  1 +
>  kernel/sysctl.c         | 10 ++++++++++
>  kernel/time/hrtimer.c   | 16 +++++++++++++++-
>  3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
> index 0ee140176f10..1a6e03b3015b 100644
> --- a/include/linux/hrtimer.h
> +++ b/include/linux/hrtimer.h
> @@ -317,6 +317,7 @@ static inline int hrtimer_is_hres_active(struct hrtimer *timer)
>  struct clock_event_device;
>  
>  extern void hrtimer_interrupt(struct clock_event_device *dev);
> +extern int sysctl_hrtimer_interrupt_warnings;
>  
>  extern unsigned int hrtimer_resolution;
>  
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index e52b6e372c60..b0420d60cec9 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -64,6 +64,7 @@
>  #include <linux/mount.h>
>  #include <linux/userfaultfd_k.h>
>  #include <linux/pid.h>
> +#include <linux/hrtimer.h>
>  
>  #include "../lib/kstrtox.h"
>  
> @@ -2038,6 +2039,15 @@ static struct ctl_table kern_table[] = {
>  		.extra1		= SYSCTL_ONE,
>  		.extra2		= SYSCTL_INT_MAX,
>  	},
> +#endif
> +#ifdef CONFIG_HIGH_RES_TIMERS
> +	{
> +		.procname       = "hrtimer_interrupt_warnings",
> +		.data           = &sysctl_hrtimer_interrupt_warnings,
> +		.maxlen         = sizeof(int),
> +		.mode           = 0644,
> +		.proc_handler   = proc_dointvec,
> +	},
>  #endif
>  	{ }
>  };
> diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> index 0ea8702eb516..19adcbcf92e4 100644
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -1773,6 +1773,7 @@ static __latent_entropy void hrtimer_run_softirq(struct softirq_action *h)
>  }
>  
>  #ifdef CONFIG_HIGH_RES_TIMERS
> +int sysctl_hrtimer_interrupt_warnings = 5;
>  
>  /*
>   * High resolution timer interrupt
> @@ -1866,7 +1867,20 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>  	else
>  		expires_next = ktime_add(now, delta);
>  	tick_program_event(expires_next, 1);
> -	pr_warn_once("hrtimer: interrupt took %llu ns\n", ktime_to_ns(delta));
> +
> +	/*
> +	 * If a message is output many times, the delayed funciton
> +	 * may be identified by resetting sysctl_hrtimer_interrupt_warnings
> +	 * and enabling ftrace.
> +	 */
> +	if (sysctl_hrtimer_interrupt_warnings) {
> +		if (sysctl_hrtimer_interrupt_warnings > 0)
> +			sysctl_hrtimer_interrupt_warnings--;
> +		pr_warn("hrtimer: CPU %d, entry_time = %llu, "
> +				"interrupt took %llu ns\n",
> +				cpu_base->cpu, entry_time,
> +				ktime_to_ns(delta));
> +	}
>  }
>  
>  /* called with interrupts disabled */

Could you pick this patch up?

Thanks,
Rei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hrtimer: CPU and entry_time is added to a warning message in hrtimer_interrupt()
  2022-09-09  5:22         ` Rei Yamamoto
@ 2022-11-07  0:20           ` Rei Yamamoto
  0 siblings, 0 replies; 9+ messages in thread
From: Rei Yamamoto @ 2022-11-07  0:20 UTC (permalink / raw)
  To: tglx; +Cc: geert+renesas, linux-kernel, yamamoto.rei

Hi, Thomas Gleixner.

On Fri,  9 Sep 2022 14:22:11, Rei Yamamoto wrote:
> On Fri, 24 Jun 2022 16:00:11, Rei Yamamoto wrote:
>> A warning message in hrtimer_interrupt() is output up to 5 times
>> by default, and CPU and entry_time are also shown.
>> These changes are helpful that the function spending a lot of time is clear
>> by using ftrace:
>> -----
>> dmesg
>> :
>> [ 1462.836971] start repro_hrtimer_interrupt
>> [ 1462.836976] test_handler = test_handler [repro] 0xffff9788f7bb3048
>> :
>> [ 1462.879117] hrtimer: CPU 7, entry_time = 1462807264840, interrupt took
>> 60048886 ns             ^^^^^               ^^^^^^^^^^^^^
>>
>> cat /sys/kernel/debug/tracing/trace
>> :
>>           <idle>-0       [007] d.h1.  1462.838075: hrtimer_expire_entry:
>>                          ^^^^^
>> hrtimer=0000000041fcee42 function=test_handler [repro] now=1462807264840
>>                                                            ^^^^^^^^^^^^^
>> -----
>>
>> Signed-off-by: Rei Yamamoto <yamamoto.rei@jp.fujitsu.com>
>> ---
>>  include/linux/hrtimer.h |  1 +
>>  kernel/sysctl.c         | 10 ++++++++++
>>  kernel/time/hrtimer.c   | 16 +++++++++++++++-
>>  3 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
>> index 0ee140176f10..1a6e03b3015b 100644
>> --- a/include/linux/hrtimer.h
>> +++ b/include/linux/hrtimer.h
>> @@ -317,6 +317,7 @@ static inline int hrtimer_is_hres_active(struct hrtimer *timer)
>>  struct clock_event_device;
>>
>>  extern void hrtimer_interrupt(struct clock_event_device *dev);
>> +extern int sysctl_hrtimer_interrupt_warnings;
>>
>>  extern unsigned int hrtimer_resolution;
>>
>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>> index e52b6e372c60..b0420d60cec9 100644
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -64,6 +64,7 @@
>>  #include <linux/mount.h>
>>  #include <linux/userfaultfd_k.h>
>>  #include <linux/pid.h>
>> +#include <linux/hrtimer.h>
>>
>>  #include "../lib/kstrtox.h"
>>
>> @@ -2038,6 +2039,15 @@ static struct ctl_table kern_table[] = {
>>               .extra1         = SYSCTL_ONE,
>>               .extra2         = SYSCTL_INT_MAX,
>>       },
>> +#endif
>> +#ifdef CONFIG_HIGH_RES_TIMERS
>> +     {
>> +             .procname       = "hrtimer_interrupt_warnings",
>> +             .data           = &sysctl_hrtimer_interrupt_warnings,
>> +             .maxlen         = sizeof(int),
>> +             .mode           = 0644,
>> +             .proc_handler   = proc_dointvec,
>> +     },
>>  #endif
>>       { }
>>  };
>> diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
>> index 0ea8702eb516..19adcbcf92e4 100644
>> --- a/kernel/time/hrtimer.c
>> +++ b/kernel/time/hrtimer.c
>> @@ -1773,6 +1773,7 @@ static __latent_entropy void hrtimer_run_softirq(struct softirq_action *h)
>>  }
>>
>>  #ifdef CONFIG_HIGH_RES_TIMERS
>> +int sysctl_hrtimer_interrupt_warnings = 5;
>>
>>  /*
>>   * High resolution timer interrupt
>> @@ -1866,7 +1867,20 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>>       else
>>               expires_next = ktime_add(now, delta);
>>       tick_program_event(expires_next, 1);
>> -     pr_warn_once("hrtimer: interrupt took %llu ns\n", ktime_to_ns(delta));
>> +
>> +     /*
>> +      * If a message is output many times, the delayed funciton
>> +      * may be identified by resetting sysctl_hrtimer_interrupt_warnings
>> +      * and enabling ftrace.
>> +      */
>> +     if (sysctl_hrtimer_interrupt_warnings) {
>> +             if (sysctl_hrtimer_interrupt_warnings > 0)
>> +                     sysctl_hrtimer_interrupt_warnings--;
>> +             pr_warn("hrtimer: CPU %d, entry_time = %llu, "
>> +                             "interrupt took %llu ns\n",
>> +                             cpu_base->cpu, entry_time,
>> +                             ktime_to_ns(delta));
>> +     }
>>  }
>>
>>  /* called with interrupts disabled */
>
> Could you pick this patch up?

Do you have any comments?

Thanks,
Rei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hrtimer: CPU and entry_time is added to a warning message in hrtimer_interrupt()
  2022-06-24  7:00     ` [PATCH] hrtimer: CPU and entry_time is added to " Rei Yamamoto
  2022-06-24  7:09       ` Rei Yamamoto
@ 2022-12-01 21:30       ` Thomas Gleixner
  2022-12-02  9:55         ` Rei Yamamoto
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2022-12-01 21:30 UTC (permalink / raw)
  To: Rei Yamamoto; +Cc: geert+renesas, linux-kernel, yamamoto.rei

Rei!

On Fri, Jun 24 2022 at 16:00, Rei Yamamoto wrote:
> A warning message in hrtimer_interrupt() is output up to 5 times
> by default, and CPU and entry_time are also shown.

This describes to some extent _what_ the patch is doing, but not the
why.

> These changes are helpful that the function spending a lot of time is clear
> by using ftrace:

That's a constructed case. There are multiple reasons why this can
happen, not just because a single hrtimer callback misbehaves.

> @@ -2038,6 +2039,15 @@ static struct ctl_table kern_table[] = {
>  		.extra1		= SYSCTL_ONE,
>  		.extra2		= SYSCTL_INT_MAX,
>  	},
> +#endif
> +#ifdef CONFIG_HIGH_RES_TIMERS
> +	{
> +		.procname       = "hrtimer_interrupt_warnings",
> +		.data           = &sysctl_hrtimer_interrupt_warnings,
> +		.maxlen         = sizeof(int),
> +		.mode           = 0644,
> +		.proc_handler   = proc_dointvec,
> +	},

So this adds a new sysctl, but the changelog does not tell anything
about it. Aside of the dubious value of this sysctl, this lacks the
required documentation for new sysctls.

> +	/*
> +	 * If a message is output many times, the delayed funciton
> +	 * may be identified by resetting sysctl_hrtimer_interrupt_warnings
> +	 * and enabling ftrace.

What has the reset of sysctl_hrtimer_interrupt_warnings to do with
ftrace and how is that reset helpful to identify the root cause?

Also repeating the printk 5 times does not add any value at all. The
runaway detection already has logic to supress spurious events and if
the problem persists then it can be observed by ftrace without any of
these changes.

I assume - because you did not tell so - that you try to have a
correlation between ftrace and dmesg via the entry timestamp output,
right?

That's just a half thought out debug bandaid, really.

You can provide a way better mechanism by adding a tracepoint right at
the pr_warn_once(), which emits information for correlation right into
the trace.

That allows you to stop the trace once the tracepoint is emitted instead
of having to do all of this including the correlation manually.

Hmm?

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hrtimer: CPU and entry_time is added to a warning message in hrtimer_interrupt()
  2022-12-01 21:30       ` Thomas Gleixner
@ 2022-12-02  9:55         ` Rei Yamamoto
  0 siblings, 0 replies; 9+ messages in thread
From: Rei Yamamoto @ 2022-12-02  9:55 UTC (permalink / raw)
  To: tglx; +Cc: geert+renesas, linux-kernel, yamamoto.rei

Hi Thomas Gleixner,

Thanks for your reply.

On  Thu, 01 Dec 2022 22:30:41, Thomas Gleixner wrote:
> On Fri, Jun 24 2022 at 16:00, Rei Yamamoto wrote:
>> A warning message in hrtimer_interrupt() is output up to 5 times
>> by default, and CPU and entry_time are also shown.
> 
> This describes to some extent _what_ the patch is doing, but not the
> why.
> 
>> These changes are helpful that the function spending a lot of time is clear
>> by using ftrace:
> 
> That's a constructed case. There are multiple reasons why this can
> happen, not just because a single hrtimer callback misbehaves.
> 
>> @@ -2038,6 +2039,15 @@ static struct ctl_table kern_table[] = {
>>  		.extra1		= SYSCTL_ONE,
>>  		.extra2		= SYSCTL_INT_MAX,
>>  	},
>> +#endif
>> +#ifdef CONFIG_HIGH_RES_TIMERS
>> +	{
>> +		.procname       = "hrtimer_interrupt_warnings",
>> +		.data           = &sysctl_hrtimer_interrupt_warnings,
>> +		.maxlen         = sizeof(int),
>> +		.mode           = 0644,
>> +		.proc_handler   = proc_dointvec,
>> +	},
> 
> So this adds a new sysctl, but the changelog does not tell anything
> about it. Aside of the dubious value of this sysctl, this lacks the
> required documentation for new sysctls.
> 
>> +	/*
>> +	 * If a message is output many times, the delayed funciton
>> +	 * may be identified by resetting sysctl_hrtimer_interrupt_warnings
>> +	 * and enabling ftrace.
> 
> What has the reset of sysctl_hrtimer_interrupt_warnings to do with
> ftrace and how is that reset helpful to identify the root cause?
>
> Also repeating the printk 5 times does not add any value at all. The
> runaway detection already has logic to supress spurious events and if
> the problem persists then it can be observed by ftrace without any of
> these changes.

If the system performance degradation exists and this message is output
some times at the same time, my idea was that the reset of
sysctl_hrtimer_interrupt_warnings helps to check whether hrtimer event
is related to the degradation.
However, as you said, it can be observed by ftrace.
Therefore, I withdraw an opinion to add hrtimer_interrupt_warnings as
the sysctl parameter.

> I assume - because you did not tell so - that you try to have a
> correlation between ftrace and dmesg via the entry timestamp output,
> right?

Yes, that's right.

> That's just a half thought out debug bandaid, really.
> 
> You can provide a way better mechanism by adding a tracepoint right at
> the pr_warn_once(), which emits information for correlation right into
> the trace.
> 
> That allows you to stop the trace once the tracepoint is emitted instead
> of having to do all of this including the correlation manually.

OK, I will recreate the patch with following your idea.

Thanks.
Rei

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-12-02 10:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-24  2:39 [RFC] hrtimer: About the warning message in hrtimer_interrupt() Rei Yamamoto
2022-02-24  9:28 ` [PATCH] hrtimer: Remove a " Rei Yamamoto
2022-03-25 19:48   ` Thomas Gleixner
2022-06-24  7:00     ` [PATCH] hrtimer: CPU and entry_time is added to " Rei Yamamoto
2022-06-24  7:09       ` Rei Yamamoto
2022-09-09  5:22         ` Rei Yamamoto
2022-11-07  0:20           ` Rei Yamamoto
2022-12-01 21:30       ` Thomas Gleixner
2022-12-02  9:55         ` Rei Yamamoto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).