linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] timer: Fix bucket_expiry calculation
@ 2021-05-12 12:15 Xiongfeng Wang
  2021-05-12 14:42 ` Thomas Gleixner
  0 siblings, 1 reply; 3+ messages in thread
From: Xiongfeng Wang @ 2021-05-12 12:15 UTC (permalink / raw)
  To: john.stultz, tglx, sboyd; +Cc: linux-kernel, wangxiongfeng2

When I use schedule_timeout(5) to put a process into sleep on my machine
with HZ = 100. It always sleep about 60ms. I enable the timer trace and
find out, when the timer_list expires, 'now' is always equal to
'expires + 1'. I print 'base->next_expiry' in '__run_timers' and find out
'next_expiry' is always equal to 'expires + 1';

  my_test_thread-1230    [001] d...   382.627089: timer_start: timer=000000004ec021c9 function=process_timeout expires=4294975072 [timeout=5] cpu=1 idx=33 flags=
          <idle>-0       [001] ..s.   382.687082: run_timer_softirq: jiffies 4294975073  next_expiry 4294975073
          <idle>-0       [001] ..s.   382.687083: timer_expire_entry: timer=000000004ec021c9 function=process_timeout now=4294975073 baseclk=4294975073
  my_test_thread-1230    [001] d...   382.687089: timer_start: timer=000000004ec021c9 function=process_timeout expires=4294975078 [timeout=5] cpu=1 idx=39 flags=
          <idle>-0       [001] ..s.   382.747083: run_timer_softirq: jiffies 4294975079  next_expiry 4294975079
          <idle>-0       [001] ..s.   382.747084: timer_expire_entry: timer=000000004ec021c9 function=process_timeout now=4294975079 baseclk=4294975079

It is because we use the following equation to calculate bucket_expiry.

  bucket_expiry = ((expires + LVL_GRAN(lvl)) >> LVL_SHIFT(lvl)) << LVL_SHIFT(lvl)

'bucket_expiry' is equal to 'expires + 1' when lvl = 0. So modify the
equation as follows to fix the issue.

  bucket_expiry = ((expires + LVL_GRAN(lvl) - 1) >> LVL_SHIFT(lvl)) << LVL_SHIFT(lvl)

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
---
 kernel/time/timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index d111adf..a6a26da 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -501,7 +501,7 @@ static inline unsigned calc_index(unsigned long expires, unsigned lvl,
 	 *
 	 * Round up with level granularity to prevent this.
 	 */
-	expires = (expires + LVL_GRAN(lvl)) >> LVL_SHIFT(lvl);
+	expires = (expires + LVL_GRAN(lvl) - 1) >> LVL_SHIFT(lvl);
 	*bucket_expiry = expires << LVL_SHIFT(lvl);
 	return LVL_OFFS(lvl) + (expires & LVL_MASK);
 }
-- 
1.7.12.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] timer: Fix bucket_expiry calculation
  2021-05-12 12:15 [RFC PATCH] timer: Fix bucket_expiry calculation Xiongfeng Wang
@ 2021-05-12 14:42 ` Thomas Gleixner
  2021-05-13  6:51   ` Xiongfeng Wang
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Gleixner @ 2021-05-12 14:42 UTC (permalink / raw)
  To: Xiongfeng Wang, john.stultz, sboyd; +Cc: linux-kernel, wangxiongfeng2

Xiongfeng,

On Wed, May 12 2021 at 20:15, Xiongfeng Wang wrote:
> When I use schedule_timeout(5) to put a process into sleep on my machine
> with HZ = 100. It always sleep about 60ms. I enable the timer trace and
> find out, when the timer_list expires, 'now' is always equal to
> 'expires + 1'. I print 'base->next_expiry' in '__run_timers' and find out
> 'next_expiry' is always equal to 'expires + 1';
>
> It is because we use the following equation to calculate bucket_expiry.
>
>   bucket_expiry = ((expires + LVL_GRAN(lvl)) >> LVL_SHIFT(lvl)) << LVL_SHIFT(lvl)
>
> 'bucket_expiry' is equal to 'expires + 1' when lvl = 0. So modify the
> equation as follows to fix the issue.
>
>   bucket_expiry = ((expires + LVL_GRAN(lvl) - 1) >> LVL_SHIFT(lvl)) << LVL_SHIFT(lvl)

That's wrong because you move the expiry of each timer one jiffie ahead,
which violates the guarantee that a timer sleeps at least for one jiffie
for real and not measured in jiffies.

  jiffies = 0
  schedule_timeout(1)

   local_irq_disable()
			  -> timer interrupt is raised in HW
   timer->expires = jiffies + 1 <- 1
   add_timer(timer)
   local_irq_enable()
   timer interrupt
      jiffies++;
      softirq()
	  expire(timer); -> timer is expired immediately       

So the off by one has a reason and is required to prevent too short
timeouts. There is nothing you can do about that because that's a
property of low granularity tick based timer wheels.

That's even documented in the comment above the code you modified:

	/*
	 * The timer wheel has to guarantee that a timer does not fire
	 * early. Early expiry can happen due to:
	 * - Timer is armed at the edge of a tick
	 * - Truncation of the expiry time in the outer wheel levels
	 *
	 * Round up with level granularity to prevent this.
	 */

Thanks,

	  tglx

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] timer: Fix bucket_expiry calculation
  2021-05-12 14:42 ` Thomas Gleixner
@ 2021-05-13  6:51   ` Xiongfeng Wang
  0 siblings, 0 replies; 3+ messages in thread
From: Xiongfeng Wang @ 2021-05-13  6:51 UTC (permalink / raw)
  To: Thomas Gleixner, john.stultz, sboyd; +Cc: linux-kernel

Hi Thomas,

On 2021/5/12 22:42, Thomas Gleixner wrote:
> Xiongfeng,
> 
> On Wed, May 12 2021 at 20:15, Xiongfeng Wang wrote:
>> When I use schedule_timeout(5) to put a process into sleep on my machine
>> with HZ = 100. It always sleep about 60ms. I enable the timer trace and
>> find out, when the timer_list expires, 'now' is always equal to
>> 'expires + 1'. I print 'base->next_expiry' in '__run_timers' and find out
>> 'next_expiry' is always equal to 'expires + 1';
>>
>> It is because we use the following equation to calculate bucket_expiry.
>>
>>   bucket_expiry = ((expires + LVL_GRAN(lvl)) >> LVL_SHIFT(lvl)) << LVL_SHIFT(lvl)
>>
>> 'bucket_expiry' is equal to 'expires + 1' when lvl = 0. So modify the
>> equation as follows to fix the issue.
>>
>>   bucket_expiry = ((expires + LVL_GRAN(lvl) - 1) >> LVL_SHIFT(lvl)) << LVL_SHIFT(lvl)
> 
> That's wrong because you move the expiry of each timer one jiffie ahead,
> which violates the guarantee that a timer sleeps at least for one jiffie
> for real and not measured in jiffies.
> 
>   jiffies = 0
>   schedule_timeout(1)
> 
>    local_irq_disable()
> 			  -> timer interrupt is raised in HW
>    timer->expires = jiffies + 1 <- 1
>    add_timer(timer)
>    local_irq_enable()
>    timer interrupt
>       jiffies++;
>       softirq()
> 	  expire(timer); -> timer is expired immediately       
> 
> So the off by one has a reason and is required to prevent too short
> timeouts. There is nothing you can do about that because that's a
> property of low granularity tick based timer wheels.
> 
> That's even documented in the comment above the code you modified:
> 
> 	/*
> 	 * The timer wheel has to guarantee that a timer does not fire
> 	 * early. Early expiry can happen due to:
> 	 * - Timer is armed at the edge of a tick
> 	 * - Truncation of the expiry time in the outer wheel levels
> 	 *
> 	 * Round up with level granularity to prevent this.
> 	 */

Thanks for your explanation. I got it !

Thanks,
Xiongfeng

> 
> Thanks,
> 
> 	  tglx
> .
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-05-13  6:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-12 12:15 [RFC PATCH] timer: Fix bucket_expiry calculation Xiongfeng Wang
2021-05-12 14:42 ` Thomas Gleixner
2021-05-13  6:51   ` Xiongfeng Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).