linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjanvandeven@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Chris Mason <clm@fb.com>, Arjan van de Ven <arjan@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	George Spelvin <linux@sciencehorizons.net>
Subject: Re: [patch 13/20] timer: Switch to a non cascading wheel
Date: Thu, 16 Jun 2016 09:02:15 -0700	[thread overview]
Message-ID: <20160616160215.GQ3923@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1606161700150.5839@nanos>

On Thu, Jun 16, 2016 at 05:43:36PM +0200, Thomas Gleixner wrote:
> On Wed, 15 Jun 2016, Thomas Gleixner wrote:
> > On Wed, 15 Jun 2016, Arjan van de Ven wrote:
> > > what would 1 more timer wheel do?
> > 
> > Waste storage space and make the collection of expired timers more expensive.
> > 
> > The selection of the timer wheel properties is combination of:
> > 
> >     1) Granularity 
> > 
> >     2) Storage space
> > 
> >     3) Number of levels to collect
> 
> So I came up with a slightly different solution for this. The problem case is
> HZ=1000 and again looking at the data, there is no reason why we need actual
> 1ms granularity for timer wheel timers. That's independent of the desired ms
> based interfaces.
> 
> We can simply run the wheel internaly with 4ms base level resolution and
> degrade from there. That gives us 6 days+ and a simple cutoff at the capacity
> of the 7th level wheel.
> 
>  0     0        4 ms               0 ms -        255 ms		    
>  1    64       32 ms             256 ms -       2047 ms (256ms - ~2s)
>  2   128      256 ms            2048 ms -      16383 ms (~2s - ~16s) 
>  3   192     2048 ms (~2s)     16384 ms -     131071 ms (~16s - ~2m)
>  4   256    16384 ms (~16s)   131072 ms -    1048575 ms (~2m - ~17m)
>  5   320   131072 ms (~2m)   1048576 ms -    8388607 ms (~17m - ~2h)
>  6   384  1048576 ms (~17m)  8388608 ms -   67108863 ms (~2h - ~18h)
>  7   448  8388608 ms (~2h)  67108864 ms -  536870911 ms (~18h - ~6d)
> 
> That works really nice and has the interesting side effect that we batch in
> the first level wheel which helps networking. I'll repost the series with the
> other review points addressed later tonight.
> 
> Btw, I also thought a bit more about the milliseconds interfaces. I think we
> shouldn't invent new interfaces. The correct solution IMHO is to distangle the
> scheduler tick frequency and jiffies. If we have that completely seperated
> then we can do the following:
> 
> 1) Force HZ=1000. That means jiffies and timer wheel units are 1ms. If the
>    tick frequency is != 1000 we simply increment jiffies in the tick by the
>    proper amount (4 @250 ticks/sec, 10 @100 ticks/sec).
> 
>    So all msec_to_jiffies() invocations compile out into nothing magically and
>    we can remove them gradually over time.

Some of RCU's heuristics assume that if scheduling-clock ticks happen,
they happen once per jiffy.  These would need to be adjusted, which would
not be a big deal, just a bit more use of HZ.

> 2) When we do that right, we can make the tick frequency a command line option
>    and just have a compiled in default.

As long as there is something that tells RCU what the tick frequency
actually is at runtime, this should not be a problem.  For example,
in rcu_implicit_dynticks_qs(), the following:

	rdp->rsp->jiffies_resched += 5;

Would instead need to be something like:

	rdp->rsp->jiffies_resched += 5 * jiffies_per_tick;

Changing tick frequency at runtime would be a bit more tricky, as it would
be tough to avoid some oddball false positives during the transition.
But setting it at boot time would be fine.  ;-)

							Thanx, Paul

> Thoughts?
> 
> Thanks,
> 
> 	tglx
> 

  reply	other threads:[~2016-06-16 16:02 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-13  8:40 [patch 00/20] timer: Refactor the timer wheel Thomas Gleixner
2016-06-13  8:40 ` [patch 01/20] timer: Make pinned a timer property Thomas Gleixner
2016-06-13  8:40 ` [patch 02/20] x86/apic/uv: Initialize timer as pinned Thomas Gleixner
2016-06-13  8:40 ` [patch 03/20] x86/mce: " Thomas Gleixner
2016-06-13  8:40 ` [patch 04/20] cpufreq/powernv: " Thomas Gleixner
2016-06-13 13:18   ` Arjan van de Ven
2016-06-13  8:40 ` [patch 05/20] driver/net/ethernet/tile: " Thomas Gleixner
2016-06-13  8:40 ` [patch 06/20] drivers/tty/metag_da: " Thomas Gleixner
2016-06-13 13:13   ` Arjan van de Ven
2016-06-13  8:40 ` [patch 07/20] drivers/tty/mips_ejtag: " Thomas Gleixner
2016-06-13  8:40 ` [patch 08/20] net/ipv4/inet: Initialize timers " Thomas Gleixner
2016-06-13  8:40 ` [patch 09/20] timer: Remove mod_timer_pinned Thomas Gleixner
2016-06-13  8:40 ` [patch 10/20] timer: Add a cascading tracepoint Thomas Gleixner
2016-06-13  8:40 ` [patch 11/20] hlist: Add hlist_is_last_node() helper Thomas Gleixner
2016-06-13 10:27   ` Paolo Bonzini
2016-06-13  8:40 ` [patch 12/20] timer: Give a few structs and members proper names Thomas Gleixner
2016-06-13  8:41 ` [patch 13/20] timer: Switch to a non cascading wheel Thomas Gleixner
2016-06-13 11:40   ` Peter Zijlstra
2016-06-13 12:30     ` Thomas Gleixner
2016-06-13 12:46       ` Eric Dumazet
2016-06-13 14:30         ` Thomas Gleixner
2016-06-14 10:11       ` Ingo Molnar
2016-06-14 16:28         ` Thomas Gleixner
2016-06-14 17:14           ` Arjan van de Ven
2016-06-14 18:05             ` Thomas Gleixner
2016-06-14 20:34               ` Peter Zijlstra
2016-06-14 20:42               ` Peter Zijlstra
2016-06-14 21:17                 ` Eric Dumazet
2016-06-15 14:53                   ` Thomas Gleixner
2016-06-15 14:55                     ` Arjan van de Ven
2016-06-15 16:43                       ` Thomas Gleixner
2016-06-16 15:43                         ` Thomas Gleixner
2016-06-16 16:02                           ` Paul E. McKenney [this message]
2016-06-16 18:14                             ` Peter Zijlstra
2016-06-17  0:40                               ` Paul E. McKenney
2016-06-17  4:04                                 ` Paul E. McKenney
2016-06-16 16:04                           ` Arjan van de Ven
2016-06-16 16:09                             ` Thomas Gleixner
2016-06-15 15:05                     ` Eric Dumazet
2016-06-13 14:36   ` Richard Cochran
2016-06-13 14:39     ` Thomas Gleixner
2016-06-13  8:41 ` [patch 15/20] timer: Move __run_timers() function Thomas Gleixner
2016-06-13  8:41 ` [patch 14/20] timer: Remove slack leftovers Thomas Gleixner
2016-06-13  8:41 ` [patch 16/20] timer: Optimize collect timers for NOHZ Thomas Gleixner
2016-06-13  8:41 ` [patch 17/20] tick/sched: Remove pointless empty function Thomas Gleixner
2016-06-13  8:41 ` [patch 18/20] timer: Forward wheel clock whenever possible Thomas Gleixner
2016-06-13 15:14   ` Richard Cochran
2016-06-13 15:18     ` Thomas Gleixner
2016-06-13  8:41 ` [patch 19/20] timer: Split out index calculation Thomas Gleixner
2016-06-13  8:41 ` [patch 20/20] timer: Optimization for same expiry time in mod_timer() Thomas Gleixner
2016-06-13 14:10 ` [patch 00/20] timer: Refactor the timer wheel Eric Dumazet
2016-06-13 16:15 ` Paul E. McKenney
2016-06-15 15:15   ` Paul E. McKenney
2016-06-15 17:02     ` Thomas Gleixner
2016-06-15 20:26       ` Paul E. McKenney
2016-06-14  8:16 [patch 13/20] timer: Switch to a non cascading wheel George Spelvin
2016-06-14  8:50 ` Thomas Gleixner
2016-06-14 10:15   ` George Spelvin
2016-06-14 10:20 ` Peter Zijlstra
2016-06-14 12:58   ` George Spelvin
2016-06-14 16:48     ` Thomas Gleixner
2016-06-14 19:56       ` George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160616160215.GQ3923@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=arjan@infradead.org \
    --cc=arjanvandeven@gmail.com \
    --cc=clm@fb.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@sciencehorizons.net \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).