From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjanvandeven@gmail.com>,
Eric Dumazet <edumazet@google.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Chris Mason <clm@fb.com>, Arjan van de Ven <arjan@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
George Spelvin <linux@sciencehorizons.net>
Subject: Re: [patch 13/20] timer: Switch to a non cascading wheel
Date: Thu, 16 Jun 2016 09:02:15 -0700 [thread overview]
Message-ID: <20160616160215.GQ3923@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1606161700150.5839@nanos>
On Thu, Jun 16, 2016 at 05:43:36PM +0200, Thomas Gleixner wrote:
> On Wed, 15 Jun 2016, Thomas Gleixner wrote:
> > On Wed, 15 Jun 2016, Arjan van de Ven wrote:
> > > what would 1 more timer wheel do?
> >
> > Waste storage space and make the collection of expired timers more expensive.
> >
> > The selection of the timer wheel properties is combination of:
> >
> > 1) Granularity
> >
> > 2) Storage space
> >
> > 3) Number of levels to collect
>
> So I came up with a slightly different solution for this. The problem case is
> HZ=1000 and again looking at the data, there is no reason why we need actual
> 1ms granularity for timer wheel timers. That's independent of the desired ms
> based interfaces.
>
> We can simply run the wheel internaly with 4ms base level resolution and
> degrade from there. That gives us 6 days+ and a simple cutoff at the capacity
> of the 7th level wheel.
>
> 0 0 4 ms 0 ms - 255 ms
> 1 64 32 ms 256 ms - 2047 ms (256ms - ~2s)
> 2 128 256 ms 2048 ms - 16383 ms (~2s - ~16s)
> 3 192 2048 ms (~2s) 16384 ms - 131071 ms (~16s - ~2m)
> 4 256 16384 ms (~16s) 131072 ms - 1048575 ms (~2m - ~17m)
> 5 320 131072 ms (~2m) 1048576 ms - 8388607 ms (~17m - ~2h)
> 6 384 1048576 ms (~17m) 8388608 ms - 67108863 ms (~2h - ~18h)
> 7 448 8388608 ms (~2h) 67108864 ms - 536870911 ms (~18h - ~6d)
>
> That works really nice and has the interesting side effect that we batch in
> the first level wheel which helps networking. I'll repost the series with the
> other review points addressed later tonight.
>
> Btw, I also thought a bit more about the milliseconds interfaces. I think we
> shouldn't invent new interfaces. The correct solution IMHO is to distangle the
> scheduler tick frequency and jiffies. If we have that completely seperated
> then we can do the following:
>
> 1) Force HZ=1000. That means jiffies and timer wheel units are 1ms. If the
> tick frequency is != 1000 we simply increment jiffies in the tick by the
> proper amount (4 @250 ticks/sec, 10 @100 ticks/sec).
>
> So all msec_to_jiffies() invocations compile out into nothing magically and
> we can remove them gradually over time.
Some of RCU's heuristics assume that if scheduling-clock ticks happen,
they happen once per jiffy. These would need to be adjusted, which would
not be a big deal, just a bit more use of HZ.
> 2) When we do that right, we can make the tick frequency a command line option
> and just have a compiled in default.
As long as there is something that tells RCU what the tick frequency
actually is at runtime, this should not be a problem. For example,
in rcu_implicit_dynticks_qs(), the following:
rdp->rsp->jiffies_resched += 5;
Would instead need to be something like:
rdp->rsp->jiffies_resched += 5 * jiffies_per_tick;
Changing tick frequency at runtime would be a bit more tricky, as it would
be tough to avoid some oddball false positives during the transition.
But setting it at boot time would be fine. ;-)
Thanx, Paul
> Thoughts?
>
> Thanks,
>
> tglx
>
next prev parent reply other threads:[~2016-06-16 16:02 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-13 8:40 [patch 00/20] timer: Refactor the timer wheel Thomas Gleixner
2016-06-13 8:40 ` [patch 01/20] timer: Make pinned a timer property Thomas Gleixner
2016-06-13 8:40 ` [patch 02/20] x86/apic/uv: Initialize timer as pinned Thomas Gleixner
2016-06-13 8:40 ` [patch 03/20] x86/mce: " Thomas Gleixner
2016-06-13 8:40 ` [patch 04/20] cpufreq/powernv: " Thomas Gleixner
2016-06-13 13:18 ` Arjan van de Ven
2016-06-13 8:40 ` [patch 05/20] driver/net/ethernet/tile: " Thomas Gleixner
2016-06-13 8:40 ` [patch 06/20] drivers/tty/metag_da: " Thomas Gleixner
2016-06-13 13:13 ` Arjan van de Ven
2016-06-13 8:40 ` [patch 07/20] drivers/tty/mips_ejtag: " Thomas Gleixner
2016-06-13 8:40 ` [patch 08/20] net/ipv4/inet: Initialize timers " Thomas Gleixner
2016-06-13 8:40 ` [patch 09/20] timer: Remove mod_timer_pinned Thomas Gleixner
2016-06-13 8:40 ` [patch 10/20] timer: Add a cascading tracepoint Thomas Gleixner
2016-06-13 8:40 ` [patch 11/20] hlist: Add hlist_is_last_node() helper Thomas Gleixner
2016-06-13 10:27 ` Paolo Bonzini
2016-06-13 8:40 ` [patch 12/20] timer: Give a few structs and members proper names Thomas Gleixner
2016-06-13 8:41 ` [patch 13/20] timer: Switch to a non cascading wheel Thomas Gleixner
2016-06-13 11:40 ` Peter Zijlstra
2016-06-13 12:30 ` Thomas Gleixner
2016-06-13 12:46 ` Eric Dumazet
2016-06-13 14:30 ` Thomas Gleixner
2016-06-14 10:11 ` Ingo Molnar
2016-06-14 16:28 ` Thomas Gleixner
2016-06-14 17:14 ` Arjan van de Ven
2016-06-14 18:05 ` Thomas Gleixner
2016-06-14 20:34 ` Peter Zijlstra
2016-06-14 20:42 ` Peter Zijlstra
2016-06-14 21:17 ` Eric Dumazet
2016-06-15 14:53 ` Thomas Gleixner
2016-06-15 14:55 ` Arjan van de Ven
2016-06-15 16:43 ` Thomas Gleixner
2016-06-16 15:43 ` Thomas Gleixner
2016-06-16 16:02 ` Paul E. McKenney [this message]
2016-06-16 18:14 ` Peter Zijlstra
2016-06-17 0:40 ` Paul E. McKenney
2016-06-17 4:04 ` Paul E. McKenney
2016-06-16 16:04 ` Arjan van de Ven
2016-06-16 16:09 ` Thomas Gleixner
2016-06-15 15:05 ` Eric Dumazet
2016-06-13 14:36 ` Richard Cochran
2016-06-13 14:39 ` Thomas Gleixner
2016-06-13 8:41 ` [patch 15/20] timer: Move __run_timers() function Thomas Gleixner
2016-06-13 8:41 ` [patch 14/20] timer: Remove slack leftovers Thomas Gleixner
2016-06-13 8:41 ` [patch 16/20] timer: Optimize collect timers for NOHZ Thomas Gleixner
2016-06-13 8:41 ` [patch 17/20] tick/sched: Remove pointless empty function Thomas Gleixner
2016-06-13 8:41 ` [patch 18/20] timer: Forward wheel clock whenever possible Thomas Gleixner
2016-06-13 15:14 ` Richard Cochran
2016-06-13 15:18 ` Thomas Gleixner
2016-06-13 8:41 ` [patch 19/20] timer: Split out index calculation Thomas Gleixner
2016-06-13 8:41 ` [patch 20/20] timer: Optimization for same expiry time in mod_timer() Thomas Gleixner
2016-06-13 14:10 ` [patch 00/20] timer: Refactor the timer wheel Eric Dumazet
2016-06-13 16:15 ` Paul E. McKenney
2016-06-15 15:15 ` Paul E. McKenney
2016-06-15 17:02 ` Thomas Gleixner
2016-06-15 20:26 ` Paul E. McKenney
2016-06-14 8:16 [patch 13/20] timer: Switch to a non cascading wheel George Spelvin
2016-06-14 8:50 ` Thomas Gleixner
2016-06-14 10:15 ` George Spelvin
2016-06-14 10:20 ` Peter Zijlstra
2016-06-14 12:58 ` George Spelvin
2016-06-14 16:48 ` Thomas Gleixner
2016-06-14 19:56 ` George Spelvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160616160215.GQ3923@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=arjan@infradead.org \
--cc=arjanvandeven@gmail.com \
--cc=clm@fb.com \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@sciencehorizons.net \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).