linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Chris Metcalf <cmetcalf@tilera.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Gilad Ben-Yossef <gilad@benyossef.com>,
	linux-kernel@vger.kernel.org, Christoph Lameter <cl@linux.com>,
	linux-mm@kvack.org, Pekka Enberg <penberg@kernel.org>,
	Matt Mackall <mpm@selenic.com>,
	Sasha Levin <levinsasha928@gmail.com>,
	Rik van Riel <riel@redhat.com>, Andi Kleen <andi@firstfloor.org>,
	Mel Gorman <mel@csn.ul.ie>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Avi Kivity <avi@redhat.com>,
	Michal Nazarewicz <mina86@mina86.com>,
	Kosaki Motohiro <kosaki.motohiro@gmail.com>,
	Milton Miller <miltonm@bga.com>
Subject: Re: [v7 0/8] Reduce cross CPU IPI interference
Date: Tue, 21 Feb 2012 02:34:45 +0100	[thread overview]
Message-ID: <20120221013443.GA13403@somewhere.redhat.com> (raw)
In-Reply-To: <4F3C28AF.9080005@tilera.com>

On Wed, Feb 15, 2012 at 04:50:39PM -0500, Chris Metcalf wrote:
> On 2/10/2012 1:33 PM, Peter Zijlstra wrote:
> > On Thu, 2012-02-02 at 10:41 -0500, Chris Metcalf wrote:
> >> At Tilera we have been supporting a "dataplane" mode (aka Zero Overhead
> >> Linux - the marketing name).  This is configured on a per-cpu basis, and in
> >> addition to setting isolcpus for those nodes, also suppresses various
> >> things that might otherwise run (soft lockup detection, vmstat work,
> >> etc.).  
> > See that's wrong.. it starts being wrong by depending on cpuisol and
> > goes from there.
> >
> >> The claim is that you need to specify these kinds of things
> >> per-core since it's not always possible for the kernel to know that you
> >> really don't want the scheduler or any other interrupt source to touch the
> >> core, as opposed to the case where you just happen to have a single process
> >> scheduled on the core and you don't mind occasional interrupts.
> > Right, so that claim is proven false I think.
> >
> >> But
> >> there's definitely appeal in having the kernel do it adaptively too,
> >> particularly if it can be made to work just as well as configuring it
> >> statically. 
> > I see no reason why it shouldn't work as well or even better.
> 
> Thanks for the feedback.  To echo Gilad's guess in a later email, the code
> as-is is not intended as a patch planned for a merge.  The code is in use
> by our customers, who have found it useful, but what I'd really like to do
> is to make sure to integrate all the functionality that's useful in our
> "dataplane" mode into Frederic's ongoing work with nohz cpusets.
> 
> The goal of the work we've done is to provide a way for customers to ensure
> they reliably have zero jitter on cpus that are trying to process real-time
> or otherwise low-latency events.  A good example is 10 Gb network traffic,
> where at min-packet sizes you have only 50-odd cpu cycles to dispatch the
> packet to one of our 64 cores, and each core then has a budget of only a
> few thousand cycles to deal with the core.  A kernel interrupt would mean
> dropping packets on the floor.  Similarly, for something like
> high-frequency trading, you'd want guaranteed low-latency response.
> 
> The Tilera dataplane code is available on the "dataplane" branch (off of
> 3.3-rc3 at the moment):
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git
> 
> I'm still looking at Frederic's git tree, but I'm assuming the following
> are all true of tasks that are running on a nohz cpuset core:
> 
> - The core will not run the global scheduler to do work stealing, since
> otherwise you can't guarantee that only tasks that care about userspace
> nohz get to run there.  (I suppose you could loosen thus such that the core
> would do work stealing as long as no task was pinned to that core by
> affinity, at which point the pinned task would become the only runnable task.)

A nohz cpuset doesn't really control that. It actually reacts to the scheduler
actions. Like try to stop the tick if there is only one task on the runqueue,
restart it when we have more.

Ensuring the CPU doesn't get distracted is rather the role of the user by
setting the right cpusets to get the desired affinity. And if we still have
noise with workqueues or something, this is something we need to look at
and fix on a case by case basis.


> - Kernel "background" tasks are disabled on that core, at least while
> userspace nohz tasks are running: softlockup watchdog, slab reap timer,
> vmstat thread, etc.

Yeah that's examples of "noisy" things. Those are in fact a seperate issues
that nohz cpusets don't touch. nohz cpuset are really only about trying to
shut down the periodic tick, or defer it for a far as possible in the future.

Now the nohz cpuset uses some user/kernel entry/exit hooks that we can extend
to cover some of these cases. We may want to make some timers "user-deferrable",
ie: deactivate, reactivate them on kernel entry and exit.

That need some thinking though, this may not always be a win for every workload.
But those that are userspace-mostly can profit.

  parent reply	other threads:[~2012-02-21  1:34 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-26 10:01 [v7 0/8] Reduce cross CPU IPI interference Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 1/8] smp: introduce a generic on_each_cpu_mask function Gilad Ben-Yossef
2012-01-29 12:24   ` Gilad Ben-Yossef
2012-01-30 21:52     ` Andrew Morton
2012-01-31  6:33       ` Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 2/8] arm: move arm over to generic on_each_cpu_mask Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 3/8] tile: move tile to use " Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 4/8] smp: add func to IPI cpus based on parameter func Gilad Ben-Yossef
2012-01-27 23:57   ` Andrew Morton
2012-01-29 12:04     ` Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 5/8] slub: only IPI CPUs that have per cpu obj to flush Gilad Ben-Yossef
2012-01-26 15:09   ` Christoph Lameter
2012-01-26 10:01 ` [v7 6/8] fs: only send IPI to invalidate LRU BH when needed Gilad Ben-Yossef
2012-01-26 10:02 ` [v7 7/8] mm: only IPI CPUs to drain local pages if they exist Gilad Ben-Yossef
2012-01-26 15:13   ` Christoph Lameter
2012-01-28  0:12   ` Andrew Morton
2012-01-29 12:18     ` Gilad Ben-Yossef
2012-01-30 21:49       ` Andrew Morton
2012-01-31  6:32         ` Gilad Ben-Yossef
2012-01-30 14:59   ` Mel Gorman
2012-01-30 15:14     ` Gilad Ben-Yossef
2012-01-30 15:44       ` Mel Gorman
2012-01-26 10:02 ` [v7 8/8] mm: add vmstat counters for tracking PCP drains Gilad Ben-Yossef
2012-01-26 15:19 ` [v7 0/8] Reduce cross CPU IPI interference Peter Zijlstra
2012-01-29  8:25   ` Gilad Ben-Yossef
2012-02-01 17:04     ` Frederic Weisbecker
2012-02-02  8:46       ` Gilad Ben-Yossef
2012-02-02 15:41         ` Chris Metcalf
2012-02-05 11:46           ` Gilad Ben-Yossef
2012-02-10 18:39             ` Peter Zijlstra
2012-02-10 20:13               ` Gilad Ben-Yossef
2012-02-10 20:29                 ` Peter Zijlstra
2012-02-10 20:39                   ` Gilad Ben-Yossef
2012-02-10 18:33           ` Peter Zijlstra
2012-02-10 20:33             ` Gilad Ben-Yossef
2012-02-15 21:50             ` Chris Metcalf
2012-02-15 22:15               ` Christoph Lameter
2012-02-15 23:44                 ` Chris Metcalf
2012-02-21  1:34               ` Frederic Weisbecker [this message]
2012-03-01 18:27                 ` Chris Metcalf
2012-02-10 18:38           ` Peter Zijlstra
2012-02-10 20:24             ` Gilad Ben-Yossef
2012-02-15 15:11               ` Peter Zijlstra
2012-02-15 15:19                 ` Gilad Ben-Yossef
2012-02-15 21:51               ` Chris Metcalf
2012-02-02 16:24         ` Frederic Weisbecker
2012-02-02 16:29           ` Christoph Lameter
2012-02-09 15:52             ` Frederic Weisbecker
2012-02-09 15:59               ` Chris Metcalf
2012-02-09 18:11                 ` Frederic Weisbecker
2012-02-09 16:26               ` Christoph Lameter
2012-02-09 18:32                 ` Frederic Weisbecker
2012-02-01 17:35     ` Peter Zijlstra
2012-02-01 17:57       ` Peter Zijlstra
2012-02-02  9:42         ` Gilad Ben-Yossef
2012-02-01 18:40       ` Paul E. McKenney
2012-02-01 20:06         ` Christoph Lameter
2012-02-01 20:13           ` Paul E. McKenney
2012-02-02  9:34             ` Avi Kivity
2012-02-02 15:34               ` Paul E. McKenney
2012-02-02 16:14                 ` Avi Kivity
2012-02-02 17:01                   ` Paul E. McKenney
2012-02-02 17:23                     ` Avi Kivity
2012-02-02 17:51                       ` Paul E. McKenney
2012-02-05 12:16                         ` Avi Kivity
2012-02-05 16:59                           ` Paul E. McKenney
2012-02-09 15:22                             ` Frederic Weisbecker
2012-02-09 16:05                               ` Avi Kivity
2012-02-09 18:22                                 ` Frederic Weisbecker
2012-02-09 23:41                                   ` Paul E. McKenney
2012-02-10  1:39                                     ` Frederic Weisbecker
2012-02-14 13:18                                       ` Avi Kivity
2012-02-21  0:02                                         ` Frederic Weisbecker
2012-02-02 17:25                     ` Christoph Lameter
2012-02-05 12:06                       ` Gilad Ben-Yossef
2012-02-06 18:19                         ` Christoph Lameter
2012-02-09 15:37                           ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120221013443.GA13403@somewhere.redhat.com \
    --to=fweisbec@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=avi@redhat.com \
    --cc=cl@linux.com \
    --cc=cmetcalf@tilera.com \
    --cc=gilad@benyossef.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=levinsasha928@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=miltonm@bga.com \
    --cc=mina86@mina86.com \
    --cc=mpm@selenic.com \
    --cc=penberg@kernel.org \
    --cc=riel@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).