linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gilad Ben-Yossef <gilad@benyossef.com>
To: Chris Metcalf <cmetcalf@tilera.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	linux-kernel@vger.kernel.org, Christoph Lameter <cl@linux.com>,
	linux-mm@kvack.org, Pekka Enberg <penberg@kernel.org>,
	Matt Mackall <mpm@selenic.com>,
	Sasha Levin <levinsasha928@gmail.com>,
	Rik van Riel <riel@redhat.com>, Andi Kleen <andi@firstfloor.org>,
	Mel Gorman <mel@csn.ul.ie>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Avi Kivity <avi@redhat.com>,
	Michal Nazarewicz <mina86@mina86.com>,
	Kosaki Motohiro <kosaki.motohiro@gmail.com>,
	Milton Miller <miltonm@bga.com>
Subject: Re: [v7 0/8] Reduce cross CPU IPI interference
Date: Sun, 5 Feb 2012 13:46:25 +0200	[thread overview]
Message-ID: <CAOtvUMfE3xpwmRKnFPTsstr3SuUG7SnpWn5eomEQzkap4_nfrg@mail.gmail.com> (raw)
In-Reply-To: <4F2AAEB9.9070302@tilera.com>

On Thu, Feb 2, 2012 at 5:41 PM, Chris Metcalf <cmetcalf@tilera.com> wrote:
> On 2/2/2012 3:46 AM, Gilad Ben-Yossef wrote:
>
>> Yes, that is what drives me as well. I have a bare metal program
>> I'm trying to kill here, I researched CPU isolation and ran into your
>> nohz patch set and asked myself: "OK, if we disable the tick what else
>> is on the way?"
>
> At Tilera we have been supporting a "dataplane" mode (aka Zero Overhead
> Linux - the marketing name).  This is configured on a per-cpu basis, and in
> addition to setting isolcpus for those nodes, also suppresses various
> things that might otherwise run (soft lockup detection, vmstat work,
> etc.).  The claim is that you need to specify these kinds of things
> per-core since it's not always possible for the kernel to know that you
> really don't want the scheduler or any other interrupt source to touch the
> core, as opposed to the case where you just happen to have a single process
> scheduled on the core and you don't mind occasional interrupts.  But
> there's definitely appeal in having the kernel do it adaptively too,
> particularly if it can be made to work just as well as configuring it
> statically.

Currently adaptive tick needs to be enabled as a cpuset property in
order to apply,
but once enabled it is activated automatically when feasible.

The combination of per cpuset enabling and automatic activation makes
sense to me
since cpuset is the way to go to isolate cpus for specific tasks going forward.
>
> We also have a set_dataplane() syscall that a task can make to allow it to
> request some additional semantics from the kernel, such as various
> debugging modes, a flag to request populating the page table fully, and a
> flag to request that all pending kernel timer ticks, etc., happen while the
> task spins in the kernel before actually returning to userspace from a
> syscall (so you don't get unexpected interrupts once you're back in
> userspace).

Oohh.. I like that :-)

> I've appended the relevant bits of <asm/dataplane.h> for more
> details.
>
> We've been planning to start working with the community on returning this,
> but since fiddling with the scheduler is pretty tricky stuff and it wasn't
> clear there was a lot of interest, we've been deferring it in favor of
> other activities.  But seeing more about Frederic Weisbecker's and Gilad
> Ben-Yossef's work makes me think that it might be a good time for us to
> start that process.  For a start I'll see about putting up a git branch on
> kernel.org that has our dataplane stuff in it, for reference.
>

This sounds very interesting. Thanks you!

I for one will be delighted to see that tree as a reference. There is nothing
I hate more then re-inventing the wheel... :-)

> /*
>  * Quiesce the timer interrupt before returning to user space after a
>  * system call.  Normally if a task on a dataplane core makes a
>  * syscall, the system will run one or more timer ticks after the
>  * syscall has completed, causing unexpected interrupts in userspace.
>  * Setting DP_QUIESCE avoids that problem by having the kernel "hold"
>  * the task in kernel mode until the timer ticks are complete.  This
>  * will make syscalls dramatically slower.
>  *
>  * If multiple dataplane tasks are scheduled on a single core, this
>  * in effect silently disables DP_QUIESCE, which allows the tasks to make
>  * progress, but without actually disabling the timer tick.
>  */
> #define DP_QUIESCE      0x1
>
> /*
>  * Disallow the application from entering the kernel in any way,
>  * unless it calls set_dataplane() again without this bit set.
>  * Issuing any other syscall or causing a page fault would generate a
>  * kernel message, and "kill -9" the process.
>  *
>  * Setting this flag automatically sets DP_QUIESCE as well.
>  */
> #define DP_STRICT       0x2
>
> /*
>  * Debug dataplane interrupts, so that if any interrupt source
>  * attempts to involve a dataplane cpu, a kernel message and stack
>  * backtrace will be generated on the console.  As this warning is a
>  * slow event, it may make sense to avoid this mode in production code
>  * to avoid making any possible interrupts even more heavyweight.
>  *
>  * Setting this flag automatically sets DP_QUIESCE as well.
>  */
> #define DP_DEBUG        0x4
>
> /*
>  * Cause all memory mappings to be populated in the page table.
>  * Specifying this when entering dataplane mode ensures that no future
>  * page fault events will occur to cause interrupts into the Linux
>  * kernel, as long as no new mappings are installed by mmap(), etc.
>  * Note that since the hardware TLB is of finite size, there will
>  * still be the potential for TLB misses that the hypervisor handles,
>  * either via its software TLB cache (fast path) or by walking the
>  * kernel page tables (slow path), so touching large amounts of memory
>  * will still incur hypervisor interrupt overhead.
>  */
> #define DP_POPULATE     0x8

hmm... I've probably missed something, but doesn't this replicate
mlockall (MCL_CURRENT|MCL_FUTURE) ?

Thanks!
Gilad




-- 
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru

  reply	other threads:[~2012-02-05 11:46 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-26 10:01 [v7 0/8] Reduce cross CPU IPI interference Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 1/8] smp: introduce a generic on_each_cpu_mask function Gilad Ben-Yossef
2012-01-29 12:24   ` Gilad Ben-Yossef
2012-01-30 21:52     ` Andrew Morton
2012-01-31  6:33       ` Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 2/8] arm: move arm over to generic on_each_cpu_mask Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 3/8] tile: move tile to use " Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 4/8] smp: add func to IPI cpus based on parameter func Gilad Ben-Yossef
2012-01-27 23:57   ` Andrew Morton
2012-01-29 12:04     ` Gilad Ben-Yossef
2012-01-26 10:01 ` [v7 5/8] slub: only IPI CPUs that have per cpu obj to flush Gilad Ben-Yossef
2012-01-26 15:09   ` Christoph Lameter
2012-01-26 10:01 ` [v7 6/8] fs: only send IPI to invalidate LRU BH when needed Gilad Ben-Yossef
2012-01-26 10:02 ` [v7 7/8] mm: only IPI CPUs to drain local pages if they exist Gilad Ben-Yossef
2012-01-26 15:13   ` Christoph Lameter
2012-01-28  0:12   ` Andrew Morton
2012-01-29 12:18     ` Gilad Ben-Yossef
2012-01-30 21:49       ` Andrew Morton
2012-01-31  6:32         ` Gilad Ben-Yossef
2012-01-30 14:59   ` Mel Gorman
2012-01-30 15:14     ` Gilad Ben-Yossef
2012-01-30 15:44       ` Mel Gorman
2012-01-26 10:02 ` [v7 8/8] mm: add vmstat counters for tracking PCP drains Gilad Ben-Yossef
2012-01-26 15:19 ` [v7 0/8] Reduce cross CPU IPI interference Peter Zijlstra
2012-01-29  8:25   ` Gilad Ben-Yossef
2012-02-01 17:04     ` Frederic Weisbecker
2012-02-02  8:46       ` Gilad Ben-Yossef
2012-02-02 15:41         ` Chris Metcalf
2012-02-05 11:46           ` Gilad Ben-Yossef [this message]
2012-02-10 18:39             ` Peter Zijlstra
2012-02-10 20:13               ` Gilad Ben-Yossef
2012-02-10 20:29                 ` Peter Zijlstra
2012-02-10 20:39                   ` Gilad Ben-Yossef
2012-02-10 18:33           ` Peter Zijlstra
2012-02-10 20:33             ` Gilad Ben-Yossef
2012-02-15 21:50             ` Chris Metcalf
2012-02-15 22:15               ` Christoph Lameter
2012-02-15 23:44                 ` Chris Metcalf
2012-02-21  1:34               ` Frederic Weisbecker
2012-03-01 18:27                 ` Chris Metcalf
2012-02-10 18:38           ` Peter Zijlstra
2012-02-10 20:24             ` Gilad Ben-Yossef
2012-02-15 15:11               ` Peter Zijlstra
2012-02-15 15:19                 ` Gilad Ben-Yossef
2012-02-15 21:51               ` Chris Metcalf
2012-02-02 16:24         ` Frederic Weisbecker
2012-02-02 16:29           ` Christoph Lameter
2012-02-09 15:52             ` Frederic Weisbecker
2012-02-09 15:59               ` Chris Metcalf
2012-02-09 18:11                 ` Frederic Weisbecker
2012-02-09 16:26               ` Christoph Lameter
2012-02-09 18:32                 ` Frederic Weisbecker
2012-02-01 17:35     ` Peter Zijlstra
2012-02-01 17:57       ` Peter Zijlstra
2012-02-02  9:42         ` Gilad Ben-Yossef
2012-02-01 18:40       ` Paul E. McKenney
2012-02-01 20:06         ` Christoph Lameter
2012-02-01 20:13           ` Paul E. McKenney
2012-02-02  9:34             ` Avi Kivity
2012-02-02 15:34               ` Paul E. McKenney
2012-02-02 16:14                 ` Avi Kivity
2012-02-02 17:01                   ` Paul E. McKenney
2012-02-02 17:23                     ` Avi Kivity
2012-02-02 17:51                       ` Paul E. McKenney
2012-02-05 12:16                         ` Avi Kivity
2012-02-05 16:59                           ` Paul E. McKenney
2012-02-09 15:22                             ` Frederic Weisbecker
2012-02-09 16:05                               ` Avi Kivity
2012-02-09 18:22                                 ` Frederic Weisbecker
2012-02-09 23:41                                   ` Paul E. McKenney
2012-02-10  1:39                                     ` Frederic Weisbecker
2012-02-14 13:18                                       ` Avi Kivity
2012-02-21  0:02                                         ` Frederic Weisbecker
2012-02-02 17:25                     ` Christoph Lameter
2012-02-05 12:06                       ` Gilad Ben-Yossef
2012-02-06 18:19                         ` Christoph Lameter
2012-02-09 15:37                           ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOtvUMfE3xpwmRKnFPTsstr3SuUG7SnpWn5eomEQzkap4_nfrg@mail.gmail.com \
    --to=gilad@benyossef.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=avi@redhat.com \
    --cc=cl@linux.com \
    --cc=cmetcalf@tilera.com \
    --cc=fweisbec@gmail.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=levinsasha928@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=miltonm@bga.com \
    --cc=mina86@mina86.com \
    --cc=mpm@selenic.com \
    --cc=penberg@kernel.org \
    --cc=riel@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).