linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Travis <travis@sgi.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	davej@codemonkey.org.uk, David Miller <davem@davemloft.net>,
	Eric Dumazet <dada1@cosmosbay.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Jack Steiner <steiner@sgi.com>,
	Jeremy Fitzhardinge <jeremy@goop.org>, Jes Sorensen <jes@sgi.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC 07/13] sched: Reduce stack size requirements in kernel/sched.c
Date: Mon, 08 Sep 2008 07:54:09 -0700	[thread overview]
Message-ID: <48C53C91.70604@sgi.com> (raw)
In-Reply-To: <1220783087.8687.73.camel@twins.programming.kicks-ass.net>

Peter Zijlstra wrote:
> On Sat, 2008-09-06 at 16:50 -0700, Mike Travis wrote:
>> plain text document attachment (stack-hogs-kernel_sched_c)
>> * Make the following changes to kernel/sched.c functions:
>>
>>     - use node_to_cpumask_ptr in place of node_to_cpumask
>>     - use get_cpumask_var for temporary cpumask_t variables
>>     - use alloc_cpumask_ptr where available
>>
>>   * Remove special code for SCHED_CPUMASK_ALLOC and use CPUMASK_ALLOC
>>     from linux/cpumask.h.
>>
>>   * The resultant stack savings are:
>>
>>     ====== Stack (-l 100)
>>
>> 	1 - initial
>> 	2 - stack-hogs-kernel_sched_c
>> 	'.' is less than the limit(100)
>>
>>        .1.    .2.    ..final..
>>       2216  -1536 680   -69%  __build_sched_domains
>>       1592  -1592   .  -100%  move_task_off_dead_cpu
>>       1096  -1096   .  -100%  sched_balance_self
>>       1032  -1032   .  -100%  sched_setaffinity
>>        616   -616   .  -100%  rebalance_domains
>>        552   -552   .  -100%  free_sched_groups
>>        512   -512   .  -100%  cpu_to_allnodes_group
>>       7616  -6936 680   -91%  Totals
>>
>>
>> Applies to linux-2.6.tip/master.
>>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>>  kernel/sched.c |  151 ++++++++++++++++++++++++++++++---------------------------
>>  1 file changed, 81 insertions(+), 70 deletions(-)
>>
>> --- linux-2.6.tip.orig/kernel/sched.c
>> +++ linux-2.6.tip/kernel/sched.c
>> @@ -70,6 +70,7 @@
>>  #include <linux/bootmem.h>
>>  #include <linux/debugfs.h>
>>  #include <linux/ctype.h>
>> +#include <linux/cpumask_ptr.h>
>>  #include <linux/ftrace.h>
>>  #include <trace/sched.h>
>>  
>> @@ -117,6 +118,12 @@
>>   */
>>  #define RUNTIME_INF	((u64)~0ULL)
>>  
>> +/*
>> + * temp cpumask variables
>> + */
>> +static DEFINE_PER_CPUMASK(temp_cpumask_1);
>> +static DEFINE_PER_CPUMASK(temp_cpumask_2);
> 
> Yuck, that relies on turning preemption off everywhere you want to use
> those.
> 
> 
>> @@ -5384,11 +5400,14 @@ out_unlock:
>>  
>>  long sched_setaffinity(pid_t pid, const cpumask_t *in_mask)
>>  {
>> -	cpumask_t cpus_allowed;
>> -	cpumask_t new_mask = *in_mask;
>> +	cpumask_ptr cpus_allowed;
>> +	cpumask_ptr new_mask;
>>  	struct task_struct *p;
>>  	int retval;
>>  
>> +	get_cpumask_var(cpus_allowed, temp_cpumask_1);
>> +	get_cpumask_var(new_mask, temp_cpumask_2);
>> +	*new_mask = *in_mask;
>>  	get_online_cpus();
>>  	read_lock(&tasklist_lock);
> 
> BUG!
> 
> get_online_cpus() can sleep, but you just disabled preemption with those
> get_cpumask_var() horribles!
> 
> Couldn't be arsed to look through the rest, but I really hate this
> cpumask_ptr() stuff that relies on disabling preemption.
> 
> NAK

Yeah, I really agree as well.  But I wanted to start playing with using
cpumask_t pointers in some fairly straight forward manner.  Linus's and
Ingo's suggestion to just bite the bullet and redefine the cpumask_t 
would force a lot of changes to be made, but perhaps that's really the
way to go.

As to obtaining temp cpumask_t's (both early and late), perhaps a pool of
them would be better?  I believe it could be done similar to alloc_bootmem
(but much simpler), and I don't think there's enough nesting to require a
very large pool.  (4 was the largest depth I could find in io_apic.c.)  Of
course, with preemption enabled then other problems arise...

One other really big use was for the "allbutself" cpumask in the send_IPI
functions.  I think here, preemption is ok because the ownership of the
cpumask temp is very short lived.

But thanks for pointing out the get_online_cpus problem.  I did try and
chase down as many call trees as I could, but I obviously missed one
important one.

And thanks for looking it over!
Mike


  parent reply	other threads:[~2008-09-08 14:54 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-06 23:50 [RFC 00/13] smp: reduce stack requirements for genapic send_IPI_mask functions Mike Travis
2008-09-06 23:50 ` [RFC 01/13] smp: modify send_IPI_mask interface to accept cpumask_t pointers Mike Travis
2008-09-06 23:50 ` [RFC 02/13] cpumask: add for_each_online_cpu_mask_nr function Mike Travis
2008-09-06 23:50 ` [RFC 03/13] xen: use new " Mike Travis
2008-09-06 23:50 ` [RFC 04/13] cpumask: add cpumask_ptr operations Mike Travis
2008-09-06 23:50 ` [RFC 05/13] cpumask: add get_cpumask_var debug operations Mike Travis
2008-09-06 23:50 ` [RFC 06/13] genapic: use get_cpumask_var operations for allbutself cpumask_ts Mike Travis
2008-09-06 23:50 ` [RFC 07/13] sched: Reduce stack size requirements in kernel/sched.c Mike Travis
2008-09-07 10:24   ` Peter Zijlstra
2008-09-07 11:00     ` Andrew Morton
2008-09-07 13:05       ` Peter Zijlstra
2008-09-08 14:56         ` Mike Travis
2008-09-07 20:28       ` Peter Zijlstra
2008-09-08 14:54     ` Mike Travis [this message]
2008-09-08 15:05       ` Peter Zijlstra
2008-09-08 18:38         ` Ingo Molnar
2008-09-10 22:47           ` [RFC] CPUMASK: proposal for replacing cpumask_t Mike Travis
2008-09-10 22:53             ` Andi Kleen
2008-09-10 23:33               ` Mike Travis
2008-09-11  5:21                 ` Andi Kleen
2008-09-11  9:00             ` Peter Zijlstra
2008-09-11 15:04               ` Mike Travis
2008-09-12  4:55             ` Rusty Russell
2008-09-12 14:28               ` Mike Travis
2008-09-12 22:02                 ` Rusty Russell
2008-09-12 22:50                   ` Mike Travis
2008-09-12 22:58                     ` H. Peter Anvin
2008-09-06 23:50 ` [RFC 08/13] cpufreq: Reduce stack size requirements in acpi-cpufreq.c Mike Travis
2008-09-06 23:50 ` [RFC 09/13] genapic: reduce stack pressuge in io_apic.c step 1 temp cpumask_ts Mike Travis
2008-09-08 11:01   ` Andi Kleen
2008-09-08 16:03     ` Mike Travis
2008-09-06 23:50 ` [RFC 10/13] genapic: reduce stack pressuge in io_apic.c step 2 internal abi Mike Travis
2008-09-06 23:50 ` [RFC 11/13] genapic: reduce stack pressuge in io_apic.c step 3 target_cpus Mike Travis
2008-09-07  7:55   ` Bert Wesarg
2008-09-07  9:13     ` Ingo Molnar
2008-09-08 15:01       ` Mike Travis
2008-09-08 15:29     ` Mike Travis
2008-09-06 23:50 ` [RFC 12/13] genapic: reduce stack pressuge in io_apic.c step 4 vector allocation Mike Travis
2008-09-06 23:50 ` [RFC 13/13] genapic: reduce stack pressuge in io_apic.c step 5 cpu_mask_to_apicid Mike Travis
2008-09-07  7:36 ` [RFC 00/13] smp: reduce stack requirements for genapic send_IPI_mask functions Ingo Molnar
2008-09-08 15:17   ` Mike Travis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48C53C91.70604@sgi.com \
    --to=travis@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=dada1@cosmosbay.com \
    --cc=davej@codemonkey.org.uk \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=jes@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=steiner@sgi.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).