All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v2
@ 2008-03-26  1:38 ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel


Modify usage of cpumask_t variables to use pointers as much as possible.

Changes are:

	* Use an allocated array of cpumask_t's for cpumask_of_cpu() when
	  large NR_CPUS count is present.  This removes 26168 bytes of stack
	  usage (see chart below), as well as reduces the code generated for
	  each usage.

	* Modify set_cpus_allowed to pass a pointer to the "newly allowed"
	  cpumask.  This removes 10792 bytes of stack usage but is an
	  ABI change.

	* Add node_to_cpumask_ptr that returns pointer to cpumask for the
	  specified node.  This removes 10256 bytes of stack usage.

	* Modify build_sched_domains and related sub-functions to pass
	  pointers to cpumask temp variables.  This consolidates stack
	  space that was spread over various functions.

	* Remove large array from numa_initmem_init() [-8248 bytes].

	* Optimize usages of {CPU,NODE}_MASK_{NONE,ALL} [-9408 bytes].

	* Various other changes to reduce stacksize and silence checkpatch
	  warnings [-7672 bytes].

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Anton Blanchard <anton@samba.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Cliff Wickman <cpw@sgi.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: William L. Irwin <wli@holomorphy.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---

v2: resubmitted based on x86/latest.

Summaries:

	1 - Memory Usages Changes
	2 - Build & Test Results

--- ---------------------------------------------------------
* Memory Usages Changes

Patch list summary of various memory usage changes using the akpm2
config file with NR_CPUS=4096 and MAX_NUMNODES=512.


====== Data (-l 500)
    1 - initial
    2 - cpumask_of_cpu
    8 - sched_domain
   13 - CPU_NODE_MASK

   .1.   .2.    .8.  .13.    ..final..
  3553     .  -1146  +320 2727   -23%  build_sched_domains(.text)
   533  -533      .     .    .  -100%  hpet_enable(.init.text)
   512     .      .  -512    .  -100%  C(.rodata)
     0     .      .  +512  512      .  cpu_mask_all(.data)
 4598 -533 -1146 +320 3239  -29%  Totals

====== Text/Data ()
    1 - initial
    2 - cpumask_of_cpu
    3 - set_cpus_allowed
    6 - numa_initmem_init
    9 - kern_sched
   13 - CPU_NODE_MASK

        .1.    .2.    .3.    .6.    .9.   .13.    ..final..
    3397632  -2048      .      .      .      .   3395584    <1%  TextSize
    1642496  +2048  +4096      .  -4096  -4096   1640448    <1%  DataSize
    1658880      .      .  +8192      .      .   1667072    <1%  InitSize
  287709184      .  +4096      .  -4096  +4096 287713280    <1%  OtherSize
  294408192      .  +8192  +8192  -8192      . 294416384    +0%  Totals

====== Stack (-l 500)
    1 - initial
    2 - cpumask_of_cpu
    3 - set_cpus_allowed
    4 - cpumask_affinity
    6 - numa_initmem_init
    7 - node_to_cpumask_ptr
    8 - sched_domain
    9 - kern_sched
   11 - build_sched_domains
   12 - cpu_coregroup_map
   13 - CPU_NODE_MASK

    .1.    .2.    .3.    .4.    .6.    .7.    .8.   .9.  .11.  .12.   .13.    ..final..
  11080      .      .      .      .   -512  -6352     .  -976   +16   -512 2744   -75%  build_sched_domains
   8248      .      .      .  -8248      .      .     .     .     .      .    .  -100%  numa_initmem_init
   3672  -1024   -496      .      .      .      .     .     .     .      . 2152   -41%  centrino_target
   3176      .      .      .      .  -2576      .     .     .     .      .  600   -81%  sched_domain_node_span
   3096  -1536   -520      .      .      .      .     .     .     .      . 1040   -66%  acpi_processor_set_throttling
   2600  -1536      .      .      .      .      .     .     .     .   -512  552   -78%  powernowk8_cpu_init
   2120  -1024   -512      .      .      .      .     .     .     .      .  584   -72%  cache_add_dev
   2104  -1008      .      .      .      .      .     .     .     .   -512  584   -72%  powernowk8_target
   2104      .   -512      .      .      .      .     .     .     .   -512 1080   -48%  _cpu_down
   2072   -512      .      .      .      .      .     .     .     .      . 1560   -24%  tick_notify
   2064  -1024      .      .      .      .      .     .     .     .   -504  536   -74%  check_supported_cpu
   2056      .  -1544   +520      .      .      .     .     .     .      . 1032   -49%  sched_setaffinity
   2056  -1024   -512      .      .      .      .     .     .     .      .  520   -74%  get_cur_freq
   2056      .   -512  -1032      .      .      .     .     .     .   -512    .  -100%  affinity_set
   2056  -1024   -520      .      .      .      .     .     .     .      .  512   -75%  acpi_processor_get_throttling
   2056  -1024   -512      .      .      .      .     .     .     .      .  520   -74%  acpi_processor_ffh_cstate_probe
   2048  -1016   -520      .      .      .      .     .     .     .      .  512   -75%  powernowk8_get
   1784  -1024      .      .      .      .      .     .     .     .      .  760   -57%  cpufreq_add_dev
   1768      .   -512      .      .  -1256      .     .     .     .      .    .  -100%  kswapd
   1608  -1608      .      .      .      .      .     .     .     .      .    .  -100%  disable_smp
   1592      .      .      .      .  -1592      .     .     .     .      .    .  -100%  do_tune_cpucache
   1576      .      .      .      .      .      .  -480     .     .  -1096    .  -100%  init_sched_build_groups
   1560      .   -528      .      .   -512      .     .     .     .      .  520   -66%  pci_device_probe
   1552      .   -512      .      .      .      .     .     .     .  -1040    .  -100%  kthreadd
   1544  -1024   -520      .      .      .      .     .     .     .      .    .  -100%  stopmachine
   1544  -1032   -512      .      .      .      .     .     .     .      .    .  -100%  native_machine_shutdown
   1544  -1008      .      .      .      .      .     .     .     .      .  536   -65%  alloc_ldt
   1536   -504      .      .      .      .      .     .     .     .      . 1032   -32%  smp_call_function_single
   1536  -1024      .      .      .      .      .     .     .     .      .  512   -66%  native_smp_send_reschedule
   1176      .      .      .      .      .      .  -512     .     .      .  664   -43%  thread_return
   1176      .      .      .      .      .      .  -512     .     .      .  664   -43%  schedule
   1160      .      .      .      .      .      .  -512     .     .      .  648   -44%  run_rebalance_domains
   1160      .      .      .      .  -1160      .     .     .     .      .    .  -100%  __build_all_zonelists
   1144      .      .   +512      .      .      .     .     .     .   -512 1144      .  threshold_create_device
   1080      .   -520      .      .      .      .     .     .     .      .  560   -48%  pdflush
   1080      .   -512      .      .      .      .     .     .     .   -568    .  -100%  kernel_init
   1064      .      .      .      .  -1064      .     .     .     .      .    .  -100%  cpuup_canceled
   1064      .      .      .      .  -1064      .     .     .     .      .    .  -100%  cpuup_callback
   1032  -1032      .      .      .      .      .     .     .     .      .    .  -100%  setup_pit_timer
   1032      .      .      .      .      .      .     .     .     .   -520  512   -50%  physflat_vector_allocation_domain
   1032  -1032      .      .      .      .      .     .     .     .      .    .  -100%  init_workqueues
   1032  -1032      .      .      .      .      .     .     .     .      .    .  -100%  init_idle
   1032      .      .      .      .      .      .     .     .     .   -512  520   -49%  destroy_irq
   1024      .      .   -512      .      .      .     .     .     .      .  512   -50%  sys_sched_setaffinity
   1024  -1024      .      .      .      .      .     .     .     .      .    .  -100%  setup_APIC_timer
   1024      .   -504      .      .      .      .     .     .     .      .  520   -49%  sched_init_smp
   1024  -1024      .      .      .      .      .     .     .     .      .    .  -100%  native_smp_prepare_cpus
   1024  -1024      .      .      .      .      .     .     .     .      .    .  -100%  kthread_bind
   1024  -1024      .      .      .      .      .     .     .     .      .    .  -100%  hpet_enable
   1024      .      .   -512      .      .      .     .     .     .      .  512   -50%  compat_sys_sched_setaffinity
   1024      .      .      .      .      .      .     .     .     .   -512  512   -50%  __percpu_populate_mask
   1024      .   -512      .      .      .      .     .     .     .   -512    .  -100%  ____call_usermodehelper
    568      .      .      .      .      .      .  -568     .     .      .    .  -100%  cpu_attach_domain
    552      .      .      .      .      .      .     .     .     .   -552    .  -100%  migration_call
    520      .      .      .      .   -520      .     .     .     .      .    .  -100%  node_read_cpumap
    520      .      .      .      .      .      .     .     .     .   -520    .  -100%  dynamic_irq_init
    520      .      .      .      .      .      .    -8     .  -512      .    .  -100%  cpu_to_phys_group
    520      .      .      .      .      .      .  -520     .     .      .    .  -100%  cpu_to_core_group
      0      .      .      .      .      .   +760     .     .     .      .  760      .  sd_init_SIBLING
      0      .      .      .      .      .   +760     .     .     .      .  760      .  sd_init_NODE
      0      .      .      .      .      .   +752     .     .     .      .  752      .  sd_init_MC
      0      .      .      .      .      .   +752     .     .     .      .  752      .  sd_init_CPU
      0      .      .      .      .      .   +752     .     .     .      .  752      .  sd_init_ALLNODES
      0      .      .      .      .      .      .  +512     .     .      .  512      .  detach_destroy_domains
 101488 -26168 -10792  -1024  -8248 -10256  -2576 -2600  -976  -496  -9408 28944  -71%  Totals

--- ---------------------------------------------------------
* Build & Test Results

Built/tested:

    nosmp
    nonuma
    defconfig (NR_CPUS/MAX_NUMANODES: 32/64 and 4096/512)
    akpm2 config (NR_CPUS/MAX_NUMANODES: 255/64 and 4096/512)

Built no errors:

    allyesconfig
    allnoconfig
    allmodconfig
    current-x86_64-default
    current-ia64-sn2
    current-ia64-default
    current-ia64-nosmp
    current-ia64-zx1
    current-s390-default
    current-arm-default
    current-sparc-default
    current-sparc64-default
    current-sparc64-smp
    current-ppc-pmac32

Not Built (previous errors):

    current-x86_64-single
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x814bd): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x81556): undefined reference to `release_firmware'
    current-x86_64-8psmp
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x814bd): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x81556): undefined reference to `release_firmware'
    current-x86_64-debug
	sas_scsi_host.c:1091: undefined reference to `request_firmware'
	sas_scsi_host.c:1103: undefined reference to `release_firmware'
    current-x86_64-numa
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x8540d): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x854a6): undefined reference to `release_firmware'
    current-i386-single
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x7617a): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x76208): undefined reference to `release_firmware'
    current-i386-smp
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x7985a): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x798e8): undefined reference to `release_firmware'
    current-ppc-smp
	WRAP    arch/powerpc/boot/uImage
	ln: accessing `arch/powerpc/boot/uImage': No such file or directory

(Note: build with patches applied did not change errors.)


--- ---------------------------------------------------------

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v2
@ 2008-03-26  1:38 ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel

Modify usage of cpumask_t variables to use pointers as much as possible.

Changes are:

	* Use an allocated array of cpumask_t's for cpumask_of_cpu() when
	  large NR_CPUS count is present.  This removes 26168 bytes of stack
	  usage (see chart below), as well as reduces the code generated for
	  each usage.

	* Modify set_cpus_allowed to pass a pointer to the "newly allowed"
	  cpumask.  This removes 10792 bytes of stack usage but is an
	  ABI change.

	* Add node_to_cpumask_ptr that returns pointer to cpumask for the
	  specified node.  This removes 10256 bytes of stack usage.

	* Modify build_sched_domains and related sub-functions to pass
	  pointers to cpumask temp variables.  This consolidates stack
	  space that was spread over various functions.

	* Remove large array from numa_initmem_init() [-8248 bytes].

	* Optimize usages of {CPU,NODE}_MASK_{NONE,ALL} [-9408 bytes].

	* Various other changes to reduce stacksize and silence checkpatch
	  warnings [-7672 bytes].

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Anton Blanchard <anton@samba.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Cliff Wickman <cpw@sgi.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: William L. Irwin <wli@holomorphy.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---

v2: resubmitted based on x86/latest.

Summaries:

	1 - Memory Usages Changes
	2 - Build & Test Results

--- ---------------------------------------------------------
* Memory Usages Changes

Patch list summary of various memory usage changes using the akpm2
config file with NR_CPUS=4096 and MAX_NUMNODES=512.


====== Data (-l 500)
    1 - initial
    2 - cpumask_of_cpu
    8 - sched_domain
   13 - CPU_NODE_MASK

   .1.   .2.    .8.  .13.    ..final..
  3553     .  -1146  +320 2727   -23%  build_sched_domains(.text)
   533  -533      .     .    .  -100%  hpet_enable(.init.text)
   512     .      .  -512    .  -100%  C(.rodata)
     0     .      .  +512  512      .  cpu_mask_all(.data)
 4598 -533 -1146 +320 3239  -29%  Totals

====== Text/Data ()
    1 - initial
    2 - cpumask_of_cpu
    3 - set_cpus_allowed
    6 - numa_initmem_init
    9 - kern_sched
   13 - CPU_NODE_MASK

        .1.    .2.    .3.    .6.    .9.   .13.    ..final..
    3397632  -2048      .      .      .      .   3395584    <1%  TextSize
    1642496  +2048  +4096      .  -4096  -4096   1640448    <1%  DataSize
    1658880      .      .  +8192      .      .   1667072    <1%  InitSize
  287709184      .  +4096      .  -4096  +4096 287713280    <1%  OtherSize
  294408192      .  +8192  +8192  -8192      . 294416384    +0%  Totals

====== Stack (-l 500)
    1 - initial
    2 - cpumask_of_cpu
    3 - set_cpus_allowed
    4 - cpumask_affinity
    6 - numa_initmem_init
    7 - node_to_cpumask_ptr
    8 - sched_domain
    9 - kern_sched
   11 - build_sched_domains
   12 - cpu_coregroup_map
   13 - CPU_NODE_MASK

    .1.    .2.    .3.    .4.    .6.    .7.    .8.   .9.  .11.  .12.   .13.    ..final..
  11080      .      .      .      .   -512  -6352     .  -976   +16   -512 2744   -75%  build_sched_domains
   8248      .      .      .  -8248      .      .     .     .     .      .    .  -100%  numa_initmem_init
   3672  -1024   -496      .      .      .      .     .     .     .      . 2152   -41%  centrino_target
   3176      .      .      .      .  -2576      .     .     .     .      .  600   -81%  sched_domain_node_span
   3096  -1536   -520      .      .      .      .     .     .     .      . 1040   -66%  acpi_processor_set_throttling
   2600  -1536      .      .      .      .      .     .     .     .   -512  552   -78%  powernowk8_cpu_init
   2120  -1024   -512      .      .      .      .     .     .     .      .  584   -72%  cache_add_dev
   2104  -1008      .      .      .      .      .     .     .     .   -512  584   -72%  powernowk8_target
   2104      .   -512      .      .      .      .     .     .     .   -512 1080   -48%  _cpu_down
   2072   -512      .      .      .      .      .     .     .     .      . 1560   -24%  tick_notify
   2064  -1024      .      .      .      .      .     .     .     .   -504  536   -74%  check_supported_cpu
   2056      .  -1544   +520      .      .      .     .     .     .      . 1032   -49%  sched_setaffinity
   2056  -1024   -512      .      .      .      .     .     .     .      .  520   -74%  get_cur_freq
   2056      .   -512  -1032      .      .      .     .     .     .   -512    .  -100%  affinity_set
   2056  -1024   -520      .      .      .      .     .     .     .      .  512   -75%  acpi_processor_get_throttling
   2056  -1024   -512      .      .      .      .     .     .     .      .  520   -74%  acpi_processor_ffh_cstate_probe
   2048  -1016   -520      .      .      .      .     .     .     .      .  512   -75%  powernowk8_get
   1784  -1024      .      .      .      .      .     .     .     .      .  760   -57%  cpufreq_add_dev
   1768      .   -512      .      .  -1256      .     .     .     .      .    .  -100%  kswapd
   1608  -1608      .      .      .      .      .     .     .     .      .    .  -100%  disable_smp
   1592      .      .      .      .  -1592      .     .     .     .      .    .  -100%  do_tune_cpucache
   1576      .      .      .      .      .      .  -480     .     .  -1096    .  -100%  init_sched_build_groups
   1560      .   -528      .      .   -512      .     .     .     .      .  520   -66%  pci_device_probe
   1552      .   -512      .      .      .      .     .     .     .  -1040    .  -100%  kthreadd
   1544  -1024   -520      .      .      .      .     .     .     .      .    .  -100%  stopmachine
   1544  -1032   -512      .      .      .      .     .     .     .      .    .  -100%  native_machine_shutdown
   1544  -1008      .      .      .      .      .     .     .     .      .  536   -65%  alloc_ldt
   1536   -504      .      .      .      .      .     .     .     .      . 1032   -32%  smp_call_function_single
   1536  -1024      .      .      .      .      .     .     .     .      .  512   -66%  native_smp_send_reschedule
   1176      .      .      .      .      .      .  -512     .     .      .  664   -43%  thread_return
   1176      .      .      .      .      .      .  -512     .     .      .  664   -43%  schedule
   1160      .      .      .      .      .      .  -512     .     .      .  648   -44%  run_rebalance_domains
   1160      .      .      .      .  -1160      .     .     .     .      .    .  -100%  __build_all_zonelists
   1144      .      .   +512      .      .      .     .     .     .   -512 1144      .  threshold_create_device
   1080      .   -520      .      .      .      .     .     .     .      .  560   -48%  pdflush
   1080      .   -512      .      .      .      .     .     .     .   -568    .  -100%  kernel_init
   1064      .      .      .      .  -1064      .     .     .     .      .    .  -100%  cpuup_canceled
   1064      .      .      .      .  -1064      .     .     .     .      .    .  -100%  cpuup_callback
   1032  -1032      .      .      .      .      .     .     .     .      .    .  -100%  setup_pit_timer
   1032      .      .      .      .      .      .     .     .     .   -520  512   -50%  physflat_vector_allocation_domain
   1032  -1032      .      .      .      .      .     .     .     .      .    .  -100%  init_workqueues
   1032  -1032      .      .      .      .      .     .     .     .      .    .  -100%  init_idle
   1032      .      .      .      .      .      .     .     .     .   -512  520   -49%  destroy_irq
   1024      .      .   -512      .      .      .     .     .     .      .  512   -50%  sys_sched_setaffinity
   1024  -1024      .      .      .      .      .     .     .     .      .    .  -100%  setup_APIC_timer
   1024      .   -504      .      .      .      .     .     .     .      .  520   -49%  sched_init_smp
   1024  -1024      .      .      .      .      .     .     .     .      .    .  -100%  native_smp_prepare_cpus
   1024  -1024      .      .      .      .      .     .     .     .      .    .  -100%  kthread_bind
   1024  -1024      .      .      .      .      .     .     .     .      .    .  -100%  hpet_enable
   1024      .      .   -512      .      .      .     .     .     .      .  512   -50%  compat_sys_sched_setaffinity
   1024      .      .      .      .      .      .     .     .     .   -512  512   -50%  __percpu_populate_mask
   1024      .   -512      .      .      .      .     .     .     .   -512    .  -100%  ____call_usermodehelper
    568      .      .      .      .      .      .  -568     .     .      .    .  -100%  cpu_attach_domain
    552      .      .      .      .      .      .     .     .     .   -552    .  -100%  migration_call
    520      .      .      .      .   -520      .     .     .     .      .    .  -100%  node_read_cpumap
    520      .      .      .      .      .      .     .     .     .   -520    .  -100%  dynamic_irq_init
    520      .      .      .      .      .      .    -8     .  -512      .    .  -100%  cpu_to_phys_group
    520      .      .      .      .      .      .  -520     .     .      .    .  -100%  cpu_to_core_group
      0      .      .      .      .      .   +760     .     .     .      .  760      .  sd_init_SIBLING
      0      .      .      .      .      .   +760     .     .     .      .  760      .  sd_init_NODE
      0      .      .      .      .      .   +752     .     .     .      .  752      .  sd_init_MC
      0      .      .      .      .      .   +752     .     .     .      .  752      .  sd_init_CPU
      0      .      .      .      .      .   +752     .     .     .      .  752      .  sd_init_ALLNODES
      0      .      .      .      .      .      .  +512     .     .      .  512      .  detach_destroy_domains
 101488 -26168 -10792  -1024  -8248 -10256  -2576 -2600  -976  -496  -9408 28944  -71%  Totals

--- ---------------------------------------------------------
* Build & Test Results

Built/tested:

    nosmp
    nonuma
    defconfig (NR_CPUS/MAX_NUMANODES: 32/64 and 4096/512)
    akpm2 config (NR_CPUS/MAX_NUMANODES: 255/64 and 4096/512)

Built no errors:

    allyesconfig
    allnoconfig
    allmodconfig
    current-x86_64-default
    current-ia64-sn2
    current-ia64-default
    current-ia64-nosmp
    current-ia64-zx1
    current-s390-default
    current-arm-default
    current-sparc-default
    current-sparc64-default
    current-sparc64-smp
    current-ppc-pmac32

Not Built (previous errors):

    current-x86_64-single
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x814bd): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x81556): undefined reference to `release_firmware'
    current-x86_64-8psmp
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x814bd): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x81556): undefined reference to `release_firmware'
    current-x86_64-debug
	sas_scsi_host.c:1091: undefined reference to `request_firmware'
	sas_scsi_host.c:1103: undefined reference to `release_firmware'
    current-x86_64-numa
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x8540d): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x854a6): undefined reference to `release_firmware'
    current-i386-single
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x7617a): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x76208): undefined reference to `release_firmware'
    current-i386-smp
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x7985a): undefined reference to `request_firmware'
	drivers/built-in.o: In function `sas_request_addr':
	(.text+0x798e8): undefined reference to `release_firmware'
    current-ppc-smp
	WRAP    arch/powerpc/boot/uImage
	ln: accessing `arch/powerpc/boot/uImage': No such file or directory

(Note: build with patches applied did not change errors.)


--- ---------------------------------------------------------

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 01/12] cpumask: Convert cpumask_of_cpu to allocated array v2
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Tony Luck, Paul Mackerras,
	Anton Blanchard, David S. Miller, William L. Irwin,
	Thomas Gleixner, H. Peter Anvin, Christoph Lameter

[-- Attachment #1: cpumask_of_cpu --]
[-- Type: text/plain, Size: 5197 bytes --]

Here is a simple patch to use an allocated array of cpumasks to
represent cpumask_of_cpu() instead of constructing one on the
stack, when the size of cpumask_t is significant.

Conditioned by NR_CPUS > BITS_PER_LONG, as if less than or equal,
cpumask_of_cpu() generates a simple unsigned long.  But the macro is
changed to generate an lvalue so a pointer to cpumask_of_cpu can be
provided.

This removes 26168 bytes of stack usage, as well as reduces the code
generated for each usage.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

# ia64
Cc: Tony Luck <tony.luck@intel.com>

# powerpc
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>

# sparc
Cc: David S. Miller <davem@davemloft.net>
Cc: William L. Irwin <wli@holomorphy.com>

# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>


Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
v2: rebased on linux-2.6.git + linux-2.6-x86.git
    ... and changed to use an allocated array of cpumask_t's instead
        of a percpu variable.
---
 arch/ia64/kernel/setup.c       |    3 +++
 arch/powerpc/kernel/setup_64.c |    3 +++
 arch/sparc64/mm/init.c         |    3 +++
 arch/x86/kernel/setup.c        |    3 +++
 include/linux/cpumask.h        |   30 ++++++++++++++++++------------
 init/main.c                    |   18 ++++++++++++++++++
 6 files changed, 48 insertions(+), 12 deletions(-)

--- linux.trees.git.orig/arch/ia64/kernel/setup.c
+++ linux.trees.git/arch/ia64/kernel/setup.c
@@ -772,6 +772,9 @@ setup_per_cpu_areas (void)
 		highest_cpu = cpu;
 
 	nr_cpu_ids = highest_cpu + 1;
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 #endif
 }
 
--- linux.trees.git.orig/arch/powerpc/kernel/setup_64.c
+++ linux.trees.git/arch/powerpc/kernel/setup_64.c
@@ -601,6 +601,9 @@ void __init setup_per_cpu_areas(void)
 
 	/* Now that per_cpu is setup, initialize cpu_sibling_map */
 	smp_setup_cpu_sibling_map();
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 }
 #endif
 
--- linux.trees.git.orig/arch/sparc64/mm/init.c
+++ linux.trees.git/arch/sparc64/mm/init.c
@@ -1302,6 +1302,9 @@ void __init setup_per_cpu_areas(void)
 		highest_cpu = cpu;
 
 	nr_cpu_ids = highest_cpu + 1;
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 }
 #endif
 
--- linux.trees.git.orig/arch/x86/kernel/setup.c
+++ linux.trees.git/arch/x86/kernel/setup.c
@@ -96,6 +96,9 @@ void __init setup_per_cpu_areas(void)
 
 	/* Setup percpu data maps */
 	setup_per_cpu_maps();
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 }
 
 #endif
--- linux.trees.git.orig/include/linux/cpumask.h
+++ linux.trees.git/include/linux/cpumask.h
@@ -222,18 +222,6 @@ int __next_cpu(int n, const cpumask_t *s
 #define next_cpu(n, src)	({ (void)(src); 1; })
 #endif
 
-#define cpumask_of_cpu(cpu)						\
-({									\
-	typeof(_unused_cpumask_arg_) m;					\
-	if (sizeof(m) == sizeof(unsigned long)) {			\
-		m.bits[0] = 1UL<<(cpu);					\
-	} else {							\
-		cpus_clear(m);						\
-		cpu_set((cpu), m);					\
-	}								\
-	m;								\
-})
-
 #define CPU_MASK_LAST_WORD BITMAP_LAST_WORD_MASK(NR_CPUS)
 
 #if NR_CPUS <= BITS_PER_LONG
@@ -243,6 +231,19 @@ int __next_cpu(int n, const cpumask_t *s
 	[BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD			\
 } }
 
+#define cpumask_of_cpu(cpu)						\
+(*({									\
+	typeof(_unused_cpumask_arg_) m;					\
+	if (sizeof(m) == sizeof(unsigned long)) {			\
+		m.bits[0] = 1UL<<(cpu);					\
+	} else {							\
+		cpus_clear(m);						\
+		cpu_set((cpu), m);					\
+	}								\
+	&m;								\
+}))
+static inline void setup_cpumask_of_cpu(int num) {}
+
 #else
 
 #define CPU_MASK_ALL							\
@@ -251,6 +252,11 @@ int __next_cpu(int n, const cpumask_t *s
 	[BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD			\
 } }
 
+/* cpumask_of_cpu_map is in init/main.c */
+#define cpumask_of_cpu(cpu)    (cpumask_of_cpu_map[cpu])
+extern cpumask_t *cpumask_of_cpu_map;
+void setup_cpumask_of_cpu(int num);
+
 #endif
 
 #define CPU_MASK_NONE							\
--- linux.trees.git.orig/init/main.c
+++ linux.trees.git/init/main.c
@@ -367,6 +367,21 @@ static inline void smp_prepare_cpus(unsi
 int nr_cpu_ids __read_mostly = NR_CPUS;
 EXPORT_SYMBOL(nr_cpu_ids);
 
+#if NR_CPUS > BITS_PER_LONG
+cpumask_t *cpumask_of_cpu_map __read_mostly;
+EXPORT_SYMBOL(cpumask_of_cpu_map);
+
+void __init setup_cpumask_of_cpu(int num)
+{
+	int i;
+
+	/* alloc_bootmem zeroes memory */
+	cpumask_of_cpu_map = alloc_bootmem_low(sizeof(cpumask_t) * num);
+	for (i = 0; i < num; i++)
+		cpu_set(i, cpumask_of_cpu_map[i]);
+}
+#endif
+
 #ifndef CONFIG_HAVE_SETUP_PER_CPU_AREA
 unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
 EXPORT_SYMBOL(__per_cpu_offset);
@@ -393,6 +408,9 @@ static void __init setup_per_cpu_areas(v
 
 	nr_cpu_ids = highest_cpu + 1;
 	printk(KERN_DEBUG "NR_CPUS:%d (nr_cpu_ids:%d)\n", NR_CPUS, nr_cpu_ids);
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 }
 #endif /* CONFIG_HAVE_SETUP_PER_CPU_AREA */
 

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 01/12] cpumask: Convert cpumask_of_cpu to allocated array v2
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Tony Luck, Paul Mackerras,
	Anton Blanchard, David S. Miller, William L. Irwin,
	Thomas Gleixner, H. Peter Anvin, Christoph Lameter

[-- Attachment #1: cpumask_of_cpu --]
[-- Type: text/plain, Size: 5423 bytes --]

Here is a simple patch to use an allocated array of cpumasks to
represent cpumask_of_cpu() instead of constructing one on the
stack, when the size of cpumask_t is significant.

Conditioned by NR_CPUS > BITS_PER_LONG, as if less than or equal,
cpumask_of_cpu() generates a simple unsigned long.  But the macro is
changed to generate an lvalue so a pointer to cpumask_of_cpu can be
provided.

This removes 26168 bytes of stack usage, as well as reduces the code
generated for each usage.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

# ia64
Cc: Tony Luck <tony.luck@intel.com>

# powerpc
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>

# sparc
Cc: David S. Miller <davem@davemloft.net>
Cc: William L. Irwin <wli@holomorphy.com>

# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>


Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
v2: rebased on linux-2.6.git + linux-2.6-x86.git
    ... and changed to use an allocated array of cpumask_t's instead
        of a percpu variable.
---
 arch/ia64/kernel/setup.c       |    3 +++
 arch/powerpc/kernel/setup_64.c |    3 +++
 arch/sparc64/mm/init.c         |    3 +++
 arch/x86/kernel/setup.c        |    3 +++
 include/linux/cpumask.h        |   30 ++++++++++++++++++------------
 init/main.c                    |   18 ++++++++++++++++++
 6 files changed, 48 insertions(+), 12 deletions(-)

--- linux.trees.git.orig/arch/ia64/kernel/setup.c
+++ linux.trees.git/arch/ia64/kernel/setup.c
@@ -772,6 +772,9 @@ setup_per_cpu_areas (void)
 		highest_cpu = cpu;
 
 	nr_cpu_ids = highest_cpu + 1;
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 #endif
 }
 
--- linux.trees.git.orig/arch/powerpc/kernel/setup_64.c
+++ linux.trees.git/arch/powerpc/kernel/setup_64.c
@@ -601,6 +601,9 @@ void __init setup_per_cpu_areas(void)
 
 	/* Now that per_cpu is setup, initialize cpu_sibling_map */
 	smp_setup_cpu_sibling_map();
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 }
 #endif
 
--- linux.trees.git.orig/arch/sparc64/mm/init.c
+++ linux.trees.git/arch/sparc64/mm/init.c
@@ -1302,6 +1302,9 @@ void __init setup_per_cpu_areas(void)
 		highest_cpu = cpu;
 
 	nr_cpu_ids = highest_cpu + 1;
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 }
 #endif
 
--- linux.trees.git.orig/arch/x86/kernel/setup.c
+++ linux.trees.git/arch/x86/kernel/setup.c
@@ -96,6 +96,9 @@ void __init setup_per_cpu_areas(void)
 
 	/* Setup percpu data maps */
 	setup_per_cpu_maps();
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 }
 
 #endif
--- linux.trees.git.orig/include/linux/cpumask.h
+++ linux.trees.git/include/linux/cpumask.h
@@ -222,18 +222,6 @@ int __next_cpu(int n, const cpumask_t *s
 #define next_cpu(n, src)	({ (void)(src); 1; })
 #endif
 
-#define cpumask_of_cpu(cpu)						\
-({									\
-	typeof(_unused_cpumask_arg_) m;					\
-	if (sizeof(m) == sizeof(unsigned long)) {			\
-		m.bits[0] = 1UL<<(cpu);					\
-	} else {							\
-		cpus_clear(m);						\
-		cpu_set((cpu), m);					\
-	}								\
-	m;								\
-})
-
 #define CPU_MASK_LAST_WORD BITMAP_LAST_WORD_MASK(NR_CPUS)
 
 #if NR_CPUS <= BITS_PER_LONG
@@ -243,6 +231,19 @@ int __next_cpu(int n, const cpumask_t *s
 	[BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD			\
 } }
 
+#define cpumask_of_cpu(cpu)						\
+(*({									\
+	typeof(_unused_cpumask_arg_) m;					\
+	if (sizeof(m) == sizeof(unsigned long)) {			\
+		m.bits[0] = 1UL<<(cpu);					\
+	} else {							\
+		cpus_clear(m);						\
+		cpu_set((cpu), m);					\
+	}								\
+	&m;								\
+}))
+static inline void setup_cpumask_of_cpu(int num) {}
+
 #else
 
 #define CPU_MASK_ALL							\
@@ -251,6 +252,11 @@ int __next_cpu(int n, const cpumask_t *s
 	[BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD			\
 } }
 
+/* cpumask_of_cpu_map is in init/main.c */
+#define cpumask_of_cpu(cpu)    (cpumask_of_cpu_map[cpu])
+extern cpumask_t *cpumask_of_cpu_map;
+void setup_cpumask_of_cpu(int num);
+
 #endif
 
 #define CPU_MASK_NONE							\
--- linux.trees.git.orig/init/main.c
+++ linux.trees.git/init/main.c
@@ -367,6 +367,21 @@ static inline void smp_prepare_cpus(unsi
 int nr_cpu_ids __read_mostly = NR_CPUS;
 EXPORT_SYMBOL(nr_cpu_ids);
 
+#if NR_CPUS > BITS_PER_LONG
+cpumask_t *cpumask_of_cpu_map __read_mostly;
+EXPORT_SYMBOL(cpumask_of_cpu_map);
+
+void __init setup_cpumask_of_cpu(int num)
+{
+	int i;
+
+	/* alloc_bootmem zeroes memory */
+	cpumask_of_cpu_map = alloc_bootmem_low(sizeof(cpumask_t) * num);
+	for (i = 0; i < num; i++)
+		cpu_set(i, cpumask_of_cpu_map[i]);
+}
+#endif
+
 #ifndef CONFIG_HAVE_SETUP_PER_CPU_AREA
 unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
 EXPORT_SYMBOL(__per_cpu_offset);
@@ -393,6 +408,9 @@ static void __init setup_per_cpu_areas(v
 
 	nr_cpu_ids = highest_cpu + 1;
 	printk(KERN_DEBUG "NR_CPUS:%d (nr_cpu_ids:%d)\n", NR_CPUS, nr_cpu_ids);
+
+	/* Setup cpumask_of_cpu() map */
+	setup_cpumask_of_cpu(nr_cpu_ids);
 }
 #endif /* CONFIG_HAVE_SETUP_PER_CPU_AREA */
 

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 02/12] cpumask: pass pointer to cpumask for set_cpus_allowed() v2
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel

[-- Attachment #1: set_cpus_allowed --]
[-- Type: text/plain, Size: 43560 bytes --]

Instead of passing by value, the "newly allowed cpus" cpumask
argument, pass a pointer:

-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed(struct task_struct *p, const cpumask_t *new_mask)

This is a major ABI change and unfortunately touches a number of files
as the function is very commonly used.  I had thought of using a macro
to "silently" pass the 2nd arg as a pointer, but you lose in the
situation where you already have a pointer to the new cpumask.

This removes 10792 bytes of stack usage.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Signed-off-by: Mike Travis <travis@sgi.com>
---
v2: rebased on linux-2.6.git + linux-2.6-x86.git
---
 arch/arm/mach-integrator/cpu.c                   |   10 ++++-----
 arch/ia64/kernel/cpufreq/acpi-cpufreq.c          |   10 ++++-----
 arch/ia64/kernel/salinfo.c                       |    4 +--
 arch/ia64/kernel/topology.c                      |    4 +--
 arch/ia64/sn/kernel/sn2/sn_hwperf.c              |    4 +--
 arch/ia64/sn/kernel/xpc_main.c                   |    4 +--
 arch/mips/kernel/mips-mt-fpaff.c                 |    4 +--
 arch/mips/kernel/traps.c                         |    2 -
 arch/powerpc/kernel/smp.c                        |    4 +--
 arch/powerpc/kernel/sysfs.c                      |    4 +--
 arch/powerpc/platforms/pseries/rtasd.c           |    4 +--
 arch/sh/kernel/cpufreq.c                         |    4 +--
 arch/sparc64/kernel/sysfs.c                      |    4 +--
 arch/sparc64/kernel/us2e_cpufreq.c               |    8 +++----
 arch/sparc64/kernel/us3_cpufreq.c                |    8 +++----
 arch/x86/kernel/acpi/cstate.c                    |    4 +--
 arch/x86/kernel/apm_32.c                         |    6 ++---
 arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c       |   12 +++++------
 arch/x86/kernel/cpu/cpufreq/powernow-k8.c        |   20 +++++++++---------
 arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c |   12 +++++------
 arch/x86/kernel/cpu/cpufreq/speedstep-ich.c      |   12 +++++------
 arch/x86/kernel/cpu/intel_cacheinfo.c            |    4 +--
 arch/x86/kernel/cpu/mcheck/mce_amd_64.c          |    4 +--
 arch/x86/kernel/microcode.c                      |   16 +++++++-------
 arch/x86/kernel/process_64.c                     |    1 
 arch/x86/kernel/reboot.c                         |    2 -
 drivers/acpi/processor_throttling.c              |   10 ++++-----
 drivers/firmware/dcdbas.c                        |    4 +--
 drivers/pci/pci-driver.c                         |   10 ++++++---
 include/linux/cpuset.h                           |   13 ++++++-----
 include/linux/sched.h                            |   10 +++++----
 init/main.c                                      |    2 -
 kernel/cpu.c                                     |    4 +--
 kernel/cpuset.c                                  |   22 +++++++-------------
 kernel/kmod.c                                    |    2 -
 kernel/kthread.c                                 |    4 +--
 kernel/rcutorture.c                              |   11 +++++-----
 kernel/sched.c                                   |   25 +++++++++++------------
 kernel/sched_rt.c                                |    3 +-
 kernel/stop_machine.c                            |    2 -
 mm/pdflush.c                                     |    4 +--
 mm/vmscan.c                                      |    6 ++---
 net/sunrpc/svc.c                                 |   18 +++++++++++-----
 43 files changed, 166 insertions(+), 155 deletions(-)

--- linux.trees.git.orig/arch/arm/mach-integrator/cpu.c
+++ linux.trees.git/arch/arm/mach-integrator/cpu.c
@@ -94,7 +94,7 @@ static int integrator_set_target(struct 
 	 * Bind to the specified CPU.  When this call returns,
 	 * we should be running on the right CPU.
 	 */
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	BUG_ON(cpu != smp_processor_id());
 
 	/* get current setting */
@@ -122,7 +122,7 @@ static int integrator_set_target(struct 
 	freqs.cpu = policy->cpu;
 
 	if (freqs.old == freqs.new) {
-		set_cpus_allowed(current, cpus_allowed);
+		set_cpus_allowed(current, &cpus_allowed);
 		return 0;
 	}
 
@@ -145,7 +145,7 @@ static int integrator_set_target(struct 
 	/*
 	 * Restore the CPUs allowed mask.
 	 */
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 
@@ -161,7 +161,7 @@ static unsigned int integrator_get(unsig
 
 	cpus_allowed = current->cpus_allowed;
 
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	BUG_ON(cpu != smp_processor_id());
 
 	/* detect memory etc. */
@@ -177,7 +177,7 @@ static unsigned int integrator_get(unsig
 
 	current_freq = icst525_khz(&cclk_params, vco); /* current freq */
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	return current_freq;
 }
--- linux.trees.git.orig/arch/ia64/kernel/cpufreq/acpi-cpufreq.c
+++ linux.trees.git/arch/ia64/kernel/cpufreq/acpi-cpufreq.c
@@ -112,7 +112,7 @@ processor_get_freq (
 	dprintk("processor_get_freq\n");
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (smp_processor_id() != cpu)
 		goto migrate_end;
 
@@ -120,7 +120,7 @@ processor_get_freq (
 	ret = processor_get_pstate(&value);
 
 	if (ret) {
-		set_cpus_allowed(current, saved_mask);
+		set_cpus_allowed(current, &saved_mask);
 		printk(KERN_WARNING "get performance failed with error %d\n",
 		       ret);
 		ret = 0;
@@ -130,7 +130,7 @@ processor_get_freq (
 	ret = (clock_freq*1000);
 
 migrate_end:
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return ret;
 }
 
@@ -150,7 +150,7 @@ processor_set_freq (
 	dprintk("processor_set_freq\n");
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (smp_processor_id() != cpu) {
 		retval = -EAGAIN;
 		goto migrate_end;
@@ -207,7 +207,7 @@ processor_set_freq (
 	retval = 0;
 
 migrate_end:
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return (retval);
 }
 
--- linux.trees.git.orig/arch/ia64/kernel/salinfo.c
+++ linux.trees.git/arch/ia64/kernel/salinfo.c
@@ -405,9 +405,9 @@ call_on_cpu(int cpu, void (*fn)(void *),
 {
 	cpumask_t save_cpus_allowed = current->cpus_allowed;
 	cpumask_t new_cpus_allowed = cpumask_of_cpu(cpu);
-	set_cpus_allowed(current, new_cpus_allowed);
+	set_cpus_allowed(current, &new_cpus_allowed);
 	(*fn)(arg);
-	set_cpus_allowed(current, save_cpus_allowed);
+	set_cpus_allowed(current, &save_cpus_allowed);
 }
 
 static void
--- linux.trees.git.orig/arch/ia64/kernel/topology.c
+++ linux.trees.git/arch/ia64/kernel/topology.c
@@ -345,12 +345,12 @@ static int __cpuinit cache_add_dev(struc
 		return 0;
 
 	oldmask = current->cpus_allowed;
-	retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	retval = set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (unlikely(retval))
 		return retval;
 
 	retval = cpu_cache_sysfs_init(cpu);
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	if (unlikely(retval < 0))
 		return retval;
 
--- linux.trees.git.orig/arch/ia64/sn/kernel/sn2/sn_hwperf.c
+++ linux.trees.git/arch/ia64/sn/kernel/sn2/sn_hwperf.c
@@ -635,9 +635,9 @@ static int sn_hwperf_op_cpu(struct sn_hw
 		else {
 			/* migrate the task before calling SAL */ 
 			save_allowed = current->cpus_allowed;
-			set_cpus_allowed(current, cpumask_of_cpu(cpu));
+			set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 			sn_hwperf_call_sal(op_info);
-			set_cpus_allowed(current, save_allowed);
+			set_cpus_allowed(current, &save_allowed);
 		}
 	}
 	r = op_info->ret;
--- linux.trees.git.orig/arch/ia64/sn/kernel/xpc_main.c
+++ linux.trees.git/arch/ia64/sn/kernel/xpc_main.c
@@ -255,7 +255,7 @@ xpc_hb_checker(void *ignore)
 
 	daemonize(XPC_HB_CHECK_THREAD_NAME);
 
-	set_cpus_allowed(current, cpumask_of_cpu(XPC_HB_CHECK_CPU));
+	set_cpus_allowed(current, &cpumask_of_cpu(XPC_HB_CHECK_CPU));
 
 	/* set our heartbeating to other partitions into motion */
 	xpc_hb_check_timeout = jiffies + (xpc_hb_check_interval * HZ);
@@ -509,7 +509,7 @@ xpc_activating(void *__partid)
 	}
 
 	/* allow this thread and its children to run on any CPU */
-	set_cpus_allowed(current, CPU_MASK_ALL);
+	set_cpus_allowed(current, &CPU_MASK_ALL);
 
 	/*
 	 * Register the remote partition's AMOs with SAL so it can handle
--- linux.trees.git.orig/arch/mips/kernel/mips-mt-fpaff.c
+++ linux.trees.git/arch/mips/kernel/mips-mt-fpaff.c
@@ -98,10 +98,10 @@ asmlinkage long mipsmt_sys_sched_setaffi
 	if (test_ti_thread_flag(ti, TIF_FPUBOUND) &&
 	    cpus_intersects(new_mask, mt_fpu_cpumask)) {
 		cpus_and(effective_mask, new_mask, mt_fpu_cpumask);
-		retval = set_cpus_allowed(p, effective_mask);
+		retval = set_cpus_allowed(p, &effective_mask);
 	} else {
 		clear_ti_thread_flag(ti, TIF_FPUBOUND);
-		retval = set_cpus_allowed(p, new_mask);
+		retval = set_cpus_allowed(p, &new_mask);
 	}
 
 out_unlock:
--- linux.trees.git.orig/arch/mips/kernel/traps.c
+++ linux.trees.git/arch/mips/kernel/traps.c
@@ -804,7 +804,7 @@ static void mt_ase_fp_affinity(void)
 
 			cpus_and(tmask, current->thread.user_cpus_allowed,
 			         mt_fpu_cpumask);
-			set_cpus_allowed(current, tmask);
+			set_cpus_allowed(current, &tmask);
 			set_thread_flag(TIF_FPUBOUND);
 		}
 	}
--- linux.trees.git.orig/arch/powerpc/kernel/smp.c
+++ linux.trees.git/arch/powerpc/kernel/smp.c
@@ -618,12 +618,12 @@ void __init smp_cpus_done(unsigned int m
 	 * se we pin us down to CPU 0 for a short while
 	 */
 	old_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(boot_cpuid));
+	set_cpus_allowed(current, &cpumask_of_cpu(boot_cpuid));
 	
 	if (smp_ops)
 		smp_ops->setup_cpu(boot_cpuid);
 
-	set_cpus_allowed(current, old_mask);
+	set_cpus_allowed(current, &old_mask);
 
 	snapshot_timebases();
 
--- linux.trees.git.orig/arch/powerpc/kernel/sysfs.c
+++ linux.trees.git/arch/powerpc/kernel/sysfs.c
@@ -131,12 +131,12 @@ static unsigned long run_on_cpu(unsigned
 	unsigned long ret;
 
 	/* should return -EINVAL to userspace */
-	if (set_cpus_allowed(current, cpumask_of_cpu(cpu)))
+	if (set_cpus_allowed(current, &cpumask_of_cpu(cpu)))
 		return 0;
 
 	ret = func(arg);
 
-	set_cpus_allowed(current, old_affinity);
+	set_cpus_allowed(current, &old_affinity);
 
 	return ret;
 }
--- linux.trees.git.orig/arch/powerpc/platforms/pseries/rtasd.c
+++ linux.trees.git/arch/powerpc/platforms/pseries/rtasd.c
@@ -385,9 +385,9 @@ static void do_event_scan_all_cpus(long 
 	get_online_cpus();
 	cpu = first_cpu(cpu_online_map);
 	for (;;) {
-		set_cpus_allowed(current, cpumask_of_cpu(cpu));
+		set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 		do_event_scan();
-		set_cpus_allowed(current, CPU_MASK_ALL);
+		set_cpus_allowed(current, &CPU_MASK_ALL);
 
 		/* Drop hotplug lock, and sleep for the specified delay */
 		put_online_cpus();
--- linux.trees.git.orig/arch/sh/kernel/cpufreq.c
+++ linux.trees.git/arch/sh/kernel/cpufreq.c
@@ -48,7 +48,7 @@ static int sh_cpufreq_target(struct cpuf
 		return -ENODEV;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	BUG_ON(smp_processor_id() != cpu);
 
@@ -66,7 +66,7 @@ static int sh_cpufreq_target(struct cpuf
 	freqs.flags	= 0;
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 	clk_set_rate(cpuclk, freq);
 	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 
--- linux.trees.git.orig/arch/sparc64/kernel/sysfs.c
+++ linux.trees.git/arch/sparc64/kernel/sysfs.c
@@ -104,12 +104,12 @@ static unsigned long run_on_cpu(unsigned
 	unsigned long ret;
 
 	/* should return -EINVAL to userspace */
-	if (set_cpus_allowed(current, cpumask_of_cpu(cpu)))
+	if (set_cpus_allowed(current, &cpumask_of_cpu(cpu)))
 		return 0;
 
 	ret = func(arg);
 
-	set_cpus_allowed(current, old_affinity);
+	set_cpus_allowed(current, &old_affinity);
 
 	return ret;
 }
--- linux.trees.git.orig/arch/sparc64/kernel/us2e_cpufreq.c
+++ linux.trees.git/arch/sparc64/kernel/us2e_cpufreq.c
@@ -238,12 +238,12 @@ static unsigned int us2e_freq_get(unsign
 		return 0;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	clock_tick = sparc64_get_clock_tick(cpu) / 1000;
 	estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	return clock_tick / estar_to_divisor(estar);
 }
@@ -259,7 +259,7 @@ static void us2e_set_cpu_divider_index(u
 		return;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	new_freq = clock_tick = sparc64_get_clock_tick(cpu) / 1000;
 	new_bits = index_to_estar_mode(index);
@@ -281,7 +281,7 @@ static void us2e_set_cpu_divider_index(u
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 }
 
 static int us2e_freq_target(struct cpufreq_policy *policy,
--- linux.trees.git.orig/arch/sparc64/kernel/us3_cpufreq.c
+++ linux.trees.git/arch/sparc64/kernel/us3_cpufreq.c
@@ -86,12 +86,12 @@ static unsigned int us3_freq_get(unsigne
 		return 0;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	reg = read_safari_cfg();
 	ret = get_current_freq(cpu, reg);
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	return ret;
 }
@@ -106,7 +106,7 @@ static void us3_set_cpu_divider_index(un
 		return;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	new_freq = sparc64_get_clock_tick(cpu) / 1000;
 	switch (index) {
@@ -140,7 +140,7 @@ static void us3_set_cpu_divider_index(un
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 }
 
 static int us3_freq_target(struct cpufreq_policy *policy,
--- linux.trees.git.orig/arch/x86/kernel/acpi/cstate.c
+++ linux.trees.git/arch/x86/kernel/acpi/cstate.c
@@ -91,7 +91,7 @@ int acpi_processor_ffh_cstate_probe(unsi
 
 	/* Make sure we are running on right CPU */
 	saved_mask = current->cpus_allowed;
-	retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	retval = set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (retval)
 		return -1;
 
@@ -128,7 +128,7 @@ int acpi_processor_ffh_cstate_probe(unsi
 		 cx->address);
 
 out:
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return retval;
 }
 EXPORT_SYMBOL_GPL(acpi_processor_ffh_cstate_probe);
--- linux.trees.git.orig/arch/x86/kernel/apm_32.c
+++ linux.trees.git/arch/x86/kernel/apm_32.c
@@ -496,14 +496,14 @@ static cpumask_t apm_save_cpus(void)
 {
 	cpumask_t x = current->cpus_allowed;
 	/* Some bioses don't like being called from CPU != 0 */
-	set_cpus_allowed(current, cpumask_of_cpu(0));
+	set_cpus_allowed(current, &cpumask_of_cpu(0));
 	BUG_ON(smp_processor_id() != 0);
 	return x;
 }
 
 static inline void apm_restore_cpus(cpumask_t mask)
 {
-	set_cpus_allowed(current, mask);
+	set_cpus_allowed(current, &mask);
 }
 
 #else
@@ -1694,7 +1694,7 @@ static int apm(void *unused)
 	 * Some bioses don't like being called from CPU != 0.
 	 * Method suggested by Ingo Molnar.
 	 */
-	set_cpus_allowed(current, cpumask_of_cpu(0));
+	set_cpus_allowed(current, &cpumask_of_cpu(0));
 	BUG_ON(smp_processor_id() != 0);
 #endif
 
--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -192,9 +192,9 @@ static void drv_read(struct drv_cmd *cmd
 	cpumask_t saved_mask = current->cpus_allowed;
 	cmd->val = 0;
 
-	set_cpus_allowed(current, cmd->mask);
+	set_cpus_allowed(current, &cmd->mask);
 	do_drv_read(cmd);
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 }
 
 static void drv_write(struct drv_cmd *cmd)
@@ -203,11 +203,11 @@ static void drv_write(struct drv_cmd *cm
 	unsigned int i;
 
 	for_each_cpu_mask(i, cmd->mask) {
-		set_cpus_allowed(current, cpumask_of_cpu(i));
+		set_cpus_allowed(current, &cpumask_of_cpu(i));
 		do_drv_write(cmd);
 	}
 
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return;
 }
 
@@ -271,7 +271,7 @@ static unsigned int get_measured_perf(un
 	unsigned int retval;
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (get_cpu() != cpu) {
 		/* We were not able to run on requested processor */
 		put_cpu();
@@ -329,7 +329,7 @@ static unsigned int get_measured_perf(un
 	retval = per_cpu(drv_data, cpu)->max_freq * perf_percent / 100;
 
 	put_cpu();
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 
 	dprintk("cpu %d: performance percent %d\n", cpu, perf_percent);
 	return retval;
--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
@@ -483,7 +483,7 @@ static int check_supported_cpu(unsigned 
 	unsigned int rc = 0;
 
 	oldmask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	if (smp_processor_id() != cpu) {
 		printk(KERN_ERR PFX "limiting to cpu %u failed\n", cpu);
@@ -528,7 +528,7 @@ static int check_supported_cpu(unsigned 
 	rc = 1;
 
 out:
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	return rc;
 }
 
@@ -1030,7 +1030,7 @@ static int powernowk8_target(struct cpuf
 
 	/* only run on specific CPU from here on */
 	oldmask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(pol->cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(pol->cpu));
 
 	if (smp_processor_id() != pol->cpu) {
 		printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1085,7 +1085,7 @@ static int powernowk8_target(struct cpuf
 	ret = 0;
 
 err_out:
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	return ret;
 }
 
@@ -1145,7 +1145,7 @@ static int __cpuinit powernowk8_cpu_init
 
 	/* only run on specific CPU from here on */
 	oldmask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(pol->cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(pol->cpu));
 
 	if (smp_processor_id() != pol->cpu) {
 		printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1164,7 +1164,7 @@ static int __cpuinit powernowk8_cpu_init
 		fidvid_msr_init();
 
 	/* run on any CPU again */
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 
 	if (cpu_family == CPU_HW_PSTATE)
 		pol->cpus = cpumask_of_cpu(pol->cpu);
@@ -1205,7 +1205,7 @@ static int __cpuinit powernowk8_cpu_init
 	return 0;
 
 err_out:
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	powernow_k8_cpu_exit_acpi(data);
 
 	kfree(data);
@@ -1242,10 +1242,10 @@ static unsigned int powernowk8_get (unsi
 	if (!data)
 		return -EINVAL;
 
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (smp_processor_id() != cpu) {
 		printk(KERN_ERR PFX "limiting to CPU %d failed in powernowk8_get\n", cpu);
-		set_cpus_allowed(current, oldmask);
+		set_cpus_allowed(current, &oldmask);
 		return 0;
 	}
 
@@ -1259,7 +1259,7 @@ static unsigned int powernowk8_get (unsi
 
 
 out:
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	return khz;
 }
 
--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
@@ -315,7 +315,7 @@ static unsigned int get_cur_freq(unsigne
 	cpumask_t saved_mask;
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (smp_processor_id() != cpu)
 		return 0;
 
@@ -333,7 +333,7 @@ static unsigned int get_cur_freq(unsigne
 		clock_freq = extract_clock(l, cpu, 1);
 	}
 
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return clock_freq;
 }
 
@@ -487,7 +487,7 @@ static int centrino_target (struct cpufr
 		else
 			cpu_set(j, set_mask);
 
-		set_cpus_allowed(current, set_mask);
+		set_cpus_allowed(current, &set_mask);
 		preempt_disable();
 		if (unlikely(!cpu_isset(smp_processor_id(), set_mask))) {
 			dprintk("couldn't limit to CPUs in this domain\n");
@@ -555,7 +555,7 @@ static int centrino_target (struct cpufr
 
 		if (!cpus_empty(covered_cpus)) {
 			for_each_cpu_mask(j, covered_cpus) {
-				set_cpus_allowed(current, cpumask_of_cpu(j));
+				set_cpus_allowed(current, &cpumask_of_cpu(j));
 				wrmsr(MSR_IA32_PERF_CTL, oldmsr, h);
 			}
 		}
@@ -569,12 +569,12 @@ static int centrino_target (struct cpufr
 			cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 		}
 	}
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return 0;
 
 migrate_end:
 	preempt_enable();
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return 0;
 }
 
--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
@@ -235,9 +235,9 @@ static unsigned int _speedstep_get(cpuma
 	cpumask_t cpus_allowed;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpus);
+	set_cpus_allowed(current, &cpus);
 	speed = speedstep_get_processor_frequency(speedstep_processor);
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 	dprintk("detected %u kHz as current frequency\n", speed);
 	return speed;
 }
@@ -285,12 +285,12 @@ static int speedstep_target (struct cpuf
 	}
 
 	/* switch to physical CPU where state is to be changed */
-	set_cpus_allowed(current, policy->cpus);
+	set_cpus_allowed(current, &policy->cpus);
 
 	speedstep_set_state(newstate);
 
 	/* allow to be run on all CPUs */
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	for_each_cpu_mask(i, policy->cpus) {
 		freqs.cpu = i;
@@ -326,7 +326,7 @@ static int speedstep_cpu_init(struct cpu
 #endif
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, policy->cpus);
+	set_cpus_allowed(current, &policy->cpus);
 
 	/* detect low and high frequency and transition latency */
 	result = speedstep_get_freqs(speedstep_processor,
@@ -334,7 +334,7 @@ static int speedstep_cpu_init(struct cpu
 				     &speedstep_freqs[SPEEDSTEP_HIGH].frequency,
 				     &policy->cpuinfo.transition_latency,
 				     &speedstep_set_state);
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 	if (result)
 		return result;
 
--- linux.trees.git.orig/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ linux.trees.git/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -525,7 +525,7 @@ static int __cpuinit detect_cache_attrib
 		return -ENOMEM;
 
 	oldmask = current->cpus_allowed;
-	retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	retval = set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (retval)
 		goto out;
 
@@ -542,7 +542,7 @@ static int __cpuinit detect_cache_attrib
 		}
 		cache_shared_cpu_map_setup(cpu, j);
 	}
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 
 out:
 	if (retval) {
--- linux.trees.git.orig/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
+++ linux.trees.git/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
@@ -256,13 +256,13 @@ static cpumask_t affinity_set(unsigned i
 	cpumask_t oldmask = current->cpus_allowed;
 	cpumask_t newmask = CPU_MASK_NONE;
 	cpu_set(cpu, newmask);
-	set_cpus_allowed(current, newmask);
+	set_cpus_allowed(current, &newmask);
 	return oldmask;
 }
 
 static void affinity_restore(cpumask_t oldmask)
 {
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 }
 
 #define SHOW_FIELDS(name)                                           \
--- linux.trees.git.orig/arch/x86/kernel/microcode.c
+++ linux.trees.git/arch/x86/kernel/microcode.c
@@ -402,7 +402,7 @@ static int do_microcode_update (void)
 
 			if (!uci->valid)
 				continue;
-			set_cpus_allowed(current, cpumask_of_cpu(cpu));
+			set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 			error = get_maching_microcode(new_mc, cpu);
 			if (error < 0)
 				goto out;
@@ -416,7 +416,7 @@ out:
 		vfree(new_mc);
 	if (cursor < 0)
 		error = cursor;
-	set_cpus_allowed(current, old);
+	set_cpus_allowed(current, &old);
 	return error;
 }
 
@@ -579,7 +579,7 @@ static int apply_microcode_check_cpu(int
 		return 0;
 
 	old = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	/* Check if the microcode we have in memory matches the CPU */
 	if (c->x86_vendor != X86_VENDOR_INTEL || c->x86 < 6 ||
@@ -610,7 +610,7 @@ static int apply_microcode_check_cpu(int
 			" sig=0x%x, pf=0x%x, rev=0x%x\n",
 			cpu, uci->sig, uci->pf, uci->rev);
 
-	set_cpus_allowed(current, old);
+	set_cpus_allowed(current, &old);
 	return err;
 }
 
@@ -621,13 +621,13 @@ static void microcode_init_cpu(int cpu, 
 
 	old = current->cpus_allowed;
 
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	mutex_lock(&microcode_mutex);
 	collect_cpu_info(cpu);
 	if (uci->valid && system_state == SYSTEM_RUNNING && !resume)
 		cpu_request_microcode(cpu);
 	mutex_unlock(&microcode_mutex);
-	set_cpus_allowed(current, old);
+	set_cpus_allowed(current, &old);
 }
 
 static void microcode_fini_cpu(int cpu)
@@ -657,14 +657,14 @@ static ssize_t reload_store(struct sys_d
 		old = current->cpus_allowed;
 
 		get_online_cpus();
-		set_cpus_allowed(current, cpumask_of_cpu(cpu));
+		set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 		mutex_lock(&microcode_mutex);
 		if (uci->valid)
 			err = cpu_request_microcode(cpu);
 		mutex_unlock(&microcode_mutex);
 		put_online_cpus();
-		set_cpus_allowed(current, old);
+		set_cpus_allowed(current, &old);
 	}
 	if (err)
 		return err;
--- linux.trees.git.orig/arch/x86/kernel/process_64.c
+++ linux.trees.git/arch/x86/kernel/process_64.c
@@ -37,6 +37,7 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/tick.h>
+#include <linux/sched.h>
 
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
--- linux.trees.git.orig/arch/x86/kernel/reboot.c
+++ linux.trees.git/arch/x86/kernel/reboot.c
@@ -420,7 +420,7 @@ static void native_machine_shutdown(void
 		reboot_cpu_id = smp_processor_id();
 
 	/* Make certain I only run on the appropriate processor */
-	set_cpus_allowed(current, cpumask_of_cpu(reboot_cpu_id));
+	set_cpus_allowed(current, &cpumask_of_cpu(reboot_cpu_id));
 
 	/* O.K Now that I'm on the appropriate processor,
 	 * stop all of the others.
--- linux.trees.git.orig/drivers/acpi/processor_throttling.c
+++ linux.trees.git/drivers/acpi/processor_throttling.c
@@ -838,10 +838,10 @@ static int acpi_processor_get_throttling
 	 * Migrate task to the cpu pointed by pr.
 	 */
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(pr->id));
+	set_cpus_allowed(current, &cpumask_of_cpu(pr->id));
 	ret = pr->throttling.acpi_processor_get_throttling(pr);
 	/* restore the previous state */
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 
 	return ret;
 }
@@ -1025,7 +1025,7 @@ int acpi_processor_set_throttling(struct
 	 * it can be called only for the cpu pointed by pr.
 	 */
 	if (p_throttling->shared_type == DOMAIN_COORD_TYPE_SW_ANY) {
-		set_cpus_allowed(current, cpumask_of_cpu(pr->id));
+		set_cpus_allowed(current, &cpumask_of_cpu(pr->id));
 		ret = p_throttling->acpi_processor_set_throttling(pr,
 						t_state.target_state);
 	} else {
@@ -1056,7 +1056,7 @@ int acpi_processor_set_throttling(struct
 				continue;
 			}
 			t_state.cpu = i;
-			set_cpus_allowed(current, cpumask_of_cpu(i));
+			set_cpus_allowed(current, &cpumask_of_cpu(i));
 			ret = match_pr->throttling.
 				acpi_processor_set_throttling(
 				match_pr, t_state.target_state);
@@ -1074,7 +1074,7 @@ int acpi_processor_set_throttling(struct
 							&t_state);
 	}
 	/* restore the previous state */
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return ret;
 }
 
--- linux.trees.git.orig/drivers/firmware/dcdbas.c
+++ linux.trees.git/drivers/firmware/dcdbas.c
@@ -265,7 +265,7 @@ static int smi_request(struct smi_cmd *s
 
 	/* SMI requires CPU 0 */
 	old_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(0));
+	set_cpus_allowed(current, &cpumask_of_cpu(0));
 	if (smp_processor_id() != 0) {
 		dev_dbg(&dcdbas_pdev->dev, "%s: failed to get CPU 0\n",
 			__FUNCTION__);
@@ -285,7 +285,7 @@ static int smi_request(struct smi_cmd *s
 	);
 
 out:
-	set_cpus_allowed(current, old_mask);
+	set_cpus_allowed(current, &old_mask);
 	return ret;
 }
 
--- linux.trees.git.orig/drivers/pci/pci-driver.c
+++ linux.trees.git/drivers/pci/pci-driver.c
@@ -182,15 +182,19 @@ static int pci_call_probe(struct pci_dri
 	struct mempolicy *oldpol;
 	cpumask_t oldmask = current->cpus_allowed;
 	int node = dev_to_node(&dev->dev);
-	if (node >= 0)
-	    set_cpus_allowed(current, node_to_cpumask(node));
+
+	if (node >= 0) {
+		cpumask_t nodecpumask = node_to_cpumask(node);
+		set_cpus_allowed(current, &nodecpumask);
+	}
+
 	/* And set default memory allocation policy */
 	oldpol = current->mempolicy;
 	current->mempolicy = NULL;	/* fall back to system default policy */
 #endif
 	error = drv->probe(dev, id);
 #ifdef CONFIG_NUMA
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	current->mempolicy = oldpol;
 #endif
 	return error;
--- linux.trees.git.orig/include/linux/cpuset.h
+++ linux.trees.git/include/linux/cpuset.h
@@ -20,8 +20,8 @@ extern int number_of_cpusets;	/* How man
 extern int cpuset_init_early(void);
 extern int cpuset_init(void);
 extern void cpuset_init_smp(void);
-extern cpumask_t cpuset_cpus_allowed(struct task_struct *p);
-extern cpumask_t cpuset_cpus_allowed_locked(struct task_struct *p);
+extern void cpuset_cpus_allowed(struct task_struct *p, cpumask_t *mask);
+extern void cpuset_cpus_allowed_locked(struct task_struct *p, cpumask_t *mask);
 extern nodemask_t cpuset_mems_allowed(struct task_struct *p);
 #define cpuset_current_mems_allowed (current->mems_allowed)
 void cpuset_init_current_mems_allowed(void);
@@ -84,13 +84,14 @@ static inline int cpuset_init_early(void
 static inline int cpuset_init(void) { return 0; }
 static inline void cpuset_init_smp(void) {}
 
-static inline cpumask_t cpuset_cpus_allowed(struct task_struct *p)
+static inline void cpuset_cpus_allowed(struct task_struct *p, cpumask_t *mask)
 {
-	return cpu_possible_map;
+	*mask = cpu_possible_map;
 }
-static inline cpumask_t cpuset_cpus_allowed_locked(struct task_struct *p)
+static inline void cpuset_cpus_allowed_locked(struct task_struct *p,
+								cpumask_t *mask)
 {
-	return cpu_possible_map;
+	*mask = cpu_possible_map;
 }
 
 static inline nodemask_t cpuset_mems_allowed(struct task_struct *p)
--- linux.trees.git.orig/include/linux/sched.h
+++ linux.trees.git/include/linux/sched.h
@@ -889,7 +889,8 @@ struct sched_class {
 	void (*set_curr_task) (struct rq *rq);
 	void (*task_tick) (struct rq *rq, struct task_struct *p, int queued);
 	void (*task_new) (struct rq *rq, struct task_struct *p);
-	void (*set_cpus_allowed)(struct task_struct *p, cpumask_t *newmask);
+	void (*set_cpus_allowed)(struct task_struct *p,
+						const cpumask_t *newmask);
 
 	void (*join_domain)(struct rq *rq);
 	void (*leave_domain)(struct rq *rq);
@@ -1501,11 +1502,12 @@ static inline void put_task_struct(struc
 #define used_math() tsk_used_math(current)
 
 #ifdef CONFIG_SMP
-extern int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask);
+extern int set_cpus_allowed(struct task_struct *p, const cpumask_t *new_mask);
 #else
-static inline int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+static inline int set_cpus_allowed(struct task_struct *p,
+				   const cpumask_t *new_mask)
 {
-	if (!cpu_isset(0, new_mask))
+	if (!cpu_isset(0, *new_mask))
 		return -EINVAL;
 	return 0;
 }
--- linux.trees.git.orig/init/main.c
+++ linux.trees.git/init/main.c
@@ -845,7 +845,7 @@ static int __init kernel_init(void * unu
 	/*
 	 * init can run on any cpu.
 	 */
-	set_cpus_allowed(current, CPU_MASK_ALL);
+	set_cpus_allowed(current, &CPU_MASK_ALL);
 	/*
 	 * Tell the world that we're going to be the grim
 	 * reaper of innocent orphaned children.
--- linux.trees.git.orig/kernel/cpu.c
+++ linux.trees.git/kernel/cpu.c
@@ -234,7 +234,7 @@ static int _cpu_down(unsigned int cpu, i
 	old_allowed = current->cpus_allowed;
 	tmp = CPU_MASK_ALL;
 	cpu_clear(cpu, tmp);
-	set_cpus_allowed(current, tmp);
+	set_cpus_allowed(current, &tmp);
 
 	p = __stop_machine_run(take_cpu_down, &tcd_param, cpu);
 
@@ -268,7 +268,7 @@ static int _cpu_down(unsigned int cpu, i
 out_thread:
 	err = kthread_stop(p);
 out_allowed:
-	set_cpus_allowed(current, old_allowed);
+	set_cpus_allowed(current, &old_allowed);
 out_release:
 	cpu_hotplug_done();
 	return err;
--- linux.trees.git.orig/kernel/cpuset.c
+++ linux.trees.git/kernel/cpuset.c
@@ -729,7 +729,8 @@ int cpuset_test_cpumask(struct task_stru
  */
 void cpuset_change_cpumask(struct task_struct *tsk, struct cgroup_scanner *scan)
 {
-	set_cpus_allowed(tsk, (cgroup_cs(scan->cg))->cpus_allowed);
+	cpumask_t newmask = cgroup_cs(scan->cg)->cpus_allowed;
+	set_cpus_allowed(tsk, &newmask);
 }
 
 /**
@@ -1178,7 +1179,7 @@ static void cpuset_attach(struct cgroup_
 
 	mutex_lock(&callback_mutex);
 	guarantee_online_cpus(cs, &cpus);
-	set_cpus_allowed(tsk, cpus);
+	set_cpus_allowed(tsk, &cpus);
 	mutex_unlock(&callback_mutex);
 
 	from = oldcs->mems_allowed;
@@ -1844,6 +1845,7 @@ void __init cpuset_init_smp(void)
 
  * cpuset_cpus_allowed - return cpus_allowed mask from a tasks cpuset.
  * @tsk: pointer to task_struct from which to obtain cpuset->cpus_allowed.
+ * @mask: pointer to cpumask to be returned.
  *
  * Description: Returns the cpumask_t cpus_allowed of the cpuset
  * attached to the specified @tsk.  Guaranteed to return some non-empty
@@ -1851,30 +1853,22 @@ void __init cpuset_init_smp(void)
  * tasks cpuset.
  **/
 
-cpumask_t cpuset_cpus_allowed(struct task_struct *tsk)
+void cpuset_cpus_allowed(struct task_struct *tsk, cpumask_t *mask)
 {
-	cpumask_t mask;
-
 	mutex_lock(&callback_mutex);
-	mask = cpuset_cpus_allowed_locked(tsk);
+	cpuset_cpus_allowed_locked(tsk, mask);
 	mutex_unlock(&callback_mutex);
-
-	return mask;
 }
 
 /**
  * cpuset_cpus_allowed_locked - return cpus_allowed mask from a tasks cpuset.
  * Must be called with callback_mutex held.
  **/
-cpumask_t cpuset_cpus_allowed_locked(struct task_struct *tsk)
+void cpuset_cpus_allowed_locked(struct task_struct *tsk, cpumask_t *mask)
 {
-	cpumask_t mask;
-
 	task_lock(tsk);
-	guarantee_online_cpus(task_cs(tsk), &mask);
+	guarantee_online_cpus(task_cs(tsk), mask);
 	task_unlock(tsk);
-
-	return mask;
 }
 
 void cpuset_init_current_mems_allowed(void)
--- linux.trees.git.orig/kernel/kmod.c
+++ linux.trees.git/kernel/kmod.c
@@ -165,7 +165,7 @@ static int ____call_usermodehelper(void 
 	}
 
 	/* We can run anywhere, unlike our parent keventd(). */
-	set_cpus_allowed(current, CPU_MASK_ALL);
+	set_cpus_allowed(current, &CPU_MASK_ALL);
 
 	/*
 	 * Our parent is keventd, which runs with elevated scheduling priority.
--- linux.trees.git.orig/kernel/kthread.c
+++ linux.trees.git/kernel/kthread.c
@@ -107,7 +107,7 @@ static void create_kthread(struct kthrea
 		 */
 		sched_setscheduler(create->result, SCHED_NORMAL, &param);
 		set_user_nice(create->result, KTHREAD_NICE_LEVEL);
-		set_cpus_allowed(create->result, CPU_MASK_ALL);
+		set_cpus_allowed(create->result, &CPU_MASK_ALL);
 	}
 	complete(&create->done);
 }
@@ -232,7 +232,7 @@ int kthreadd(void *unused)
 	set_task_comm(tsk, "kthreadd");
 	ignore_signals(tsk);
 	set_user_nice(tsk, KTHREAD_NICE_LEVEL);
-	set_cpus_allowed(tsk, CPU_MASK_ALL);
+	set_cpus_allowed(tsk, &CPU_MASK_ALL);
 
 	current->flags |= PF_NOFREEZE;
 
--- linux.trees.git.orig/kernel/rcutorture.c
+++ linux.trees.git/kernel/rcutorture.c
@@ -737,25 +737,26 @@ static void rcu_torture_shuffle_tasks(vo
 	if (rcu_idle_cpu != -1)
 		cpu_clear(rcu_idle_cpu, tmp_mask);
 
-	set_cpus_allowed(current, tmp_mask);
+	set_cpus_allowed(current, &tmp_mask);
 
 	if (reader_tasks) {
 		for (i = 0; i < nrealreaders; i++)
 			if (reader_tasks[i])
-				set_cpus_allowed(reader_tasks[i], tmp_mask);
+				set_cpus_allowed(reader_tasks[i], &tmp_mask);
 	}
 
 	if (fakewriter_tasks) {
 		for (i = 0; i < nfakewriters; i++)
 			if (fakewriter_tasks[i])
-				set_cpus_allowed(fakewriter_tasks[i], tmp_mask);
+				set_cpus_allowed(fakewriter_tasks[i],
+						 &tmp_mask);
 	}
 
 	if (writer_task)
-		set_cpus_allowed(writer_task, tmp_mask);
+		set_cpus_allowed(writer_task, &tmp_mask);
 
 	if (stats_task)
-		set_cpus_allowed(stats_task, tmp_mask);
+		set_cpus_allowed(stats_task, &tmp_mask);
 
 	if (rcu_idle_cpu == -1)
 		rcu_idle_cpu = num_online_cpus() - 1;
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -4739,13 +4739,13 @@ long sched_setaffinity(pid_t pid, cpumas
 	if (retval)
 		goto out_unlock;
 
-	cpus_allowed = cpuset_cpus_allowed(p);
+	cpuset_cpus_allowed(p, &cpus_allowed);
 	cpus_and(new_mask, new_mask, cpus_allowed);
  again:
-	retval = set_cpus_allowed(p, new_mask);
+	retval = set_cpus_allowed(p, &new_mask);
 
 	if (!retval) {
-		cpus_allowed = cpuset_cpus_allowed(p);
+		cpuset_cpus_allowed(p, &cpus_allowed);
 		if (!cpus_subset(new_mask, cpus_allowed)) {
 			/*
 			 * We must have raced with a concurrent cpuset
@@ -5280,7 +5280,7 @@ static inline void sched_init_granularit
  * task must not exit() & deallocate itself prematurely. The
  * call is not atomic; no spinlocks may be held.
  */
-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed(struct task_struct *p, const cpumask_t *new_mask)
 {
 	struct migration_req req;
 	unsigned long flags;
@@ -5288,23 +5288,23 @@ int set_cpus_allowed(struct task_struct 
 	int ret = 0;
 
 	rq = task_rq_lock(p, &flags);
-	if (!cpus_intersects(new_mask, cpu_online_map)) {
+	if (!cpus_intersects(*new_mask, cpu_online_map)) {
 		ret = -EINVAL;
 		goto out;
 	}
 
 	if (p->sched_class->set_cpus_allowed)
-		p->sched_class->set_cpus_allowed(p, &new_mask);
+		p->sched_class->set_cpus_allowed(p, new_mask);
 	else {
-		p->cpus_allowed = new_mask;
-		p->rt.nr_cpus_allowed = cpus_weight(new_mask);
+		p->cpus_allowed = *new_mask;
+		p->rt.nr_cpus_allowed = cpus_weight(*new_mask);
 	}
 
 	/* Can the task run on the task's current CPU? If so, we're done */
-	if (cpu_isset(task_cpu(p), new_mask))
+	if (cpu_isset(task_cpu(p), *new_mask))
 		goto out;
 
-	if (migrate_task(p, any_online_cpu(new_mask), &req)) {
+	if (migrate_task(p, any_online_cpu(*new_mask), &req)) {
 		/* Need help from migration thread: drop lock and wait. */
 		task_rq_unlock(rq, &flags);
 		wake_up_process(rq->migration_thread);
@@ -5460,7 +5460,8 @@ static void move_task_off_dead_cpu(int d
 
 		/* No more Mr. Nice Guy. */
 		if (dest_cpu >= nr_cpu_ids) {
-			cpumask_t cpus_allowed = cpuset_cpus_allowed_locked(p);
+			cpumask_t cpus_allowed;
+			cpuset_cpus_allowed_locked(p, &cpus_allowed);
 			/*
 			 * Try to stay on the same cpuset, where the
 			 * current cpuset may be a subset of all cpus.
@@ -7049,7 +7050,7 @@ void __init sched_init_smp(void)
 	hotcpu_notifier(update_sched_domains, 0);
 
 	/* Move init over to a non-isolated CPU */
-	if (set_cpus_allowed(current, non_isolated_cpus) < 0)
+	if (set_cpus_allowed(current, &non_isolated_cpus) < 0)
 		BUG();
 	sched_init_granularity();
 }
--- linux.trees.git.orig/kernel/sched_rt.c
+++ linux.trees.git/kernel/sched_rt.c
@@ -1001,7 +1001,8 @@ move_one_task_rt(struct rq *this_rq, int
 	return 0;
 }
 
-static void set_cpus_allowed_rt(struct task_struct *p, cpumask_t *new_mask)
+static void set_cpus_allowed_rt(struct task_struct *p,
+				const cpumask_t *new_mask)
 {
 	int weight = cpus_weight(*new_mask);
 
--- linux.trees.git.orig/kernel/stop_machine.c
+++ linux.trees.git/kernel/stop_machine.c
@@ -35,7 +35,7 @@ static int stopmachine(void *cpu)
 	int irqs_disabled = 0;
 	int prepared = 0;
 
-	set_cpus_allowed(current, cpumask_of_cpu((int)(long)cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu((int)(long)cpu));
 
 	/* Ack: we are alive */
 	smp_mb(); /* Theoretically the ack = 0 might not be on this CPU yet. */
--- linux.trees.git.orig/mm/pdflush.c
+++ linux.trees.git/mm/pdflush.c
@@ -187,8 +187,8 @@ static int pdflush(void *dummy)
 	 * This is needed as pdflush's are dynamically created and destroyed.
 	 * The boottime pdflush's are easily placed w/o these 2 lines.
 	 */
-	cpus_allowed = cpuset_cpus_allowed(current);
-	set_cpus_allowed(current, cpus_allowed);
+	cpuset_cpus_allowed(current, &cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	return __pdflush(&my_work);
 }
--- linux.trees.git.orig/mm/vmscan.c
+++ linux.trees.git/mm/vmscan.c
@@ -1668,7 +1668,7 @@ static int kswapd(void *p)
 
 	cpumask = node_to_cpumask(pgdat->node_id);
 	if (!cpus_empty(cpumask))
-		set_cpus_allowed(tsk, cpumask);
+		set_cpus_allowed(tsk, &cpumask);
 	current->reclaim_state = &reclaim_state;
 
 	/*
@@ -1905,9 +1905,9 @@ static int __devinit cpu_callback(struct
 		for_each_node_state(nid, N_HIGH_MEMORY) {
 			pgdat = NODE_DATA(nid);
 			mask = node_to_cpumask(pgdat->node_id);
-			if (any_online_cpu(mask) != NR_CPUS)
+			if (any_online_cpu(mask) < nr_cpu_ids)
 				/* One of our CPUs online: restore mask */
-				set_cpus_allowed(pgdat->kswapd, mask);
+				set_cpus_allowed(pgdat->kswapd, &mask);
 		}
 	}
 	return NOTIFY_OK;
--- linux.trees.git.orig/net/sunrpc/svc.c
+++ linux.trees.git/net/sunrpc/svc.c
@@ -301,7 +301,6 @@ static inline int
 svc_pool_map_set_cpumask(unsigned int pidx, cpumask_t *oldmask)
 {
 	struct svc_pool_map *m = &svc_pool_map;
-	unsigned int node; /* or cpu */
 
 	/*
 	 * The caller checks for sv_nrpools > 1, which
@@ -314,16 +313,23 @@ svc_pool_map_set_cpumask(unsigned int pi
 	default:
 		return 0;
 	case SVC_POOL_PERCPU:
-		node = m->pool_to[pidx];
+	{
+		unsigned int cpu = m->pool_to[pidx];
+
 		*oldmask = current->cpus_allowed;
-		set_cpus_allowed(current, cpumask_of_cpu(node));
+		set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 		return 1;
+	}
 	case SVC_POOL_PERNODE:
-		node = m->pool_to[pidx];
+	{
+		unsigned int node = m->pool_to[pidx];
+		cpumask_t nodecpumask = node_to_cpumask(node);
+
 		*oldmask = current->cpus_allowed;
-		set_cpus_allowed(current, node_to_cpumask(node));
+		set_cpus_allowed(current, &nodecpumask);
 		return 1;
 	}
+	}
 }
 
 /*
@@ -598,7 +604,7 @@ __svc_create_thread(svc_thread_fn func, 
 	error = kernel_thread((int (*)(void *)) func, rqstp, 0);
 
 	if (have_oldmask)
-		set_cpus_allowed(current, oldmask);
+		set_cpus_allowed(current, &oldmask);
 
 	if (error < 0)
 		goto out_thread;

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 02/12] cpumask: pass pointer to cpumask for set_cpus_allowed() v2
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel

[-- Attachment #1: set_cpus_allowed --]
[-- Type: text/plain, Size: 43786 bytes --]

Instead of passing by value, the "newly allowed cpus" cpumask
argument, pass a pointer:

-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed(struct task_struct *p, const cpumask_t *new_mask)

This is a major ABI change and unfortunately touches a number of files
as the function is very commonly used.  I had thought of using a macro
to "silently" pass the 2nd arg as a pointer, but you lose in the
situation where you already have a pointer to the new cpumask.

This removes 10792 bytes of stack usage.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Signed-off-by: Mike Travis <travis@sgi.com>
---
v2: rebased on linux-2.6.git + linux-2.6-x86.git
---
 arch/arm/mach-integrator/cpu.c                   |   10 ++++-----
 arch/ia64/kernel/cpufreq/acpi-cpufreq.c          |   10 ++++-----
 arch/ia64/kernel/salinfo.c                       |    4 +--
 arch/ia64/kernel/topology.c                      |    4 +--
 arch/ia64/sn/kernel/sn2/sn_hwperf.c              |    4 +--
 arch/ia64/sn/kernel/xpc_main.c                   |    4 +--
 arch/mips/kernel/mips-mt-fpaff.c                 |    4 +--
 arch/mips/kernel/traps.c                         |    2 -
 arch/powerpc/kernel/smp.c                        |    4 +--
 arch/powerpc/kernel/sysfs.c                      |    4 +--
 arch/powerpc/platforms/pseries/rtasd.c           |    4 +--
 arch/sh/kernel/cpufreq.c                         |    4 +--
 arch/sparc64/kernel/sysfs.c                      |    4 +--
 arch/sparc64/kernel/us2e_cpufreq.c               |    8 +++----
 arch/sparc64/kernel/us3_cpufreq.c                |    8 +++----
 arch/x86/kernel/acpi/cstate.c                    |    4 +--
 arch/x86/kernel/apm_32.c                         |    6 ++---
 arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c       |   12 +++++------
 arch/x86/kernel/cpu/cpufreq/powernow-k8.c        |   20 +++++++++---------
 arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c |   12 +++++------
 arch/x86/kernel/cpu/cpufreq/speedstep-ich.c      |   12 +++++------
 arch/x86/kernel/cpu/intel_cacheinfo.c            |    4 +--
 arch/x86/kernel/cpu/mcheck/mce_amd_64.c          |    4 +--
 arch/x86/kernel/microcode.c                      |   16 +++++++-------
 arch/x86/kernel/process_64.c                     |    1 
 arch/x86/kernel/reboot.c                         |    2 -
 drivers/acpi/processor_throttling.c              |   10 ++++-----
 drivers/firmware/dcdbas.c                        |    4 +--
 drivers/pci/pci-driver.c                         |   10 ++++++---
 include/linux/cpuset.h                           |   13 ++++++-----
 include/linux/sched.h                            |   10 +++++----
 init/main.c                                      |    2 -
 kernel/cpu.c                                     |    4 +--
 kernel/cpuset.c                                  |   22 +++++++-------------
 kernel/kmod.c                                    |    2 -
 kernel/kthread.c                                 |    4 +--
 kernel/rcutorture.c                              |   11 +++++-----
 kernel/sched.c                                   |   25 +++++++++++------------
 kernel/sched_rt.c                                |    3 +-
 kernel/stop_machine.c                            |    2 -
 mm/pdflush.c                                     |    4 +--
 mm/vmscan.c                                      |    6 ++---
 net/sunrpc/svc.c                                 |   18 +++++++++++-----
 43 files changed, 166 insertions(+), 155 deletions(-)

--- linux.trees.git.orig/arch/arm/mach-integrator/cpu.c
+++ linux.trees.git/arch/arm/mach-integrator/cpu.c
@@ -94,7 +94,7 @@ static int integrator_set_target(struct 
 	 * Bind to the specified CPU.  When this call returns,
 	 * we should be running on the right CPU.
 	 */
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	BUG_ON(cpu != smp_processor_id());
 
 	/* get current setting */
@@ -122,7 +122,7 @@ static int integrator_set_target(struct 
 	freqs.cpu = policy->cpu;
 
 	if (freqs.old == freqs.new) {
-		set_cpus_allowed(current, cpus_allowed);
+		set_cpus_allowed(current, &cpus_allowed);
 		return 0;
 	}
 
@@ -145,7 +145,7 @@ static int integrator_set_target(struct 
 	/*
 	 * Restore the CPUs allowed mask.
 	 */
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 
@@ -161,7 +161,7 @@ static unsigned int integrator_get(unsig
 
 	cpus_allowed = current->cpus_allowed;
 
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	BUG_ON(cpu != smp_processor_id());
 
 	/* detect memory etc. */
@@ -177,7 +177,7 @@ static unsigned int integrator_get(unsig
 
 	current_freq = icst525_khz(&cclk_params, vco); /* current freq */
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	return current_freq;
 }
--- linux.trees.git.orig/arch/ia64/kernel/cpufreq/acpi-cpufreq.c
+++ linux.trees.git/arch/ia64/kernel/cpufreq/acpi-cpufreq.c
@@ -112,7 +112,7 @@ processor_get_freq (
 	dprintk("processor_get_freq\n");
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (smp_processor_id() != cpu)
 		goto migrate_end;
 
@@ -120,7 +120,7 @@ processor_get_freq (
 	ret = processor_get_pstate(&value);
 
 	if (ret) {
-		set_cpus_allowed(current, saved_mask);
+		set_cpus_allowed(current, &saved_mask);
 		printk(KERN_WARNING "get performance failed with error %d\n",
 		       ret);
 		ret = 0;
@@ -130,7 +130,7 @@ processor_get_freq (
 	ret = (clock_freq*1000);
 
 migrate_end:
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return ret;
 }
 
@@ -150,7 +150,7 @@ processor_set_freq (
 	dprintk("processor_set_freq\n");
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (smp_processor_id() != cpu) {
 		retval = -EAGAIN;
 		goto migrate_end;
@@ -207,7 +207,7 @@ processor_set_freq (
 	retval = 0;
 
 migrate_end:
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return (retval);
 }
 
--- linux.trees.git.orig/arch/ia64/kernel/salinfo.c
+++ linux.trees.git/arch/ia64/kernel/salinfo.c
@@ -405,9 +405,9 @@ call_on_cpu(int cpu, void (*fn)(void *),
 {
 	cpumask_t save_cpus_allowed = current->cpus_allowed;
 	cpumask_t new_cpus_allowed = cpumask_of_cpu(cpu);
-	set_cpus_allowed(current, new_cpus_allowed);
+	set_cpus_allowed(current, &new_cpus_allowed);
 	(*fn)(arg);
-	set_cpus_allowed(current, save_cpus_allowed);
+	set_cpus_allowed(current, &save_cpus_allowed);
 }
 
 static void
--- linux.trees.git.orig/arch/ia64/kernel/topology.c
+++ linux.trees.git/arch/ia64/kernel/topology.c
@@ -345,12 +345,12 @@ static int __cpuinit cache_add_dev(struc
 		return 0;
 
 	oldmask = current->cpus_allowed;
-	retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	retval = set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (unlikely(retval))
 		return retval;
 
 	retval = cpu_cache_sysfs_init(cpu);
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	if (unlikely(retval < 0))
 		return retval;
 
--- linux.trees.git.orig/arch/ia64/sn/kernel/sn2/sn_hwperf.c
+++ linux.trees.git/arch/ia64/sn/kernel/sn2/sn_hwperf.c
@@ -635,9 +635,9 @@ static int sn_hwperf_op_cpu(struct sn_hw
 		else {
 			/* migrate the task before calling SAL */ 
 			save_allowed = current->cpus_allowed;
-			set_cpus_allowed(current, cpumask_of_cpu(cpu));
+			set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 			sn_hwperf_call_sal(op_info);
-			set_cpus_allowed(current, save_allowed);
+			set_cpus_allowed(current, &save_allowed);
 		}
 	}
 	r = op_info->ret;
--- linux.trees.git.orig/arch/ia64/sn/kernel/xpc_main.c
+++ linux.trees.git/arch/ia64/sn/kernel/xpc_main.c
@@ -255,7 +255,7 @@ xpc_hb_checker(void *ignore)
 
 	daemonize(XPC_HB_CHECK_THREAD_NAME);
 
-	set_cpus_allowed(current, cpumask_of_cpu(XPC_HB_CHECK_CPU));
+	set_cpus_allowed(current, &cpumask_of_cpu(XPC_HB_CHECK_CPU));
 
 	/* set our heartbeating to other partitions into motion */
 	xpc_hb_check_timeout = jiffies + (xpc_hb_check_interval * HZ);
@@ -509,7 +509,7 @@ xpc_activating(void *__partid)
 	}
 
 	/* allow this thread and its children to run on any CPU */
-	set_cpus_allowed(current, CPU_MASK_ALL);
+	set_cpus_allowed(current, &CPU_MASK_ALL);
 
 	/*
 	 * Register the remote partition's AMOs with SAL so it can handle
--- linux.trees.git.orig/arch/mips/kernel/mips-mt-fpaff.c
+++ linux.trees.git/arch/mips/kernel/mips-mt-fpaff.c
@@ -98,10 +98,10 @@ asmlinkage long mipsmt_sys_sched_setaffi
 	if (test_ti_thread_flag(ti, TIF_FPUBOUND) &&
 	    cpus_intersects(new_mask, mt_fpu_cpumask)) {
 		cpus_and(effective_mask, new_mask, mt_fpu_cpumask);
-		retval = set_cpus_allowed(p, effective_mask);
+		retval = set_cpus_allowed(p, &effective_mask);
 	} else {
 		clear_ti_thread_flag(ti, TIF_FPUBOUND);
-		retval = set_cpus_allowed(p, new_mask);
+		retval = set_cpus_allowed(p, &new_mask);
 	}
 
 out_unlock:
--- linux.trees.git.orig/arch/mips/kernel/traps.c
+++ linux.trees.git/arch/mips/kernel/traps.c
@@ -804,7 +804,7 @@ static void mt_ase_fp_affinity(void)
 
 			cpus_and(tmask, current->thread.user_cpus_allowed,
 			         mt_fpu_cpumask);
-			set_cpus_allowed(current, tmask);
+			set_cpus_allowed(current, &tmask);
 			set_thread_flag(TIF_FPUBOUND);
 		}
 	}
--- linux.trees.git.orig/arch/powerpc/kernel/smp.c
+++ linux.trees.git/arch/powerpc/kernel/smp.c
@@ -618,12 +618,12 @@ void __init smp_cpus_done(unsigned int m
 	 * se we pin us down to CPU 0 for a short while
 	 */
 	old_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(boot_cpuid));
+	set_cpus_allowed(current, &cpumask_of_cpu(boot_cpuid));
 	
 	if (smp_ops)
 		smp_ops->setup_cpu(boot_cpuid);
 
-	set_cpus_allowed(current, old_mask);
+	set_cpus_allowed(current, &old_mask);
 
 	snapshot_timebases();
 
--- linux.trees.git.orig/arch/powerpc/kernel/sysfs.c
+++ linux.trees.git/arch/powerpc/kernel/sysfs.c
@@ -131,12 +131,12 @@ static unsigned long run_on_cpu(unsigned
 	unsigned long ret;
 
 	/* should return -EINVAL to userspace */
-	if (set_cpus_allowed(current, cpumask_of_cpu(cpu)))
+	if (set_cpus_allowed(current, &cpumask_of_cpu(cpu)))
 		return 0;
 
 	ret = func(arg);
 
-	set_cpus_allowed(current, old_affinity);
+	set_cpus_allowed(current, &old_affinity);
 
 	return ret;
 }
--- linux.trees.git.orig/arch/powerpc/platforms/pseries/rtasd.c
+++ linux.trees.git/arch/powerpc/platforms/pseries/rtasd.c
@@ -385,9 +385,9 @@ static void do_event_scan_all_cpus(long 
 	get_online_cpus();
 	cpu = first_cpu(cpu_online_map);
 	for (;;) {
-		set_cpus_allowed(current, cpumask_of_cpu(cpu));
+		set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 		do_event_scan();
-		set_cpus_allowed(current, CPU_MASK_ALL);
+		set_cpus_allowed(current, &CPU_MASK_ALL);
 
 		/* Drop hotplug lock, and sleep for the specified delay */
 		put_online_cpus();
--- linux.trees.git.orig/arch/sh/kernel/cpufreq.c
+++ linux.trees.git/arch/sh/kernel/cpufreq.c
@@ -48,7 +48,7 @@ static int sh_cpufreq_target(struct cpuf
 		return -ENODEV;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	BUG_ON(smp_processor_id() != cpu);
 
@@ -66,7 +66,7 @@ static int sh_cpufreq_target(struct cpuf
 	freqs.flags	= 0;
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 	clk_set_rate(cpuclk, freq);
 	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 
--- linux.trees.git.orig/arch/sparc64/kernel/sysfs.c
+++ linux.trees.git/arch/sparc64/kernel/sysfs.c
@@ -104,12 +104,12 @@ static unsigned long run_on_cpu(unsigned
 	unsigned long ret;
 
 	/* should return -EINVAL to userspace */
-	if (set_cpus_allowed(current, cpumask_of_cpu(cpu)))
+	if (set_cpus_allowed(current, &cpumask_of_cpu(cpu)))
 		return 0;
 
 	ret = func(arg);
 
-	set_cpus_allowed(current, old_affinity);
+	set_cpus_allowed(current, &old_affinity);
 
 	return ret;
 }
--- linux.trees.git.orig/arch/sparc64/kernel/us2e_cpufreq.c
+++ linux.trees.git/arch/sparc64/kernel/us2e_cpufreq.c
@@ -238,12 +238,12 @@ static unsigned int us2e_freq_get(unsign
 		return 0;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	clock_tick = sparc64_get_clock_tick(cpu) / 1000;
 	estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	return clock_tick / estar_to_divisor(estar);
 }
@@ -259,7 +259,7 @@ static void us2e_set_cpu_divider_index(u
 		return;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	new_freq = clock_tick = sparc64_get_clock_tick(cpu) / 1000;
 	new_bits = index_to_estar_mode(index);
@@ -281,7 +281,7 @@ static void us2e_set_cpu_divider_index(u
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 }
 
 static int us2e_freq_target(struct cpufreq_policy *policy,
--- linux.trees.git.orig/arch/sparc64/kernel/us3_cpufreq.c
+++ linux.trees.git/arch/sparc64/kernel/us3_cpufreq.c
@@ -86,12 +86,12 @@ static unsigned int us3_freq_get(unsigne
 		return 0;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	reg = read_safari_cfg();
 	ret = get_current_freq(cpu, reg);
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	return ret;
 }
@@ -106,7 +106,7 @@ static void us3_set_cpu_divider_index(un
 		return;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	new_freq = sparc64_get_clock_tick(cpu) / 1000;
 	switch (index) {
@@ -140,7 +140,7 @@ static void us3_set_cpu_divider_index(un
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 }
 
 static int us3_freq_target(struct cpufreq_policy *policy,
--- linux.trees.git.orig/arch/x86/kernel/acpi/cstate.c
+++ linux.trees.git/arch/x86/kernel/acpi/cstate.c
@@ -91,7 +91,7 @@ int acpi_processor_ffh_cstate_probe(unsi
 
 	/* Make sure we are running on right CPU */
 	saved_mask = current->cpus_allowed;
-	retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	retval = set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (retval)
 		return -1;
 
@@ -128,7 +128,7 @@ int acpi_processor_ffh_cstate_probe(unsi
 		 cx->address);
 
 out:
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return retval;
 }
 EXPORT_SYMBOL_GPL(acpi_processor_ffh_cstate_probe);
--- linux.trees.git.orig/arch/x86/kernel/apm_32.c
+++ linux.trees.git/arch/x86/kernel/apm_32.c
@@ -496,14 +496,14 @@ static cpumask_t apm_save_cpus(void)
 {
 	cpumask_t x = current->cpus_allowed;
 	/* Some bioses don't like being called from CPU != 0 */
-	set_cpus_allowed(current, cpumask_of_cpu(0));
+	set_cpus_allowed(current, &cpumask_of_cpu(0));
 	BUG_ON(smp_processor_id() != 0);
 	return x;
 }
 
 static inline void apm_restore_cpus(cpumask_t mask)
 {
-	set_cpus_allowed(current, mask);
+	set_cpus_allowed(current, &mask);
 }
 
 #else
@@ -1694,7 +1694,7 @@ static int apm(void *unused)
 	 * Some bioses don't like being called from CPU != 0.
 	 * Method suggested by Ingo Molnar.
 	 */
-	set_cpus_allowed(current, cpumask_of_cpu(0));
+	set_cpus_allowed(current, &cpumask_of_cpu(0));
 	BUG_ON(smp_processor_id() != 0);
 #endif
 
--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -192,9 +192,9 @@ static void drv_read(struct drv_cmd *cmd
 	cpumask_t saved_mask = current->cpus_allowed;
 	cmd->val = 0;
 
-	set_cpus_allowed(current, cmd->mask);
+	set_cpus_allowed(current, &cmd->mask);
 	do_drv_read(cmd);
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 }
 
 static void drv_write(struct drv_cmd *cmd)
@@ -203,11 +203,11 @@ static void drv_write(struct drv_cmd *cm
 	unsigned int i;
 
 	for_each_cpu_mask(i, cmd->mask) {
-		set_cpus_allowed(current, cpumask_of_cpu(i));
+		set_cpus_allowed(current, &cpumask_of_cpu(i));
 		do_drv_write(cmd);
 	}
 
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return;
 }
 
@@ -271,7 +271,7 @@ static unsigned int get_measured_perf(un
 	unsigned int retval;
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (get_cpu() != cpu) {
 		/* We were not able to run on requested processor */
 		put_cpu();
@@ -329,7 +329,7 @@ static unsigned int get_measured_perf(un
 	retval = per_cpu(drv_data, cpu)->max_freq * perf_percent / 100;
 
 	put_cpu();
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 
 	dprintk("cpu %d: performance percent %d\n", cpu, perf_percent);
 	return retval;
--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
@@ -483,7 +483,7 @@ static int check_supported_cpu(unsigned 
 	unsigned int rc = 0;
 
 	oldmask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	if (smp_processor_id() != cpu) {
 		printk(KERN_ERR PFX "limiting to cpu %u failed\n", cpu);
@@ -528,7 +528,7 @@ static int check_supported_cpu(unsigned 
 	rc = 1;
 
 out:
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	return rc;
 }
 
@@ -1030,7 +1030,7 @@ static int powernowk8_target(struct cpuf
 
 	/* only run on specific CPU from here on */
 	oldmask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(pol->cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(pol->cpu));
 
 	if (smp_processor_id() != pol->cpu) {
 		printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1085,7 +1085,7 @@ static int powernowk8_target(struct cpuf
 	ret = 0;
 
 err_out:
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	return ret;
 }
 
@@ -1145,7 +1145,7 @@ static int __cpuinit powernowk8_cpu_init
 
 	/* only run on specific CPU from here on */
 	oldmask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(pol->cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(pol->cpu));
 
 	if (smp_processor_id() != pol->cpu) {
 		printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1164,7 +1164,7 @@ static int __cpuinit powernowk8_cpu_init
 		fidvid_msr_init();
 
 	/* run on any CPU again */
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 
 	if (cpu_family == CPU_HW_PSTATE)
 		pol->cpus = cpumask_of_cpu(pol->cpu);
@@ -1205,7 +1205,7 @@ static int __cpuinit powernowk8_cpu_init
 	return 0;
 
 err_out:
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	powernow_k8_cpu_exit_acpi(data);
 
 	kfree(data);
@@ -1242,10 +1242,10 @@ static unsigned int powernowk8_get (unsi
 	if (!data)
 		return -EINVAL;
 
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (smp_processor_id() != cpu) {
 		printk(KERN_ERR PFX "limiting to CPU %d failed in powernowk8_get\n", cpu);
-		set_cpus_allowed(current, oldmask);
+		set_cpus_allowed(current, &oldmask);
 		return 0;
 	}
 
@@ -1259,7 +1259,7 @@ static unsigned int powernowk8_get (unsi
 
 
 out:
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	return khz;
 }
 
--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
@@ -315,7 +315,7 @@ static unsigned int get_cur_freq(unsigne
 	cpumask_t saved_mask;
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (smp_processor_id() != cpu)
 		return 0;
 
@@ -333,7 +333,7 @@ static unsigned int get_cur_freq(unsigne
 		clock_freq = extract_clock(l, cpu, 1);
 	}
 
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return clock_freq;
 }
 
@@ -487,7 +487,7 @@ static int centrino_target (struct cpufr
 		else
 			cpu_set(j, set_mask);
 
-		set_cpus_allowed(current, set_mask);
+		set_cpus_allowed(current, &set_mask);
 		preempt_disable();
 		if (unlikely(!cpu_isset(smp_processor_id(), set_mask))) {
 			dprintk("couldn't limit to CPUs in this domain\n");
@@ -555,7 +555,7 @@ static int centrino_target (struct cpufr
 
 		if (!cpus_empty(covered_cpus)) {
 			for_each_cpu_mask(j, covered_cpus) {
-				set_cpus_allowed(current, cpumask_of_cpu(j));
+				set_cpus_allowed(current, &cpumask_of_cpu(j));
 				wrmsr(MSR_IA32_PERF_CTL, oldmsr, h);
 			}
 		}
@@ -569,12 +569,12 @@ static int centrino_target (struct cpufr
 			cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 		}
 	}
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return 0;
 
 migrate_end:
 	preempt_enable();
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return 0;
 }
 
--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
@@ -235,9 +235,9 @@ static unsigned int _speedstep_get(cpuma
 	cpumask_t cpus_allowed;
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, cpus);
+	set_cpus_allowed(current, &cpus);
 	speed = speedstep_get_processor_frequency(speedstep_processor);
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 	dprintk("detected %u kHz as current frequency\n", speed);
 	return speed;
 }
@@ -285,12 +285,12 @@ static int speedstep_target (struct cpuf
 	}
 
 	/* switch to physical CPU where state is to be changed */
-	set_cpus_allowed(current, policy->cpus);
+	set_cpus_allowed(current, &policy->cpus);
 
 	speedstep_set_state(newstate);
 
 	/* allow to be run on all CPUs */
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	for_each_cpu_mask(i, policy->cpus) {
 		freqs.cpu = i;
@@ -326,7 +326,7 @@ static int speedstep_cpu_init(struct cpu
 #endif
 
 	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed(current, policy->cpus);
+	set_cpus_allowed(current, &policy->cpus);
 
 	/* detect low and high frequency and transition latency */
 	result = speedstep_get_freqs(speedstep_processor,
@@ -334,7 +334,7 @@ static int speedstep_cpu_init(struct cpu
 				     &speedstep_freqs[SPEEDSTEP_HIGH].frequency,
 				     &policy->cpuinfo.transition_latency,
 				     &speedstep_set_state);
-	set_cpus_allowed(current, cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 	if (result)
 		return result;
 
--- linux.trees.git.orig/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ linux.trees.git/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -525,7 +525,7 @@ static int __cpuinit detect_cache_attrib
 		return -ENOMEM;
 
 	oldmask = current->cpus_allowed;
-	retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	retval = set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	if (retval)
 		goto out;
 
@@ -542,7 +542,7 @@ static int __cpuinit detect_cache_attrib
 		}
 		cache_shared_cpu_map_setup(cpu, j);
 	}
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 
 out:
 	if (retval) {
--- linux.trees.git.orig/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
+++ linux.trees.git/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
@@ -256,13 +256,13 @@ static cpumask_t affinity_set(unsigned i
 	cpumask_t oldmask = current->cpus_allowed;
 	cpumask_t newmask = CPU_MASK_NONE;
 	cpu_set(cpu, newmask);
-	set_cpus_allowed(current, newmask);
+	set_cpus_allowed(current, &newmask);
 	return oldmask;
 }
 
 static void affinity_restore(cpumask_t oldmask)
 {
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 }
 
 #define SHOW_FIELDS(name)                                           \
--- linux.trees.git.orig/arch/x86/kernel/microcode.c
+++ linux.trees.git/arch/x86/kernel/microcode.c
@@ -402,7 +402,7 @@ static int do_microcode_update (void)
 
 			if (!uci->valid)
 				continue;
-			set_cpus_allowed(current, cpumask_of_cpu(cpu));
+			set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 			error = get_maching_microcode(new_mc, cpu);
 			if (error < 0)
 				goto out;
@@ -416,7 +416,7 @@ out:
 		vfree(new_mc);
 	if (cursor < 0)
 		error = cursor;
-	set_cpus_allowed(current, old);
+	set_cpus_allowed(current, &old);
 	return error;
 }
 
@@ -579,7 +579,7 @@ static int apply_microcode_check_cpu(int
 		return 0;
 
 	old = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 	/* Check if the microcode we have in memory matches the CPU */
 	if (c->x86_vendor != X86_VENDOR_INTEL || c->x86 < 6 ||
@@ -610,7 +610,7 @@ static int apply_microcode_check_cpu(int
 			" sig=0x%x, pf=0x%x, rev=0x%x\n",
 			cpu, uci->sig, uci->pf, uci->rev);
 
-	set_cpus_allowed(current, old);
+	set_cpus_allowed(current, &old);
 	return err;
 }
 
@@ -621,13 +621,13 @@ static void microcode_init_cpu(int cpu, 
 
 	old = current->cpus_allowed;
 
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 	mutex_lock(&microcode_mutex);
 	collect_cpu_info(cpu);
 	if (uci->valid && system_state == SYSTEM_RUNNING && !resume)
 		cpu_request_microcode(cpu);
 	mutex_unlock(&microcode_mutex);
-	set_cpus_allowed(current, old);
+	set_cpus_allowed(current, &old);
 }
 
 static void microcode_fini_cpu(int cpu)
@@ -657,14 +657,14 @@ static ssize_t reload_store(struct sys_d
 		old = current->cpus_allowed;
 
 		get_online_cpus();
-		set_cpus_allowed(current, cpumask_of_cpu(cpu));
+		set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 
 		mutex_lock(&microcode_mutex);
 		if (uci->valid)
 			err = cpu_request_microcode(cpu);
 		mutex_unlock(&microcode_mutex);
 		put_online_cpus();
-		set_cpus_allowed(current, old);
+		set_cpus_allowed(current, &old);
 	}
 	if (err)
 		return err;
--- linux.trees.git.orig/arch/x86/kernel/process_64.c
+++ linux.trees.git/arch/x86/kernel/process_64.c
@@ -37,6 +37,7 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/tick.h>
+#include <linux/sched.h>
 
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
--- linux.trees.git.orig/arch/x86/kernel/reboot.c
+++ linux.trees.git/arch/x86/kernel/reboot.c
@@ -420,7 +420,7 @@ static void native_machine_shutdown(void
 		reboot_cpu_id = smp_processor_id();
 
 	/* Make certain I only run on the appropriate processor */
-	set_cpus_allowed(current, cpumask_of_cpu(reboot_cpu_id));
+	set_cpus_allowed(current, &cpumask_of_cpu(reboot_cpu_id));
 
 	/* O.K Now that I'm on the appropriate processor,
 	 * stop all of the others.
--- linux.trees.git.orig/drivers/acpi/processor_throttling.c
+++ linux.trees.git/drivers/acpi/processor_throttling.c
@@ -838,10 +838,10 @@ static int acpi_processor_get_throttling
 	 * Migrate task to the cpu pointed by pr.
 	 */
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(pr->id));
+	set_cpus_allowed(current, &cpumask_of_cpu(pr->id));
 	ret = pr->throttling.acpi_processor_get_throttling(pr);
 	/* restore the previous state */
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 
 	return ret;
 }
@@ -1025,7 +1025,7 @@ int acpi_processor_set_throttling(struct
 	 * it can be called only for the cpu pointed by pr.
 	 */
 	if (p_throttling->shared_type == DOMAIN_COORD_TYPE_SW_ANY) {
-		set_cpus_allowed(current, cpumask_of_cpu(pr->id));
+		set_cpus_allowed(current, &cpumask_of_cpu(pr->id));
 		ret = p_throttling->acpi_processor_set_throttling(pr,
 						t_state.target_state);
 	} else {
@@ -1056,7 +1056,7 @@ int acpi_processor_set_throttling(struct
 				continue;
 			}
 			t_state.cpu = i;
-			set_cpus_allowed(current, cpumask_of_cpu(i));
+			set_cpus_allowed(current, &cpumask_of_cpu(i));
 			ret = match_pr->throttling.
 				acpi_processor_set_throttling(
 				match_pr, t_state.target_state);
@@ -1074,7 +1074,7 @@ int acpi_processor_set_throttling(struct
 							&t_state);
 	}
 	/* restore the previous state */
-	set_cpus_allowed(current, saved_mask);
+	set_cpus_allowed(current, &saved_mask);
 	return ret;
 }
 
--- linux.trees.git.orig/drivers/firmware/dcdbas.c
+++ linux.trees.git/drivers/firmware/dcdbas.c
@@ -265,7 +265,7 @@ static int smi_request(struct smi_cmd *s
 
 	/* SMI requires CPU 0 */
 	old_mask = current->cpus_allowed;
-	set_cpus_allowed(current, cpumask_of_cpu(0));
+	set_cpus_allowed(current, &cpumask_of_cpu(0));
 	if (smp_processor_id() != 0) {
 		dev_dbg(&dcdbas_pdev->dev, "%s: failed to get CPU 0\n",
 			__FUNCTION__);
@@ -285,7 +285,7 @@ static int smi_request(struct smi_cmd *s
 	);
 
 out:
-	set_cpus_allowed(current, old_mask);
+	set_cpus_allowed(current, &old_mask);
 	return ret;
 }
 
--- linux.trees.git.orig/drivers/pci/pci-driver.c
+++ linux.trees.git/drivers/pci/pci-driver.c
@@ -182,15 +182,19 @@ static int pci_call_probe(struct pci_dri
 	struct mempolicy *oldpol;
 	cpumask_t oldmask = current->cpus_allowed;
 	int node = dev_to_node(&dev->dev);
-	if (node >= 0)
-	    set_cpus_allowed(current, node_to_cpumask(node));
+
+	if (node >= 0) {
+		cpumask_t nodecpumask = node_to_cpumask(node);
+		set_cpus_allowed(current, &nodecpumask);
+	}
+
 	/* And set default memory allocation policy */
 	oldpol = current->mempolicy;
 	current->mempolicy = NULL;	/* fall back to system default policy */
 #endif
 	error = drv->probe(dev, id);
 #ifdef CONFIG_NUMA
-	set_cpus_allowed(current, oldmask);
+	set_cpus_allowed(current, &oldmask);
 	current->mempolicy = oldpol;
 #endif
 	return error;
--- linux.trees.git.orig/include/linux/cpuset.h
+++ linux.trees.git/include/linux/cpuset.h
@@ -20,8 +20,8 @@ extern int number_of_cpusets;	/* How man
 extern int cpuset_init_early(void);
 extern int cpuset_init(void);
 extern void cpuset_init_smp(void);
-extern cpumask_t cpuset_cpus_allowed(struct task_struct *p);
-extern cpumask_t cpuset_cpus_allowed_locked(struct task_struct *p);
+extern void cpuset_cpus_allowed(struct task_struct *p, cpumask_t *mask);
+extern void cpuset_cpus_allowed_locked(struct task_struct *p, cpumask_t *mask);
 extern nodemask_t cpuset_mems_allowed(struct task_struct *p);
 #define cpuset_current_mems_allowed (current->mems_allowed)
 void cpuset_init_current_mems_allowed(void);
@@ -84,13 +84,14 @@ static inline int cpuset_init_early(void
 static inline int cpuset_init(void) { return 0; }
 static inline void cpuset_init_smp(void) {}
 
-static inline cpumask_t cpuset_cpus_allowed(struct task_struct *p)
+static inline void cpuset_cpus_allowed(struct task_struct *p, cpumask_t *mask)
 {
-	return cpu_possible_map;
+	*mask = cpu_possible_map;
 }
-static inline cpumask_t cpuset_cpus_allowed_locked(struct task_struct *p)
+static inline void cpuset_cpus_allowed_locked(struct task_struct *p,
+								cpumask_t *mask)
 {
-	return cpu_possible_map;
+	*mask = cpu_possible_map;
 }
 
 static inline nodemask_t cpuset_mems_allowed(struct task_struct *p)
--- linux.trees.git.orig/include/linux/sched.h
+++ linux.trees.git/include/linux/sched.h
@@ -889,7 +889,8 @@ struct sched_class {
 	void (*set_curr_task) (struct rq *rq);
 	void (*task_tick) (struct rq *rq, struct task_struct *p, int queued);
 	void (*task_new) (struct rq *rq, struct task_struct *p);
-	void (*set_cpus_allowed)(struct task_struct *p, cpumask_t *newmask);
+	void (*set_cpus_allowed)(struct task_struct *p,
+						const cpumask_t *newmask);
 
 	void (*join_domain)(struct rq *rq);
 	void (*leave_domain)(struct rq *rq);
@@ -1501,11 +1502,12 @@ static inline void put_task_struct(struc
 #define used_math() tsk_used_math(current)
 
 #ifdef CONFIG_SMP
-extern int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask);
+extern int set_cpus_allowed(struct task_struct *p, const cpumask_t *new_mask);
 #else
-static inline int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+static inline int set_cpus_allowed(struct task_struct *p,
+				   const cpumask_t *new_mask)
 {
-	if (!cpu_isset(0, new_mask))
+	if (!cpu_isset(0, *new_mask))
 		return -EINVAL;
 	return 0;
 }
--- linux.trees.git.orig/init/main.c
+++ linux.trees.git/init/main.c
@@ -845,7 +845,7 @@ static int __init kernel_init(void * unu
 	/*
 	 * init can run on any cpu.
 	 */
-	set_cpus_allowed(current, CPU_MASK_ALL);
+	set_cpus_allowed(current, &CPU_MASK_ALL);
 	/*
 	 * Tell the world that we're going to be the grim
 	 * reaper of innocent orphaned children.
--- linux.trees.git.orig/kernel/cpu.c
+++ linux.trees.git/kernel/cpu.c
@@ -234,7 +234,7 @@ static int _cpu_down(unsigned int cpu, i
 	old_allowed = current->cpus_allowed;
 	tmp = CPU_MASK_ALL;
 	cpu_clear(cpu, tmp);
-	set_cpus_allowed(current, tmp);
+	set_cpus_allowed(current, &tmp);
 
 	p = __stop_machine_run(take_cpu_down, &tcd_param, cpu);
 
@@ -268,7 +268,7 @@ static int _cpu_down(unsigned int cpu, i
 out_thread:
 	err = kthread_stop(p);
 out_allowed:
-	set_cpus_allowed(current, old_allowed);
+	set_cpus_allowed(current, &old_allowed);
 out_release:
 	cpu_hotplug_done();
 	return err;
--- linux.trees.git.orig/kernel/cpuset.c
+++ linux.trees.git/kernel/cpuset.c
@@ -729,7 +729,8 @@ int cpuset_test_cpumask(struct task_stru
  */
 void cpuset_change_cpumask(struct task_struct *tsk, struct cgroup_scanner *scan)
 {
-	set_cpus_allowed(tsk, (cgroup_cs(scan->cg))->cpus_allowed);
+	cpumask_t newmask = cgroup_cs(scan->cg)->cpus_allowed;
+	set_cpus_allowed(tsk, &newmask);
 }
 
 /**
@@ -1178,7 +1179,7 @@ static void cpuset_attach(struct cgroup_
 
 	mutex_lock(&callback_mutex);
 	guarantee_online_cpus(cs, &cpus);
-	set_cpus_allowed(tsk, cpus);
+	set_cpus_allowed(tsk, &cpus);
 	mutex_unlock(&callback_mutex);
 
 	from = oldcs->mems_allowed;
@@ -1844,6 +1845,7 @@ void __init cpuset_init_smp(void)
 
  * cpuset_cpus_allowed - return cpus_allowed mask from a tasks cpuset.
  * @tsk: pointer to task_struct from which to obtain cpuset->cpus_allowed.
+ * @mask: pointer to cpumask to be returned.
  *
  * Description: Returns the cpumask_t cpus_allowed of the cpuset
  * attached to the specified @tsk.  Guaranteed to return some non-empty
@@ -1851,30 +1853,22 @@ void __init cpuset_init_smp(void)
  * tasks cpuset.
  **/
 
-cpumask_t cpuset_cpus_allowed(struct task_struct *tsk)
+void cpuset_cpus_allowed(struct task_struct *tsk, cpumask_t *mask)
 {
-	cpumask_t mask;
-
 	mutex_lock(&callback_mutex);
-	mask = cpuset_cpus_allowed_locked(tsk);
+	cpuset_cpus_allowed_locked(tsk, mask);
 	mutex_unlock(&callback_mutex);
-
-	return mask;
 }
 
 /**
  * cpuset_cpus_allowed_locked - return cpus_allowed mask from a tasks cpuset.
  * Must be called with callback_mutex held.
  **/
-cpumask_t cpuset_cpus_allowed_locked(struct task_struct *tsk)
+void cpuset_cpus_allowed_locked(struct task_struct *tsk, cpumask_t *mask)
 {
-	cpumask_t mask;
-
 	task_lock(tsk);
-	guarantee_online_cpus(task_cs(tsk), &mask);
+	guarantee_online_cpus(task_cs(tsk), mask);
 	task_unlock(tsk);
-
-	return mask;
 }
 
 void cpuset_init_current_mems_allowed(void)
--- linux.trees.git.orig/kernel/kmod.c
+++ linux.trees.git/kernel/kmod.c
@@ -165,7 +165,7 @@ static int ____call_usermodehelper(void 
 	}
 
 	/* We can run anywhere, unlike our parent keventd(). */
-	set_cpus_allowed(current, CPU_MASK_ALL);
+	set_cpus_allowed(current, &CPU_MASK_ALL);
 
 	/*
 	 * Our parent is keventd, which runs with elevated scheduling priority.
--- linux.trees.git.orig/kernel/kthread.c
+++ linux.trees.git/kernel/kthread.c
@@ -107,7 +107,7 @@ static void create_kthread(struct kthrea
 		 */
 		sched_setscheduler(create->result, SCHED_NORMAL, &param);
 		set_user_nice(create->result, KTHREAD_NICE_LEVEL);
-		set_cpus_allowed(create->result, CPU_MASK_ALL);
+		set_cpus_allowed(create->result, &CPU_MASK_ALL);
 	}
 	complete(&create->done);
 }
@@ -232,7 +232,7 @@ int kthreadd(void *unused)
 	set_task_comm(tsk, "kthreadd");
 	ignore_signals(tsk);
 	set_user_nice(tsk, KTHREAD_NICE_LEVEL);
-	set_cpus_allowed(tsk, CPU_MASK_ALL);
+	set_cpus_allowed(tsk, &CPU_MASK_ALL);
 
 	current->flags |= PF_NOFREEZE;
 
--- linux.trees.git.orig/kernel/rcutorture.c
+++ linux.trees.git/kernel/rcutorture.c
@@ -737,25 +737,26 @@ static void rcu_torture_shuffle_tasks(vo
 	if (rcu_idle_cpu != -1)
 		cpu_clear(rcu_idle_cpu, tmp_mask);
 
-	set_cpus_allowed(current, tmp_mask);
+	set_cpus_allowed(current, &tmp_mask);
 
 	if (reader_tasks) {
 		for (i = 0; i < nrealreaders; i++)
 			if (reader_tasks[i])
-				set_cpus_allowed(reader_tasks[i], tmp_mask);
+				set_cpus_allowed(reader_tasks[i], &tmp_mask);
 	}
 
 	if (fakewriter_tasks) {
 		for (i = 0; i < nfakewriters; i++)
 			if (fakewriter_tasks[i])
-				set_cpus_allowed(fakewriter_tasks[i], tmp_mask);
+				set_cpus_allowed(fakewriter_tasks[i],
+						 &tmp_mask);
 	}
 
 	if (writer_task)
-		set_cpus_allowed(writer_task, tmp_mask);
+		set_cpus_allowed(writer_task, &tmp_mask);
 
 	if (stats_task)
-		set_cpus_allowed(stats_task, tmp_mask);
+		set_cpus_allowed(stats_task, &tmp_mask);
 
 	if (rcu_idle_cpu == -1)
 		rcu_idle_cpu = num_online_cpus() - 1;
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -4739,13 +4739,13 @@ long sched_setaffinity(pid_t pid, cpumas
 	if (retval)
 		goto out_unlock;
 
-	cpus_allowed = cpuset_cpus_allowed(p);
+	cpuset_cpus_allowed(p, &cpus_allowed);
 	cpus_and(new_mask, new_mask, cpus_allowed);
  again:
-	retval = set_cpus_allowed(p, new_mask);
+	retval = set_cpus_allowed(p, &new_mask);
 
 	if (!retval) {
-		cpus_allowed = cpuset_cpus_allowed(p);
+		cpuset_cpus_allowed(p, &cpus_allowed);
 		if (!cpus_subset(new_mask, cpus_allowed)) {
 			/*
 			 * We must have raced with a concurrent cpuset
@@ -5280,7 +5280,7 @@ static inline void sched_init_granularit
  * task must not exit() & deallocate itself prematurely. The
  * call is not atomic; no spinlocks may be held.
  */
-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed(struct task_struct *p, const cpumask_t *new_mask)
 {
 	struct migration_req req;
 	unsigned long flags;
@@ -5288,23 +5288,23 @@ int set_cpus_allowed(struct task_struct 
 	int ret = 0;
 
 	rq = task_rq_lock(p, &flags);
-	if (!cpus_intersects(new_mask, cpu_online_map)) {
+	if (!cpus_intersects(*new_mask, cpu_online_map)) {
 		ret = -EINVAL;
 		goto out;
 	}
 
 	if (p->sched_class->set_cpus_allowed)
-		p->sched_class->set_cpus_allowed(p, &new_mask);
+		p->sched_class->set_cpus_allowed(p, new_mask);
 	else {
-		p->cpus_allowed = new_mask;
-		p->rt.nr_cpus_allowed = cpus_weight(new_mask);
+		p->cpus_allowed = *new_mask;
+		p->rt.nr_cpus_allowed = cpus_weight(*new_mask);
 	}
 
 	/* Can the task run on the task's current CPU? If so, we're done */
-	if (cpu_isset(task_cpu(p), new_mask))
+	if (cpu_isset(task_cpu(p), *new_mask))
 		goto out;
 
-	if (migrate_task(p, any_online_cpu(new_mask), &req)) {
+	if (migrate_task(p, any_online_cpu(*new_mask), &req)) {
 		/* Need help from migration thread: drop lock and wait. */
 		task_rq_unlock(rq, &flags);
 		wake_up_process(rq->migration_thread);
@@ -5460,7 +5460,8 @@ static void move_task_off_dead_cpu(int d
 
 		/* No more Mr. Nice Guy. */
 		if (dest_cpu >= nr_cpu_ids) {
-			cpumask_t cpus_allowed = cpuset_cpus_allowed_locked(p);
+			cpumask_t cpus_allowed;
+			cpuset_cpus_allowed_locked(p, &cpus_allowed);
 			/*
 			 * Try to stay on the same cpuset, where the
 			 * current cpuset may be a subset of all cpus.
@@ -7049,7 +7050,7 @@ void __init sched_init_smp(void)
 	hotcpu_notifier(update_sched_domains, 0);
 
 	/* Move init over to a non-isolated CPU */
-	if (set_cpus_allowed(current, non_isolated_cpus) < 0)
+	if (set_cpus_allowed(current, &non_isolated_cpus) < 0)
 		BUG();
 	sched_init_granularity();
 }
--- linux.trees.git.orig/kernel/sched_rt.c
+++ linux.trees.git/kernel/sched_rt.c
@@ -1001,7 +1001,8 @@ move_one_task_rt(struct rq *this_rq, int
 	return 0;
 }
 
-static void set_cpus_allowed_rt(struct task_struct *p, cpumask_t *new_mask)
+static void set_cpus_allowed_rt(struct task_struct *p,
+				const cpumask_t *new_mask)
 {
 	int weight = cpus_weight(*new_mask);
 
--- linux.trees.git.orig/kernel/stop_machine.c
+++ linux.trees.git/kernel/stop_machine.c
@@ -35,7 +35,7 @@ static int stopmachine(void *cpu)
 	int irqs_disabled = 0;
 	int prepared = 0;
 
-	set_cpus_allowed(current, cpumask_of_cpu((int)(long)cpu));
+	set_cpus_allowed(current, &cpumask_of_cpu((int)(long)cpu));
 
 	/* Ack: we are alive */
 	smp_mb(); /* Theoretically the ack = 0 might not be on this CPU yet. */
--- linux.trees.git.orig/mm/pdflush.c
+++ linux.trees.git/mm/pdflush.c
@@ -187,8 +187,8 @@ static int pdflush(void *dummy)
 	 * This is needed as pdflush's are dynamically created and destroyed.
 	 * The boottime pdflush's are easily placed w/o these 2 lines.
 	 */
-	cpus_allowed = cpuset_cpus_allowed(current);
-	set_cpus_allowed(current, cpus_allowed);
+	cpuset_cpus_allowed(current, &cpus_allowed);
+	set_cpus_allowed(current, &cpus_allowed);
 
 	return __pdflush(&my_work);
 }
--- linux.trees.git.orig/mm/vmscan.c
+++ linux.trees.git/mm/vmscan.c
@@ -1668,7 +1668,7 @@ static int kswapd(void *p)
 
 	cpumask = node_to_cpumask(pgdat->node_id);
 	if (!cpus_empty(cpumask))
-		set_cpus_allowed(tsk, cpumask);
+		set_cpus_allowed(tsk, &cpumask);
 	current->reclaim_state = &reclaim_state;
 
 	/*
@@ -1905,9 +1905,9 @@ static int __devinit cpu_callback(struct
 		for_each_node_state(nid, N_HIGH_MEMORY) {
 			pgdat = NODE_DATA(nid);
 			mask = node_to_cpumask(pgdat->node_id);
-			if (any_online_cpu(mask) != NR_CPUS)
+			if (any_online_cpu(mask) < nr_cpu_ids)
 				/* One of our CPUs online: restore mask */
-				set_cpus_allowed(pgdat->kswapd, mask);
+				set_cpus_allowed(pgdat->kswapd, &mask);
 		}
 	}
 	return NOTIFY_OK;
--- linux.trees.git.orig/net/sunrpc/svc.c
+++ linux.trees.git/net/sunrpc/svc.c
@@ -301,7 +301,6 @@ static inline int
 svc_pool_map_set_cpumask(unsigned int pidx, cpumask_t *oldmask)
 {
 	struct svc_pool_map *m = &svc_pool_map;
-	unsigned int node; /* or cpu */
 
 	/*
 	 * The caller checks for sv_nrpools > 1, which
@@ -314,16 +313,23 @@ svc_pool_map_set_cpumask(unsigned int pi
 	default:
 		return 0;
 	case SVC_POOL_PERCPU:
-		node = m->pool_to[pidx];
+	{
+		unsigned int cpu = m->pool_to[pidx];
+
 		*oldmask = current->cpus_allowed;
-		set_cpus_allowed(current, cpumask_of_cpu(node));
+		set_cpus_allowed(current, &cpumask_of_cpu(cpu));
 		return 1;
+	}
 	case SVC_POOL_PERNODE:
-		node = m->pool_to[pidx];
+	{
+		unsigned int node = m->pool_to[pidx];
+		cpumask_t nodecpumask = node_to_cpumask(node);
+
 		*oldmask = current->cpus_allowed;
-		set_cpus_allowed(current, node_to_cpumask(node));
+		set_cpus_allowed(current, &nodecpumask);
 		return 1;
 	}
+	}
 }
 
 /*
@@ -598,7 +604,7 @@ __svc_create_thread(svc_thread_fn func, 
 	error = kernel_thread((int (*)(void *)) func, rqstp, 0);
 
 	if (have_oldmask)
-		set_cpus_allowed(current, oldmask);
+		set_cpus_allowed(current, &oldmask);
 
 	if (error < 0)
 		goto out_thread;

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 03/12] cpumask: reduce stack pressure in sched_affinity
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Paul Jackson, Cliff Wickman

[-- Attachment #1: cpumask_affinity --]
[-- Type: text/plain, Size: 5977 bytes --]

Remove local and passed cpumask_t variables in kernel/sched.c

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Jackson <pj@sgi.com>
Cc: Cliff Wickman <cpw@sgi.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/cpu/mcheck/mce_amd_64.c |   46 ++++++++++++++++----------------
 include/linux/sched.h                   |    2 -
 kernel/compat.c                         |    2 -
 kernel/rcupreempt.c                     |    4 +-
 kernel/sched.c                          |    5 ++-
 5 files changed, 30 insertions(+), 29 deletions(-)

--- linux.trees.git.orig/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
+++ linux.trees.git/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
@@ -251,18 +251,18 @@ struct threshold_attr {
 	ssize_t(*store) (struct threshold_block *, const char *, size_t count);
 };
 
-static cpumask_t affinity_set(unsigned int cpu)
+static void affinity_set(unsigned int cpu, cpumask_t *oldmask,
+						cpumask_t *newmask)
 {
-	cpumask_t oldmask = current->cpus_allowed;
-	cpumask_t newmask = CPU_MASK_NONE;
-	cpu_set(cpu, newmask);
-	set_cpus_allowed(current, &newmask);
-	return oldmask;
+	*oldmask = current->cpus_allowed;
+	*newmask = CPU_MASK_NONE;
+	cpu_set(cpu, *newmask);
+	set_cpus_allowed(current, newmask);
 }
 
-static void affinity_restore(cpumask_t oldmask)
+static void affinity_restore(cpumask_t *oldmask)
 {
-	set_cpus_allowed(current, &oldmask);
+	set_cpus_allowed(current, oldmask);
 }
 
 #define SHOW_FIELDS(name)                                           \
@@ -277,15 +277,15 @@ static ssize_t store_interrupt_enable(st
 				      const char *buf, size_t count)
 {
 	char *end;
-	cpumask_t oldmask;
+	cpumask_t oldmask, newmask;
 	unsigned long new = simple_strtoul(buf, &end, 0);
 	if (end == buf)
 		return -EINVAL;
 	b->interrupt_enable = !!new;
 
-	oldmask = affinity_set(b->cpu);
+	affinity_set(b->cpu, &oldmask, &newmask);
 	threshold_restart_bank(b, 0, 0);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 
 	return end - buf;
 }
@@ -294,7 +294,7 @@ static ssize_t store_threshold_limit(str
 				     const char *buf, size_t count)
 {
 	char *end;
-	cpumask_t oldmask;
+	cpumask_t oldmask, newmask;
 	u16 old;
 	unsigned long new = simple_strtoul(buf, &end, 0);
 	if (end == buf)
@@ -306,9 +306,9 @@ static ssize_t store_threshold_limit(str
 	old = b->threshold_limit;
 	b->threshold_limit = new;
 
-	oldmask = affinity_set(b->cpu);
+	affinity_set(b->cpu, &oldmask, &newmask);
 	threshold_restart_bank(b, 0, old);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 
 	return end - buf;
 }
@@ -316,10 +316,10 @@ static ssize_t store_threshold_limit(str
 static ssize_t show_error_count(struct threshold_block *b, char *buf)
 {
 	u32 high, low;
-	cpumask_t oldmask;
-	oldmask = affinity_set(b->cpu);
+	cpumask_t oldmask, newmask;
+	affinity_set(b->cpu, &oldmask, &newmask);
 	rdmsr(b->address, low, high);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 	return sprintf(buf, "%x\n",
 		       (high & 0xFFF) - (THRESHOLD_MAX - b->threshold_limit));
 }
@@ -327,10 +327,10 @@ static ssize_t show_error_count(struct t
 static ssize_t store_error_count(struct threshold_block *b,
 				 const char *buf, size_t count)
 {
-	cpumask_t oldmask;
-	oldmask = affinity_set(b->cpu);
+	cpumask_t oldmask, newmask;
+	affinity_set(b->cpu, &oldmask, &newmask);
 	threshold_restart_bank(b, 1, 0);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 	return 1;
 }
 
@@ -468,7 +468,7 @@ static __cpuinit int threshold_create_ba
 {
 	int i, err = 0;
 	struct threshold_bank *b = NULL;
-	cpumask_t oldmask = CPU_MASK_NONE;
+	cpumask_t oldmask = CPU_MASK_NONE, newmask;
 	char name[32];
 
 	sprintf(name, "threshold_bank%i", bank);
@@ -519,10 +519,10 @@ static __cpuinit int threshold_create_ba
 
 	per_cpu(threshold_banks, cpu)[bank] = b;
 
-	oldmask = affinity_set(cpu);
+	affinity_set(cpu, &oldmask, &newmask);
 	err = allocate_threshold_blocks(cpu, bank, 0,
 					MSR_IA32_MC0_MISC + bank * 4);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 
 	if (err)
 		goto out_free;
--- linux.trees.git.orig/include/linux/sched.h
+++ linux.trees.git/include/linux/sched.h
@@ -2026,7 +2026,7 @@ static inline void arch_pick_mmap_layout
 }
 #endif
 
-extern long sched_setaffinity(pid_t pid, cpumask_t new_mask);
+extern long sched_setaffinity(pid_t pid, const cpumask_t *new_mask);
 extern long sched_getaffinity(pid_t pid, cpumask_t *mask);
 
 extern int sched_mc_power_savings, sched_smt_power_savings;
--- linux.trees.git.orig/kernel/compat.c
+++ linux.trees.git/kernel/compat.c
@@ -446,7 +446,7 @@ asmlinkage long compat_sys_sched_setaffi
 	if (retval)
 		return retval;
 
-	return sched_setaffinity(pid, new_mask);
+	return sched_setaffinity(pid, &new_mask);
 }
 
 asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
--- linux.trees.git.orig/kernel/rcupreempt.c
+++ linux.trees.git/kernel/rcupreempt.c
@@ -1007,10 +1007,10 @@ void __synchronize_sched(void)
 	if (sched_getaffinity(0, &oldmask) < 0)
 		oldmask = cpu_possible_map;
 	for_each_online_cpu(cpu) {
-		sched_setaffinity(0, cpumask_of_cpu(cpu));
+		sched_setaffinity(0, &cpumask_of_cpu(cpu));
 		schedule();
 	}
-	sched_setaffinity(0, oldmask);
+	sched_setaffinity(0, &oldmask);
 }
 EXPORT_SYMBOL_GPL(__synchronize_sched);
 
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -4706,9 +4706,10 @@ out_unlock:
 	return retval;
 }
 
-long sched_setaffinity(pid_t pid, cpumask_t new_mask)
+long sched_setaffinity(pid_t pid, const cpumask_t *in_mask)
 {
 	cpumask_t cpus_allowed;
+	cpumask_t new_mask = *in_mask;
 	struct task_struct *p;
 	int retval;
 
@@ -4789,7 +4790,7 @@ asmlinkage long sys_sched_setaffinity(pi
 	if (retval)
 		return retval;
 
-	return sched_setaffinity(pid, new_mask);
+	return sched_setaffinity(pid, &new_mask);
 }
 
 /*

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 03/12] cpumask: reduce stack pressure in sched_affinity
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Paul Jackson, Cliff Wickman

[-- Attachment #1: cpumask_affinity --]
[-- Type: text/plain, Size: 6203 bytes --]

Remove local and passed cpumask_t variables in kernel/sched.c

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Jackson <pj@sgi.com>
Cc: Cliff Wickman <cpw@sgi.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/cpu/mcheck/mce_amd_64.c |   46 ++++++++++++++++----------------
 include/linux/sched.h                   |    2 -
 kernel/compat.c                         |    2 -
 kernel/rcupreempt.c                     |    4 +-
 kernel/sched.c                          |    5 ++-
 5 files changed, 30 insertions(+), 29 deletions(-)

--- linux.trees.git.orig/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
+++ linux.trees.git/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
@@ -251,18 +251,18 @@ struct threshold_attr {
 	ssize_t(*store) (struct threshold_block *, const char *, size_t count);
 };
 
-static cpumask_t affinity_set(unsigned int cpu)
+static void affinity_set(unsigned int cpu, cpumask_t *oldmask,
+						cpumask_t *newmask)
 {
-	cpumask_t oldmask = current->cpus_allowed;
-	cpumask_t newmask = CPU_MASK_NONE;
-	cpu_set(cpu, newmask);
-	set_cpus_allowed(current, &newmask);
-	return oldmask;
+	*oldmask = current->cpus_allowed;
+	*newmask = CPU_MASK_NONE;
+	cpu_set(cpu, *newmask);
+	set_cpus_allowed(current, newmask);
 }
 
-static void affinity_restore(cpumask_t oldmask)
+static void affinity_restore(cpumask_t *oldmask)
 {
-	set_cpus_allowed(current, &oldmask);
+	set_cpus_allowed(current, oldmask);
 }
 
 #define SHOW_FIELDS(name)                                           \
@@ -277,15 +277,15 @@ static ssize_t store_interrupt_enable(st
 				      const char *buf, size_t count)
 {
 	char *end;
-	cpumask_t oldmask;
+	cpumask_t oldmask, newmask;
 	unsigned long new = simple_strtoul(buf, &end, 0);
 	if (end == buf)
 		return -EINVAL;
 	b->interrupt_enable = !!new;
 
-	oldmask = affinity_set(b->cpu);
+	affinity_set(b->cpu, &oldmask, &newmask);
 	threshold_restart_bank(b, 0, 0);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 
 	return end - buf;
 }
@@ -294,7 +294,7 @@ static ssize_t store_threshold_limit(str
 				     const char *buf, size_t count)
 {
 	char *end;
-	cpumask_t oldmask;
+	cpumask_t oldmask, newmask;
 	u16 old;
 	unsigned long new = simple_strtoul(buf, &end, 0);
 	if (end == buf)
@@ -306,9 +306,9 @@ static ssize_t store_threshold_limit(str
 	old = b->threshold_limit;
 	b->threshold_limit = new;
 
-	oldmask = affinity_set(b->cpu);
+	affinity_set(b->cpu, &oldmask, &newmask);
 	threshold_restart_bank(b, 0, old);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 
 	return end - buf;
 }
@@ -316,10 +316,10 @@ static ssize_t store_threshold_limit(str
 static ssize_t show_error_count(struct threshold_block *b, char *buf)
 {
 	u32 high, low;
-	cpumask_t oldmask;
-	oldmask = affinity_set(b->cpu);
+	cpumask_t oldmask, newmask;
+	affinity_set(b->cpu, &oldmask, &newmask);
 	rdmsr(b->address, low, high);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 	return sprintf(buf, "%x\n",
 		       (high & 0xFFF) - (THRESHOLD_MAX - b->threshold_limit));
 }
@@ -327,10 +327,10 @@ static ssize_t show_error_count(struct t
 static ssize_t store_error_count(struct threshold_block *b,
 				 const char *buf, size_t count)
 {
-	cpumask_t oldmask;
-	oldmask = affinity_set(b->cpu);
+	cpumask_t oldmask, newmask;
+	affinity_set(b->cpu, &oldmask, &newmask);
 	threshold_restart_bank(b, 1, 0);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 	return 1;
 }
 
@@ -468,7 +468,7 @@ static __cpuinit int threshold_create_ba
 {
 	int i, err = 0;
 	struct threshold_bank *b = NULL;
-	cpumask_t oldmask = CPU_MASK_NONE;
+	cpumask_t oldmask = CPU_MASK_NONE, newmask;
 	char name[32];
 
 	sprintf(name, "threshold_bank%i", bank);
@@ -519,10 +519,10 @@ static __cpuinit int threshold_create_ba
 
 	per_cpu(threshold_banks, cpu)[bank] = b;
 
-	oldmask = affinity_set(cpu);
+	affinity_set(cpu, &oldmask, &newmask);
 	err = allocate_threshold_blocks(cpu, bank, 0,
 					MSR_IA32_MC0_MISC + bank * 4);
-	affinity_restore(oldmask);
+	affinity_restore(&oldmask);
 
 	if (err)
 		goto out_free;
--- linux.trees.git.orig/include/linux/sched.h
+++ linux.trees.git/include/linux/sched.h
@@ -2026,7 +2026,7 @@ static inline void arch_pick_mmap_layout
 }
 #endif
 
-extern long sched_setaffinity(pid_t pid, cpumask_t new_mask);
+extern long sched_setaffinity(pid_t pid, const cpumask_t *new_mask);
 extern long sched_getaffinity(pid_t pid, cpumask_t *mask);
 
 extern int sched_mc_power_savings, sched_smt_power_savings;
--- linux.trees.git.orig/kernel/compat.c
+++ linux.trees.git/kernel/compat.c
@@ -446,7 +446,7 @@ asmlinkage long compat_sys_sched_setaffi
 	if (retval)
 		return retval;
 
-	return sched_setaffinity(pid, new_mask);
+	return sched_setaffinity(pid, &new_mask);
 }
 
 asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
--- linux.trees.git.orig/kernel/rcupreempt.c
+++ linux.trees.git/kernel/rcupreempt.c
@@ -1007,10 +1007,10 @@ void __synchronize_sched(void)
 	if (sched_getaffinity(0, &oldmask) < 0)
 		oldmask = cpu_possible_map;
 	for_each_online_cpu(cpu) {
-		sched_setaffinity(0, cpumask_of_cpu(cpu));
+		sched_setaffinity(0, &cpumask_of_cpu(cpu));
 		schedule();
 	}
-	sched_setaffinity(0, oldmask);
+	sched_setaffinity(0, &oldmask);
 }
 EXPORT_SYMBOL_GPL(__synchronize_sched);
 
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -4706,9 +4706,10 @@ out_unlock:
 	return retval;
 }
 
-long sched_setaffinity(pid_t pid, cpumask_t new_mask)
+long sched_setaffinity(pid_t pid, const cpumask_t *in_mask)
 {
 	cpumask_t cpus_allowed;
+	cpumask_t new_mask = *in_mask;
 	struct task_struct *p;
 	int retval;
 
@@ -4789,7 +4790,7 @@ asmlinkage long sys_sched_setaffinity(pi
 	if (retval)
 		return retval;
 
-	return sched_setaffinity(pid, new_mask);
+	return sched_setaffinity(pid, &new_mask);
 }
 
 /*

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 04/12] cpumask: pass cpumask by reference to acpi-cpufreq
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel, Len Brown, Dave Jones

[-- Attachment #1: check_freqs --]
[-- Type: text/plain, Size: 2376 bytes --]

Pass cpumask_t variables by reference in acpi-cpufreq functions.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Len Brown <len.brown@intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -211,22 +211,22 @@ static void drv_write(struct drv_cmd *cm
 	return;
 }
 
-static u32 get_cur_val(cpumask_t mask)
+static u32 get_cur_val(const cpumask_t *mask)
 {
 	struct acpi_processor_performance *perf;
 	struct drv_cmd cmd;
 
-	if (unlikely(cpus_empty(mask)))
+	if (unlikely(cpus_empty(*mask)))
 		return 0;
 
-	switch (per_cpu(drv_data, first_cpu(mask))->cpu_feature) {
+	switch (per_cpu(drv_data, first_cpu(*mask))->cpu_feature) {
 	case SYSTEM_INTEL_MSR_CAPABLE:
 		cmd.type = SYSTEM_INTEL_MSR_CAPABLE;
 		cmd.addr.msr.reg = MSR_IA32_PERF_STATUS;
 		break;
 	case SYSTEM_IO_CAPABLE:
 		cmd.type = SYSTEM_IO_CAPABLE;
-		perf = per_cpu(drv_data, first_cpu(mask))->acpi_data;
+		perf = per_cpu(drv_data, first_cpu(*mask))->acpi_data;
 		cmd.addr.io.port = perf->control_register.address;
 		cmd.addr.io.bit_width = perf->control_register.bit_width;
 		break;
@@ -234,7 +234,7 @@ static u32 get_cur_val(cpumask_t mask)
 		return 0;
 	}
 
-	cmd.mask = mask;
+	cmd.mask = *mask;
 
 	drv_read(&cmd);
 
@@ -347,13 +347,13 @@ static unsigned int get_cur_freq_on_cpu(
 		return 0;
 	}
 
-	freq = extract_freq(get_cur_val(cpumask_of_cpu(cpu)), data);
+	freq = extract_freq(get_cur_val(&cpumask_of_cpu(cpu)), data);
 	dprintk("cur freq = %u\n", freq);
 
 	return freq;
 }
 
-static unsigned int check_freqs(cpumask_t mask, unsigned int freq,
+static unsigned int check_freqs(const cpumask_t *mask, unsigned int freq,
 				struct acpi_cpufreq_data *data)
 {
 	unsigned int cur_freq;
@@ -449,7 +449,7 @@ static int acpi_cpufreq_target(struct cp
 	drv_write(&cmd);
 
 	if (acpi_pstate_strict) {
-		if (!check_freqs(cmd.mask, freqs.new, data)) {
+		if (!check_freqs(&cmd.mask, freqs.new, data)) {
 			dprintk("acpi_cpufreq_target failed (%d)\n",
 				policy->cpu);
 			return -EAGAIN;

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 04/12] cpumask: pass cpumask by reference to acpi-cpufreq
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel, Len Brown, Dave Jones

[-- Attachment #1: check_freqs --]
[-- Type: text/plain, Size: 2602 bytes --]

Pass cpumask_t variables by reference in acpi-cpufreq functions.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Len Brown <len.brown@intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -211,22 +211,22 @@ static void drv_write(struct drv_cmd *cm
 	return;
 }
 
-static u32 get_cur_val(cpumask_t mask)
+static u32 get_cur_val(const cpumask_t *mask)
 {
 	struct acpi_processor_performance *perf;
 	struct drv_cmd cmd;
 
-	if (unlikely(cpus_empty(mask)))
+	if (unlikely(cpus_empty(*mask)))
 		return 0;
 
-	switch (per_cpu(drv_data, first_cpu(mask))->cpu_feature) {
+	switch (per_cpu(drv_data, first_cpu(*mask))->cpu_feature) {
 	case SYSTEM_INTEL_MSR_CAPABLE:
 		cmd.type = SYSTEM_INTEL_MSR_CAPABLE;
 		cmd.addr.msr.reg = MSR_IA32_PERF_STATUS;
 		break;
 	case SYSTEM_IO_CAPABLE:
 		cmd.type = SYSTEM_IO_CAPABLE;
-		perf = per_cpu(drv_data, first_cpu(mask))->acpi_data;
+		perf = per_cpu(drv_data, first_cpu(*mask))->acpi_data;
 		cmd.addr.io.port = perf->control_register.address;
 		cmd.addr.io.bit_width = perf->control_register.bit_width;
 		break;
@@ -234,7 +234,7 @@ static u32 get_cur_val(cpumask_t mask)
 		return 0;
 	}
 
-	cmd.mask = mask;
+	cmd.mask = *mask;
 
 	drv_read(&cmd);
 
@@ -347,13 +347,13 @@ static unsigned int get_cur_freq_on_cpu(
 		return 0;
 	}
 
-	freq = extract_freq(get_cur_val(cpumask_of_cpu(cpu)), data);
+	freq = extract_freq(get_cur_val(&cpumask_of_cpu(cpu)), data);
 	dprintk("cur freq = %u\n", freq);
 
 	return freq;
 }
 
-static unsigned int check_freqs(cpumask_t mask, unsigned int freq,
+static unsigned int check_freqs(const cpumask_t *mask, unsigned int freq,
 				struct acpi_cpufreq_data *data)
 {
 	unsigned int cur_freq;
@@ -449,7 +449,7 @@ static int acpi_cpufreq_target(struct cp
 	drv_write(&cmd);
 
 	if (acpi_pstate_strict) {
-		if (!check_freqs(cmd.mask, freqs.new, data)) {
+		if (!check_freqs(&cmd.mask, freqs.new, data)) {
 			dprintk("acpi_cpufreq_target failed (%d)\n",
 				policy->cpu);
 			return -EAGAIN;

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 05/12] init: move large array from stack to _initdata section
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: numa_initmem_init --]
[-- Type: text/plain, Size: 1059 bytes --]

Move large array "struct bootnode nodes" from stack to _initdata
section.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/mm/numa_64.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux.trees.git.orig/arch/x86/mm/numa_64.c
+++ linux.trees.git/arch/x86/mm/numa_64.c
@@ -411,9 +411,10 @@ static int __init split_nodes_by_size(st
  * Sets up the system RAM area from start_pfn to end_pfn according to the
  * numa=fake command-line option.
  */
+static struct bootnode nodes[MAX_NUMNODES] __initdata;
+
 static int __init numa_emulation(unsigned long start_pfn, unsigned long end_pfn)
 {
-	struct bootnode nodes[MAX_NUMNODES];
 	u64 size, addr = start_pfn << PAGE_SHIFT;
 	u64 max_addr = end_pfn << PAGE_SHIFT;
 	int num_nodes = 0, num = 0, coeff_flag, coeff = -1, i;

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 05/12] init: move large array from stack to _initdata section
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: numa_initmem_init --]
[-- Type: text/plain, Size: 1285 bytes --]

Move large array "struct bootnode nodes" from stack to _initdata
section.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/mm/numa_64.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux.trees.git.orig/arch/x86/mm/numa_64.c
+++ linux.trees.git/arch/x86/mm/numa_64.c
@@ -411,9 +411,10 @@ static int __init split_nodes_by_size(st
  * Sets up the system RAM area from start_pfn to end_pfn according to the
  * numa=fake command-line option.
  */
+static struct bootnode nodes[MAX_NUMNODES] __initdata;
+
 static int __init numa_emulation(unsigned long start_pfn, unsigned long end_pfn)
 {
-	struct bootnode nodes[MAX_NUMNODES];
 	u64 size, addr = start_pfn << PAGE_SHIFT;
 	u64 max_addr = end_pfn << PAGE_SHIFT;
 	int num_nodes = 0, num = 0, coeff_flag, coeff = -1, i;

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 06/12] cpumask: create pointer to node_to_cpumask array element v2
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Richard Henderson,
	David Howells, Tony Luck, Paul Mackerras, Anton Blanchard,
	David S. Miller, William L. Irwin, Thomas Gleixner,
	H. Peter Anvin

[-- Attachment #1: node_to_cpumask_ptr --]
[-- Type: text/plain, Size: 12998 bytes --]

Create a simple macro to always return a pointer to the 
node_to_cpumask(node) value.  This relies on compiler optimization
to remove the extra indirection:

#define	node_to_cpumask_ptr(v, node) 		\
		cpumask_t _##v = node_to_cpumask(node), *v = &_##v

For those systems with a large cpumask size, then a true pointer
to the array element is used:

#define node_to_cpumask_ptr(v, node)		\
		cpumask_t *v = &(node_to_cpumask_map[node])

A node_to_cpumask_ptr_next() macro is provided to access another
node_to_cpumask value.

This removes 10256 bytes of stack usage.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

# alpha
Cc: Richard Henderson <rth@twiddle.net>

# fujitsu
Cc: David Howells <dhowells@redhat.com>

# ia64
Cc: Tony Luck <tony.luck@intel.com>

# powerpc
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>

# sparc
Cc: David S. Miller <davem@davemloft.net>
Cc: William L. Irwin <wli@holomorphy.com>

# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>


Signed-off-by: Mike Travis <travis@sgi.com>
---
v2: rebased on linux-2.6.git + linux-2.6-x86.git

One checkpatch error that I don't think can be fixed (was already in source):

ERROR: Macros with complex values should be enclosed in parenthesis
#230: FILE: include/linux/topology.h:49:

#define for_each_node_with_cpus(node)			\
        for_each_online_node(node)                      \
		if (nr_cpus_node(node))

total: 1 errors, 0 warnings, 315 lines checked

---
 drivers/base/node.c            |    4 ++--
 drivers/pci/pci-driver.c       |    4 ++--
 include/asm-alpha/topology.h   |    3 +--
 include/asm-frv/topology.h     |    4 +---
 include/asm-generic/topology.h |   14 ++++++++++++++
 include/asm-ia64/topology.h    |    5 +++++
 include/asm-powerpc/topology.h |    3 +--
 include/asm-x86/topology.h     |   15 +++++++++++++--
 include/linux/topology.h       |   13 ++++++-------
 kernel/sched.c                 |   29 ++++++++++++++---------------
 mm/page_alloc.c                |    6 +++---
 mm/slab.c                      |    5 ++---
 mm/vmscan.c                    |   18 ++++++++----------
 net/sunrpc/svc.c               |    4 ++--
 14 files changed, 74 insertions(+), 53 deletions(-)

--- linux.trees.git.orig/drivers/base/node.c
+++ linux.trees.git/drivers/base/node.c
@@ -22,13 +22,13 @@ static struct sysdev_class node_class = 
 static ssize_t node_read_cpumap(struct sys_device * dev, char * buf)
 {
 	struct node *node_dev = to_node(dev);
-	cpumask_t mask = node_to_cpumask(node_dev->sysdev.id);
+	node_to_cpumask_ptr(mask, node_dev->sysdev.id);
 	int len;
 
 	/* 2004/06/03: buf currently PAGE_SIZE, need > 1 char per 4 bits. */
 	BUILD_BUG_ON(MAX_NUMNODES/4 > PAGE_SIZE/2);
 
-	len = cpumask_scnprintf(buf, PAGE_SIZE-1, mask);
+	len = cpumask_scnprintf(buf, PAGE_SIZE-1, *mask);
 	len += sprintf(buf + len, "\n");
 	return len;
 }
--- linux.trees.git.orig/drivers/pci/pci-driver.c
+++ linux.trees.git/drivers/pci/pci-driver.c
@@ -184,8 +184,8 @@ static int pci_call_probe(struct pci_dri
 	int node = dev_to_node(&dev->dev);
 
 	if (node >= 0) {
-		cpumask_t nodecpumask = node_to_cpumask(node);
-		set_cpus_allowed(current, &nodecpumask);
+		node_to_cpumask_ptr(nodecpumask, node);
+		set_cpus_allowed(current, nodecpumask);
 	}
 
 	/* And set default memory allocation policy */
--- linux.trees.git.orig/include/asm-alpha/topology.h
+++ linux.trees.git/include/asm-alpha/topology.h
@@ -41,8 +41,7 @@ static inline cpumask_t node_to_cpumask(
 
 #define pcibus_to_cpumask(bus)	(cpu_online_map)
 
-#else /* CONFIG_NUMA */
-# include <asm-generic/topology.h>
 #endif /* !CONFIG_NUMA */
+# include <asm-generic/topology.h>
 
 #endif /* _ASM_ALPHA_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-frv/topology.h
+++ linux.trees.git/include/asm-frv/topology.h
@@ -5,10 +5,8 @@
 
 #error NUMA not supported yet
 
-#else /* !CONFIG_NUMA */
+#endif /* CONFIG_NUMA */
 
 #include <asm-generic/topology.h>
 
-#endif /* CONFIG_NUMA */
-
 #endif /* _ASM_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-generic/topology.h
+++ linux.trees.git/include/asm-generic/topology.h
@@ -27,6 +27,8 @@
 #ifndef _ASM_GENERIC_TOPOLOGY_H
 #define _ASM_GENERIC_TOPOLOGY_H
 
+#ifndef	CONFIG_NUMA
+
 /* Other architectures wishing to use this simple topology API should fill
    in the below functions as appropriate in their own <asm/topology.h> file. */
 #ifndef cpu_to_node
@@ -52,4 +54,16 @@
 				)
 #endif
 
+#endif	/* CONFIG_NUMA */
+
+/* returns pointer to cpumask for specified node */
+#ifndef node_to_cpumask_ptr
+
+#define	node_to_cpumask_ptr(v, node) 					\
+		cpumask_t _##v = node_to_cpumask(node), *v = &_##v
+
+#define node_to_cpumask_ptr_next(v, node)				\
+			  _##v = node_to_cpumask(node)
+#endif
+
 #endif /* _ASM_GENERIC_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-ia64/topology.h
+++ linux.trees.git/include/asm-ia64/topology.h
@@ -116,6 +116,11 @@ void build_cpu_to_node_map(void);
 #define smt_capable() 				(smp_num_siblings > 1)
 #endif
 
+#define pcibus_to_cpumask(bus)	(pcibus_to_node(bus) == -1 ? \
+					CPU_MASK_ALL : \
+					node_to_cpumask(pcibus_to_node(bus)) \
+				)
+
 #include <asm-generic/topology.h>
 
 #endif /* _ASM_IA64_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-powerpc/topology.h
+++ linux.trees.git/include/asm-powerpc/topology.h
@@ -96,11 +96,10 @@ static inline void sysfs_remove_device_f
 {
 }
 
+#endif /* CONFIG_NUMA */
 
 #include <asm-generic/topology.h>
 
-#endif /* CONFIG_NUMA */
-
 #ifdef CONFIG_SMP
 #include <asm/cputable.h>
 #define smt_capable()		(cpu_has_feature(CPU_FTR_SMT))
--- linux.trees.git.orig/include/asm-x86/topology.h
+++ linux.trees.git/include/asm-x86/topology.h
@@ -89,6 +89,17 @@ static inline int cpu_to_node(int cpu)
 #endif
 	return per_cpu(x86_cpu_to_node_map, cpu);
 }
+
+#ifdef	CONFIG_NUMA
+
+/* Returns a pointer to the cpumask of CPUs on Node 'node'. */
+#define node_to_cpumask_ptr(v, node)		\
+		cpumask_t *v = &(node_to_cpumask_map[node])
+
+#define node_to_cpumask_ptr_next(v, node)	\
+			   v = &(node_to_cpumask_map[node])
+#endif
+
 #endif /* CONFIG_X86_64 */
 
 /*
@@ -186,10 +197,10 @@ static inline void set_mp_bus_to_node(in
 {
 }
 
-#include <asm-generic/topology.h>
-
 #endif
 
+#include <asm-generic/topology.h>
+
 extern cpumask_t cpu_coregroup_map(int cpu);
 
 #ifdef ENABLE_TOPO_DEFINES
--- linux.trees.git.orig/include/linux/topology.h
+++ linux.trees.git/include/linux/topology.h
@@ -38,16 +38,15 @@
 #endif
 
 #ifndef nr_cpus_node
-#define nr_cpus_node(node)							\
-	({									\
-		cpumask_t __tmp__;						\
-		__tmp__ = node_to_cpumask(node);				\
-		cpus_weight(__tmp__);						\
+#define nr_cpus_node(node)				\
+	({						\
+		node_to_cpumask_ptr(__tmp__, node);	\
+		cpus_weight(*__tmp__);			\
 	})
 #endif
 
-#define for_each_node_with_cpus(node)						\
-	for_each_online_node(node)						\
+#define for_each_node_with_cpus(node)			\
+	for_each_online_node(node)			\
 		if (nr_cpus_node(node))
 
 void arch_update_cpu_topology(void);
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6252,7 +6252,7 @@ init_sched_build_groups(cpumask_t span, 
  *
  * Should use nodemask_t.
  */
-static int find_next_best_node(int node, unsigned long *used_nodes)
+static int find_next_best_node(int node, nodemask_t *used_nodes)
 {
 	int i, n, val, min_val, best_node = 0;
 
@@ -6266,7 +6266,7 @@ static int find_next_best_node(int node,
 			continue;
 
 		/* Skip already used nodes */
-		if (test_bit(n, used_nodes))
+		if (node_isset(n, *used_nodes))
 			continue;
 
 		/* Simple min distance search */
@@ -6278,14 +6278,13 @@ static int find_next_best_node(int node,
 		}
 	}
 
-	set_bit(best_node, used_nodes);
+	node_set(best_node, *used_nodes);
 	return best_node;
 }
 
 /**
  * sched_domain_node_span - get a cpumask for a node's sched_domain
  * @node: node whose cpumask we're constructing
- * @size: number of nodes to include in this span
  *
  * Given a node, construct a good cpumask for its sched_domain to span. It
  * should be one that prevents unnecessary balancing, but also spreads tasks
@@ -6293,22 +6292,22 @@ static int find_next_best_node(int node,
  */
 static cpumask_t sched_domain_node_span(int node)
 {
-	DECLARE_BITMAP(used_nodes, MAX_NUMNODES);
-	cpumask_t span, nodemask;
+	nodemask_t used_nodes;
+	cpumask_t span;
+	node_to_cpumask_ptr(nodemask, node);
 	int i;
 
 	cpus_clear(span);
-	bitmap_zero(used_nodes, MAX_NUMNODES);
+	nodes_clear(used_nodes);
 
-	nodemask = node_to_cpumask(node);
-	cpus_or(span, span, nodemask);
-	set_bit(node, used_nodes);
+	cpus_or(span, span, *nodemask);
+	node_set(node, used_nodes);
 
 	for (i = 1; i < SD_NODES_PER_DOMAIN; i++) {
-		int next_node = find_next_best_node(node, used_nodes);
+		int next_node = find_next_best_node(node, &used_nodes);
 
-		nodemask = node_to_cpumask(next_node);
-		cpus_or(span, span, nodemask);
+		node_to_cpumask_ptr_next(nodemask, next_node);
+		cpus_or(span, span, *nodemask);
 	}
 
 	return span;
@@ -6705,6 +6704,7 @@ static int build_sched_domains(const cpu
 		for (j = 0; j < MAX_NUMNODES; j++) {
 			cpumask_t tmp, notcovered;
 			int n = (i + j) % MAX_NUMNODES;
+			node_to_cpumask_ptr(nodemask, n);
 
 			cpus_complement(notcovered, covered);
 			cpus_and(tmp, notcovered, *cpu_map);
@@ -6712,8 +6712,7 @@ static int build_sched_domains(const cpu
 			if (cpus_empty(tmp))
 				break;
 
-			nodemask = node_to_cpumask(n);
-			cpus_and(tmp, tmp, nodemask);
+			cpus_and(tmp, tmp, *nodemask);
 			if (cpus_empty(tmp))
 				continue;
 
--- linux.trees.git.orig/mm/page_alloc.c
+++ linux.trees.git/mm/page_alloc.c
@@ -2029,6 +2029,7 @@ static int find_next_best_node(int node,
 	int n, val;
 	int min_val = INT_MAX;
 	int best_node = -1;
+	node_to_cpumask_ptr(tmp, 0);
 
 	/* Use the local node if we haven't already */
 	if (!node_isset(node, *used_node_mask)) {
@@ -2037,7 +2038,6 @@ static int find_next_best_node(int node,
 	}
 
 	for_each_node_state(n, N_HIGH_MEMORY) {
-		cpumask_t tmp;
 
 		/* Don't want a node to appear more than once */
 		if (node_isset(n, *used_node_mask))
@@ -2050,8 +2050,8 @@ static int find_next_best_node(int node,
 		val += (n < node);
 
 		/* Give preference to headless and unused nodes */
-		tmp = node_to_cpumask(n);
-		if (!cpus_empty(tmp))
+		node_to_cpumask_ptr_next(tmp, n);
+		if (!cpus_empty(*tmp))
 			val += PENALTY_FOR_NODE_WITH_CPUS;
 
 		/* Slight preference for less loaded node */
--- linux.trees.git.orig/mm/slab.c
+++ linux.trees.git/mm/slab.c
@@ -1160,14 +1160,13 @@ static void __cpuinit cpuup_canceled(lon
 	struct kmem_cache *cachep;
 	struct kmem_list3 *l3 = NULL;
 	int node = cpu_to_node(cpu);
+	node_to_cpumask_ptr(mask, node);
 
 	list_for_each_entry(cachep, &cache_chain, next) {
 		struct array_cache *nc;
 		struct array_cache *shared;
 		struct array_cache **alien;
-		cpumask_t mask;
 
-		mask = node_to_cpumask(node);
 		/* cpu is dead; no one can alloc from it. */
 		nc = cachep->array[cpu];
 		cachep->array[cpu] = NULL;
@@ -1183,7 +1182,7 @@ static void __cpuinit cpuup_canceled(lon
 		if (nc)
 			free_block(cachep, nc->entry, nc->avail, node);
 
-		if (!cpus_empty(mask)) {
+		if (!cpus_empty(*mask)) {
 			spin_unlock_irq(&l3->list_lock);
 			goto free_array_cache;
 		}
--- linux.trees.git.orig/mm/vmscan.c
+++ linux.trees.git/mm/vmscan.c
@@ -1664,11 +1664,10 @@ static int kswapd(void *p)
 	struct reclaim_state reclaim_state = {
 		.reclaimed_slab = 0,
 	};
-	cpumask_t cpumask;
+	node_to_cpumask_ptr(cpumask, pgdat->node_id);
 
-	cpumask = node_to_cpumask(pgdat->node_id);
-	if (!cpus_empty(cpumask))
-		set_cpus_allowed(tsk, &cpumask);
+	if (!cpus_empty(*cpumask))
+		set_cpus_allowed(tsk, cpumask);
 	current->reclaim_state = &reclaim_state;
 
 	/*
@@ -1897,17 +1896,16 @@ out:
 static int __devinit cpu_callback(struct notifier_block *nfb,
 				  unsigned long action, void *hcpu)
 {
-	pg_data_t *pgdat;
-	cpumask_t mask;
 	int nid;
 
 	if (action == CPU_ONLINE || action == CPU_ONLINE_FROZEN) {
 		for_each_node_state(nid, N_HIGH_MEMORY) {
-			pgdat = NODE_DATA(nid);
-			mask = node_to_cpumask(pgdat->node_id);
-			if (any_online_cpu(mask) < nr_cpu_ids)
+			pg_data_t *pgdat = NODE_DATA(nid);
+			node_to_cpumask_ptr(mask, pgdat->node_id);
+
+			if (any_online_cpu(*mask) < nr_cpu_ids)
 				/* One of our CPUs online: restore mask */
-				set_cpus_allowed(pgdat->kswapd, &mask);
+				set_cpus_allowed(pgdat->kswapd, mask);
 		}
 	}
 	return NOTIFY_OK;
--- linux.trees.git.orig/net/sunrpc/svc.c
+++ linux.trees.git/net/sunrpc/svc.c
@@ -323,10 +323,10 @@ svc_pool_map_set_cpumask(unsigned int pi
 	case SVC_POOL_PERNODE:
 	{
 		unsigned int node = m->pool_to[pidx];
-		cpumask_t nodecpumask = node_to_cpumask(node);
+		node_to_cpumask_ptr(nodecpumask, node);
 
 		*oldmask = current->cpus_allowed;
-		set_cpus_allowed(current, &nodecpumask);
+		set_cpus_allowed(current, nodecpumask);
 		return 1;
 	}
 	}

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 06/12] cpumask: create pointer to node_to_cpumask array element v2
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Richard Henderson,
	David Howells, Tony Luck, Paul Mackerras, Anton Blanchard,
	David S. Miller, William L. Irwin, Thomas Gleixner,
	H. Peter Anvin

[-- Attachment #1: node_to_cpumask_ptr --]
[-- Type: text/plain, Size: 13224 bytes --]

Create a simple macro to always return a pointer to the 
node_to_cpumask(node) value.  This relies on compiler optimization
to remove the extra indirection:

#define	node_to_cpumask_ptr(v, node) 		\
		cpumask_t _##v = node_to_cpumask(node), *v = &_##v

For those systems with a large cpumask size, then a true pointer
to the array element is used:

#define node_to_cpumask_ptr(v, node)		\
		cpumask_t *v = &(node_to_cpumask_map[node])

A node_to_cpumask_ptr_next() macro is provided to access another
node_to_cpumask value.

This removes 10256 bytes of stack usage.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

# alpha
Cc: Richard Henderson <rth@twiddle.net>

# fujitsu
Cc: David Howells <dhowells@redhat.com>

# ia64
Cc: Tony Luck <tony.luck@intel.com>

# powerpc
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>

# sparc
Cc: David S. Miller <davem@davemloft.net>
Cc: William L. Irwin <wli@holomorphy.com>

# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>


Signed-off-by: Mike Travis <travis@sgi.com>
---
v2: rebased on linux-2.6.git + linux-2.6-x86.git

One checkpatch error that I don't think can be fixed (was already in source):

ERROR: Macros with complex values should be enclosed in parenthesis
#230: FILE: include/linux/topology.h:49:

#define for_each_node_with_cpus(node)			\
        for_each_online_node(node)                      \
		if (nr_cpus_node(node))

total: 1 errors, 0 warnings, 315 lines checked

---
 drivers/base/node.c            |    4 ++--
 drivers/pci/pci-driver.c       |    4 ++--
 include/asm-alpha/topology.h   |    3 +--
 include/asm-frv/topology.h     |    4 +---
 include/asm-generic/topology.h |   14 ++++++++++++++
 include/asm-ia64/topology.h    |    5 +++++
 include/asm-powerpc/topology.h |    3 +--
 include/asm-x86/topology.h     |   15 +++++++++++++--
 include/linux/topology.h       |   13 ++++++-------
 kernel/sched.c                 |   29 ++++++++++++++---------------
 mm/page_alloc.c                |    6 +++---
 mm/slab.c                      |    5 ++---
 mm/vmscan.c                    |   18 ++++++++----------
 net/sunrpc/svc.c               |    4 ++--
 14 files changed, 74 insertions(+), 53 deletions(-)

--- linux.trees.git.orig/drivers/base/node.c
+++ linux.trees.git/drivers/base/node.c
@@ -22,13 +22,13 @@ static struct sysdev_class node_class = 
 static ssize_t node_read_cpumap(struct sys_device * dev, char * buf)
 {
 	struct node *node_dev = to_node(dev);
-	cpumask_t mask = node_to_cpumask(node_dev->sysdev.id);
+	node_to_cpumask_ptr(mask, node_dev->sysdev.id);
 	int len;
 
 	/* 2004/06/03: buf currently PAGE_SIZE, need > 1 char per 4 bits. */
 	BUILD_BUG_ON(MAX_NUMNODES/4 > PAGE_SIZE/2);
 
-	len = cpumask_scnprintf(buf, PAGE_SIZE-1, mask);
+	len = cpumask_scnprintf(buf, PAGE_SIZE-1, *mask);
 	len += sprintf(buf + len, "\n");
 	return len;
 }
--- linux.trees.git.orig/drivers/pci/pci-driver.c
+++ linux.trees.git/drivers/pci/pci-driver.c
@@ -184,8 +184,8 @@ static int pci_call_probe(struct pci_dri
 	int node = dev_to_node(&dev->dev);
 
 	if (node >= 0) {
-		cpumask_t nodecpumask = node_to_cpumask(node);
-		set_cpus_allowed(current, &nodecpumask);
+		node_to_cpumask_ptr(nodecpumask, node);
+		set_cpus_allowed(current, nodecpumask);
 	}
 
 	/* And set default memory allocation policy */
--- linux.trees.git.orig/include/asm-alpha/topology.h
+++ linux.trees.git/include/asm-alpha/topology.h
@@ -41,8 +41,7 @@ static inline cpumask_t node_to_cpumask(
 
 #define pcibus_to_cpumask(bus)	(cpu_online_map)
 
-#else /* CONFIG_NUMA */
-# include <asm-generic/topology.h>
 #endif /* !CONFIG_NUMA */
+# include <asm-generic/topology.h>
 
 #endif /* _ASM_ALPHA_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-frv/topology.h
+++ linux.trees.git/include/asm-frv/topology.h
@@ -5,10 +5,8 @@
 
 #error NUMA not supported yet
 
-#else /* !CONFIG_NUMA */
+#endif /* CONFIG_NUMA */
 
 #include <asm-generic/topology.h>
 
-#endif /* CONFIG_NUMA */
-
 #endif /* _ASM_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-generic/topology.h
+++ linux.trees.git/include/asm-generic/topology.h
@@ -27,6 +27,8 @@
 #ifndef _ASM_GENERIC_TOPOLOGY_H
 #define _ASM_GENERIC_TOPOLOGY_H
 
+#ifndef	CONFIG_NUMA
+
 /* Other architectures wishing to use this simple topology API should fill
    in the below functions as appropriate in their own <asm/topology.h> file. */
 #ifndef cpu_to_node
@@ -52,4 +54,16 @@
 				)
 #endif
 
+#endif	/* CONFIG_NUMA */
+
+/* returns pointer to cpumask for specified node */
+#ifndef node_to_cpumask_ptr
+
+#define	node_to_cpumask_ptr(v, node) 					\
+		cpumask_t _##v = node_to_cpumask(node), *v = &_##v
+
+#define node_to_cpumask_ptr_next(v, node)				\
+			  _##v = node_to_cpumask(node)
+#endif
+
 #endif /* _ASM_GENERIC_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-ia64/topology.h
+++ linux.trees.git/include/asm-ia64/topology.h
@@ -116,6 +116,11 @@ void build_cpu_to_node_map(void);
 #define smt_capable() 				(smp_num_siblings > 1)
 #endif
 
+#define pcibus_to_cpumask(bus)	(pcibus_to_node(bus) == -1 ? \
+					CPU_MASK_ALL : \
+					node_to_cpumask(pcibus_to_node(bus)) \
+				)
+
 #include <asm-generic/topology.h>
 
 #endif /* _ASM_IA64_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-powerpc/topology.h
+++ linux.trees.git/include/asm-powerpc/topology.h
@@ -96,11 +96,10 @@ static inline void sysfs_remove_device_f
 {
 }
 
+#endif /* CONFIG_NUMA */
 
 #include <asm-generic/topology.h>
 
-#endif /* CONFIG_NUMA */
-
 #ifdef CONFIG_SMP
 #include <asm/cputable.h>
 #define smt_capable()		(cpu_has_feature(CPU_FTR_SMT))
--- linux.trees.git.orig/include/asm-x86/topology.h
+++ linux.trees.git/include/asm-x86/topology.h
@@ -89,6 +89,17 @@ static inline int cpu_to_node(int cpu)
 #endif
 	return per_cpu(x86_cpu_to_node_map, cpu);
 }
+
+#ifdef	CONFIG_NUMA
+
+/* Returns a pointer to the cpumask of CPUs on Node 'node'. */
+#define node_to_cpumask_ptr(v, node)		\
+		cpumask_t *v = &(node_to_cpumask_map[node])
+
+#define node_to_cpumask_ptr_next(v, node)	\
+			   v = &(node_to_cpumask_map[node])
+#endif
+
 #endif /* CONFIG_X86_64 */
 
 /*
@@ -186,10 +197,10 @@ static inline void set_mp_bus_to_node(in
 {
 }
 
-#include <asm-generic/topology.h>
-
 #endif
 
+#include <asm-generic/topology.h>
+
 extern cpumask_t cpu_coregroup_map(int cpu);
 
 #ifdef ENABLE_TOPO_DEFINES
--- linux.trees.git.orig/include/linux/topology.h
+++ linux.trees.git/include/linux/topology.h
@@ -38,16 +38,15 @@
 #endif
 
 #ifndef nr_cpus_node
-#define nr_cpus_node(node)							\
-	({									\
-		cpumask_t __tmp__;						\
-		__tmp__ = node_to_cpumask(node);				\
-		cpus_weight(__tmp__);						\
+#define nr_cpus_node(node)				\
+	({						\
+		node_to_cpumask_ptr(__tmp__, node);	\
+		cpus_weight(*__tmp__);			\
 	})
 #endif
 
-#define for_each_node_with_cpus(node)						\
-	for_each_online_node(node)						\
+#define for_each_node_with_cpus(node)			\
+	for_each_online_node(node)			\
 		if (nr_cpus_node(node))
 
 void arch_update_cpu_topology(void);
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6252,7 +6252,7 @@ init_sched_build_groups(cpumask_t span, 
  *
  * Should use nodemask_t.
  */
-static int find_next_best_node(int node, unsigned long *used_nodes)
+static int find_next_best_node(int node, nodemask_t *used_nodes)
 {
 	int i, n, val, min_val, best_node = 0;
 
@@ -6266,7 +6266,7 @@ static int find_next_best_node(int node,
 			continue;
 
 		/* Skip already used nodes */
-		if (test_bit(n, used_nodes))
+		if (node_isset(n, *used_nodes))
 			continue;
 
 		/* Simple min distance search */
@@ -6278,14 +6278,13 @@ static int find_next_best_node(int node,
 		}
 	}
 
-	set_bit(best_node, used_nodes);
+	node_set(best_node, *used_nodes);
 	return best_node;
 }
 
 /**
  * sched_domain_node_span - get a cpumask for a node's sched_domain
  * @node: node whose cpumask we're constructing
- * @size: number of nodes to include in this span
  *
  * Given a node, construct a good cpumask for its sched_domain to span. It
  * should be one that prevents unnecessary balancing, but also spreads tasks
@@ -6293,22 +6292,22 @@ static int find_next_best_node(int node,
  */
 static cpumask_t sched_domain_node_span(int node)
 {
-	DECLARE_BITMAP(used_nodes, MAX_NUMNODES);
-	cpumask_t span, nodemask;
+	nodemask_t used_nodes;
+	cpumask_t span;
+	node_to_cpumask_ptr(nodemask, node);
 	int i;
 
 	cpus_clear(span);
-	bitmap_zero(used_nodes, MAX_NUMNODES);
+	nodes_clear(used_nodes);
 
-	nodemask = node_to_cpumask(node);
-	cpus_or(span, span, nodemask);
-	set_bit(node, used_nodes);
+	cpus_or(span, span, *nodemask);
+	node_set(node, used_nodes);
 
 	for (i = 1; i < SD_NODES_PER_DOMAIN; i++) {
-		int next_node = find_next_best_node(node, used_nodes);
+		int next_node = find_next_best_node(node, &used_nodes);
 
-		nodemask = node_to_cpumask(next_node);
-		cpus_or(span, span, nodemask);
+		node_to_cpumask_ptr_next(nodemask, next_node);
+		cpus_or(span, span, *nodemask);
 	}
 
 	return span;
@@ -6705,6 +6704,7 @@ static int build_sched_domains(const cpu
 		for (j = 0; j < MAX_NUMNODES; j++) {
 			cpumask_t tmp, notcovered;
 			int n = (i + j) % MAX_NUMNODES;
+			node_to_cpumask_ptr(nodemask, n);
 
 			cpus_complement(notcovered, covered);
 			cpus_and(tmp, notcovered, *cpu_map);
@@ -6712,8 +6712,7 @@ static int build_sched_domains(const cpu
 			if (cpus_empty(tmp))
 				break;
 
-			nodemask = node_to_cpumask(n);
-			cpus_and(tmp, tmp, nodemask);
+			cpus_and(tmp, tmp, *nodemask);
 			if (cpus_empty(tmp))
 				continue;
 
--- linux.trees.git.orig/mm/page_alloc.c
+++ linux.trees.git/mm/page_alloc.c
@@ -2029,6 +2029,7 @@ static int find_next_best_node(int node,
 	int n, val;
 	int min_val = INT_MAX;
 	int best_node = -1;
+	node_to_cpumask_ptr(tmp, 0);
 
 	/* Use the local node if we haven't already */
 	if (!node_isset(node, *used_node_mask)) {
@@ -2037,7 +2038,6 @@ static int find_next_best_node(int node,
 	}
 
 	for_each_node_state(n, N_HIGH_MEMORY) {
-		cpumask_t tmp;
 
 		/* Don't want a node to appear more than once */
 		if (node_isset(n, *used_node_mask))
@@ -2050,8 +2050,8 @@ static int find_next_best_node(int node,
 		val += (n < node);
 
 		/* Give preference to headless and unused nodes */
-		tmp = node_to_cpumask(n);
-		if (!cpus_empty(tmp))
+		node_to_cpumask_ptr_next(tmp, n);
+		if (!cpus_empty(*tmp))
 			val += PENALTY_FOR_NODE_WITH_CPUS;
 
 		/* Slight preference for less loaded node */
--- linux.trees.git.orig/mm/slab.c
+++ linux.trees.git/mm/slab.c
@@ -1160,14 +1160,13 @@ static void __cpuinit cpuup_canceled(lon
 	struct kmem_cache *cachep;
 	struct kmem_list3 *l3 = NULL;
 	int node = cpu_to_node(cpu);
+	node_to_cpumask_ptr(mask, node);
 
 	list_for_each_entry(cachep, &cache_chain, next) {
 		struct array_cache *nc;
 		struct array_cache *shared;
 		struct array_cache **alien;
-		cpumask_t mask;
 
-		mask = node_to_cpumask(node);
 		/* cpu is dead; no one can alloc from it. */
 		nc = cachep->array[cpu];
 		cachep->array[cpu] = NULL;
@@ -1183,7 +1182,7 @@ static void __cpuinit cpuup_canceled(lon
 		if (nc)
 			free_block(cachep, nc->entry, nc->avail, node);
 
-		if (!cpus_empty(mask)) {
+		if (!cpus_empty(*mask)) {
 			spin_unlock_irq(&l3->list_lock);
 			goto free_array_cache;
 		}
--- linux.trees.git.orig/mm/vmscan.c
+++ linux.trees.git/mm/vmscan.c
@@ -1664,11 +1664,10 @@ static int kswapd(void *p)
 	struct reclaim_state reclaim_state = {
 		.reclaimed_slab = 0,
 	};
-	cpumask_t cpumask;
+	node_to_cpumask_ptr(cpumask, pgdat->node_id);
 
-	cpumask = node_to_cpumask(pgdat->node_id);
-	if (!cpus_empty(cpumask))
-		set_cpus_allowed(tsk, &cpumask);
+	if (!cpus_empty(*cpumask))
+		set_cpus_allowed(tsk, cpumask);
 	current->reclaim_state = &reclaim_state;
 
 	/*
@@ -1897,17 +1896,16 @@ out:
 static int __devinit cpu_callback(struct notifier_block *nfb,
 				  unsigned long action, void *hcpu)
 {
-	pg_data_t *pgdat;
-	cpumask_t mask;
 	int nid;
 
 	if (action == CPU_ONLINE || action == CPU_ONLINE_FROZEN) {
 		for_each_node_state(nid, N_HIGH_MEMORY) {
-			pgdat = NODE_DATA(nid);
-			mask = node_to_cpumask(pgdat->node_id);
-			if (any_online_cpu(mask) < nr_cpu_ids)
+			pg_data_t *pgdat = NODE_DATA(nid);
+			node_to_cpumask_ptr(mask, pgdat->node_id);
+
+			if (any_online_cpu(*mask) < nr_cpu_ids)
 				/* One of our CPUs online: restore mask */
-				set_cpus_allowed(pgdat->kswapd, &mask);
+				set_cpus_allowed(pgdat->kswapd, mask);
 		}
 	}
 	return NOTIFY_OK;
--- linux.trees.git.orig/net/sunrpc/svc.c
+++ linux.trees.git/net/sunrpc/svc.c
@@ -323,10 +323,10 @@ svc_pool_map_set_cpumask(unsigned int pi
 	case SVC_POOL_PERNODE:
 	{
 		unsigned int node = m->pool_to[pidx];
-		cpumask_t nodecpumask = node_to_cpumask(node);
+		node_to_cpumask_ptr(nodecpumask, node);
 
 		*oldmask = current->cpus_allowed;
-		set_cpus_allowed(current, &nodecpumask);
+		set_cpus_allowed(current, nodecpumask);
 		return 1;
 	}
 	}

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 07/12] cpumask: reduce stack usage in SD_x_INIT initializers
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: sched_domain --]
[-- Type: text/plain, Size: 7353 bytes --]

Remove empty cpumask_t (and all non-zero/non-null) variables
in SD_*_INIT macros.  Use memset(0) to clear.  Also, don't
inline the initializer functions to save on stack space in
build_sched_domains().

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
One checkpatch warning that I think can't be removed:

ERROR: Macros with complex values should be enclosed in parenthesis
#165: FILE: kernel/sched.c:6591:
#define SD_INIT_FUNC(type)	\
static noinline void sd_init_##type(struct sched_domain *sd)	\
{								\
	memset(sd, 0, sizeof(*sd));				\
	*sd = SD_##type##_INIT;					\
}

---
 include/asm-x86/topology.h |    5 -----
 include/linux/topology.h   |   33 ++-------------------------------
 kernel/sched.c             |   35 ++++++++++++++++++++++++++++++-----
 3 files changed, 32 insertions(+), 41 deletions(-)

--- linux.trees.git.orig/include/asm-x86/topology.h
+++ linux.trees.git/include/asm-x86/topology.h
@@ -155,10 +155,6 @@ extern unsigned long node_remap_size[];
 
 /* sched_domains SD_NODE_INIT for NUMAQ machines */
 #define SD_NODE_INIT (struct sched_domain) {		\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 8,			\
 	.max_interval		= 32,			\
 	.busy_factor		= 32,			\
@@ -176,7 +172,6 @@ extern unsigned long node_remap_size[];
 				| SD_WAKE_BALANCE,	\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
 }
 
 #ifdef CONFIG_X86_64_ACPI_NUMA
--- linux.trees.git.orig/include/linux/topology.h
+++ linux.trees.git/include/linux/topology.h
@@ -79,7 +79,9 @@ void arch_update_cpu_topology(void);
  * by defining their own arch-specific initializer in include/asm/topology.h.
  * A definition there will automagically override these default initializers
  * and allow arch-specific performance tuning of sched_domains.
+ * (Only non-zero and non-null fields need be specified.)
  */
+
 #ifdef CONFIG_SCHED_SMT
 /* MCD - Do we really need this?  It is always on if CONFIG_SCHED_SMT is,
  * so can't we drop this in favor of CONFIG_SCHED_SMT?
@@ -88,20 +90,10 @@ void arch_update_cpu_topology(void);
 /* Common values for SMT siblings */
 #ifndef SD_SIBLING_INIT
 #define SD_SIBLING_INIT (struct sched_domain) {		\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 1,			\
 	.max_interval		= 2,			\
 	.busy_factor		= 64,			\
 	.imbalance_pct		= 110,			\
-	.cache_nice_tries	= 0,			\
-	.busy_idx		= 0,			\
-	.idle_idx		= 0,			\
-	.newidle_idx		= 0,			\
-	.wake_idx		= 0,			\
-	.forkexec_idx		= 0,			\
 	.flags			= SD_LOAD_BALANCE	\
 				| SD_BALANCE_NEWIDLE	\
 				| SD_BALANCE_FORK	\
@@ -111,7 +103,6 @@ void arch_update_cpu_topology(void);
 				| SD_SHARE_CPUPOWER,	\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
 }
 #endif
 #endif /* CONFIG_SCHED_SMT */
@@ -120,18 +111,12 @@ void arch_update_cpu_topology(void);
 /* Common values for MC siblings. for now mostly derived from SD_CPU_INIT */
 #ifndef SD_MC_INIT
 #define SD_MC_INIT (struct sched_domain) {		\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 1,			\
 	.max_interval		= 4,			\
 	.busy_factor		= 64,			\
 	.imbalance_pct		= 125,			\
 	.cache_nice_tries	= 1,			\
 	.busy_idx		= 2,			\
-	.idle_idx		= 0,			\
-	.newidle_idx		= 0,			\
 	.wake_idx		= 1,			\
 	.forkexec_idx		= 1,			\
 	.flags			= SD_LOAD_BALANCE	\
@@ -143,7 +128,6 @@ void arch_update_cpu_topology(void);
 				| BALANCE_FOR_MC_POWER,	\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
 }
 #endif
 #endif /* CONFIG_SCHED_MC */
@@ -151,10 +135,6 @@ void arch_update_cpu_topology(void);
 /* Common values for CPUs */
 #ifndef SD_CPU_INIT
 #define SD_CPU_INIT (struct sched_domain) {		\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 1,			\
 	.max_interval		= 4,			\
 	.busy_factor		= 64,			\
@@ -173,16 +153,11 @@ void arch_update_cpu_topology(void);
 				| BALANCE_FOR_PKG_POWER,\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
 }
 #endif
 
 /* sched_domains SD_ALLNODES_INIT for NUMA machines */
 #define SD_ALLNODES_INIT (struct sched_domain) {	\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 64,			\
 	.max_interval		= 64*num_online_cpus(),	\
 	.busy_factor		= 128,			\
@@ -190,14 +165,10 @@ void arch_update_cpu_topology(void);
 	.cache_nice_tries	= 1,			\
 	.busy_idx		= 3,			\
 	.idle_idx		= 3,			\
-	.newidle_idx		= 0, /* unused */	\
-	.wake_idx		= 0, /* unused */	\
-	.forkexec_idx		= 0, /* unused */	\
 	.flags			= SD_LOAD_BALANCE	\
 				| SD_SERIALIZE,	\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 64,			\
-	.nr_balance_failed	= 0,			\
 }
 
 #ifdef CONFIG_NUMA
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6532,6 +6532,31 @@ static void init_sched_groups_power(int 
 }
 
 /*
+ * Initializers for schedule domains
+ * Non-inlined to reduce accumulated stack pressure in build_sched_domains()
+ */
+
+#define	SD_INIT(sd, type)	sd_init_##type(sd)
+#define SD_INIT_FUNC(type)	\
+static noinline void sd_init_##type(struct sched_domain *sd)	\
+{								\
+	memset(sd, 0, sizeof(*sd));				\
+	*sd = SD_##type##_INIT;					\
+}
+
+SD_INIT_FUNC(CPU)
+#ifdef CONFIG_NUMA
+ SD_INIT_FUNC(ALLNODES)
+ SD_INIT_FUNC(NODE)
+#endif
+#ifdef CONFIG_SCHED_SMT
+ SD_INIT_FUNC(SIBLING)
+#endif
+#ifdef CONFIG_SCHED_MC
+ SD_INIT_FUNC(MC)
+#endif
+
+/*
  * Build sched domains for a given set of cpus and attach the sched domains
  * to the individual cpus
  */
@@ -6574,7 +6599,7 @@ static int build_sched_domains(const cpu
 		if (cpus_weight(*cpu_map) >
 				SD_NODES_PER_DOMAIN*cpus_weight(nodemask)) {
 			sd = &per_cpu(allnodes_domains, i);
-			*sd = SD_ALLNODES_INIT;
+			SD_INIT(sd, ALLNODES);
 			sd->span = *cpu_map;
 			cpu_to_allnodes_group(i, cpu_map, &sd->groups);
 			p = sd;
@@ -6583,7 +6608,7 @@ static int build_sched_domains(const cpu
 			p = NULL;
 
 		sd = &per_cpu(node_domains, i);
-		*sd = SD_NODE_INIT;
+		SD_INIT(sd, NODE);
 		sd->span = sched_domain_node_span(cpu_to_node(i));
 		sd->parent = p;
 		if (p)
@@ -6593,7 +6618,7 @@ static int build_sched_domains(const cpu
 
 		p = sd;
 		sd = &per_cpu(phys_domains, i);
-		*sd = SD_CPU_INIT;
+		SD_INIT(sd, CPU);
 		sd->span = nodemask;
 		sd->parent = p;
 		if (p)
@@ -6603,7 +6628,7 @@ static int build_sched_domains(const cpu
 #ifdef CONFIG_SCHED_MC
 		p = sd;
 		sd = &per_cpu(core_domains, i);
-		*sd = SD_MC_INIT;
+		SD_INIT(sd, MC);
 		sd->span = cpu_coregroup_map(i);
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
@@ -6614,7 +6639,7 @@ static int build_sched_domains(const cpu
 #ifdef CONFIG_SCHED_SMT
 		p = sd;
 		sd = &per_cpu(cpu_domains, i);
-		*sd = SD_SIBLING_INIT;
+		SD_INIT(sd, SIBLING);
 		sd->span = per_cpu(cpu_sibling_map, i);
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 07/12] cpumask: reduce stack usage in SD_x_INIT initializers
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: sched_domain --]
[-- Type: text/plain, Size: 7579 bytes --]

Remove empty cpumask_t (and all non-zero/non-null) variables
in SD_*_INIT macros.  Use memset(0) to clear.  Also, don't
inline the initializer functions to save on stack space in
build_sched_domains().

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
One checkpatch warning that I think can't be removed:

ERROR: Macros with complex values should be enclosed in parenthesis
#165: FILE: kernel/sched.c:6591:
#define SD_INIT_FUNC(type)	\
static noinline void sd_init_##type(struct sched_domain *sd)	\
{								\
	memset(sd, 0, sizeof(*sd));				\
	*sd = SD_##type##_INIT;					\
}

---
 include/asm-x86/topology.h |    5 -----
 include/linux/topology.h   |   33 ++-------------------------------
 kernel/sched.c             |   35 ++++++++++++++++++++++++++++++-----
 3 files changed, 32 insertions(+), 41 deletions(-)

--- linux.trees.git.orig/include/asm-x86/topology.h
+++ linux.trees.git/include/asm-x86/topology.h
@@ -155,10 +155,6 @@ extern unsigned long node_remap_size[];
 
 /* sched_domains SD_NODE_INIT for NUMAQ machines */
 #define SD_NODE_INIT (struct sched_domain) {		\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 8,			\
 	.max_interval		= 32,			\
 	.busy_factor		= 32,			\
@@ -176,7 +172,6 @@ extern unsigned long node_remap_size[];
 				| SD_WAKE_BALANCE,	\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
 }
 
 #ifdef CONFIG_X86_64_ACPI_NUMA
--- linux.trees.git.orig/include/linux/topology.h
+++ linux.trees.git/include/linux/topology.h
@@ -79,7 +79,9 @@ void arch_update_cpu_topology(void);
  * by defining their own arch-specific initializer in include/asm/topology.h.
  * A definition there will automagically override these default initializers
  * and allow arch-specific performance tuning of sched_domains.
+ * (Only non-zero and non-null fields need be specified.)
  */
+
 #ifdef CONFIG_SCHED_SMT
 /* MCD - Do we really need this?  It is always on if CONFIG_SCHED_SMT is,
  * so can't we drop this in favor of CONFIG_SCHED_SMT?
@@ -88,20 +90,10 @@ void arch_update_cpu_topology(void);
 /* Common values for SMT siblings */
 #ifndef SD_SIBLING_INIT
 #define SD_SIBLING_INIT (struct sched_domain) {		\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 1,			\
 	.max_interval		= 2,			\
 	.busy_factor		= 64,			\
 	.imbalance_pct		= 110,			\
-	.cache_nice_tries	= 0,			\
-	.busy_idx		= 0,			\
-	.idle_idx		= 0,			\
-	.newidle_idx		= 0,			\
-	.wake_idx		= 0,			\
-	.forkexec_idx		= 0,			\
 	.flags			= SD_LOAD_BALANCE	\
 				| SD_BALANCE_NEWIDLE	\
 				| SD_BALANCE_FORK	\
@@ -111,7 +103,6 @@ void arch_update_cpu_topology(void);
 				| SD_SHARE_CPUPOWER,	\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
 }
 #endif
 #endif /* CONFIG_SCHED_SMT */
@@ -120,18 +111,12 @@ void arch_update_cpu_topology(void);
 /* Common values for MC siblings. for now mostly derived from SD_CPU_INIT */
 #ifndef SD_MC_INIT
 #define SD_MC_INIT (struct sched_domain) {		\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 1,			\
 	.max_interval		= 4,			\
 	.busy_factor		= 64,			\
 	.imbalance_pct		= 125,			\
 	.cache_nice_tries	= 1,			\
 	.busy_idx		= 2,			\
-	.idle_idx		= 0,			\
-	.newidle_idx		= 0,			\
 	.wake_idx		= 1,			\
 	.forkexec_idx		= 1,			\
 	.flags			= SD_LOAD_BALANCE	\
@@ -143,7 +128,6 @@ void arch_update_cpu_topology(void);
 				| BALANCE_FOR_MC_POWER,	\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
 }
 #endif
 #endif /* CONFIG_SCHED_MC */
@@ -151,10 +135,6 @@ void arch_update_cpu_topology(void);
 /* Common values for CPUs */
 #ifndef SD_CPU_INIT
 #define SD_CPU_INIT (struct sched_domain) {		\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 1,			\
 	.max_interval		= 4,			\
 	.busy_factor		= 64,			\
@@ -173,16 +153,11 @@ void arch_update_cpu_topology(void);
 				| BALANCE_FOR_PKG_POWER,\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
 }
 #endif
 
 /* sched_domains SD_ALLNODES_INIT for NUMA machines */
 #define SD_ALLNODES_INIT (struct sched_domain) {	\
-	.span			= CPU_MASK_NONE,	\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
 	.min_interval		= 64,			\
 	.max_interval		= 64*num_online_cpus(),	\
 	.busy_factor		= 128,			\
@@ -190,14 +165,10 @@ void arch_update_cpu_topology(void);
 	.cache_nice_tries	= 1,			\
 	.busy_idx		= 3,			\
 	.idle_idx		= 3,			\
-	.newidle_idx		= 0, /* unused */	\
-	.wake_idx		= 0, /* unused */	\
-	.forkexec_idx		= 0, /* unused */	\
 	.flags			= SD_LOAD_BALANCE	\
 				| SD_SERIALIZE,	\
 	.last_balance		= jiffies,		\
 	.balance_interval	= 64,			\
-	.nr_balance_failed	= 0,			\
 }
 
 #ifdef CONFIG_NUMA
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6532,6 +6532,31 @@ static void init_sched_groups_power(int 
 }
 
 /*
+ * Initializers for schedule domains
+ * Non-inlined to reduce accumulated stack pressure in build_sched_domains()
+ */
+
+#define	SD_INIT(sd, type)	sd_init_##type(sd)
+#define SD_INIT_FUNC(type)	\
+static noinline void sd_init_##type(struct sched_domain *sd)	\
+{								\
+	memset(sd, 0, sizeof(*sd));				\
+	*sd = SD_##type##_INIT;					\
+}
+
+SD_INIT_FUNC(CPU)
+#ifdef CONFIG_NUMA
+ SD_INIT_FUNC(ALLNODES)
+ SD_INIT_FUNC(NODE)
+#endif
+#ifdef CONFIG_SCHED_SMT
+ SD_INIT_FUNC(SIBLING)
+#endif
+#ifdef CONFIG_SCHED_MC
+ SD_INIT_FUNC(MC)
+#endif
+
+/*
  * Build sched domains for a given set of cpus and attach the sched domains
  * to the individual cpus
  */
@@ -6574,7 +6599,7 @@ static int build_sched_domains(const cpu
 		if (cpus_weight(*cpu_map) >
 				SD_NODES_PER_DOMAIN*cpus_weight(nodemask)) {
 			sd = &per_cpu(allnodes_domains, i);
-			*sd = SD_ALLNODES_INIT;
+			SD_INIT(sd, ALLNODES);
 			sd->span = *cpu_map;
 			cpu_to_allnodes_group(i, cpu_map, &sd->groups);
 			p = sd;
@@ -6583,7 +6608,7 @@ static int build_sched_domains(const cpu
 			p = NULL;
 
 		sd = &per_cpu(node_domains, i);
-		*sd = SD_NODE_INIT;
+		SD_INIT(sd, NODE);
 		sd->span = sched_domain_node_span(cpu_to_node(i));
 		sd->parent = p;
 		if (p)
@@ -6593,7 +6618,7 @@ static int build_sched_domains(const cpu
 
 		p = sd;
 		sd = &per_cpu(phys_domains, i);
-		*sd = SD_CPU_INIT;
+		SD_INIT(sd, CPU);
 		sd->span = nodemask;
 		sd->parent = p;
 		if (p)
@@ -6603,7 +6628,7 @@ static int build_sched_domains(const cpu
 #ifdef CONFIG_SCHED_MC
 		p = sd;
 		sd = &per_cpu(core_domains, i);
-		*sd = SD_MC_INIT;
+		SD_INIT(sd, MC);
 		sd->span = cpu_coregroup_map(i);
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
@@ -6614,7 +6639,7 @@ static int build_sched_domains(const cpu
 #ifdef CONFIG_SCHED_SMT
 		p = sd;
 		sd = &per_cpu(cpu_domains, i);
-		*sd = SD_SIBLING_INIT;
+		SD_INIT(sd, SIBLING);
 		sd->span = per_cpu(cpu_sibling_map, i);
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 08/12] cpumask: pass temp cpumask variables in init_sched_build_groups
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel

[-- Attachment #1: kern_sched --]
[-- Type: text/plain, Size: 18852 bytes --]

Pass pointers to temporary cpumask variables instead of creating on the stack.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Ingo Molnar <mingo@elte.hu>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 kernel/sched.c |  218 ++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 126 insertions(+), 92 deletions(-)

--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -1670,17 +1670,17 @@ find_idlest_group(struct sched_domain *s
  * find_idlest_cpu - find the idlest cpu among the cpus in group.
  */
 static int
-find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
+find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu,
+		cpumask_t *tmp)
 {
-	cpumask_t tmp;
 	unsigned long load, min_load = ULONG_MAX;
 	int idlest = -1;
 	int i;
 
 	/* Traverse only the allowed CPUs */
-	cpus_and(tmp, group->cpumask, p->cpus_allowed);
+	cpus_and(*tmp, group->cpumask, p->cpus_allowed);
 
-	for_each_cpu_mask(i, tmp) {
+	for_each_cpu_mask(i, *tmp) {
 		load = weighted_cpuload(i);
 
 		if (load < min_load || (load == min_load && i == this_cpu)) {
@@ -1719,7 +1719,7 @@ static int sched_balance_self(int cpu, i
 	}
 
 	while (sd) {
-		cpumask_t span;
+		cpumask_t span, tmpmask;
 		struct sched_group *group;
 		int new_cpu, weight;
 
@@ -1735,7 +1735,7 @@ static int sched_balance_self(int cpu, i
 			continue;
 		}
 
-		new_cpu = find_idlest_cpu(group, t, cpu);
+		new_cpu = find_idlest_cpu(group, t, cpu, &tmpmask);
 		if (new_cpu == -1 || new_cpu == cpu) {
 			/* Now try balancing at a lower domain level of cpu */
 			sd = sd->child;
@@ -2616,7 +2616,7 @@ static int move_one_task(struct rq *this
 static struct sched_group *
 find_busiest_group(struct sched_domain *sd, int this_cpu,
 		   unsigned long *imbalance, enum cpu_idle_type idle,
-		   int *sd_idle, cpumask_t *cpus, int *balance)
+		   int *sd_idle, const cpumask_t *cpus, int *balance)
 {
 	struct sched_group *busiest = NULL, *this = NULL, *group = sd->groups;
 	unsigned long max_load, avg_load, total_load, this_load, total_pwr;
@@ -2917,7 +2917,7 @@ ret:
  */
 static struct rq *
 find_busiest_queue(struct sched_group *group, enum cpu_idle_type idle,
-		   unsigned long imbalance, cpumask_t *cpus)
+		   unsigned long imbalance, const cpumask_t *cpus)
 {
 	struct rq *busiest = NULL, *rq;
 	unsigned long max_load = 0;
@@ -2956,15 +2956,16 @@ find_busiest_queue(struct sched_group *g
  */
 static int load_balance(int this_cpu, struct rq *this_rq,
 			struct sched_domain *sd, enum cpu_idle_type idle,
-			int *balance)
+			int *balance, cpumask_t *cpus)
 {
 	int ld_moved, all_pinned = 0, active_balance = 0, sd_idle = 0;
 	struct sched_group *group;
 	unsigned long imbalance;
 	struct rq *busiest;
-	cpumask_t cpus = CPU_MASK_ALL;
 	unsigned long flags;
 
+	cpus_setall(*cpus);
+
 	/*
 	 * When power savings policy is enabled for the parent domain, idle
 	 * sibling can pick up load irrespective of busy siblings. In this case,
@@ -2979,7 +2980,7 @@ static int load_balance(int this_cpu, st
 
 redo:
 	group = find_busiest_group(sd, this_cpu, &imbalance, idle, &sd_idle,
-				   &cpus, balance);
+				   cpus, balance);
 
 	if (*balance == 0)
 		goto out_balanced;
@@ -2989,7 +2990,7 @@ redo:
 		goto out_balanced;
 	}
 
-	busiest = find_busiest_queue(group, idle, imbalance, &cpus);
+	busiest = find_busiest_queue(group, idle, imbalance, cpus);
 	if (!busiest) {
 		schedstat_inc(sd, lb_nobusyq[idle]);
 		goto out_balanced;
@@ -3022,8 +3023,8 @@ redo:
 
 		/* All tasks on this runqueue were pinned by CPU affinity */
 		if (unlikely(all_pinned)) {
-			cpu_clear(cpu_of(busiest), cpus);
-			if (!cpus_empty(cpus))
+			cpu_clear(cpu_of(busiest), *cpus);
+			if (!cpus_empty(*cpus))
 				goto redo;
 			goto out_balanced;
 		}
@@ -3108,7 +3109,8 @@ out_one_pinned:
  * this_rq is locked.
  */
 static int
-load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd)
+load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd,
+			cpumask_t *cpus)
 {
 	struct sched_group *group;
 	struct rq *busiest = NULL;
@@ -3116,7 +3118,8 @@ load_balance_newidle(int this_cpu, struc
 	int ld_moved = 0;
 	int sd_idle = 0;
 	int all_pinned = 0;
-	cpumask_t cpus = CPU_MASK_ALL;
+
+	cpus_setall(*cpus);
 
 	/*
 	 * When power savings policy is enabled for the parent domain, idle
@@ -3131,14 +3134,13 @@ load_balance_newidle(int this_cpu, struc
 	schedstat_inc(sd, lb_count[CPU_NEWLY_IDLE]);
 redo:
 	group = find_busiest_group(sd, this_cpu, &imbalance, CPU_NEWLY_IDLE,
-				   &sd_idle, &cpus, NULL);
+				   &sd_idle, cpus, NULL);
 	if (!group) {
 		schedstat_inc(sd, lb_nobusyg[CPU_NEWLY_IDLE]);
 		goto out_balanced;
 	}
 
-	busiest = find_busiest_queue(group, CPU_NEWLY_IDLE, imbalance,
-				&cpus);
+	busiest = find_busiest_queue(group, CPU_NEWLY_IDLE, imbalance, cpus);
 	if (!busiest) {
 		schedstat_inc(sd, lb_nobusyq[CPU_NEWLY_IDLE]);
 		goto out_balanced;
@@ -3160,8 +3162,8 @@ redo:
 		spin_unlock(&busiest->lock);
 
 		if (unlikely(all_pinned)) {
-			cpu_clear(cpu_of(busiest), cpus);
-			if (!cpus_empty(cpus))
+			cpu_clear(cpu_of(busiest), *cpus);
+			if (!cpus_empty(*cpus))
 				goto redo;
 		}
 	}
@@ -3195,6 +3197,7 @@ static void idle_balance(int this_cpu, s
 	struct sched_domain *sd;
 	int pulled_task = -1;
 	unsigned long next_balance = jiffies + HZ;
+	cpumask_t tmpmask;
 
 	for_each_domain(this_cpu, sd) {
 		unsigned long interval;
@@ -3204,8 +3207,8 @@ static void idle_balance(int this_cpu, s
 
 		if (sd->flags & SD_BALANCE_NEWIDLE)
 			/* If we've pulled tasks over stop searching: */
-			pulled_task = load_balance_newidle(this_cpu,
-								this_rq, sd);
+			pulled_task = load_balance_newidle(this_cpu, this_rq,
+							   sd, &tmpmask);
 
 		interval = msecs_to_jiffies(sd->balance_interval);
 		if (time_after(next_balance, sd->last_balance + interval))
@@ -3364,6 +3367,7 @@ static void rebalance_domains(int cpu, e
 	/* Earliest time when we have to do rebalance again */
 	unsigned long next_balance = jiffies + 60*HZ;
 	int update_next_balance = 0;
+	cpumask_t tmp;
 
 	for_each_domain(cpu, sd) {
 		if (!(sd->flags & SD_LOAD_BALANCE))
@@ -3387,7 +3391,7 @@ static void rebalance_domains(int cpu, e
 		}
 
 		if (time_after_eq(jiffies, sd->last_balance + interval)) {
-			if (load_balance(cpu, rq, sd, idle, &balance)) {
+			if (load_balance(cpu, rq, sd, idle, &balance, &tmp)) {
 				/*
 				 * We've pulled tasks over so either we're no
 				 * longer idle, or one of our SMT siblings is
@@ -5912,21 +5916,10 @@ void __init migration_init(void)
 
 #ifdef CONFIG_SCHED_DEBUG
 
-static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level)
+static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
+				  cpumask_t *groupmask, char *str, int len)
 {
 	struct sched_group *group = sd->groups;
-	cpumask_t groupmask;
-	int len = cpumask_scnprintf_len(nr_cpu_ids);
-	char *str = kmalloc(len, GFP_KERNEL);
-	int ret = 0;
-
-	if (!str) {
-		printk(KERN_DEBUG "Cannot load-balance (no memory)\n");
-		return -1;
-	}
-
-	cpumask_scnprintf(str, len, sd->span);
-	cpus_clear(groupmask);
 
 	printk(KERN_DEBUG "%*s domain %d: ", level, "", level);
 
@@ -5935,10 +5928,12 @@ static int sched_domain_debug_one(struct
 		if (sd->parent)
 			printk(KERN_ERR "ERROR: !SD_LOAD_BALANCE domain"
 					" has parent");
-		kfree(str);
 		return -1;
 	}
 
+	cpumask_scnprintf(str, len, sd->span);
+	cpus_clear(*groupmask);
+
 	printk(KERN_CONT "span %s\n", str);
 
 	if (!cpu_isset(cpu, sd->span)) {
@@ -5971,13 +5966,13 @@ static int sched_domain_debug_one(struct
 			break;
 		}
 
-		if (cpus_intersects(groupmask, group->cpumask)) {
+		if (cpus_intersects(*groupmask, group->cpumask)) {
 			printk(KERN_CONT "\n");
 			printk(KERN_ERR "ERROR: repeated CPUs\n");
 			break;
 		}
 
-		cpus_or(groupmask, groupmask, group->cpumask);
+		cpus_or(*groupmask, *groupmask, group->cpumask);
 
 		cpumask_scnprintf(str, len, group->cpumask);
 		printk(KERN_CONT " %s", str);
@@ -5986,36 +5981,49 @@ static int sched_domain_debug_one(struct
 	} while (group != sd->groups);
 	printk(KERN_CONT "\n");
 
-	if (!cpus_equal(sd->span, groupmask))
+	if (!cpus_equal(sd->span, *groupmask))
 		printk(KERN_ERR "ERROR: groups don't span domain->span\n");
 
-	if (sd->parent && !cpus_subset(groupmask, sd->parent->span))
+	if (sd->parent && !cpus_subset(*groupmask, sd->parent->span))
 		printk(KERN_ERR "ERROR: parent span is not a superset "
 			"of domain->span\n");
 
-	kfree(str);
 	return 0;
 }
 
 static void sched_domain_debug(struct sched_domain *sd, int cpu)
 {
 	int level = 0;
+	char *str = NULL;
+	cpumask_t *groupmask = NULL;
+	int len;
 
 	if (!sd) {
 		printk(KERN_DEBUG "CPU%d attaching NULL sched-domain.\n", cpu);
 		return;
 	}
 
+	groupmask = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
+	len = cpumask_scnprintf_len(nr_cpu_ids);
+	str = kmalloc(len, GFP_KERNEL);
+	if (!groupmask || !str) {
+		printk(KERN_DEBUG "Cannot load-balance (out of memory)\n");
+		goto exit;
+	}
+
 	printk(KERN_DEBUG "CPU%d attaching sched-domain:\n", cpu);
 
 	for (;;) {
-		if (sched_domain_debug_one(sd, cpu, level))
+		if (sched_domain_debug_one(sd, cpu, level, groupmask, str, len))
 			break;
 		level++;
 		sd = sd->parent;
 		if (!sd)
 			break;
 	}
+exit:
+	kfree(str);
+	kfree(groupmask);
 }
 #else
 # define sched_domain_debug(sd, cpu) do { } while (0)
@@ -6203,30 +6211,33 @@ __setup("isolcpus=", isolated_cpu_setup)
  * and ->cpu_power to 0.
  */
 static void
-init_sched_build_groups(cpumask_t span, const cpumask_t *cpu_map,
+init_sched_build_groups(const cpumask_t *span, const cpumask_t *cpu_map,
 			int (*group_fn)(int cpu, const cpumask_t *cpu_map,
-					struct sched_group **sg))
+					struct sched_group **sg,
+					cpumask_t *tmpmask),
+			cpumask_t *covered, cpumask_t *tmpmask)
 {
 	struct sched_group *first = NULL, *last = NULL;
-	cpumask_t covered = CPU_MASK_NONE;
 	int i;
 
-	for_each_cpu_mask(i, span) {
+	*covered = CPU_MASK_NONE;
+
+	for_each_cpu_mask(i, *span) {
 		struct sched_group *sg;
-		int group = group_fn(i, cpu_map, &sg);
+		int group = group_fn(i, cpu_map, &sg, tmpmask);
 		int j;
 
-		if (cpu_isset(i, covered))
+		if (cpu_isset(i, *covered))
 			continue;
 
 		sg->cpumask = CPU_MASK_NONE;
 		sg->__cpu_power = 0;
 
-		for_each_cpu_mask(j, span) {
-			if (group_fn(j, cpu_map, NULL) != group)
+		for_each_cpu_mask(j, *span) {
+			if (group_fn(j, cpu_map, NULL, tmpmask) != group)
 				continue;
 
-			cpu_set(j, covered);
+			cpu_set(j, *covered);
 			cpu_set(j, sg->cpumask);
 		}
 		if (!first)
@@ -6324,7 +6335,8 @@ static DEFINE_PER_CPU(struct sched_domai
 static DEFINE_PER_CPU(struct sched_group, sched_group_cpus);
 
 static int
-cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+		 cpumask_t *unused)
 {
 	if (sg)
 		*sg = &per_cpu(sched_group_cpus, cpu);
@@ -6342,19 +6354,22 @@ static DEFINE_PER_CPU(struct sched_group
 
 #if defined(CONFIG_SCHED_MC) && defined(CONFIG_SCHED_SMT)
 static int
-cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+		  cpumask_t *mask)
 {
 	int group;
-	cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
-	cpus_and(mask, mask, *cpu_map);
-	group = first_cpu(mask);
+
+	*mask = per_cpu(cpu_sibling_map, cpu);
+	cpus_and(*mask, *mask, *cpu_map);
+	group = first_cpu(*mask);
 	if (sg)
 		*sg = &per_cpu(sched_group_core, group);
 	return group;
 }
 #elif defined(CONFIG_SCHED_MC)
 static int
-cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+		  cpumask_t *unused)
 {
 	if (sg)
 		*sg = &per_cpu(sched_group_core, cpu);
@@ -6366,17 +6381,18 @@ static DEFINE_PER_CPU(struct sched_domai
 static DEFINE_PER_CPU(struct sched_group, sched_group_phys);
 
 static int
-cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+		  cpumask_t *mask)
 {
 	int group;
 #ifdef CONFIG_SCHED_MC
-	cpumask_t mask = cpu_coregroup_map(cpu);
-	cpus_and(mask, mask, *cpu_map);
-	group = first_cpu(mask);
+	*mask = cpu_coregroup_map(cpu);
+	cpus_and(*mask, *mask, *cpu_map);
+	group = first_cpu(*mask);
 #elif defined(CONFIG_SCHED_SMT)
-	cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
-	cpus_and(mask, mask, *cpu_map);
-	group = first_cpu(mask);
+	*mask = per_cpu(cpu_sibling_map, cpu);
+	cpus_and(*mask, *mask, *cpu_map);
+	group = first_cpu(*mask);
 #else
 	group = cpu;
 #endif
@@ -6398,13 +6414,13 @@ static DEFINE_PER_CPU(struct sched_domai
 static DEFINE_PER_CPU(struct sched_group, sched_group_allnodes);
 
 static int cpu_to_allnodes_group(int cpu, const cpumask_t *cpu_map,
-				 struct sched_group **sg)
+				 struct sched_group **sg, cpumask_t *nodemask)
 {
-	cpumask_t nodemask = node_to_cpumask(cpu_to_node(cpu));
 	int group;
 
-	cpus_and(nodemask, nodemask, *cpu_map);
-	group = first_cpu(nodemask);
+	*nodemask = node_to_cpumask(cpu_to_node(cpu));
+	cpus_and(*nodemask, *nodemask, *cpu_map);
+	group = first_cpu(*nodemask);
 
 	if (sg)
 		*sg = &per_cpu(sched_group_allnodes, group);
@@ -6440,7 +6456,7 @@ static void init_numa_sched_groups_power
 
 #ifdef CONFIG_NUMA
 /* Free memory allocated for various sched_group structures */
-static void free_sched_groups(const cpumask_t *cpu_map)
+static void free_sched_groups(const cpumask_t *cpu_map, cpumask_t *nodemask)
 {
 	int cpu, i;
 
@@ -6452,11 +6468,11 @@ static void free_sched_groups(const cpum
 			continue;
 
 		for (i = 0; i < MAX_NUMNODES; i++) {
-			cpumask_t nodemask = node_to_cpumask(i);
 			struct sched_group *oldsg, *sg = sched_group_nodes[i];
 
-			cpus_and(nodemask, nodemask, *cpu_map);
-			if (cpus_empty(nodemask))
+			*nodemask = node_to_cpumask(i);
+			cpus_and(*nodemask, *nodemask, *cpu_map);
+			if (cpus_empty(*nodemask))
 				continue;
 
 			if (sg == NULL)
@@ -6474,7 +6490,7 @@ next_sg:
 	}
 }
 #else
-static void free_sched_groups(const cpumask_t *cpu_map)
+static void free_sched_groups(const cpumask_t *cpu_map, cpumask_t *nodemask)
 {
 }
 #endif
@@ -6564,6 +6580,7 @@ static int build_sched_domains(const cpu
 {
 	int i;
 	struct root_domain *rd;
+	cpumask_t tmpmask;
 #ifdef CONFIG_NUMA
 	struct sched_group **sched_group_nodes = NULL;
 	int sd_allnodes = 0;
@@ -6601,7 +6618,8 @@ static int build_sched_domains(const cpu
 			sd = &per_cpu(allnodes_domains, i);
 			SD_INIT(sd, ALLNODES);
 			sd->span = *cpu_map;
-			cpu_to_allnodes_group(i, cpu_map, &sd->groups);
+			cpu_to_allnodes_group(i, cpu_map, &sd->groups,
+								      &tmpmask);
 			p = sd;
 			sd_allnodes = 1;
 		} else
@@ -6623,7 +6641,7 @@ static int build_sched_domains(const cpu
 		sd->parent = p;
 		if (p)
 			p->child = sd;
-		cpu_to_phys_group(i, cpu_map, &sd->groups);
+		cpu_to_phys_group(i, cpu_map, &sd->groups, &tmpmask);
 
 #ifdef CONFIG_SCHED_MC
 		p = sd;
@@ -6633,7 +6651,7 @@ static int build_sched_domains(const cpu
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
-		cpu_to_core_group(i, cpu_map, &sd->groups);
+		cpu_to_core_group(i, cpu_map, &sd->groups, &tmpmask);
 #endif
 
 #ifdef CONFIG_SCHED_SMT
@@ -6644,7 +6662,7 @@ static int build_sched_domains(const cpu
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
-		cpu_to_cpu_group(i, cpu_map, &sd->groups);
+		cpu_to_cpu_group(i, cpu_map, &sd->groups, &tmpmask);
 #endif
 	}
 
@@ -6652,12 +6670,15 @@ static int build_sched_domains(const cpu
 	/* Set up CPU (sibling) groups */
 	for_each_cpu_mask(i, *cpu_map) {
 		cpumask_t this_sibling_map = per_cpu(cpu_sibling_map, i);
+		cpumask_t send_covered;
+
 		cpus_and(this_sibling_map, this_sibling_map, *cpu_map);
 		if (i != first_cpu(this_sibling_map))
 			continue;
 
-		init_sched_build_groups(this_sibling_map, cpu_map,
-					&cpu_to_cpu_group);
+		init_sched_build_groups(&this_sibling_map, cpu_map,
+					&cpu_to_cpu_group,
+					&send_covered, &tmpmask);
 	}
 #endif
 
@@ -6665,30 +6686,40 @@ static int build_sched_domains(const cpu
 	/* Set up multi-core groups */
 	for_each_cpu_mask(i, *cpu_map) {
 		cpumask_t this_core_map = cpu_coregroup_map(i);
+		cpumask_t send_covered;
+
 		cpus_and(this_core_map, this_core_map, *cpu_map);
 		if (i != first_cpu(this_core_map))
 			continue;
-		init_sched_build_groups(this_core_map, cpu_map,
-					&cpu_to_core_group);
+		init_sched_build_groups(&this_core_map, cpu_map,
+					&cpu_to_core_group,
+					&send_covered, &tmpmask);
 	}
 #endif
 
 	/* Set up physical groups */
 	for (i = 0; i < MAX_NUMNODES; i++) {
 		cpumask_t nodemask = node_to_cpumask(i);
+		cpumask_t send_covered;
 
 		cpus_and(nodemask, nodemask, *cpu_map);
 		if (cpus_empty(nodemask))
 			continue;
 
-		init_sched_build_groups(nodemask, cpu_map, &cpu_to_phys_group);
+		init_sched_build_groups(&nodemask, cpu_map,
+					&cpu_to_phys_group,
+					&send_covered, &tmpmask);
 	}
 
 #ifdef CONFIG_NUMA
 	/* Set up node groups */
-	if (sd_allnodes)
-		init_sched_build_groups(*cpu_map, cpu_map,
-					&cpu_to_allnodes_group);
+	if (sd_allnodes) {
+		cpumask_t send_covered;
+
+		init_sched_build_groups(cpu_map, cpu_map,
+					&cpu_to_allnodes_group,
+					&send_covered, &tmpmask);
+	}
 
 	for (i = 0; i < MAX_NUMNODES; i++) {
 		/* Set up node groups */
@@ -6787,7 +6818,8 @@ static int build_sched_domains(const cpu
 	if (sd_allnodes) {
 		struct sched_group *sg;
 
-		cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg);
+		cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg,
+								&tmpmask);
 		init_numa_sched_groups_power(sg);
 	}
 #endif
@@ -6809,7 +6841,7 @@ static int build_sched_domains(const cpu
 
 #ifdef CONFIG_NUMA
 error:
-	free_sched_groups(cpu_map);
+	free_sched_groups(cpu_map, &tmpmask);
 	return -ENOMEM;
 #endif
 }
@@ -6849,9 +6881,10 @@ static int arch_init_sched_domains(const
 	return err;
 }
 
-static void arch_destroy_sched_domains(const cpumask_t *cpu_map)
+static void arch_destroy_sched_domains(const cpumask_t *cpu_map,
+				       cpumask_t *tmpmask)
 {
-	free_sched_groups(cpu_map);
+	free_sched_groups(cpu_map, tmpmask);
 }
 
 /*
@@ -6860,6 +6893,7 @@ static void arch_destroy_sched_domains(c
  */
 static void detach_destroy_domains(const cpumask_t *cpu_map)
 {
+	cpumask_t tmpmask;
 	int i;
 
 	unregister_sched_domain_sysctl();
@@ -6867,7 +6901,7 @@ static void detach_destroy_domains(const
 	for_each_cpu_mask(i, *cpu_map)
 		cpu_attach_domain(NULL, &def_root_domain, i);
 	synchronize_sched();
-	arch_destroy_sched_domains(cpu_map);
+	arch_destroy_sched_domains(cpu_map, &tmpmask);
 }
 
 /*

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 08/12] cpumask: pass temp cpumask variables in init_sched_build_groups
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel

[-- Attachment #1: kern_sched --]
[-- Type: text/plain, Size: 19078 bytes --]

Pass pointers to temporary cpumask variables instead of creating on the stack.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Ingo Molnar <mingo@elte.hu>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 kernel/sched.c |  218 ++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 126 insertions(+), 92 deletions(-)

--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -1670,17 +1670,17 @@ find_idlest_group(struct sched_domain *s
  * find_idlest_cpu - find the idlest cpu among the cpus in group.
  */
 static int
-find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
+find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu,
+		cpumask_t *tmp)
 {
-	cpumask_t tmp;
 	unsigned long load, min_load = ULONG_MAX;
 	int idlest = -1;
 	int i;
 
 	/* Traverse only the allowed CPUs */
-	cpus_and(tmp, group->cpumask, p->cpus_allowed);
+	cpus_and(*tmp, group->cpumask, p->cpus_allowed);
 
-	for_each_cpu_mask(i, tmp) {
+	for_each_cpu_mask(i, *tmp) {
 		load = weighted_cpuload(i);
 
 		if (load < min_load || (load == min_load && i == this_cpu)) {
@@ -1719,7 +1719,7 @@ static int sched_balance_self(int cpu, i
 	}
 
 	while (sd) {
-		cpumask_t span;
+		cpumask_t span, tmpmask;
 		struct sched_group *group;
 		int new_cpu, weight;
 
@@ -1735,7 +1735,7 @@ static int sched_balance_self(int cpu, i
 			continue;
 		}
 
-		new_cpu = find_idlest_cpu(group, t, cpu);
+		new_cpu = find_idlest_cpu(group, t, cpu, &tmpmask);
 		if (new_cpu == -1 || new_cpu == cpu) {
 			/* Now try balancing at a lower domain level of cpu */
 			sd = sd->child;
@@ -2616,7 +2616,7 @@ static int move_one_task(struct rq *this
 static struct sched_group *
 find_busiest_group(struct sched_domain *sd, int this_cpu,
 		   unsigned long *imbalance, enum cpu_idle_type idle,
-		   int *sd_idle, cpumask_t *cpus, int *balance)
+		   int *sd_idle, const cpumask_t *cpus, int *balance)
 {
 	struct sched_group *busiest = NULL, *this = NULL, *group = sd->groups;
 	unsigned long max_load, avg_load, total_load, this_load, total_pwr;
@@ -2917,7 +2917,7 @@ ret:
  */
 static struct rq *
 find_busiest_queue(struct sched_group *group, enum cpu_idle_type idle,
-		   unsigned long imbalance, cpumask_t *cpus)
+		   unsigned long imbalance, const cpumask_t *cpus)
 {
 	struct rq *busiest = NULL, *rq;
 	unsigned long max_load = 0;
@@ -2956,15 +2956,16 @@ find_busiest_queue(struct sched_group *g
  */
 static int load_balance(int this_cpu, struct rq *this_rq,
 			struct sched_domain *sd, enum cpu_idle_type idle,
-			int *balance)
+			int *balance, cpumask_t *cpus)
 {
 	int ld_moved, all_pinned = 0, active_balance = 0, sd_idle = 0;
 	struct sched_group *group;
 	unsigned long imbalance;
 	struct rq *busiest;
-	cpumask_t cpus = CPU_MASK_ALL;
 	unsigned long flags;
 
+	cpus_setall(*cpus);
+
 	/*
 	 * When power savings policy is enabled for the parent domain, idle
 	 * sibling can pick up load irrespective of busy siblings. In this case,
@@ -2979,7 +2980,7 @@ static int load_balance(int this_cpu, st
 
 redo:
 	group = find_busiest_group(sd, this_cpu, &imbalance, idle, &sd_idle,
-				   &cpus, balance);
+				   cpus, balance);
 
 	if (*balance == 0)
 		goto out_balanced;
@@ -2989,7 +2990,7 @@ redo:
 		goto out_balanced;
 	}
 
-	busiest = find_busiest_queue(group, idle, imbalance, &cpus);
+	busiest = find_busiest_queue(group, idle, imbalance, cpus);
 	if (!busiest) {
 		schedstat_inc(sd, lb_nobusyq[idle]);
 		goto out_balanced;
@@ -3022,8 +3023,8 @@ redo:
 
 		/* All tasks on this runqueue were pinned by CPU affinity */
 		if (unlikely(all_pinned)) {
-			cpu_clear(cpu_of(busiest), cpus);
-			if (!cpus_empty(cpus))
+			cpu_clear(cpu_of(busiest), *cpus);
+			if (!cpus_empty(*cpus))
 				goto redo;
 			goto out_balanced;
 		}
@@ -3108,7 +3109,8 @@ out_one_pinned:
  * this_rq is locked.
  */
 static int
-load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd)
+load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd,
+			cpumask_t *cpus)
 {
 	struct sched_group *group;
 	struct rq *busiest = NULL;
@@ -3116,7 +3118,8 @@ load_balance_newidle(int this_cpu, struc
 	int ld_moved = 0;
 	int sd_idle = 0;
 	int all_pinned = 0;
-	cpumask_t cpus = CPU_MASK_ALL;
+
+	cpus_setall(*cpus);
 
 	/*
 	 * When power savings policy is enabled for the parent domain, idle
@@ -3131,14 +3134,13 @@ load_balance_newidle(int this_cpu, struc
 	schedstat_inc(sd, lb_count[CPU_NEWLY_IDLE]);
 redo:
 	group = find_busiest_group(sd, this_cpu, &imbalance, CPU_NEWLY_IDLE,
-				   &sd_idle, &cpus, NULL);
+				   &sd_idle, cpus, NULL);
 	if (!group) {
 		schedstat_inc(sd, lb_nobusyg[CPU_NEWLY_IDLE]);
 		goto out_balanced;
 	}
 
-	busiest = find_busiest_queue(group, CPU_NEWLY_IDLE, imbalance,
-				&cpus);
+	busiest = find_busiest_queue(group, CPU_NEWLY_IDLE, imbalance, cpus);
 	if (!busiest) {
 		schedstat_inc(sd, lb_nobusyq[CPU_NEWLY_IDLE]);
 		goto out_balanced;
@@ -3160,8 +3162,8 @@ redo:
 		spin_unlock(&busiest->lock);
 
 		if (unlikely(all_pinned)) {
-			cpu_clear(cpu_of(busiest), cpus);
-			if (!cpus_empty(cpus))
+			cpu_clear(cpu_of(busiest), *cpus);
+			if (!cpus_empty(*cpus))
 				goto redo;
 		}
 	}
@@ -3195,6 +3197,7 @@ static void idle_balance(int this_cpu, s
 	struct sched_domain *sd;
 	int pulled_task = -1;
 	unsigned long next_balance = jiffies + HZ;
+	cpumask_t tmpmask;
 
 	for_each_domain(this_cpu, sd) {
 		unsigned long interval;
@@ -3204,8 +3207,8 @@ static void idle_balance(int this_cpu, s
 
 		if (sd->flags & SD_BALANCE_NEWIDLE)
 			/* If we've pulled tasks over stop searching: */
-			pulled_task = load_balance_newidle(this_cpu,
-								this_rq, sd);
+			pulled_task = load_balance_newidle(this_cpu, this_rq,
+							   sd, &tmpmask);
 
 		interval = msecs_to_jiffies(sd->balance_interval);
 		if (time_after(next_balance, sd->last_balance + interval))
@@ -3364,6 +3367,7 @@ static void rebalance_domains(int cpu, e
 	/* Earliest time when we have to do rebalance again */
 	unsigned long next_balance = jiffies + 60*HZ;
 	int update_next_balance = 0;
+	cpumask_t tmp;
 
 	for_each_domain(cpu, sd) {
 		if (!(sd->flags & SD_LOAD_BALANCE))
@@ -3387,7 +3391,7 @@ static void rebalance_domains(int cpu, e
 		}
 
 		if (time_after_eq(jiffies, sd->last_balance + interval)) {
-			if (load_balance(cpu, rq, sd, idle, &balance)) {
+			if (load_balance(cpu, rq, sd, idle, &balance, &tmp)) {
 				/*
 				 * We've pulled tasks over so either we're no
 				 * longer idle, or one of our SMT siblings is
@@ -5912,21 +5916,10 @@ void __init migration_init(void)
 
 #ifdef CONFIG_SCHED_DEBUG
 
-static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level)
+static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
+				  cpumask_t *groupmask, char *str, int len)
 {
 	struct sched_group *group = sd->groups;
-	cpumask_t groupmask;
-	int len = cpumask_scnprintf_len(nr_cpu_ids);
-	char *str = kmalloc(len, GFP_KERNEL);
-	int ret = 0;
-
-	if (!str) {
-		printk(KERN_DEBUG "Cannot load-balance (no memory)\n");
-		return -1;
-	}
-
-	cpumask_scnprintf(str, len, sd->span);
-	cpus_clear(groupmask);
 
 	printk(KERN_DEBUG "%*s domain %d: ", level, "", level);
 
@@ -5935,10 +5928,12 @@ static int sched_domain_debug_one(struct
 		if (sd->parent)
 			printk(KERN_ERR "ERROR: !SD_LOAD_BALANCE domain"
 					" has parent");
-		kfree(str);
 		return -1;
 	}
 
+	cpumask_scnprintf(str, len, sd->span);
+	cpus_clear(*groupmask);
+
 	printk(KERN_CONT "span %s\n", str);
 
 	if (!cpu_isset(cpu, sd->span)) {
@@ -5971,13 +5966,13 @@ static int sched_domain_debug_one(struct
 			break;
 		}
 
-		if (cpus_intersects(groupmask, group->cpumask)) {
+		if (cpus_intersects(*groupmask, group->cpumask)) {
 			printk(KERN_CONT "\n");
 			printk(KERN_ERR "ERROR: repeated CPUs\n");
 			break;
 		}
 
-		cpus_or(groupmask, groupmask, group->cpumask);
+		cpus_or(*groupmask, *groupmask, group->cpumask);
 
 		cpumask_scnprintf(str, len, group->cpumask);
 		printk(KERN_CONT " %s", str);
@@ -5986,36 +5981,49 @@ static int sched_domain_debug_one(struct
 	} while (group != sd->groups);
 	printk(KERN_CONT "\n");
 
-	if (!cpus_equal(sd->span, groupmask))
+	if (!cpus_equal(sd->span, *groupmask))
 		printk(KERN_ERR "ERROR: groups don't span domain->span\n");
 
-	if (sd->parent && !cpus_subset(groupmask, sd->parent->span))
+	if (sd->parent && !cpus_subset(*groupmask, sd->parent->span))
 		printk(KERN_ERR "ERROR: parent span is not a superset "
 			"of domain->span\n");
 
-	kfree(str);
 	return 0;
 }
 
 static void sched_domain_debug(struct sched_domain *sd, int cpu)
 {
 	int level = 0;
+	char *str = NULL;
+	cpumask_t *groupmask = NULL;
+	int len;
 
 	if (!sd) {
 		printk(KERN_DEBUG "CPU%d attaching NULL sched-domain.\n", cpu);
 		return;
 	}
 
+	groupmask = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
+	len = cpumask_scnprintf_len(nr_cpu_ids);
+	str = kmalloc(len, GFP_KERNEL);
+	if (!groupmask || !str) {
+		printk(KERN_DEBUG "Cannot load-balance (out of memory)\n");
+		goto exit;
+	}
+
 	printk(KERN_DEBUG "CPU%d attaching sched-domain:\n", cpu);
 
 	for (;;) {
-		if (sched_domain_debug_one(sd, cpu, level))
+		if (sched_domain_debug_one(sd, cpu, level, groupmask, str, len))
 			break;
 		level++;
 		sd = sd->parent;
 		if (!sd)
 			break;
 	}
+exit:
+	kfree(str);
+	kfree(groupmask);
 }
 #else
 # define sched_domain_debug(sd, cpu) do { } while (0)
@@ -6203,30 +6211,33 @@ __setup("isolcpus=", isolated_cpu_setup)
  * and ->cpu_power to 0.
  */
 static void
-init_sched_build_groups(cpumask_t span, const cpumask_t *cpu_map,
+init_sched_build_groups(const cpumask_t *span, const cpumask_t *cpu_map,
 			int (*group_fn)(int cpu, const cpumask_t *cpu_map,
-					struct sched_group **sg))
+					struct sched_group **sg,
+					cpumask_t *tmpmask),
+			cpumask_t *covered, cpumask_t *tmpmask)
 {
 	struct sched_group *first = NULL, *last = NULL;
-	cpumask_t covered = CPU_MASK_NONE;
 	int i;
 
-	for_each_cpu_mask(i, span) {
+	*covered = CPU_MASK_NONE;
+
+	for_each_cpu_mask(i, *span) {
 		struct sched_group *sg;
-		int group = group_fn(i, cpu_map, &sg);
+		int group = group_fn(i, cpu_map, &sg, tmpmask);
 		int j;
 
-		if (cpu_isset(i, covered))
+		if (cpu_isset(i, *covered))
 			continue;
 
 		sg->cpumask = CPU_MASK_NONE;
 		sg->__cpu_power = 0;
 
-		for_each_cpu_mask(j, span) {
-			if (group_fn(j, cpu_map, NULL) != group)
+		for_each_cpu_mask(j, *span) {
+			if (group_fn(j, cpu_map, NULL, tmpmask) != group)
 				continue;
 
-			cpu_set(j, covered);
+			cpu_set(j, *covered);
 			cpu_set(j, sg->cpumask);
 		}
 		if (!first)
@@ -6324,7 +6335,8 @@ static DEFINE_PER_CPU(struct sched_domai
 static DEFINE_PER_CPU(struct sched_group, sched_group_cpus);
 
 static int
-cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+		 cpumask_t *unused)
 {
 	if (sg)
 		*sg = &per_cpu(sched_group_cpus, cpu);
@@ -6342,19 +6354,22 @@ static DEFINE_PER_CPU(struct sched_group
 
 #if defined(CONFIG_SCHED_MC) && defined(CONFIG_SCHED_SMT)
 static int
-cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+		  cpumask_t *mask)
 {
 	int group;
-	cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
-	cpus_and(mask, mask, *cpu_map);
-	group = first_cpu(mask);
+
+	*mask = per_cpu(cpu_sibling_map, cpu);
+	cpus_and(*mask, *mask, *cpu_map);
+	group = first_cpu(*mask);
 	if (sg)
 		*sg = &per_cpu(sched_group_core, group);
 	return group;
 }
 #elif defined(CONFIG_SCHED_MC)
 static int
-cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+		  cpumask_t *unused)
 {
 	if (sg)
 		*sg = &per_cpu(sched_group_core, cpu);
@@ -6366,17 +6381,18 @@ static DEFINE_PER_CPU(struct sched_domai
 static DEFINE_PER_CPU(struct sched_group, sched_group_phys);
 
 static int
-cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+		  cpumask_t *mask)
 {
 	int group;
 #ifdef CONFIG_SCHED_MC
-	cpumask_t mask = cpu_coregroup_map(cpu);
-	cpus_and(mask, mask, *cpu_map);
-	group = first_cpu(mask);
+	*mask = cpu_coregroup_map(cpu);
+	cpus_and(*mask, *mask, *cpu_map);
+	group = first_cpu(*mask);
 #elif defined(CONFIG_SCHED_SMT)
-	cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
-	cpus_and(mask, mask, *cpu_map);
-	group = first_cpu(mask);
+	*mask = per_cpu(cpu_sibling_map, cpu);
+	cpus_and(*mask, *mask, *cpu_map);
+	group = first_cpu(*mask);
 #else
 	group = cpu;
 #endif
@@ -6398,13 +6414,13 @@ static DEFINE_PER_CPU(struct sched_domai
 static DEFINE_PER_CPU(struct sched_group, sched_group_allnodes);
 
 static int cpu_to_allnodes_group(int cpu, const cpumask_t *cpu_map,
-				 struct sched_group **sg)
+				 struct sched_group **sg, cpumask_t *nodemask)
 {
-	cpumask_t nodemask = node_to_cpumask(cpu_to_node(cpu));
 	int group;
 
-	cpus_and(nodemask, nodemask, *cpu_map);
-	group = first_cpu(nodemask);
+	*nodemask = node_to_cpumask(cpu_to_node(cpu));
+	cpus_and(*nodemask, *nodemask, *cpu_map);
+	group = first_cpu(*nodemask);
 
 	if (sg)
 		*sg = &per_cpu(sched_group_allnodes, group);
@@ -6440,7 +6456,7 @@ static void init_numa_sched_groups_power
 
 #ifdef CONFIG_NUMA
 /* Free memory allocated for various sched_group structures */
-static void free_sched_groups(const cpumask_t *cpu_map)
+static void free_sched_groups(const cpumask_t *cpu_map, cpumask_t *nodemask)
 {
 	int cpu, i;
 
@@ -6452,11 +6468,11 @@ static void free_sched_groups(const cpum
 			continue;
 
 		for (i = 0; i < MAX_NUMNODES; i++) {
-			cpumask_t nodemask = node_to_cpumask(i);
 			struct sched_group *oldsg, *sg = sched_group_nodes[i];
 
-			cpus_and(nodemask, nodemask, *cpu_map);
-			if (cpus_empty(nodemask))
+			*nodemask = node_to_cpumask(i);
+			cpus_and(*nodemask, *nodemask, *cpu_map);
+			if (cpus_empty(*nodemask))
 				continue;
 
 			if (sg == NULL)
@@ -6474,7 +6490,7 @@ next_sg:
 	}
 }
 #else
-static void free_sched_groups(const cpumask_t *cpu_map)
+static void free_sched_groups(const cpumask_t *cpu_map, cpumask_t *nodemask)
 {
 }
 #endif
@@ -6564,6 +6580,7 @@ static int build_sched_domains(const cpu
 {
 	int i;
 	struct root_domain *rd;
+	cpumask_t tmpmask;
 #ifdef CONFIG_NUMA
 	struct sched_group **sched_group_nodes = NULL;
 	int sd_allnodes = 0;
@@ -6601,7 +6618,8 @@ static int build_sched_domains(const cpu
 			sd = &per_cpu(allnodes_domains, i);
 			SD_INIT(sd, ALLNODES);
 			sd->span = *cpu_map;
-			cpu_to_allnodes_group(i, cpu_map, &sd->groups);
+			cpu_to_allnodes_group(i, cpu_map, &sd->groups,
+								      &tmpmask);
 			p = sd;
 			sd_allnodes = 1;
 		} else
@@ -6623,7 +6641,7 @@ static int build_sched_domains(const cpu
 		sd->parent = p;
 		if (p)
 			p->child = sd;
-		cpu_to_phys_group(i, cpu_map, &sd->groups);
+		cpu_to_phys_group(i, cpu_map, &sd->groups, &tmpmask);
 
 #ifdef CONFIG_SCHED_MC
 		p = sd;
@@ -6633,7 +6651,7 @@ static int build_sched_domains(const cpu
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
-		cpu_to_core_group(i, cpu_map, &sd->groups);
+		cpu_to_core_group(i, cpu_map, &sd->groups, &tmpmask);
 #endif
 
 #ifdef CONFIG_SCHED_SMT
@@ -6644,7 +6662,7 @@ static int build_sched_domains(const cpu
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
-		cpu_to_cpu_group(i, cpu_map, &sd->groups);
+		cpu_to_cpu_group(i, cpu_map, &sd->groups, &tmpmask);
 #endif
 	}
 
@@ -6652,12 +6670,15 @@ static int build_sched_domains(const cpu
 	/* Set up CPU (sibling) groups */
 	for_each_cpu_mask(i, *cpu_map) {
 		cpumask_t this_sibling_map = per_cpu(cpu_sibling_map, i);
+		cpumask_t send_covered;
+
 		cpus_and(this_sibling_map, this_sibling_map, *cpu_map);
 		if (i != first_cpu(this_sibling_map))
 			continue;
 
-		init_sched_build_groups(this_sibling_map, cpu_map,
-					&cpu_to_cpu_group);
+		init_sched_build_groups(&this_sibling_map, cpu_map,
+					&cpu_to_cpu_group,
+					&send_covered, &tmpmask);
 	}
 #endif
 
@@ -6665,30 +6686,40 @@ static int build_sched_domains(const cpu
 	/* Set up multi-core groups */
 	for_each_cpu_mask(i, *cpu_map) {
 		cpumask_t this_core_map = cpu_coregroup_map(i);
+		cpumask_t send_covered;
+
 		cpus_and(this_core_map, this_core_map, *cpu_map);
 		if (i != first_cpu(this_core_map))
 			continue;
-		init_sched_build_groups(this_core_map, cpu_map,
-					&cpu_to_core_group);
+		init_sched_build_groups(&this_core_map, cpu_map,
+					&cpu_to_core_group,
+					&send_covered, &tmpmask);
 	}
 #endif
 
 	/* Set up physical groups */
 	for (i = 0; i < MAX_NUMNODES; i++) {
 		cpumask_t nodemask = node_to_cpumask(i);
+		cpumask_t send_covered;
 
 		cpus_and(nodemask, nodemask, *cpu_map);
 		if (cpus_empty(nodemask))
 			continue;
 
-		init_sched_build_groups(nodemask, cpu_map, &cpu_to_phys_group);
+		init_sched_build_groups(&nodemask, cpu_map,
+					&cpu_to_phys_group,
+					&send_covered, &tmpmask);
 	}
 
 #ifdef CONFIG_NUMA
 	/* Set up node groups */
-	if (sd_allnodes)
-		init_sched_build_groups(*cpu_map, cpu_map,
-					&cpu_to_allnodes_group);
+	if (sd_allnodes) {
+		cpumask_t send_covered;
+
+		init_sched_build_groups(cpu_map, cpu_map,
+					&cpu_to_allnodes_group,
+					&send_covered, &tmpmask);
+	}
 
 	for (i = 0; i < MAX_NUMNODES; i++) {
 		/* Set up node groups */
@@ -6787,7 +6818,8 @@ static int build_sched_domains(const cpu
 	if (sd_allnodes) {
 		struct sched_group *sg;
 
-		cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg);
+		cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg,
+								&tmpmask);
 		init_numa_sched_groups_power(sg);
 	}
 #endif
@@ -6809,7 +6841,7 @@ static int build_sched_domains(const cpu
 
 #ifdef CONFIG_NUMA
 error:
-	free_sched_groups(cpu_map);
+	free_sched_groups(cpu_map, &tmpmask);
 	return -ENOMEM;
 #endif
 }
@@ -6849,9 +6881,10 @@ static int arch_init_sched_domains(const
 	return err;
 }
 
-static void arch_destroy_sched_domains(const cpumask_t *cpu_map)
+static void arch_destroy_sched_domains(const cpumask_t *cpu_map,
+				       cpumask_t *tmpmask)
 {
-	free_sched_groups(cpu_map);
+	free_sched_groups(cpu_map, tmpmask);
 }
 
 /*
@@ -6860,6 +6893,7 @@ static void arch_destroy_sched_domains(c
  */
 static void detach_destroy_domains(const cpumask_t *cpu_map)
 {
+	cpumask_t tmpmask;
 	int i;
 
 	unregister_sched_domain_sysctl();
@@ -6867,7 +6901,7 @@ static void detach_destroy_domains(const
 	for_each_cpu_mask(i, *cpu_map)
 		cpu_attach_domain(NULL, &def_root_domain, i);
 	synchronize_sched();
-	arch_destroy_sched_domains(cpu_map);
+	arch_destroy_sched_domains(cpu_map, &tmpmask);
 }
 
 /*

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 09/12] sched: fix memory leak in build_sched_domains
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel

[-- Attachment #1: build_sched_domain_leak_fix --]
[-- Type: text/plain, Size: 1219 bytes --]

I'm not 100% sure if this is needed but I can't find where memory
allocated for sched_group_nodes is released if the kmalloc for
alloc_rootdomain fails.  Also, sched_group_nodes_bycpu[] is set,
but never completely filled in for the kmalloc failure case.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Ingo Molnar <mingo@elte.hu>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 kernel/sched.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6594,15 +6594,21 @@ static int build_sched_domains(const cpu
 		printk(KERN_WARNING "Can not alloc sched group node list\n");
 		return -ENOMEM;
 	}
-	sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
 #endif
 
 	rd = alloc_rootdomain();
 	if (!rd) {
 		printk(KERN_WARNING "Cannot alloc root domain\n");
+#ifdef CONFIG_NUMA
+		kfree(sched_group_nodes);
+#endif
 		return -ENOMEM;
 	}
 
+#ifdef CONFIG_NUMA
+	sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
+#endif
+
 	/*
 	 * Set up domains for cpus specified by the cpu_map.
 	 */

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 09/12] sched: fix memory leak in build_sched_domains
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel

[-- Attachment #1: build_sched_domain_leak_fix --]
[-- Type: text/plain, Size: 1445 bytes --]

I'm not 100% sure if this is needed but I can't find where memory
allocated for sched_group_nodes is released if the kmalloc for
alloc_rootdomain fails.  Also, sched_group_nodes_bycpu[] is set,
but never completely filled in for the kmalloc failure case.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Ingo Molnar <mingo@elte.hu>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 kernel/sched.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6594,15 +6594,21 @@ static int build_sched_domains(const cpu
 		printk(KERN_WARNING "Can not alloc sched group node list\n");
 		return -ENOMEM;
 	}
-	sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
 #endif
 
 	rd = alloc_rootdomain();
 	if (!rd) {
 		printk(KERN_WARNING "Cannot alloc root domain\n");
+#ifdef CONFIG_NUMA
+		kfree(sched_group_nodes);
+#endif
 		return -ENOMEM;
 	}
 
+#ifdef CONFIG_NUMA
+	sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
+#endif
+
 	/*
 	 * Set up domains for cpus specified by the cpu_map.
 	 */

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 10/12] cpumask: reduce stack usage in build_sched_domains
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel

[-- Attachment #1: build_sched_domains --]
[-- Type: text/plain, Size: 9927 bytes --]

Reduce the amount of stack used in build_sched_domains by allocating
all the masks at once, and setting up individual pointers.  If
NR_CPUS <= 128, then stack space is used instead.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Ingo Molnar <mingo@elte.hu>

Signed-off-by: Mike Travis <travis@sgi.com>
---
One checkpatch warning that I'm not sure how to remove:

ERROR: Macros with complex values should be enclosed in parenthesis
#61: FILE: kernel/sched.c:6656:
+#define        SCHED_CPU_VAR(v, a)     cpumask_t *v = (cpumask_t *) \
			((unsigned long)(a) + offsetof(struct allmasks, v))
---
 kernel/sched.c |  165 ++++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 112 insertions(+), 53 deletions(-)

--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6573,6 +6573,40 @@ SD_INIT_FUNC(CPU)
 #endif
 
 /*
+ * To minimize stack usage kmalloc room for cpumasks and share the
+ * space as the usage in build_sched_domains() dictates.  Used only
+ * if the amount of space is significant.
+ */
+struct allmasks {
+	cpumask_t tmpmask;			/* make this one first */
+	union {
+		cpumask_t nodemask;
+		cpumask_t this_sibling_map;
+		cpumask_t this_core_map;
+	};
+	cpumask_t send_covered;
+
+#ifdef CONFIG_NUMA
+	cpumask_t domainspan;
+	cpumask_t covered;
+	cpumask_t notcovered;
+#endif
+};
+
+#if	NR_CPUS > 128
+#define	SCHED_CPUMASK_ALLOC		1
+#define	SCHED_CPUMASK_FREE(v)		kfree(v)
+#define	SCHED_CPUMASK_DECLARE(v)	struct allmasks *v
+#else
+#define	SCHED_CPUMASK_ALLOC		0
+#define	SCHED_CPUMASK_FREE(v)
+#define	SCHED_CPUMASK_DECLARE(v)	struct allmasks _v, *v = &_v
+#endif
+
+#define	SCHED_CPUMASK_VAR(v, a) 	cpumask_t *v = (cpumask_t *) \
+			((unsigned long)(a) + offsetof(struct allmasks, v))
+
+/*
  * Build sched domains for a given set of cpus and attach the sched domains
  * to the individual cpus
  */
@@ -6580,7 +6614,8 @@ static int build_sched_domains(const cpu
 {
 	int i;
 	struct root_domain *rd;
-	cpumask_t tmpmask;
+	SCHED_CPUMASK_DECLARE(allmasks);
+	cpumask_t *tmpmask;
 #ifdef CONFIG_NUMA
 	struct sched_group **sched_group_nodes = NULL;
 	int sd_allnodes = 0;
@@ -6605,6 +6640,21 @@ static int build_sched_domains(const cpu
 		return -ENOMEM;
 	}
 
+#if SCHED_CPUMASK_ALLOC
+	/* get space for all scratch cpumask variables */
+	allmasks = kmalloc(sizeof(*allmasks), GFP_KERNEL);
+	if (!allmasks) {
+		printk(KERN_WARNING "Cannot alloc cpumask array\n");
+		kfree(rd);
+#ifdef CONFIG_NUMA
+		kfree(sched_group_nodes);
+#endif
+		return -ENOMEM;
+	}
+#endif
+	tmpmask = (cpumask_t *)allmasks;
+
+
 #ifdef CONFIG_NUMA
 	sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
 #endif
@@ -6614,18 +6664,18 @@ static int build_sched_domains(const cpu
 	 */
 	for_each_cpu_mask(i, *cpu_map) {
 		struct sched_domain *sd = NULL, *p;
-		cpumask_t nodemask = node_to_cpumask(cpu_to_node(i));
+		SCHED_CPUMASK_VAR(nodemask, allmasks);
 
-		cpus_and(nodemask, nodemask, *cpu_map);
+		*nodemask = node_to_cpumask(cpu_to_node(i));
+		cpus_and(*nodemask, *nodemask, *cpu_map);
 
 #ifdef CONFIG_NUMA
 		if (cpus_weight(*cpu_map) >
-				SD_NODES_PER_DOMAIN*cpus_weight(nodemask)) {
+				SD_NODES_PER_DOMAIN*cpus_weight(*nodemask)) {
 			sd = &per_cpu(allnodes_domains, i);
 			SD_INIT(sd, ALLNODES);
 			sd->span = *cpu_map;
-			cpu_to_allnodes_group(i, cpu_map, &sd->groups,
-								      &tmpmask);
+			cpu_to_allnodes_group(i, cpu_map, &sd->groups, tmpmask);
 			p = sd;
 			sd_allnodes = 1;
 		} else
@@ -6643,11 +6693,11 @@ static int build_sched_domains(const cpu
 		p = sd;
 		sd = &per_cpu(phys_domains, i);
 		SD_INIT(sd, CPU);
-		sd->span = nodemask;
+		sd->span = *nodemask;
 		sd->parent = p;
 		if (p)
 			p->child = sd;
-		cpu_to_phys_group(i, cpu_map, &sd->groups, &tmpmask);
+		cpu_to_phys_group(i, cpu_map, &sd->groups, tmpmask);
 
 #ifdef CONFIG_SCHED_MC
 		p = sd;
@@ -6657,7 +6707,7 @@ static int build_sched_domains(const cpu
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
-		cpu_to_core_group(i, cpu_map, &sd->groups, &tmpmask);
+		cpu_to_core_group(i, cpu_map, &sd->groups, tmpmask);
 #endif
 
 #ifdef CONFIG_SCHED_SMT
@@ -6668,81 +6718,88 @@ static int build_sched_domains(const cpu
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
-		cpu_to_cpu_group(i, cpu_map, &sd->groups, &tmpmask);
+		cpu_to_cpu_group(i, cpu_map, &sd->groups, tmpmask);
 #endif
 	}
 
 #ifdef CONFIG_SCHED_SMT
 	/* Set up CPU (sibling) groups */
 	for_each_cpu_mask(i, *cpu_map) {
-		cpumask_t this_sibling_map = per_cpu(cpu_sibling_map, i);
-		cpumask_t send_covered;
+		SCHED_CPUMASK_VAR(this_sibling_map, allmasks);
+		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
-		cpus_and(this_sibling_map, this_sibling_map, *cpu_map);
-		if (i != first_cpu(this_sibling_map))
+		*this_sibling_map = per_cpu(cpu_sibling_map, i);
+		cpus_and(*this_sibling_map, *this_sibling_map, *cpu_map);
+		if (i != first_cpu(*this_sibling_map))
 			continue;
 
-		init_sched_build_groups(&this_sibling_map, cpu_map,
-					&cpu_to_cpu_group,
-					&send_covered, &tmpmask);
+		init_sched_build_groups(this_sibling_map, cpu_map,
+					cpu_to_cpu_group,
+					send_covered, tmpmask);
 	}
 #endif
 
 #ifdef CONFIG_SCHED_MC
 	/* Set up multi-core groups */
 	for_each_cpu_mask(i, *cpu_map) {
-		cpumask_t this_core_map = cpu_coregroup_map(i);
-		cpumask_t send_covered;
+		SCHED_CPUMASK_VAR(this_core_map, allmasks);
+		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
-		cpus_and(this_core_map, this_core_map, *cpu_map);
-		if (i != first_cpu(this_core_map))
+		*this_core_map = cpu_coregroup_map(i);
+		cpus_and(*this_core_map, *this_core_map, *cpu_map);
+		if (i != first_cpu(*this_core_map))
 			continue;
-		init_sched_build_groups(&this_core_map, cpu_map,
+
+		init_sched_build_groups(this_core_map, cpu_map,
 					&cpu_to_core_group,
-					&send_covered, &tmpmask);
+					send_covered, tmpmask);
 	}
 #endif
 
 	/* Set up physical groups */
 	for (i = 0; i < MAX_NUMNODES; i++) {
-		cpumask_t nodemask = node_to_cpumask(i);
-		cpumask_t send_covered;
+		SCHED_CPUMASK_VAR(nodemask, allmasks);
+		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
-		cpus_and(nodemask, nodemask, *cpu_map);
-		if (cpus_empty(nodemask))
+		*nodemask = node_to_cpumask(i);
+		cpus_and(*nodemask, *nodemask, *cpu_map);
+		if (cpus_empty(*nodemask))
 			continue;
 
-		init_sched_build_groups(&nodemask, cpu_map,
+		init_sched_build_groups(nodemask, cpu_map,
 					&cpu_to_phys_group,
-					&send_covered, &tmpmask);
+					send_covered, tmpmask);
 	}
 
 #ifdef CONFIG_NUMA
 	/* Set up node groups */
 	if (sd_allnodes) {
-		cpumask_t send_covered;
+		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
 		init_sched_build_groups(cpu_map, cpu_map,
 					&cpu_to_allnodes_group,
-					&send_covered, &tmpmask);
+					send_covered, tmpmask);
 	}
 
 	for (i = 0; i < MAX_NUMNODES; i++) {
 		/* Set up node groups */
 		struct sched_group *sg, *prev;
-		cpumask_t nodemask = node_to_cpumask(i);
-		cpumask_t domainspan;
-		cpumask_t covered = CPU_MASK_NONE;
+		SCHED_CPUMASK_VAR(nodemask, allmasks);
+		SCHED_CPUMASK_VAR(domainspan, allmasks);
+		SCHED_CPUMASK_VAR(covered, allmasks);
 		int j;
 
-		cpus_and(nodemask, nodemask, *cpu_map);
-		if (cpus_empty(nodemask)) {
+		*nodemask = node_to_cpumask(i);
+		*covered = CPU_MASK_NONE;
+
+		cpus_and(*nodemask, *nodemask, *cpu_map);
+		if (cpus_empty(*nodemask)) {
 			sched_group_nodes[i] = NULL;
 			continue;
 		}
 
-		domainspan = sched_domain_node_span(i);
-		cpus_and(domainspan, domainspan, *cpu_map);
+		*domainspan = sched_domain_node_span(i);
+		cpus_and(*domainspan, *domainspan, *cpu_map);
 
 		sg = kmalloc_node(sizeof(struct sched_group), GFP_KERNEL, i);
 		if (!sg) {
@@ -6751,31 +6808,31 @@ static int build_sched_domains(const cpu
 			goto error;
 		}
 		sched_group_nodes[i] = sg;
-		for_each_cpu_mask(j, nodemask) {
+		for_each_cpu_mask(j, *nodemask) {
 			struct sched_domain *sd;
 
 			sd = &per_cpu(node_domains, j);
 			sd->groups = sg;
 		}
 		sg->__cpu_power = 0;
-		sg->cpumask = nodemask;
+		sg->cpumask = *nodemask;
 		sg->next = sg;
-		cpus_or(covered, covered, nodemask);
+		cpus_or(*covered, *covered, *nodemask);
 		prev = sg;
 
 		for (j = 0; j < MAX_NUMNODES; j++) {
-			cpumask_t tmp, notcovered;
+			SCHED_CPUMASK_VAR(notcovered, allmasks);
 			int n = (i + j) % MAX_NUMNODES;
-			node_to_cpumask_ptr(nodemask, n);
 
-			cpus_complement(notcovered, covered);
-			cpus_and(tmp, notcovered, *cpu_map);
-			cpus_and(tmp, tmp, domainspan);
-			if (cpus_empty(tmp))
+			cpus_complement(*notcovered, *covered);
+			cpus_and(*tmpmask, *notcovered, *cpu_map);
+			cpus_and(*tmpmask, *tmpmask, *domainspan);
+			if (cpus_empty(*tmpmask))
 				break;
 
-			cpus_and(tmp, tmp, *nodemask);
-			if (cpus_empty(tmp))
+			*nodemask = node_to_cpumask(n);
+			cpus_and(*tmpmask, *tmpmask, *nodemask);
+			if (cpus_empty(*tmpmask))
 				continue;
 
 			sg = kmalloc_node(sizeof(struct sched_group),
@@ -6786,9 +6843,9 @@ static int build_sched_domains(const cpu
 				goto error;
 			}
 			sg->__cpu_power = 0;
-			sg->cpumask = tmp;
+			sg->cpumask = *tmpmask;
 			sg->next = prev->next;
-			cpus_or(covered, covered, tmp);
+			cpus_or(*covered, *covered, *tmpmask);
 			prev->next = sg;
 			prev = sg;
 		}
@@ -6825,7 +6882,7 @@ static int build_sched_domains(const cpu
 		struct sched_group *sg;
 
 		cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg,
-								&tmpmask);
+								tmpmask);
 		init_numa_sched_groups_power(sg);
 	}
 #endif
@@ -6843,11 +6900,13 @@ static int build_sched_domains(const cpu
 		cpu_attach_domain(sd, rd, i);
 	}
 
+	SCHED_CPUMASK_FREE((void *)allmasks);
 	return 0;
 
 #ifdef CONFIG_NUMA
 error:
-	free_sched_groups(cpu_map, &tmpmask);
+	free_sched_groups(cpu_map, tmpmask);
+	SCHED_CPUMASK_FREE((void *)allmasks);
 	return -ENOMEM;
 #endif
 }

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 10/12] cpumask: reduce stack usage in build_sched_domains
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-mm, linux-kernel

[-- Attachment #1: build_sched_domains --]
[-- Type: text/plain, Size: 10153 bytes --]

Reduce the amount of stack used in build_sched_domains by allocating
all the masks at once, and setting up individual pointers.  If
NR_CPUS <= 128, then stack space is used instead.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Cc: Ingo Molnar <mingo@elte.hu>

Signed-off-by: Mike Travis <travis@sgi.com>
---
One checkpatch warning that I'm not sure how to remove:

ERROR: Macros with complex values should be enclosed in parenthesis
#61: FILE: kernel/sched.c:6656:
+#define        SCHED_CPU_VAR(v, a)     cpumask_t *v = (cpumask_t *) \
			((unsigned long)(a) + offsetof(struct allmasks, v))
---
 kernel/sched.c |  165 ++++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 112 insertions(+), 53 deletions(-)

--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6573,6 +6573,40 @@ SD_INIT_FUNC(CPU)
 #endif
 
 /*
+ * To minimize stack usage kmalloc room for cpumasks and share the
+ * space as the usage in build_sched_domains() dictates.  Used only
+ * if the amount of space is significant.
+ */
+struct allmasks {
+	cpumask_t tmpmask;			/* make this one first */
+	union {
+		cpumask_t nodemask;
+		cpumask_t this_sibling_map;
+		cpumask_t this_core_map;
+	};
+	cpumask_t send_covered;
+
+#ifdef CONFIG_NUMA
+	cpumask_t domainspan;
+	cpumask_t covered;
+	cpumask_t notcovered;
+#endif
+};
+
+#if	NR_CPUS > 128
+#define	SCHED_CPUMASK_ALLOC		1
+#define	SCHED_CPUMASK_FREE(v)		kfree(v)
+#define	SCHED_CPUMASK_DECLARE(v)	struct allmasks *v
+#else
+#define	SCHED_CPUMASK_ALLOC		0
+#define	SCHED_CPUMASK_FREE(v)
+#define	SCHED_CPUMASK_DECLARE(v)	struct allmasks _v, *v = &_v
+#endif
+
+#define	SCHED_CPUMASK_VAR(v, a) 	cpumask_t *v = (cpumask_t *) \
+			((unsigned long)(a) + offsetof(struct allmasks, v))
+
+/*
  * Build sched domains for a given set of cpus and attach the sched domains
  * to the individual cpus
  */
@@ -6580,7 +6614,8 @@ static int build_sched_domains(const cpu
 {
 	int i;
 	struct root_domain *rd;
-	cpumask_t tmpmask;
+	SCHED_CPUMASK_DECLARE(allmasks);
+	cpumask_t *tmpmask;
 #ifdef CONFIG_NUMA
 	struct sched_group **sched_group_nodes = NULL;
 	int sd_allnodes = 0;
@@ -6605,6 +6640,21 @@ static int build_sched_domains(const cpu
 		return -ENOMEM;
 	}
 
+#if SCHED_CPUMASK_ALLOC
+	/* get space for all scratch cpumask variables */
+	allmasks = kmalloc(sizeof(*allmasks), GFP_KERNEL);
+	if (!allmasks) {
+		printk(KERN_WARNING "Cannot alloc cpumask array\n");
+		kfree(rd);
+#ifdef CONFIG_NUMA
+		kfree(sched_group_nodes);
+#endif
+		return -ENOMEM;
+	}
+#endif
+	tmpmask = (cpumask_t *)allmasks;
+
+
 #ifdef CONFIG_NUMA
 	sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
 #endif
@@ -6614,18 +6664,18 @@ static int build_sched_domains(const cpu
 	 */
 	for_each_cpu_mask(i, *cpu_map) {
 		struct sched_domain *sd = NULL, *p;
-		cpumask_t nodemask = node_to_cpumask(cpu_to_node(i));
+		SCHED_CPUMASK_VAR(nodemask, allmasks);
 
-		cpus_and(nodemask, nodemask, *cpu_map);
+		*nodemask = node_to_cpumask(cpu_to_node(i));
+		cpus_and(*nodemask, *nodemask, *cpu_map);
 
 #ifdef CONFIG_NUMA
 		if (cpus_weight(*cpu_map) >
-				SD_NODES_PER_DOMAIN*cpus_weight(nodemask)) {
+				SD_NODES_PER_DOMAIN*cpus_weight(*nodemask)) {
 			sd = &per_cpu(allnodes_domains, i);
 			SD_INIT(sd, ALLNODES);
 			sd->span = *cpu_map;
-			cpu_to_allnodes_group(i, cpu_map, &sd->groups,
-								      &tmpmask);
+			cpu_to_allnodes_group(i, cpu_map, &sd->groups, tmpmask);
 			p = sd;
 			sd_allnodes = 1;
 		} else
@@ -6643,11 +6693,11 @@ static int build_sched_domains(const cpu
 		p = sd;
 		sd = &per_cpu(phys_domains, i);
 		SD_INIT(sd, CPU);
-		sd->span = nodemask;
+		sd->span = *nodemask;
 		sd->parent = p;
 		if (p)
 			p->child = sd;
-		cpu_to_phys_group(i, cpu_map, &sd->groups, &tmpmask);
+		cpu_to_phys_group(i, cpu_map, &sd->groups, tmpmask);
 
 #ifdef CONFIG_SCHED_MC
 		p = sd;
@@ -6657,7 +6707,7 @@ static int build_sched_domains(const cpu
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
-		cpu_to_core_group(i, cpu_map, &sd->groups, &tmpmask);
+		cpu_to_core_group(i, cpu_map, &sd->groups, tmpmask);
 #endif
 
 #ifdef CONFIG_SCHED_SMT
@@ -6668,81 +6718,88 @@ static int build_sched_domains(const cpu
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
-		cpu_to_cpu_group(i, cpu_map, &sd->groups, &tmpmask);
+		cpu_to_cpu_group(i, cpu_map, &sd->groups, tmpmask);
 #endif
 	}
 
 #ifdef CONFIG_SCHED_SMT
 	/* Set up CPU (sibling) groups */
 	for_each_cpu_mask(i, *cpu_map) {
-		cpumask_t this_sibling_map = per_cpu(cpu_sibling_map, i);
-		cpumask_t send_covered;
+		SCHED_CPUMASK_VAR(this_sibling_map, allmasks);
+		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
-		cpus_and(this_sibling_map, this_sibling_map, *cpu_map);
-		if (i != first_cpu(this_sibling_map))
+		*this_sibling_map = per_cpu(cpu_sibling_map, i);
+		cpus_and(*this_sibling_map, *this_sibling_map, *cpu_map);
+		if (i != first_cpu(*this_sibling_map))
 			continue;
 
-		init_sched_build_groups(&this_sibling_map, cpu_map,
-					&cpu_to_cpu_group,
-					&send_covered, &tmpmask);
+		init_sched_build_groups(this_sibling_map, cpu_map,
+					cpu_to_cpu_group,
+					send_covered, tmpmask);
 	}
 #endif
 
 #ifdef CONFIG_SCHED_MC
 	/* Set up multi-core groups */
 	for_each_cpu_mask(i, *cpu_map) {
-		cpumask_t this_core_map = cpu_coregroup_map(i);
-		cpumask_t send_covered;
+		SCHED_CPUMASK_VAR(this_core_map, allmasks);
+		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
-		cpus_and(this_core_map, this_core_map, *cpu_map);
-		if (i != first_cpu(this_core_map))
+		*this_core_map = cpu_coregroup_map(i);
+		cpus_and(*this_core_map, *this_core_map, *cpu_map);
+		if (i != first_cpu(*this_core_map))
 			continue;
-		init_sched_build_groups(&this_core_map, cpu_map,
+
+		init_sched_build_groups(this_core_map, cpu_map,
 					&cpu_to_core_group,
-					&send_covered, &tmpmask);
+					send_covered, tmpmask);
 	}
 #endif
 
 	/* Set up physical groups */
 	for (i = 0; i < MAX_NUMNODES; i++) {
-		cpumask_t nodemask = node_to_cpumask(i);
-		cpumask_t send_covered;
+		SCHED_CPUMASK_VAR(nodemask, allmasks);
+		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
-		cpus_and(nodemask, nodemask, *cpu_map);
-		if (cpus_empty(nodemask))
+		*nodemask = node_to_cpumask(i);
+		cpus_and(*nodemask, *nodemask, *cpu_map);
+		if (cpus_empty(*nodemask))
 			continue;
 
-		init_sched_build_groups(&nodemask, cpu_map,
+		init_sched_build_groups(nodemask, cpu_map,
 					&cpu_to_phys_group,
-					&send_covered, &tmpmask);
+					send_covered, tmpmask);
 	}
 
 #ifdef CONFIG_NUMA
 	/* Set up node groups */
 	if (sd_allnodes) {
-		cpumask_t send_covered;
+		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
 		init_sched_build_groups(cpu_map, cpu_map,
 					&cpu_to_allnodes_group,
-					&send_covered, &tmpmask);
+					send_covered, tmpmask);
 	}
 
 	for (i = 0; i < MAX_NUMNODES; i++) {
 		/* Set up node groups */
 		struct sched_group *sg, *prev;
-		cpumask_t nodemask = node_to_cpumask(i);
-		cpumask_t domainspan;
-		cpumask_t covered = CPU_MASK_NONE;
+		SCHED_CPUMASK_VAR(nodemask, allmasks);
+		SCHED_CPUMASK_VAR(domainspan, allmasks);
+		SCHED_CPUMASK_VAR(covered, allmasks);
 		int j;
 
-		cpus_and(nodemask, nodemask, *cpu_map);
-		if (cpus_empty(nodemask)) {
+		*nodemask = node_to_cpumask(i);
+		*covered = CPU_MASK_NONE;
+
+		cpus_and(*nodemask, *nodemask, *cpu_map);
+		if (cpus_empty(*nodemask)) {
 			sched_group_nodes[i] = NULL;
 			continue;
 		}
 
-		domainspan = sched_domain_node_span(i);
-		cpus_and(domainspan, domainspan, *cpu_map);
+		*domainspan = sched_domain_node_span(i);
+		cpus_and(*domainspan, *domainspan, *cpu_map);
 
 		sg = kmalloc_node(sizeof(struct sched_group), GFP_KERNEL, i);
 		if (!sg) {
@@ -6751,31 +6808,31 @@ static int build_sched_domains(const cpu
 			goto error;
 		}
 		sched_group_nodes[i] = sg;
-		for_each_cpu_mask(j, nodemask) {
+		for_each_cpu_mask(j, *nodemask) {
 			struct sched_domain *sd;
 
 			sd = &per_cpu(node_domains, j);
 			sd->groups = sg;
 		}
 		sg->__cpu_power = 0;
-		sg->cpumask = nodemask;
+		sg->cpumask = *nodemask;
 		sg->next = sg;
-		cpus_or(covered, covered, nodemask);
+		cpus_or(*covered, *covered, *nodemask);
 		prev = sg;
 
 		for (j = 0; j < MAX_NUMNODES; j++) {
-			cpumask_t tmp, notcovered;
+			SCHED_CPUMASK_VAR(notcovered, allmasks);
 			int n = (i + j) % MAX_NUMNODES;
-			node_to_cpumask_ptr(nodemask, n);
 
-			cpus_complement(notcovered, covered);
-			cpus_and(tmp, notcovered, *cpu_map);
-			cpus_and(tmp, tmp, domainspan);
-			if (cpus_empty(tmp))
+			cpus_complement(*notcovered, *covered);
+			cpus_and(*tmpmask, *notcovered, *cpu_map);
+			cpus_and(*tmpmask, *tmpmask, *domainspan);
+			if (cpus_empty(*tmpmask))
 				break;
 
-			cpus_and(tmp, tmp, *nodemask);
-			if (cpus_empty(tmp))
+			*nodemask = node_to_cpumask(n);
+			cpus_and(*tmpmask, *tmpmask, *nodemask);
+			if (cpus_empty(*tmpmask))
 				continue;
 
 			sg = kmalloc_node(sizeof(struct sched_group),
@@ -6786,9 +6843,9 @@ static int build_sched_domains(const cpu
 				goto error;
 			}
 			sg->__cpu_power = 0;
-			sg->cpumask = tmp;
+			sg->cpumask = *tmpmask;
 			sg->next = prev->next;
-			cpus_or(covered, covered, tmp);
+			cpus_or(*covered, *covered, *tmpmask);
 			prev->next = sg;
 			prev = sg;
 		}
@@ -6825,7 +6882,7 @@ static int build_sched_domains(const cpu
 		struct sched_group *sg;
 
 		cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg,
-								&tmpmask);
+								tmpmask);
 		init_numa_sched_groups_power(sg);
 	}
 #endif
@@ -6843,11 +6900,13 @@ static int build_sched_domains(const cpu
 		cpu_attach_domain(sd, rd, i);
 	}
 
+	SCHED_CPUMASK_FREE((void *)allmasks);
 	return 0;
 
 #ifdef CONFIG_NUMA
 error:
-	free_sched_groups(cpu_map, &tmpmask);
+	free_sched_groups(cpu_map, tmpmask);
+	SCHED_CPUMASK_FREE((void *)allmasks);
 	return -ENOMEM;
 #endif
 }

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 11/12] cpumask: reduce stack pressure in cpu_coregroup_map v2
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, David S. Miller,
	William L. Irwin, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: cpu_coregroup_map --]
[-- Type: text/plain, Size: 3129 bytes --]

Return pointer to requested cpumask for cpu_coregroup_map()
functions.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

# sparc
Cc: David S. Miller <davem@davemloft.net>
Cc: William L. Irwin <wli@holomorphy.com>

# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
v2: rebased on linux-2.6.git + linux-2.6-x86.git
---
 arch/x86/kernel/smpboot.c      |    6 +++---
 include/asm-sparc64/topology.h |    2 +-
 include/asm-x86/topology.h     |    2 +-
 kernel/sched.c                 |    6 +++---
 4 files changed, 8 insertions(+), 8 deletions(-)

--- linux.trees.git.orig/arch/x86/kernel/smpboot.c
+++ linux.trees.git/arch/x86/kernel/smpboot.c
@@ -538,7 +538,7 @@ void __cpuinit set_cpu_sibling_map(int c
 }
 
 /* maps the cpu to the sched domain representing multi-core */
-cpumask_t cpu_coregroup_map(int cpu)
+const cpumask_t *cpu_coregroup_map(int cpu)
 {
 	struct cpuinfo_x86 *c = &cpu_data(cpu);
 	/*
@@ -546,9 +546,9 @@ cpumask_t cpu_coregroup_map(int cpu)
 	 * And for power savings, we return cpu_core_map
 	 */
 	if (sched_mc_power_savings || sched_smt_power_savings)
-		return per_cpu(cpu_core_map, cpu);
+		return &per_cpu(cpu_core_map, cpu);
 	else
-		return c->llc_shared_map;
+		return &c->llc_shared_map;
 }
 
 /*
--- linux.trees.git.orig/include/asm-sparc64/topology.h
+++ linux.trees.git/include/asm-sparc64/topology.h
@@ -12,6 +12,6 @@
 
 #include <asm-generic/topology.h>
 
-#define cpu_coregroup_map(cpu)			(cpu_core_map[cpu])
+#define cpu_coregroup_map(cpu)			(&cpu_core_map[cpu])
 
 #endif /* _ASM_SPARC64_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-x86/topology.h
+++ linux.trees.git/include/asm-x86/topology.h
@@ -196,7 +196,7 @@ static inline void set_mp_bus_to_node(in
 
 #include <asm-generic/topology.h>
 
-extern cpumask_t cpu_coregroup_map(int cpu);
+const cpumask_t *cpu_coregroup_map(int cpu);
 
 #ifdef ENABLE_TOPO_DEFINES
 #define topology_physical_package_id(cpu)	(cpu_data(cpu).phys_proc_id)
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6386,7 +6386,7 @@ cpu_to_phys_group(int cpu, const cpumask
 {
 	int group;
 #ifdef CONFIG_SCHED_MC
-	*mask = cpu_coregroup_map(cpu);
+	*mask = *cpu_coregroup_map(cpu);
 	cpus_and(*mask, *mask, *cpu_map);
 	group = first_cpu(*mask);
 #elif defined(CONFIG_SCHED_SMT)
@@ -6703,7 +6703,7 @@ static int build_sched_domains(const cpu
 		p = sd;
 		sd = &per_cpu(core_domains, i);
 		SD_INIT(sd, MC);
-		sd->span = cpu_coregroup_map(i);
+		sd->span = *cpu_coregroup_map(i);
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
@@ -6745,7 +6745,7 @@ static int build_sched_domains(const cpu
 		SCHED_CPUMASK_VAR(this_core_map, allmasks);
 		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
-		*this_core_map = cpu_coregroup_map(i);
+		*this_core_map = *cpu_coregroup_map(i);
 		cpus_and(*this_core_map, *this_core_map, *cpu_map);
 		if (i != first_cpu(*this_core_map))
 			continue;

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 11/12] cpumask: reduce stack pressure in cpu_coregroup_map v2
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, David S. Miller,
	William L. Irwin, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: cpu_coregroup_map --]
[-- Type: text/plain, Size: 3355 bytes --]

Return pointer to requested cpumask for cpu_coregroup_map()
functions.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

# sparc
Cc: David S. Miller <davem@davemloft.net>
Cc: William L. Irwin <wli@holomorphy.com>

# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
v2: rebased on linux-2.6.git + linux-2.6-x86.git
---
 arch/x86/kernel/smpboot.c      |    6 +++---
 include/asm-sparc64/topology.h |    2 +-
 include/asm-x86/topology.h     |    2 +-
 kernel/sched.c                 |    6 +++---
 4 files changed, 8 insertions(+), 8 deletions(-)

--- linux.trees.git.orig/arch/x86/kernel/smpboot.c
+++ linux.trees.git/arch/x86/kernel/smpboot.c
@@ -538,7 +538,7 @@ void __cpuinit set_cpu_sibling_map(int c
 }
 
 /* maps the cpu to the sched domain representing multi-core */
-cpumask_t cpu_coregroup_map(int cpu)
+const cpumask_t *cpu_coregroup_map(int cpu)
 {
 	struct cpuinfo_x86 *c = &cpu_data(cpu);
 	/*
@@ -546,9 +546,9 @@ cpumask_t cpu_coregroup_map(int cpu)
 	 * And for power savings, we return cpu_core_map
 	 */
 	if (sched_mc_power_savings || sched_smt_power_savings)
-		return per_cpu(cpu_core_map, cpu);
+		return &per_cpu(cpu_core_map, cpu);
 	else
-		return c->llc_shared_map;
+		return &c->llc_shared_map;
 }
 
 /*
--- linux.trees.git.orig/include/asm-sparc64/topology.h
+++ linux.trees.git/include/asm-sparc64/topology.h
@@ -12,6 +12,6 @@
 
 #include <asm-generic/topology.h>
 
-#define cpu_coregroup_map(cpu)			(cpu_core_map[cpu])
+#define cpu_coregroup_map(cpu)			(&cpu_core_map[cpu])
 
 #endif /* _ASM_SPARC64_TOPOLOGY_H */
--- linux.trees.git.orig/include/asm-x86/topology.h
+++ linux.trees.git/include/asm-x86/topology.h
@@ -196,7 +196,7 @@ static inline void set_mp_bus_to_node(in
 
 #include <asm-generic/topology.h>
 
-extern cpumask_t cpu_coregroup_map(int cpu);
+const cpumask_t *cpu_coregroup_map(int cpu);
 
 #ifdef ENABLE_TOPO_DEFINES
 #define topology_physical_package_id(cpu)	(cpu_data(cpu).phys_proc_id)
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -6386,7 +6386,7 @@ cpu_to_phys_group(int cpu, const cpumask
 {
 	int group;
 #ifdef CONFIG_SCHED_MC
-	*mask = cpu_coregroup_map(cpu);
+	*mask = *cpu_coregroup_map(cpu);
 	cpus_and(*mask, *mask, *cpu_map);
 	group = first_cpu(*mask);
 #elif defined(CONFIG_SCHED_SMT)
@@ -6703,7 +6703,7 @@ static int build_sched_domains(const cpu
 		p = sd;
 		sd = &per_cpu(core_domains, i);
 		SD_INIT(sd, MC);
-		sd->span = cpu_coregroup_map(i);
+		sd->span = *cpu_coregroup_map(i);
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
@@ -6745,7 +6745,7 @@ static int build_sched_domains(const cpu
 		SCHED_CPUMASK_VAR(this_core_map, allmasks);
 		SCHED_CPUMASK_VAR(send_covered, allmasks);
 
-		*this_core_map = cpu_coregroup_map(i);
+		*this_core_map = *cpu_coregroup_map(i);
 		cpus_and(*this_core_map, *this_core_map, *cpu_map);
 		if (i != first_cpu(*this_core_map))
 			continue;

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 12/12] cpu/node mask: reduce stack usage using MASK_NONE, MASK_ALL
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  1:38   ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: CPU_NODE_MASK --]
[-- Type: text/plain, Size: 9831 bytes --]

Replace usages of CPU_MASK_NONE, CPU_MASK_ALL, NODE_MASK_NONE,
NODE_MASK_ALL to reduce stack requirements for large NR_CPUS
and MAXNODES counts.  In some cases, the cpumask variable was
initialized but then overwritten with another value.  This is
the case for changes like this:

-       cpumask_t oldmask = CPU_MASK_ALL;
+       cpumask_t oldmask;


Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/cpu/cpufreq/powernow-k8.c |    6 +++---
 arch/x86/kernel/cpu/mcheck/mce_amd_64.c   |    4 ++--
 arch/x86/kernel/genapic_flat_64.c         |    4 +++-
 arch/x86/kernel/io_apic_64.c              |    2 +-
 include/linux/cpumask.h                   |    6 ++++++
 init/main.c                               |    7 ++++++-
 kernel/cpu.c                              |    2 +-
 kernel/cpuset.c                           |   10 +++++-----
 kernel/irq/chip.c                         |    2 +-
 kernel/kmod.c                             |    2 +-
 kernel/kthread.c                          |    4 ++--
 kernel/rcutorture.c                       |    3 ++-
 kernel/sched.c                            |    8 ++++----
 mm/allocpercpu.c                          |    3 ++-
 14 files changed, 39 insertions(+), 24 deletions(-)

--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
@@ -478,7 +478,7 @@ static int core_voltage_post_transition(
 
 static int check_supported_cpu(unsigned int cpu)
 {
-	cpumask_t oldmask = CPU_MASK_ALL;
+	cpumask_t oldmask;
 	u32 eax, ebx, ecx, edx;
 	unsigned int rc = 0;
 
@@ -1015,7 +1015,7 @@ static int transition_frequency_pstate(s
 /* Driver entry point to switch to the target frequency */
 static int powernowk8_target(struct cpufreq_policy *pol, unsigned targfreq, unsigned relation)
 {
-	cpumask_t oldmask = CPU_MASK_ALL;
+	cpumask_t oldmask;
 	struct powernow_k8_data *data = per_cpu(powernow_data, pol->cpu);
 	u32 checkfid;
 	u32 checkvid;
@@ -1104,7 +1104,7 @@ static int powernowk8_verify(struct cpuf
 static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
 {
 	struct powernow_k8_data *data;
-	cpumask_t oldmask = CPU_MASK_ALL;
+	cpumask_t oldmask;
 	int rc;
 
 	if (!cpu_online(pol->cpu))
--- linux.trees.git.orig/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
+++ linux.trees.git/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
@@ -255,7 +255,7 @@ static void affinity_set(unsigned int cp
 						cpumask_t *newmask)
 {
 	*oldmask = current->cpus_allowed;
-	*newmask = CPU_MASK_NONE;
+	cpus_clear(*newmask);
 	cpu_set(cpu, *newmask);
 	set_cpus_allowed(current, newmask);
 }
@@ -468,7 +468,7 @@ static __cpuinit int threshold_create_ba
 {
 	int i, err = 0;
 	struct threshold_bank *b = NULL;
-	cpumask_t oldmask = CPU_MASK_NONE, newmask;
+	cpumask_t oldmask, newmask;
 	char name[32];
 
 	sprintf(name, "threshold_bank%i", bank);
--- linux.trees.git.orig/arch/x86/kernel/genapic_flat_64.c
+++ linux.trees.git/arch/x86/kernel/genapic_flat_64.c
@@ -138,7 +138,9 @@ static cpumask_t physflat_target_cpus(vo
 
 static cpumask_t physflat_vector_allocation_domain(int cpu)
 {
-	cpumask_t domain = CPU_MASK_NONE;
+	cpumask_t domain;
+
+	cpus_clear(domain);
 	cpu_set(cpu, domain);
 	return domain;
 }
--- linux.trees.git.orig/arch/x86/kernel/io_apic_64.c
+++ linux.trees.git/arch/x86/kernel/io_apic_64.c
@@ -770,7 +770,7 @@ static void __clear_irq_vector(int irq)
 		per_cpu(vector_irq, cpu)[vector] = -1;
 
 	cfg->vector = 0;
-	cfg->domain = CPU_MASK_NONE;
+	cpus_clear(cfg->domain);
 }
 
 void __setup_vector_irq(int cpu)
--- linux.trees.git.orig/include/linux/cpumask.h
+++ linux.trees.git/include/linux/cpumask.h
@@ -244,6 +244,8 @@ int __next_cpu(int n, const cpumask_t *s
 }))
 static inline void setup_cpumask_of_cpu(int num) {}
 
+#define CPU_MASK_ALL_PTR	(&CPU_MASK_ALL)
+
 #else
 
 #define CPU_MASK_ALL							\
@@ -252,6 +254,10 @@ static inline void setup_cpumask_of_cpu(
 	[BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD			\
 } }
 
+/* cpu_mask_all is in init/main.c */
+extern cpumask_t cpu_mask_all;
+#define CPU_MASK_ALL_PTR	(&cpu_mask_all)
+
 /* cpumask_of_cpu_map is in init/main.c */
 #define cpumask_of_cpu(cpu)    (cpumask_of_cpu_map[cpu])
 extern cpumask_t *cpumask_of_cpu_map;
--- linux.trees.git.orig/init/main.c
+++ linux.trees.git/init/main.c
@@ -194,6 +194,11 @@ static const char *panic_later, *panic_p
 
 extern struct obs_kernel_param __setup_start[], __setup_end[];
 
+#if NR_CPUS > BITS_PER_LONG
+cpumask_t cpu_mask_all = CPU_MASK_ALL;
+EXPORT_SYMBOL(cpu_mask_all);
+#endif
+
 static int __init obsolete_checksetup(char *line)
 {
 	struct obs_kernel_param *p;
@@ -845,7 +850,7 @@ static int __init kernel_init(void * unu
 	/*
 	 * init can run on any cpu.
 	 */
-	set_cpus_allowed(current, &CPU_MASK_ALL);
+	set_cpus_allowed(current, CPU_MASK_ALL_PTR);
 	/*
 	 * Tell the world that we're going to be the grim
 	 * reaper of innocent orphaned children.
--- linux.trees.git.orig/kernel/cpu.c
+++ linux.trees.git/kernel/cpu.c
@@ -232,7 +232,7 @@ static int _cpu_down(unsigned int cpu, i
 
 	/* Ensure that we are not runnable on dying cpu */
 	old_allowed = current->cpus_allowed;
-	tmp = CPU_MASK_ALL;
+	cpus_setall(tmp);
 	cpu_clear(cpu, tmp);
 	set_cpus_allowed(current, &tmp);
 
--- linux.trees.git.orig/kernel/cpuset.c
+++ linux.trees.git/kernel/cpuset.c
@@ -1556,8 +1556,8 @@ static struct cgroup_subsys_state *cpuse
 	if (is_spread_slab(parent))
 		set_bit(CS_SPREAD_SLAB, &cs->flags);
 	set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
-	cs->cpus_allowed = CPU_MASK_NONE;
-	cs->mems_allowed = NODE_MASK_NONE;
+	cpus_clear(cs->cpus_allowed);
+	nodes_clear(cs->mems_allowed);
 	cs->mems_generation = cpuset_mems_generation++;
 	fmeter_init(&cs->fmeter);
 
@@ -1626,8 +1626,8 @@ int __init cpuset_init(void)
 {
 	int err = 0;
 
-	top_cpuset.cpus_allowed = CPU_MASK_ALL;
-	top_cpuset.mems_allowed = NODE_MASK_ALL;
+	cpus_setall(top_cpuset.cpus_allowed);
+	nodes_setall(top_cpuset.mems_allowed);
 
 	fmeter_init(&top_cpuset.fmeter);
 	top_cpuset.mems_generation = cpuset_mems_generation++;
@@ -1873,7 +1873,7 @@ void cpuset_cpus_allowed_locked(struct t
 
 void cpuset_init_current_mems_allowed(void)
 {
-	current->mems_allowed = NODE_MASK_ALL;
+	nodes_setall(current->mems_allowed);
 }
 
 /**
--- linux.trees.git.orig/kernel/irq/chip.c
+++ linux.trees.git/kernel/irq/chip.c
@@ -47,7 +47,7 @@ void dynamic_irq_init(unsigned int irq)
 	desc->irq_count = 0;
 	desc->irqs_unhandled = 0;
 #ifdef CONFIG_SMP
-	desc->affinity = CPU_MASK_ALL;
+	cpus_setall(desc->affinity);
 #endif
 	spin_unlock_irqrestore(&desc->lock, flags);
 }
--- linux.trees.git.orig/kernel/kmod.c
+++ linux.trees.git/kernel/kmod.c
@@ -165,7 +165,7 @@ static int ____call_usermodehelper(void 
 	}
 
 	/* We can run anywhere, unlike our parent keventd(). */
-	set_cpus_allowed(current, &CPU_MASK_ALL);
+	set_cpus_allowed(current, CPU_MASK_ALL_PTR);
 
 	/*
 	 * Our parent is keventd, which runs with elevated scheduling priority.
--- linux.trees.git.orig/kernel/kthread.c
+++ linux.trees.git/kernel/kthread.c
@@ -107,7 +107,7 @@ static void create_kthread(struct kthrea
 		 */
 		sched_setscheduler(create->result, SCHED_NORMAL, &param);
 		set_user_nice(create->result, KTHREAD_NICE_LEVEL);
-		set_cpus_allowed(create->result, &CPU_MASK_ALL);
+		set_cpus_allowed(create->result, CPU_MASK_ALL_PTR);
 	}
 	complete(&create->done);
 }
@@ -232,7 +232,7 @@ int kthreadd(void *unused)
 	set_task_comm(tsk, "kthreadd");
 	ignore_signals(tsk);
 	set_user_nice(tsk, KTHREAD_NICE_LEVEL);
-	set_cpus_allowed(tsk, &CPU_MASK_ALL);
+	set_cpus_allowed(tsk, CPU_MASK_ALL_PTR);
 
 	current->flags |= PF_NOFREEZE;
 
--- linux.trees.git.orig/kernel/rcutorture.c
+++ linux.trees.git/kernel/rcutorture.c
@@ -723,9 +723,10 @@ static int rcu_idle_cpu;	/* Force all to
  */
 static void rcu_torture_shuffle_tasks(void)
 {
-	cpumask_t tmp_mask = CPU_MASK_ALL;
+	cpumask_t tmp_mask;
 	int i;
 
+	cpus_setall(tmp_mask);
 	get_online_cpus();
 
 	/* No point in shuffling if there is only one online CPU (ex: UP) */
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -5502,7 +5502,7 @@ static void move_task_off_dead_cpu(int d
  */
 static void migrate_nr_uninterruptible(struct rq *rq_src)
 {
-	struct rq *rq_dest = cpu_rq(any_online_cpu(CPU_MASK_ALL));
+	struct rq *rq_dest = cpu_rq(any_online_cpu(*CPU_MASK_ALL_PTR));
 	unsigned long flags;
 
 	local_irq_save(flags);
@@ -6220,7 +6220,7 @@ init_sched_build_groups(const cpumask_t 
 	struct sched_group *first = NULL, *last = NULL;
 	int i;
 
-	*covered = CPU_MASK_NONE;
+	cpus_clear(*covered);
 
 	for_each_cpu_mask(i, *span) {
 		struct sched_group *sg;
@@ -6230,7 +6230,7 @@ init_sched_build_groups(const cpumask_t 
 		if (cpu_isset(i, *covered))
 			continue;
 
-		sg->cpumask = CPU_MASK_NONE;
+		cpus_clear(sg->cpumask);
 		sg->__cpu_power = 0;
 
 		for_each_cpu_mask(j, *span) {
@@ -6790,7 +6790,7 @@ static int build_sched_domains(const cpu
 		int j;
 
 		*nodemask = node_to_cpumask(i);
-		*covered = CPU_MASK_NONE;
+		cpus_clear(*covered);
 
 		cpus_and(*nodemask, *nodemask, *cpu_map);
 		if (cpus_empty(*nodemask)) {
--- linux.trees.git.orig/mm/allocpercpu.c
+++ linux.trees.git/mm/allocpercpu.c
@@ -82,9 +82,10 @@ EXPORT_SYMBOL_GPL(percpu_populate);
 int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
 			   cpumask_t *mask)
 {
-	cpumask_t populated = CPU_MASK_NONE;
+	cpumask_t populated;
 	int cpu;
 
+	cpus_clear(populated);
 	for_each_cpu_mask(cpu, *mask)
 		if (unlikely(!percpu_populate(__pdata, size, gfp, cpu))) {
 			__percpu_depopulate_mask(__pdata, &populated);

-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 12/12] cpu/node mask: reduce stack usage using MASK_NONE, MASK_ALL
@ 2008-03-26  1:38   ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26  1:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, linux-mm, linux-kernel, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: CPU_NODE_MASK --]
[-- Type: text/plain, Size: 10057 bytes --]

Replace usages of CPU_MASK_NONE, CPU_MASK_ALL, NODE_MASK_NONE,
NODE_MASK_ALL to reduce stack requirements for large NR_CPUS
and MAXNODES counts.  In some cases, the cpumask variable was
initialized but then overwritten with another value.  This is
the case for changes like this:

-       cpumask_t oldmask = CPU_MASK_ALL;
+       cpumask_t oldmask;


Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/cpu/cpufreq/powernow-k8.c |    6 +++---
 arch/x86/kernel/cpu/mcheck/mce_amd_64.c   |    4 ++--
 arch/x86/kernel/genapic_flat_64.c         |    4 +++-
 arch/x86/kernel/io_apic_64.c              |    2 +-
 include/linux/cpumask.h                   |    6 ++++++
 init/main.c                               |    7 ++++++-
 kernel/cpu.c                              |    2 +-
 kernel/cpuset.c                           |   10 +++++-----
 kernel/irq/chip.c                         |    2 +-
 kernel/kmod.c                             |    2 +-
 kernel/kthread.c                          |    4 ++--
 kernel/rcutorture.c                       |    3 ++-
 kernel/sched.c                            |    8 ++++----
 mm/allocpercpu.c                          |    3 ++-
 14 files changed, 39 insertions(+), 24 deletions(-)

--- linux.trees.git.orig/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
+++ linux.trees.git/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
@@ -478,7 +478,7 @@ static int core_voltage_post_transition(
 
 static int check_supported_cpu(unsigned int cpu)
 {
-	cpumask_t oldmask = CPU_MASK_ALL;
+	cpumask_t oldmask;
 	u32 eax, ebx, ecx, edx;
 	unsigned int rc = 0;
 
@@ -1015,7 +1015,7 @@ static int transition_frequency_pstate(s
 /* Driver entry point to switch to the target frequency */
 static int powernowk8_target(struct cpufreq_policy *pol, unsigned targfreq, unsigned relation)
 {
-	cpumask_t oldmask = CPU_MASK_ALL;
+	cpumask_t oldmask;
 	struct powernow_k8_data *data = per_cpu(powernow_data, pol->cpu);
 	u32 checkfid;
 	u32 checkvid;
@@ -1104,7 +1104,7 @@ static int powernowk8_verify(struct cpuf
 static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
 {
 	struct powernow_k8_data *data;
-	cpumask_t oldmask = CPU_MASK_ALL;
+	cpumask_t oldmask;
 	int rc;
 
 	if (!cpu_online(pol->cpu))
--- linux.trees.git.orig/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
+++ linux.trees.git/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
@@ -255,7 +255,7 @@ static void affinity_set(unsigned int cp
 						cpumask_t *newmask)
 {
 	*oldmask = current->cpus_allowed;
-	*newmask = CPU_MASK_NONE;
+	cpus_clear(*newmask);
 	cpu_set(cpu, *newmask);
 	set_cpus_allowed(current, newmask);
 }
@@ -468,7 +468,7 @@ static __cpuinit int threshold_create_ba
 {
 	int i, err = 0;
 	struct threshold_bank *b = NULL;
-	cpumask_t oldmask = CPU_MASK_NONE, newmask;
+	cpumask_t oldmask, newmask;
 	char name[32];
 
 	sprintf(name, "threshold_bank%i", bank);
--- linux.trees.git.orig/arch/x86/kernel/genapic_flat_64.c
+++ linux.trees.git/arch/x86/kernel/genapic_flat_64.c
@@ -138,7 +138,9 @@ static cpumask_t physflat_target_cpus(vo
 
 static cpumask_t physflat_vector_allocation_domain(int cpu)
 {
-	cpumask_t domain = CPU_MASK_NONE;
+	cpumask_t domain;
+
+	cpus_clear(domain);
 	cpu_set(cpu, domain);
 	return domain;
 }
--- linux.trees.git.orig/arch/x86/kernel/io_apic_64.c
+++ linux.trees.git/arch/x86/kernel/io_apic_64.c
@@ -770,7 +770,7 @@ static void __clear_irq_vector(int irq)
 		per_cpu(vector_irq, cpu)[vector] = -1;
 
 	cfg->vector = 0;
-	cfg->domain = CPU_MASK_NONE;
+	cpus_clear(cfg->domain);
 }
 
 void __setup_vector_irq(int cpu)
--- linux.trees.git.orig/include/linux/cpumask.h
+++ linux.trees.git/include/linux/cpumask.h
@@ -244,6 +244,8 @@ int __next_cpu(int n, const cpumask_t *s
 }))
 static inline void setup_cpumask_of_cpu(int num) {}
 
+#define CPU_MASK_ALL_PTR	(&CPU_MASK_ALL)
+
 #else
 
 #define CPU_MASK_ALL							\
@@ -252,6 +254,10 @@ static inline void setup_cpumask_of_cpu(
 	[BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD			\
 } }
 
+/* cpu_mask_all is in init/main.c */
+extern cpumask_t cpu_mask_all;
+#define CPU_MASK_ALL_PTR	(&cpu_mask_all)
+
 /* cpumask_of_cpu_map is in init/main.c */
 #define cpumask_of_cpu(cpu)    (cpumask_of_cpu_map[cpu])
 extern cpumask_t *cpumask_of_cpu_map;
--- linux.trees.git.orig/init/main.c
+++ linux.trees.git/init/main.c
@@ -194,6 +194,11 @@ static const char *panic_later, *panic_p
 
 extern struct obs_kernel_param __setup_start[], __setup_end[];
 
+#if NR_CPUS > BITS_PER_LONG
+cpumask_t cpu_mask_all = CPU_MASK_ALL;
+EXPORT_SYMBOL(cpu_mask_all);
+#endif
+
 static int __init obsolete_checksetup(char *line)
 {
 	struct obs_kernel_param *p;
@@ -845,7 +850,7 @@ static int __init kernel_init(void * unu
 	/*
 	 * init can run on any cpu.
 	 */
-	set_cpus_allowed(current, &CPU_MASK_ALL);
+	set_cpus_allowed(current, CPU_MASK_ALL_PTR);
 	/*
 	 * Tell the world that we're going to be the grim
 	 * reaper of innocent orphaned children.
--- linux.trees.git.orig/kernel/cpu.c
+++ linux.trees.git/kernel/cpu.c
@@ -232,7 +232,7 @@ static int _cpu_down(unsigned int cpu, i
 
 	/* Ensure that we are not runnable on dying cpu */
 	old_allowed = current->cpus_allowed;
-	tmp = CPU_MASK_ALL;
+	cpus_setall(tmp);
 	cpu_clear(cpu, tmp);
 	set_cpus_allowed(current, &tmp);
 
--- linux.trees.git.orig/kernel/cpuset.c
+++ linux.trees.git/kernel/cpuset.c
@@ -1556,8 +1556,8 @@ static struct cgroup_subsys_state *cpuse
 	if (is_spread_slab(parent))
 		set_bit(CS_SPREAD_SLAB, &cs->flags);
 	set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
-	cs->cpus_allowed = CPU_MASK_NONE;
-	cs->mems_allowed = NODE_MASK_NONE;
+	cpus_clear(cs->cpus_allowed);
+	nodes_clear(cs->mems_allowed);
 	cs->mems_generation = cpuset_mems_generation++;
 	fmeter_init(&cs->fmeter);
 
@@ -1626,8 +1626,8 @@ int __init cpuset_init(void)
 {
 	int err = 0;
 
-	top_cpuset.cpus_allowed = CPU_MASK_ALL;
-	top_cpuset.mems_allowed = NODE_MASK_ALL;
+	cpus_setall(top_cpuset.cpus_allowed);
+	nodes_setall(top_cpuset.mems_allowed);
 
 	fmeter_init(&top_cpuset.fmeter);
 	top_cpuset.mems_generation = cpuset_mems_generation++;
@@ -1873,7 +1873,7 @@ void cpuset_cpus_allowed_locked(struct t
 
 void cpuset_init_current_mems_allowed(void)
 {
-	current->mems_allowed = NODE_MASK_ALL;
+	nodes_setall(current->mems_allowed);
 }
 
 /**
--- linux.trees.git.orig/kernel/irq/chip.c
+++ linux.trees.git/kernel/irq/chip.c
@@ -47,7 +47,7 @@ void dynamic_irq_init(unsigned int irq)
 	desc->irq_count = 0;
 	desc->irqs_unhandled = 0;
 #ifdef CONFIG_SMP
-	desc->affinity = CPU_MASK_ALL;
+	cpus_setall(desc->affinity);
 #endif
 	spin_unlock_irqrestore(&desc->lock, flags);
 }
--- linux.trees.git.orig/kernel/kmod.c
+++ linux.trees.git/kernel/kmod.c
@@ -165,7 +165,7 @@ static int ____call_usermodehelper(void 
 	}
 
 	/* We can run anywhere, unlike our parent keventd(). */
-	set_cpus_allowed(current, &CPU_MASK_ALL);
+	set_cpus_allowed(current, CPU_MASK_ALL_PTR);
 
 	/*
 	 * Our parent is keventd, which runs with elevated scheduling priority.
--- linux.trees.git.orig/kernel/kthread.c
+++ linux.trees.git/kernel/kthread.c
@@ -107,7 +107,7 @@ static void create_kthread(struct kthrea
 		 */
 		sched_setscheduler(create->result, SCHED_NORMAL, &param);
 		set_user_nice(create->result, KTHREAD_NICE_LEVEL);
-		set_cpus_allowed(create->result, &CPU_MASK_ALL);
+		set_cpus_allowed(create->result, CPU_MASK_ALL_PTR);
 	}
 	complete(&create->done);
 }
@@ -232,7 +232,7 @@ int kthreadd(void *unused)
 	set_task_comm(tsk, "kthreadd");
 	ignore_signals(tsk);
 	set_user_nice(tsk, KTHREAD_NICE_LEVEL);
-	set_cpus_allowed(tsk, &CPU_MASK_ALL);
+	set_cpus_allowed(tsk, CPU_MASK_ALL_PTR);
 
 	current->flags |= PF_NOFREEZE;
 
--- linux.trees.git.orig/kernel/rcutorture.c
+++ linux.trees.git/kernel/rcutorture.c
@@ -723,9 +723,10 @@ static int rcu_idle_cpu;	/* Force all to
  */
 static void rcu_torture_shuffle_tasks(void)
 {
-	cpumask_t tmp_mask = CPU_MASK_ALL;
+	cpumask_t tmp_mask;
 	int i;
 
+	cpus_setall(tmp_mask);
 	get_online_cpus();
 
 	/* No point in shuffling if there is only one online CPU (ex: UP) */
--- linux.trees.git.orig/kernel/sched.c
+++ linux.trees.git/kernel/sched.c
@@ -5502,7 +5502,7 @@ static void move_task_off_dead_cpu(int d
  */
 static void migrate_nr_uninterruptible(struct rq *rq_src)
 {
-	struct rq *rq_dest = cpu_rq(any_online_cpu(CPU_MASK_ALL));
+	struct rq *rq_dest = cpu_rq(any_online_cpu(*CPU_MASK_ALL_PTR));
 	unsigned long flags;
 
 	local_irq_save(flags);
@@ -6220,7 +6220,7 @@ init_sched_build_groups(const cpumask_t 
 	struct sched_group *first = NULL, *last = NULL;
 	int i;
 
-	*covered = CPU_MASK_NONE;
+	cpus_clear(*covered);
 
 	for_each_cpu_mask(i, *span) {
 		struct sched_group *sg;
@@ -6230,7 +6230,7 @@ init_sched_build_groups(const cpumask_t 
 		if (cpu_isset(i, *covered))
 			continue;
 
-		sg->cpumask = CPU_MASK_NONE;
+		cpus_clear(sg->cpumask);
 		sg->__cpu_power = 0;
 
 		for_each_cpu_mask(j, *span) {
@@ -6790,7 +6790,7 @@ static int build_sched_domains(const cpu
 		int j;
 
 		*nodemask = node_to_cpumask(i);
-		*covered = CPU_MASK_NONE;
+		cpus_clear(*covered);
 
 		cpus_and(*nodemask, *nodemask, *cpu_map);
 		if (cpus_empty(*nodemask)) {
--- linux.trees.git.orig/mm/allocpercpu.c
+++ linux.trees.git/mm/allocpercpu.c
@@ -82,9 +82,10 @@ EXPORT_SYMBOL_GPL(percpu_populate);
 int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
 			   cpumask_t *mask)
 {
-	cpumask_t populated = CPU_MASK_NONE;
+	cpumask_t populated;
 	int cpu;
 
+	cpus_clear(populated);
 	for_each_cpu_mask(cpu, *mask)
 		if (unlikely(!percpu_populate(__pdata, size, gfp, cpu))) {
 			__percpu_depopulate_mask(__pdata, &populated);

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 04/12] cpumask: pass cpumask by reference to acpi-cpufreq
  2008-03-26  1:38   ` Mike Travis
@ 2008-03-26  2:15     ` Dave Jones
  -1 siblings, 0 replies; 32+ messages in thread
From: Dave Jones @ 2008-03-26  2:15 UTC (permalink / raw)
  To: Mike Travis; +Cc: Andrew Morton, Ingo Molnar, linux-mm, linux-kernel, Len Brown

On Tue, Mar 25, 2008 at 06:38:15PM -0700, Mike Travis wrote:
 > Pass cpumask_t variables by reference in acpi-cpufreq functions.
 > 
 > Based on:
 > 	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
 > 	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git
 > 
 > Cc: Len Brown <len.brown@intel.com>
 > Cc: Dave Jones <davej@codemonkey.org.uk>
 > Signed-off-by: Mike Travis <travis@sgi.com>

As this is dependant on non-cpufreq bits, I'm assuming this is going
via Ingo.  From a quick eyeball of this, and the change its dependant on,
it looks ok to me.

	Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 04/12] cpumask: pass cpumask by reference to acpi-cpufreq
@ 2008-03-26  2:15     ` Dave Jones
  0 siblings, 0 replies; 32+ messages in thread
From: Dave Jones @ 2008-03-26  2:15 UTC (permalink / raw)
  To: Mike Travis; +Cc: Andrew Morton, Ingo Molnar, linux-mm, linux-kernel, Len Brown

On Tue, Mar 25, 2008 at 06:38:15PM -0700, Mike Travis wrote:
 > Pass cpumask_t variables by reference in acpi-cpufreq functions.
 > 
 > Based on:
 > 	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
 > 	git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git
 > 
 > Cc: Len Brown <len.brown@intel.com>
 > Cc: Dave Jones <davej@codemonkey.org.uk>
 > Signed-off-by: Mike Travis <travis@sgi.com>

As this is dependant on non-cpufreq bits, I'm assuming this is going
via Ingo.  From a quick eyeball of this, and the change its dependant on,
it looks ok to me.

	Dave

-- 
http://www.codemonkey.org.uk

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v2
  2008-03-26  1:38 ` Mike Travis
@ 2008-03-26  6:18   ` Ingo Molnar
  -1 siblings, 0 replies; 32+ messages in thread
From: Ingo Molnar @ 2008-03-26  6:18 UTC (permalink / raw)
  To: Mike Travis; +Cc: Andrew Morton, linux-mm, linux-kernel


* Mike Travis <travis@sgi.com> wrote:

> Modify usage of cpumask_t variables to use pointers as much as 
> possible.

hm, why is there no minimal patch against -git that does nothing but 
introduces the new pointer based generic APIs (without using them) - 
such as set_cpus_allowed_ptr(), etc.? Once that is upstream all the 
remaining changes can trickle one arch and one subsystem at a time, and 
once that's done, the old set_cpus_allowed() can be removed. This is far 
more manageable than one large patch.

and the cpumask_of_cpu() change should be Kconfig based initially - once 
all arches have moved to it (or even sooner) we can remove that.

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v2
@ 2008-03-26  6:18   ` Ingo Molnar
  0 siblings, 0 replies; 32+ messages in thread
From: Ingo Molnar @ 2008-03-26  6:18 UTC (permalink / raw)
  To: Mike Travis; +Cc: Andrew Morton, linux-mm, linux-kernel

* Mike Travis <travis@sgi.com> wrote:

> Modify usage of cpumask_t variables to use pointers as much as 
> possible.

hm, why is there no minimal patch against -git that does nothing but 
introduces the new pointer based generic APIs (without using them) - 
such as set_cpus_allowed_ptr(), etc.? Once that is upstream all the 
remaining changes can trickle one arch and one subsystem at a time, and 
once that's done, the old set_cpus_allowed() can be removed. This is far 
more manageable than one large patch.

and the cpumask_of_cpu() change should be Kconfig based initially - once 
all arches have moved to it (or even sooner) we can remove that.

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v2
  2008-03-26  6:18   ` Ingo Molnar
@ 2008-03-26 15:53     ` Mike Travis
  -1 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26 15:53 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, linux-mm, linux-kernel

Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
> 
>> Modify usage of cpumask_t variables to use pointers as much as 
>> possible.
> 
> hm, why is there no minimal patch against -git that does nothing but 
> introduces the new pointer based generic APIs (without using them) - 
> such as set_cpus_allowed_ptr(), etc.? Once that is upstream all the 
> remaining changes can trickle one arch and one subsystem at a time, and 
> once that's done, the old set_cpus_allowed() can be removed. This is far 
> more manageable than one large patch.
> 
> and the cpumask_of_cpu() change should be Kconfig based initially - once 
> all arches have moved to it (or even sooner) we can remove that.
> 
> 	Ingo

Yes, good idea!  I'll see about dividing them up.  Though 99% seems to
be in generic kernel code (kernel/sched.c is by far the biggest user.)

There is one function pointer in a struct that would need an additional entry
if we keep both interfaces.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v2
@ 2008-03-26 15:53     ` Mike Travis
  0 siblings, 0 replies; 32+ messages in thread
From: Mike Travis @ 2008-03-26 15:53 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, linux-mm, linux-kernel

Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
> 
>> Modify usage of cpumask_t variables to use pointers as much as 
>> possible.
> 
> hm, why is there no minimal patch against -git that does nothing but 
> introduces the new pointer based generic APIs (without using them) - 
> such as set_cpus_allowed_ptr(), etc.? Once that is upstream all the 
> remaining changes can trickle one arch and one subsystem at a time, and 
> once that's done, the old set_cpus_allowed() can be removed. This is far 
> more manageable than one large patch.
> 
> and the cpumask_of_cpu() change should be Kconfig based initially - once 
> all arches have moved to it (or even sooner) we can remove that.
> 
> 	Ingo

Yes, good idea!  I'll see about dividing them up.  Though 99% seems to
be in generic kernel code (kernel/sched.c is by far the biggest user.)

There is one function pointer in a struct that would need an additional entry
if we keep both interfaces.

Thanks,
Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2008-03-26 15:53 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-26  1:38 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v2 Mike Travis
2008-03-26  1:38 ` Mike Travis
2008-03-26  1:38 ` [PATCH 01/12] cpumask: Convert cpumask_of_cpu to allocated array v2 Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 02/12] cpumask: pass pointer to cpumask for set_cpus_allowed() v2 Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 03/12] cpumask: reduce stack pressure in sched_affinity Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 04/12] cpumask: pass cpumask by reference to acpi-cpufreq Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  2:15   ` Dave Jones
2008-03-26  2:15     ` Dave Jones
2008-03-26  1:38 ` [PATCH 05/12] init: move large array from stack to _initdata section Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 06/12] cpumask: create pointer to node_to_cpumask array element v2 Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 07/12] cpumask: reduce stack usage in SD_x_INIT initializers Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 08/12] cpumask: pass temp cpumask variables in init_sched_build_groups Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 09/12] sched: fix memory leak in build_sched_domains Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 10/12] cpumask: reduce stack usage " Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 11/12] cpumask: reduce stack pressure in cpu_coregroup_map v2 Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  1:38 ` [PATCH 12/12] cpu/node mask: reduce stack usage using MASK_NONE, MASK_ALL Mike Travis
2008-03-26  1:38   ` Mike Travis
2008-03-26  6:18 ` [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v2 Ingo Molnar
2008-03-26  6:18   ` Ingo Molnar
2008-03-26 15:53   ` Mike Travis
2008-03-26 15:53     ` Mike Travis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.