All of lore.kernel.org
 help / color / mirror / Atom feed
* Size of 2.6.20 task_struct on x86_64 machines
@ 2007-02-08 16:14 William Cohen
  2007-02-08 20:19 ` David Miller
  2007-02-11  0:20 ` Dave Jones
  0 siblings, 2 replies; 5+ messages in thread
From: William Cohen @ 2007-02-08 16:14 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 931 bytes --]

This past week I was playing around with that pahole tool
(http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the
size of various struct in the kernel. I was surprised by the size of
the task_struct on x86_64, approaching 4K.  I looked through the
fields in task_struct and found that a number of them were declared as
"unsigned long" rather than "unsigned int" despite them appearing okay
as 32-bit sized fields. On x86_64 "unsigned long" ends up being 8
bytes in size and forces 8 byte alignment. Is there a reason there
a reason they are "unsigned long"?

The patch below drops the size of the struct from 3808 bytes (60
64-byte cachelines) to 3760 bytes (59 64-byte cachelines). A couple
other fields in the task struct take a signficant amount of space:

struct thread_struct       thread;               688
struct held_lock           held_locks[30];       1680

CONFIG_LOCKDEP is turned on in the .config

-Will

[-- Attachment #2: task_struct_compress.diff --]
[-- Type: text/x-patch, Size: 1363 bytes --]

--- include/linux/sched.h.compress	2007-02-06 16:16:14.000000000 -0500
+++ include/linux/sched.h	2007-02-07 18:09:34.000000000 -0500
@@ -802,8 +802,8 @@
 	volatile long state;	/* -1 unrunnable, 0 runnable, >0 stopped */
 	struct thread_info *thread_info;
 	atomic_t usage;
-	unsigned long flags;	/* per process flags, defined below */
-	unsigned long ptrace;
+	unsigned int flags;	/* per process flags, defined below */
+	unsigned int ptrace;
 
 	int lock_depth;		/* BKL lock depth */
 
@@ -826,7 +826,7 @@
 	unsigned long long sched_time; /* sched_clock time spent running */
 	enum sleep_type sleep_type;
 
-	unsigned long policy;
+	unsigned int policy;
 	cpumask_t cpus_allowed;
 	unsigned int time_slice, first_time_slice;
 
@@ -846,11 +846,11 @@
 
 /* task state */
 	struct linux_binfmt *binfmt;
-	long exit_state;
+	int exit_state;
 	int exit_code, exit_signal;
 	int pdeath_signal;  /*  The signal sent when the parent dies  */
 	/* ??? */
-	unsigned long personality;
+	unsigned int personality;
 	unsigned did_exec:1;
 	pid_t pid;
 	pid_t tgid;
@@ -882,7 +882,7 @@
 	int __user *set_child_tid;		/* CLONE_CHILD_SETTID */
 	int __user *clear_child_tid;		/* CLONE_CHILD_CLEARTID */
 
-	unsigned long rt_priority;
+	unsigned int rt_priority;
 	cputime_t utime, stime;
 	unsigned long nvcsw, nivcsw; /* context switch counts */
 	struct timespec start_time;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Size of 2.6.20 task_struct on x86_64 machines
  2007-02-08 16:14 Size of 2.6.20 task_struct on x86_64 machines William Cohen
@ 2007-02-08 20:19 ` David Miller
  2007-02-08 21:03   ` Andrew Morton
  2007-02-11  0:20 ` Dave Jones
  1 sibling, 1 reply; 5+ messages in thread
From: David Miller @ 2007-02-08 20:19 UTC (permalink / raw)
  To: wcohen; +Cc: linux-kernel

From: William Cohen <wcohen@redhat.com>
Date: Thu, 08 Feb 2007 11:14:13 -0500

> This past week I was playing around with that pahole tool
> (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the
> size of various struct in the kernel. I was surprised by the size of
> the task_struct on x86_64, approaching 4K.  I looked through the
> fields in task_struct and found that a number of them were declared as
> "unsigned long" rather than "unsigned int" despite them appearing okay
> as 32-bit sized fields. On x86_64 "unsigned long" ends up being 8
> bytes in size and forces 8 byte alignment. Is there a reason there
> a reason they are "unsigned long"?

I think at one point we used the atomic bit operations to operate on
things like tsk->flags, and those interfaces require unsigned long as
the type.

That doesn't appear to be the case any longer, so at a minimum
your tsk->flags conversion to unsigned int should be ok.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Size of 2.6.20 task_struct on x86_64 machines
  2007-02-08 20:19 ` David Miller
@ 2007-02-08 21:03   ` Andrew Morton
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2007-02-08 21:03 UTC (permalink / raw)
  To: David Miller; +Cc: wcohen, linux-kernel

On Thu, 08 Feb 2007 12:19:45 -0800 (PST)
David Miller <davem@davemloft.net> wrote:

> From: William Cohen <wcohen@redhat.com>
> Date: Thu, 08 Feb 2007 11:14:13 -0500
> 
> > This past week I was playing around with that pahole tool
> > (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the
> > size of various struct in the kernel. I was surprised by the size of
> > the task_struct on x86_64, approaching 4K.  I looked through the
> > fields in task_struct and found that a number of them were declared as
> > "unsigned long" rather than "unsigned int" despite them appearing okay
> > as 32-bit sized fields. On x86_64 "unsigned long" ends up being 8
> > bytes in size and forces 8 byte alignment. Is there a reason there
> > a reason they are "unsigned long"?
> 
> I think at one point we used the atomic bit operations to operate on
> things like tsk->flags, and those interfaces require unsigned long as
> the type.
> 
> That doesn't appear to be the case any longer, so at a minimum
> your tsk->flags conversion to unsigned int should be ok.

Yeah, afacit everything in there is OK and happily all the
converted-to-32-bit quantities happen to be contiguous with other 32-bit
quantities.

Most architectures' bitops functions take unsigned long * so if anyone is
using bitops on these things we should get to hear about it.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Size of 2.6.20 task_struct on x86_64 machines
  2007-02-08 16:14 Size of 2.6.20 task_struct on x86_64 machines William Cohen
  2007-02-08 20:19 ` David Miller
@ 2007-02-11  0:20 ` Dave Jones
  2007-02-11  2:55   ` Linus Torvalds
  1 sibling, 1 reply; 5+ messages in thread
From: Dave Jones @ 2007-02-11  0:20 UTC (permalink / raw)
  To: William Cohen; +Cc: linux-kernel, Linus Torvalds, Andrew Morton

On Thu, Feb 08, 2007 at 11:14:13AM -0500, William Cohen wrote:
 > This past week I was playing around with that pahole tool
 > (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the
 > size of various struct in the kernel. I was surprised by the size of
 > the task_struct on x86_64, approaching 4K.  I looked through the
 > fields in task_struct and found that a number of them were declared as
 > "unsigned long" rather than "unsigned int" despite them appearing okay
 > as 32-bit sized fields. On x86_64 "unsigned long" ends up being 8
 > bytes in size and forces 8 byte alignment. Is there a reason there
 > a reason they are "unsigned long"?
 > 
 > The patch below drops the size of the struct from 3808 bytes (60
 > 64-byte cachelines) to 3760 bytes (59 64-byte cachelines). A couple
 > other fields in the task struct take a signficant amount of space:
 > 
 > struct thread_struct       thread;               688
 > struct held_lock           held_locks[30];       1680
 > 
 > CONFIG_LOCKDEP is turned on in the .config

I sent this .. http://lkml.org/lkml/2007/1/2/299
last month which shrinks task struct by 480 bytes when lockdep
is enabled. Ingo acked it, but then it fell on the floor.

Here it is again..

		Dave

Shrink the held_lock struct by using bitfields.
This shrinks task_struct on lockdep enabled kernels by 480 bytes.

Signed-off-by: Dave Jones <davej@redhat.com>

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index ea097dd..ba81cce 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -175,11 +175,11 @@ struct held_lock {
 	 * The following field is used to detect when we cross into an
 	 * interrupt context:
 	 */
-	int				irq_context;
-	int				trylock;
-	int				read;
-	int				check;
-	int				hardirqs_off;
+	unsigned char irq_context:1;
+	unsigned char trylock:1;
+	unsigned char read:2;
+	unsigned char check:1;
+	unsigned char hardirqs_off:1;
 };
 
 /*

-- 
http://www.codemonkey.org.uk

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Size of 2.6.20 task_struct on x86_64 machines
  2007-02-11  0:20 ` Dave Jones
@ 2007-02-11  2:55   ` Linus Torvalds
  0 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2007-02-11  2:55 UTC (permalink / raw)
  To: Dave Jones
  Cc: William Cohen, Linux Kernel Mailing List, Andrew Morton, Ingo Molnar



On Sat, 10 Feb 2007, Dave Jones wrote:
> 
> Shrink the held_lock struct by using bitfields.
> This shrinks task_struct on lockdep enabled kernels by 480 bytes.

Are we sure that there are no users that depend on accessing the different 
fields under different locks?

Having them as separate "int" fields means that they don't have any 
interaction, and normal cache coherency will "just work". Once they are 
fields in the same word in memory, updating one field automatically will 
do a read-write cycle on the other fields, and if _they_ are updated by 
interrupts or other CPU's at the same time, a write can get lost..

So I'd like this to be ack'ed by Ingo.

Ingo?

		Linus
---
> Signed-off-by: Dave Jones <davej@redhat.com>
> 
> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> index ea097dd..ba81cce 100644
> --- a/include/linux/lockdep.h
> +++ b/include/linux/lockdep.h
> @@ -175,11 +175,11 @@ struct held_lock {
>  	 * The following field is used to detect when we cross into an
>  	 * interrupt context:
>  	 */
> -	int				irq_context;
> -	int				trylock;
> -	int				read;
> -	int				check;
> -	int				hardirqs_off;
> +	unsigned char irq_context:1;
> +	unsigned char trylock:1;
> +	unsigned char read:2;
> +	unsigned char check:1;
> +	unsigned char hardirqs_off:1;
>  };
>  
>  /*
> 
> -- 
> http://www.codemonkey.org.uk
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-02-11  2:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-08 16:14 Size of 2.6.20 task_struct on x86_64 machines William Cohen
2007-02-08 20:19 ` David Miller
2007-02-08 21:03   ` Andrew Morton
2007-02-11  0:20 ` Dave Jones
2007-02-11  2:55   ` Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.