linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC: remove __read_mostly
@ 2007-12-13 22:20 Adrian Bunk
  2007-12-13 22:29 ` Andi Kleen
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Adrian Bunk @ 2007-12-13 22:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch

I tried the following patch with a full x86 .config [1]:

--- a/include/asm-x86/cache.h
+++ b/include/asm-x86/cache.h
-#define __read_mostly __attribute__((__section__(".data.read_mostly")))
+/* #define __read_mostly __attribute__((__section__(".data.read_mostly"))) */

The result [2,3] was:

-rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
-rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux

It's not a surprise that the kernel can become bigger when __read_mostly 
gets used, especially in cases where __read_mostly prevents gcc 
optimizations.

My question is:
Is there anywhere in the kernel a case where __read_mostly brings a 
measurable improvement or can it be removed?

cu
Adrian

[1] mostly allyesconfig with CONFIG_MODULES=n
[2] vmlinux.old is the unpatched kernel
[3] gcc 4.2 was used

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:20 RFC: remove __read_mostly Adrian Bunk
@ 2007-12-13 22:29 ` Andi Kleen
  2007-12-13 22:41   ` Adrian Bunk
  2007-12-13 22:32 ` David Miller
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2007-12-13 22:29 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: linux-kernel, linux-arch

Adrian Bunk <bunk@kernel.org> writes:
>
> -rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
> -rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux

File sizes are useless -- check size output.

> It's not a surprise that the kernel can become bigger when __read_mostly 
> gets used, especially in cases where __read_mostly prevents gcc 
> optimizations.

What optimizations do you think it prevents?  I don't think it 
should change the gcc generated code at all; the only difference
should be to the linker.

-Andi


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:20 RFC: remove __read_mostly Adrian Bunk
  2007-12-13 22:29 ` Andi Kleen
@ 2007-12-13 22:32 ` David Miller
  2007-12-13 22:44   ` Harvey Harrison
  2007-12-13 22:48 ` Eric Dumazet
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2007-12-13 22:32 UTC (permalink / raw)
  To: bunk; +Cc: linux-kernel, linux-arch

From: Adrian Bunk <bunk@kernel.org>
Date: Thu, 13 Dec 2007 23:20:44 +0100

> My question is:
> Is there anywhere in the kernel a case where __read_mostly brings a 
> measurable improvement or can it be removed?

Yes, on SMP when read-mostly objects share cache lines
with other objects which are frequently written to.

That is the whole reason we created __read_mostly

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:29 ` Andi Kleen
@ 2007-12-13 22:41   ` Adrian Bunk
  2007-12-14 16:16     ` Arnd Bergmann
  0 siblings, 1 reply; 21+ messages in thread
From: Adrian Bunk @ 2007-12-13 22:41 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, linux-arch

On Thu, Dec 13, 2007 at 11:29:08PM +0100, Andi Kleen wrote:
> Adrian Bunk <bunk@kernel.org> writes:
> >
> > -rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
> > -rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux
> 
> File sizes are useless -- check size output.

    text    data     bss      dec     hex filename
29268488 3697961 5222400 38188849 246b731 vmlinux.old
29268435 3685565 5228784 38192784 246c690 vmlinux

> > It's not a surprise that the kernel can become bigger when __read_mostly 
> > gets used, especially in cases where __read_mostly prevents gcc 
> > optimizations.
> 
> What optimizations do you think it prevents?  I don't think it 
> should change the gcc generated code at all; the only difference
> should be to the linker.

What I have seen recently was a static variable marked __read_mostly 
being read-only with some CONFIG_DEBUG_FOO=n.

With __read_mostly gcc feeled obliged to emit a variable.

Without __read_mostly, gcc optimized the variable away.

> -Andi

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:32 ` David Miller
@ 2007-12-13 22:44   ` Harvey Harrison
  2007-12-13 23:06     ` Andi Kleen
  0 siblings, 1 reply; 21+ messages in thread
From: Harvey Harrison @ 2007-12-13 22:44 UTC (permalink / raw)
  To: David Miller; +Cc: bunk, linux-kernel, linux-arch, Mathieu Desnoyers

On Thu, 2007-12-13 at 14:32 -0800, David Miller wrote:
> From: Adrian Bunk <bunk@kernel.org>
> Date: Thu, 13 Dec 2007 23:20:44 +0100
> 
> > My question is:
> > Is there anywhere in the kernel a case where __read_mostly brings a 
> > measurable improvement or can it be removed?
> 
> Yes, on SMP when read-mostly objects share cache lines
> with other objects which are frequently written to.
> 
> That is the whole reason we created __read_mostly

I'm curious if anyone has been looking into replacing the __read_mostly
approach with Mathieu's immediate values patchset.  Wouldn't they solve
the cacheline sharing as well (perhaps more eficiently even with trading
some icache for dcache)?

Harvey


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:20 RFC: remove __read_mostly Adrian Bunk
  2007-12-13 22:29 ` Andi Kleen
  2007-12-13 22:32 ` David Miller
@ 2007-12-13 22:48 ` Eric Dumazet
  2007-12-13 23:00   ` Adrian Bunk
  2007-12-13 23:54 ` Kyle McMartin
  2007-12-14 15:24 ` Matt Mackall
  4 siblings, 1 reply; 21+ messages in thread
From: Eric Dumazet @ 2007-12-13 22:48 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: linux-kernel, linux-arch

Adrian Bunk a écrit :
> I tried the following patch with a full x86 .config [1]:
> 
> --- a/include/asm-x86/cache.h
> +++ b/include/asm-x86/cache.h
> -#define __read_mostly __attribute__((__section__(".data.read_mostly")))
> +/* #define __read_mostly __attribute__((__section__(".data.read_mostly"))) */
> 
> The result [2,3] was:
> 
> -rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
> -rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux
> 
> It's not a surprise that the kernel can become bigger when __read_mostly 
> gets used, especially in cases where __read_mostly prevents gcc 
> optimizations.
> 
> My question is:
> Is there anywhere in the kernel a case where __read_mostly brings a 
> measurable improvement or can it be removed?

Yes, there are many cases where read_mostly brings huge improvements.

I did test your idea on a 4 CPUS server, and system time was roughly doubled, 
from 11% to 20%

Of course, you noticed that puting a __read_mostly attribute force the linker 
to reserve space for the variable. So a null variable previously in bss 
section (no space in vmlinux file) is now in .data.read_mostly. Not a big deal.

If you want, you could play some .lds games to create sort of a 
"bss.read_mostly" section to save 10000 bytes in vmlinux.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:48 ` Eric Dumazet
@ 2007-12-13 23:00   ` Adrian Bunk
  0 siblings, 0 replies; 21+ messages in thread
From: Adrian Bunk @ 2007-12-13 23:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, linux-arch

On Thu, Dec 13, 2007 at 11:48:27PM +0100, Eric Dumazet wrote:
> Adrian Bunk a écrit :
>> I tried the following patch with a full x86 .config [1]:
>>
>> --- a/include/asm-x86/cache.h
>> +++ b/include/asm-x86/cache.h
>> -#define __read_mostly __attribute__((__section__(".data.read_mostly")))
>> +/* #define __read_mostly __attribute__((__section__(".data.read_mostly"))) */
>>
>> The result [2,3] was:
>>
>> -rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
>> -rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux
>>
>> It's not a surprise that the kernel can become bigger when __read_mostly 
>> gets used, especially in cases where __read_mostly prevents gcc 
>> optimizations.
>>
>> My question is:
>> Is there anywhere in the kernel a case where __read_mostly brings a 
>> measurable improvement or can it be removed?
>
> Yes, there are many cases where read_mostly brings huge improvements.
>
> I did test your idea on a 4 CPUS server, and system time was roughly 
> doubled, from 11% to 20%

Thanks, that answered my question.

> Of course, you noticed that puting a __read_mostly attribute force the 
> linker to reserve space for the variable. So a null variable previously in 
> bss section (no space in vmlinux file) is now in .data.read_mostly. Not a 
> big deal.
>
> If you want, you could play some .lds games to create sort of a 
> "bss.read_mostly" section to save 10000 bytes in vmlinux.

No, what I was thinking of kernel code like

static int fooxxxvar = 0;
static int __read_mostly fooyyyvar = 0;

void mytest()
{
        if (fooxxxvar)
                printk("fooxxx");

        if (fooyyyvar)
                printk("fooyyy");
}

If you run "strings file.c | grep foo" on the file it will only return
"fooyyy" but not "fooxxx".

It's not a big deal, but it's an area where __read_mostly affects gcc 
and not ld.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:44   ` Harvey Harrison
@ 2007-12-13 23:06     ` Andi Kleen
  0 siblings, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2007-12-13 23:06 UTC (permalink / raw)
  To: Harvey Harrison
  Cc: David Miller, bunk, linux-kernel, linux-arch, Mathieu Desnoyers

Harvey Harrison <harvey.harrison@gmail.com> writes:

> On Thu, 2007-12-13 at 14:32 -0800, David Miller wrote:
>> From: Adrian Bunk <bunk@kernel.org>
>> Date: Thu, 13 Dec 2007 23:20:44 +0100
>> 
>> > My question is:
>> > Is there anywhere in the kernel a case where __read_mostly brings a 
>> > measurable improvement or can it be removed?
>> 
>> Yes, on SMP when read-mostly objects share cache lines
>> with other objects which are frequently written to.
>> 
>> That is the whole reason we created __read_mostly
>
> I'm curious if anyone has been looking into replacing the __read_mostly
> approach with Mathieu's immediate values patchset.  Wouldn't they solve
> the cacheline sharing as well

Yes it would in most cases.  This means the writing case is much
more expensive with them, so if there is anything which is still
relatively frequently written (just not "mostly") it would be 
probably not a good idea. It should be only done for settings
that practically never change.

The other problem is that you have to change all accesses to the new
accessor macros which is potentially a lot. Also you have to add
sync calls for the writing case, which means that sysctl etc. setup
will be more complicated because you need "strategy" functions now.

Also for things that are not that critical the uglification in
the source is likely not worth it.

I would suggest to only do it after looking at oprofile cache miss
profiles.

> (perhaps more eficiently even with trading
> some icache for dcache)?

It should be more efficient because there will be less dcache misses.
icache is typically already prefetched by the CPU. This is especially
useful for the kernel because it often runs cache cold because user
space used all the cache.

-Andi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:20 RFC: remove __read_mostly Adrian Bunk
                   ` (2 preceding siblings ...)
  2007-12-13 22:48 ` Eric Dumazet
@ 2007-12-13 23:54 ` Kyle McMartin
  2007-12-14  0:33   ` Andi Kleen
  2007-12-14 15:24 ` Matt Mackall
  4 siblings, 1 reply; 21+ messages in thread
From: Kyle McMartin @ 2007-12-13 23:54 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: linux-kernel, linux-arch

On Thu, Dec 13, 2007 at 11:20:44PM +0100, Adrian Bunk wrote:
> Is there anywhere in the kernel a case where __read_mostly brings a 
> measurable improvement or can it be removed?
> 

Yes, definitely[1]... the problem is, and this is also true of other
annotations[2], that people go absolutely nuts adding these annotations
without doing any profiling to see whether they actually improve things.

I'd bet, in the __read_mostly case at least, that there's no
improvement in almost all cases.

cheers, Kyle

1. It's hugely relevant on big-SMP machines.
2. I mean __read_mostly, likely, unlikely, etc., not the useful sparse
annotations.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 23:54 ` Kyle McMartin
@ 2007-12-14  0:33   ` Andi Kleen
  2007-12-17 10:33     ` Andrew Morton
  0 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2007-12-14  0:33 UTC (permalink / raw)
  To: Kyle McMartin; +Cc: Adrian Bunk, linux-kernel, linux-arch

Kyle McMartin <kyle@mcmartin.ca> writes:

> I'd bet, in the __read_mostly case at least, that there's no
> improvement in almost all cases.

I bet you're wrong. Cache line behaviour is critical, much more
than pipeline behaviour (which unlikely affects). That is because
if you eat a cache miss it gets really expensive, which e.g.
a mispredicted jump is relatively cheap in comparison. We're talking
one or more orders of magnitude.

I admit I'm to blame for both (submitted unlikely and asked for
__read_mostly) and I now consider unlikely a mistake now by hindsight,
but still think __read_mostly was a good idea.

There is one potential problem in that if __read_mostly is used too
aggressively then the non __read_mostly variables will be all "write
mostly" with nothing inbetween and that could lead to more false
sharing, but so far this doesn't seem to be a big problem (and that
one could be solved too)

-Andi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:20 RFC: remove __read_mostly Adrian Bunk
                   ` (3 preceding siblings ...)
  2007-12-13 23:54 ` Kyle McMartin
@ 2007-12-14 15:24 ` Matt Mackall
  2007-12-14 15:38   ` Eric Dumazet
  4 siblings, 1 reply; 21+ messages in thread
From: Matt Mackall @ 2007-12-14 15:24 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: linux-kernel, linux-arch

On Thu, Dec 13, 2007 at 11:20:44PM +0100, Adrian Bunk wrote:
> I tried the following patch with a full x86 .config [1]:
> 
> --- a/include/asm-x86/cache.h
> +++ b/include/asm-x86/cache.h
> -#define __read_mostly __attribute__((__section__(".data.read_mostly")))
> +/* #define __read_mostly __attribute__((__section__(".data.read_mostly"))) */
> 
> The result [2,3] was:
> 
> -rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
> -rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux
> 
> It's not a surprise that the kernel can become bigger when __read_mostly 
> gets used, especially in cases where __read_mostly prevents gcc 
> optimizations.
> 
> My question is:
> Is there anywhere in the kernel a case where __read_mostly brings a 
> measurable improvement or can it be removed?

Yes, but perhaps we can put it under CONFIG_BASE_FULL?

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-14 15:24 ` Matt Mackall
@ 2007-12-14 15:38   ` Eric Dumazet
  2007-12-14 15:42     ` Matt Mackall
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Dumazet @ 2007-12-14 15:38 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Adrian Bunk, linux-kernel, linux-arch

Matt Mackall a écrit :
> On Thu, Dec 13, 2007 at 11:20:44PM +0100, Adrian Bunk wrote:
>   
>> I tried the following patch with a full x86 .config [1]:
>>
>> --- a/include/asm-x86/cache.h
>> +++ b/include/asm-x86/cache.h
>> -#define __read_mostly __attribute__((__section__(".data.read_mostly")))
>> +/* #define __read_mostly __attribute__((__section__(".data.read_mostly"))) */
>>
>> The result [2,3] was:
>>
>> -rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
>> -rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux
>>
>> It's not a surprise that the kernel can become bigger when __read_mostly 
>> gets used, especially in cases where __read_mostly prevents gcc 
>> optimizations.
>>
>> My question is:
>> Is there anywhere in the kernel a case where __read_mostly brings a 
>> measurable improvement or can it be removed?
>>     
>
> Yes, but perhaps we can put it under CONFIG_BASE_FULL?
>
>   
Yes, we probably can do something like that (in addition to !CONFIG_SMP)






^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-14 15:38   ` Eric Dumazet
@ 2007-12-14 15:42     ` Matt Mackall
  0 siblings, 0 replies; 21+ messages in thread
From: Matt Mackall @ 2007-12-14 15:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Adrian Bunk, linux-kernel, linux-arch

On Fri, Dec 14, 2007 at 04:38:04PM +0100, Eric Dumazet wrote:
> Matt Mackall a ?crit :
> >On Thu, Dec 13, 2007 at 11:20:44PM +0100, Adrian Bunk wrote:
> >  
> >>I tried the following patch with a full x86 .config [1]:
> >>
> >>--- a/include/asm-x86/cache.h
> >>+++ b/include/asm-x86/cache.h
> >>-#define __read_mostly __attribute__((__section__(".data.read_mostly")))
> >>+/* #define __read_mostly 
> >>__attribute__((__section__(".data.read_mostly"))) */
> >>
> >>The result [2,3] was:
> >>
> >>-rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
> >>-rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux
> >>
> >>It's not a surprise that the kernel can become bigger when __read_mostly 
> >>gets used, especially in cases where __read_mostly prevents gcc 
> >>optimizations.
> >>
> >>My question is:
> >>Is there anywhere in the kernel a case where __read_mostly brings a 
> >>measurable improvement or can it be removed?
> >>    
> >
> >Yes, but perhaps we can put it under CONFIG_BASE_FULL?
> >
> >  
> Yes, we probably can do something like that (in addition to !CONFIG_SMP)

Excellent point. If either CONFIG_BASE_FULL or CONFIG_SMP are not set,
we should not define __read_mostly.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-13 22:41   ` Adrian Bunk
@ 2007-12-14 16:16     ` Arnd Bergmann
  2007-12-14 16:31       ` Eric Dumazet
  0 siblings, 1 reply; 21+ messages in thread
From: Arnd Bergmann @ 2007-12-14 16:16 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: Andi Kleen, linux-kernel, linux-arch

On Thursday 13 December 2007, Adrian Bunk wrote:
> On Thu, Dec 13, 2007 at 11:29:08PM +0100, Andi Kleen wrote:
> > Adrian Bunk <bunk@kernel.org> writes:
> > >
> > > -rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
> > > -rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux
> > 
> > File sizes are useless -- check size output.
> 
>     text    data     bss      dec     hex filename
> 29268488 3697961 5222400 38188849 246b731 vmlinux.old
> 29268435 3685565 5228784 38192784 246c690 vmlinux

Just to make sure everyone interprets this correctly:

The file size in the first example suggests a 8552 byte
(0.02%) size improvement for removing __read_mostly.

The size output shows a -3935 byte (0.01%) size penalty
instead, much smaller because data that was moved out to
the .data.read_mostly section from .bss now takes space
in the binary but won't consume more RAM.

Since 'size' does not take any sections except text, data and
bss into account, its output is more often than not also
misleading, but at least it shows that the footprint is likely
to get larger without __read_mostly rather than smaller.

	Arnd <><

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-14 16:16     ` Arnd Bergmann
@ 2007-12-14 16:31       ` Eric Dumazet
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Dumazet @ 2007-12-14 16:31 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Adrian Bunk, Andi Kleen, linux-kernel, linux-arch

Arnd Bergmann a écrit :
> On Thursday 13 December 2007, Adrian Bunk wrote:
>   
>> On Thu, Dec 13, 2007 at 11:29:08PM +0100, Andi Kleen wrote:
>>     
>>> Adrian Bunk <bunk@kernel.org> writes:
>>>       
>>>> -rwxrwxr-x 1 bunk bunk 46607243 2007-12-13 19:50 vmlinux.old
>>>> -rwxrwxr-x 1 bunk bunk 46598691 2007-12-13 21:55 vmlinux
>>>>         
>>> File sizes are useless -- check size output.
>>>       
>>     text    data     bss      dec     hex filename
>> 29268488 3697961 5222400 38188849 246b731 vmlinux.old
>> 29268435 3685565 5228784 38192784 246c690 vmlinux
>>     
>
> Just to make sure everyone interprets this correctly:
>
> The file size in the first example suggests a 8552 byte
> (0.02%) size improvement for removing __read_mostly.
>
> The size output shows a -3935 byte (0.01%) size penalty
> instead, much smaller because data that was moved out to
> the .data.read_mostly section from .bss now takes space
> in the binary but won't consume more RAM.
>
> Since 'size' does not take any sections except text, data and
> bss into account, its output is more often than not also
> misleading, but at least it shows that the footprint is likely
> to get larger without __read_mostly rather than smaller.
>
>   
Your analysis is fine, except the last point. Just check "size -A 
vmlinux" and you'll see
that "size vmlinux" take into account not only "text, data and bss"

# size vmlinux
   text    data     bss     dec     hex filename
4835243  450722  610304 5896269  59f84d vmlinux
# size -A vmlinux
vmlinux  :
section                      size         addr
.text.head                    885   3222274048
.text                     3359764   3222278144
__ex_table                   3904   3225637920
.notes                         36   3225641824
__bug_table                 21120   3225641864
.rodata                   1130680   3225665536
.pci_fixup                   2192   3226796216
__ksymtab                   23104   3226798408
__ksymtab_gpl                7224   3226821512
__ksymtab_unused_gpl           16   3226828736
__ksymtab_gpl_future           24   3226828752
__ksymtab_strings           72470   3226828776
__param                      4020   3226901248
.data                      309056   3226906624
.data_nosave                 4096   3227217920
.data.page_aligned           2048   3227222016
.data.cacheline_aligned     46720   3227224064
.data.read_mostly           11388   3227270784
.data.init_task              8192   3227287552
.smp_locks                  19128   3227295744
.init.text                 137115   3227316224
.init.data                  40290   3227453344
.init.setup                  1656   3227493648
.initcall.init               1124   3227495304
.con_initcall.init              8   3227496428
.altinstructions            38771   3227496436
.altinstr_replacement        9908   3227535207
.exit.text                   4750   3227545116
.init.ramfs                   132   3227553792
.data.percpu                26144   3227557888
.bss                       610304   3227586560
.comment                    19746            0
Total                     5916015





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-14  0:33   ` Andi Kleen
@ 2007-12-17 10:33     ` Andrew Morton
  2007-12-17 10:53       ` Eric Dumazet
  2007-12-17 12:15       ` Andi Kleen
  0 siblings, 2 replies; 21+ messages in thread
From: Andrew Morton @ 2007-12-17 10:33 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Kyle McMartin, Adrian Bunk, linux-kernel, linux-arch

On Fri, 14 Dec 2007 01:33:45 +0100 Andi Kleen <andi@firstfloor.org> wrote:

> Kyle McMartin <kyle@mcmartin.ca> writes:
> 
> > I'd bet, in the __read_mostly case at least, that there's no
> > improvement in almost all cases.
> 
> I bet you're wrong. Cache line behaviour is critical, much more
> than pipeline behaviour (which unlikely affects). That is because
> if you eat a cache miss it gets really expensive, which e.g.
> a mispredicted jump is relatively cheap in comparison. We're talking
> one or more orders of magnitude.

So...  once we've moved all read-mostly variables into __read_mostly, what
is left behind in bss?

All the write-often variables.  All optimally packed together to nicely
maximise cacheline sharing.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-17 10:33     ` Andrew Morton
@ 2007-12-17 10:53       ` Eric Dumazet
  2007-12-17 11:07         ` Andrew Morton
  2007-12-17 12:15       ` Andi Kleen
  1 sibling, 1 reply; 21+ messages in thread
From: Eric Dumazet @ 2007-12-17 10:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Kyle McMartin, Adrian Bunk, linux-kernel, linux-arch

On Mon, 17 Dec 2007 02:33:39 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Fri, 14 Dec 2007 01:33:45 +0100 Andi Kleen <andi@firstfloor.org> wrote:
> 
> > Kyle McMartin <kyle@mcmartin.ca> writes:
> > 
> > > I'd bet, in the __read_mostly case at least, that there's no
> > > improvement in almost all cases.
> > 
> > I bet you're wrong. Cache line behaviour is critical, much more
> > than pipeline behaviour (which unlikely affects). That is because
> > if you eat a cache miss it gets really expensive, which e.g.
> > a mispredicted jump is relatively cheap in comparison. We're talking
> > one or more orders of magnitude.
> 
> So...  once we've moved all read-mostly variables into __read_mostly, what
> is left behind in bss?
> 
> All the write-often variables.  All optimally packed together to nicely
> maximise cacheline sharing.

This is why it's important to group related variables together, so that they share
same cacheline.

Random example : vmlist_lock & vmlist

Currently in two separate cache lines (not that important since vmlist is
 so big that one extra cache line access is not measurable)

Other possibilities are :

1) to make sure that really critical hot spots are alone 
(they eventually waste a full cacheline, even if only 4 bytes are in use)

2) Or they are mixed with seldom used data. (One cache line contains one
critical object + other mostly_unused data). This kind of mixing
is really hard to do without a special linker.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-17 10:53       ` Eric Dumazet
@ 2007-12-17 11:07         ` Andrew Morton
  2007-12-17 12:19           ` Andi Kleen
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2007-12-17 11:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andi Kleen, Kyle McMartin, Adrian Bunk, linux-kernel, linux-arch

On Mon, 17 Dec 2007 11:53:36 +0100 Eric Dumazet <dada1@cosmosbay.com> wrote:

> n Mon, 17 Dec 2007 02:33:39 -0800
> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > On Fri, 14 Dec 2007 01:33:45 +0100 Andi Kleen <andi@firstfloor.org> wrote:
> > 
> > > Kyle McMartin <kyle@mcmartin.ca> writes:
> > > 
> > > > I'd bet, in the __read_mostly case at least, that there's no
> > > > improvement in almost all cases.
> > > 
> > > I bet you're wrong. Cache line behaviour is critical, much more
> > > than pipeline behaviour (which unlikely affects). That is because
> > > if you eat a cache miss it gets really expensive, which e.g.
> > > a mispredicted jump is relatively cheap in comparison. We're talking
> > > one or more orders of magnitude.
> > 
> > So...  once we've moved all read-mostly variables into __read_mostly, what
> > is left behind in bss?
> > 
> > All the write-often variables.  All optimally packed together to nicely
> > maximise cacheline sharing.
> 
> This is why it's important to group related variables together, so that they share
> same cacheline.

Not feasible.  The linker is (I believe) free to place them anywhere it
likes unless we go and aggregate them in a struct.

Take (just for one example) inode_lock.  How do we prevent that from
sharing a cacheline with (to pick another example) rtnl_mutex?

The insidious thing about this is that is is highly dependent upon
compiler/linker version and upon kernel config.  So performance differences
will appears and disappear with us having very little understanding why.


I guess we could hunt down the write-very-often variables and put them in
private cachelines.  But there will be a *lot* of them when one considers
all possible workloads and all possible drivers.


Now, if we had named it __read_often rather than __read_mostly then we might
end up with a better result: all those read-mostly, read-rarely variables (and
there are a lot of those) could be very usefully deployed by packing them
in between the write-often variables.

It's crying out for a performance-guided solution.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-17 10:33     ` Andrew Morton
  2007-12-17 10:53       ` Eric Dumazet
@ 2007-12-17 12:15       ` Andi Kleen
  2007-12-17 12:40         ` Adrian Bunk
  1 sibling, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2007-12-17 12:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Kyle McMartin, Adrian Bunk, linux-kernel, linux-arch

> So...  once we've moved all read-mostly variables into __read_mostly, what
> is left behind in bss?

I had already covered that in the next paragraph which you conveniently
snipped :)

Anyways I suspect the right solution for that would be more classes
of variables for even better grouping.

And for really often writen variables cache line padding is probably a good
idea.

-Andi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-17 11:07         ` Andrew Morton
@ 2007-12-17 12:19           ` Andi Kleen
  0 siblings, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2007-12-17 12:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric Dumazet, Andi Kleen, Kyle McMartin, Adrian Bunk,
	linux-kernel, linux-arch

On Mon, Dec 17, 2007 at 03:07:43AM -0800, Andrew Morton wrote:
> On Mon, 17 Dec 2007 11:53:36 +0100 Eric Dumazet <dada1@cosmosbay.com> wrote:
> 
> > n Mon, 17 Dec 2007 02:33:39 -0800
> > Andrew Morton <akpm@linux-foundation.org> wrote:
> > 
> > > On Fri, 14 Dec 2007 01:33:45 +0100 Andi Kleen <andi@firstfloor.org> wrote:
> > > 
> > > > Kyle McMartin <kyle@mcmartin.ca> writes:
> > > > 
> > > > > I'd bet, in the __read_mostly case at least, that there's no
> > > > > improvement in almost all cases.
> > > > 
> > > > I bet you're wrong. Cache line behaviour is critical, much more
> > > > than pipeline behaviour (which unlikely affects). That is because
> > > > if you eat a cache miss it gets really expensive, which e.g.
> > > > a mispredicted jump is relatively cheap in comparison. We're talking
> > > > one or more orders of magnitude.
> > > 
> > > So...  once we've moved all read-mostly variables into __read_mostly, what
> > > is left behind in bss?
> > > 
> > > All the write-often variables.  All optimally packed together to nicely
> > > maximise cacheline sharing.
> > 
> > This is why it's important to group related variables together, so that they share
> > same cacheline.
> 
> Not feasible.  The linker is (I believe) free to place them anywhere it
> likes unless we go and aggregate them in a struct.

It won't normally though (they are put linear for each object file)
and if you really want to make sure you can always use a special section.

> The insidious thing about this is that is is highly dependent upon
> compiler/linker version and upon kernel config.  So performance differences

I'm not aware of the order of global variables changing that much.
Especially the linker seems to keep it rather stable.

> end up with a better result: all those read-mostly, read-rarely variables (and
> there are a lot of those) could be very usefully deployed by packing them
> in between the write-often variables.
> 
> It's crying out for a performance-guided solution.

My problem with profile feedback is that it will make it impossible
to ever recreate kernel binaries after the fact. So decoding of random
oopses would become much harder.

-Andi



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFC: remove __read_mostly
  2007-12-17 12:15       ` Andi Kleen
@ 2007-12-17 12:40         ` Adrian Bunk
  0 siblings, 0 replies; 21+ messages in thread
From: Adrian Bunk @ 2007-12-17 12:40 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Kyle McMartin, linux-kernel, linux-arch

On Mon, Dec 17, 2007 at 01:15:56PM +0100, Andi Kleen wrote:
> > So...  once we've moved all read-mostly variables into __read_mostly, what
> > is left behind in bss?
> 
> I had already covered that in the next paragraph which you conveniently
> snipped :)
> 
> Anyways I suspect the right solution for that would be more classes
> of variables for even better grouping.
>...

Is there any way to tell gcc that it's still allowed to omit the 
variable when it's able to prove it's read-only?

My basic fear with all the __read_mostly, immediate values,... stuff is 
that it might interfere with gcc optimizations.

> -Andi

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2007-12-17 12:40 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-13 22:20 RFC: remove __read_mostly Adrian Bunk
2007-12-13 22:29 ` Andi Kleen
2007-12-13 22:41   ` Adrian Bunk
2007-12-14 16:16     ` Arnd Bergmann
2007-12-14 16:31       ` Eric Dumazet
2007-12-13 22:32 ` David Miller
2007-12-13 22:44   ` Harvey Harrison
2007-12-13 23:06     ` Andi Kleen
2007-12-13 22:48 ` Eric Dumazet
2007-12-13 23:00   ` Adrian Bunk
2007-12-13 23:54 ` Kyle McMartin
2007-12-14  0:33   ` Andi Kleen
2007-12-17 10:33     ` Andrew Morton
2007-12-17 10:53       ` Eric Dumazet
2007-12-17 11:07         ` Andrew Morton
2007-12-17 12:19           ` Andi Kleen
2007-12-17 12:15       ` Andi Kleen
2007-12-17 12:40         ` Adrian Bunk
2007-12-14 15:24 ` Matt Mackall
2007-12-14 15:38   ` Eric Dumazet
2007-12-14 15:42     ` Matt Mackall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).