All of lore.kernel.org
 help / color / mirror / Atom feed
* String literals in __init functions
@ 2015-03-25 17:56 ` Mason
  0 siblings, 0 replies; 30+ messages in thread
From: Mason @ 2015-03-25 17:56 UTC (permalink / raw)
  To: Linux ARM; +Cc: LKML

Hello everyone,

AFAIU, functions only used at system init are tagged __init to have
the linker store them in a separate .init.text section, so memory can
be reclaimed once initialization is complete. Is that correct?

The corresponding tag for data is __initdata (section .init.data)

I started wondering if the string literals used in an __init functions
were automatically marked __initdata.

Looking at the objdump output, I see that the string literals are,
in fact, stored in the .rodata section. I suppose that .rodata is NOT
reclaimed after init?

This way seems to work:

static       char XyZa[] __initdata  = KERN_ALERT "foo";
static const char XyZb[] __initconst = KERN_ALERT "bar";
void __init XyZc(void) { printk(XyZa); printk(XyZb); }

$ arm-linux-gnueabihf-objdump -xd arch/arm/mach-tangox/time.o | grep XyZ
00000000 l     O .init.data	00000006 XyZa
00000000 l     O .init.rodata	00000006 XyZb
00000000 g     F .init.text	00000028 XyZc
00000000 <XyZc>:

$ arm-linux-gnueabihf-objdump -xd vmlinux | grep XyZ
c021e360 l     O .init.data	00000006 XyZa
c0220090 l     O .init.data	00000006 XyZb
c020d928 g     F .init.text	00000028 XyZc
c020d928 <XyZc>:

c020d928 <XyZc>:
c020d928:       e1a0c00d        mov     ip, sp
c020d92c:       e92dd800        push    {fp, ip, lr, pc}
c020d930:       e24cb004        sub     fp, ip, #4
c020d934:       e30e0360        movw    r0, #58208      ; 0xe360
c020d938:       e34c0021        movt    r0, #49185      ; 0xc021
c020d93c:       ebfe00c9        bl      c018dc68 <printk>
c020d940:       e3000090        movw    r0, #144        ; 0x90
c020d944:       e34c0022        movt    r0, #49186      ; 0xc022
c020d948:       ebfe00c6        bl      c018dc68 <printk>
c020d94c:       e89da800        ldm     sp, {fp, sp, pc}

Did I miss something in init.h?
Or should it be done like above to reclaim string literals?

Regards.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-25 17:56 ` Mason
  0 siblings, 0 replies; 30+ messages in thread
From: Mason @ 2015-03-25 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hello everyone,

AFAIU, functions only used at system init are tagged __init to have
the linker store them in a separate .init.text section, so memory can
be reclaimed once initialization is complete. Is that correct?

The corresponding tag for data is __initdata (section .init.data)

I started wondering if the string literals used in an __init functions
were automatically marked __initdata.

Looking at the objdump output, I see that the string literals are,
in fact, stored in the .rodata section. I suppose that .rodata is NOT
reclaimed after init?

This way seems to work:

static       char XyZa[] __initdata  = KERN_ALERT "foo";
static const char XyZb[] __initconst = KERN_ALERT "bar";
void __init XyZc(void) { printk(XyZa); printk(XyZb); }

$ arm-linux-gnueabihf-objdump -xd arch/arm/mach-tangox/time.o | grep XyZ
00000000 l     O .init.data	00000006 XyZa
00000000 l     O .init.rodata	00000006 XyZb
00000000 g     F .init.text	00000028 XyZc
00000000 <XyZc>:

$ arm-linux-gnueabihf-objdump -xd vmlinux | grep XyZ
c021e360 l     O .init.data	00000006 XyZa
c0220090 l     O .init.data	00000006 XyZb
c020d928 g     F .init.text	00000028 XyZc
c020d928 <XyZc>:

c020d928 <XyZc>:
c020d928:       e1a0c00d        mov     ip, sp
c020d92c:       e92dd800        push    {fp, ip, lr, pc}
c020d930:       e24cb004        sub     fp, ip, #4
c020d934:       e30e0360        movw    r0, #58208      ; 0xe360
c020d938:       e34c0021        movt    r0, #49185      ; 0xc021
c020d93c:       ebfe00c9        bl      c018dc68 <printk>
c020d940:       e3000090        movw    r0, #144        ; 0x90
c020d944:       e34c0022        movt    r0, #49186      ; 0xc022
c020d948:       ebfe00c6        bl      c018dc68 <printk>
c020d94c:       e89da800        ldm     sp, {fp, sp, pc}

Did I miss something in init.h?
Or should it be done like above to reclaim string literals?

Regards.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-25 17:56 ` Mason
@ 2015-03-25 18:01   ` Joe Perches
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-25 18:01 UTC (permalink / raw)
  To: Mason; +Cc: Linux ARM, LKML

On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
> Hello everyone,
> 
> AFAIU, functions only used at system init are tagged __init to have
> the linker store them in a separate .init.text section, so memory can
> be reclaimed once initialization is complete. Is that correct?
> 
> The corresponding tag for data is __initdata (section .init.data)
> 
> I started wondering if the string literals used in an __init functions
> were automatically marked __initdata.
> 
> Looking at the objdump output, I see that the string literals are,
> in fact, stored in the .rodata section. I suppose that .rodata is NOT
> reclaimed after init?
> 
> This way seems to work:
> 
> static       char XyZa[] __initdata  = KERN_ALERT "foo";
> static const char XyZb[] __initconst = KERN_ALERT "bar";
> void __init XyZc(void) { printk(XyZa); printk(XyZb); }
> 
> $ arm-linux-gnueabihf-objdump -xd arch/arm/mach-tangox/time.o | grep XyZ
> 00000000 l     O .init.data	00000006 XyZa
> 00000000 l     O .init.rodata	00000006 XyZb
> 00000000 g     F .init.text	00000028 XyZc
> 00000000 <XyZc>:
> 
> $ arm-linux-gnueabihf-objdump -xd vmlinux | grep XyZ
> c021e360 l     O .init.data	00000006 XyZa
> c0220090 l     O .init.data	00000006 XyZb
> c020d928 g     F .init.text	00000028 XyZc
> c020d928 <XyZc>:
> 
> c020d928 <XyZc>:
> c020d928:       e1a0c00d        mov     ip, sp
> c020d92c:       e92dd800        push    {fp, ip, lr, pc}
> c020d930:       e24cb004        sub     fp, ip, #4
> c020d934:       e30e0360        movw    r0, #58208      ; 0xe360
> c020d938:       e34c0021        movt    r0, #49185      ; 0xc021
> c020d93c:       ebfe00c9        bl      c018dc68 <printk>
> c020d940:       e3000090        movw    r0, #144        ; 0x90
> c020d944:       e34c0022        movt    r0, #49186      ; 0xc022
> c020d948:       ebfe00c6        bl      c018dc68 <printk>
> c020d94c:       e89da800        ldm     sp, {fp, sp, pc}
> 
> Did I miss something in init.h?
> Or should it be done like above to reclaim string literals?

No, you didn't miss anything.

One proposal:

https://lkml.org/lkml/2014/8/21/255
> Regards.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/




^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-25 18:01   ` Joe Perches
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-25 18:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
> Hello everyone,
> 
> AFAIU, functions only used at system init are tagged __init to have
> the linker store them in a separate .init.text section, so memory can
> be reclaimed once initialization is complete. Is that correct?
> 
> The corresponding tag for data is __initdata (section .init.data)
> 
> I started wondering if the string literals used in an __init functions
> were automatically marked __initdata.
> 
> Looking at the objdump output, I see that the string literals are,
> in fact, stored in the .rodata section. I suppose that .rodata is NOT
> reclaimed after init?
> 
> This way seems to work:
> 
> static       char XyZa[] __initdata  = KERN_ALERT "foo";
> static const char XyZb[] __initconst = KERN_ALERT "bar";
> void __init XyZc(void) { printk(XyZa); printk(XyZb); }
> 
> $ arm-linux-gnueabihf-objdump -xd arch/arm/mach-tangox/time.o | grep XyZ
> 00000000 l     O .init.data	00000006 XyZa
> 00000000 l     O .init.rodata	00000006 XyZb
> 00000000 g     F .init.text	00000028 XyZc
> 00000000 <XyZc>:
> 
> $ arm-linux-gnueabihf-objdump -xd vmlinux | grep XyZ
> c021e360 l     O .init.data	00000006 XyZa
> c0220090 l     O .init.data	00000006 XyZb
> c020d928 g     F .init.text	00000028 XyZc
> c020d928 <XyZc>:
> 
> c020d928 <XyZc>:
> c020d928:       e1a0c00d        mov     ip, sp
> c020d92c:       e92dd800        push    {fp, ip, lr, pc}
> c020d930:       e24cb004        sub     fp, ip, #4
> c020d934:       e30e0360        movw    r0, #58208      ; 0xe360
> c020d938:       e34c0021        movt    r0, #49185      ; 0xc021
> c020d93c:       ebfe00c9        bl      c018dc68 <printk>
> c020d940:       e3000090        movw    r0, #144        ; 0x90
> c020d944:       e34c0022        movt    r0, #49186      ; 0xc022
> c020d948:       ebfe00c6        bl      c018dc68 <printk>
> c020d94c:       e89da800        ldm     sp, {fp, sp, pc}
> 
> Did I miss something in init.h?
> Or should it be done like above to reclaim string literals?

No, you didn't miss anything.

One proposal:

https://lkml.org/lkml/2014/8/21/255
> Regards.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-25 18:01   ` Joe Perches
@ 2015-03-26 12:40     ` Mason
  -1 siblings, 0 replies; 30+ messages in thread
From: Mason @ 2015-03-26 12:40 UTC (permalink / raw)
  To: Joe Perches; +Cc: Linux ARM, LKML, Ingo Molnar, Mathias Krause

On 25/03/2015 19:01, Joe Perches wrote:

> On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
>
>> AFAIU, functions only used at system init are tagged __init to have
>> the linker store them in a separate .init.text section, so memory can
>> be reclaimed once initialization is complete. Is that correct?
>>
>> The corresponding tag for data is __initdata (section .init.data)
>>
>> I started wondering if the string literals used in an __init functions
>> were automatically marked __initdata.
>>
>> Looking at the objdump output, I see that the string literals are,
>> in fact, stored in the .rodata section. I suppose that .rodata is NOT
>> reclaimed after init?
>>
>> This way seems to work:
>>
>> static       char XyZa[] __initdata  = KERN_ALERT "foo";
>> static const char XyZb[] __initconst = KERN_ALERT "bar";
>> void __init XyZc(void) { printk(XyZa); printk(XyZb); }
>>
>> $ arm-linux-gnueabihf-objdump -xd arch/arm/mach-tangox/time.o | grep XyZ
>> 00000000 l     O .init.data	00000006 XyZa
>> 00000000 l     O .init.rodata	00000006 XyZb
>> 00000000 g     F .init.text	00000028 XyZc
>> 00000000 <XyZc>:
>>
>> $ arm-linux-gnueabihf-objdump -xd vmlinux | grep XyZ
>> c021e360 l     O .init.data	00000006 XyZa
>> c0220090 l     O .init.data	00000006 XyZb
>> c020d928 g     F .init.text	00000028 XyZc
>> c020d928 <XyZc>:
>>
>> c020d928 <XyZc>:
>> c020d928:       e1a0c00d        mov     ip, sp
>> c020d92c:       e92dd800        push    {fp, ip, lr, pc}
>> c020d930:       e24cb004        sub     fp, ip, #4
>> c020d934:       e30e0360        movw    r0, #58208      ; 0xe360
>> c020d938:       e34c0021        movt    r0, #49185      ; 0xc021
>> c020d93c:       ebfe00c9        bl      c018dc68 <printk>
>> c020d940:       e3000090        movw    r0, #144        ; 0x90
>> c020d944:       e34c0022        movt    r0, #49186      ; 0xc022
>> c020d948:       ebfe00c6        bl      c018dc68 <printk>
>> c020d94c:       e89da800        ldm     sp, {fp, sp, pc}
>>
>> Did I miss something in init.h?
>> Or should it be done like above to reclaim string literals?
>
> No, you didn't miss anything.
>
> One proposal:
>
> https://lkml.org/lkml/2014/8/21/255

Thanks for the link!

Here's the equivalent gmane link for my own reference:
http://thread.gmane.org/gmane.linux.kernel/1771969

Basically, if I understand correctly, Ingo NAKed the patch, saying
this should be done automatically by the toolchain. That would make
for an interesting side-project...

For the record, I wrote a trivial wrapper for my limited use-case.

#define printk_init(format, ...) do { \
   static char fmt[] __initdata = format; printk(fmt, ## __VA_ARGS__); \
} while(0)

(I dislike the "statement-in-expression" extension, because vim thinks
there's a syntax error, and flashes a bright red block.)

https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html
https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html

Regards.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-26 12:40     ` Mason
  0 siblings, 0 replies; 30+ messages in thread
From: Mason @ 2015-03-26 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 25/03/2015 19:01, Joe Perches wrote:

> On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
>
>> AFAIU, functions only used at system init are tagged __init to have
>> the linker store them in a separate .init.text section, so memory can
>> be reclaimed once initialization is complete. Is that correct?
>>
>> The corresponding tag for data is __initdata (section .init.data)
>>
>> I started wondering if the string literals used in an __init functions
>> were automatically marked __initdata.
>>
>> Looking at the objdump output, I see that the string literals are,
>> in fact, stored in the .rodata section. I suppose that .rodata is NOT
>> reclaimed after init?
>>
>> This way seems to work:
>>
>> static       char XyZa[] __initdata  = KERN_ALERT "foo";
>> static const char XyZb[] __initconst = KERN_ALERT "bar";
>> void __init XyZc(void) { printk(XyZa); printk(XyZb); }
>>
>> $ arm-linux-gnueabihf-objdump -xd arch/arm/mach-tangox/time.o | grep XyZ
>> 00000000 l     O .init.data	00000006 XyZa
>> 00000000 l     O .init.rodata	00000006 XyZb
>> 00000000 g     F .init.text	00000028 XyZc
>> 00000000 <XyZc>:
>>
>> $ arm-linux-gnueabihf-objdump -xd vmlinux | grep XyZ
>> c021e360 l     O .init.data	00000006 XyZa
>> c0220090 l     O .init.data	00000006 XyZb
>> c020d928 g     F .init.text	00000028 XyZc
>> c020d928 <XyZc>:
>>
>> c020d928 <XyZc>:
>> c020d928:       e1a0c00d        mov     ip, sp
>> c020d92c:       e92dd800        push    {fp, ip, lr, pc}
>> c020d930:       e24cb004        sub     fp, ip, #4
>> c020d934:       e30e0360        movw    r0, #58208      ; 0xe360
>> c020d938:       e34c0021        movt    r0, #49185      ; 0xc021
>> c020d93c:       ebfe00c9        bl      c018dc68 <printk>
>> c020d940:       e3000090        movw    r0, #144        ; 0x90
>> c020d944:       e34c0022        movt    r0, #49186      ; 0xc022
>> c020d948:       ebfe00c6        bl      c018dc68 <printk>
>> c020d94c:       e89da800        ldm     sp, {fp, sp, pc}
>>
>> Did I miss something in init.h?
>> Or should it be done like above to reclaim string literals?
>
> No, you didn't miss anything.
>
> One proposal:
>
> https://lkml.org/lkml/2014/8/21/255

Thanks for the link!

Here's the equivalent gmane link for my own reference:
http://thread.gmane.org/gmane.linux.kernel/1771969

Basically, if I understand correctly, Ingo NAKed the patch, saying
this should be done automatically by the toolchain. That would make
for an interesting side-project...

For the record, I wrote a trivial wrapper for my limited use-case.

#define printk_init(format, ...) do { \
   static char fmt[] __initdata = format; printk(fmt, ## __VA_ARGS__); \
} while(0)

(I dislike the "statement-in-expression" extension, because vim thinks
there's a syntax error, and flashes a bright red block.)

https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html
https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html

Regards.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 12:40     ` Mason
@ 2015-03-26 16:13       ` Joe Perches
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-26 16:13 UTC (permalink / raw)
  To: Mason; +Cc: Linux ARM, LKML, Ingo Molnar, Mathias Krause

On Thu, 2015-03-26 at 13:40 +0100, Mason wrote:
> On 25/03/2015 19:01, Joe Perches wrote:
> > On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
> >
> >> AFAIU, functions only used at system init are tagged __init to have
> >> the linker store them in a separate .init.text section, so memory can
> >> be reclaimed once initialization is complete. Is that correct?
> >>
> >> The corresponding tag for data is __initdata (section .init.data)
> >>
> >> I started wondering if the string literals used in an __init functions
> >> were automatically marked __initdata.
> >>
> >> Looking at the objdump output, I see that the string literals are,
> >> in fact, stored in the .rodata section. I suppose that .rodata is NOT
> >> reclaimed after init?
> >>
> >> This way seems to work:
> >>
> >> static       char XyZa[] __initdata  = KERN_ALERT "foo";
> >> static const char XyZb[] __initconst = KERN_ALERT "bar";
> >> void __init XyZc(void) { printk(XyZa); printk(XyZb); }
> >>
> >> $ arm-linux-gnueabihf-objdump -xd arch/arm/mach-tangox/time.o | grep XyZ
> >> 00000000 l     O .init.data	00000006 XyZa
> >> 00000000 l     O .init.rodata	00000006 XyZb
> >> 00000000 g     F .init.text	00000028 XyZc
> >> 00000000 <XyZc>:
> >>
> >> $ arm-linux-gnueabihf-objdump -xd vmlinux | grep XyZ
> >> c021e360 l     O .init.data	00000006 XyZa
> >> c0220090 l     O .init.data	00000006 XyZb
> >> c020d928 g     F .init.text	00000028 XyZc
> >> c020d928 <XyZc>:
> >>
> >> c020d928 <XyZc>:
> >> c020d928:       e1a0c00d        mov     ip, sp
> >> c020d92c:       e92dd800        push    {fp, ip, lr, pc}
> >> c020d930:       e24cb004        sub     fp, ip, #4
> >> c020d934:       e30e0360        movw    r0, #58208      ; 0xe360
> >> c020d938:       e34c0021        movt    r0, #49185      ; 0xc021
> >> c020d93c:       ebfe00c9        bl      c018dc68 <printk>
> >> c020d940:       e3000090        movw    r0, #144        ; 0x90
> >> c020d944:       e34c0022        movt    r0, #49186      ; 0xc022
> >> c020d948:       ebfe00c6        bl      c018dc68 <printk>
> >> c020d94c:       e89da800        ldm     sp, {fp, sp, pc}
> >>
> >> Did I miss something in init.h?
> >> Or should it be done like above to reclaim string literals?
> >
> > No, you didn't miss anything.
> >
> > One proposal:
> >
> > https://lkml.org/lkml/2014/8/21/255
> 
> Thanks for the link!
> 
> Here's the equivalent gmane link for my own reference:
> http://thread.gmane.org/gmane.linux.kernel/1771969
> 
> Basically, if I understand correctly, Ingo NAKed the patch, saying
> this should be done automatically by the toolchain. That would make
> for an interesting side-project...

True.  It's probably not feasible though.

Tracking string deduplication/reuse would be pretty difficult.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-26 16:13       ` Joe Perches
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-26 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2015-03-26 at 13:40 +0100, Mason wrote:
> On 25/03/2015 19:01, Joe Perches wrote:
> > On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
> >
> >> AFAIU, functions only used at system init are tagged __init to have
> >> the linker store them in a separate .init.text section, so memory can
> >> be reclaimed once initialization is complete. Is that correct?
> >>
> >> The corresponding tag for data is __initdata (section .init.data)
> >>
> >> I started wondering if the string literals used in an __init functions
> >> were automatically marked __initdata.
> >>
> >> Looking at the objdump output, I see that the string literals are,
> >> in fact, stored in the .rodata section. I suppose that .rodata is NOT
> >> reclaimed after init?
> >>
> >> This way seems to work:
> >>
> >> static       char XyZa[] __initdata  = KERN_ALERT "foo";
> >> static const char XyZb[] __initconst = KERN_ALERT "bar";
> >> void __init XyZc(void) { printk(XyZa); printk(XyZb); }
> >>
> >> $ arm-linux-gnueabihf-objdump -xd arch/arm/mach-tangox/time.o | grep XyZ
> >> 00000000 l     O .init.data	00000006 XyZa
> >> 00000000 l     O .init.rodata	00000006 XyZb
> >> 00000000 g     F .init.text	00000028 XyZc
> >> 00000000 <XyZc>:
> >>
> >> $ arm-linux-gnueabihf-objdump -xd vmlinux | grep XyZ
> >> c021e360 l     O .init.data	00000006 XyZa
> >> c0220090 l     O .init.data	00000006 XyZb
> >> c020d928 g     F .init.text	00000028 XyZc
> >> c020d928 <XyZc>:
> >>
> >> c020d928 <XyZc>:
> >> c020d928:       e1a0c00d        mov     ip, sp
> >> c020d92c:       e92dd800        push    {fp, ip, lr, pc}
> >> c020d930:       e24cb004        sub     fp, ip, #4
> >> c020d934:       e30e0360        movw    r0, #58208      ; 0xe360
> >> c020d938:       e34c0021        movt    r0, #49185      ; 0xc021
> >> c020d93c:       ebfe00c9        bl      c018dc68 <printk>
> >> c020d940:       e3000090        movw    r0, #144        ; 0x90
> >> c020d944:       e34c0022        movt    r0, #49186      ; 0xc022
> >> c020d948:       ebfe00c6        bl      c018dc68 <printk>
> >> c020d94c:       e89da800        ldm     sp, {fp, sp, pc}
> >>
> >> Did I miss something in init.h?
> >> Or should it be done like above to reclaim string literals?
> >
> > No, you didn't miss anything.
> >
> > One proposal:
> >
> > https://lkml.org/lkml/2014/8/21/255
> 
> Thanks for the link!
> 
> Here's the equivalent gmane link for my own reference:
> http://thread.gmane.org/gmane.linux.kernel/1771969
> 
> Basically, if I understand correctly, Ingo NAKed the patch, saying
> this should be done automatically by the toolchain. That would make
> for an interesting side-project...

True.  It's probably not feasible though.

Tracking string deduplication/reuse would be pretty difficult.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 16:13       ` Joe Perches
@ 2015-03-26 16:37         ` Mathias Krause
  -1 siblings, 0 replies; 30+ messages in thread
From: Mathias Krause @ 2015-03-26 16:37 UTC (permalink / raw)
  To: Joe Perches; +Cc: Mason, Linux ARM, LKML, Ingo Molnar, Mathias Krause

On 26 March 2015 at 17:13, Joe Perches <joe@perches.com> wrote:
> On Thu, 2015-03-26 at 13:40 +0100, Mason wrote:
>> On 25/03/2015 19:01, Joe Perches wrote:
>> > On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
>> >
>> >> AFAIU, functions only used at system init are tagged __init to have
>> >> the linker store them in a separate .init.text section, so memory can
>> >> be reclaimed once initialization is complete. Is that correct?
>> >>
>> >> The corresponding tag for data is __initdata (section .init.data)
>> >>
>> >> I started wondering if the string literals used in an __init functions
>> >> were automatically marked __initdata.
>> >>
>> >> Looking at the objdump output, I see that the string literals are,
>> >> in fact, stored in the .rodata section. I suppose that .rodata is NOT
>> >> reclaimed after init?
>> >>
>> >> [...]
>> >>
>> >> Did I miss something in init.h?
>> >> Or should it be done like above to reclaim string literals?
>> >
>> > No, you didn't miss anything.
>> >
>> > One proposal:
>> >
>> > https://lkml.org/lkml/2014/8/21/255
>>
>> Thanks for the link!
>>
>> Here's the equivalent gmane link for my own reference:
>> http://thread.gmane.org/gmane.linux.kernel/1771969
>>
>> Basically, if I understand correctly, Ingo NAKed the patch, saying
>> this should be done automatically by the toolchain. That would make
>> for an interesting side-project...
>
> True.  It's probably not feasible though.
>
> Tracking string deduplication/reuse would be pretty difficult.

Yep, that's why I simply didn't attempt to write a "toolchain" to
automatically put strings into the appropriate section. String
annotation and deduplication is best done in the compiler. It already
does impressive tricks to limit the amount of actual strings ending up
in the binary. If one would try to write a compiler plugin to
automatically flag __init / __exit strings it would have to be an LTO
pass as only there one would have the complete view where the string
will end up. It's not as simple as blindly marking all strings used in
__init / __exit functions to end up in the corresponding .rodata
section because those strings may be passed to functions that want to
keep a pointer, e.g. as an object name. But not all functions do! So
only an LTO pass *may* see the whole usage of a possible __init /
__exit string. Therefore I'm still not convinced that solving the
problem in the toolchain is the right thing to do. It's *way* more
complicated and probably gets it wrong more often than not. Therefore
the straight simple approach of manually marking the strings is IMHO
the best solution. Unfortunately, not everyone shares this opinion :/

Mathias

^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-26 16:37         ` Mathias Krause
  0 siblings, 0 replies; 30+ messages in thread
From: Mathias Krause @ 2015-03-26 16:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 26 March 2015 at 17:13, Joe Perches <joe@perches.com> wrote:
> On Thu, 2015-03-26 at 13:40 +0100, Mason wrote:
>> On 25/03/2015 19:01, Joe Perches wrote:
>> > On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
>> >
>> >> AFAIU, functions only used at system init are tagged __init to have
>> >> the linker store them in a separate .init.text section, so memory can
>> >> be reclaimed once initialization is complete. Is that correct?
>> >>
>> >> The corresponding tag for data is __initdata (section .init.data)
>> >>
>> >> I started wondering if the string literals used in an __init functions
>> >> were automatically marked __initdata.
>> >>
>> >> Looking at the objdump output, I see that the string literals are,
>> >> in fact, stored in the .rodata section. I suppose that .rodata is NOT
>> >> reclaimed after init?
>> >>
>> >> [...]
>> >>
>> >> Did I miss something in init.h?
>> >> Or should it be done like above to reclaim string literals?
>> >
>> > No, you didn't miss anything.
>> >
>> > One proposal:
>> >
>> > https://lkml.org/lkml/2014/8/21/255
>>
>> Thanks for the link!
>>
>> Here's the equivalent gmane link for my own reference:
>> http://thread.gmane.org/gmane.linux.kernel/1771969
>>
>> Basically, if I understand correctly, Ingo NAKed the patch, saying
>> this should be done automatically by the toolchain. That would make
>> for an interesting side-project...
>
> True.  It's probably not feasible though.
>
> Tracking string deduplication/reuse would be pretty difficult.

Yep, that's why I simply didn't attempt to write a "toolchain" to
automatically put strings into the appropriate section. String
annotation and deduplication is best done in the compiler. It already
does impressive tricks to limit the amount of actual strings ending up
in the binary. If one would try to write a compiler plugin to
automatically flag __init / __exit strings it would have to be an LTO
pass as only there one would have the complete view where the string
will end up. It's not as simple as blindly marking all strings used in
__init / __exit functions to end up in the corresponding .rodata
section because those strings may be passed to functions that want to
keep a pointer, e.g. as an object name. But not all functions do! So
only an LTO pass *may* see the whole usage of a possible __init /
__exit string. Therefore I'm still not convinced that solving the
problem in the toolchain is the right thing to do. It's *way* more
complicated and probably gets it wrong more often than not. Therefore
the straight simple approach of manually marking the strings is IMHO
the best solution. Unfortunately, not everyone shares this opinion :/

Mathias

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 16:37         ` Mathias Krause
@ 2015-03-26 17:53           ` Joe Perches
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-26 17:53 UTC (permalink / raw)
  To: Mathias Krause, Andrew Morton; +Cc: Mason, Linux ARM, LKML, Ingo Molnar

On Thu, 2015-03-26 at 17:37 +0100, Mathias Krause wrote:
> On 26 March 2015 at 17:13, Joe Perches <joe@perches.com> wrote:
> > On Thu, 2015-03-26 at 13:40 +0100, Mason wrote:
> >> On 25/03/2015 19:01, Joe Perches wrote:
> >> > On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
> >> >
> >> >> AFAIU, functions only used at system init are tagged __init to have
> >> >> the linker store them in a separate .init.text section, so memory can
> >> >> be reclaimed once initialization is complete. Is that correct?
> >> >>
> >> >> The corresponding tag for data is __initdata (section .init.data)
> >> >>
> >> >> I started wondering if the string literals used in an __init functions
> >> >> were automatically marked __initdata.
> >> >>
> >> >> Looking at the objdump output, I see that the string literals are,
> >> >> in fact, stored in the .rodata section. I suppose that .rodata is NOT
> >> >> reclaimed after init?
> >> >>
> >> >> [...]
> >> >>
> >> >> Did I miss something in init.h?
> >> >> Or should it be done like above to reclaim string literals?
> >> >
> >> > No, you didn't miss anything.
> >> >
> >> > One proposal:
> >> >
> >> > https://lkml.org/lkml/2014/8/21/255
> >>
> >> Thanks for the link!
> >>
> >> Here's the equivalent gmane link for my own reference:
> >> http://thread.gmane.org/gmane.linux.kernel/1771969
> >>
> >> Basically, if I understand correctly, Ingo NAKed the patch, saying
> >> this should be done automatically by the toolchain. That would make
> >> for an interesting side-project...
> >
> > True.  It's probably not feasible though.
> >
> > Tracking string deduplication/reuse would be pretty difficult.
> 
> Yep, that's why I simply didn't attempt to write a "toolchain" to
> automatically put strings into the appropriate section. String
> annotation and deduplication is best done in the compiler. It already
> does impressive tricks to limit the amount of actual strings ending up
> in the binary. If one would try to write a compiler plugin to
> automatically flag __init / __exit strings it would have to be an LTO
> pass as only there one would have the complete view where the string
> will end up. It's not as simple as blindly marking all strings used in
> __init / __exit functions to end up in the corresponding .rodata
> section because those strings may be passed to functions that want to
> keep a pointer, e.g. as an object name. But not all functions do! So
> only an LTO pass *may* see the whole usage of a possible __init /
> __exit string. Therefore I'm still not convinced that solving the
> problem in the toolchain is the right thing to do. It's *way* more
> complicated and probably gets it wrong more often than not. Therefore
> the straight simple approach of manually marking the strings is IMHO
> the best solution. Unfortunately, not everyone shares this opinion :/

At least a few do though.

The first 4 patches still apply and are useful in my opinion.

Maybe you could resend them as a new patch set and cc Andrew Morton.
(cc'd here too)



^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-26 17:53           ` Joe Perches
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-26 17:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2015-03-26 at 17:37 +0100, Mathias Krause wrote:
> On 26 March 2015 at 17:13, Joe Perches <joe@perches.com> wrote:
> > On Thu, 2015-03-26 at 13:40 +0100, Mason wrote:
> >> On 25/03/2015 19:01, Joe Perches wrote:
> >> > On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
> >> >
> >> >> AFAIU, functions only used at system init are tagged __init to have
> >> >> the linker store them in a separate .init.text section, so memory can
> >> >> be reclaimed once initialization is complete. Is that correct?
> >> >>
> >> >> The corresponding tag for data is __initdata (section .init.data)
> >> >>
> >> >> I started wondering if the string literals used in an __init functions
> >> >> were automatically marked __initdata.
> >> >>
> >> >> Looking at the objdump output, I see that the string literals are,
> >> >> in fact, stored in the .rodata section. I suppose that .rodata is NOT
> >> >> reclaimed after init?
> >> >>
> >> >> [...]
> >> >>
> >> >> Did I miss something in init.h?
> >> >> Or should it be done like above to reclaim string literals?
> >> >
> >> > No, you didn't miss anything.
> >> >
> >> > One proposal:
> >> >
> >> > https://lkml.org/lkml/2014/8/21/255
> >>
> >> Thanks for the link!
> >>
> >> Here's the equivalent gmane link for my own reference:
> >> http://thread.gmane.org/gmane.linux.kernel/1771969
> >>
> >> Basically, if I understand correctly, Ingo NAKed the patch, saying
> >> this should be done automatically by the toolchain. That would make
> >> for an interesting side-project...
> >
> > True.  It's probably not feasible though.
> >
> > Tracking string deduplication/reuse would be pretty difficult.
> 
> Yep, that's why I simply didn't attempt to write a "toolchain" to
> automatically put strings into the appropriate section. String
> annotation and deduplication is best done in the compiler. It already
> does impressive tricks to limit the amount of actual strings ending up
> in the binary. If one would try to write a compiler plugin to
> automatically flag __init / __exit strings it would have to be an LTO
> pass as only there one would have the complete view where the string
> will end up. It's not as simple as blindly marking all strings used in
> __init / __exit functions to end up in the corresponding .rodata
> section because those strings may be passed to functions that want to
> keep a pointer, e.g. as an object name. But not all functions do! So
> only an LTO pass *may* see the whole usage of a possible __init /
> __exit string. Therefore I'm still not convinced that solving the
> problem in the toolchain is the right thing to do. It's *way* more
> complicated and probably gets it wrong more often than not. Therefore
> the straight simple approach of manually marking the strings is IMHO
> the best solution. Unfortunately, not everyone shares this opinion :/

At least a few do though.

The first 4 patches still apply and are useful in my opinion.

Maybe you could resend them as a new patch set and cc Andrew Morton.
(cc'd here too)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 17:53           ` Joe Perches
@ 2015-03-26 20:49             ` Mathias Krause
  -1 siblings, 0 replies; 30+ messages in thread
From: Mathias Krause @ 2015-03-26 20:49 UTC (permalink / raw)
  To: Joe Perches, Andrew Morton; +Cc: Mason, Linux ARM, LKML, Ingo Molnar

On 26 March 2015 at 18:53, Joe Perches <joe@perches.com> wrote:
> On Thu, 2015-03-26 at 17:37 +0100, Mathias Krause wrote:
>> On 26 March 2015 at 17:13, Joe Perches <joe@perches.com> wrote:
>> > On Thu, 2015-03-26 at 13:40 +0100, Mason wrote:
>> >> On 25/03/2015 19:01, Joe Perches wrote:
>> >> > On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
>> >> >
>> >> >> I started wondering if the string literals used in an __init functions
>> >> >> were automatically marked __initdata.
>> >> >>
>> >> > One proposal:
>> >> >
>> >> > https://lkml.org/lkml/2014/8/21/255
>> >>
>> >> Basically, if I understand correctly, Ingo NAKed the patch, saying
>> >> this should be done automatically by the toolchain. That would make
>> >> for an interesting side-project...
>> >
>> > True.  It's probably not feasible though.
>> >
>> > Tracking string deduplication/reuse would be pretty difficult.
>>
>> [...] Therefore I'm still not convinced that solving the
>> problem in the toolchain is the right thing to do. It's *way* more
>> complicated and probably gets it wrong more often than not. Therefore
>> the straight simple approach of manually marking the strings is IMHO
>> the best solution. Unfortunately, not everyone shares this opinion :/
>
> At least a few do though.
>
> The first 4 patches still apply and are useful in my opinion.
>
> Maybe you could resend them as a new patch set and cc Andrew Morton.
> (cc'd here too)

Andrew briefly commented on v2 of the patch set so I added him to the
Cc list when sending v3, linked above. But he did say nothing so I
guess Ingo's dislike of the approach is still valid?

Andrew, what's your opinion on such a patch set? Do you too think it's
useful? Or do you share Ingo's fear about the additional maintenance
burden?

Thanks,
Mathias

^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-26 20:49             ` Mathias Krause
  0 siblings, 0 replies; 30+ messages in thread
From: Mathias Krause @ 2015-03-26 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 26 March 2015 at 18:53, Joe Perches <joe@perches.com> wrote:
> On Thu, 2015-03-26 at 17:37 +0100, Mathias Krause wrote:
>> On 26 March 2015 at 17:13, Joe Perches <joe@perches.com> wrote:
>> > On Thu, 2015-03-26 at 13:40 +0100, Mason wrote:
>> >> On 25/03/2015 19:01, Joe Perches wrote:
>> >> > On Wed, 2015-03-25 at 18:56 +0100, Mason wrote:
>> >> >
>> >> >> I started wondering if the string literals used in an __init functions
>> >> >> were automatically marked __initdata.
>> >> >>
>> >> > One proposal:
>> >> >
>> >> > https://lkml.org/lkml/2014/8/21/255
>> >>
>> >> Basically, if I understand correctly, Ingo NAKed the patch, saying
>> >> this should be done automatically by the toolchain. That would make
>> >> for an interesting side-project...
>> >
>> > True.  It's probably not feasible though.
>> >
>> > Tracking string deduplication/reuse would be pretty difficult.
>>
>> [...] Therefore I'm still not convinced that solving the
>> problem in the toolchain is the right thing to do. It's *way* more
>> complicated and probably gets it wrong more often than not. Therefore
>> the straight simple approach of manually marking the strings is IMHO
>> the best solution. Unfortunately, not everyone shares this opinion :/
>
> At least a few do though.
>
> The first 4 patches still apply and are useful in my opinion.
>
> Maybe you could resend them as a new patch set and cc Andrew Morton.
> (cc'd here too)

Andrew briefly commented on v2 of the patch set so I added him to the
Cc list when sending v3, linked above. But he did say nothing so I
guess Ingo's dislike of the approach is still valid?

Andrew, what's your opinion on such a patch set? Do you too think it's
useful? Or do you share Ingo's fear about the additional maintenance
burden?

Thanks,
Mathias

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 20:49             ` Mathias Krause
@ 2015-03-26 21:40               ` Andrew Morton
  -1 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2015-03-26 21:40 UTC (permalink / raw)
  To: Mathias Krause; +Cc: Joe Perches, Mason, Linux ARM, LKML, Ingo Molnar

On Thu, 26 Mar 2015 21:49:06 +0100 Mathias Krause <minipli@googlemail.com> wrote:

> Andrew, what's your opinion on such a patch set? Do you too think it's
> useful? Or do you share Ingo's fear about the additional maintenance
> burden?

I don't think the burden would be toooo high, although it will mess the
code up a bit.

The post-build checking for section reference mismatches will help,
although that seems to have got itself turned off (what happened
there?).

Did anyone look at writing a postprocessor for the .s files?  It
doesn't look like it will be too hard from an initial peek.

Did anyone ask the gcc developers?  I'd have thought that a function-wide

	__attribute__((__string_section__(foo))

wouldn't be a ton of work to implement.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-26 21:40               ` Andrew Morton
  0 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2015-03-26 21:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 26 Mar 2015 21:49:06 +0100 Mathias Krause <minipli@googlemail.com> wrote:

> Andrew, what's your opinion on such a patch set? Do you too think it's
> useful? Or do you share Ingo's fear about the additional maintenance
> burden?

I don't think the burden would be toooo high, although it will mess the
code up a bit.

The post-build checking for section reference mismatches will help,
although that seems to have got itself turned off (what happened
there?).

Did anyone look at writing a postprocessor for the .s files?  It
doesn't look like it will be too hard from an initial peek.

Did anyone ask the gcc developers?  I'd have thought that a function-wide

	__attribute__((__string_section__(foo))

wouldn't be a ton of work to implement.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 21:40               ` Andrew Morton
@ 2015-03-26 21:58                 ` Joe Perches
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-26 21:58 UTC (permalink / raw)
  To: Andrew Morton, gcc; +Cc: Mathias Krause, Mason, Linux ARM, LKML, Ingo Molnar

(adding gcc@gcc.gnu.org)

On Thu, 2015-03-26 at 14:40 -0700, Andrew Morton wrote:
> On Thu, 26 Mar 2015 21:49:06 +0100 Mathias Krause <minipli@googlemail.com> wrote:
> 
> > Andrew, what's your opinion on such a patch set? Do you too think it's
> > useful? Or do you share Ingo's fear about the additional maintenance
> > burden?
> 
> I don't think the burden would be toooo high, although it will mess the
> code up a bit.

I think it's overall a pretty low cost one-time pass
that Mathias has nearly completely automated.

Even if a future version of gcc implements string
constants in specific sections, the code isn't
difficult to understand or maintain for older versions.

> The post-build checking for section reference mismatches will help,
> although that seems to have got itself turned off (what happened
> there?).

I think the modprobe message works well.
What do you think missing?

> Did anyone ask the gcc developers?

Not to my knowledge.

> I'd have thought that a function-wide
> 	__attribute__((__string_section__(foo))
> wouldn't be a ton of work to implement.

Maybe not.

Could some future version of gcc move string constants
in a function to a specific section marked in a manner
similar to what Andrew described above?



^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-26 21:58                 ` Joe Perches
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-26 21:58 UTC (permalink / raw)
  To: linux-arm-kernel

(adding gcc at gcc.gnu.org)

On Thu, 2015-03-26 at 14:40 -0700, Andrew Morton wrote:
> On Thu, 26 Mar 2015 21:49:06 +0100 Mathias Krause <minipli@googlemail.com> wrote:
> 
> > Andrew, what's your opinion on such a patch set? Do you too think it's
> > useful? Or do you share Ingo's fear about the additional maintenance
> > burden?
> 
> I don't think the burden would be toooo high, although it will mess the
> code up a bit.

I think it's overall a pretty low cost one-time pass
that Mathias has nearly completely automated.

Even if a future version of gcc implements string
constants in specific sections, the code isn't
difficult to understand or maintain for older versions.

> The post-build checking for section reference mismatches will help,
> although that seems to have got itself turned off (what happened
> there?).

I think the modprobe message works well.
What do you think missing?

> Did anyone ask the gcc developers?

Not to my knowledge.

> I'd have thought that a function-wide
> 	__attribute__((__string_section__(foo))
> wouldn't be a ton of work to implement.

Maybe not.

Could some future version of gcc move string constants
in a function to a specific section marked in a manner
similar to what Andrew described above?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 21:58                 ` Joe Perches
@ 2015-03-26 22:15                   ` Andrew Morton
  -1 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2015-03-26 22:15 UTC (permalink / raw)
  To: Joe Perches; +Cc: gcc, Mathias Krause, Mason, Linux ARM, LKML, Ingo Molnar

On Thu, 26 Mar 2015 14:58:40 -0700 Joe Perches <joe@perches.com> wrote:

> > I'd have thought that a function-wide
> > 	__attribute__((__string_section__(foo))
> > wouldn't be a ton of work to implement.
> 
> Maybe not.
> 
> Could some future version of gcc move string constants
> in a function to a specific section marked in a manner
> similar to what Andrew described above?

One thing which might complexicate this is

void foo()
{
	p("bar");
}

void  __attribute__((__string_section__(.init.rodata)) zot()
{
	p("bar");
}

It would be silly to create two instances of "bar".

Change it thusly:


#define __mark_str(str) \
	({ static const char var[] __attribute__((__section__(".init.string"))) = str; var; })

void foo()
{
	p("bar");
}

void zot()
{
	p(__mark_str("bar"));
}


and we indeed get two copies of "bar".

It would be nice not to do that, but I guess that losing this
optimization is a reasonable compromise.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-26 22:15                   ` Andrew Morton
  0 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2015-03-26 22:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 26 Mar 2015 14:58:40 -0700 Joe Perches <joe@perches.com> wrote:

> > I'd have thought that a function-wide
> > 	__attribute__((__string_section__(foo))
> > wouldn't be a ton of work to implement.
> 
> Maybe not.
> 
> Could some future version of gcc move string constants
> in a function to a specific section marked in a manner
> similar to what Andrew described above?

One thing which might complexicate this is

void foo()
{
	p("bar");
}

void  __attribute__((__string_section__(.init.rodata)) zot()
{
	p("bar");
}

It would be silly to create two instances of "bar".

Change it thusly:


#define __mark_str(str) \
	({ static const char var[] __attribute__((__section__(".init.string"))) = str; var; })

void foo()
{
	p("bar");
}

void zot()
{
	p(__mark_str("bar"));
}


and we indeed get two copies of "bar".

It would be nice not to do that, but I guess that losing this
optimization is a reasonable compromise.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 21:40               ` Andrew Morton
@ 2015-03-27  7:05                 ` Mathias Krause
  -1 siblings, 0 replies; 30+ messages in thread
From: Mathias Krause @ 2015-03-27  7:05 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Joe Perches, Mason, Linux ARM, LKML, Ingo Molnar

On 26 March 2015 at 22:40, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 26 Mar 2015 21:49:06 +0100 Mathias Krause <minipli@googlemail.com> wrote:
>
>> Andrew, what's your opinion on such a patch set? Do you too think it's
>> useful? Or do you share Ingo's fear about the additional maintenance
>> burden?
>
> I don't think the burden would be toooo high, although it will mess the
> code up a bit.
>
> The post-build checking for section reference mismatches will help,
> although that seems to have got itself turned off (what happened
> there?).

Seem to be working fine here.

This is make M=lib/, building the test module:
  CC [M]  lib/test_module.o
  Building modules, stage 2.
  MODPOST 1 modules
  LD [M]  lib/test_module.ko

This is the same module with two pi_info() calls in a non-__init
function, therefore generating section mismatches:
  CC [M]  lib/test_module.o
  Building modules, stage 2.
  MODPOST 1 modules
WARNING: modpost: Found 2 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'
  LD [M]  lib/test_module.ko

> Did anyone look at writing a postprocessor for the .s files?  It
> doesn't look like it will be too hard from an initial peek.
>
> Did anyone ask the gcc developers?  I'd have thought that a function-wide
>
>         __attribute__((__string_section__(foo))
>
> wouldn't be a ton of work to implement.

The point is you cannot blindly mark all strings referenced from
__init / __exit code to end up in a matching string section because
strings in this code might have to live longer when passed to
functions keeping a pointer on them. For example, the name passed to
class_create() won't be copied. If that one would go into the
.init.rodata section automatically, we would have dangling pointers
after the .init.* memory got freed.
To know if it's safe to automatically put a string into an .init /
.exit section one needs to see the whole code. That's why I'm
reasoning it needs to be an LTO pass, not a .s post-processor or
function wide section attribute. Or, as in my approach, a source level
annotation.

Thanks,
Mathias

^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-27  7:05                 ` Mathias Krause
  0 siblings, 0 replies; 30+ messages in thread
From: Mathias Krause @ 2015-03-27  7:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 26 March 2015 at 22:40, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 26 Mar 2015 21:49:06 +0100 Mathias Krause <minipli@googlemail.com> wrote:
>
>> Andrew, what's your opinion on such a patch set? Do you too think it's
>> useful? Or do you share Ingo's fear about the additional maintenance
>> burden?
>
> I don't think the burden would be toooo high, although it will mess the
> code up a bit.
>
> The post-build checking for section reference mismatches will help,
> although that seems to have got itself turned off (what happened
> there?).

Seem to be working fine here.

This is make M=lib/, building the test module:
  CC [M]  lib/test_module.o
  Building modules, stage 2.
  MODPOST 1 modules
  LD [M]  lib/test_module.ko

This is the same module with two pi_info() calls in a non-__init
function, therefore generating section mismatches:
  CC [M]  lib/test_module.o
  Building modules, stage 2.
  MODPOST 1 modules
WARNING: modpost: Found 2 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'
  LD [M]  lib/test_module.ko

> Did anyone look at writing a postprocessor for the .s files?  It
> doesn't look like it will be too hard from an initial peek.
>
> Did anyone ask the gcc developers?  I'd have thought that a function-wide
>
>         __attribute__((__string_section__(foo))
>
> wouldn't be a ton of work to implement.

The point is you cannot blindly mark all strings referenced from
__init / __exit code to end up in a matching string section because
strings in this code might have to live longer when passed to
functions keeping a pointer on them. For example, the name passed to
class_create() won't be copied. If that one would go into the
.init.rodata section automatically, we would have dangling pointers
after the .init.* memory got freed.
To know if it's safe to automatically put a string into an .init /
.exit section one needs to see the whole code. That's why I'm
reasoning it needs to be an LTO pass, not a .s post-processor or
function wide section attribute. Or, as in my approach, a source level
annotation.

Thanks,
Mathias

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 22:15                   ` Andrew Morton
@ 2015-03-27  7:16                     ` Mathias Krause
  -1 siblings, 0 replies; 30+ messages in thread
From: Mathias Krause @ 2015-03-27  7:16 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Joe Perches, gcc, Mason, Linux ARM, LKML, Ingo Molnar

On 26 March 2015 at 23:15, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 26 Mar 2015 14:58:40 -0700 Joe Perches <joe@perches.com> wrote:
>
>> > I'd have thought that a function-wide
>> >     __attribute__((__string_section__(foo))
>> > wouldn't be a ton of work to implement.
>>
>> Maybe not.
>>
>> Could some future version of gcc move string constants
>> in a function to a specific section marked in a manner
>> similar to what Andrew described above?
>
> One thing which might complexicate this is
>
> void foo()
> {
>         p("bar");
> }
>
> void  __attribute__((__string_section__(.init.rodata)) zot()
> {
>         p("bar");
> }
>
> It would be silly to create two instances of "bar".

No it wouldn't, because the "bar" in foo() has a different life time
than the "bar" in zot(). The compiler simply cannot know what the life
time of a section will be, so can only merge string within the same
section.

Beside that, your example is wrong and would generate a section
mismatch because zot() is not marked with __init so it's a function
that can be called after init. If one does, however, that p() will get
passed a dangling pointer. That's what modpost will complain about.

>
> Change it thusly:
>
>
> #define __mark_str(str) \
>         ({ static const char var[] __attribute__((__section__(".init.string"))) = str; var; })
>
> void foo()
> {
>         p("bar");
> }
>
> void zot()
> {
>         p(__mark_str("bar"));
> }
>
>
> and we indeed get two copies of "bar".
>
> It would be nice not to do that, but I guess that losing this
> optimization is a reasonable compromise.

As I said, it's legit to get two copies here, as the compiler simply
cannot deduce any semantics from the section name. However, this is a
synthetic example as in real life use-cases you'll only seldom have
the same format string be used twice. So this is, at best, a minor
issue.


Thanks,
Mathias

^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-27  7:16                     ` Mathias Krause
  0 siblings, 0 replies; 30+ messages in thread
From: Mathias Krause @ 2015-03-27  7:16 UTC (permalink / raw)
  To: linux-arm-kernel

On 26 March 2015 at 23:15, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 26 Mar 2015 14:58:40 -0700 Joe Perches <joe@perches.com> wrote:
>
>> > I'd have thought that a function-wide
>> >     __attribute__((__string_section__(foo))
>> > wouldn't be a ton of work to implement.
>>
>> Maybe not.
>>
>> Could some future version of gcc move string constants
>> in a function to a specific section marked in a manner
>> similar to what Andrew described above?
>
> One thing which might complexicate this is
>
> void foo()
> {
>         p("bar");
> }
>
> void  __attribute__((__string_section__(.init.rodata)) zot()
> {
>         p("bar");
> }
>
> It would be silly to create two instances of "bar".

No it wouldn't, because the "bar" in foo() has a different life time
than the "bar" in zot(). The compiler simply cannot know what the life
time of a section will be, so can only merge string within the same
section.

Beside that, your example is wrong and would generate a section
mismatch because zot() is not marked with __init so it's a function
that can be called after init. If one does, however, that p() will get
passed a dangling pointer. That's what modpost will complain about.

>
> Change it thusly:
>
>
> #define __mark_str(str) \
>         ({ static const char var[] __attribute__((__section__(".init.string"))) = str; var; })
>
> void foo()
> {
>         p("bar");
> }
>
> void zot()
> {
>         p(__mark_str("bar"));
> }
>
>
> and we indeed get two copies of "bar".
>
> It would be nice not to do that, but I guess that losing this
> optimization is a reasonable compromise.

As I said, it's legit to get two copies here, as the compiler simply
cannot deduce any semantics from the section name. However, this is a
synthetic example as in real life use-cases you'll only seldom have
the same format string be used twice. So this is, at best, a minor
issue.


Thanks,
Mathias

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-27  7:05                 ` Mathias Krause
@ 2015-03-27  7:32                   ` Joe Perches
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-27  7:32 UTC (permalink / raw)
  To: Mathias Krause; +Cc: Andrew Morton, Mason, Linux ARM, LKML, Ingo Molnar

On Fri, 2015-03-27 at 08:05 +0100, Mathias Krause wrote:
> On 26 March 2015 at 22:40, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Thu, 26 Mar 2015 21:49:06 +0100 Mathias Krause <minipli@googlemail.com> wrote:
> >
> >> Andrew, what's your opinion on such a patch set? Do you too think it's
> >> useful? Or do you share Ingo's fear about the additional maintenance
> >> burden?
> >
> > I don't think the burden would be toooo high, although it will mess the
> > code up a bit.
[]
> > Did anyone ask the gcc developers?  I'd have thought that a function-wide
> >         __attribute__((__string_section__(foo))
> > wouldn't be a ton of work to implement.
> 
> The point is you cannot blindly mark all strings referenced from
> __init / __exit code to end up in a matching string section because
> strings in this code might have to live longer when passed to
> functions keeping a pointer on them.

This is the primary reason I support the pi_<level>/pe_<level>/
printk_init/printk_exit markings.  It's simple and not a large
burden to the coder/reader.  If a few formats aren't marked
appropriately, it's not generally a significant loss, but it
is easily correctable by scripts.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-03-27  7:32                   ` Joe Perches
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-03-27  7:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2015-03-27 at 08:05 +0100, Mathias Krause wrote:
> On 26 March 2015 at 22:40, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Thu, 26 Mar 2015 21:49:06 +0100 Mathias Krause <minipli@googlemail.com> wrote:
> >
> >> Andrew, what's your opinion on such a patch set? Do you too think it's
> >> useful? Or do you share Ingo's fear about the additional maintenance
> >> burden?
> >
> > I don't think the burden would be toooo high, although it will mess the
> > code up a bit.
[]
> > Did anyone ask the gcc developers?  I'd have thought that a function-wide
> >         __attribute__((__string_section__(foo))
> > wouldn't be a ton of work to implement.
> 
> The point is you cannot blindly mark all strings referenced from
> __init / __exit code to end up in a matching string section because
> strings in this code might have to live longer when passed to
> functions keeping a pointer on them.

This is the primary reason I support the pi_<level>/pe_<level>/
printk_init/printk_exit markings.  It's simple and not a large
burden to the coder/reader.  If a few formats aren't marked
appropriately, it's not generally a significant loss, but it
is easily correctable by scripts.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-03-26 21:58                 ` Joe Perches
@ 2015-04-02 16:00                   ` Joseph Myers
  -1 siblings, 0 replies; 30+ messages in thread
From: Joseph Myers @ 2015-04-02 16:00 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andrew Morton, gcc, Mathias Krause, Mason, Linux ARM, LKML, Ingo Molnar

On Thu, 26 Mar 2015, Joe Perches wrote:

> > I'd have thought that a function-wide
> > 	__attribute__((__string_section__(foo))
> > wouldn't be a ton of work to implement.
> 
> Maybe not.
> 
> Could some future version of gcc move string constants
> in a function to a specific section marked in a manner
> similar to what Andrew described above?

Putting string constants in a special section is an old project idea at 
<http://gcc.gnu.org/projects/optimize.html#putting_constants_in_special_sections>.  
I still think support for that would make sense.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-04-02 16:00                   ` Joseph Myers
  0 siblings, 0 replies; 30+ messages in thread
From: Joseph Myers @ 2015-04-02 16:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 26 Mar 2015, Joe Perches wrote:

> > I'd have thought that a function-wide
> > 	__attribute__((__string_section__(foo))
> > wouldn't be a ton of work to implement.
> 
> Maybe not.
> 
> Could some future version of gcc move string constants
> in a function to a specific section marked in a manner
> similar to what Andrew described above?

Putting string constants in a special section is an old project idea at 
<http://gcc.gnu.org/projects/optimize.html#putting_constants_in_special_sections>.  
I still think support for that would make sense.

-- 
Joseph S. Myers
joseph at codesourcery.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: String literals in __init functions
  2015-04-02 16:00                   ` Joseph Myers
@ 2015-04-02 16:23                     ` Joe Perches
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-04-02 16:23 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Andrew Morton, gcc, Mathias Krause, Mason, Linux ARM, LKML, Ingo Molnar

On Thu, 2015-04-02 at 16:00 +0000, Joseph Myers wrote:
> On Thu, 26 Mar 2015, Joe Perches wrote:
> > > I'd have thought that a function-wide
> > > 	__attribute__((__string_section__(foo))
> > > wouldn't be a ton of work to implement.
> > 
> > Maybe not.
> > 
> > Could some future version of gcc move string constants
> > in a function to a specific section marked in a manner
> > similar to what Andrew described above?
> 
> Putting string constants in a special section is an old project idea at 
> <http://gcc.gnu.org/projects/optimize.html#putting_constants_in_special_sections>.  

That's news to me.  Thanks for the link.

> I still think support for that would make sense.

That's good to know.

Do you know of a mechanism in place to prioritize
development for these optimization projects or is
this a placeholder for unscheduled or unresourced
TODOs?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* String literals in __init functions
@ 2015-04-02 16:23                     ` Joe Perches
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Perches @ 2015-04-02 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2015-04-02 at 16:00 +0000, Joseph Myers wrote:
> On Thu, 26 Mar 2015, Joe Perches wrote:
> > > I'd have thought that a function-wide
> > > 	__attribute__((__string_section__(foo))
> > > wouldn't be a ton of work to implement.
> > 
> > Maybe not.
> > 
> > Could some future version of gcc move string constants
> > in a function to a specific section marked in a manner
> > similar to what Andrew described above?
> 
> Putting string constants in a special section is an old project idea at 
> <http://gcc.gnu.org/projects/optimize.html#putting_constants_in_special_sections>.  

That's news to me.  Thanks for the link.

> I still think support for that would make sense.

That's good to know.

Do you know of a mechanism in place to prioritize
development for these optimization projects or is
this a placeholder for unscheduled or unresourced
TODOs?

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2015-04-02 16:23 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-25 17:56 String literals in __init functions Mason
2015-03-25 17:56 ` Mason
2015-03-25 18:01 ` Joe Perches
2015-03-25 18:01   ` Joe Perches
2015-03-26 12:40   ` Mason
2015-03-26 12:40     ` Mason
2015-03-26 16:13     ` Joe Perches
2015-03-26 16:13       ` Joe Perches
2015-03-26 16:37       ` Mathias Krause
2015-03-26 16:37         ` Mathias Krause
2015-03-26 17:53         ` Joe Perches
2015-03-26 17:53           ` Joe Perches
2015-03-26 20:49           ` Mathias Krause
2015-03-26 20:49             ` Mathias Krause
2015-03-26 21:40             ` Andrew Morton
2015-03-26 21:40               ` Andrew Morton
2015-03-26 21:58               ` Joe Perches
2015-03-26 21:58                 ` Joe Perches
2015-03-26 22:15                 ` Andrew Morton
2015-03-26 22:15                   ` Andrew Morton
2015-03-27  7:16                   ` Mathias Krause
2015-03-27  7:16                     ` Mathias Krause
2015-04-02 16:00                 ` Joseph Myers
2015-04-02 16:00                   ` Joseph Myers
2015-04-02 16:23                   ` Joe Perches
2015-04-02 16:23                     ` Joe Perches
2015-03-27  7:05               ` Mathias Krause
2015-03-27  7:05                 ` Mathias Krause
2015-03-27  7:32                 ` Joe Perches
2015-03-27  7:32                   ` Joe Perches

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.