* [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags @ 2019-11-20 14:51 Jessica Yu 2019-11-20 15:02 ` Jessica Yu ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Jessica Yu @ 2019-11-20 14:51 UTC (permalink / raw) To: linux-kernel Cc: Matthias Maennich, Masahiro Yamada, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman, Jessica Yu Commit c3a6cf19e695 ("export: avoid code duplication in include/linux/export.h") refactors export.h quite nicely, but introduces a slight increase in memory usage due to using the empty string "" instead of NULL to indicate that an exported symbol has no namespace. As mentioned in that commit, this meant an increase of 1 byte per exported symbol without a namespace. For example, if a kernel configuration has about 10k exported symbols, this would mean that the size of __ksymtab_strings would increase by roughly 10kB. We can alleviate this situation by utilizing the SHF_MERGE and SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker that the data in the section are null-terminated strings that can be merged to eliminate duplication. More specifically, from the binutils documentation - "for sections with both M and S, a string which is a suffix of a larger string is considered a duplicate. Thus "def" will be merged with "abcdef"; A reference to the first "def" will be changed to a reference to "abcdef"+3". Thus, all the empty strings would be merged as well as any strings that can be merged according to the cited method above. For example, "memset" and "__memset" would be merged to just "__memset" in __ksymtab_strings. As of v5.4-rc5, the following statistics were gathered with x86 defconfig with approximately 10.7k exported symbols. Size of __ksymtab_strings in vmlinux: ------------------------------------- v5.4-rc5: 213834 bytes v5.4-rc5 with commit c3a6cf19e695: 224455 bytes v5.4-rc5 with this patch: 205759 bytes So, we already see memory savings of ~8kB compared to vanilla -rc5 and savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. Unfortunately, as of this writing, strings will not get deduplicated for kernel modules, as ld does not do the deduplication for SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which kernel modules are. A patch for ld is currently being worked on to hopefully allow for string deduplication in relocatable files in the future. Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Jessica Yu <jeyu@kernel.org> --- include/asm-generic/export.h | 13 ++++++++++--- include/linux/export.h | 28 ++++++++++++++++++++++------ 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h index fa577978fbbd..d0704f2602f4 100644 --- a/include/asm-generic/export.h +++ b/include/asm-generic/export.h @@ -26,9 +26,16 @@ .endm /* - * note on .section use: @progbits vs %progbits nastiness doesn't matter, - * since we immediately emit into those sections anyway. + * note on .section use: we specify @progbits vs %progbits since usage of + * "M" (SHF_MERGE) section flag requires it. */ + +#ifdef CONFIG_ARM +#define ARCH_PROGBITS %progbits +#else +#define ARCH_PROGBITS @progbits +#endif + .macro ___EXPORT_SYMBOL name,val,sec #ifdef CONFIG_MODULES .globl __ksymtab_\name @@ -37,7 +44,7 @@ __ksymtab_\name: __put \val, __kstrtab_\name .previous - .section __ksymtab_strings,"a" + .section __ksymtab_strings,"aMS",ARCH_PROGBITS,1 __kstrtab_\name: .asciz "\name" .previous diff --git a/include/linux/export.h b/include/linux/export.h index 201262793369..ab325a8e6bee 100644 --- a/include/linux/export.h +++ b/include/linux/export.h @@ -81,16 +81,32 @@ struct kernel_symbol { #else +#ifdef CONFIG_ARM +#define ARCH_PROGBITS "%progbits" +#else +#define ARCH_PROGBITS "@progbits" +#endif + +#define __KSTRTAB_ENTRY(sym) \ + asm(" .section \"__ksymtab_strings\",\"aMS\","ARCH_PROGBITS",1\n" \ + "__kstrtab_" #sym ": \n" \ + " .asciz \"" #sym "\" \n" \ + " .previous \n") + +#define __KSTRTAB_NS_ENTRY(sym, ns) \ + asm(" .section \"__ksymtab_strings\",\"aMS\","ARCH_PROGBITS",1\n" \ + "__kstrtabns_" #sym ": \n" \ + " .asciz " #ns " \n" \ + " .previous \n") + /* For every exported symbol, place a struct in the __ksymtab section */ #define ___EXPORT_SYMBOL(sym, sec, ns) \ extern typeof(sym) sym; \ + extern const char __kstrtab_##sym[]; \ + extern const char __kstrtabns_##sym[]; \ __CRC_SYMBOL(sym, sec); \ - static const char __kstrtab_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = #sym; \ - static const char __kstrtabns_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = ns; \ + __KSTRTAB_ENTRY(sym); \ + __KSTRTAB_NS_ENTRY(sym, ns); \ __KSYMTAB_ENTRY(sym, sec) #endif -- 2.16.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-20 14:51 [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags Jessica Yu @ 2019-11-20 15:02 ` Jessica Yu 2019-11-21 10:51 ` Rasmus Villemoes 2019-11-25 15:42 ` [PATCH v2] " Jessica Yu 2 siblings, 0 replies; 23+ messages in thread From: Jessica Yu @ 2019-11-20 15:02 UTC (permalink / raw) To: linux-kernel Cc: Matthias Maennich, Masahiro Yamada, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman +++ Jessica Yu [20/11/19 15:51 +0100]: >Commit c3a6cf19e695 ("export: avoid code duplication in >include/linux/export.h") refactors export.h quite nicely, but introduces >a slight increase in memory usage due to using the empty string "" >instead of NULL to indicate that an exported symbol has no namespace. As >mentioned in that commit, this meant an increase of 1 byte per exported >symbol without a namespace. For example, if a kernel configuration has >about 10k exported symbols, this would mean that the size of >__ksymtab_strings would increase by roughly 10kB. > >We can alleviate this situation by utilizing the SHF_MERGE and >SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >that the data in the section are null-terminated strings that can be >merged to eliminate duplication. More specifically, from the binutils >documentation - "for sections with both M and S, a string which is a >suffix of a larger string is considered a duplicate. Thus "def" will be >merged with "abcdef"; A reference to the first "def" will be changed to >a reference to "abcdef"+3". Thus, all the empty strings would be merged >as well as any strings that can be merged according to the cited method >above. For example, "memset" and "__memset" would be merged to just >"__memset" in __ksymtab_strings. > >As of v5.4-rc5, the following statistics were gathered with x86 >defconfig with approximately 10.7k exported symbols. > >Size of __ksymtab_strings in vmlinux: >------------------------------------- >v5.4-rc5: 213834 bytes >v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >v5.4-rc5 with this patch: 205759 bytes > >So, we already see memory savings of ~8kB compared to vanilla -rc5 and >savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. > >Unfortunately, as of this writing, strings will not get deduplicated for >kernel modules, as ld does not do the deduplication for >SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >kernel modules are. A patch for ld is currently being worked on to >hopefully allow for string deduplication in relocatable files in the >future. > >Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >Signed-off-by: Jessica Yu <jeyu@kernel.org> >--- > include/asm-generic/export.h | 13 ++++++++++--- > include/linux/export.h | 28 ++++++++++++++++++++++------ > 2 files changed, 32 insertions(+), 9 deletions(-) > >diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >index fa577978fbbd..d0704f2602f4 100644 >--- a/include/asm-generic/export.h >+++ b/include/asm-generic/export.h >@@ -26,9 +26,16 @@ > .endm > > /* >- * note on .section use: @progbits vs %progbits nastiness doesn't matter, >- * since we immediately emit into those sections anyway. >+ * note on .section use: we specify @progbits vs %progbits since usage of >+ * "M" (SHF_MERGE) section flag requires it. > */ >+ >+#ifdef CONFIG_ARM >+#define ARCH_PROGBITS %progbits >+#else >+#define ARCH_PROGBITS @progbits >+#endif >+ > .macro ___EXPORT_SYMBOL name,val,sec > #ifdef CONFIG_MODULES > .globl __ksymtab_\name >@@ -37,7 +44,7 @@ > __ksymtab_\name: > __put \val, __kstrtab_\name > .previous >- .section __ksymtab_strings,"a" >+ .section __ksymtab_strings,"aMS",ARCH_PROGBITS,1 > __kstrtab_\name: > .asciz "\name" > .previous >diff --git a/include/linux/export.h b/include/linux/export.h >index 201262793369..ab325a8e6bee 100644 >--- a/include/linux/export.h >+++ b/include/linux/export.h >@@ -81,16 +81,32 @@ struct kernel_symbol { > > #else > >+#ifdef CONFIG_ARM >+#define ARCH_PROGBITS "%progbits" >+#else >+#define ARCH_PROGBITS "@progbits" >+#endif Just a comment, I don't like the duplication of this block, but I wasn't sure where was the best place to have this define since asm-generic/export.h and linux/export.h don't share includes, and includes are generally kept to a minimum. If anyone has a better idea, please let me know. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-20 14:51 [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags Jessica Yu 2019-11-20 15:02 ` Jessica Yu @ 2019-11-21 10:51 ` Rasmus Villemoes 2019-11-21 16:09 ` Jessica Yu 2019-11-25 15:42 ` [PATCH v2] " Jessica Yu 2 siblings, 1 reply; 23+ messages in thread From: Rasmus Villemoes @ 2019-11-21 10:51 UTC (permalink / raw) To: Jessica Yu, linux-kernel Cc: Matthias Maennich, Masahiro Yamada, Arnd Bergmann, Greg Kroah-Hartman On 20/11/2019 15.51, Jessica Yu wrote: > > We can alleviate this situation by utilizing the SHF_MERGE and > SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker > that the data in the section are null-terminated strings [pet peeve: nul, not null.] > As of v5.4-rc5, the following statistics were gathered with x86 > defconfig with approximately 10.7k exported symbols. > > Size of __ksymtab_strings in vmlinux: > ------------------------------------- > v5.4-rc5: 213834 bytes > v5.4-rc5 with commit c3a6cf19e695: 224455 bytes > v5.4-rc5 with this patch: 205759 bytes > > So, we already see memory savings of ~8kB compared to vanilla -rc5 and > savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. Yippee :) Thanks for picking up the ball and sending this. > /* > - * note on .section use: @progbits vs %progbits nastiness doesn't matter, > - * since we immediately emit into those sections anyway. > + * note on .section use: we specify @progbits vs %progbits since usage of > + * "M" (SHF_MERGE) section flag requires it. > */ > + > +#ifdef CONFIG_ARM > +#define ARCH_PROGBITS %progbits > +#else > +#define ARCH_PROGBITS @progbits > +#endif Did you figure out a way to determine if ARM is the only odd one? I was just going by gas' documentation which mentions ARM as an example, but doesn't really provide a way to know what each arch uses. I suppose 0day will tell us shortly. If we want to avoid arch-ifdefs in these headers (and that could become unwieldy if ARM is not the only one) I think one could add a asm-generic/progbits.h, add progbits.h to mandatory-y in include/asm-generic/Kbuild, and then just provide a progbits.h on ARM (and the other exceptions) - then I think the kbuild logic automatically makes "#include <asm/progbits.h>" pick up the right one. And the header could define ARCH_PROGBITS with or without the double quotes depending on __ASSEMBLY__. OTOH, adding a whole new header just for this may be overkill. Rasmus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-21 10:51 ` Rasmus Villemoes @ 2019-11-21 16:09 ` Jessica Yu 2019-11-21 16:53 ` Will Deacon 2019-11-22 11:44 ` Masahiro Yamada 0 siblings, 2 replies; 23+ messages in thread From: Jessica Yu @ 2019-11-21 16:09 UTC (permalink / raw) To: Rasmus Villemoes Cc: linux-kernel, Matthias Maennich, Masahiro Yamada, Arnd Bergmann, Greg Kroah-Hartman, Will Deacon, Catalin Marinas +++ Rasmus Villemoes [21/11/19 11:51 +0100]: >On 20/11/2019 15.51, Jessica Yu wrote: >> >> We can alleviate this situation by utilizing the SHF_MERGE and >> SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >> that the data in the section are null-terminated strings > >[pet peeve: nul, not null.] Ah, right you are! >> As of v5.4-rc5, the following statistics were gathered with x86 >> defconfig with approximately 10.7k exported symbols. >> >> Size of __ksymtab_strings in vmlinux: >> ------------------------------------- >> v5.4-rc5: 213834 bytes >> v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >> v5.4-rc5 with this patch: 205759 bytes >> >> So, we already see memory savings of ~8kB compared to vanilla -rc5 and >> savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. > >Yippee :) Thanks for picking up the ball and sending this. And thanks for the idea in the first place :-) >> /* >> - * note on .section use: @progbits vs %progbits nastiness doesn't matter, >> - * since we immediately emit into those sections anyway. >> + * note on .section use: we specify @progbits vs %progbits since usage of >> + * "M" (SHF_MERGE) section flag requires it. >> */ >> + >> +#ifdef CONFIG_ARM >> +#define ARCH_PROGBITS %progbits >> +#else >> +#define ARCH_PROGBITS @progbits >> +#endif > >Did you figure out a way to determine if ARM is the only odd one? I was >just going by gas' documentation which mentions ARM as an example, but >doesn't really provide a way to know what each arch uses. I suppose 0day >will tell us shortly. I *think* so. At least, I was going off of drivers/base/firmware_loader/builtin/Makefile and scripts/recordmcount.pl, which were the only other places that I found that reference the %progbits vs @progbits oddity. They only use %progbits in the case of CONFIG_ARM and @progbits for other arches. I wasn't sure about arm64, but I looked at the assembly files gcc produced and it looked like @progbits was used there. Added some arm64 people to CC since they would know :-) >If we want to avoid arch-ifdefs in these headers (and that could become >unwieldy if ARM is not the only one) I think one could add a >asm-generic/progbits.h, add progbits.h to mandatory-y in >include/asm-generic/Kbuild, and then just provide a progbits.h on ARM >(and the other exceptions) - then I think the kbuild logic automatically >makes "#include <asm/progbits.h>" pick up the right one. And the header >could define ARCH_PROGBITS with or without the double quotes depending >on __ASSEMBLY__. I think this would work, and it feels like the more correct solution especially if arm isn't the only one with the odd progbits char. It might be overkill if it's just arm that's affected though. I leave it to Masahiro to see what he prefers. Thanks! Jessica ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-21 16:09 ` Jessica Yu @ 2019-11-21 16:53 ` Will Deacon 2019-11-22 11:44 ` Masahiro Yamada 1 sibling, 0 replies; 23+ messages in thread From: Will Deacon @ 2019-11-21 16:53 UTC (permalink / raw) To: Jessica Yu Cc: Rasmus Villemoes, linux-kernel, Matthias Maennich, Masahiro Yamada, Arnd Bergmann, Greg Kroah-Hartman, Catalin Marinas Hi Jessica, On Thu, Nov 21, 2019 at 05:09:20PM +0100, Jessica Yu wrote: > +++ Rasmus Villemoes [21/11/19 11:51 +0100]: > > On 20/11/2019 15.51, Jessica Yu wrote: > > > /* > > > - * note on .section use: @progbits vs %progbits nastiness doesn't matter, > > > - * since we immediately emit into those sections anyway. > > > + * note on .section use: we specify @progbits vs %progbits since usage of > > > + * "M" (SHF_MERGE) section flag requires it. > > > */ > > > + > > > +#ifdef CONFIG_ARM > > > +#define ARCH_PROGBITS %progbits > > > +#else > > > +#define ARCH_PROGBITS @progbits > > > +#endif > > > > Did you figure out a way to determine if ARM is the only odd one? I was > > just going by gas' documentation which mentions ARM as an example, but > > doesn't really provide a way to know what each arch uses. I suppose 0day > > will tell us shortly. > > I *think* so. At least, I was going off of > drivers/base/firmware_loader/builtin/Makefile and > scripts/recordmcount.pl, which were the only other places that I found > that reference the %progbits vs @progbits oddity. They only use > %progbits in the case of CONFIG_ARM and @progbits for other > arches. I wasn't sure about arm64, but I looked at the assembly files > gcc produced and it looked like @progbits was used there. Added some > arm64 people to CC since they would know :-) The '@' character is a comment delimiter for 32-bit ARM assembly, so that's why you end up having to use a different character there. This isn't the case for arm64, where you need to use standard C/C++ comment delimiters instead and so '@progbits' should work correctly. Hope that helps, Will ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-21 16:09 ` Jessica Yu 2019-11-21 16:53 ` Will Deacon @ 2019-11-22 11:44 ` Masahiro Yamada 2019-11-25 9:29 ` Rasmus Villemoes 1 sibling, 1 reply; 23+ messages in thread From: Masahiro Yamada @ 2019-11-22 11:44 UTC (permalink / raw) To: Jessica Yu Cc: Rasmus Villemoes, Linux Kernel Mailing List, Matthias Maennich, Arnd Bergmann, Greg Kroah-Hartman, Will Deacon, Catalin Marinas On Fri, Nov 22, 2019 at 1:09 AM Jessica Yu <jeyu@kernel.org> wrote: > > +++ Rasmus Villemoes [21/11/19 11:51 +0100]: > >On 20/11/2019 15.51, Jessica Yu wrote: > >> > >> We can alleviate this situation by utilizing the SHF_MERGE and > >> SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker > >> that the data in the section are null-terminated strings > > > >[pet peeve: nul, not null.] > > Ah, right you are! > > >> As of v5.4-rc5, the following statistics were gathered with x86 > >> defconfig with approximately 10.7k exported symbols. > >> > >> Size of __ksymtab_strings in vmlinux: > >> ------------------------------------- > >> v5.4-rc5: 213834 bytes > >> v5.4-rc5 with commit c3a6cf19e695: 224455 bytes > >> v5.4-rc5 with this patch: 205759 bytes > >> > >> So, we already see memory savings of ~8kB compared to vanilla -rc5 and > >> savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. > > > >Yippee :) Thanks for picking up the ball and sending this. > > And thanks for the idea in the first place :-) > > >> /* > >> - * note on .section use: @progbits vs %progbits nastiness doesn't matter, > >> - * since we immediately emit into those sections anyway. > >> + * note on .section use: we specify @progbits vs %progbits since usage of > >> + * "M" (SHF_MERGE) section flag requires it. > >> */ > >> + > >> +#ifdef CONFIG_ARM > >> +#define ARCH_PROGBITS %progbits > >> +#else > >> +#define ARCH_PROGBITS @progbits > >> +#endif > > > >Did you figure out a way to determine if ARM is the only odd one? I was > >just going by gas' documentation which mentions ARM as an example, but > >doesn't really provide a way to know what each arch uses. I suppose 0day > >will tell us shortly. > > I *think* so. At least, I was going off of > drivers/base/firmware_loader/builtin/Makefile and > scripts/recordmcount.pl, which were the only other places that I found > that reference the %progbits vs @progbits oddity. They only use > %progbits in the case of CONFIG_ARM and @progbits for other > arches. I wasn't sure about arm64, but I looked at the assembly files > gcc produced and it looked like @progbits was used there. Added some > arm64 people to CC since they would know :-) > > >If we want to avoid arch-ifdefs in these headers (and that could become > >unwieldy if ARM is not the only one) I think one could add a > >asm-generic/progbits.h, add progbits.h to mandatory-y in > >include/asm-generic/Kbuild, and then just provide a progbits.h on ARM > >(and the other exceptions) - then I think the kbuild logic automatically > >makes "#include <asm/progbits.h>" pick up the right one. And the header > >could define ARCH_PROGBITS with or without the double quotes depending > >on __ASSEMBLY__. > > I think this would work, and it feels like the more correct solution > especially if arm isn't the only one with the odd progbits char. It > might be overkill if it's just arm that's affected though. I leave it > to Masahiro to see what he prefers. > BTW, is there any reason why not use %progbits all the time? include/linux/init.h hard-codes %progbits #define __INITDATA .section ".init.data","aw",%progbits #define __INITRODATA .section ".init.rodata","a",%progbits So, my understanding is '%' works for all architectures, but it is better to ask 0-day bot to test it. -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-22 11:44 ` Masahiro Yamada @ 2019-11-25 9:29 ` Rasmus Villemoes 2019-11-25 9:45 ` Jessica Yu 0 siblings, 1 reply; 23+ messages in thread From: Rasmus Villemoes @ 2019-11-25 9:29 UTC (permalink / raw) To: Masahiro Yamada, Jessica Yu Cc: Linux Kernel Mailing List, Matthias Maennich, Arnd Bergmann, Greg Kroah-Hartman, Will Deacon, Catalin Marinas, binutils cc += binutils ML [on @progbits v %progbits in generic (data only) assembly code] On 22/11/2019 12.44, Masahiro Yamada wrote: > On Fri, Nov 22, 2019 at 1:09 AM Jessica Yu <jeyu@kernel.org> wrote: >> >> I think this would work, and it feels like the more correct solution >> especially if arm isn't the only one with the odd progbits char. It >> might be overkill if it's just arm that's affected though. I leave it >> to Masahiro to see what he prefers. >> > > > BTW, is there any reason why > not use %progbits all the time? > > > include/linux/init.h hard-codes %progbits > > #define __INITDATA .section ".init.data","aw",%progbits > #define __INITRODATA .section ".init.rodata","a",%progbits > > > So, my understanding is '%' works for all architectures, > but it is better to ask 0-day bot to test it. It seems that you're absolutely right. The binutils source has code like + if (*input_line_pointer == '@' || *input_line_pointer == '%') + { + ++input_line_pointer; + if (strncmp (input_line_pointer, "progbits", + sizeof "progbits" - 1) == 0) + { + type = SHT_PROGBITS; + input_line_pointer += sizeof "progbits" - 1; + } + else if (strncmp (input_line_pointer, "nobits", + sizeof "nobits" - 1) == 0) + { + type = SHT_NOBITS; + input_line_pointer += sizeof "nobits" - 1; + } all the way back from commit 252b5132c753830d5fd56823373aed85f2a0db63 (tag: binu_ss_19990502) Author: Richard Henderson <rth@redhat.com> Date: Mon May 3 07:29:11 1999 +0000 19990502 sourceware import So yes, let's just try unconditionally using %progbits, that makes everything much simpler. The only reason I thought one needed to do it differently on ARM is this from the gas docs: === The optional TYPE argument may contain one of the following constants: '@progbits' section contains data ... Note on targets where the '@' character is the start of a comment (eg ARM) then another character is used instead. For example the ARM port uses the '%' character. === That doesn't suggest the possibility that % or some other character might work on all architectures. Jessica, can you resend using just %progbits to let kbuild chew on that? Please include a comment about the misleading gas documentation. Rasmus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-25 9:29 ` Rasmus Villemoes @ 2019-11-25 9:45 ` Jessica Yu 0 siblings, 0 replies; 23+ messages in thread From: Jessica Yu @ 2019-11-25 9:45 UTC (permalink / raw) To: Rasmus Villemoes Cc: Masahiro Yamada, Linux Kernel Mailing List, Matthias Maennich, Arnd Bergmann, Greg Kroah-Hartman, Will Deacon, Catalin Marinas, binutils +++ Rasmus Villemoes [25/11/19 10:29 +0100]: >cc += binutils ML > >[on @progbits v %progbits in generic (data only) assembly code] > >On 22/11/2019 12.44, Masahiro Yamada wrote: >> On Fri, Nov 22, 2019 at 1:09 AM Jessica Yu <jeyu@kernel.org> wrote: >>> > >>> I think this would work, and it feels like the more correct solution >>> especially if arm isn't the only one with the odd progbits char. It >>> might be overkill if it's just arm that's affected though. I leave it >>> to Masahiro to see what he prefers. >>> >> >> >> BTW, is there any reason why >> not use %progbits all the time? >> >> >> include/linux/init.h hard-codes %progbits >> >> #define __INITDATA .section ".init.data","aw",%progbits >> #define __INITRODATA .section ".init.rodata","a",%progbits >> >> >> So, my understanding is '%' works for all architectures, >> but it is better to ask 0-day bot to test it. > >It seems that you're absolutely right. The binutils source has code like > >+ if (*input_line_pointer == '@' || *input_line_pointer == '%') >+ { >+ ++input_line_pointer; >+ if (strncmp (input_line_pointer, "progbits", >+ sizeof "progbits" - 1) == 0) >+ { >+ type = SHT_PROGBITS; >+ input_line_pointer += sizeof "progbits" - 1; >+ } >+ else if (strncmp (input_line_pointer, "nobits", >+ sizeof "nobits" - 1) == 0) >+ { >+ type = SHT_NOBITS; >+ input_line_pointer += sizeof "nobits" - 1; >+ } Yeah, I saw this too while digging. I also came across this commit in the binutils source: commit ecc054c097e7ced281d02ef9632eb0261a410b96 Author: Thomas Preud'homme <thomas.preudhomme@arm.com> Date: Fri Mar 2 11:51:34 2018 +0000 [GDB/testsuite] Use %progbits in watch-loc.c While using @progbits in .pushsection work on some targets, it does not work on arm target where this introduces a comment. This patch replaces its use in gdb.dlang/watch-loc.c and gdb.mi/dw2-ref-missing-frame-func.c by %progbits which should work on all targets since it is used in target-independent elf/section7.s GAS test. 2018-03-02 Thomas Preud'homme <thomas.preudhomme@arm.com> gdb/testsuite/ * gdb.dlang/watch-loc.c: Use %progbits instead of @progbits. * gdb.mi/dw2-ref-missing-frame-func.c: Likewise. So that seems to confirm this theory :-) I'm just surprised it isn't documented anywhere it seems. >The only reason I thought one needed to do it differently on ARM is this >from the gas docs: > >=== > The optional TYPE argument may contain one of the following >constants: > >'@progbits' > section contains data >... > Note on targets where the '@' character is the start of a comment (eg >ARM) then another character is used instead. For example the ARM port >uses the '%' character. >=== > >That doesn't suggest the possibility that % or some other character >might work on all architectures. > >Jessica, can you resend using just %progbits to let kbuild chew on that? >Please include a comment about the misleading gas documentation. Yup, sounds good. Thanks! Jessica ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-20 14:51 [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags Jessica Yu 2019-11-20 15:02 ` Jessica Yu 2019-11-21 10:51 ` Rasmus Villemoes @ 2019-11-25 15:42 ` Jessica Yu 2019-11-26 8:32 ` Masahiro Yamada 2019-12-06 12:41 ` [PATCH v3] " Jessica Yu 2 siblings, 2 replies; 23+ messages in thread From: Jessica Yu @ 2019-11-25 15:42 UTC (permalink / raw) To: linux-kernel Cc: Matthias Maennich, Masahiro Yamada, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman, Jessica Yu Commit c3a6cf19e695 ("export: avoid code duplication in include/linux/export.h") refactors export.h quite nicely, but introduces a slight increase in memory usage due to using the empty string "" instead of NULL to indicate that an exported symbol has no namespace. As mentioned in that commit, this meant an increase of 1 byte per exported symbol without a namespace. For example, if a kernel configuration has about 10k exported symbols, this would mean that the size of __ksymtab_strings would increase by roughly 10kB. We can alleviate this situation by utilizing the SHF_MERGE and SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker that the data in the section are null-terminated strings that can be merged to eliminate duplication. More specifically, from the binutils documentation - "for sections with both M and S, a string which is a suffix of a larger string is considered a duplicate. Thus "def" will be merged with "abcdef"; A reference to the first "def" will be changed to a reference to "abcdef"+3". Thus, all the empty strings would be merged as well as any strings that can be merged according to the cited method above. For example, "memset" and "__memset" would be merged to just "__memset" in __ksymtab_strings. As of v5.4-rc5, the following statistics were gathered with x86 defconfig with approximately 10.7k exported symbols. Size of __ksymtab_strings in vmlinux: ------------------------------------- v5.4-rc5: 213834 bytes v5.4-rc5 with commit c3a6cf19e695: 224455 bytes v5.4-rc5 with this patch: 205759 bytes So, we already see memory savings of ~8kB compared to vanilla -rc5 and savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. Unfortunately, as of this writing, strings will not get deduplicated for kernel modules, as ld does not do the deduplication for SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which kernel modules are. A patch for ld is currently being worked on to hopefully allow for string deduplication in relocatable files in the future. Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Jessica Yu <jeyu@kernel.org> --- v2: use %progbits throughout and document the oddity in a comment. include/asm-generic/export.h | 8 +++++--- include/linux/export.h | 27 +++++++++++++++++++++------ 2 files changed, 26 insertions(+), 9 deletions(-) diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h index fa577978fbbd..23bc98e97a66 100644 --- a/include/asm-generic/export.h +++ b/include/asm-generic/export.h @@ -26,9 +26,11 @@ .endm /* - * note on .section use: @progbits vs %progbits nastiness doesn't matter, - * since we immediately emit into those sections anyway. + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) + * section flag requires it. Use '%progbits' instead of '@progbits' since the + * former apparently works on all arches according to the binutils source. */ + .macro ___EXPORT_SYMBOL name,val,sec #ifdef CONFIG_MODULES .globl __ksymtab_\name @@ -37,7 +39,7 @@ __ksymtab_\name: __put \val, __kstrtab_\name .previous - .section __ksymtab_strings,"a" + .section __ksymtab_strings,"aMS",%progbits,1 __kstrtab_\name: .asciz "\name" .previous diff --git a/include/linux/export.h b/include/linux/export.h index 201262793369..3d835ca34d33 100644 --- a/include/linux/export.h +++ b/include/linux/export.h @@ -81,16 +81,31 @@ struct kernel_symbol { #else +/* + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) + * section flag requires it. Use '%progbits' instead of '@progbits' since the + * former apparently works on all arches according to the binutils source. + */ +#define __KSTRTAB_ENTRY(sym) \ + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ + "__kstrtab_" #sym ": \n" \ + " .asciz \"" #sym "\" \n" \ + " .previous \n") + +#define __KSTRTAB_NS_ENTRY(sym, ns) \ + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ + "__kstrtabns_" #sym ": \n" \ + " .asciz " #ns " \n" \ + " .previous \n") + /* For every exported symbol, place a struct in the __ksymtab section */ #define ___EXPORT_SYMBOL(sym, sec, ns) \ extern typeof(sym) sym; \ + extern const char __kstrtab_##sym[]; \ + extern const char __kstrtabns_##sym[]; \ __CRC_SYMBOL(sym, sec); \ - static const char __kstrtab_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = #sym; \ - static const char __kstrtabns_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = ns; \ + __KSTRTAB_ENTRY(sym); \ + __KSTRTAB_NS_ENTRY(sym, ns); \ __KSYMTAB_ENTRY(sym, sec) #endif -- 2.16.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v2] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-25 15:42 ` [PATCH v2] " Jessica Yu @ 2019-11-26 8:32 ` Masahiro Yamada 2019-11-26 9:55 ` Jessica Yu 2019-11-26 13:56 ` Matthias Maennich 2019-12-06 12:41 ` [PATCH v3] " Jessica Yu 1 sibling, 2 replies; 23+ messages in thread From: Masahiro Yamada @ 2019-11-26 8:32 UTC (permalink / raw) To: Jessica Yu Cc: Linux Kernel Mailing List, Matthias Maennich, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman On Tue, Nov 26, 2019 at 12:42 AM Jessica Yu <jeyu@kernel.org> wrote: > > Commit c3a6cf19e695 ("export: avoid code duplication in > include/linux/export.h") refactors export.h quite nicely, but introduces > a slight increase in memory usage due to using the empty string "" > instead of NULL to indicate that an exported symbol has no namespace. As > mentioned in that commit, this meant an increase of 1 byte per exported > symbol without a namespace. For example, if a kernel configuration has > about 10k exported symbols, this would mean that the size of > __ksymtab_strings would increase by roughly 10kB. > > We can alleviate this situation by utilizing the SHF_MERGE and > SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker > that the data in the section are null-terminated strings that can be > merged to eliminate duplication. More specifically, from the binutils > documentation - "for sections with both M and S, a string which is a > suffix of a larger string is considered a duplicate. Thus "def" will be > merged with "abcdef"; A reference to the first "def" will be changed to > a reference to "abcdef"+3". Thus, all the empty strings would be merged > as well as any strings that can be merged according to the cited method > above. For example, "memset" and "__memset" would be merged to just > "__memset" in __ksymtab_strings. > > As of v5.4-rc5, the following statistics were gathered with x86 > defconfig with approximately 10.7k exported symbols. > > Size of __ksymtab_strings in vmlinux: > ------------------------------------- > v5.4-rc5: 213834 bytes > v5.4-rc5 with commit c3a6cf19e695: 224455 bytes > v5.4-rc5 with this patch: 205759 bytes > > So, we already see memory savings of ~8kB compared to vanilla -rc5 and > savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. > > Unfortunately, as of this writing, strings will not get deduplicated for > kernel modules, as ld does not do the deduplication for > SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which > kernel modules are. A patch for ld is currently being worked on to > hopefully allow for string deduplication in relocatable files in the > future. > > Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > Signed-off-by: Jessica Yu <jeyu@kernel.org> > --- > > v2: use %progbits throughout and document the oddity in a comment. > > include/asm-generic/export.h | 8 +++++--- > include/linux/export.h | 27 +++++++++++++++++++++------ > 2 files changed, 26 insertions(+), 9 deletions(-) > > diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h > index fa577978fbbd..23bc98e97a66 100644 > --- a/include/asm-generic/export.h > +++ b/include/asm-generic/export.h > @@ -26,9 +26,11 @@ > .endm > > /* > - * note on .section use: @progbits vs %progbits nastiness doesn't matter, > - * since we immediately emit into those sections anyway. > + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) > + * section flag requires it. Use '%progbits' instead of '@progbits' since the > + * former apparently works on all arches according to the binutils source. > */ > + > .macro ___EXPORT_SYMBOL name,val,sec > #ifdef CONFIG_MODULES > .globl __ksymtab_\name > @@ -37,7 +39,7 @@ > __ksymtab_\name: > __put \val, __kstrtab_\name > .previous > - .section __ksymtab_strings,"a" > + .section __ksymtab_strings,"aMS",%progbits,1 > __kstrtab_\name: > .asciz "\name" > .previous > diff --git a/include/linux/export.h b/include/linux/export.h > index 201262793369..3d835ca34d33 100644 > --- a/include/linux/export.h > +++ b/include/linux/export.h > @@ -81,16 +81,31 @@ struct kernel_symbol { > > #else > > +/* > + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) > + * section flag requires it. Use '%progbits' instead of '@progbits' since the > + * former apparently works on all arches according to the binutils source. > + */ > +#define __KSTRTAB_ENTRY(sym) \ > + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ > + "__kstrtab_" #sym ": \n" \ > + " .asciz \"" #sym "\" \n" \ > + " .previous \n") > + > +#define __KSTRTAB_NS_ENTRY(sym, ns) \ > + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ > + "__kstrtabns_" #sym ": \n" \ > + " .asciz " #ns " \n" \ Hmm, it took some time for me to how this code works. ns is already a C string, then you added # to it, then I was confused. Personally, I prefer this code: " .asciz \"" ns "\" \n" so it looks in the same way as __KSTRTAB_ENTRY(). BTW, you duplicated \"aMS\",%progbits,1" and ".previous" I would write it shorter, like this: #define ___EXPORT_SYMBOL(sym, sec, ns) \ extern typeof(sym) sym; \ extern const char __kstrtab_##sym[]; \ extern const char __kstrtabns_##sym[]; \ __CRC_SYMBOL(sym, sec); \ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ "__kstrtab_" #sym ": \n" \ " .asciz \"" #sym "\" \n" \ "__kstrtabns_" #sym ": \n" \ " .asciz \"" ns "\" \n" \ " .previous \n"); \ __KSYMTAB_ENTRY(sym, sec) > + " .previous \n") > + > /* For every exported symbol, place a struct in the __ksymtab section */ > #define ___EXPORT_SYMBOL(sym, sec, ns) \ > extern typeof(sym) sym; \ > + extern const char __kstrtab_##sym[]; \ > + extern const char __kstrtabns_##sym[]; \ > __CRC_SYMBOL(sym, sec); \ > - static const char __kstrtab_##sym[] \ > - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ > - = #sym; \ > - static const char __kstrtabns_##sym[] \ > - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ > - = ns; \ > + __KSTRTAB_ENTRY(sym); \ > + __KSTRTAB_NS_ENTRY(sym, ns); \ > __KSYMTAB_ENTRY(sym, sec) > > #endif > -- > 2.16.4 > -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-26 8:32 ` Masahiro Yamada @ 2019-11-26 9:55 ` Jessica Yu 2019-11-26 13:56 ` Matthias Maennich 1 sibling, 0 replies; 23+ messages in thread From: Jessica Yu @ 2019-11-26 9:55 UTC (permalink / raw) To: Masahiro Yamada Cc: Linux Kernel Mailing List, Matthias Maennich, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman +++ Masahiro Yamada [26/11/19 17:32 +0900]: >On Tue, Nov 26, 2019 at 12:42 AM Jessica Yu <jeyu@kernel.org> wrote: >> >> Commit c3a6cf19e695 ("export: avoid code duplication in >> include/linux/export.h") refactors export.h quite nicely, but introduces >> a slight increase in memory usage due to using the empty string "" >> instead of NULL to indicate that an exported symbol has no namespace. As >> mentioned in that commit, this meant an increase of 1 byte per exported >> symbol without a namespace. For example, if a kernel configuration has >> about 10k exported symbols, this would mean that the size of >> __ksymtab_strings would increase by roughly 10kB. >> >> We can alleviate this situation by utilizing the SHF_MERGE and >> SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >> that the data in the section are null-terminated strings that can be >> merged to eliminate duplication. More specifically, from the binutils >> documentation - "for sections with both M and S, a string which is a >> suffix of a larger string is considered a duplicate. Thus "def" will be >> merged with "abcdef"; A reference to the first "def" will be changed to >> a reference to "abcdef"+3". Thus, all the empty strings would be merged >> as well as any strings that can be merged according to the cited method >> above. For example, "memset" and "__memset" would be merged to just >> "__memset" in __ksymtab_strings. >> >> As of v5.4-rc5, the following statistics were gathered with x86 >> defconfig with approximately 10.7k exported symbols. >> >> Size of __ksymtab_strings in vmlinux: >> ------------------------------------- >> v5.4-rc5: 213834 bytes >> v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >> v5.4-rc5 with this patch: 205759 bytes >> >> So, we already see memory savings of ~8kB compared to vanilla -rc5 and >> savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. >> >> Unfortunately, as of this writing, strings will not get deduplicated for >> kernel modules, as ld does not do the deduplication for >> SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >> kernel modules are. A patch for ld is currently being worked on to >> hopefully allow for string deduplication in relocatable files in the >> future. >> >> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >> Signed-off-by: Jessica Yu <jeyu@kernel.org> >> --- >> >> v2: use %progbits throughout and document the oddity in a comment. >> >> include/asm-generic/export.h | 8 +++++--- >> include/linux/export.h | 27 +++++++++++++++++++++------ >> 2 files changed, 26 insertions(+), 9 deletions(-) >> >> diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >> index fa577978fbbd..23bc98e97a66 100644 >> --- a/include/asm-generic/export.h >> +++ b/include/asm-generic/export.h >> @@ -26,9 +26,11 @@ >> .endm >> >> /* >> - * note on .section use: @progbits vs %progbits nastiness doesn't matter, >> - * since we immediately emit into those sections anyway. >> + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >> + * section flag requires it. Use '%progbits' instead of '@progbits' since the >> + * former apparently works on all arches according to the binutils source. >> */ >> + >> .macro ___EXPORT_SYMBOL name,val,sec >> #ifdef CONFIG_MODULES >> .globl __ksymtab_\name >> @@ -37,7 +39,7 @@ >> __ksymtab_\name: >> __put \val, __kstrtab_\name >> .previous >> - .section __ksymtab_strings,"a" >> + .section __ksymtab_strings,"aMS",%progbits,1 >> __kstrtab_\name: >> .asciz "\name" >> .previous >> diff --git a/include/linux/export.h b/include/linux/export.h >> index 201262793369..3d835ca34d33 100644 >> --- a/include/linux/export.h >> +++ b/include/linux/export.h >> @@ -81,16 +81,31 @@ struct kernel_symbol { >> >> #else >> >> +/* >> + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >> + * section flag requires it. Use '%progbits' instead of '@progbits' since the >> + * former apparently works on all arches according to the binutils source. >> + */ >> +#define __KSTRTAB_ENTRY(sym) \ >> + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >> + "__kstrtab_" #sym ": \n" \ >> + " .asciz \"" #sym "\" \n" \ >> + " .previous \n") >> + >> +#define __KSTRTAB_NS_ENTRY(sym, ns) \ >> + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >> + "__kstrtabns_" #sym ": \n" \ >> + " .asciz " #ns " \n" \ > > >Hmm, it took some time for me to how this code works. > >ns is already a C string, then you added # to it, >then I was confused. > >Personally, I prefer this code: >" .asciz \"" ns "\" \n" > >so it looks in the same way as __KSTRTAB_ENTRY(). > > > >BTW, you duplicated \"aMS\",%progbits,1" and ".previous" > > >I would write it shorter, like this: > > >#define ___EXPORT_SYMBOL(sym, sec, ns) \ > extern typeof(sym) sym; \ > extern const char __kstrtab_##sym[]; \ > extern const char __kstrtabns_##sym[]; \ > __CRC_SYMBOL(sym, sec); \ > asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ > "__kstrtab_" #sym ": \n" \ > " .asciz \"" #sym "\" \n" \ > "__kstrtabns_" #sym ": \n" \ > " .asciz \"" ns "\" \n" \ > " .previous \n"); \ > __KSYMTAB_ENTRY(sym, sec) > Sure, I can change that. Just thought it'd be easier to read with the separate macros. Thanks for your comments! ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-26 8:32 ` Masahiro Yamada 2019-11-26 9:55 ` Jessica Yu @ 2019-11-26 13:56 ` Matthias Maennich 2019-11-26 15:31 ` Jessica Yu 1 sibling, 1 reply; 23+ messages in thread From: Matthias Maennich @ 2019-11-26 13:56 UTC (permalink / raw) To: Masahiro Yamada Cc: Jessica Yu, Linux Kernel Mailing List, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman On Tue, Nov 26, 2019 at 05:32:59PM +0900, Masahiro Yamada wrote: >On Tue, Nov 26, 2019 at 12:42 AM Jessica Yu <jeyu@kernel.org> wrote: >> >> Commit c3a6cf19e695 ("export: avoid code duplication in >> include/linux/export.h") refactors export.h quite nicely, but introduces >> a slight increase in memory usage due to using the empty string "" >> instead of NULL to indicate that an exported symbol has no namespace. As >> mentioned in that commit, this meant an increase of 1 byte per exported >> symbol without a namespace. For example, if a kernel configuration has >> about 10k exported symbols, this would mean that the size of >> __ksymtab_strings would increase by roughly 10kB. >> >> We can alleviate this situation by utilizing the SHF_MERGE and >> SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >> that the data in the section are null-terminated strings that can be >> merged to eliminate duplication. More specifically, from the binutils >> documentation - "for sections with both M and S, a string which is a >> suffix of a larger string is considered a duplicate. Thus "def" will be >> merged with "abcdef"; A reference to the first "def" will be changed to >> a reference to "abcdef"+3". Thus, all the empty strings would be merged >> as well as any strings that can be merged according to the cited method >> above. For example, "memset" and "__memset" would be merged to just >> "__memset" in __ksymtab_strings. >> >> As of v5.4-rc5, the following statistics were gathered with x86 >> defconfig with approximately 10.7k exported symbols. >> >> Size of __ksymtab_strings in vmlinux: >> ------------------------------------- >> v5.4-rc5: 213834 bytes >> v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >> v5.4-rc5 with this patch: 205759 bytes >> >> So, we already see memory savings of ~8kB compared to vanilla -rc5 and >> savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. >> >> Unfortunately, as of this writing, strings will not get deduplicated for >> kernel modules, as ld does not do the deduplication for >> SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >> kernel modules are. A patch for ld is currently being worked on to >> hopefully allow for string deduplication in relocatable files in the >> future. >> Thanks for working on this! >> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >> Signed-off-by: Jessica Yu <jeyu@kernel.org> >> --- >> >> v2: use %progbits throughout and document the oddity in a comment. >> >> include/asm-generic/export.h | 8 +++++--- >> include/linux/export.h | 27 +++++++++++++++++++++------ >> 2 files changed, 26 insertions(+), 9 deletions(-) >> >> diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >> index fa577978fbbd..23bc98e97a66 100644 >> --- a/include/asm-generic/export.h >> +++ b/include/asm-generic/export.h >> @@ -26,9 +26,11 @@ >> .endm >> >> /* >> - * note on .section use: @progbits vs %progbits nastiness doesn't matter, >> - * since we immediately emit into those sections anyway. >> + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >> + * section flag requires it. Use '%progbits' instead of '@progbits' since the >> + * former apparently works on all arches according to the binutils source. >> */ >> + >> .macro ___EXPORT_SYMBOL name,val,sec >> #ifdef CONFIG_MODULES >> .globl __ksymtab_\name >> @@ -37,7 +39,7 @@ >> __ksymtab_\name: >> __put \val, __kstrtab_\name >> .previous >> - .section __ksymtab_strings,"a" >> + .section __ksymtab_strings,"aMS",%progbits,1 >> __kstrtab_\name: >> .asciz "\name" >> .previous >> diff --git a/include/linux/export.h b/include/linux/export.h >> index 201262793369..3d835ca34d33 100644 >> --- a/include/linux/export.h >> +++ b/include/linux/export.h >> @@ -81,16 +81,31 @@ struct kernel_symbol { >> >> #else >> >> +/* >> + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >> + * section flag requires it. Use '%progbits' instead of '@progbits' since the >> + * former apparently works on all arches according to the binutils source. >> + */ >> +#define __KSTRTAB_ENTRY(sym) \ >> + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >> + "__kstrtab_" #sym ": \n" \ >> + " .asciz \"" #sym "\" \n" \ >> + " .previous \n") >> + >> +#define __KSTRTAB_NS_ENTRY(sym, ns) \ >> + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >> + "__kstrtabns_" #sym ": \n" \ >> + " .asciz " #ns " \n" \ > > >Hmm, it took some time for me to how this code works. > >ns is already a C string, then you added # to it, >then I was confused. > >Personally, I prefer this code: >" .asciz \"" ns "\" \n" > >so it looks in the same way as __KSTRTAB_ENTRY(). I agree with this, these entries should be consistent. > > > >BTW, you duplicated \"aMS\",%progbits,1" and ".previous" > > >I would write it shorter, like this: > > >#define ___EXPORT_SYMBOL(sym, sec, ns) \ > extern typeof(sym) sym; \ > extern const char __kstrtab_##sym[]; \ > extern const char __kstrtabns_##sym[]; \ > __CRC_SYMBOL(sym, sec); \ > asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ > "__kstrtab_" #sym ": \n" \ > " .asciz \"" #sym "\" \n" \ > "__kstrtabns_" #sym ": \n" \ > " .asciz \"" ns "\" \n" \ > " .previous \n"); \ > __KSYMTAB_ENTRY(sym, sec) > I would prefer the separate macros though (as initially proposed) as I find them much more readable. The code is already a bit tricky to reason about and I don't think the shorter version is enough of a gain. > > > > > > > >> + " .previous \n") >> + >> /* For every exported symbol, place a struct in the __ksymtab section */ >> #define ___EXPORT_SYMBOL(sym, sec, ns) \ >> extern typeof(sym) sym; \ >> + extern const char __kstrtab_##sym[]; \ >> + extern const char __kstrtabns_##sym[]; \ >> __CRC_SYMBOL(sym, sec); \ >> - static const char __kstrtab_##sym[] \ >> - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >> - = #sym; \ You could keep simplified versions of these statements as comment for the above macros to increase readability. >> - static const char __kstrtabns_##sym[] \ >> - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >> - = ns; \ >> + __KSTRTAB_ENTRY(sym); \ >> + __KSTRTAB_NS_ENTRY(sym, ns); \ >> __KSYMTAB_ENTRY(sym, sec) >> >> #endif >> -- >> 2.16.4 >> With the above addressed, please feel free to add Reviewed-by: Matthias Maennich <maennich@google.com> Cheers, Matthias > > >-- >Best Regards >Masahiro Yamada ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-26 13:56 ` Matthias Maennich @ 2019-11-26 15:31 ` Jessica Yu 2019-11-26 16:02 ` Matthias Maennich 2019-11-26 16:12 ` Masahiro Yamada 0 siblings, 2 replies; 23+ messages in thread From: Jessica Yu @ 2019-11-26 15:31 UTC (permalink / raw) To: Matthias Maennich Cc: Masahiro Yamada, Linux Kernel Mailing List, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman +++ Matthias Maennich [26/11/19 13:56 +0000]: >On Tue, Nov 26, 2019 at 05:32:59PM +0900, Masahiro Yamada wrote: >>On Tue, Nov 26, 2019 at 12:42 AM Jessica Yu <jeyu@kernel.org> wrote: >>> >>>Commit c3a6cf19e695 ("export: avoid code duplication in >>>include/linux/export.h") refactors export.h quite nicely, but introduces >>>a slight increase in memory usage due to using the empty string "" >>>instead of NULL to indicate that an exported symbol has no namespace. As >>>mentioned in that commit, this meant an increase of 1 byte per exported >>>symbol without a namespace. For example, if a kernel configuration has >>>about 10k exported symbols, this would mean that the size of >>>__ksymtab_strings would increase by roughly 10kB. >>> >>>We can alleviate this situation by utilizing the SHF_MERGE and >>>SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >>>that the data in the section are null-terminated strings that can be >>>merged to eliminate duplication. More specifically, from the binutils >>>documentation - "for sections with both M and S, a string which is a >>>suffix of a larger string is considered a duplicate. Thus "def" will be >>>merged with "abcdef"; A reference to the first "def" will be changed to >>>a reference to "abcdef"+3". Thus, all the empty strings would be merged >>>as well as any strings that can be merged according to the cited method >>>above. For example, "memset" and "__memset" would be merged to just >>>"__memset" in __ksymtab_strings. >>> >>>As of v5.4-rc5, the following statistics were gathered with x86 >>>defconfig with approximately 10.7k exported symbols. >>> >>>Size of __ksymtab_strings in vmlinux: >>>------------------------------------- >>>v5.4-rc5: 213834 bytes >>>v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >>>v5.4-rc5 with this patch: 205759 bytes >>> >>>So, we already see memory savings of ~8kB compared to vanilla -rc5 and >>>savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. >>> >>>Unfortunately, as of this writing, strings will not get deduplicated for >>>kernel modules, as ld does not do the deduplication for >>>SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >>>kernel modules are. A patch for ld is currently being worked on to >>>hopefully allow for string deduplication in relocatable files in the >>>future. >>> > >Thanks for working on this! > >>>Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >>>Signed-off-by: Jessica Yu <jeyu@kernel.org> >>>--- >>> >>>v2: use %progbits throughout and document the oddity in a comment. >>> >>> include/asm-generic/export.h | 8 +++++--- >>> include/linux/export.h | 27 +++++++++++++++++++++------ >>> 2 files changed, 26 insertions(+), 9 deletions(-) >>> >>>diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >>>index fa577978fbbd..23bc98e97a66 100644 >>>--- a/include/asm-generic/export.h >>>+++ b/include/asm-generic/export.h >>>@@ -26,9 +26,11 @@ >>> .endm >>> >>> /* >>>- * note on .section use: @progbits vs %progbits nastiness doesn't matter, >>>- * since we immediately emit into those sections anyway. >>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >>>+ * former apparently works on all arches according to the binutils source. >>> */ >>>+ >>> .macro ___EXPORT_SYMBOL name,val,sec >>> #ifdef CONFIG_MODULES >>> .globl __ksymtab_\name >>>@@ -37,7 +39,7 @@ >>> __ksymtab_\name: >>> __put \val, __kstrtab_\name >>> .previous >>>- .section __ksymtab_strings,"a" >>>+ .section __ksymtab_strings,"aMS",%progbits,1 >>> __kstrtab_\name: >>> .asciz "\name" >>> .previous >>>diff --git a/include/linux/export.h b/include/linux/export.h >>>index 201262793369..3d835ca34d33 100644 >>>--- a/include/linux/export.h >>>+++ b/include/linux/export.h >>>@@ -81,16 +81,31 @@ struct kernel_symbol { >>> >>> #else >>> >>>+/* >>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >>>+ * former apparently works on all arches according to the binutils source. >>>+ */ >>>+#define __KSTRTAB_ENTRY(sym) \ >>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >>>+ "__kstrtab_" #sym ": \n" \ >>>+ " .asciz \"" #sym "\" \n" \ >>>+ " .previous \n") >>>+ >>>+#define __KSTRTAB_NS_ENTRY(sym, ns) \ >>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >>>+ "__kstrtabns_" #sym ": \n" \ >>>+ " .asciz " #ns " \n" \ >> >> >>Hmm, it took some time for me to how this code works. >> >>ns is already a C string, then you added # to it, >>then I was confused. >> >>Personally, I prefer this code: >>" .asciz \"" ns "\" \n" >> >>so it looks in the same way as __KSTRTAB_ENTRY(). > >I agree with this, these entries should be consistent. > >> >> >> >>BTW, you duplicated \"aMS\",%progbits,1" and ".previous" >> >> >>I would write it shorter, like this: >> >> >>#define ___EXPORT_SYMBOL(sym, sec, ns) \ >> extern typeof(sym) sym; \ >> extern const char __kstrtab_##sym[]; \ >> extern const char __kstrtabns_##sym[]; \ >> __CRC_SYMBOL(sym, sec); \ >> asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ >> "__kstrtab_" #sym ": \n" \ >> " .asciz \"" #sym "\" \n" \ >> "__kstrtabns_" #sym ": \n" \ >> " .asciz \"" ns "\" \n" \ >> " .previous \n"); \ >> __KSYMTAB_ENTRY(sym, sec) >> > >I would prefer the separate macros though (as initially proposed) as I >find them much more readable. The code is already a bit tricky to reason >about and I don't think the shorter version is enough of a gain. Yeah, the macros were more readable IMO. But I could just squash them into one __KSTRTAB_ENTRY macro as a compromise for Masahiro maybe? Is this any better? diff --git a/include/linux/export.h b/include/linux/export.h index 201262793369..f4a8fc798a1b 100644 --- a/include/linux/export.h +++ b/include/linux/export.h @@ -81,16 +81,30 @@ struct kernel_symbol { #else +/* + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) + * section flag requires it. Use '%progbits' instead of '@progbits' since the + * former apparently works on all arches according to the binutils source. + * + * This basically corresponds to: + * const char __kstrtab_##sym[] __attribute__((section("__ksymtab_strings")) = #sym; + * const char __kstrtabns_##sym[] __attribute__((section("__ksymtab_strings")) = ns; + */ +#define __KSTRTAB_ENTRY(sym, ns) \ + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ + "__kstrtab_" #sym ": \n" \ + " .asciz \"" #sym "\" \n" \ + "__kstrtabns_" #sym ": \n" \ + " .asciz \"" ns "\" \n" \ + " .previous \n") + /* For every exported symbol, place a struct in the __ksymtab section */ #define ___EXPORT_SYMBOL(sym, sec, ns) \ extern typeof(sym) sym; \ + extern const char __kstrtab_##sym[]; \ + extern const char __kstrtabns_##sym[]; \ __CRC_SYMBOL(sym, sec); \ - static const char __kstrtab_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = #sym; \ - static const char __kstrtabns_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = ns; \ + __KSTRTAB_ENTRY(sym, ns); \ __KSYMTAB_ENTRY(sym, sec) #endif >> >> >> >> >> >> >> >>>+ " .previous \n") >>>+ >>> /* For every exported symbol, place a struct in the __ksymtab section */ >>> #define ___EXPORT_SYMBOL(sym, sec, ns) \ >>> extern typeof(sym) sym; \ >>>+ extern const char __kstrtab_##sym[]; \ >>>+ extern const char __kstrtabns_##sym[]; \ >>> __CRC_SYMBOL(sym, sec); \ >>>- static const char __kstrtab_##sym[] \ >>>- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >>>- = #sym; \ > >You could keep simplified versions of these statements as comment for >the above macros to increase readability. > >>>- static const char __kstrtabns_##sym[] \ >>>- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >>>- = ns; \ >>>+ __KSTRTAB_ENTRY(sym); \ >>>+ __KSTRTAB_NS_ENTRY(sym, ns); \ >>> __KSYMTAB_ENTRY(sym, sec) >>> >>> #endif >>>-- >>>2.16.4 >>> > >With the above addressed, please feel free to add > >Reviewed-by: Matthias Maennich <maennich@google.com> > >Cheers, >Matthias > >> >> >>-- >>Best Regards >>Masahiro Yamada ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v2] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-26 15:31 ` Jessica Yu @ 2019-11-26 16:02 ` Matthias Maennich 2019-11-26 16:12 ` Masahiro Yamada 1 sibling, 0 replies; 23+ messages in thread From: Matthias Maennich @ 2019-11-26 16:02 UTC (permalink / raw) To: Jessica Yu Cc: Masahiro Yamada, Linux Kernel Mailing List, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman On Tue, Nov 26, 2019 at 04:31:54PM +0100, Jessica Yu wrote: >+++ Matthias Maennich [26/11/19 13:56 +0000]: >>On Tue, Nov 26, 2019 at 05:32:59PM +0900, Masahiro Yamada wrote: >>>On Tue, Nov 26, 2019 at 12:42 AM Jessica Yu <jeyu@kernel.org> wrote: >>>> >>>>Commit c3a6cf19e695 ("export: avoid code duplication in >>>>include/linux/export.h") refactors export.h quite nicely, but introduces >>>>a slight increase in memory usage due to using the empty string "" >>>>instead of NULL to indicate that an exported symbol has no namespace. As >>>>mentioned in that commit, this meant an increase of 1 byte per exported >>>>symbol without a namespace. For example, if a kernel configuration has >>>>about 10k exported symbols, this would mean that the size of >>>>__ksymtab_strings would increase by roughly 10kB. >>>> >>>>We can alleviate this situation by utilizing the SHF_MERGE and >>>>SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >>>>that the data in the section are null-terminated strings that can be >>>>merged to eliminate duplication. More specifically, from the binutils >>>>documentation - "for sections with both M and S, a string which is a >>>>suffix of a larger string is considered a duplicate. Thus "def" will be >>>>merged with "abcdef"; A reference to the first "def" will be changed to >>>>a reference to "abcdef"+3". Thus, all the empty strings would be merged >>>>as well as any strings that can be merged according to the cited method >>>>above. For example, "memset" and "__memset" would be merged to just >>>>"__memset" in __ksymtab_strings. >>>> >>>>As of v5.4-rc5, the following statistics were gathered with x86 >>>>defconfig with approximately 10.7k exported symbols. >>>> >>>>Size of __ksymtab_strings in vmlinux: >>>>------------------------------------- >>>>v5.4-rc5: 213834 bytes >>>>v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >>>>v5.4-rc5 with this patch: 205759 bytes >>>> >>>>So, we already see memory savings of ~8kB compared to vanilla -rc5 and >>>>savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. >>>> >>>>Unfortunately, as of this writing, strings will not get deduplicated for >>>>kernel modules, as ld does not do the deduplication for >>>>SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >>>>kernel modules are. A patch for ld is currently being worked on to >>>>hopefully allow for string deduplication in relocatable files in the >>>>future. >>>> >> >>Thanks for working on this! >> >>>>Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >>>>Signed-off-by: Jessica Yu <jeyu@kernel.org> >>>>--- >>>> >>>>v2: use %progbits throughout and document the oddity in a comment. >>>> >>>>include/asm-generic/export.h | 8 +++++--- >>>>include/linux/export.h | 27 +++++++++++++++++++++------ >>>>2 files changed, 26 insertions(+), 9 deletions(-) >>>> >>>>diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >>>>index fa577978fbbd..23bc98e97a66 100644 >>>>--- a/include/asm-generic/export.h >>>>+++ b/include/asm-generic/export.h >>>>@@ -26,9 +26,11 @@ >>>>.endm >>>> >>>>/* >>>>- * note on .section use: @progbits vs %progbits nastiness doesn't matter, >>>>- * since we immediately emit into those sections anyway. >>>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >>>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >>>>+ * former apparently works on all arches according to the binutils source. >>>> */ >>>>+ >>>>.macro ___EXPORT_SYMBOL name,val,sec >>>>#ifdef CONFIG_MODULES >>>> .globl __ksymtab_\name >>>>@@ -37,7 +39,7 @@ >>>>__ksymtab_\name: >>>> __put \val, __kstrtab_\name >>>> .previous >>>>- .section __ksymtab_strings,"a" >>>>+ .section __ksymtab_strings,"aMS",%progbits,1 >>>>__kstrtab_\name: >>>> .asciz "\name" >>>> .previous >>>>diff --git a/include/linux/export.h b/include/linux/export.h >>>>index 201262793369..3d835ca34d33 100644 >>>>--- a/include/linux/export.h >>>>+++ b/include/linux/export.h >>>>@@ -81,16 +81,31 @@ struct kernel_symbol { >>>> >>>>#else >>>> >>>>+/* >>>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >>>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >>>>+ * former apparently works on all arches according to the binutils source. >>>>+ */ >>>>+#define __KSTRTAB_ENTRY(sym) \ >>>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >>>>+ "__kstrtab_" #sym ": \n" \ >>>>+ " .asciz \"" #sym "\" \n" \ >>>>+ " .previous \n") >>>>+ >>>>+#define __KSTRTAB_NS_ENTRY(sym, ns) \ >>>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >>>>+ "__kstrtabns_" #sym ": \n" \ >>>>+ " .asciz " #ns " \n" \ >>> >>> >>>Hmm, it took some time for me to how this code works. >>> >>>ns is already a C string, then you added # to it, >>>then I was confused. >>> >>>Personally, I prefer this code: >>>" .asciz \"" ns "\" \n" >>> >>>so it looks in the same way as __KSTRTAB_ENTRY(). >> >>I agree with this, these entries should be consistent. >> >>> >>> >>> >>>BTW, you duplicated \"aMS\",%progbits,1" and ".previous" >>> >>> >>>I would write it shorter, like this: >>> >>> >>>#define ___EXPORT_SYMBOL(sym, sec, ns) \ >>> extern typeof(sym) sym; \ >>> extern const char __kstrtab_##sym[]; \ >>> extern const char __kstrtabns_##sym[]; \ >>> __CRC_SYMBOL(sym, sec); \ >>> asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ >>> "__kstrtab_" #sym ": \n" \ >>> " .asciz \"" #sym "\" \n" \ >>> "__kstrtabns_" #sym ": \n" \ >>> " .asciz \"" ns "\" \n" \ >>> " .previous \n"); \ >>> __KSYMTAB_ENTRY(sym, sec) >>> >> >>I would prefer the separate macros though (as initially proposed) as I >>find them much more readable. The code is already a bit tricky to reason >>about and I don't think the shorter version is enough of a gain. > >Yeah, the macros were more readable IMO. But I could just squash them into one >__KSTRTAB_ENTRY macro as a compromise for Masahiro maybe? > >Is this any better? Yes, I like this variant. > >diff --git a/include/linux/export.h b/include/linux/export.h >index 201262793369..f4a8fc798a1b 100644 >--- a/include/linux/export.h >+++ b/include/linux/export.h >@@ -81,16 +81,30 @@ struct kernel_symbol { > >#else > >+/* >+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >+ * former apparently works on all arches according to the binutils source. >+ * >+ * This basically corresponds to: >+ * const char __kstrtab_##sym[] __attribute__((section("__ksymtab_strings")) = #sym; >+ * const char __kstrtabns_##sym[] __attribute__((section("__ksymtab_strings")) = ns; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You might want to drop the section attribute here to make the comment shorter (even though it is not entirely correct than anymore). Cheers, Matthias >+ */ >+#define __KSTRTAB_ENTRY(sym, ns) \ >+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >+ "__kstrtab_" #sym ": \n" \ >+ " .asciz \"" #sym "\" \n" \ >+ "__kstrtabns_" #sym ": \n" \ >+ " .asciz \"" ns "\" \n" \ >+ " .previous \n") >+ >/* For every exported symbol, place a struct in the __ksymtab section */ >#define ___EXPORT_SYMBOL(sym, sec, ns) \ > extern typeof(sym) sym; \ >+ extern const char __kstrtab_##sym[]; \ >+ extern const char __kstrtabns_##sym[]; \ > __CRC_SYMBOL(sym, sec); \ >- static const char __kstrtab_##sym[] \ >- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >- = #sym; \ >- static const char __kstrtabns_##sym[] \ >- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >- = ns; \ >+ __KSTRTAB_ENTRY(sym, ns); \ > __KSYMTAB_ENTRY(sym, sec) > >#endif > >>> >>> >>> >>> >>> >>> >>> >>>>+ " .previous \n") >>>>+ >>>>/* For every exported symbol, place a struct in the __ksymtab section */ >>>>#define ___EXPORT_SYMBOL(sym, sec, ns) \ >>>> extern typeof(sym) sym; \ >>>>+ extern const char __kstrtab_##sym[]; \ >>>>+ extern const char __kstrtabns_##sym[]; \ >>>> __CRC_SYMBOL(sym, sec); \ >>>>- static const char __kstrtab_##sym[] \ >>>>- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >>>>- = #sym; \ >> >>You could keep simplified versions of these statements as comment for >>the above macros to increase readability. >> >>>>- static const char __kstrtabns_##sym[] \ >>>>- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >>>>- = ns; \ >>>>+ __KSTRTAB_ENTRY(sym); \ >>>>+ __KSTRTAB_NS_ENTRY(sym, ns); \ >>>> __KSYMTAB_ENTRY(sym, sec) >>>> >>>>#endif >>>>-- >>>>2.16.4 >>>> >> >>With the above addressed, please feel free to add >> >>Reviewed-by: Matthias Maennich <maennich@google.com> >> >>Cheers, >>Matthias >> >>> >>> >>>-- >>>Best Regards >>>Masahiro Yamada ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-26 15:31 ` Jessica Yu 2019-11-26 16:02 ` Matthias Maennich @ 2019-11-26 16:12 ` Masahiro Yamada 2019-11-26 16:48 ` Jessica Yu 1 sibling, 1 reply; 23+ messages in thread From: Masahiro Yamada @ 2019-11-26 16:12 UTC (permalink / raw) To: Jessica Yu Cc: Matthias Maennich, Linux Kernel Mailing List, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman On Wed, Nov 27, 2019 at 12:32 AM Jessica Yu <jeyu@kernel.org> wrote: > > +++ Matthias Maennich [26/11/19 13:56 +0000]: > >On Tue, Nov 26, 2019 at 05:32:59PM +0900, Masahiro Yamada wrote: > >>On Tue, Nov 26, 2019 at 12:42 AM Jessica Yu <jeyu@kernel.org> wrote: > >>> > >>>Commit c3a6cf19e695 ("export: avoid code duplication in > >>>include/linux/export.h") refactors export.h quite nicely, but introduces > >>>a slight increase in memory usage due to using the empty string "" > >>>instead of NULL to indicate that an exported symbol has no namespace. As > >>>mentioned in that commit, this meant an increase of 1 byte per exported > >>>symbol without a namespace. For example, if a kernel configuration has > >>>about 10k exported symbols, this would mean that the size of > >>>__ksymtab_strings would increase by roughly 10kB. > >>> > >>>We can alleviate this situation by utilizing the SHF_MERGE and > >>>SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker > >>>that the data in the section are null-terminated strings that can be > >>>merged to eliminate duplication. More specifically, from the binutils > >>>documentation - "for sections with both M and S, a string which is a > >>>suffix of a larger string is considered a duplicate. Thus "def" will be > >>>merged with "abcdef"; A reference to the first "def" will be changed to > >>>a reference to "abcdef"+3". Thus, all the empty strings would be merged > >>>as well as any strings that can be merged according to the cited method > >>>above. For example, "memset" and "__memset" would be merged to just > >>>"__memset" in __ksymtab_strings. > >>> > >>>As of v5.4-rc5, the following statistics were gathered with x86 > >>>defconfig with approximately 10.7k exported symbols. > >>> > >>>Size of __ksymtab_strings in vmlinux: > >>>------------------------------------- > >>>v5.4-rc5: 213834 bytes > >>>v5.4-rc5 with commit c3a6cf19e695: 224455 bytes > >>>v5.4-rc5 with this patch: 205759 bytes > >>> > >>>So, we already see memory savings of ~8kB compared to vanilla -rc5 and > >>>savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. > >>> > >>>Unfortunately, as of this writing, strings will not get deduplicated for > >>>kernel modules, as ld does not do the deduplication for > >>>SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which > >>>kernel modules are. A patch for ld is currently being worked on to > >>>hopefully allow for string deduplication in relocatable files in the > >>>future. > >>> > > > >Thanks for working on this! > > > >>>Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > >>>Signed-off-by: Jessica Yu <jeyu@kernel.org> > >>>--- > >>> > >>>v2: use %progbits throughout and document the oddity in a comment. > >>> > >>> include/asm-generic/export.h | 8 +++++--- > >>> include/linux/export.h | 27 +++++++++++++++++++++------ > >>> 2 files changed, 26 insertions(+), 9 deletions(-) > >>> > >>>diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h > >>>index fa577978fbbd..23bc98e97a66 100644 > >>>--- a/include/asm-generic/export.h > >>>+++ b/include/asm-generic/export.h > >>>@@ -26,9 +26,11 @@ > >>> .endm > >>> > >>> /* > >>>- * note on .section use: @progbits vs %progbits nastiness doesn't matter, > >>>- * since we immediately emit into those sections anyway. > >>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) > >>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the > >>>+ * former apparently works on all arches according to the binutils source. > >>> */ > >>>+ > >>> .macro ___EXPORT_SYMBOL name,val,sec > >>> #ifdef CONFIG_MODULES > >>> .globl __ksymtab_\name > >>>@@ -37,7 +39,7 @@ > >>> __ksymtab_\name: > >>> __put \val, __kstrtab_\name > >>> .previous > >>>- .section __ksymtab_strings,"a" > >>>+ .section __ksymtab_strings,"aMS",%progbits,1 > >>> __kstrtab_\name: > >>> .asciz "\name" > >>> .previous > >>>diff --git a/include/linux/export.h b/include/linux/export.h > >>>index 201262793369..3d835ca34d33 100644 > >>>--- a/include/linux/export.h > >>>+++ b/include/linux/export.h > >>>@@ -81,16 +81,31 @@ struct kernel_symbol { > >>> > >>> #else > >>> > >>>+/* > >>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) > >>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the > >>>+ * former apparently works on all arches according to the binutils source. > >>>+ */ > >>>+#define __KSTRTAB_ENTRY(sym) \ > >>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ > >>>+ "__kstrtab_" #sym ": \n" \ > >>>+ " .asciz \"" #sym "\" \n" \ > >>>+ " .previous \n") > >>>+ > >>>+#define __KSTRTAB_NS_ENTRY(sym, ns) \ > >>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ > >>>+ "__kstrtabns_" #sym ": \n" \ > >>>+ " .asciz " #ns " \n" \ > >> > >> > >>Hmm, it took some time for me to how this code works. > >> > >>ns is already a C string, then you added # to it, > >>then I was confused. > >> > >>Personally, I prefer this code: > >>" .asciz \"" ns "\" \n" > >> > >>so it looks in the same way as __KSTRTAB_ENTRY(). > > > >I agree with this, these entries should be consistent. > > > >> > >> > >> > >>BTW, you duplicated \"aMS\",%progbits,1" and ".previous" > >> > >> > >>I would write it shorter, like this: > >> > >> > >>#define ___EXPORT_SYMBOL(sym, sec, ns) \ > >> extern typeof(sym) sym; \ > >> extern const char __kstrtab_##sym[]; \ > >> extern const char __kstrtabns_##sym[]; \ > >> __CRC_SYMBOL(sym, sec); \ > >> asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ > >> "__kstrtab_" #sym ": \n" \ > >> " .asciz \"" #sym "\" \n" \ > >> "__kstrtabns_" #sym ": \n" \ > >> " .asciz \"" ns "\" \n" \ > >> " .previous \n"); \ > >> __KSYMTAB_ENTRY(sym, sec) > >> > > > >I would prefer the separate macros though (as initially proposed) as I > >find them much more readable. The code is already a bit tricky to reason > >about and I don't think the shorter version is enough of a gain. > > Yeah, the macros were more readable IMO. But I could just squash them into one > __KSTRTAB_ENTRY macro as a compromise for Masahiro maybe? > > Is this any better? I prefer opposite. __CRC_SYMBOL() is macrofied because it is ifdef'ed by CONFIG_MODVERSIONS and CONFIG_MOD_REL_CRCS. __KSYMTAB_ENTRY() is macrofied because it is ifdef'ed by CONFIG_HAVE_ARCH_PREL32_RELOCATIONS. __KSTRTAB_ENTRY() does not depend on any CONFIG option, so it can be expanded in ___EXPORT_SYMBOL(). You need to check multiple locations to understand how it works as a whole. I do not understand why increasing macro-chains is readable. Masahiro > > diff --git a/include/linux/export.h b/include/linux/export.h > index 201262793369..f4a8fc798a1b 100644 > --- a/include/linux/export.h > +++ b/include/linux/export.h > @@ -81,16 +81,30 @@ struct kernel_symbol { > > #else > > +/* > + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) > + * section flag requires it. Use '%progbits' instead of '@progbits' since the > + * former apparently works on all arches according to the binutils source. > + * > + * This basically corresponds to: > + * const char __kstrtab_##sym[] __attribute__((section("__ksymtab_strings")) = #sym; > + * const char __kstrtabns_##sym[] __attribute__((section("__ksymtab_strings")) = ns; > + */ > +#define __KSTRTAB_ENTRY(sym, ns) \ > + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ > + "__kstrtab_" #sym ": \n" \ > + " .asciz \"" #sym "\" \n" \ > + "__kstrtabns_" #sym ": \n" \ > + " .asciz \"" ns "\" \n" \ > + " .previous \n") > + > /* For every exported symbol, place a struct in the __ksymtab section */ > #define ___EXPORT_SYMBOL(sym, sec, ns) \ > extern typeof(sym) sym; \ > + extern const char __kstrtab_##sym[]; \ > + extern const char __kstrtabns_##sym[]; \ > __CRC_SYMBOL(sym, sec); \ > - static const char __kstrtab_##sym[] \ > - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ > - = #sym; \ > - static const char __kstrtabns_##sym[] \ > - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ > - = ns; \ > + __KSTRTAB_ENTRY(sym, ns); \ > __KSYMTAB_ENTRY(sym, sec) > > #endif > > >> > >> > >> > >> > >> > >> > >> > >>>+ " .previous \n") > >>>+ > >>> /* For every exported symbol, place a struct in the __ksymtab section */ > >>> #define ___EXPORT_SYMBOL(sym, sec, ns) \ > >>> extern typeof(sym) sym; \ > >>>+ extern const char __kstrtab_##sym[]; \ > >>>+ extern const char __kstrtabns_##sym[]; \ > >>> __CRC_SYMBOL(sym, sec); \ > >>>- static const char __kstrtab_##sym[] \ > >>>- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ > >>>- = #sym; \ > > > >You could keep simplified versions of these statements as comment for > >the above macros to increase readability. > > > >>>- static const char __kstrtabns_##sym[] \ > >>>- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ > >>>- = ns; \ > >>>+ __KSTRTAB_ENTRY(sym); \ > >>>+ __KSTRTAB_NS_ENTRY(sym, ns); \ > >>> __KSYMTAB_ENTRY(sym, sec) > >>> > >>> #endif > >>>-- > >>>2.16.4 > >>> > > > >With the above addressed, please feel free to add > > > >Reviewed-by: Matthias Maennich <maennich@google.com> > > > >Cheers, > >Matthias > > > >> > >> > >>-- > >>Best Regards > >>Masahiro Yamada -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-26 16:12 ` Masahiro Yamada @ 2019-11-26 16:48 ` Jessica Yu 2019-12-05 16:42 ` Matthias Maennich 0 siblings, 1 reply; 23+ messages in thread From: Jessica Yu @ 2019-11-26 16:48 UTC (permalink / raw) To: Masahiro Yamada Cc: Matthias Maennich, Linux Kernel Mailing List, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman +++ Masahiro Yamada [27/11/19 01:12 +0900]: >On Wed, Nov 27, 2019 at 12:32 AM Jessica Yu <jeyu@kernel.org> wrote: >> >> +++ Matthias Maennich [26/11/19 13:56 +0000]: >> >On Tue, Nov 26, 2019 at 05:32:59PM +0900, Masahiro Yamada wrote: >> >>On Tue, Nov 26, 2019 at 12:42 AM Jessica Yu <jeyu@kernel.org> wrote: >> >>> >> >>>Commit c3a6cf19e695 ("export: avoid code duplication in >> >>>include/linux/export.h") refactors export.h quite nicely, but introduces >> >>>a slight increase in memory usage due to using the empty string "" >> >>>instead of NULL to indicate that an exported symbol has no namespace. As >> >>>mentioned in that commit, this meant an increase of 1 byte per exported >> >>>symbol without a namespace. For example, if a kernel configuration has >> >>>about 10k exported symbols, this would mean that the size of >> >>>__ksymtab_strings would increase by roughly 10kB. >> >>> >> >>>We can alleviate this situation by utilizing the SHF_MERGE and >> >>>SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >> >>>that the data in the section are null-terminated strings that can be >> >>>merged to eliminate duplication. More specifically, from the binutils >> >>>documentation - "for sections with both M and S, a string which is a >> >>>suffix of a larger string is considered a duplicate. Thus "def" will be >> >>>merged with "abcdef"; A reference to the first "def" will be changed to >> >>>a reference to "abcdef"+3". Thus, all the empty strings would be merged >> >>>as well as any strings that can be merged according to the cited method >> >>>above. For example, "memset" and "__memset" would be merged to just >> >>>"__memset" in __ksymtab_strings. >> >>> >> >>>As of v5.4-rc5, the following statistics were gathered with x86 >> >>>defconfig with approximately 10.7k exported symbols. >> >>> >> >>>Size of __ksymtab_strings in vmlinux: >> >>>------------------------------------- >> >>>v5.4-rc5: 213834 bytes >> >>>v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >> >>>v5.4-rc5 with this patch: 205759 bytes >> >>> >> >>>So, we already see memory savings of ~8kB compared to vanilla -rc5 and >> >>>savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. >> >>> >> >>>Unfortunately, as of this writing, strings will not get deduplicated for >> >>>kernel modules, as ld does not do the deduplication for >> >>>SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >> >>>kernel modules are. A patch for ld is currently being worked on to >> >>>hopefully allow for string deduplication in relocatable files in the >> >>>future. >> >>> >> > >> >Thanks for working on this! >> > >> >>>Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >> >>>Signed-off-by: Jessica Yu <jeyu@kernel.org> >> >>>--- >> >>> >> >>>v2: use %progbits throughout and document the oddity in a comment. >> >>> >> >>> include/asm-generic/export.h | 8 +++++--- >> >>> include/linux/export.h | 27 +++++++++++++++++++++------ >> >>> 2 files changed, 26 insertions(+), 9 deletions(-) >> >>> >> >>>diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >> >>>index fa577978fbbd..23bc98e97a66 100644 >> >>>--- a/include/asm-generic/export.h >> >>>+++ b/include/asm-generic/export.h >> >>>@@ -26,9 +26,11 @@ >> >>> .endm >> >>> >> >>> /* >> >>>- * note on .section use: @progbits vs %progbits nastiness doesn't matter, >> >>>- * since we immediately emit into those sections anyway. >> >>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >> >>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >> >>>+ * former apparently works on all arches according to the binutils source. >> >>> */ >> >>>+ >> >>> .macro ___EXPORT_SYMBOL name,val,sec >> >>> #ifdef CONFIG_MODULES >> >>> .globl __ksymtab_\name >> >>>@@ -37,7 +39,7 @@ >> >>> __ksymtab_\name: >> >>> __put \val, __kstrtab_\name >> >>> .previous >> >>>- .section __ksymtab_strings,"a" >> >>>+ .section __ksymtab_strings,"aMS",%progbits,1 >> >>> __kstrtab_\name: >> >>> .asciz "\name" >> >>> .previous >> >>>diff --git a/include/linux/export.h b/include/linux/export.h >> >>>index 201262793369..3d835ca34d33 100644 >> >>>--- a/include/linux/export.h >> >>>+++ b/include/linux/export.h >> >>>@@ -81,16 +81,31 @@ struct kernel_symbol { >> >>> >> >>> #else >> >>> >> >>>+/* >> >>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >> >>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >> >>>+ * former apparently works on all arches according to the binutils source. >> >>>+ */ >> >>>+#define __KSTRTAB_ENTRY(sym) \ >> >>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >> >>>+ "__kstrtab_" #sym ": \n" \ >> >>>+ " .asciz \"" #sym "\" \n" \ >> >>>+ " .previous \n") >> >>>+ >> >>>+#define __KSTRTAB_NS_ENTRY(sym, ns) \ >> >>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >> >>>+ "__kstrtabns_" #sym ": \n" \ >> >>>+ " .asciz " #ns " \n" \ >> >> >> >> >> >>Hmm, it took some time for me to how this code works. >> >> >> >>ns is already a C string, then you added # to it, >> >>then I was confused. >> >> >> >>Personally, I prefer this code: >> >>" .asciz \"" ns "\" \n" >> >> >> >>so it looks in the same way as __KSTRTAB_ENTRY(). >> > >> >I agree with this, these entries should be consistent. >> > >> >> >> >> >> >> >> >>BTW, you duplicated \"aMS\",%progbits,1" and ".previous" >> >> >> >> >> >>I would write it shorter, like this: >> >> >> >> >> >>#define ___EXPORT_SYMBOL(sym, sec, ns) \ >> >> extern typeof(sym) sym; \ >> >> extern const char __kstrtab_##sym[]; \ >> >> extern const char __kstrtabns_##sym[]; \ >> >> __CRC_SYMBOL(sym, sec); \ >> >> asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ >> >> "__kstrtab_" #sym ": \n" \ >> >> " .asciz \"" #sym "\" \n" \ >> >> "__kstrtabns_" #sym ": \n" \ >> >> " .asciz \"" ns "\" \n" \ >> >> " .previous \n"); \ >> >> __KSYMTAB_ENTRY(sym, sec) >> >> >> > >> >I would prefer the separate macros though (as initially proposed) as I >> >find them much more readable. The code is already a bit tricky to reason >> >about and I don't think the shorter version is enough of a gain. >> >> Yeah, the macros were more readable IMO. But I could just squash them into one >> __KSTRTAB_ENTRY macro as a compromise for Masahiro maybe? >> >> Is this any better? > >I prefer opposite. > > >__CRC_SYMBOL() is macrofied because it is ifdef'ed by >CONFIG_MODVERSIONS and CONFIG_MOD_REL_CRCS. > >__KSYMTAB_ENTRY() is macrofied because it is ifdef'ed by >CONFIG_HAVE_ARCH_PREL32_RELOCATIONS. > > > >__KSTRTAB_ENTRY() does not depend on any CONFIG option, >so it can be expanded in ___EXPORT_SYMBOL(). > > >You need to check multiple locations >to understand how it works as a whole. >I do not understand why increasing macro-chains is readable. I think it is "readable" in the sense that when someone is reading export.h for the first time, they know exactly what ___EXPORT_SYMBOL() is supposed to do, without requiring them to read too deeply into the asm and swim through the #ifdefs. __CRC_SYMBOL(), depending on modversions, creates an entry in what would eventually be merged to become the __kcrctab section, __KSTRTAB_ENTRY() creates entries in __ksymtab_strings, and __KSYMTAB_ENTRY() creates an entry in (what would eventually become) the __ksymtab{,_gpl} section. It gives a general idea of what it's doing. However, I do understand your reasoning behind not having an extra macro. I'll wait a bit before respinning the patch to see if we can get at a consensus.. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-26 16:48 ` Jessica Yu @ 2019-12-05 16:42 ` Matthias Maennich 0 siblings, 0 replies; 23+ messages in thread From: Matthias Maennich @ 2019-12-05 16:42 UTC (permalink / raw) To: Jessica Yu Cc: Masahiro Yamada, Linux Kernel Mailing List, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman On Tue, Nov 26, 2019 at 05:48:40PM +0100, Jessica Yu wrote: >+++ Masahiro Yamada [27/11/19 01:12 +0900]: >>On Wed, Nov 27, 2019 at 12:32 AM Jessica Yu <jeyu@kernel.org> wrote: >>> >>>+++ Matthias Maennich [26/11/19 13:56 +0000]: >>>>On Tue, Nov 26, 2019 at 05:32:59PM +0900, Masahiro Yamada wrote: >>>>>On Tue, Nov 26, 2019 at 12:42 AM Jessica Yu <jeyu@kernel.org> wrote: >>>>>> >>>>>>Commit c3a6cf19e695 ("export: avoid code duplication in >>>>>>include/linux/export.h") refactors export.h quite nicely, but introduces >>>>>>a slight increase in memory usage due to using the empty string "" >>>>>>instead of NULL to indicate that an exported symbol has no namespace. As >>>>>>mentioned in that commit, this meant an increase of 1 byte per exported >>>>>>symbol without a namespace. For example, if a kernel configuration has >>>>>>about 10k exported symbols, this would mean that the size of >>>>>>__ksymtab_strings would increase by roughly 10kB. >>>>>> >>>>>>We can alleviate this situation by utilizing the SHF_MERGE and >>>>>>SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >>>>>>that the data in the section are null-terminated strings that can be >>>>>>merged to eliminate duplication. More specifically, from the binutils >>>>>>documentation - "for sections with both M and S, a string which is a >>>>>>suffix of a larger string is considered a duplicate. Thus "def" will be >>>>>>merged with "abcdef"; A reference to the first "def" will be changed to >>>>>>a reference to "abcdef"+3". Thus, all the empty strings would be merged >>>>>>as well as any strings that can be merged according to the cited method >>>>>>above. For example, "memset" and "__memset" would be merged to just >>>>>>"__memset" in __ksymtab_strings. >>>>>> >>>>>>As of v5.4-rc5, the following statistics were gathered with x86 >>>>>>defconfig with approximately 10.7k exported symbols. >>>>>> >>>>>>Size of __ksymtab_strings in vmlinux: >>>>>>------------------------------------- >>>>>>v5.4-rc5: 213834 bytes >>>>>>v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >>>>>>v5.4-rc5 with this patch: 205759 bytes >>>>>> >>>>>>So, we already see memory savings of ~8kB compared to vanilla -rc5 and >>>>>>savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. >>>>>> >>>>>>Unfortunately, as of this writing, strings will not get deduplicated for >>>>>>kernel modules, as ld does not do the deduplication for >>>>>>SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >>>>>>kernel modules are. A patch for ld is currently being worked on to >>>>>>hopefully allow for string deduplication in relocatable files in the >>>>>>future. >>>>>> >>>> >>>>Thanks for working on this! >>>> >>>>>>Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >>>>>>Signed-off-by: Jessica Yu <jeyu@kernel.org> >>>>>>--- >>>>>> >>>>>>v2: use %progbits throughout and document the oddity in a comment. >>>>>> >>>>>> include/asm-generic/export.h | 8 +++++--- >>>>>> include/linux/export.h | 27 +++++++++++++++++++++------ >>>>>> 2 files changed, 26 insertions(+), 9 deletions(-) >>>>>> >>>>>>diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >>>>>>index fa577978fbbd..23bc98e97a66 100644 >>>>>>--- a/include/asm-generic/export.h >>>>>>+++ b/include/asm-generic/export.h >>>>>>@@ -26,9 +26,11 @@ >>>>>> .endm >>>>>> >>>>>> /* >>>>>>- * note on .section use: @progbits vs %progbits nastiness doesn't matter, >>>>>>- * since we immediately emit into those sections anyway. >>>>>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >>>>>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >>>>>>+ * former apparently works on all arches according to the binutils source. >>>>>> */ >>>>>>+ >>>>>> .macro ___EXPORT_SYMBOL name,val,sec >>>>>> #ifdef CONFIG_MODULES >>>>>> .globl __ksymtab_\name >>>>>>@@ -37,7 +39,7 @@ >>>>>> __ksymtab_\name: >>>>>> __put \val, __kstrtab_\name >>>>>> .previous >>>>>>- .section __ksymtab_strings,"a" >>>>>>+ .section __ksymtab_strings,"aMS",%progbits,1 >>>>>> __kstrtab_\name: >>>>>> .asciz "\name" >>>>>> .previous >>>>>>diff --git a/include/linux/export.h b/include/linux/export.h >>>>>>index 201262793369..3d835ca34d33 100644 >>>>>>--- a/include/linux/export.h >>>>>>+++ b/include/linux/export.h >>>>>>@@ -81,16 +81,31 @@ struct kernel_symbol { >>>>>> >>>>>> #else >>>>>> >>>>>>+/* >>>>>>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >>>>>>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >>>>>>+ * former apparently works on all arches according to the binutils source. >>>>>>+ */ >>>>>>+#define __KSTRTAB_ENTRY(sym) \ >>>>>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >>>>>>+ "__kstrtab_" #sym ": \n" \ >>>>>>+ " .asciz \"" #sym "\" \n" \ >>>>>>+ " .previous \n") >>>>>>+ >>>>>>+#define __KSTRTAB_NS_ENTRY(sym, ns) \ >>>>>>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \ >>>>>>+ "__kstrtabns_" #sym ": \n" \ >>>>>>+ " .asciz " #ns " \n" \ >>>>> >>>>> >>>>>Hmm, it took some time for me to how this code works. >>>>> >>>>>ns is already a C string, then you added # to it, >>>>>then I was confused. >>>>> >>>>>Personally, I prefer this code: >>>>>" .asciz \"" ns "\" \n" >>>>> >>>>>so it looks in the same way as __KSTRTAB_ENTRY(). >>>> >>>>I agree with this, these entries should be consistent. >>>> >>>>> >>>>> >>>>> >>>>>BTW, you duplicated \"aMS\",%progbits,1" and ".previous" >>>>> >>>>> >>>>>I would write it shorter, like this: >>>>> >>>>> >>>>>#define ___EXPORT_SYMBOL(sym, sec, ns) \ >>>>> extern typeof(sym) sym; \ >>>>> extern const char __kstrtab_##sym[]; \ >>>>> extern const char __kstrtabns_##sym[]; \ >>>>> __CRC_SYMBOL(sym, sec); \ >>>>> asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ >>>>> "__kstrtab_" #sym ": \n" \ >>>>> " .asciz \"" #sym "\" \n" \ >>>>> "__kstrtabns_" #sym ": \n" \ >>>>> " .asciz \"" ns "\" \n" \ >>>>> " .previous \n"); \ >>>>> __KSYMTAB_ENTRY(sym, sec) >>>>> >>>> >>>>I would prefer the separate macros though (as initially proposed) as I >>>>find them much more readable. The code is already a bit tricky to reason >>>>about and I don't think the shorter version is enough of a gain. >>> >>>Yeah, the macros were more readable IMO. But I could just squash them into one >>>__KSTRTAB_ENTRY macro as a compromise for Masahiro maybe? >>> >>>Is this any better? >> >>I prefer opposite. >> >> >>__CRC_SYMBOL() is macrofied because it is ifdef'ed by >>CONFIG_MODVERSIONS and CONFIG_MOD_REL_CRCS. >> >>__KSYMTAB_ENTRY() is macrofied because it is ifdef'ed by >>CONFIG_HAVE_ARCH_PREL32_RELOCATIONS. >> >> >> >>__KSTRTAB_ENTRY() does not depend on any CONFIG option, >>so it can be expanded in ___EXPORT_SYMBOL(). >> >> >>You need to check multiple locations >>to understand how it works as a whole. >>I do not understand why increasing macro-chains is readable. > >I think it is "readable" in the sense that when someone is reading >export.h for the first time, they know exactly what ___EXPORT_SYMBOL() >is supposed to do, without requiring them to read too deeply into the >asm and swim through the #ifdefs. __CRC_SYMBOL(), depending on >modversions, creates an entry in what would eventually be merged to >become the __kcrctab section, __KSTRTAB_ENTRY() creates entries in >__ksymtab_strings, and __KSYMTAB_ENTRY() creates an entry in (what >would eventually become) the __ksymtab{,_gpl} section. It gives a >general idea of what it's doing. I agree with that thought. One option would be, though, to put a proper comment there to explain what happens in the code. Splitting the functionality across several macros has the effect that the code feels easier to reason about and has some implicit documentation through the macro names. If we could restore that in Masahiro's version, I am ok with that. Cheers, Matthias > >However, I do understand your reasoning behind not having an extra >macro. I'll wait a bit before respinning the patch to see if we can >get at a consensus.. > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-11-25 15:42 ` [PATCH v2] " Jessica Yu 2019-11-26 8:32 ` Masahiro Yamada @ 2019-12-06 12:41 ` Jessica Yu 2019-12-12 6:22 ` Masahiro Yamada 2019-12-12 14:16 ` [PATCH v4] " Jessica Yu 1 sibling, 2 replies; 23+ messages in thread From: Jessica Yu @ 2019-12-06 12:41 UTC (permalink / raw) To: linux-kernel Cc: Matthias Maennich, Masahiro Yamada, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman, Jessica Yu Commit c3a6cf19e695 ("export: avoid code duplication in include/linux/export.h") refactors export.h quite nicely, but introduces a slight increase in memory usage due to using the empty string "" instead of NULL to indicate that an exported symbol has no namespace. As mentioned in that commit, this meant an increase of 1 byte per exported symbol without a namespace. For example, if a kernel configuration has about 10k exported symbols, this would mean that the size of __ksymtab_strings would increase by roughly 10kB. We can alleviate this situation by utilizing the SHF_MERGE and SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker that the data in the section are null-terminated strings that can be merged to eliminate duplication. More specifically, from the binutils documentation - "for sections with both M and S, a string which is a suffix of a larger string is considered a duplicate. Thus "def" will be merged with "abcdef"; A reference to the first "def" will be changed to a reference to "abcdef"+3". Thus, all the empty strings would be merged as well as any strings that can be merged according to the cited method above. For example, "memset" and "__memset" would be merged to just "__memset" in __ksymtab_strings. As of v5.4-rc5, the following statistics were gathered with x86 defconfig with approximately 10.7k exported symbols. Size of __ksymtab_strings in vmlinux: ------------------------------------- v5.4-rc5: 213834 bytes v5.4-rc5 with commit c3a6cf19e695: 224455 bytes v5.4-rc5 with this patch: 205759 bytes So, we already see memory savings of ~8kB compared to vanilla -rc5 and savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. Unfortunately, as of this writing, strings will not get deduplicated for kernel modules, as ld does not do the deduplication for SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which kernel modules are. A patch for ld is currently being worked on to hopefully allow for string deduplication in relocatable files in the future. Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Jessica Yu <jeyu@kernel.org> --- v3: - remove __KSTRTAB_ENTRY macros in favor of just putting the asm directly in ___EXPORT_SYMBOL - Document more clearly what the ___EXPORT_SYMBOL macro does include/asm-generic/export.h | 8 +++++--- include/linux/export.h | 27 ++++++++++++++++++++------- 2 files changed, 25 insertions(+), 10 deletions(-) diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h index fa577978fbbd..23bc98e97a66 100644 --- a/include/asm-generic/export.h +++ b/include/asm-generic/export.h @@ -26,9 +26,11 @@ .endm /* - * note on .section use: @progbits vs %progbits nastiness doesn't matter, - * since we immediately emit into those sections anyway. + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) + * section flag requires it. Use '%progbits' instead of '@progbits' since the + * former apparently works on all arches according to the binutils source. */ + .macro ___EXPORT_SYMBOL name,val,sec #ifdef CONFIG_MODULES .globl __ksymtab_\name @@ -37,7 +39,7 @@ __ksymtab_\name: __put \val, __kstrtab_\name .previous - .section __ksymtab_strings,"a" + .section __ksymtab_strings,"aMS",%progbits,1 __kstrtab_\name: .asciz "\name" .previous diff --git a/include/linux/export.h b/include/linux/export.h index 201262793369..18dcdcd118e7 100644 --- a/include/linux/export.h +++ b/include/linux/export.h @@ -81,16 +81,29 @@ struct kernel_symbol { #else -/* For every exported symbol, place a struct in the __ksymtab section */ +/* + * For every exported symbol, do the following: + * + * - If applicable, place an entry in the __kcrctab section. + * - Put the name of the symbol and namespace (empty string "" for none) in + * __ksymtab_strings. + * - Place an entry in the __ksymtab section. + * + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) + * section flag requires it. Use '%progbits' instead of '@progbits' since the + * former apparently works on all arches according to the binutils source. + */ #define ___EXPORT_SYMBOL(sym, sec, ns) \ extern typeof(sym) sym; \ + extern const char __kstrtab_##sym[]; \ + extern const char __kstrtabns_##sym[]; \ __CRC_SYMBOL(sym, sec); \ - static const char __kstrtab_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = #sym; \ - static const char __kstrtabns_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = ns; \ + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ + "__kstrtab_" #sym ": \n" \ + " .asciz \"" #sym "\" \n" \ + "__kstrtabns_" #sym ": \n" \ + " .asciz \"" ns "\" \n" \ + " .previous \n"); \ __KSYMTAB_ENTRY(sym, sec) #endif -- 2.16.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v3] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-12-06 12:41 ` [PATCH v3] " Jessica Yu @ 2019-12-12 6:22 ` Masahiro Yamada 2019-12-12 10:36 ` Jessica Yu 2019-12-12 14:16 ` [PATCH v4] " Jessica Yu 1 sibling, 1 reply; 23+ messages in thread From: Masahiro Yamada @ 2019-12-12 6:22 UTC (permalink / raw) To: Jessica Yu Cc: Linux Kernel Mailing List, Matthias Maennich, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman On Fri, Dec 6, 2019 at 9:41 PM Jessica Yu <jeyu@kernel.org> wrote: > > Commit c3a6cf19e695 ("export: avoid code duplication in > include/linux/export.h") refactors export.h quite nicely, but introduces > a slight increase in memory usage due to using the empty string "" > instead of NULL to indicate that an exported symbol has no namespace. As > mentioned in that commit, this meant an increase of 1 byte per exported > symbol without a namespace. For example, if a kernel configuration has > about 10k exported symbols, this would mean that the size of > __ksymtab_strings would increase by roughly 10kB. > > We can alleviate this situation by utilizing the SHF_MERGE and > SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker > that the data in the section are null-terminated strings that can be > merged to eliminate duplication. More specifically, from the binutils > documentation - "for sections with both M and S, a string which is a > suffix of a larger string is considered a duplicate. Thus "def" will be > merged with "abcdef"; A reference to the first "def" will be changed to > a reference to "abcdef"+3". Thus, all the empty strings would be merged > as well as any strings that can be merged according to the cited method > above. For example, "memset" and "__memset" would be merged to just > "__memset" in __ksymtab_strings. > > As of v5.4-rc5, the following statistics were gathered with x86 > defconfig with approximately 10.7k exported symbols. > > Size of __ksymtab_strings in vmlinux: > ------------------------------------- > v5.4-rc5: 213834 bytes > v5.4-rc5 with commit c3a6cf19e695: 224455 bytes > v5.4-rc5 with this patch: 205759 bytes > > So, we already see memory savings of ~8kB compared to vanilla -rc5 and > savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. > > Unfortunately, as of this writing, strings will not get deduplicated for > kernel modules, as ld does not do the deduplication for > SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which > kernel modules are. A patch for ld is currently being worked on to > hopefully allow for string deduplication in relocatable files in the > future. > > Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > Signed-off-by: Jessica Yu <jeyu@kernel.org> > --- > v3: > - remove __KSTRTAB_ENTRY macros in favor of just putting the asm directly > in ___EXPORT_SYMBOL > - Document more clearly what the ___EXPORT_SYMBOL macro does > > include/asm-generic/export.h | 8 +++++--- > include/linux/export.h | 27 ++++++++++++++++++++------- > 2 files changed, 25 insertions(+), 10 deletions(-) > > diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h > index fa577978fbbd..23bc98e97a66 100644 > --- a/include/asm-generic/export.h > +++ b/include/asm-generic/export.h > @@ -26,9 +26,11 @@ > .endm > > /* > - * note on .section use: @progbits vs %progbits nastiness doesn't matter, > - * since we immediately emit into those sections anyway. > + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) > + * section flag requires it. Use '%progbits' instead of '@progbits' since the > + * former apparently works on all arches according to the binutils source. > */ > + > .macro ___EXPORT_SYMBOL name,val,sec > #ifdef CONFIG_MODULES > .globl __ksymtab_\name > @@ -37,7 +39,7 @@ > __ksymtab_\name: > __put \val, __kstrtab_\name > .previous > - .section __ksymtab_strings,"a" > + .section __ksymtab_strings,"aMS",%progbits,1 > __kstrtab_\name: > .asciz "\name" > .previous > diff --git a/include/linux/export.h b/include/linux/export.h > index 201262793369..18dcdcd118e7 100644 > --- a/include/linux/export.h > +++ b/include/linux/export.h > @@ -81,16 +81,29 @@ struct kernel_symbol { > > #else > > -/* For every exported symbol, place a struct in the __ksymtab section */ > +/* > + * For every exported symbol, do the following: Just a nit. You mention "an entry" twice to talk about different classes of structures. It might be better to be explicit about "which entry". > + * > + * - If applicable, place an entry in the __kcrctab section. "place a CRC entry in the __kcrctab section" might be clearer ? > + * - Put the name of the symbol and namespace (empty string "" for none) in > + * __ksymtab_strings. > + * - Place an entry in the __ksymtab section. "Place a struct kernel_symbol entry in __ksymtab section" ? might be clearer ? Other than that: Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> > + * > + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) > + * section flag requires it. Use '%progbits' instead of '@progbits' since the > + * former apparently works on all arches according to the binutils source. > + */ > #define ___EXPORT_SYMBOL(sym, sec, ns) \ > extern typeof(sym) sym; \ > + extern const char __kstrtab_##sym[]; \ > + extern const char __kstrtabns_##sym[]; \ > __CRC_SYMBOL(sym, sec); \ > - static const char __kstrtab_##sym[] \ > - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ > - = #sym; \ > - static const char __kstrtabns_##sym[] \ > - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ > - = ns; \ > + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ > + "__kstrtab_" #sym ": \n" \ > + " .asciz \"" #sym "\" \n" \ > + "__kstrtabns_" #sym ": \n" \ > + " .asciz \"" ns "\" \n" \ > + " .previous \n"); \ > __KSYMTAB_ENTRY(sym, sec) > > #endif > -- > 2.16.4 > -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-12-12 6:22 ` Masahiro Yamada @ 2019-12-12 10:36 ` Jessica Yu 0 siblings, 0 replies; 23+ messages in thread From: Jessica Yu @ 2019-12-12 10:36 UTC (permalink / raw) To: Masahiro Yamada Cc: Linux Kernel Mailing List, Matthias Maennich, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman +++ Masahiro Yamada [12/12/19 15:22 +0900]: >On Fri, Dec 6, 2019 at 9:41 PM Jessica Yu <jeyu@kernel.org> wrote: >> >> Commit c3a6cf19e695 ("export: avoid code duplication in >> include/linux/export.h") refactors export.h quite nicely, but introduces >> a slight increase in memory usage due to using the empty string "" >> instead of NULL to indicate that an exported symbol has no namespace. As >> mentioned in that commit, this meant an increase of 1 byte per exported >> symbol without a namespace. For example, if a kernel configuration has >> about 10k exported symbols, this would mean that the size of >> __ksymtab_strings would increase by roughly 10kB. >> >> We can alleviate this situation by utilizing the SHF_MERGE and >> SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >> that the data in the section are null-terminated strings that can be >> merged to eliminate duplication. More specifically, from the binutils >> documentation - "for sections with both M and S, a string which is a >> suffix of a larger string is considered a duplicate. Thus "def" will be >> merged with "abcdef"; A reference to the first "def" will be changed to >> a reference to "abcdef"+3". Thus, all the empty strings would be merged >> as well as any strings that can be merged according to the cited method >> above. For example, "memset" and "__memset" would be merged to just >> "__memset" in __ksymtab_strings. >> >> As of v5.4-rc5, the following statistics were gathered with x86 >> defconfig with approximately 10.7k exported symbols. >> >> Size of __ksymtab_strings in vmlinux: >> ------------------------------------- >> v5.4-rc5: 213834 bytes >> v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >> v5.4-rc5 with this patch: 205759 bytes >> >> So, we already see memory savings of ~8kB compared to vanilla -rc5 and >> savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. >> >> Unfortunately, as of this writing, strings will not get deduplicated for >> kernel modules, as ld does not do the deduplication for >> SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >> kernel modules are. A patch for ld is currently being worked on to >> hopefully allow for string deduplication in relocatable files in the >> future. >> >> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >> Signed-off-by: Jessica Yu <jeyu@kernel.org> >> --- >> v3: >> - remove __KSTRTAB_ENTRY macros in favor of just putting the asm directly >> in ___EXPORT_SYMBOL >> - Document more clearly what the ___EXPORT_SYMBOL macro does >> >> include/asm-generic/export.h | 8 +++++--- >> include/linux/export.h | 27 ++++++++++++++++++++------- >> 2 files changed, 25 insertions(+), 10 deletions(-) >> >> diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >> index fa577978fbbd..23bc98e97a66 100644 >> --- a/include/asm-generic/export.h >> +++ b/include/asm-generic/export.h >> @@ -26,9 +26,11 @@ >> .endm >> >> /* >> - * note on .section use: @progbits vs %progbits nastiness doesn't matter, >> - * since we immediately emit into those sections anyway. >> + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >> + * section flag requires it. Use '%progbits' instead of '@progbits' since the >> + * former apparently works on all arches according to the binutils source. >> */ >> + >> .macro ___EXPORT_SYMBOL name,val,sec >> #ifdef CONFIG_MODULES >> .globl __ksymtab_\name >> @@ -37,7 +39,7 @@ >> __ksymtab_\name: >> __put \val, __kstrtab_\name >> .previous >> - .section __ksymtab_strings,"a" >> + .section __ksymtab_strings,"aMS",%progbits,1 >> __kstrtab_\name: >> .asciz "\name" >> .previous >> diff --git a/include/linux/export.h b/include/linux/export.h >> index 201262793369..18dcdcd118e7 100644 >> --- a/include/linux/export.h >> +++ b/include/linux/export.h >> @@ -81,16 +81,29 @@ struct kernel_symbol { >> >> #else >> >> -/* For every exported symbol, place a struct in the __ksymtab section */ >> +/* >> + * For every exported symbol, do the following: > >Just a nit. > >You mention "an entry" twice >to talk about different classes of structures. > >It might be better to be explicit about "which entry". > > >> + * >> + * - If applicable, place an entry in the __kcrctab section. > > "place a CRC entry in the __kcrctab section" >might be clearer ? > > >> + * - Put the name of the symbol and namespace (empty string "" for none) in >> + * __ksymtab_strings. >> + * - Place an entry in the __ksymtab section. > > "Place a struct kernel_symbol entry in __ksymtab section" ? >might be clearer ? > > >Other than that: >Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Yeah, that is much clearer, will resend. Thanks! Jessica ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v4] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-12-06 12:41 ` [PATCH v3] " Jessica Yu 2019-12-12 6:22 ` Masahiro Yamada @ 2019-12-12 14:16 ` Jessica Yu 2019-12-12 14:29 ` Matthias Maennich 1 sibling, 1 reply; 23+ messages in thread From: Jessica Yu @ 2019-12-12 14:16 UTC (permalink / raw) To: linux-kernel Cc: Matthias Maennich, Masahiro Yamada, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman, Jessica Yu Commit c3a6cf19e695 ("export: avoid code duplication in include/linux/export.h") refactors export.h quite nicely, but introduces a slight increase in memory usage due to using the empty string "" instead of NULL to indicate that an exported symbol has no namespace. As mentioned in that commit, this meant an increase of 1 byte per exported symbol without a namespace. For example, if a kernel configuration has about 10k exported symbols, this would mean that the size of __ksymtab_strings would increase by roughly 10kB. We can alleviate this situation by utilizing the SHF_MERGE and SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker that the data in the section are null-terminated strings that can be merged to eliminate duplication. More specifically, from the binutils documentation - "for sections with both M and S, a string which is a suffix of a larger string is considered a duplicate. Thus "def" will be merged with "abcdef"; A reference to the first "def" will be changed to a reference to "abcdef"+3". Thus, all the empty strings would be merged as well as any strings that can be merged according to the cited method above. For example, "memset" and "__memset" would be merged to just "__memset" in __ksymtab_strings. As of v5.4-rc5, the following statistics were gathered with x86 defconfig with approximately 10.7k exported symbols. Size of __ksymtab_strings in vmlinux: ------------------------------------- v5.4-rc5: 213834 bytes v5.4-rc5 with commit c3a6cf19e695: 224455 bytes v5.4-rc5 with this patch: 205759 bytes So, we already see memory savings of ~8kB compared to vanilla -rc5 and savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. Unfortunately, as of this writing, strings will not get deduplicated for kernel modules, as ld does not do the deduplication for SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which kernel modules are. A patch for ld is currently being worked on to hopefully allow for string deduplication in relocatable files in the future. Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Jessica Yu <jeyu@kernel.org> --- v4: - fix the comment above ___EXPORT_SYMBOL to be more specific about what entries are being placed in their respective sections. include/asm-generic/export.h | 8 +++++--- include/linux/export.h | 27 ++++++++++++++++++++------- 2 files changed, 25 insertions(+), 10 deletions(-) diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h index afddc5442e92..365345f9a9e3 100644 --- a/include/asm-generic/export.h +++ b/include/asm-generic/export.h @@ -27,9 +27,11 @@ .endm /* - * note on .section use: @progbits vs %progbits nastiness doesn't matter, - * since we immediately emit into those sections anyway. + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) + * section flag requires it. Use '%progbits' instead of '@progbits' since the + * former apparently works on all arches according to the binutils source. */ + .macro ___EXPORT_SYMBOL name,val,sec #ifdef CONFIG_MODULES .section ___ksymtab\sec+\name,"a" @@ -37,7 +39,7 @@ __ksymtab_\name: __put \val, __kstrtab_\name .previous - .section __ksymtab_strings,"a" + .section __ksymtab_strings,"aMS",%progbits,1 __kstrtab_\name: .asciz "\name" .previous diff --git a/include/linux/export.h b/include/linux/export.h index 627841448293..c166d35e3d76 100644 --- a/include/linux/export.h +++ b/include/linux/export.h @@ -82,16 +82,29 @@ struct kernel_symbol { #else -/* For every exported symbol, place a struct in the __ksymtab section */ +/* + * For every exported symbol, do the following: + * + * - If applicable, place a CRC entry in the __kcrctab section. + * - Put the name of the symbol and namespace (empty string "" for none) in + * __ksymtab_strings. + * - Place a struct kernel_symbol entry in the __ksymtab section. + * + * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) + * section flag requires it. Use '%progbits' instead of '@progbits' since the + * former apparently works on all arches according to the binutils source. + */ #define ___EXPORT_SYMBOL(sym, sec, ns) \ extern typeof(sym) sym; \ + extern const char __kstrtab_##sym[]; \ + extern const char __kstrtabns_##sym[]; \ __CRC_SYMBOL(sym, sec); \ - static const char __kstrtab_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = #sym; \ - static const char __kstrtabns_##sym[] \ - __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ - = ns; \ + asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ + "__kstrtab_" #sym ": \n" \ + " .asciz \"" #sym "\" \n" \ + "__kstrtabns_" #sym ": \n" \ + " .asciz \"" ns "\" \n" \ + " .previous \n"); \ __KSYMTAB_ENTRY(sym, sec) #endif -- 2.16.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v4] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-12-12 14:16 ` [PATCH v4] " Jessica Yu @ 2019-12-12 14:29 ` Matthias Maennich 2019-12-16 9:41 ` Jessica Yu 0 siblings, 1 reply; 23+ messages in thread From: Matthias Maennich @ 2019-12-12 14:29 UTC (permalink / raw) To: Jessica Yu Cc: linux-kernel, Masahiro Yamada, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman On Thu, Dec 12, 2019 at 03:16:13PM +0100, Jessica Yu wrote: >Commit c3a6cf19e695 ("export: avoid code duplication in >include/linux/export.h") refactors export.h quite nicely, but introduces >a slight increase in memory usage due to using the empty string "" >instead of NULL to indicate that an exported symbol has no namespace. As >mentioned in that commit, this meant an increase of 1 byte per exported >symbol without a namespace. For example, if a kernel configuration has >about 10k exported symbols, this would mean that the size of >__ksymtab_strings would increase by roughly 10kB. > >We can alleviate this situation by utilizing the SHF_MERGE and >SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >that the data in the section are null-terminated strings that can be >merged to eliminate duplication. More specifically, from the binutils >documentation - "for sections with both M and S, a string which is a >suffix of a larger string is considered a duplicate. Thus "def" will be >merged with "abcdef"; A reference to the first "def" will be changed to >a reference to "abcdef"+3". Thus, all the empty strings would be merged >as well as any strings that can be merged according to the cited method >above. For example, "memset" and "__memset" would be merged to just >"__memset" in __ksymtab_strings. > >As of v5.4-rc5, the following statistics were gathered with x86 >defconfig with approximately 10.7k exported symbols. > >Size of __ksymtab_strings in vmlinux: >------------------------------------- >v5.4-rc5: 213834 bytes >v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >v5.4-rc5 with this patch: 205759 bytes > >So, we already see memory savings of ~8kB compared to vanilla -rc5 and >savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. > >Unfortunately, as of this writing, strings will not get deduplicated for >kernel modules, as ld does not do the deduplication for >SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >kernel modules are. A patch for ld is currently being worked on to >hopefully allow for string deduplication in relocatable files in the >future. > >Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> >Signed-off-by: Jessica Yu <jeyu@kernel.org> >--- >v4: > - fix the comment above ___EXPORT_SYMBOL to be more specific about what > entries are being placed in their respective sections. > > include/asm-generic/export.h | 8 +++++--- > include/linux/export.h | 27 ++++++++++++++++++++------- > > 2 files changed, 25 insertions(+), 10 deletions(-) > >diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >index afddc5442e92..365345f9a9e3 100644 >--- a/include/asm-generic/export.h >+++ b/include/asm-generic/export.h >@@ -27,9 +27,11 @@ > .endm > > /* >- * note on .section use: @progbits vs %progbits nastiness doesn't matter, >- * since we immediately emit into those sections anyway. >+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >+ * former apparently works on all arches according to the binutils source. > */ >+ > .macro ___EXPORT_SYMBOL name,val,sec > #ifdef CONFIG_MODULES > .section ___ksymtab\sec+\name,"a" >@@ -37,7 +39,7 @@ > __ksymtab_\name: > __put \val, __kstrtab_\name > .previous >- .section __ksymtab_strings,"a" >+ .section __ksymtab_strings,"aMS",%progbits,1 > __kstrtab_\name: > .asciz "\name" > .previous >diff --git a/include/linux/export.h b/include/linux/export.h >index 627841448293..c166d35e3d76 100644 >--- a/include/linux/export.h >+++ b/include/linux/export.h >@@ -82,16 +82,29 @@ struct kernel_symbol { > > #else > >-/* For every exported symbol, place a struct in the __ksymtab section */ >+/* >+ * For every exported symbol, do the following: >+ * >+ * - If applicable, place a CRC entry in the __kcrctab section. >+ * - Put the name of the symbol and namespace (empty string "" for none) in >+ * __ksymtab_strings. >+ * - Place a struct kernel_symbol entry in the __ksymtab section. >+ * >+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >+ * former apparently works on all arches according to the binutils source. >+ */ > #define ___EXPORT_SYMBOL(sym, sec, ns) \ > extern typeof(sym) sym; \ >+ extern const char __kstrtab_##sym[]; \ >+ extern const char __kstrtabns_##sym[]; \ > __CRC_SYMBOL(sym, sec); \ >- static const char __kstrtab_##sym[] \ >- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >- = #sym; \ >- static const char __kstrtabns_##sym[] \ >- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >- = ns; \ >+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ >+ "__kstrtab_" #sym ": \n" \ >+ " .asciz \"" #sym "\" \n" \ >+ "__kstrtabns_" #sym ": \n" \ >+ " .asciz \"" ns "\" \n" \ >+ " .previous \n"); \ nit: You might want to align the newline characters up to the asm line. Thanks for working on this! Reviewed-by: Matthias Maennich <maennich@google.com> Cheers, Matthias > __KSYMTAB_ENTRY(sym, sec) > > #endif >-- >2.16.4 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags 2019-12-12 14:29 ` Matthias Maennich @ 2019-12-16 9:41 ` Jessica Yu 0 siblings, 0 replies; 23+ messages in thread From: Jessica Yu @ 2019-12-16 9:41 UTC (permalink / raw) To: Matthias Maennich Cc: linux-kernel, Masahiro Yamada, Rasmus Villemoes, Arnd Bergmann, Greg Kroah-Hartman +++ Matthias Maennich [12/12/19 14:29 +0000]: >On Thu, Dec 12, 2019 at 03:16:13PM +0100, Jessica Yu wrote: >>Commit c3a6cf19e695 ("export: avoid code duplication in >>include/linux/export.h") refactors export.h quite nicely, but introduces >>a slight increase in memory usage due to using the empty string "" >>instead of NULL to indicate that an exported symbol has no namespace. As >>mentioned in that commit, this meant an increase of 1 byte per exported >>symbol without a namespace. For example, if a kernel configuration has >>about 10k exported symbols, this would mean that the size of >>__ksymtab_strings would increase by roughly 10kB. >> >>We can alleviate this situation by utilizing the SHF_MERGE and >>SHF_STRING section flags. SHF_MERGE|SHF_STRING indicate to the linker >>that the data in the section are null-terminated strings that can be >>merged to eliminate duplication. More specifically, from the binutils >>documentation - "for sections with both M and S, a string which is a >>suffix of a larger string is considered a duplicate. Thus "def" will be >>merged with "abcdef"; A reference to the first "def" will be changed to >>a reference to "abcdef"+3". Thus, all the empty strings would be merged >>as well as any strings that can be merged according to the cited method >>above. For example, "memset" and "__memset" would be merged to just >>"__memset" in __ksymtab_strings. >> >>As of v5.4-rc5, the following statistics were gathered with x86 >>defconfig with approximately 10.7k exported symbols. >> >>Size of __ksymtab_strings in vmlinux: >>------------------------------------- >>v5.4-rc5: 213834 bytes >>v5.4-rc5 with commit c3a6cf19e695: 224455 bytes >>v5.4-rc5 with this patch: 205759 bytes >> >>So, we already see memory savings of ~8kB compared to vanilla -rc5 and >>savings of nearly 18.7kB compared to -rc5 with commit c3a6cf19e695 on top. >> >>Unfortunately, as of this writing, strings will not get deduplicated for >>kernel modules, as ld does not do the deduplication for >>SHF_MERGE|SHF_STRINGS sections for relocatable files (ld -r), which >>kernel modules are. A patch for ld is currently being worked on to >>hopefully allow for string deduplication in relocatable files in the >>future. >> >>Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> >>Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> >>Signed-off-by: Jessica Yu <jeyu@kernel.org> >>--- >>v4: >> - fix the comment above ___EXPORT_SYMBOL to be more specific about what >> entries are being placed in their respective sections. >> >>include/asm-generic/export.h | 8 +++++--- >>include/linux/export.h | 27 ++++++++++++++++++++------- >> >>2 files changed, 25 insertions(+), 10 deletions(-) >> >>diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h >>index afddc5442e92..365345f9a9e3 100644 >>--- a/include/asm-generic/export.h >>+++ b/include/asm-generic/export.h >>@@ -27,9 +27,11 @@ >>.endm >> >>/* >>- * note on .section use: @progbits vs %progbits nastiness doesn't matter, >>- * since we immediately emit into those sections anyway. >>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >>+ * former apparently works on all arches according to the binutils source. >> */ >>+ >>.macro ___EXPORT_SYMBOL name,val,sec >>#ifdef CONFIG_MODULES >> .section ___ksymtab\sec+\name,"a" >>@@ -37,7 +39,7 @@ >>__ksymtab_\name: >> __put \val, __kstrtab_\name >> .previous >>- .section __ksymtab_strings,"a" >>+ .section __ksymtab_strings,"aMS",%progbits,1 >>__kstrtab_\name: >> .asciz "\name" >> .previous >>diff --git a/include/linux/export.h b/include/linux/export.h >>index 627841448293..c166d35e3d76 100644 >>--- a/include/linux/export.h >>+++ b/include/linux/export.h >>@@ -82,16 +82,29 @@ struct kernel_symbol { >> >>#else >> >>-/* For every exported symbol, place a struct in the __ksymtab section */ >>+/* >>+ * For every exported symbol, do the following: >>+ * >>+ * - If applicable, place a CRC entry in the __kcrctab section. >>+ * - Put the name of the symbol and namespace (empty string "" for none) in >>+ * __ksymtab_strings. >>+ * - Place a struct kernel_symbol entry in the __ksymtab section. >>+ * >>+ * note on .section use: we specify progbits since usage of the "M" (SHF_MERGE) >>+ * section flag requires it. Use '%progbits' instead of '@progbits' since the >>+ * former apparently works on all arches according to the binutils source. >>+ */ >>#define ___EXPORT_SYMBOL(sym, sec, ns) \ >> extern typeof(sym) sym; \ >>+ extern const char __kstrtab_##sym[]; \ >>+ extern const char __kstrtabns_##sym[]; \ >> __CRC_SYMBOL(sym, sec); \ >>- static const char __kstrtab_##sym[] \ >>- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >>- = #sym; \ >>- static const char __kstrtabns_##sym[] \ >>- __attribute__((section("__ksymtab_strings"), used, aligned(1))) \ >>- = ns; \ >>+ asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1\n" \ >>+ "__kstrtab_" #sym ": \n" \ >>+ " .asciz \"" #sym "\" \n" \ >>+ "__kstrtabns_" #sym ": \n" \ >>+ " .asciz \"" ns "\" \n" \ >>+ " .previous \n"); \ > >nit: You might want to align the newline characters up to the asm line. > >Thanks for working on this! > >Reviewed-by: Matthias Maennich <maennich@google.com> I've fixed up the newlines and applied to modules-next. Thanks! Jessica ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2019-12-16 9:41 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-20 14:51 [PATCH] export.h: reduce __ksymtab_strings string duplication by using "MS" section flags Jessica Yu 2019-11-20 15:02 ` Jessica Yu 2019-11-21 10:51 ` Rasmus Villemoes 2019-11-21 16:09 ` Jessica Yu 2019-11-21 16:53 ` Will Deacon 2019-11-22 11:44 ` Masahiro Yamada 2019-11-25 9:29 ` Rasmus Villemoes 2019-11-25 9:45 ` Jessica Yu 2019-11-25 15:42 ` [PATCH v2] " Jessica Yu 2019-11-26 8:32 ` Masahiro Yamada 2019-11-26 9:55 ` Jessica Yu 2019-11-26 13:56 ` Matthias Maennich 2019-11-26 15:31 ` Jessica Yu 2019-11-26 16:02 ` Matthias Maennich 2019-11-26 16:12 ` Masahiro Yamada 2019-11-26 16:48 ` Jessica Yu 2019-12-05 16:42 ` Matthias Maennich 2019-12-06 12:41 ` [PATCH v3] " Jessica Yu 2019-12-12 6:22 ` Masahiro Yamada 2019-12-12 10:36 ` Jessica Yu 2019-12-12 14:16 ` [PATCH v4] " Jessica Yu 2019-12-12 14:29 ` Matthias Maennich 2019-12-16 9:41 ` Jessica Yu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).