linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kbuild: avoid split lines in .mod files
@ 2020-12-03 17:55 Masahiro Yamada
  2020-12-03 18:46 ` Sami Tolvanen
  0 siblings, 1 reply; 3+ messages in thread
From: Masahiro Yamada @ 2020-12-03 17:55 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Kees Cook, Sami Tolvanen, Masahiro Yamada, Michal Marek,
	Nathan Chancellor, Nick Desaulniers, clang-built-linux,
	linux-kernel

"xargs echo" is not a safe way to remove line breaks because the input
may exceed the command line limit and xargs may break it up into
multiple invocations of echo. This should never happen because
scripts/gen_autoksyms.sh expects all undefined symbols are placed in
the second line of .mod files.

One possible way is to replace "xargs echo" with
"sed ':x;N;$!bx;s/\n/ /g'" or something, but I rewrote the code by
using awk because it is more readable.

This issue was reported by Sami Tolvanen; in his Clang LTO patch set,
$(multi-used-m) is no longer an ELF object, but a thin archive that
contains LLVM bitcode files. llvm-nm prints out symbols for each
archive member separately, which results a lot of dupications, in some
places, beyond the system-defined limit.

This problem must be fixed irrespective of LTO, and we must ensure
zero possibility of having this issue.

Link: https://lkml.org/lkml/2020/12/1/1658
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

 scripts/Makefile.build | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index ae647379b579..4c058f12dd73 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -252,6 +252,9 @@ objtool_dep = $(objtool_obj)					\
 ifdef CONFIG_TRIM_UNUSED_KSYMS
 cmd_gen_ksymdeps = \
 	$(CONFIG_SHELL) $(srctree)/scripts/gen_ksymdeps.sh $@ >> $(dot-target).cmd
+
+# List module undefined symbols
+undefined_syms = $(NM) $< | $(AWK) '$$1 == "U" { printf("%s%s", x++ ? " " : "", $$2) }';
 endif
 
 define rule_cc_o_c
@@ -271,13 +274,6 @@ define rule_as_o_S
 	$(call cmd,modversions_S)
 endef
 
-# List module undefined symbols (or empty line if not enabled)
-ifdef CONFIG_TRIM_UNUSED_KSYMS
-cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | xargs echo
-else
-cmd_undef_syms = echo
-endif
-
 # Built-in and composite module parts
 $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
 	$(call if_changed_rule,cc_o_c)
@@ -285,7 +281,7 @@ $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
 
 cmd_mod = { \
 	echo $(if $($*-objs)$($*-y)$($*-m), $(addprefix $(obj)/, $($*-objs) $($*-y) $($*-m)), $(@:.mod=.o)); \
-	$(cmd_undef_syms); \
+	$(undefined_syms) echo; \
 	} > $@
 
 $(obj)/%.mod: $(obj)/%.o FORCE
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] kbuild: avoid split lines in .mod files
  2020-12-03 17:55 [PATCH] kbuild: avoid split lines in .mod files Masahiro Yamada
@ 2020-12-03 18:46 ` Sami Tolvanen
  2020-12-16  6:19   ` Masahiro Yamada
  0 siblings, 1 reply; 3+ messages in thread
From: Sami Tolvanen @ 2020-12-03 18:46 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-kbuild, Kees Cook, Michal Marek, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, LKML

Hi Masahiro,

On Thu, Dec 3, 2020 at 9:56 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> "xargs echo" is not a safe way to remove line breaks because the input
> may exceed the command line limit and xargs may break it up into
> multiple invocations of echo. This should never happen because
> scripts/gen_autoksyms.sh expects all undefined symbols are placed in
> the second line of .mod files.
>
> One possible way is to replace "xargs echo" with
> "sed ':x;N;$!bx;s/\n/ /g'" or something, but I rewrote the code by
> using awk because it is more readable.
>
> This issue was reported by Sami Tolvanen; in his Clang LTO patch set,
> $(multi-used-m) is no longer an ELF object, but a thin archive that
> contains LLVM bitcode files. llvm-nm prints out symbols for each
> archive member separately, which results a lot of dupications, in some
> places, beyond the system-defined limit.
>
> This problem must be fixed irrespective of LTO, and we must ensure
> zero possibility of having this issue.
>
> Link: https://lkml.org/lkml/2020/12/1/1658
> Reported-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> ---
>
>  scripts/Makefile.build | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index ae647379b579..4c058f12dd73 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -252,6 +252,9 @@ objtool_dep = $(objtool_obj)                                        \
>  ifdef CONFIG_TRIM_UNUSED_KSYMS
>  cmd_gen_ksymdeps = \
>         $(CONFIG_SHELL) $(srctree)/scripts/gen_ksymdeps.sh $@ >> $(dot-target).cmd
> +
> +# List module undefined symbols
> +undefined_syms = $(NM) $< | $(AWK) '$$1 == "U" { printf("%s%s", x++ ? " " : "", $$2) }';
>  endif
>
>  define rule_cc_o_c
> @@ -271,13 +274,6 @@ define rule_as_o_S
>         $(call cmd,modversions_S)
>  endef
>
> -# List module undefined symbols (or empty line if not enabled)
> -ifdef CONFIG_TRIM_UNUSED_KSYMS
> -cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | xargs echo
> -else
> -cmd_undef_syms = echo
> -endif
> -
>  # Built-in and composite module parts
>  $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
>         $(call if_changed_rule,cc_o_c)
> @@ -285,7 +281,7 @@ $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
>
>  cmd_mod = { \
>         echo $(if $($*-objs)$($*-y)$($*-m), $(addprefix $(obj)/, $($*-objs) $($*-y) $($*-m)), $(@:.mod=.o)); \
> -       $(cmd_undef_syms); \
> +       $(undefined_syms) echo; \
>         } > $@
>
>  $(obj)/%.mod: $(obj)/%.o FORCE

Thanks for the patch! I confirmed that this works with llvm-nm and
bitcode files, but it does still produce plenty of duplicates, even
though they now stay on one line. I'm not sure if the readability of
the .mod file matters though. Please feel free to add:

Reviewed-by: Sami Tolvanen <samitolvanen@google.com>

Sami

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] kbuild: avoid split lines in .mod files
  2020-12-03 18:46 ` Sami Tolvanen
@ 2020-12-16  6:19   ` Masahiro Yamada
  0 siblings, 0 replies; 3+ messages in thread
From: Masahiro Yamada @ 2020-12-16  6:19 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: linux-kbuild, Kees Cook, Michal Marek, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, LKML

On Fri, Dec 4, 2020 at 3:46 AM 'Sami Tolvanen' via Clang Built Linux
<clang-built-linux@googlegroups.com> wrote:
>
> Hi Masahiro,
>
> On Thu, Dec 3, 2020 at 9:56 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > "xargs echo" is not a safe way to remove line breaks because the input
> > may exceed the command line limit and xargs may break it up into
> > multiple invocations of echo. This should never happen because
> > scripts/gen_autoksyms.sh expects all undefined symbols are placed in
> > the second line of .mod files.
> >
> > One possible way is to replace "xargs echo" with
> > "sed ':x;N;$!bx;s/\n/ /g'" or something, but I rewrote the code by
> > using awk because it is more readable.
> >
> > This issue was reported by Sami Tolvanen; in his Clang LTO patch set,
> > $(multi-used-m) is no longer an ELF object, but a thin archive that
> > contains LLVM bitcode files. llvm-nm prints out symbols for each
> > archive member separately, which results a lot of dupications, in some
> > places, beyond the system-defined limit.
> >
> > This problem must be fixed irrespective of LTO, and we must ensure
> > zero possibility of having this issue.
> >
> > Link: https://lkml.org/lkml/2020/12/1/1658
> > Reported-by: Sami Tolvanen <samitolvanen@google.com>
> > Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> > ---
> >
> >  scripts/Makefile.build | 12 ++++--------
> >  1 file changed, 4 insertions(+), 8 deletions(-)
> >
> > diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> > index ae647379b579..4c058f12dd73 100644
> > --- a/scripts/Makefile.build
> > +++ b/scripts/Makefile.build
> > @@ -252,6 +252,9 @@ objtool_dep = $(objtool_obj)                                        \
> >  ifdef CONFIG_TRIM_UNUSED_KSYMS
> >  cmd_gen_ksymdeps = \
> >         $(CONFIG_SHELL) $(srctree)/scripts/gen_ksymdeps.sh $@ >> $(dot-target).cmd
> > +
> > +# List module undefined symbols
> > +undefined_syms = $(NM) $< | $(AWK) '$$1 == "U" { printf("%s%s", x++ ? " " : "", $$2) }';
> >  endif
> >
> >  define rule_cc_o_c
> > @@ -271,13 +274,6 @@ define rule_as_o_S
> >         $(call cmd,modversions_S)
> >  endef
> >
> > -# List module undefined symbols (or empty line if not enabled)
> > -ifdef CONFIG_TRIM_UNUSED_KSYMS
> > -cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | xargs echo
> > -else
> > -cmd_undef_syms = echo
> > -endif
> > -
> >  # Built-in and composite module parts
> >  $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
> >         $(call if_changed_rule,cc_o_c)
> > @@ -285,7 +281,7 @@ $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
> >
> >  cmd_mod = { \
> >         echo $(if $($*-objs)$($*-y)$($*-m), $(addprefix $(obj)/, $($*-objs) $($*-y) $($*-m)), $(@:.mod=.o)); \
> > -       $(cmd_undef_syms); \
> > +       $(undefined_syms) echo; \
> >         } > $@
> >
> >  $(obj)/%.mod: $(obj)/%.o FORCE
>
> Thanks for the patch! I confirmed that this works with llvm-nm and
> bitcode files, but it does still produce plenty of duplicates,

Actually, the duplication does not matter
because scripts/gen_autoksyms.sh line 46
calls 'sort -u' anyway.
Only the problem is we have bigger .mod files, though.

We do not have a good reason
to move 'sort -u' for now.




> even
> though they now stay on one line. I'm not sure if the readability of
> the .mod file matters though. Please feel free to add:
>
> Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
>
> Sami
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CABCJKufyBcN-foh0kj5kUsn-wiZMJ_a8ZjB72jaTmN2GEVzVNA%40mail.gmail.com.



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-12-16  6:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-03 17:55 [PATCH] kbuild: avoid split lines in .mod files Masahiro Yamada
2020-12-03 18:46 ` Sami Tolvanen
2020-12-16  6:19   ` Masahiro Yamada

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).