linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] kbuild: Cache exploratory calls to the compiler
@ 2017-10-11 16:19 Douglas Anderson
  2017-10-11 16:19 ` [PATCH v3 1/2] kbuild: Add a cache for generated variables Douglas Anderson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Douglas Anderson @ 2017-10-11 16:19 UTC (permalink / raw)
  To: yamada.masahiro
  Cc: groeck, sjg, briannorris, mingo, Douglas Anderson,
	Matthias Kaehlcke, Cao jin, Arnd Bergmann, James Hogan,
	linux-kbuild, Michal Marek, linux-kernel, Mark Charlebois

This two-patch series attempts to speed incremental builds of the
kernel up by a bit.  How much of a speedup you get depends a lot on
your environment, specifically the speed of your workstation and how
fast it takes to invoke the compiler.

In the Chrome OS build environment you get a really big win.  For an
incremental build (via emerge) I measured a speedup from ~1 minute to
~35 seconds.  ...but Chrome OS calls the compiler through a number of
wrapper scripts and also calls the kernel make at least twice for an
emerge (during compile stage and install stage), so it's a bit of a
worst case.

Perhaps a more realistic measure of the speedup others might see is
running "time make help > /dev/null" outside of the Chrome OS build
environment on my system.  When I do this I see that it took more than
1.0 seconds before and less than 0.2 seconds after.  So presumably
this has the ability to shave ~0.8 seconds off an incremental build
for most folks out there.  While 0.8 seconds savings isn't huge, it
does make incremental builds feel a lot snappier.

Ingo Molnar also did some testing of this in his environment and found
that an incremental build of his subsystem sped up from ~.44 seconds
before to ~.15 seconds after.  Clean builds also sped up by a marginal
amount.  :)

Changes in v3:
- Rule to prevent make from trying to generate the cache
- Rule to clean .cache.mk
- No more doc changes
- Moved cache stuff below cc-cross-prefix
- Removed duplicate documentation of try-run (oops)
- Add Tested-by for Ingo and Guenter since v2 and v3 are very similar

Changes in v2:
- Abstract at a different level (like shell-cached) per Masahiro Yamada
- Include ld-version, which I missed the first time

Douglas Anderson (2):
  kbuild: Add a cache for generated variables
  kbuild: Cache a few more calls to the compiler

 Makefile               |  5 +--
 scripts/Kbuild.include | 87 ++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 76 insertions(+), 16 deletions(-)

-- 
2.15.0.rc0.271.g36b669edcc-goog

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v3 1/2] kbuild: Add a cache for generated variables
  2017-10-11 16:19 [PATCH v3 0/2] kbuild: Cache exploratory calls to the compiler Douglas Anderson
@ 2017-10-11 16:19 ` Douglas Anderson
  2017-10-11 16:19 ` [PATCH v3 2/2] kbuild: Cache a few more calls to the compiler Douglas Anderson
  2017-10-15  7:48 ` [PATCH v3 0/2] kbuild: Cache exploratory " Masahiro Yamada
  2 siblings, 0 replies; 4+ messages in thread
From: Douglas Anderson @ 2017-10-11 16:19 UTC (permalink / raw)
  To: yamada.masahiro
  Cc: groeck, sjg, briannorris, mingo, Douglas Anderson,
	Matthias Kaehlcke, Cao jin, Arnd Bergmann, James Hogan,
	linux-kbuild, Michal Marek, linux-kernel, Mark Charlebois

While timing a "no-op" build of the kernel (incrementally building the
kernel even though nothing changed) in the Chrome OS build system I
found that it was much slower than I expected.

Digging into things a bit, I found that quite a bit of the time was
spent invoking the C compiler even though we weren't actually building
anything.  Currently in the Chrome OS build system the C compiler is
called through a number of wrappers (one of which is written in
python!) and can take upwards of 100 ms to invoke even if we're not
doing anything difficult, so these invocations of the compiler were
taking a lot of time.  Worse the invocations couldn't seem to take
advantage of the multiple cores on my system.

Certainly it seems like we could make the compiler invocations in the
Chrome OS build system faster, but only to a point.  Inherently
invoking a program as big as a C compiler is a fairly heavy
operation.  Thus even if we can speed the compiler calls it made sense
to track down what was happening.

It turned out that all the compiler invocations were coming from
usages like this in the kernel's Makefile:

KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks,)

Due to the way cc-option and similar statements work the above
contains an implicit call to the C compiler.  ...and due to the fact
that we're storing the result in KBUILD_CFLAGS, a simply expanded
variable, the call will happen every time the Makefile is parsed, even
if there are no users of KBUILD_CFLAGS.

Rather than redoing this computation every time, it makes a lot of
sense to cache the result of all of the Makefile's compiler calls just
like we do when we compile a ".c" file to a ".o" file.  Conceptually
this is quite a simple idea.  ...and since the calls to invoke the
compiler and similar tools are centrally located in the Kbuild.include
file this doesn't even need to be super invasive.

Implementing the cache in a simple-to-use and efficient way is not
quite as simple as it first sounds, though.  To get maximum speed we
really want the cache in a format that make can natively understand
and make doesn't really have an ability to load/parse files. ...but
make _can_ import other Makefiles, so the solution is to store the
cache in Makefile format.  This requires coming up with a valid/unique
Makefile variable name for each value to be cached, but that's
solvable with some cleverness.

After this change, we'll automatically create a ".cache.mk" file that
will contain our cached variables.  We'll load this on each invocation
of make and will avoid recomputing anything that's already in our
cache.  The cache is stored in a format that it shouldn't need any
invalidation since anything that might change should affect the "key"
and any old cached value won't be used.

NOTE: This change requires
<https://patchwork.kernel.org/patch/9998651/> ("kbuild: add forward
declaration of default target to Makefile.asm-generic") for
correctness.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
---

Changes in v3:
- Rule to prevent make from trying to generate the cache
- Rule to clean .cache.mk
- No more doc changes
- Moved cache stuff below cc-cross-prefix
- Removed duplicate documentation of try-run (oops)
- Add Tested-by for Ingo and Guenter since v2 and v3 are very similar

Changes in v2:
- Abstract at a different level (like shell-cached) per Masahiro Yamada
- Include ld-version, which I missed the first time

 Makefile               |  1 +
 scripts/Kbuild.include | 87 ++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 74 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index 2835863bdd5a..29c06eacf491 100644
--- a/Makefile
+++ b/Makefile
@@ -1550,6 +1550,7 @@ clean: $(clean-dirs)
 		-o -name '.*.d' -o -name '.*.tmp' -o -name '*.mod.c' \
 		-o -name '*.symtypes' -o -name 'modules.order' \
 		-o -name modules.builtin -o -name '.tmp_*.o.*' \
+		-o -name .cache.mk \
 		-o -name '*.c.[012]*.*' \
 		-o -name '*.ll' \
 		-o -name '*.gcno' \) -type f -print | xargs rm -f
diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 9ffd3dda3889..4203fff8d4c3 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -8,6 +8,8 @@ squote  := '
 empty   :=
 space   := $(empty) $(empty)
 space_escape := _-_SPACE_-_
+right_paren := )
+left_paren := (
 
 ###
 # Name of target with a '.' as filename prefix. foo/bar.o => foo/.bar.o
@@ -80,6 +82,57 @@ cc-cross-prefix =  \
 			echo $(c);                                    \
 		fi)))
 
+# Tools for caching Makefile variables that are "expensive" to compute.
+#
+# Here we want to help deal with variables that take a long time to compute
+# by making it easy to store these variables in a cache.
+#
+# The canonical example here is testing for compiler flags.  On a simple system
+# each call to the compiler takes 10 ms, but on a system with a compiler that's
+# called through various wrappers it can take upwards of 100 ms.  If we have
+# 100 calls to the compiler this can take 1 second (on a simple system) or 10
+# seconds (on a complicated system).
+#
+# The "cache" will be in Makefile syntax and can be directly included.
+# Any time we try to reference a variable that's not in the cache we'll
+# calculate it and store it in the cache for next time.
+
+# Include values from last time
+make-cache := $(if $(KBUILD_EXTMOD),$(KBUILD_EXTMOD)/,$(if $(obj),$(obj)/)).cache.mk
+$(make-cache): ;
+-include $(make-cache)
+
+# Usage: $(call __sanitize-opt,Hello=Hola$(comma)Goodbye Adios)
+#
+# Convert all '$', ')', '(', '\', '=', ' ', ',', ':' to '_'
+__sanitize-opt = $(subst $$,_,$(subst $(right_paren),_,$(subst $(left_paren),_,$(subst \,_,$(subst =,_,$(subst $(space),_,$(subst $(comma),_,$(subst :,_,$(1)))))))))
+
+# Usage:   $(call shell-cached,shell_command)
+# Example: $(call shell-cached,md5sum /usr/bin/gcc)
+#
+# If we've already seen a call to this exact shell command (even in a
+# previous invocation of make!) we'll return the value.  If not, we'll
+# compute it and store the result for future runs.
+#
+# This is a bit of voodoo, but basic explanation is that if the variable
+# was undefined then we'll evaluate the shell command and store the result
+# into the variable.  We'll then store that value in the cache and finally
+# output the value.
+#
+# NOTE: The $$(2) here isn't actually a parameter to __run-and-store.  We
+# happen to know that the caller will have their shell command in $(2) so the
+# result of "call"ing this will produce a reference to that $(2).  The reason
+# for this strangeness is to avoid an extra level of eval (and escaping) of
+# $(2).
+define __run-and-store
+ifeq ($(origin $(1)),undefined)
+  $$(eval $(1) := $$(shell $$(2)))
+  $$(shell echo '$(1) := $$($(1))' >> $(make-cache))
+endif
+endef
+__shell-cached = $(eval $(call __run-and-store,$(1)))$($(1))
+shell-cached = $(call __shell-cached,__cached_$(call __sanitize-opt,$(1)),$(1))
+
 # output directory for tests below
 TMPOUT := $(if $(KBUILD_EXTMOD),$(firstword $(KBUILD_EXTMOD))/)
 
@@ -87,30 +140,36 @@ TMPOUT := $(if $(KBUILD_EXTMOD),$(firstword $(KBUILD_EXTMOD))/)
 # Usage: option = $(call try-run, $(CC)...-o "$$TMP",option-ok,otherwise)
 # Exit code chooses option. "$$TMP" serves as a temporary file and is
 # automatically cleaned up.
-try-run = $(shell set -e;		\
+__try-run = set -e;			\
 	TMP="$(TMPOUT).$$$$.tmp";	\
 	TMPO="$(TMPOUT).$$$$.o";	\
 	if ($(1)) >/dev/null 2>&1;	\
 	then echo "$(2)";		\
 	else echo "$(3)";		\
 	fi;				\
-	rm -f "$$TMP" "$$TMPO")
+	rm -f "$$TMP" "$$TMPO"
+
+try-run = $(shell $(__try-run))
+
+# try-run-cached
+# This works like try-run, but the result is cached.
+try-run-cached = $(call shell-cached,$(__try-run))
 
 # as-option
 # Usage: cflags-y += $(call as-option,-Wa$(comma)-isa=foo,)
 
-as-option = $(call try-run,\
+as-option = $(call try-run-cached,\
 	$(CC) $(KBUILD_CFLAGS) $(1) -c -x assembler /dev/null -o "$$TMP",$(1),$(2))
 
 # as-instr
 # Usage: cflags-y += $(call as-instr,instr,option1,option2)
 
-as-instr = $(call try-run,\
+as-instr = $(call try-run-cached,\
 	printf "%b\n" "$(1)" | $(CC) $(KBUILD_AFLAGS) -c -x assembler -o "$$TMP" -,$(2),$(3))
 
 # __cc-option
 # Usage: MY_CFLAGS += $(call __cc-option,$(CC),$(MY_CFLAGS),-march=winchip-c6,-march=i586)
-__cc-option = $(call try-run,\
+__cc-option = $(call try-run-cached,\
 	$(1) -Werror $(2) $(3) -c -x c /dev/null -o "$$TMP",$(3),$(4))
 
 # Do not attempt to build with gcc plugins during cc-option tests.
@@ -130,23 +189,23 @@ hostcc-option = $(call __cc-option, $(HOSTCC),\
 
 # cc-option-yn
 # Usage: flag := $(call cc-option-yn,-march=winchip-c6)
-cc-option-yn = $(call try-run,\
+cc-option-yn = $(call try-run-cached,\
 	$(CC) -Werror $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) $(1) -c -x c /dev/null -o "$$TMP",y,n)
 
 # cc-disable-warning
 # Usage: cflags-y += $(call cc-disable-warning,unused-but-set-variable)
-cc-disable-warning = $(call try-run,\
+cc-disable-warning = $(call try-run-cached,\
 	$(CC) -Werror $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -W$(strip $(1)) -c -x c /dev/null -o "$$TMP",-Wno-$(strip $(1)))
 
 # cc-name
 # Expands to either gcc or clang
-cc-name = $(shell $(CC) -v 2>&1 | grep -q "clang version" && echo clang || echo gcc)
+cc-name = $(call shell-cached,$(CC) -v 2>&1 | grep -q "clang version" && echo clang || echo gcc)
 
 # cc-version
-cc-version = $(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-version.sh $(CC))
+cc-version = $(call shell-cached,$(CONFIG_SHELL) $(srctree)/scripts/gcc-version.sh $(CC))
 
 # cc-fullversion
-cc-fullversion = $(shell $(CONFIG_SHELL) \
+cc-fullversion = $(call shell-cached,$(CONFIG_SHELL) \
 	$(srctree)/scripts/gcc-version.sh -p $(CC))
 
 # cc-ifversion
@@ -159,22 +218,22 @@ cc-if-fullversion = $(shell [ $(cc-fullversion) $(1) $(2) ] && echo $(3) || echo
 
 # cc-ldoption
 # Usage: ldflags += $(call cc-ldoption, -Wl$(comma)--hash-style=both)
-cc-ldoption = $(call try-run,\
+cc-ldoption = $(call try-run-cached,\
 	$(CC) $(1) -nostdlib -x c /dev/null -o "$$TMP",$(1),$(2))
 
 # ld-option
 # Usage: LDFLAGS += $(call ld-option, -X)
-ld-option = $(call try-run,\
+ld-option = $(call try-run-cached,\
 	$(CC) -x c /dev/null -c -o "$$TMPO" ; $(LD) $(1) "$$TMPO" -o "$$TMP",$(1),$(2))
 
 # ar-option
 # Usage: KBUILD_ARFLAGS := $(call ar-option,D)
 # Important: no spaces around options
-ar-option = $(call try-run, $(AR) rc$(1) "$$TMP",$(1),$(2))
+ar-option = $(call try-run-cached, $(AR) rc$(1) "$$TMP",$(1),$(2))
 
 # ld-version
 # Note this is mainly for HJ Lu's 3 number binutil versions
-ld-version = $(shell $(LD) --version | $(srctree)/scripts/ld-version.sh)
+ld-version = $(call shell-cached,$(LD) --version | $(srctree)/scripts/ld-version.sh)
 
 # ld-ifversion
 # Usage:  $(call ld-ifversion, -ge, 22252, y)
-- 
2.15.0.rc0.271.g36b669edcc-goog

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v3 2/2] kbuild: Cache a few more calls to the compiler
  2017-10-11 16:19 [PATCH v3 0/2] kbuild: Cache exploratory calls to the compiler Douglas Anderson
  2017-10-11 16:19 ` [PATCH v3 1/2] kbuild: Add a cache for generated variables Douglas Anderson
@ 2017-10-11 16:19 ` Douglas Anderson
  2017-10-15  7:48 ` [PATCH v3 0/2] kbuild: Cache exploratory " Masahiro Yamada
  2 siblings, 0 replies; 4+ messages in thread
From: Douglas Anderson @ 2017-10-11 16:19 UTC (permalink / raw)
  To: yamada.masahiro
  Cc: groeck, sjg, briannorris, mingo, Douglas Anderson, Michal Marek,
	linux-kernel, linux-kbuild

These are a few stragglers that I left out of the original patch to
cache calls to the C compiler ("kbuild: Add a cache for generated
variables") because they bleed out into the main Makefile and thus
uglify things a little bit.  The idea is the same here, though.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
---

Changes in v3:
- Add Tested-by for Ingo and Guenter since v2 and v3 are very similar

Changes in v2:
- Abstract at a different level (like shell-cached) per Masahiro Yamada

 Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 29c06eacf491..517d69cf30f8 100644
--- a/Makefile
+++ b/Makefile
@@ -652,7 +652,7 @@ KBUILD_CFLAGS += $(call cc-ifversion, -lt, 0409, \
 KBUILD_CFLAGS	+= $(call cc-option,--param=allow-store-data-races=0)
 
 # check for 'asm goto'
-ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC) $(KBUILD_CFLAGS)), y)
+ifeq ($(call shell-cached,$(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC) $(KBUILD_CFLAGS)), y)
 	KBUILD_CFLAGS += -DCC_HAVE_ASM_GOTO
 	KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
 endif
@@ -788,7 +788,7 @@ KBUILD_CFLAGS	+= $(call cc-option,-fdata-sections,)
 endif
 
 # arch Makefile may override CC so keep this after arch Makefile is included
-NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include)
+NOSTDINC_FLAGS += -nostdinc -isystem $(call shell-cached,$(CC) -print-file-name=include)
 CHECKFLAGS     += $(NOSTDINC_FLAGS)
 
 # warn about C99 declaration after statement
-- 
2.15.0.rc0.271.g36b669edcc-goog

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3 0/2] kbuild: Cache exploratory calls to the compiler
  2017-10-11 16:19 [PATCH v3 0/2] kbuild: Cache exploratory calls to the compiler Douglas Anderson
  2017-10-11 16:19 ` [PATCH v3 1/2] kbuild: Add a cache for generated variables Douglas Anderson
  2017-10-11 16:19 ` [PATCH v3 2/2] kbuild: Cache a few more calls to the compiler Douglas Anderson
@ 2017-10-15  7:48 ` Masahiro Yamada
  2 siblings, 0 replies; 4+ messages in thread
From: Masahiro Yamada @ 2017-10-15  7:48 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Guenter Roeck, Simon Glass, Brian Norris, Ingo Molnar,
	Matthias Kaehlcke, Cao jin, Arnd Bergmann, James Hogan,
	Linux Kbuild mailing list, Michal Marek,
	Linux Kernel Mailing List, Mark Charlebois

2017-10-12 1:19 GMT+09:00 Douglas Anderson <dianders@chromium.org>:
> This two-patch series attempts to speed incremental builds of the
> kernel up by a bit.  How much of a speedup you get depends a lot on
> your environment, specifically the speed of your workstation and how
> fast it takes to invoke the compiler.
>
> In the Chrome OS build environment you get a really big win.  For an
> incremental build (via emerge) I measured a speedup from ~1 minute to
> ~35 seconds.  ...but Chrome OS calls the compiler through a number of
> wrapper scripts and also calls the kernel make at least twice for an
> emerge (during compile stage and install stage), so it's a bit of a
> worst case.
>
> Perhaps a more realistic measure of the speedup others might see is
> running "time make help > /dev/null" outside of the Chrome OS build
> environment on my system.  When I do this I see that it took more than
> 1.0 seconds before and less than 0.2 seconds after.  So presumably
> this has the ability to shave ~0.8 seconds off an incremental build
> for most folks out there.  While 0.8 seconds savings isn't huge, it
> does make incremental builds feel a lot snappier.
>
> Ingo Molnar also did some testing of this in his environment and found
> that an incremental build of his subsystem sped up from ~.44 seconds
> before to ~.15 seconds after.  Clean builds also sped up by a marginal
> amount.  :)
>
> Changes in v3:
> - Rule to prevent make from trying to generate the cache
> - Rule to clean .cache.mk
> - No more doc changes
> - Moved cache stuff below cc-cross-prefix
> - Removed duplicate documentation of try-run (oops)
> - Add Tested-by for Ingo and Guenter since v2 and v3 are very similar
>
> Changes in v2:
> - Abstract at a different level (like shell-cached) per Masahiro Yamada
> - Include ld-version, which I missed the first time
>
> Douglas Anderson (2):
>   kbuild: Add a cache for generated variables
>   kbuild: Cache a few more calls to the compiler
>
>  Makefile               |  5 +--
>  scripts/Kbuild.include | 87 ++++++++++++++++++++++++++++++++++++++++++--------
>  2 files changed, 76 insertions(+), 16 deletions(-)


Series, applied to linux-kbuild/kbuild.


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-10-15  7:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-11 16:19 [PATCH v3 0/2] kbuild: Cache exploratory calls to the compiler Douglas Anderson
2017-10-11 16:19 ` [PATCH v3 1/2] kbuild: Add a cache for generated variables Douglas Anderson
2017-10-11 16:19 ` [PATCH v3 2/2] kbuild: Cache a few more calls to the compiler Douglas Anderson
2017-10-15  7:48 ` [PATCH v3 0/2] kbuild: Cache exploratory " Masahiro Yamada

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).