From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: git@vger.kernel.org
Cc: "René Scharfe" <l.s.r@web.de>,
"Derrick Stolee" <stolee@gmail.com>,
"SZEDER Gábor" <szeder.dev@gmail.com>
Subject: [PoC] coccinelle: make Coccinelle-related make targets more fine-grained
Date: Thu, 2 Aug 2018 13:55:22 +0200 [thread overview]
Message-ID: <20180802115522.16107-1-szeder.dev@gmail.com> (raw)
In-Reply-To: <20180723135100.24288-1-szeder.dev@gmail.com>
So, I have this PoC patch below on top of this patch series for some
months now; actually this series was all fallout from working on this
patch. It makes 'make -j<N> coccicheck' much faster (speedup of over
99% :) in some scenarios that, in my experience, occur rather
frequently when working on Coccinelle semantic patches (as opposed to
occasionally running 'make coccicheck' to see whether there are any
regressions).
However, it's not ready for inclusion as it is, because,
unfortunately, the improvements are not without some disadvantages, as
explained in the second half of the commit message below.
Anyway, I'm sending this out, because I don't see how I could make it
any better, but other contributors working on semantic patches might
still find it useful to keep this in their own fork, even in its
current form. And perhaps someone will come up with a brilliant idea
to address those disadvantages...
Also available at:
https://github.com/szeder/git make-coccicheck-finegrained
-- >8 --
Subject: [PATCH] [PoC] coccinelle: make Coccinelle-related make targets more fine-grained
When running 'make coccicheck', each semantic patch is applied to all
source files in a separate shell for loop. This approach has the
following disadvantages:
1. Even if you only modified a single C source file, that shell loop
will still iterate over all source files and apply the semantic
patches to all of them, wasting a lot of time.
2. If you apply only a single semantic patch (either implicitly,
after you modified only one of them (and the results of the
previous 'make coccicheck' are still there), or explicitly, by
running e.g. 'make contrib/coccinelle/array.cocci.patch'), then
you won't be able to exploit multiple CPU cores to speed up the
operation, because that shell loop will iterate over all source
files sequentially.
3. 'make coccicheck' can only use as many parallel jobs as the
number of semantic patches, so even if you have more available
CPU cores than that, you can't exploit them all to speed up this
operation.
4. During 'make coccicheck' there is only a single
SPATCH contrib/coccinelle/<whatever>.cocci
line (or as many lines as the number of parallel jobs) of output
when starting to apply each semantic patch, and then comes a long
silence without any indication of progress, because applying some
of our semantic patches to all source files can take over a
minute. This can trick developers new to semantic patches into
thinking that something went wrong, hung, ended up in an endless
loop, etc. It certainly confused my the first time.
Let's add a bit of Makefile metaprogramming to generate finer-grained
make targets applying one semantic patch to only a single source file,
and specify these as dependencies of the targets applying one semantic
patch to all source files. This way that shell loop can be avoided,
semantic patches will only be applied to changed source files, and the
same semantic patch can be applied in parallel to multiple source
files. The only remaining sequential part is aggregating the
suggested transformations from the individual targets into a single
patch file, which is comparatively cheap (especially since ideally
there aren't any suggestions).
This change brings spectacular speedup in the scenario described in
point (1) above. When the results of a previous 'make coccicheck' are
there, the time needed to run
touch apply.c ; time make -j4 coccicheck
went from 3m42s to 1.848s, which is just over 99% speedup, yay!, and
'apply.c' is the second longest source file in our codebase. In the
scenario in point (2), running
touch contrib/coccinelle/array.cocci ; time make -j4 coccicheck
went from 56.364s to 35.883s, which is ~36% speedup.
All the above timings are best-of-five on a machine with 2 physical
and 2 logical cores. I don't have the hardware to bring any numbers
for point (3). The time needed to run 'make -j4 coccicheck' in a
clean state didn't change, it's ~3m42s both with and without this
patch.
Unfortunately, this new approach has some disadvantages compared to
the current situation:
- [RFC]
With this patch 'make coccicheck's output will look like this
('make' apparently doesn't iterate over semantic patches in
lexicographical order):
SPATCH commit.cocci abspath.c
SPATCH commit.cocci advice.c
<... lots of lines ...>
SPATCH array.cocci http-walker.c
SPATCH array.cocci remote-curl.c
which means that the number of lines in the output grows from
"Nr-of-semantic-patches" to "Nr-of-semantic-patches *
Nr-of-source-files".
Now, while this certainly addresses point (4) above, it can be
considered too much, and I'm not sure that (4) is that big an
issue to justify this much output.
OTOH, I couldn't yet figure out a way to print a single 'SPATCH
contrib/coccinelle/<whatever>.cocci' line at the beginning of
applying that semantic patch without triggering unnecessary work,
effectively killing most of the runtime benefits. Or to somehow
iterate over source files in the outer loop and over semantic
patches in the inner loop, so we could have one output line per
file.
- [RFC]
The overhead of applying a semantic patch to all source files
became larger. 'make coccicheck' currently runs only one shell
process and creates two output files for each semantic patch.
With this patch it will run approx. "Nr-of-semantic-patches *
Nr-of-source-files" shell processes and create twice as many
output files.
This overhead can be quantified by measuring the runtime of:
make -j4 SPATCH=true coccicheck
and this patch increases it from 0.142s to 1.479s. While this is
indeed an order of magnitude increase, it's still negligible
compared to the "real" runtime of 3m42s.
So the increased overhead doesn't seem to matter on Linux, but I
would expect that it's worse on OSX and Windows; though I'm not
sure whether Coccinelle (with all its OCaml dependencies) works on
those platforms in the first place.
- [RFC]
This approach uses $(eval), which we haven't used in any of our
Makefiles yet. I wonder whether it's portable enough. And it's
ugly and complicated.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
Makefile | 52 ++++++++++++++++++++++++-----------
contrib/coccinelle/.gitignore | 3 +-
2 files changed, 38 insertions(+), 17 deletions(-)
diff --git a/Makefile b/Makefile
index d616c04125..f516dd93d1 100644
--- a/Makefile
+++ b/Makefile
@@ -2674,25 +2674,44 @@ COCCI_SOURCES = $(filter-out sha1collisiondetection/%,$(C_SOURCES))
else
COCCI_SOURCES = $(filter-out sha1dc/%,$(C_SOURCES))
endif
+COCCI_SEM_PATCHES = $(wildcard contrib/coccinelle/*.cocci)
-%.cocci.patch: %.cocci $(COCCI_SOURCES)
- @echo ' ' SPATCH $<; \
- ret=0; \
- for f in $(COCCI_SOURCES); do \
- $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \
- { ret=$$?; break; }; \
- done >$@+ 2>$@.log; \
- if test $$ret != 0; \
+define cocci_template
+$(cocci_sem_patch)_dirs := $$(addprefix $(cocci_sem_patch).patches/,$$(sort $$(dir $$(COCCI_SOURCES))))
+
+$$($(cocci_sem_patch)_dirs):
+ @mkdir -p $$@
+
+# e.g. 'contrib/coccinelle/strbuf.cocci.patches/builtin/commit.c.patch'
+# Applies one semantic patch to _one_ source file.
+$(cocci_sem_patch).patches/%.patch: % $(cocci_sem_patch)
+ @printf ' SPATCH %-25s %s\n' $$(notdir $(cocci_sem_patch)) $$<; \
+ if ! $$(SPATCH) --sp-file $(cocci_sem_patch) $$< $$(SPATCH_FLAGS) >$$@ 2>$$@.log; \
then \
- cat $@.log; \
+ rm -f $$@; \
+ cat $$@.log; \
exit 1; \
- fi; \
- mv $@+ $@; \
- if test -s $@; \
- then \
- echo ' ' SPATCH result: $@; \
fi
-coccicheck: $(addsuffix .patch,$(wildcard contrib/coccinelle/*.cocci))
+
+# e.g. 'contrib/coccinelle/strbuf.cocci.patch'
+# Applies one semantic patch to _all_ source files.
+$(cocci_sem_patch).patch: $(cocci_sem_patch) $$($(cocci_sem_patch)_dirs) $$(patsubst %,$(cocci_sem_patch).patches/%.patch,$(COCCI_SOURCES))
+ @>$$@+; \
+ for f in $$(filter %.patch,$$^); do \
+ if test -s $$$$f; \
+ then \
+ cat $$$$f >>$$@+; \
+ fi \
+ done; \
+ mv $$@+ $$@; \
+ if test -s $$@; then \
+ echo ' ' SPATCH result: $$@; \
+ fi
+endef
+
+$(foreach cocci_sem_patch,$(COCCI_SEM_PATCHES),$(eval $(cocci_template)))
+
+coccicheck: $(addsuffix .patch,$(COCCI_SEM_PATCHES))
.PHONY: coccicheck
@@ -2907,7 +2926,8 @@ profile-clean:
$(RM) $(addsuffix *.gcno,$(addprefix $(PROFILE_DIR)/, $(object_dirs)))
cocciclean:
- $(RM) contrib/coccinelle/*.cocci.patch*
+ $(RM) contrib/coccinelle/*.cocci.patch
+ $(RM) -r contrib/coccinelle/*.cocci.patches/
clean: profile-clean coverage-clean cocciclean
$(RM) *.res
diff --git a/contrib/coccinelle/.gitignore b/contrib/coccinelle/.gitignore
index d3f29646dc..7ae6ffa983 100644
--- a/contrib/coccinelle/.gitignore
+++ b/contrib/coccinelle/.gitignore
@@ -1 +1,2 @@
-*.patch*
+*.cocci.patch
+*.cocci.patches/
--
2.18.0.408.g42635c01bc
next prev parent reply other threads:[~2018-08-02 11:55 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-23 13:50 [PATCH 0/5] Misc Coccinelle-related improvements SZEDER Gábor
2018-07-23 13:50 ` [PATCH 1/5] coccinelle: mark the 'coccicheck' make target as .PHONY SZEDER Gábor
2018-07-23 14:36 ` Derrick Stolee
2018-07-23 19:37 ` Junio C Hamano
2018-07-23 13:50 ` [PATCH 2/5] coccinelle: use $(addsuffix) in 'coccicheck' make target SZEDER Gábor
2018-07-23 19:37 ` Junio C Hamano
2018-07-23 13:50 ` [PATCH 3/5] coccinelle: exclude sha1dc source files from static analysis SZEDER Gábor
2018-07-23 18:27 ` Eric Sunshine
2018-07-23 18:43 ` SZEDER Gábor
2018-07-23 18:57 ` Eric Sunshine
2018-07-23 13:50 ` [PATCH 4/5] coccinelle: put sane filenames into output patches SZEDER Gábor
2018-07-23 14:34 ` Derrick Stolee
2018-07-23 13:51 ` [PATCH 5/5] coccinelle: extract dedicated make target to clean Coccinelle's results SZEDER Gábor
2018-07-23 14:38 ` [PATCH 0/5] Misc Coccinelle-related improvements Derrick Stolee
2018-07-23 15:29 ` Duy Nguyen
2018-07-23 16:30 ` René Scharfe
2018-07-23 19:40 ` Junio C Hamano
2018-08-02 11:55 ` SZEDER Gábor [this message]
2018-08-02 13:24 ` [PoC] coccinelle: make Coccinelle-related make targets more fine-grained René Scharfe
2018-08-02 18:01 ` Jeff King
2018-08-02 18:31 ` Jeff King
2018-08-03 6:21 ` Jonathan Nieder
2018-08-03 13:08 ` Jeff King
2018-08-05 23:02 ` Jonathan Nieder
2018-08-02 19:46 ` Eric Sunshine
2018-08-02 21:29 ` Jeff King
2018-08-02 21:45 ` Ævar Arnfjörð Bjarmason
2018-08-03 6:22 ` Julia Lawall
2018-08-03 6:22 ` [Cocci] " Julia Lawall
2018-08-03 6:44 ` Jonathan Nieder
2018-08-03 6:44 ` [Cocci] " Jonathan Nieder
2018-08-03 6:52 ` Julia Lawall
2018-08-03 6:52 ` [Cocci] " Julia Lawall
2018-08-03 6:25 ` Julia Lawall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180802115522.16107-1-szeder.dev@gmail.com \
--to=szeder.dev@gmail.com \
--cc=git@vger.kernel.org \
--cc=l.s.r@web.de \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.