git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"René Scharfe" <l.s.r@web.de>,
	"SZEDER Gábor" <szeder.dev@gmail.com>, "Eric Wong" <e@80x24.org>
Subject: Re: [PATCH] Makefile: fix bugs in coccicheck and speed it up
Date: Fri, 05 Mar 2021 18:20:47 +0100	[thread overview]
Message-ID: <87im65tt3k.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YEIGzXMDax83cwAx@coredump.intra.peff.net>


On Fri, Mar 05 2021, Jeff King wrote:

I sent out a v2 that should address your feedback in :
https://lore.kernel.org/git/20210305170724.23859-1-avarab@gmail.com/

Just brief notes:

> On Tue, Mar 02, 2021 at 09:51:03PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> What we were doing before was processing all our *.c files, and for
>> each of those *.c files we'd recursively look around for includes and
>> see if we needed to search/replace in those too.
>> 
>> That we did that dates back to [1] when we were only processing *.c
>> files, and it was always very redundant. We'd e.g. visit the likes of
>> strbuf.h lots of times since it's widely used as an include.
>
> OK. I don't offhand know if processing the includes along with the C
> files buys us anything else. Coccinelle's behavior is often quite a
> mystery, but if we think this produces the same results with less time,
> I'm for it.
>
> BTW, this "dates back to when we were only processing *.c files"
> statement confused me a bit. Doesn't that include the current state
> before this patch?  I.e., this hunk:
>
>> -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
>> +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES)))
>
> seems to be an important part of the change, without which moving away
> from --all-includes would be wrong.

Updated in the new 2/4 commit message.

>> Then in the most recent attempt to optimize coccicheck in [2] this
>> anti-pattern finally turned into a bug.
>>
>> Namely: before this change, if your coccicheck rule applied to
>> e.g. making a change in strbuf.h itself we'd get *lots* of duplicate
>> hunks applying the exact same change, as concurrent spatch processes
>> invoked by xargs raced one another. In one instance I ended up with 27
>> copies of the same hunk in a strbuf.patch.
>> 
>> Setting SPATCH_BATCH_SIZE=0 and processing all the files in one giant
>> batch mitigated this. I suspect the author of [2] either mostly ran in
>> that mode, or didn't test on changes that impacted widely used header
>> files.
>
> This "turned into a bug" I didn't understand, though. That commit [2]
> just switched from always feeding files one at a time to letting you
> feed more than one.  So yes, feeding all at once (with
> SPATCH_BATCH_SIZE=0) mitigated the bug. But wouldn't any bug that shows
> up with SPATCH_BATCH_SIZE=1 have already existed previously?
>
> IOW, I don't think the batch-size stuff made anything worse. It made one
> specific case better, but that was not even its purpose.
>
> That is maybe splitting hairs, but I want to make sure I understand all
> of what you're claiming here. But also...

Explained in the new 2/4. Your commit introduced a regression, because
spatch does racy magic background locking, running N of them at a time
applying the same rule tripped it up.

I suspect it was racy before, but we didn't have overlapping rules in
multiple *.cocci files.

>> I'm entirely removing SPATCH_BATCH_SIZE. If you want to tweak it you
>> can tweak SPATCH_XARGS_FLAGS to e.g. "-n 256", or "-P 4 -n 128". But
>> in my testing it isn't worth it to tweak SPATCH_XARGS_FLAGS for a full
>> "make coccicheck".
>
> You hard-code this to 32 now. But I'm not sure if that's going to be the
> universal best value.
>
> Applying just part of your patch, but leaving SPATCH_BATCH_SIZE in
> place, like so:
>
> diff --git a/Makefile b/Makefile
> index dfb0f1000f..88cb157547 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1201,7 +1201,7 @@ SP_EXTRA_FLAGS = -Wno-universal-initializer
>  # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
>  # usually result in less CPU usage at the cost of higher peak memory.
>  # Setting it to 0 will feed all files in a single spatch invocation.
> -SPATCH_FLAGS = --all-includes --patch .
> +SPATCH_FLAGS = --no-includes --patch .
>  SPATCH_BATCH_SIZE = 1
>  
>  include config.mak.uname
> @@ -2859,7 +2859,7 @@ check: config-list.h command-list.h
>  		exit 1; \
>  	fi
>  
> -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
> +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES)))
>  COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
>  
>  %.cocci.patch: %.cocci $(COCCI_SOURCES)
>
> Then I get this with various batch sizes (using "make -j16" on an 8-core
> machine, so all of the patches are being run at the same time):
>
>   size |  real  |  user  | sys
>   -------------------------------
>     1  |  1m28s | 10m00s | 0m56s
>     2  |  1m03s |  7m49s | 0m33s
>     4  |  0m51s |  6m28s | 0m20s
>     8  |  0m45s |  6m02s | 0m14s
>    16  |  0m45s |  6m17s | 0m10s
>    32  |  0m44s |  6m33s | 0m07s
>    64  |  0m45s |  6m48s | 0m06s
>     0  |  1m08s | 10m08s | 0m03s
>
> So there's a sweet spot at 8. Doing "32" isn't that much worse (runtime
> is about the same, but we use more CPU), but it gets progressively worse
> as the batch size increases.
>
> That's the same sweet spot before your patch, too, btw. I know because I
> happened to be timing it the other day, as coccinelle 1.1.0 hit debian
> unstable. When I originally wrote the SPATCH_BATCH_SIZE feature, I think
> I was using 1.0.4 or 1.0.7. Back then SPATCH_BATCH_SIZE=0 was a clear
> win, assuming you had the memory. But in 1.0.8 it ran for many minutes
> without finishing. I found back then that "2" was the sweet spot. But
> now it's "8".
>
> All of which is to say: the timing difference between my "8" and "32"
> cases isn't that exciting. But the performance of spatch has been
> sufficiently baffling to me that I think it's worth keeping the knob.
>
> Your XARGS_FLAGS does accomplish something similar (and is in fact more
> flexible, though at the cost of being less abstract).  I'm OK to replace
> one with the other, but shouldn't it happen in a separate patch? It's
> completely orthogonal to the --no-includes behavior.

*nod*, set to a default of 8 in the new 4/4.

>> I'm also the whole "cat $@.log" introduced in [3]. Since we don't call
>> this in a loop anymore (and xargs will early-exit) we can just rely on
>> standard V=1 for debugging issues.
>
> I think this is missing the point of the cat. We've redirected spatch's
> stderr to the logfile. So if there's an error, you have no idea what it
> was without manually figuring out which file to cat. And V=1 doesn't
> help that.
>
> We could stop doing that redirection, but the problem is that spatch is
> very verbose. So the point was to store the output but only dump it
> when we see an error.
>
> So this part of the patch seems like a pure regression to me. E.g.,
> introduce a syntax error like:
>
> diff --git a/contrib/coccinelle/free.cocci b/contrib/coccinelle/free.cocci
> index 4490069df9..c6c6562a0a 100644
> --- a/contrib/coccinelle/free.cocci
> +++ b/contrib/coccinelle/free.cocci
> @@ -11,7 +11,7 @@ expression E;
>    free(E);
>  
>  @@
> -expression E;
> +expression E
>  @@
>  - free(E);
>  + FREE_AND_NULL(E);
>
> and run "make coccicheck" before and after your patch:
>
>   [before]
>   $ make coccicheck
>   GIT_VERSION = 2.31.0.rc1.8.gbe7935ed8b.dirty
>       SPATCH contrib/coccinelle/free.cocci
>   init_defs_builtins: /usr/bin/../lib/coccinelle/standard.h
>   meta: parse error: 
>     File "contrib/coccinelle/free.cocci", line 15, column 0, charpos = 99
>     around = '@@',
>     whole content = @@
>
>   xargs: spatch: exited with status 255; aborting
>   make: *** [Makefile:2866: contrib/coccinelle/free.cocci.patch] Error 1
>
>   [after]
>   $ make coccicheck
>     SPATCH contrib/coccinelle/free.cocci
>   make: *** [Makefile:2875: contrib/coccinelle/free.cocci.patch] Error 124

The "cat $@.log" is back, but there's still some issues with it on
master now (I didn't change this) it'll spew a lot at you with xargs
since we emit the whole error output for every file, seemingly, before
cat-ing the file:

    make -j1 coccicheck SPATCH_FLAGS=--blahblah V=1 2>&1|grep -i example.*use|wc -l
    464

Well, it's a bit better now on my series with a default batch size of 8:

    $ make -j1 coccicheck SPATCH_FLAGS=--blahblah V=1 2>&1|grep -i example.*use|wc -l
    88

I got tired of dealing with the combination of shellscritp and make for
the day. But maybe something to do as a follow-up if anyone's
interested.

  reply	other threads:[~2021-03-05 17:21 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-12 23:48 [RFC PATCH] *.h: remove extern from function declarations Denton Liu
2019-04-13  1:24 ` Jeff King
2019-04-13  5:45   ` Junio C Hamano
2019-04-15 18:24 ` [PATCH v2 0/3] " Denton Liu
2019-04-15 18:24   ` [PATCH v2 1/3] *.[ch]: remove extern from function declarations using spatch Denton Liu
2019-04-15 19:19     ` Thomas Gummerer
2019-04-15 18:24   ` [PATCH v2 2/3] *.[ch]: remove extern from function declarations using sed Denton Liu
2019-04-15 18:24   ` [PATCH v2 3/3] cocci: prevent extern function declarations Denton Liu
2019-04-17  7:58   ` [PATCH v3 0/4] remove extern from " Denton Liu
2019-04-17  7:58     ` [PATCH v3 1/4] *.[ch]: remove extern from function declarations using spatch Denton Liu
2019-04-17  7:58     ` [PATCH v3 2/4] *.[ch]: remove extern from function declarations using sed Denton Liu
2019-04-17  7:58     ` [PATCH v3 3/4] *.[ch]: manually align parameter lists Denton Liu
2019-04-17  7:58     ` [PATCH v3 4/4] cocci: prevent extern function declarations Denton Liu
2019-04-22  5:44     ` [PATCH] cache.h: fix mismerge of 'dl/no-extern-in-func-decl' Denton Liu
2019-04-22  6:30       ` Junio C Hamano
2019-04-22 11:19         ` Junio C Hamano
2019-04-22 21:49     ` [PATCH v3 0/4] remove extern from function declarations Jeff King
2019-04-25 12:07       ` SZEDER Gábor
2019-04-25 18:05         ` Denton Liu
2019-04-30 23:21         ` Johannes Schindelin
2019-05-01 10:01           ` Denton Liu
2019-05-01 18:56             ` Jeff King
2019-05-02  0:04             ` SZEDER Gábor
2019-05-03  9:32               ` Johannes Schindelin
2019-05-03 14:42                 ` SZEDER Gábor
2019-05-03 14:58                   ` SZEDER Gábor
2019-05-03 17:45                   ` Jeff King
2019-05-03 18:44                     ` SZEDER Gábor
2019-05-05  5:28                     ` Junio C Hamano
2019-05-05 18:09                       ` Jacob Keller
2019-05-05 18:08                     ` Jacob Keller
2019-05-06  5:11                       ` [PATCH] coccicheck: optionally process every source file at once Jeff King
2019-05-06  9:34                         ` Duy Nguyen
2019-05-06 23:43                           ` [PATCH] coccicheck: optionally batch spatch invocations Jeff King
2019-05-07  1:41                             ` Jacob Keller
2019-05-07  2:04                               ` Jeff King
2019-05-07  2:42                             ` Junio C Hamano
2019-05-07  2:55                               ` Jeff King
2019-05-07  3:04                                 ` Jacob Keller
2019-05-07  4:52                                 ` Junio C Hamano
2019-05-08  7:07                                   ` Jeff King
2019-05-08 12:36                                     ` Denton Liu
2019-05-08 22:39                                       ` Jeff King
2019-05-07 10:20                             ` Duy Nguyen
2019-05-07 11:19                             ` SZEDER Gábor
2021-03-02 20:51                             ` [PATCH] Makefile: fix bugs in coccicheck and speed it up Ævar Arnfjörð Bjarmason
2021-03-03  9:43                               ` Denton Liu
2021-03-03 11:45                               ` Ævar Arnfjörð Bjarmason
2021-03-04 23:18                               ` Junio C Hamano
2021-03-05 11:17                                 ` Ævar Arnfjörð Bjarmason
2021-03-05 10:24                               ` Jeff King
2021-03-05 17:20                                 ` Ævar Arnfjörð Bjarmason [this message]
2021-03-06 10:59                                   ` Jeff King
2021-03-05 17:07                               ` [PATCH v2 0/4] Makefile/coccicheck: fix bugs " Ævar Arnfjörð Bjarmason
2021-03-05 19:10                                 ` René Scharfe.
     [not found]                                   ` <xmqqim659u57.fsf@gitster.c.googlers.com>
2021-03-06 11:26                                     ` René Scharfe.
2021-03-06 12:43                                       ` René Scharfe.
     [not found]                                       ` <xmqqft16914r.fsf@gitster.c.googlers.com>
2021-03-13 16:10                                         ` René Scharfe.
2021-03-06 17:27                                   ` Ævar Arnfjörð Bjarmason
2021-03-06 17:41                                     ` René Scharfe.
2021-03-06 17:52                                       ` Ævar Arnfjörð Bjarmason
2021-03-06 19:08                                         ` René Scharfe.
2021-03-05 17:07                               ` [PATCH v2 1/4] Makefile/coccicheck: add comment heading for all SPATCH flags Ævar Arnfjörð Bjarmason
2021-03-05 17:07                               ` [PATCH v2 2/4] Makefile/coccicheck: speed up and fix bug with duplicate hunks Ævar Arnfjörð Bjarmason
2021-03-06 10:45                                 ` Jeff King
2021-03-06 19:29                                   ` Ævar Arnfjörð Bjarmason
2021-03-05 17:07                               ` [PATCH v2 3/4] Makefile/coccicheck: allow for setting xargs concurrency Ævar Arnfjörð Bjarmason
2021-03-06 10:51                                 ` Jeff King
2021-03-05 17:07                               ` [PATCH v2 4/4] Makefile/coccicheck: set SPATCH_BATCH_SIZE to 8 Ævar Arnfjörð Bjarmason
2021-03-06 19:25                                 ` [PATCH v2 5/4] Makefile/coccicheck: use --include-headers-for-types Ævar Arnfjörð Bjarmason
2021-03-18 20:49                                   ` SZEDER Gábor
2021-03-19 10:32                                     ` Ævar Arnfjörð Bjarmason
2021-03-22 12:11                                   ` [PATCH v4 0/4] Makefile/coccicheck: fix bugs and speed it up Ævar Arnfjörð Bjarmason
2021-03-22 12:11                                     ` [PATCH v4 1/4] Makefile/coccicheck: add comment heading for all SPATCH flags Ævar Arnfjörð Bjarmason
2021-03-22 18:04                                       ` René Scharfe.
2021-03-22 12:11                                     ` [PATCH v4 2/4] Makefile/coccicheck: speed up and fix bug with duplicate hunks Ævar Arnfjörð Bjarmason
2021-03-22 18:05                                       ` René Scharfe.
2021-03-24 19:19                                         ` Jeff King
2021-03-22 19:09                                       ` Junio C Hamano
2021-03-22 12:11                                     ` [PATCH v4 3/4] Makefile/coccicheck: allow for setting xargs concurrency Ævar Arnfjörð Bjarmason
2021-03-24 19:26                                       ` Jeff King
2021-03-25  2:29                                         ` Ævar Arnfjörð Bjarmason
2021-03-26  4:11                                           ` Jeff King
2021-03-22 12:11                                     ` [PATCH v4 4/4] Makefile/coccicheck: set SPATCH_BATCH_SIZE to 8 Ævar Arnfjörð Bjarmason
2021-03-22 18:05                                       ` René Scharfe.
2021-03-24 19:27                                         ` Jeff King
2021-03-27 17:43                                     ` [PATCH v4 0/4] Makefile/coccicheck: fix bugs and speed it up Junio C Hamano
2021-03-27 19:46                                       ` Ævar Arnfjörð Bjarmason
2019-05-03  9:40               ` [PATCH v3 0/4] remove extern from function declarations Denton Liu
2019-04-23 23:40   ` [PATCH v4 " Denton Liu
2019-04-23 23:40     ` [PATCH v4 1/4] *.[ch]: remove extern from function declarations using spatch Denton Liu
2019-04-23 23:40     ` [PATCH v4 2/4] *.[ch]: remove extern from function declarations using sed Denton Liu
2019-04-24  4:56       ` Junio C Hamano
2019-04-25 19:00         ` Denton Liu
2019-04-23 23:40     ` [PATCH v4 3/4] *.[ch]: manually align parameter lists Denton Liu
2019-04-23 23:40     ` [PATCH v4 4/4] cocci: prevent extern function declarations Denton Liu
2019-04-29  8:28     ` [PATCH v5 0/3] *** SUBJECT HERE *** Denton Liu
2019-04-29  8:28       ` [PATCH v5 1/3] *.[ch]: remove extern from function declarations using spatch Denton Liu
2019-04-29  8:28       ` [PATCH v5 2/3] *.[ch]: remove extern from function declarations using sed Denton Liu
2019-04-29  8:28       ` [PATCH v5 3/3] *.[ch]: manually align parameter lists Denton Liu
2019-04-29  8:30       ` [PATCH v5 0/3] *** SUBJECT HERE *** Denton Liu
2019-05-06 11:03         ` Ævar Arnfjörð Bjarmason
2019-05-06 15:34           ` Denton Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87im65tt3k.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).