All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"René Scharfe" <l.s.r@web.de>,
	"SZEDER Gábor" <szeder.dev@gmail.com>, "Eric Wong" <e@80x24.org>
Subject: Re: [PATCH] Makefile: fix bugs in coccicheck and speed it up
Date: Fri, 05 Mar 2021 18:20:47 +0100	[thread overview]
Message-ID: <87im65tt3k.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YEIGzXMDax83cwAx@coredump.intra.peff.net>


On Fri, Mar 05 2021, Jeff King wrote:

I sent out a v2 that should address your feedback in :
https://lore.kernel.org/git/20210305170724.23859-1-avarab@gmail.com/

Just brief notes:

> On Tue, Mar 02, 2021 at 09:51:03PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> What we were doing before was processing all our *.c files, and for
>> each of those *.c files we'd recursively look around for includes and
>> see if we needed to search/replace in those too.
>> 
>> That we did that dates back to [1] when we were only processing *.c
>> files, and it was always very redundant. We'd e.g. visit the likes of
>> strbuf.h lots of times since it's widely used as an include.
>
> OK. I don't offhand know if processing the includes along with the C
> files buys us anything else. Coccinelle's behavior is often quite a
> mystery, but if we think this produces the same results with less time,
> I'm for it.
>
> BTW, this "dates back to when we were only processing *.c files"
> statement confused me a bit. Doesn't that include the current state
> before this patch?  I.e., this hunk:
>
>> -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
>> +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES)))
>
> seems to be an important part of the change, without which moving away
> from --all-includes would be wrong.

Updated in the new 2/4 commit message.

>> Then in the most recent attempt to optimize coccicheck in [2] this
>> anti-pattern finally turned into a bug.
>>
>> Namely: before this change, if your coccicheck rule applied to
>> e.g. making a change in strbuf.h itself we'd get *lots* of duplicate
>> hunks applying the exact same change, as concurrent spatch processes
>> invoked by xargs raced one another. In one instance I ended up with 27
>> copies of the same hunk in a strbuf.patch.
>> 
>> Setting SPATCH_BATCH_SIZE=0 and processing all the files in one giant
>> batch mitigated this. I suspect the author of [2] either mostly ran in
>> that mode, or didn't test on changes that impacted widely used header
>> files.
>
> This "turned into a bug" I didn't understand, though. That commit [2]
> just switched from always feeding files one at a time to letting you
> feed more than one.  So yes, feeding all at once (with
> SPATCH_BATCH_SIZE=0) mitigated the bug. But wouldn't any bug that shows
> up with SPATCH_BATCH_SIZE=1 have already existed previously?
>
> IOW, I don't think the batch-size stuff made anything worse. It made one
> specific case better, but that was not even its purpose.
>
> That is maybe splitting hairs, but I want to make sure I understand all
> of what you're claiming here. But also...

Explained in the new 2/4. Your commit introduced a regression, because
spatch does racy magic background locking, running N of them at a time
applying the same rule tripped it up.

I suspect it was racy before, but we didn't have overlapping rules in
multiple *.cocci files.

>> I'm entirely removing SPATCH_BATCH_SIZE. If you want to tweak it you
>> can tweak SPATCH_XARGS_FLAGS to e.g. "-n 256", or "-P 4 -n 128". But
>> in my testing it isn't worth it to tweak SPATCH_XARGS_FLAGS for a full
>> "make coccicheck".
>
> You hard-code this to 32 now. But I'm not sure if that's going to be the
> universal best value.
>
> Applying just part of your patch, but leaving SPATCH_BATCH_SIZE in
> place, like so:
>
> diff --git a/Makefile b/Makefile
> index dfb0f1000f..88cb157547 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1201,7 +1201,7 @@ SP_EXTRA_FLAGS = -Wno-universal-initializer
>  # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
>  # usually result in less CPU usage at the cost of higher peak memory.
>  # Setting it to 0 will feed all files in a single spatch invocation.
> -SPATCH_FLAGS = --all-includes --patch .
> +SPATCH_FLAGS = --no-includes --patch .
>  SPATCH_BATCH_SIZE = 1
>  
>  include config.mak.uname
> @@ -2859,7 +2859,7 @@ check: config-list.h command-list.h
>  		exit 1; \
>  	fi
>  
> -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
> +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES)))
>  COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
>  
>  %.cocci.patch: %.cocci $(COCCI_SOURCES)
>
> Then I get this with various batch sizes (using "make -j16" on an 8-core
> machine, so all of the patches are being run at the same time):
>
>   size |  real  |  user  | sys
>   -------------------------------
>     1  |  1m28s | 10m00s | 0m56s
>     2  |  1m03s |  7m49s | 0m33s
>     4  |  0m51s |  6m28s | 0m20s
>     8  |  0m45s |  6m02s | 0m14s
>    16  |  0m45s |  6m17s | 0m10s
>    32  |  0m44s |  6m33s | 0m07s
>    64  |  0m45s |  6m48s | 0m06s
>     0  |  1m08s | 10m08s | 0m03s
>
> So there's a sweet spot at 8. Doing "32" isn't that much worse (runtime
> is about the same, but we use more CPU), but it gets progressively worse
> as the batch size increases.
>
> That's the same sweet spot before your patch, too, btw. I know because I
> happened to be timing it the other day, as coccinelle 1.1.0 hit debian
> unstable. When I originally wrote the SPATCH_BATCH_SIZE feature, I think
> I was using 1.0.4 or 1.0.7. Back then SPATCH_BATCH_SIZE=0 was a clear
> win, assuming you had the memory. But in 1.0.8 it ran for many minutes
> without finishing. I found back then that "2" was the sweet spot. But
> now it's "8".
>
> All of which is to say: the timing difference between my "8" and "32"
> cases isn't that exciting. But the performance of spatch has been
> sufficiently baffling to me that I think it's worth keeping the knob.
>
> Your XARGS_FLAGS does accomplish something similar (and is in fact more
> flexible, though at the cost of being less abstract).  I'm OK to replace
> one with the other, but shouldn't it happen in a separate patch? It's
> completely orthogonal to the --no-includes behavior.

*nod*, set to a default of 8 in the new 4/4.

>> I'm also the whole "cat $@.log" introduced in [3]. Since we don't call
>> this in a loop anymore (and xargs will early-exit) we can just rely on
>> standard V=1 for debugging issues.
>
> I think this is missing the point of the cat. We've redirected spatch's
> stderr to the logfile. So if there's an error, you have no idea what it
> was without manually figuring out which file to cat. And V=1 doesn't
> help that.
>
> We could stop doing that redirection, but the problem is that spatch is
> very verbose. So the point was to store the output but only dump it
> when we see an error.
>
> So this part of the patch seems like a pure regression to me. E.g.,
> introduce a syntax error like:
>
> diff --git a/contrib/coccinelle/free.cocci b/contrib/coccinelle/free.cocci
> index 4490069df9..c6c6562a0a 100644
> --- a/contrib/coccinelle/free.cocci
> +++ b/contrib/coccinelle/free.cocci
> @@ -11,7 +11,7 @@ expression E;
>    free(E);
>  
>  @@
> -expression E;
> +expression E
>  @@
>  - free(E);
>  + FREE_AND_NULL(E);
>
> and run "make coccicheck" before and after your patch:
>
>   [before]
>   $ make coccicheck
>   GIT_VERSION = 2.31.0.rc1.8.gbe7935ed8b.dirty
>       SPATCH contrib/coccinelle/free.cocci
>   init_defs_builtins: /usr/bin/../lib/coccinelle/standard.h
>   meta: parse error: 
>     File "contrib/coccinelle/free.cocci", line 15, column 0, charpos = 99
>     around = '@@',
>     whole content = @@
>
>   xargs: spatch: exited with status 255; aborting
>   make: *** [Makefile:2866: contrib/coccinelle/free.cocci.patch] Error 1
>
>   [after]
>   $ make coccicheck
>     SPATCH contrib/coccinelle/free.cocci
>   make: *** [Makefile:2875: contrib/coccinelle/free.cocci.patch] Error 124

The "cat $@.log" is back, but there's still some issues with it on
master now (I didn't change this) it'll spew a lot at you with xargs
since we emit the whole error output for every file, seemingly, before
cat-ing the file:

    make -j1 coccicheck SPATCH_FLAGS=--blahblah V=1 2>&1|grep -i example.*use|wc -l
    464

Well, it's a bit better now on my series with a default batch size of 8:

    $ make -j1 coccicheck SPATCH_FLAGS=--blahblah V=1 2>&1|grep -i example.*use|wc -l
    88

I got tired of dealing with the combination of shellscritp and make for
the day. But maybe something to do as a follow-up if anyone's
interested.

  reply	other threads:[~2021-03-05 17:21 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-12 23:48 [RFC PATCH] *.h: remove extern from function declarations Denton Liu
2019-04-13  1:24 ` Jeff King
2019-04-13  5:45   ` Junio C Hamano
2019-04-15 18:24 ` [PATCH v2 0/3] " Denton Liu
2019-04-15 18:24   ` [PATCH v2 1/3] *.[ch]: remove extern from function declarations using spatch Denton Liu
2019-04-15 19:19     ` Thomas Gummerer
2019-04-15 18:24   ` [PATCH v2 2/3] *.[ch]: remove extern from function declarations using sed Denton Liu
2019-04-15 18:24   ` [PATCH v2 3/3] cocci: prevent extern function declarations Denton Liu
2019-04-17  7:58   ` [PATCH v3 0/4] remove extern from " Denton Liu
2019-04-17  7:58     ` [PATCH v3 1/4] *.[ch]: remove extern from function declarations using spatch Denton Liu
2019-04-17  7:58     ` [PATCH v3 2/4] *.[ch]: remove extern from function declarations using sed Denton Liu
2019-04-17  7:58     ` [PATCH v3 3/4] *.[ch]: manually align parameter lists Denton Liu
2019-04-17  7:58     ` [PATCH v3 4/4] cocci: prevent extern function declarations Denton Liu
2019-04-22  5:44     ` [PATCH] cache.h: fix mismerge of 'dl/no-extern-in-func-decl' Denton Liu
2019-04-22  6:30       ` Junio C Hamano
2019-04-22 11:19         ` Junio C Hamano
2019-04-22 21:49     ` [PATCH v3 0/4] remove extern from function declarations Jeff King
2019-04-25 12:07       ` SZEDER Gábor
2019-04-25 18:05         ` Denton Liu
2019-04-30 23:21         ` Johannes Schindelin
2019-05-01 10:01           ` Denton Liu
2019-05-01 18:56             ` Jeff King
2019-05-02  0:04             ` SZEDER Gábor
2019-05-03  9:32               ` Johannes Schindelin
2019-05-03 14:42                 ` SZEDER Gábor
2019-05-03 14:58                   ` SZEDER Gábor
2019-05-03 17:45                   ` Jeff King
2019-05-03 18:44                     ` SZEDER Gábor
2019-05-05  5:28                     ` Junio C Hamano
2019-05-05 18:09                       ` Jacob Keller
2019-05-05 18:08                     ` Jacob Keller
2019-05-06  5:11                       ` [PATCH] coccicheck: optionally process every source file at once Jeff King
2019-05-06  9:34                         ` Duy Nguyen
2019-05-06 23:43                           ` [PATCH] coccicheck: optionally batch spatch invocations Jeff King
2019-05-07  1:41                             ` Jacob Keller
2019-05-07  2:04                               ` Jeff King
2019-05-07  2:42                             ` Junio C Hamano
2019-05-07  2:55                               ` Jeff King
2019-05-07  3:04                                 ` Jacob Keller
2019-05-07  4:52                                 ` Junio C Hamano
2019-05-08  7:07                                   ` Jeff King
2019-05-08 12:36                                     ` Denton Liu
2019-05-08 22:39                                       ` Jeff King
2019-05-07 10:20                             ` Duy Nguyen
2019-05-07 11:19                             ` SZEDER Gábor
2021-03-02 20:51                             ` [PATCH] Makefile: fix bugs in coccicheck and speed it up Ævar Arnfjörð Bjarmason
2021-03-03  9:43                               ` Denton Liu
2021-03-03 11:45                               ` Ævar Arnfjörð Bjarmason
2021-03-04 23:18                               ` Junio C Hamano
2021-03-05 11:17                                 ` Ævar Arnfjörð Bjarmason
2021-03-05 10:24                               ` Jeff King
2021-03-05 17:20                                 ` Ævar Arnfjörð Bjarmason [this message]
2021-03-06 10:59                                   ` Jeff King
2021-03-05 17:07                               ` [PATCH v2 0/4] Makefile/coccicheck: fix bugs " Ævar Arnfjörð Bjarmason
2021-03-05 19:10                                 ` René Scharfe.
     [not found]                                   ` <xmqqim659u57.fsf@gitster.c.googlers.com>
2021-03-06 11:26                                     ` René Scharfe.
2021-03-06 12:43                                       ` René Scharfe.
     [not found]                                       ` <xmqqft16914r.fsf@gitster.c.googlers.com>
2021-03-13 16:10                                         ` René Scharfe.
2021-03-06 17:27                                   ` Ævar Arnfjörð Bjarmason
2021-03-06 17:41                                     ` René Scharfe.
2021-03-06 17:52                                       ` Ævar Arnfjörð Bjarmason
2021-03-06 19:08                                         ` René Scharfe.
2021-03-05 17:07                               ` [PATCH v2 1/4] Makefile/coccicheck: add comment heading for all SPATCH flags Ævar Arnfjörð Bjarmason
2021-03-05 17:07                               ` [PATCH v2 2/4] Makefile/coccicheck: speed up and fix bug with duplicate hunks Ævar Arnfjörð Bjarmason
2021-03-06 10:45                                 ` Jeff King
2021-03-06 19:29                                   ` Ævar Arnfjörð Bjarmason
2021-03-05 17:07                               ` [PATCH v2 3/4] Makefile/coccicheck: allow for setting xargs concurrency Ævar Arnfjörð Bjarmason
2021-03-06 10:51                                 ` Jeff King
2021-03-05 17:07                               ` [PATCH v2 4/4] Makefile/coccicheck: set SPATCH_BATCH_SIZE to 8 Ævar Arnfjörð Bjarmason
2021-03-06 19:25                                 ` [PATCH v2 5/4] Makefile/coccicheck: use --include-headers-for-types Ævar Arnfjörð Bjarmason
2021-03-18 20:49                                   ` SZEDER Gábor
2021-03-19 10:32                                     ` Ævar Arnfjörð Bjarmason
2021-03-22 12:11                                   ` [PATCH v4 0/4] Makefile/coccicheck: fix bugs and speed it up Ævar Arnfjörð Bjarmason
2021-03-22 12:11                                     ` [PATCH v4 1/4] Makefile/coccicheck: add comment heading for all SPATCH flags Ævar Arnfjörð Bjarmason
2021-03-22 18:04                                       ` René Scharfe.
2021-03-22 12:11                                     ` [PATCH v4 2/4] Makefile/coccicheck: speed up and fix bug with duplicate hunks Ævar Arnfjörð Bjarmason
2021-03-22 18:05                                       ` René Scharfe.
2021-03-24 19:19                                         ` Jeff King
2021-03-22 19:09                                       ` Junio C Hamano
2021-03-22 12:11                                     ` [PATCH v4 3/4] Makefile/coccicheck: allow for setting xargs concurrency Ævar Arnfjörð Bjarmason
2021-03-24 19:26                                       ` Jeff King
2021-03-25  2:29                                         ` Ævar Arnfjörð Bjarmason
2021-03-26  4:11                                           ` Jeff King
2021-03-22 12:11                                     ` [PATCH v4 4/4] Makefile/coccicheck: set SPATCH_BATCH_SIZE to 8 Ævar Arnfjörð Bjarmason
2021-03-22 18:05                                       ` René Scharfe.
2021-03-24 19:27                                         ` Jeff King
2021-03-27 17:43                                     ` [PATCH v4 0/4] Makefile/coccicheck: fix bugs and speed it up Junio C Hamano
2021-03-27 19:46                                       ` Ævar Arnfjörð Bjarmason
2019-05-03  9:40               ` [PATCH v3 0/4] remove extern from function declarations Denton Liu
2019-04-23 23:40   ` [PATCH v4 " Denton Liu
2019-04-23 23:40     ` [PATCH v4 1/4] *.[ch]: remove extern from function declarations using spatch Denton Liu
2019-04-23 23:40     ` [PATCH v4 2/4] *.[ch]: remove extern from function declarations using sed Denton Liu
2019-04-24  4:56       ` Junio C Hamano
2019-04-25 19:00         ` Denton Liu
2019-04-23 23:40     ` [PATCH v4 3/4] *.[ch]: manually align parameter lists Denton Liu
2019-04-23 23:40     ` [PATCH v4 4/4] cocci: prevent extern function declarations Denton Liu
2019-04-29  8:28     ` [PATCH v5 0/3] *** SUBJECT HERE *** Denton Liu
2019-04-29  8:28       ` [PATCH v5 1/3] *.[ch]: remove extern from function declarations using spatch Denton Liu
2019-04-29  8:28       ` [PATCH v5 2/3] *.[ch]: remove extern from function declarations using sed Denton Liu
2019-04-29  8:28       ` [PATCH v5 3/3] *.[ch]: manually align parameter lists Denton Liu
2019-04-29  8:30       ` [PATCH v5 0/3] *** SUBJECT HERE *** Denton Liu
2019-05-06 11:03         ` Ævar Arnfjörð Bjarmason
2019-05-06 15:34           ` Denton Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87im65tt3k.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.