From: "René Scharfe" <l.s.r@web.de>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Derrick Stolee" <derrickstolee@github.com>
Cc: Git List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>,
Victoria Dye <vdye@github.com>
Subject: Re: [PATCH] cache-tree: fix strbuf growth in prime_cache_tree_rec()
Date: Sun, 12 Feb 2023 12:20:49 +0100 [thread overview]
Message-ID: <49071578-0007-bd9d-f286-58abfa848991@web.de> (raw)
In-Reply-To: <230206.86a61q7o11.gmgdl@evledraar.gmail.com>
Am 06.02.23 um 17:18 schrieb Ævar Arnfjörð Bjarmason:
>
> On Mon, Feb 06 2023, Derrick Stolee wrote:
>
>> On 2/5/2023 4:12 PM, Ævar Arnfjörð Bjarmason wrote:
>
>>> Or even just:
>>>
>>> strbuf_addf(tree_path, "%*.s/", (int)entry.pathlen, entry.path);
>>
>> Please do not add "addf" functions that can be run in tight loops.
>> It's faster to do strbuf_add() followed by strbuf_addch().
>
> Good point.
>
> I wondered just how much slower, and it's up to 3x! At least according
> to this[1] artificial test case (where I usurped a random test helper).
>
> I wondered if we could just handle some common strbuf_addf() cases
> ourselves, and the benchmark shows (manually annotated, too lazy to set
> up the -n option):
>
> git hyperfine -L rev HEAD~5,HEAD~4,HEAD~3,HEAD~2,HEAD~1,HEAD~0 -s 'make CFLAGS=-O3' './t/helper/test-tool online-cpus' -r 3
> [...]
> Summary
> './t/helper/test-tool online-cpus' in 'HEAD~0' ran <== strbuf_add() + strbuf_addch()
> 1.06 ± 0.11 times faster than './t/helper/test-tool online-cpus' in 'HEAD~1' <== strbuf_addstr() + strbuf_addch()
> 1.18 ± 0.12 times faster than './t/helper/test-tool online-cpus' in 'HEAD~4' <== hand optimized strbuf_addf() for "%sC"
> 1.33 ± 0.18 times faster than './t/helper/test-tool online-cpus' in 'HEAD~2' <== hand optimized strbuf_addf() for "%*sC"
> 2.63 ± 0.05 times faster than './t/helper/test-tool online-cpus' in 'HEAD~5' <== strbuf_addf("%s/")
> 2.92 ± 0.25 times faster than './t/helper/test-tool online-cpus' in 'HEAD~3' <== strbuf_addf("%*s/")
Woah!
> The "hand optimization" just being a very stupid handling of "%sC" for
> arbitrary values of a single char "C", and ditto for "%*sC" (which
> curiously is slower here).
"%*s" adds padding if needed, your version doesn't. Perhaps you thought
of "%.*s"? That might be relevant because for "%*s" vsnprintf(3) needs
to run strlen(3) again on the argument, while for "%.*s" it can stop
when the given length is reached.
> So, for truly hot loops we'd still want to use the add + addch, but if
> anyone's interested (hashtag #leftoverbits) it looks like we could get
> some easy wins (and reduction in code size, as we could stop worrying
> about addf being slow in most cases) with some very dumb minimal
> vaddf(), which could handle these cases (but not anything involving
> padding etc.).
>
> I didn't dig, but wouldn't be surprised if the reason is that C
> libraries need to carry a relatively fat & general sprintf() for all
> those edge cases, locale handling etc, whereas most of our use could
> trivially be represented as some sequence of addstr()/addf() etc.
If that's the reason then resisting the urge to handle ever more cases
in strbuf_addf() would be quite important.
> Another interesting approach (and this is very #leftoverbits) would be
> to perform the same optimization with coccinelle.
>
> I.e. our current use of it is purely "this code X should be written like
> Y, and we should commit Y".
>
> But there's no reason for why we couldn't effectively implement our own
> compiler optimizations for our own APIs with it, so just grab "%s/" etc,
> unpack that in OCaml, then emit strbuf_add() + strbuf_addch(), and that
> would be what the C compiler would see.
Extracting the %s is technically possible using a semantic patch without
scripting:
@@
expression sb, str;
format fmt =~ "^s$";
@@
+ strbuf_addstr(sb, str);
strbuf_addf(sb, "%@fmt@..."
- , str
+ + 2
);
The "+ 2" is ugly and runs afoul of compiler warning -Wstring-plus-int,
though. Resolving this probably requires Python scripting as in
https://github.com/coccinelle/coccinelle/blob/master/demos/format.cocci,
or the OCaml magic you have in mind. I have to admit that I don't even
understand the linked examples, however. :-/
The warning can be avoided by using an array subscription, by the way,
but that's even uglier:
@@
expression sb, str;
format fmt =~ "^s$";
@@
+ strbuf_addstr(sb, str);
strbuf_addf(sb,
+ &
"%@fmt@..."
- , str
+ [2]
);
>
> 1.
>
> 9d23ffb1117 addf + nolen
> diff --git a/t/helper/test-online-cpus.c b/t/helper/test-online-cpus.c
> index 8cb0d53840f..c802ec579d0 100644
> --- a/t/helper/test-online-cpus.c
> +++ b/t/helper/test-online-cpus.c
> @@ -1,9 +1,17 @@
> #include "test-tool.h"
> #include "git-compat-util.h"
> #include "thread-utils.h"
> * +#include "strbuf.h"
>
> int cmd__online_cpus(int argc, const char **argv)
> {
> - printf("%d\n", online_cpus());
> + struct strbuf sb = STRBUF_INIT;
> + const char *const str = "Hello, World";
> +
> + for (size_t i = 0; i < 10000000; i++) {
> + strbuf_reset(&sb);
> + strbuf_addf(&sb, "%s/", str);
> + puts(sb.buf);
> + }
> return 0;
> }
> 9f74eff5623 addf + nolen optimize
> diff --git a/strbuf.c b/strbuf.c
> index c383f41a3c5..750e5e6a5b4 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -332,8 +332,16 @@ void strbuf_addchars(struct strbuf *sb, int c, size_t n)
> void strbuf_addf(struct strbuf *sb, const char *fmt, ...)
> {
> va_list ap;
> +
> va_start(ap, fmt);
> - strbuf_vaddf(sb, fmt, ap);
> + if (*fmt == '%' && *(fmt + 1) == 's' && *(fmt + 2) && !*(fmt + 3)) {
> + const char *arg = va_arg(ap, const char *);
> +
> + strbuf_addstr(sb, arg);
> + strbuf_addch(sb, *(fmt + 2));
> + } else {
> + strbuf_vaddf(sb, fmt, ap);
> + }
> va_end(ap);
> }
>
> ca60bb9b479 addf + len
> diff --git a/t/helper/test-online-cpus.c b/t/helper/test-online-cpus.c
> index c802ec579d0..7257e622015 100644
> --- a/t/helper/test-online-cpus.c
> +++ b/t/helper/test-online-cpus.c
> @@ -7,10 +7,11 @@ int cmd__online_cpus(int argc, const char **argv)
> {
> struct strbuf sb = STRBUF_INIT;
> const char *const str = "Hello, World";
> + const size_t len = strlen(str);
>
> for (size_t i = 0; i < 10000000; i++) {
> strbuf_reset(&sb);
> - strbuf_addf(&sb, "%s/", str);
> + strbuf_addf(&sb, "%*s/", (int)len, str);
> puts(sb.buf);
> }
> return 0;
> 1f47987d095 addf + len optimize
> diff --git a/strbuf.c b/strbuf.c
> index 750e5e6a5b4..88801268f7a 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -334,11 +334,16 @@ void strbuf_addf(struct strbuf *sb, const char *fmt, ...)
> va_list ap;
>
> va_start(ap, fmt);
> - if (*fmt == '%' && *(fmt + 1) == 's' && *(fmt + 2) && !*(fmt + 3)) {
> + if (*fmt == '%' &&
> + *(fmt + 1) == '*' &&
> + *(fmt + 2) == 's' &&
> + *(fmt + 3) &&
> + !*(fmt + 4)) {
> + int len = va_arg(ap, int);
> const char *arg = va_arg(ap, const char *);
>
> - strbuf_addstr(sb, arg);
> - strbuf_addch(sb, *(fmt + 2));
> + strbuf_add(sb, arg, len);
> + strbuf_addch(sb, *(fmt + 3));
> } else {
> strbuf_vaddf(sb, fmt, ap);
> }
> 55c698c0b95 addstr
> diff --git a/t/helper/test-online-cpus.c b/t/helper/test-online-cpus.c
> index 7257e622015..2716b44ca15 100644
> --- a/t/helper/test-online-cpus.c
> +++ b/t/helper/test-online-cpus.c
> @@ -7,11 +7,11 @@ int cmd__online_cpus(int argc, const char **argv)
> {
> struct strbuf sb = STRBUF_INIT;
> const char *const str = "Hello, World";
> - const size_t len = strlen(str);
>
> for (size_t i = 0; i < 10000000; i++) {
> strbuf_reset(&sb);
> - strbuf_addf(&sb, "%*s/", (int)len, str);
> + strbuf_addstr(&sb, str);
> + strbuf_addch(&sb, '/');
> puts(sb.buf);
> }
> return 0;
> b17fb99bf7e (HEAD -> master) add
> diff --git a/t/helper/test-online-cpus.c b/t/helper/test-online-cpus.c
> index 2716b44ca15..5e52b622c4d 100644
> --- a/t/helper/test-online-cpus.c
> +++ b/t/helper/test-online-cpus.c
> @@ -7,10 +7,11 @@ int cmd__online_cpus(int argc, const char **argv)
> {
> struct strbuf sb = STRBUF_INIT;
> const char *const str = "Hello, World";
> + const size_t len = strlen(str);
>
> for (size_t i = 0; i < 10000000; i++) {
> strbuf_reset(&sb);
> - strbuf_addstr(&sb, str);
> + strbuf_add(&sb, str, len);
> strbuf_addch(&sb, '/');
> puts(sb.buf);
> }
next prev parent reply other threads:[~2023-02-12 11:21 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-04 19:10 [PATCH] cache-tree: fix strbuf growth in prime_cache_tree_rec() René Scharfe
2023-02-05 21:12 ` Ævar Arnfjörð Bjarmason
2023-02-06 15:27 ` Derrick Stolee
2023-02-06 16:18 ` Ævar Arnfjörð Bjarmason
2023-02-12 11:20 ` René Scharfe [this message]
2023-02-06 18:55 ` Junio C Hamano
2023-02-10 20:20 ` René Scharfe
2023-02-10 20:20 ` René Scharfe
2023-02-10 20:33 ` Junio C Hamano
2023-02-11 2:15 ` Jeff King
2023-02-11 2:46 ` Junio C Hamano
2023-02-10 20:20 ` [PATCH v2] " René Scharfe
2023-02-13 13:37 ` Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49071578-0007-bd9d-f286-58abfa848991@web.de \
--to=l.s.r@web.de \
--cc=avarab@gmail.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).