git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Rose via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Seija Kijin <doremylover123@gmail.com>
Subject: Re: [PATCH] git: replace strbuf_addstr with strbuf_addch for all strings of length 2
Date: Wed, 18 Jan 2023 08:04:31 -0800	[thread overview]
Message-ID: <xmqqr0vr6d80.fsf@gitster.g> (raw)
In-Reply-To: <pull.1436.git.git.1673992498572.gitgitgadget@gmail.com> (Rose via GitGitGadget's message of "Tue, 17 Jan 2023 21:54:58 +0000")

"Rose via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Seija Kijin <doremylover123@gmail.com>
>
> This helps reduce overhead of calculating the length

Have you measured how much overhead this change is saving, or is
this a 300 line e-mail message that churns code without giving us
any measurable improvement?

There also seem to be some totally unrelated changes of dubious
merit hidden in these patches (see below).

> Subject: Re: [PATCH] git: replace strbuf_addstr with strbuf_addch for all strings of length 2

I think you are touching strings of length 1, not 2.  Running
strlen() on such a string returns will return 1 without counting the
terminating NUL.

> diff --git a/builtin/am.c b/builtin/am.c
> index 7e88d2426d7..c96886e0433 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -329,13 +329,11 @@ static void write_author_script(const struct am_state *state)
>  
>  	strbuf_addstr(&sb, "GIT_AUTHOR_NAME=");
>  	sq_quote_buf(&sb, state->author_name);
> -	strbuf_addch(&sb, '\n');
>  
> -	strbuf_addstr(&sb, "GIT_AUTHOR_EMAIL=");
> +	strbuf_addstr(&sb, "\nGIT_AUTHOR_EMAIL=");
>  	sq_quote_buf(&sb, state->author_email);
> -	strbuf_addch(&sb, '\n');
>  
> -	strbuf_addstr(&sb, "GIT_AUTHOR_DATE=");
> +	strbuf_addstr(&sb, "\nGIT_AUTHOR_DATE=");
>  	sq_quote_buf(&sb, state->author_date);
>  	strbuf_addch(&sb, '\n');

This may reduce the number of lines, but markedly worsens the
readability of the resulting code.  Each of the three-line blocks in
the original used to be logically complete and independent unit, but
now each of them depend on what the last block wants.

In any case, this has nothing to do with "addstr() of constant string
can be replaced with add() with a constant length or addch() of a
sing character to make it unnecessary to compute the length".

By the way, the current implementation of strbuf_addstr() looks like
this:

        static inline void strbuf_addstr(struct strbuf *sb, const char *s)
        {
                strbuf_add(sb, s, strlen(s));
        }

Decent optimizing compilers should be able to see through the code
like this you write:

	strbuf_addstr(&sb, "constant");

which becomes

	strbuf_add(&sb, "constant", strlen("constant"))

when inlined, and realize strlen("constant") can be computed at
compile time, turning it into

	strbuf_add(&sb, "constant", 8);

That way, people can make their strbuf_addstr() calls in the way
they find the most natural, i.e. without having to choose between
_addstr() and _add().  If the compilers can do their unnecessary
thinking for us humans, we should make them do so to help us.

If somebody can prove that between these two

	strbuf_add(&sb, "c", 1);
	strbuf_addch(&sb, 'c');

there is a meaningful difference in overhead to encourage us to
rewrite the former to the latter, perhaps a similar trick can be
employed in the implementation of strbuf_add(), perhaps like:

        static inline void strbuf_add(struct strbuf *sb, const void *data, size_t len)
        {
                if (len == 1) {
                        strbuf_addch(sb, *((char *)sb));
                } else {
                        strbuf_grow(sb, len);
                        memcpy(sb->buf + sb->len, data, len);
                        strbuf_setlen(sb, sb->len + len);
                }
        }

That way, people can make their strbuf_add() and strbuf_addstr()
calls in the way they find the most natural, i.e. without having to
choose between _add() and _addch() depending on the length of the
string.  This makes the code easier to maintain, as we do not have
to change the code all that much when the length of the string we
need to append to a strbuf changes from 1 to more (or the other way
around).

But I somehow doubt it is worth it.

> diff --git a/builtin/blame.c b/builtin/blame.c
> index 71f925e456c..3ab4cc0a56b 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -143,11 +143,9 @@ static void get_ac_line(const char *inbuf, const char *what,
>  
>  	if (split_ident_line(&ident, tmp, len)) {
>  	error_out:
> -		/* Ugh */
> -		tmp = "(unknown)";
> -		strbuf_addstr(name, tmp);
> -		strbuf_addstr(mail, tmp);
> -		strbuf_addstr(tz, tmp);
> +		strbuf_addstr(name, "(unknown)");
> +		strbuf_addstr(mail, "(unknown)");
> +		strbuf_addstr(tz, "(unknown)");

This is another unrelated change that has not much to do with the
theme of the change, to use addch() when the string is of length 1.

>  	if (opt->priv->call_depth) {
> -		strbuf_addchars(dest, ' ', 2);
> -		strbuf_addstr(dest, "From inner merge:");
> +		strbuf_addstr(dest, "  From inner merge:");
>  		strbuf_addchars(dest, ' ', opt->priv->call_depth * 2);

Ditto, even though this is not as horrible as the change to builtin/am.c
we saw earlier.

I'll stop here.

  reply	other threads:[~2023-01-18 16:09 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-17 21:54 [PATCH] git: replace strbuf_addstr with strbuf_addch for all strings of length 2 Rose via GitGitGadget
2023-01-18 16:04 ` Junio C Hamano [this message]
2023-01-18 18:53   ` Eric Sunshine

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqr0vr6d80.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=doremylover123@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).