From: Elijah Newren <newren@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
"Junio C Hamano" <gitster@pobox.com>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
"Martin Ågren" <martin.agren@gmail.com>,
"Andrzej Hunt" <ajrhunt@google.com>, "Jeff King" <peff@peff.net>
Subject: Re: [PATCH 06/10] dir.c: get rid of lazy initialization
Date: Mon, 4 Oct 2021 06:45:00 -0700 [thread overview]
Message-ID: <CABPp-BFpyyJ-e8p5fbmCvyaEsfUow=RP45Nw0ckiwNEvVC4zrg@mail.gmail.com> (raw)
In-Reply-To: <patch-06.10-2b243d91696-20211004T002226Z-avarab@gmail.com>
On Sun, Oct 3, 2021 at 5:46 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> Remove the "Lazy initialization" in prep_exclude() left behind by
> aceb9429b37 (prep_exclude: remove the artificial PATH_MAX limit,
> 2014-07-14).
>
> Now that every caller who sets up a "struct dir_struct" is using the
> DIR_INIT macro we can rely on it to have done the initialization. As
> noted in an analysis of the previous control flow[1] an earlier
> passing of of "dir->basebuf.buf" to strncmp() wasn't buggy, as we'd
> only reach that code on subsequent invocations of prep_exclude(),
> i.e. after this strbuf_init() had been run. But keeping track of that
> makes for hard-to-read code. Let's just rely on the initialization
> instead.
Having read through the link previously, this all makes sense to me,
but I'm not sure if this paragraph motivates the change without that
context. Maybe another reader can comment.
> This does change the behavior of this code in that it won't be
> pre-growing the strbuf to a size of PATH_MAX. I think that's OK.
>
> That we were using PATH_MAX at all is just a relic from this being a
> fixed buffer from way back in f87f9497486 (git-ls-files: --exclude
> mechanism updates., 2005-07-24).
>
> Pre-allocating PATH_MAX was the opposite of an optimization in this
> case. I logged all "basebuf.buf" values when running the test suite,
> and by far the most common one (around 80%) is "", which we now won't
> allocate at all for, and just use the "strbuf_slopbuf".
>
> The second most common one was "a/", followed by other common cases of
> short relative paths. So using the default "struct strbuf" growth
> pattern is a much better allocation optimization in this case.
>
> 1. https://lore.kernel.org/git/87sfxhohsj.fsf@evledraar.gmail.com/
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
> dir.c | 8 --------
> dir.h | 4 +++-
> 2 files changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index 39fce3bcba7..efc87c2e405 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1550,14 +1550,6 @@ static void prep_exclude(struct dir_struct *dir,
> if (dir->pattern)
> return;
>
> - /*
> - * Lazy initialization. All call sites currently just
> - * memset(dir, 0, sizeof(*dir)) before use. Changing all of
> - * them seems lots of work for little benefit.
> - */
> - if (!dir->basebuf.buf)
> - strbuf_init(&dir->basebuf, PATH_MAX);
> -
> /* Read from the parent directories and push them down. */
> current = stk ? stk->baselen : -1;
> strbuf_setlen(&dir->basebuf, current < 0 ? 0 : current);
> diff --git a/dir.h b/dir.h
> index ff3b4a7f602..e3757c6099e 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -342,7 +342,9 @@ struct dir_struct {
> unsigned visited_directories;
> };
>
> -#define DIR_INIT { 0 }
> +#define DIR_INIT { \
> + .basebuf = STRBUF_INIT, \
> +}
>
> struct dirent *readdir_skip_dot_and_dotdot(DIR *dirp);
>
> --
> 2.33.0.1404.g83021034c5d
Wahoo! Nice code cleanup.
next prev parent reply other threads:[~2021-10-04 13:55 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-04 0:46 [PATCH 00/10] unpack-trees & dir APIs: fix memory leaks Ævar Arnfjörð Bjarmason
2021-10-04 0:46 ` [PATCH 01/10] unpack-trees.[ch]: define and use a UNPACK_TREES_OPTIONS_INIT Ævar Arnfjörð Bjarmason
2021-10-04 0:46 ` [PATCH 02/10] merge-recursive.c: call a new unpack_trees_options_init() function Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` Elijah Newren
2021-10-04 14:41 ` Ævar Arnfjörð Bjarmason
2021-10-04 15:04 ` Elijah Newren
2021-10-04 0:46 ` [PATCH 03/10] unpack-trees.[ch]: embed "dir" in "struct unpack_trees_options" Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` Elijah Newren
2021-10-04 0:46 ` [PATCH 04/10] unpack-trees API: don't have clear_unpack_trees_porcelain() reset Ævar Arnfjörð Bjarmason
2021-10-04 9:31 ` Phillip Wood
2021-10-04 11:12 ` Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` Elijah Newren
2021-10-04 15:20 ` Ævar Arnfjörð Bjarmason
2021-10-04 16:28 ` Elijah Newren
2021-10-04 0:46 ` [PATCH 05/10] dir.[ch]: make DIR_INIT mandatory Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` Elijah Newren
2021-10-04 0:46 ` [PATCH 06/10] dir.c: get rid of lazy initialization Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` Elijah Newren [this message]
2021-10-04 0:46 ` [PATCH 07/10] unpack-trees API: rename clear_unpack_trees_porcelain() Ævar Arnfjörð Bjarmason
2021-10-04 9:38 ` Phillip Wood
2021-10-04 11:10 ` Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` Elijah Newren
2021-10-04 0:46 ` [PATCH 08/10] unpack-trees: don't leak memory in verify_clean_subdirectory() Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` Elijah Newren
2021-10-04 0:46 ` [PATCH 09/10] merge.c: avoid duplicate unpack_trees_options_release() code Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` Elijah Newren
2021-10-04 14:50 ` Ævar Arnfjörð Bjarmason
2021-10-04 0:46 ` [PATCH 10/10] built-ins: plug memory leaks with unpack_trees_options_release() Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` Elijah Newren
2021-10-04 14:54 ` Ævar Arnfjörð Bjarmason
2021-10-04 13:45 ` [PATCH 00/10] unpack-trees & dir APIs: fix memory leaks Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CABPp-BFpyyJ-e8p5fbmCvyaEsfUow=RP45Nw0ckiwNEvVC4zrg@mail.gmail.com' \
--to=newren@gmail.com \
--cc=ajrhunt@google.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=martin.agren@gmail.com \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).