All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH 5/5] load_ref_decorations(): avoid parsing non-tag objects
Date: Tue, 22 Jun 2021 19:06:43 +0200	[thread overview]
Message-ID: <875yy53juu.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YNILCDz3LpHX7OX0@coredump.intra.peff.net>


On Tue, Jun 22 2021, Jeff King wrote:

> When we load the ref decorations, we parse the object pointed to by each
> ref in order to get a "struct object". This is unnecessarily expensive;
> we really only need the object struct, and don't even look at the parsed
> contents. The exception is tags, which we do need to peel.
>
> We can improve this by looking up the object type first (which is much
> cheaper), and skipping the parse entirely for non-tags. This increases
> the work slightly for annotated tags (which now do a type lookup _and_ a
> parse), but decreases it a lot for other types. On balance, this seems
> to be a good tradeoff.
>
> In my git.git clone, with ~2k refs, most of which are branches, the time
> to run "git log -1 --decorate" drops from 34ms to 11ms. Even on my
> linux.git clone, which contains mostly tags and only a handful of
> branches, the time drops from 30ms to 19ms. And on a more extreme
> real-world case with ~220k refs, mostly non-tags, the time drops from
> 2.6s to 650ms.
>
> That command is a lop-sided example, of course, because it does as
> little non-loading work as possible. But it does show the absolute time
> improvement. Even in something like a full "git log --decorate" on that
> extreme repo, we'd still be saving 2s of CPU time.
>
> Ideally we could push this even further, and avoid parsing even tags, by
> relying on the packed-refs "peel" optimization (which we could do by
> calling peel_iterated_oid() instead of peeling manually). But we can't
> do that here. The packed-refs file only stores the bottom-layer of the
> peel (so in a "tag->tag->commit" chain, it stores only the commit as the
> peel result).  But the decoration code wants to peel the layers
> individually, annotating the middle layers of the chain.
>
> If the packed-refs file ever learns to store all of the peeled layers,
> then we could switch to it. Or even if it stored a flag to indicate the
> peel was not multi-layer (because most of them aren't), then we could
> use it most of the time and fall back to a manual peel for the rare
> cases.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  log-tree.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/log-tree.c b/log-tree.c
> index 7b823786c2..8b700e9c14 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -134,6 +134,7 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
>  			      int flags, void *cb_data)
>  {
>  	struct object *obj;
> +	enum object_type objtype;
>  	enum decoration_type type = DECORATION_NONE;
>  	struct decoration_filter *filter = (struct decoration_filter *)cb_data;
>  
> @@ -155,9 +156,10 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
>  		return 0;
>  	}
>  
> -	obj = parse_object(the_repository, oid);
> -	if (!obj)
> +	objtype = oid_object_info(the_repository, oid, NULL);
> +	if (type < 0)
>  		return 0;
> +	obj = lookup_object_by_type(the_repository, oid, objtype);

This series looks good. I just wonder if between this and my own
lookup_{blob,tree,tag,commit}_type() in [1] whether exposing some
function between what we have now in parse_object() and
parse_object_buffer() wouldn't also do this for you.

I.e. in my patch if you pass a type into parse_object_buffer() I think
you'll get the same behavior.

To be clear I see nothing wrong with this, it's more of a musing about
how some functions in object.c discover the type on their own, others
allow passing it in, sometimes (worse before that series of mine) we
relay the not-real-but-inferred-type etc.

1. https://lore.kernel.org/git/patch-10.11-a84f670ac24-20210328T021238Z-avarab@gmail.com/

  parent reply	other threads:[~2021-06-22 17:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22 16:03 [PATCH 0/5] some "log --decorate" optimizations Jeff King
2021-06-22 16:03 ` [PATCH 1/5] pretty.h: update and expand docstring for userformat_find_requirements() Jeff King
2021-06-22 16:04 ` [PATCH 2/5] log: avoid loading decorations for userformats that don't need it Jeff King
2021-06-22 16:05 ` [PATCH 3/5] object.h: expand docstring for lookup_unknown_object() Jeff King
2021-06-22 16:06 ` [PATCH 4/5] object.h: add lookup_object_by_type() function Jeff King
2021-06-22 16:08 ` [PATCH 5/5] load_ref_decorations(): avoid parsing non-tag objects Jeff King
2021-06-22 16:35   ` Derrick Stolee
2021-06-22 17:06     ` Jeff King
2021-06-22 17:09       ` Jeff King
2021-06-22 17:25         ` Derrick Stolee
2021-06-22 18:27       ` Ævar Arnfjörð Bjarmason
2021-06-22 19:08         ` Jeff King
2021-06-22 17:06   ` Ævar Arnfjörð Bjarmason [this message]
2021-06-22 18:57     ` Jeff King
2021-06-23  2:46   ` Taylor Blau
2021-06-23 21:51     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875yy53juu.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --subject='Re: [PATCH 5/5] load_ref_decorations(): avoid parsing non-tag objects' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.