All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Abhishek Kumar <abhishekkumar8222@gmail.com>, git@vger.kernel.org
Cc: jnareb@gmail.com
Subject: Re: [GSoC Patch 1/3] commit: introduce helpers for generation slab
Date: Thu, 4 Jun 2020 10:36:49 -0400	[thread overview]
Message-ID: <be28ab7b-0ae4-2cc5-7f2b-92075de3723a@gmail.com> (raw)
In-Reply-To: <20200604072759.19142-2-abhishekkumar8222@gmail.com>

On 6/4/2020 3:27 AM, Abhishek Kumar wrote:
> The struct member generation refers to "generation number" (or more
> broadly, a reachablity index value) used by commit-graph to reduce time
> taken to walk commits. However, generation is not useful in other
> contexts and bloats the struct.
> 
> Let's move it to a commit-slab and shrink the struct by four bytes.
> 
> Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
> ---
>  commit-graph.c | 27 +++++++++++++++++++++++++++
>  commit-graph.h |  5 +++++
>  commit.h       |  3 ---
>  3 files changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index e3420ddcbf..63f419048d 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -87,6 +87,33 @@ static int commit_pos_cmp(const void *va, const void *vb)
>  	       commit_pos_at(&commit_pos, b);
>  }
>  
> +define_commit_slab(generation_slab, uint32_t);
> +static struct generation_slab generation_slab = COMMIT_SLAB_INIT(1, generation_slab);
> +
> +uint32_t generation(const struct commit *c)
> +{
> +	uint32_t *gen = generation_slab_peek(&generation_slab, c);
> +
> +	return gen ? *gen : GENERATION_NUMBER_INFINITY;
> +}

This is good: if we don't have the value, then use INFINITY.
In the header file, perhaps include a warning comment that a
caller _must_ first parse the commit or else we have no guarantee
that the generation slab is populated. This matches the current
expectations before accessing the generation member.

> +static void set_generation(const struct commit *c, const uint32_t generation)
> +{
> +	unsigned int i = generation_slab.slab_count;
> +	uint32_t *gen = generation_slab_at(&generation_slab, c);
> +
> +	/*
> +	 * commit-slab initializes with zero, overwrite this with
> +	 * GENERATION_NUMBER_INFINITY
> +	 */
> +	for (; i < generation_slab.slab_count; ++i) {
> +		memset(generation_slab.slab[i], GENERATION_NUMBER_INFINITY,
> +		       generation_slab.slab_size * sizeof(uint32_t));
> +	}

Here is an example where combining the graph_pos and generation
slabs into one would be helpful. The only reason the generation
would be INFINITY is if graph_pos is COMMIT_NOT_FROM_GRAPH. If
the two values are side-by-side, we could just check graph_pos
first and return INFINITY instead of paying this initialization
cost as the slab grows.

I would also like to avoid initializing the slab if there is
no commit-graph present. I wonder if we can populate the slab
while parsing the commit-graph and check here if the slab is
NULL before doing any other logic? (I'm not sure if this is
possible, but it would be nice.)

> diff --git a/commit-graph.h b/commit-graph.h
> index 4212766a4f..653bd041ad 100644
> --- a/commit-graph.h
> +++ b/commit-graph.h
> @@ -8,6 +8,10 @@
>  #include "object-store.h"
>  #include "oidset.h"
>  
> +#define GENERATION_NUMBER_INFINITY 0xFFFFFFFF
> +#define GENERATION_NUMBER_MAX 0x3FFFFFFF
> +#define GENERATION_NUMBER_ZERO 0
> +
>  #define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH"
>  #define GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD "GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD"
>  #define GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS "GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS"
> @@ -137,4 +141,5 @@ void free_commit_graph(struct commit_graph *);
>   */
>  void disable_commit_graph(struct repository *r);
>  
> +uint32_t generation(const struct commit *c);
>  #endif
> diff --git a/commit.h b/commit.h
> index 1b2dea5d85..cc610400d5 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -11,9 +11,6 @@
>  #include "commit-slab.h"
>  
>  #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF
> -#define GENERATION_NUMBER_INFINITY 0xFFFFFFFF
> -#define GENERATION_NUMBER_MAX 0x3FFFFFFF
> -#define GENERATION_NUMBER_ZERO 0

I appreciate that you are able to relocate these constants to
a more appropriate location.

Thanks,
-Stolee



  reply	other threads:[~2020-06-04 14:36 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04  7:27 [GSoC Patch 0/3] Move generation, graph_pos to a slab Abhishek Kumar
2020-06-04  7:27 ` [GSoC Patch 1/3] commit: introduce helpers for generation slab Abhishek Kumar
2020-06-04 14:36   ` Derrick Stolee [this message]
2020-06-04 17:35   ` Junio C Hamano
2020-06-05 23:23   ` Jakub Narębski
2020-06-04  7:27 ` [GSoC Patch 2/3] commit: convert commit->generation to a slab Abhishek Kumar
2020-06-04 14:27   ` Derrick Stolee
2020-06-04 17:49   ` Junio C Hamano
2020-06-06 22:03   ` Jakub Narębski
2020-06-04  7:27 ` [GSoC Patch 3/3] commit: convert commit->graph_pos " Abhishek Kumar
2020-06-07 12:12   ` Jakub Narębski
2020-06-04 14:22 ` [GSoC Patch 0/3] Move generation, graph_pos " Derrick Stolee
2020-06-04 17:55   ` Junio C Hamano
2020-06-07 19:53   ` SZEDER Gábor
2020-06-08  5:48     ` Abhishek Kumar
2020-06-08  8:36       ` SZEDER Gábor
2020-06-08 13:45         ` Derrick Stolee
2020-06-08 16:46           ` SZEDER Gábor
2020-06-08 15:21         ` Jakub Narębski
2020-06-05 19:00 ` Jakub Narębski
2020-06-07 19:32 ` [GSOC Patch v2 0/4] " Abhishek Kumar
2020-06-07 19:32   ` [GSOC Patch v2 1/4] commit-graph: introduce commit_graph_data_slab Abhishek Kumar
2020-06-15 16:27     ` Taylor Blau
2020-06-07 19:32   ` [GSOC Patch v2 2/4] commit: move members graph_pos, generation to a slab Abhishek Kumar
2020-06-08  8:26     ` SZEDER Gábor
2020-06-08 12:35       ` Derrick Stolee
2020-06-07 19:32   ` [GSOC Patch v2 3/4] commit-graph: use generation directly when writing commit-graph Abhishek Kumar
2020-06-08 16:31     ` Jakub Narębski
2020-06-15 16:31       ` Taylor Blau
2020-06-07 19:32   ` [GSOC Patch v2 4/4] commit-graph: minimize commit_graph_data_slab access Abhishek Kumar
2020-06-08 16:22   ` [GSOC Patch v2 0/4] Move generation, graph_pos to a slab Jakub Narębski
2020-06-15 16:24   ` Taylor Blau
2020-06-17  9:14 ` [GSOC Patch v4 " Abhishek Kumar
2020-06-17  9:14   ` [GSOC Patch v4 1/4] object: drop parsed_object_pool->commit_count Abhishek Kumar
2020-06-17  9:14   ` [GSOC Patch v4 2/4] commit-graph: introduce commit_graph_data_slab Abhishek Kumar
2020-06-17  9:14   ` [GSOC Patch v4 3/4] commit: move members graph_pos, generation to a slab Abhishek Kumar
2020-06-17  9:14   ` [GSOC Patch v4 4/4] commit-graph: minimize commit_graph_data_slab access Abhishek Kumar
2020-06-19 13:59   ` [GSOC Patch v4 0/4] Move generation, graph_pos to a slab Derrick Stolee
2020-06-19 17:44     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be28ab7b-0ae4-2cc5-7f2b-92075de3723a@gmail.com \
    --to=stolee@gmail.com \
    --cc=abhishekkumar8222@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.