Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: jnareb@gmail.com (Jakub Narębski)
To: Abhishek Kumar <abhishekkumar8222@gmail.com>
Cc: git@vger.kernel.org,
	"Abhishek Kumar via GitGitGadget" <gitgitgadget@gmail.com>,
	"Derrick Stolee" <stolee@gmail.com>,
	"Taylor Blau" <me@ttaylorr.com>,
	"Jakub Narębski" <jnareb@gmail.com>
Subject: Re: [PATCH v3 09/11] commit-graph: use generation v2 only if entire chain does
Date: Wed, 26 Aug 2020 12:38:04 +0200
Message-ID: <857dtld75f.fsf@gmail.com> (raw)
In-Reply-To: <20200826071519.GA6805@Abhishek-Arch> (Abhishek Kumar's message of "Wed, 26 Aug 2020 12:45:19 +0530")

Hi Abhishek,

Abhishek Kumar <abhishekkumar8222@gmail.com> writes:
> On Sat, Aug 22, 2020 at 07:14:38PM +0200, Jakub Narębski wrote:
>> Hi Abhishek,
>>
>> ...
>>
>> However the commit message do not say anything about the *writing* side.
>>
>
> Revised the commit message to include the following at the end:
>
> When writing the new layer in split commit-graph, we write a GDAT chunk
> only if the topmost layer has a GDAT chunk. This guarantees that if a
> layer has GDAT chunk, all lower layers must have a GDAT chunk as well.
>

All right.

> Rewriting layers follows similar approach: if the topmost layer below
> set of layers being rewritten (in the split commit-graph chain) exists,
> and it does not contain GDAT chunk, then the result of rewrite does not
> have GDAT chunks either.

All right.

I see that you went with proposed more complex (but better) solution...

>>
>> ...
>>
>> To be more detailed, without '--split=replace' we would want the following
>> layer merging behavior:
>>
>>    [layer with GDAT][with GDAT][without GDAT][without GDAT][without GDAT]
>>            1              2           3             4            5
>>
>> In the split commit-graph chain above, merging two topmost layers
>> (layers 4 and 5) should create a layer without GDAT; merging three
>> topmost layers (and any other layers, e.g. two middle ones, i.e. 3 and
>> 4) should create a new layer with GDAT.

A simpler solution would be to create a new merged layer without GDAT if
any of the layers being merged do not have GDAT.

In this solution merging 3+4+5, 3+4, and even 2+3 would result with
layer without GDAT, and only merging 1+2 would result in layer with GDAT.

>>
>>    [layer with GDAT][with GDAT][without GDAT][-------without GDAT-------]
>>            1              2           3               merged
>>
>>    [layer with GDAT][with GDAT][-------------with GDAT------------------]
>>            1              2                    merged
>>
>> I hope those ASCII-art pictures help understanding it
>>
>
> Thanks! There were helpful.
>
> While we work as expected in the first scenario i.e merging 4 and 5, we
> would *still* write a layer without GDAT in the second scenario.
>
> I have tweaked split_graph_merge_strategy() to fix this:
>
> ----------------------------------------------
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 6d54d9a286..246fad030d 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1973,6 +1973,9 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
>  		}
>  	}
>
> +	if (!ctx->write_generation_data && g->chunk_generation_data)
> +		ctx->write_generation_data = 1;
> +
>  	if (flags != COMMIT_GRAPH_SPLIT_REPLACE)
>  		ctx->new_base_graph = g;
>  	else if (ctx->num_commit_graphs_after != 1)

...which turned out to be not that complicated.  Nice work!

Though this needs tests that if fulfills the stated condition (because I
am not sure if it is entirely correct: we are not checking the layer
below current one, isn't it?... ah, you explain it below).

One possible solution would be to grep `test-tool read-graph` output for
"^chunks: ", then pass it through `uniq` (without `sort`!), check that
the number of lines is less or equal 2, and if there are two lines then
check that we get the following contents:

  chunks: oid_fanout oid_lookup commit_metadata generation_data
  chunks: oid_fanout oid_lookup commit_metadata

(assuming that information about layers is added in top-down order).

This test must be run with GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0, which
I think is the default.

> ----------------------------------------------------
>
> That is, if we were not writing generation data (because of mixed
> generation concerns) but the new topmost layer has a generation data
> chunk, we have merged all layers without GDAT chunk and can now write a
> GDAT chunk safely.

All right.

[...]
>>> diff --git a/commit-graph.h b/commit-graph.h
>>> index f78c892fc0..3cf89d895d 100644
>>> --- a/commit-graph.h
>>> +++ b/commit-graph.h
>>> @@ -63,6 +63,7 @@ struct commit_graph {
>>>  	struct object_directory *odb;
>>>
>>>  	uint32_t num_commits_in_base;
>>> +	uint32_t read_generation_data;
>>>  	struct commit_graph *base_graph;
>>>
>>
>> First, why `read_generation_data` is of uint32_t type, when it stores
>> (as far as I understand it), a "boolean" value of either 0 or 1?
>
> Yes, using unsigned int instead of uint32_t (although in most of cases
> it would be same).  If commit_graph had other flags as well, we could
> have used a bit field.

OK.

>> Second, couldn't we simply set chunk_generation_data to NULL?  Or would
>> that interfere with the case of rewriting, where we want to use existing
>> GDAT data when writing new commit-graph with GDAT chunk?
>
> It interferes with rewriting the split commit-graph, as you might have
> guessed from the above code snippet.

All right.

[...]
>>> @@ -885,6 +908,7 @@ void load_commit_graph_info(struct repository *r, struct commit *item)
>>>  	uint32_t pos;
>>>  	if (!prepare_commit_graph(r))
>>>  		return;
>>> +
>>>  	if (find_commit_in_graph(item, r->objects->commit_graph, &pos))
>>>  		fill_commit_graph_info(item, r->objects->commit_graph, pos);
>>>  }
>>
>> This is unrelated whitespace fix, a "while at it" in neighbourhood of
>> changes.  All right then.
>>
>
> Reverted this change, as it's unimportant.

Actually I am not against fixing the whitespace in the neighbourhood of
changes, so you can keep it or revert it (discard).

>>> @@ -2192,6 +2216,9 @@ int write_commit_graph(struct object_directory *odb,
>>
>> ...
>>
>> It would be nice to have an example with merging layers (whether we
>> would handle it in strict or relaxed way).
>>
>
> Sure, will add.

Thanks.


Best,
--
Jakub Narębski

  reply index

Thread overview: 129+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-28  9:13 [PATCH 0/6] [GSoC] Implement Corrected Commit Date Abhishek Kumar via GitGitGadget
2020-07-28  9:13 ` [PATCH 1/6] commit-graph: fix regression when computing bloom filter Abhishek Kumar via GitGitGadget
2020-07-28 15:28   ` Taylor Blau
2020-07-30  5:24     ` Abhishek Kumar
2020-08-04  0:46   ` Jakub Narębski
2020-08-04  0:56     ` Taylor Blau
2020-08-04 10:10       ` Jakub Narębski
2020-08-04  7:55     ` Jakub Narębski
2020-07-28  9:13 ` [PATCH 2/6] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-07-28 13:00   ` Derrick Stolee
2020-07-28 15:30     ` Taylor Blau
2020-08-05 23:16   ` Jakub Narębski
2020-07-28  9:13 ` [PATCH 3/6] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-07-28 13:14   ` Derrick Stolee
2020-07-28 15:19     ` René Scharfe
2020-07-28 15:58       ` Derrick Stolee
2020-07-28 16:01     ` Taylor Blau
2020-07-30  6:07     ` Abhishek Kumar
2020-07-28  9:13 ` [PATCH 4/6] commit-graph: consolidate compare_commits_by_gen Abhishek Kumar via GitGitGadget
2020-07-28 16:03   ` Taylor Blau
2020-07-28  9:13 ` [PATCH 5/6] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-07-28 16:12   ` Taylor Blau
2020-07-30  6:52     ` Abhishek Kumar
2020-07-28  9:13 ` [PATCH 6/6] commit-graph: implement corrected commit date offset Abhishek Kumar via GitGitGadget
2020-07-28 15:55   ` Derrick Stolee
2020-07-28 16:23     ` Taylor Blau
2020-07-30  7:27     ` Abhishek Kumar
2020-07-28 14:54 ` [PATCH 0/6] [GSoC] Implement Corrected Commit Date Taylor Blau
2020-07-30  7:47   ` Abhishek Kumar
2020-07-28 16:35 ` Derrick Stolee
2020-08-09  2:53 ` [PATCH v2 00/10] " Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 01/10] commit-graph: fix regression when computing bloom filter Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 02/10] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 03/10] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 04/10] commit-graph: consolidate compare_commits_by_gen Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 05/10] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-08-10 16:28     ` Derrick Stolee
2020-08-11 11:03       ` Abhishek Kumar
2020-08-11 12:27         ` Derrick Stolee
2020-08-11 18:58           ` Taylor Blau
2020-08-09  2:53   ` [PATCH v2 06/10] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 07/10] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-08-10 14:23     ` Derrick Stolee
2020-08-14  4:59       ` Abhishek Kumar
2020-08-14 12:24         ` Derrick Stolee
2020-08-09  2:53   ` [PATCH v2 08/10] commit-graph: handle mixed generation commit chains Abhishek Kumar via GitGitGadget
2020-08-10 16:42     ` Derrick Stolee
2020-08-11 11:36       ` Abhishek Kumar
2020-08-11 12:43         ` Derrick Stolee
2020-08-09  2:53   ` [PATCH v2 09/10] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 10/10] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2020-08-10 16:47   ` [PATCH v2 00/10] [GSoC] Implement Corrected Commit Date Derrick Stolee
2020-08-15 16:39   ` [PATCH v3 00/11] " Abhishek Kumar via GitGitGadget
2020-08-15 16:39     ` [PATCH v3 01/11] commit-graph: fix regression when computing bloom filter Abhishek Kumar via GitGitGadget
2020-08-17 22:30       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 02/11] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-08-18 14:18       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 03/11] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-08-19 17:54       ` Jakub Narębski
2020-08-21  4:11         ` Abhishek Kumar
2020-08-25 11:11           ` Jakub Narębski
2020-09-01 11:35             ` Abhishek Kumar
2020-08-15 16:39     ` [PATCH v3 04/11] commit-graph: consolidate compare_commits_by_gen Abhishek Kumar via GitGitGadget
2020-08-17 13:22       ` Derrick Stolee
2020-08-21 11:05       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 05/11] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-08-21 13:14       ` Jakub Narębski
2020-08-25  5:04         ` Abhishek Kumar
2020-08-25 12:18           ` Jakub Narębski
2020-09-01 12:06             ` Abhishek Kumar
2020-09-03 13:42               ` Jakub Narębski
2020-09-05 17:21                 ` Abhishek Kumar
2020-09-13 15:39                   ` Jakub Narębski
2020-09-28 21:48                     ` Jakub Narębski
2020-10-05  5:25                       ` Abhishek Kumar
2020-08-15 16:39     ` [PATCH v3 06/11] commit-graph: add a slab to store topological levels Abhishek Kumar via GitGitGadget
2020-08-21 18:43       ` Jakub Narębski
2020-08-25  6:14         ` Abhishek Kumar
2020-08-25  7:33           ` Jakub Narębski
2020-08-25  7:56             ` Jakub Narębski
2020-09-01 10:26               ` Abhishek Kumar
2020-09-03  9:25                 ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 07/11] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-08-22  0:05       ` Jakub Narębski
2020-08-25  6:49         ` Abhishek Kumar
2020-08-25 10:07           ` Jakub Narębski
2020-09-01 11:01             ` Abhishek Kumar
2020-08-15 16:39     ` [PATCH v3 08/11] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-08-22 13:09       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 09/11] commit-graph: use generation v2 only if entire chain does Abhishek Kumar via GitGitGadget
2020-08-22 17:14       ` Jakub Narębski
2020-08-26  7:15         ` Abhishek Kumar
2020-08-26 10:38           ` Jakub Narębski [this message]
2020-08-15 16:39     ` [PATCH v3 10/11] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-08-22 19:09       ` Jakub Narębski
2020-09-01 10:08         ` Abhishek Kumar
2020-09-03 19:11           ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 11/11] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2020-08-22 22:20       ` Jakub Narębski
2020-08-27  6:39         ` Abhishek Kumar
2020-08-27 12:43           ` Jakub Narębski
2020-08-27 13:15           ` Derrick Stolee
2020-09-01 13:01             ` Abhishek Kumar
2020-08-17  0:13     ` [PATCH v3 00/11] [GSoC] Implement Corrected Commit Date Jakub Narębski
     [not found]       ` <CANQwDwdKp7oKy9BeKdvKhwPUiq0R5MS8TCw-eWGCYCoMGv=G-g@mail.gmail.com>
2020-08-17  1:32         ` Fwd: " Taylor Blau
2020-08-17  7:56           ` Jakub Narębski
2020-08-18  6:12       ` Abhishek Kumar
2020-08-23 15:27       ` Jakub Narębski
2020-08-24  2:49         ` Abhishek Kumar
2020-10-07 14:09     ` [PATCH v4 00/10] " Abhishek Kumar via GitGitGadget
2020-10-07 14:09       ` [PATCH v4 01/10] commit-graph: fix regression when computing Bloom filters Abhishek Kumar via GitGitGadget
2020-10-24 23:16         ` Jakub Narębski
2020-10-25 20:58           ` Taylor Blau
2020-10-07 14:09       ` [PATCH v4 02/10] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-10-24 23:41         ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 03/10] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-10-25 10:52         ` Jakub Narębski
2020-10-27  6:33           ` Abhishek Kumar
2020-10-07 14:09       ` [PATCH v4 04/10] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-10-25 13:48         ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 05/10] commit-graph: add a slab to store topological levels Abhishek Kumar via GitGitGadget
2020-10-25 22:17         ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 06/10] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-10-27 18:53         ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 07/10] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-10-30 12:45         ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 08/10] commit-graph: use generation v2 only if entire chain does Abhishek Kumar via GitGitGadget
2020-10-07 14:09       ` [PATCH v4 09/10] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-10-07 14:09       ` [PATCH v4 10/10] doc: add corrected commit date info Abhishek Kumar via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=857dtld75f.fsf@gmail.com \
    --to=jnareb@gmail.com \
    --cc=abhishekkumar8222@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git