git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Abhishek Kumar via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: "Jakub Narębski" <jnareb@gmail.com>,
	"Abhishek Kumar" <abhishekkumar8222@gmail.com>,
	"Taylor Blau" <me@ttaylorr.com>
Subject: Re: [PATCH 0/6] [GSoC] Implement Corrected Commit Date
Date: Tue, 28 Jul 2020 12:35:55 -0400	[thread overview]
Message-ID: <2ae8e6e0-9bf7-4e47-2a93-5e5092abe77a@gmail.com> (raw)
In-Reply-To: <pull.676.git.1595927632.gitgitgadget@gmail.com>

On 7/28/2020 5:13 AM, Abhishek Kumar via GitGitGadget wrote:
> This patch series implements the corrected commit date offsets as generation
> number v2, along with other pre-requisites.
> 
> Git uses topological levels in the commit-graph file for commit-graph
> traversal operations like git log --graph. Unfortunately, using topological
> levels can result in a worse performance than without them when compared
> with committer date as a heuristics. For example, git merge-base v4.8 v4.9 
> on the Linux repository walks 635,579 commits using topological levels and
> walks 167,468 using committer date.
> 
> Thus, the need for generation number v2 was born. New generation number
> needed to provide good performance, increment updates, and backward
> compatibility. Due to an unfortunate problem, we also needed a way to
> distinguish between the old and new generation number without incrementing
> graph version.
> 
> Various candidates were examined (https://github.com/derrickstolee/gen-test, 
> https://github.com/abhishekkumar2718/git/pull/1). The proposed generation
> number v2, Corrected Commit Date with Mononotically Increasing Offsets 
> performed much worse than committer date (506,577 vs. 167,468 commits walked
> for git merge-base v4.8 v4.9) and was dropped.
> 
> Using Generation Data chunk (GDAT) relieves the requirement of backward
> compatibility as we would continue to store topological levels in Commit
> Data (CDAT) chunk. Thus, Corrected Commit Date was chosen as generation
> number v2. The Corrected Commit Date is defined as:
> 
> For a commit C, let its corrected commit date be the maximum of the commit
> date of C and the corrected commit dates of its parents. Then corrected
> commit date offset is the difference between corrected commit date of C and
> commit date of C.
> 
> We will introduce an additional commit-graph chunk, Generation Data chunk,
> and store corrected commit date offsets in GDAT chunk while storing
> topological levels in CDAT chunk. The old versions of Git would ignore GDAT
> chunk, using topological levels from CDAT chunk. In contrast, new versions
> of Git would use corrected commit dates, falling back to topological level
> if the generation data chunk is absent in the commit-graph file.
> 
> Here's what left for the PR (which I intend to take on with the second
> version of pull request):
> 
>  1. Add an option to skip writing generation data chunk (to test whether new
>     Git works without GDAT as intended).

This would be a good idea, if only as a GIT_TEST_* environment variable.
I think it important we have a test for the compatibility scenario where
we have an "old" commit-graph with the new code and test that reading and
writing still works properly.

>  2. Handle writing to commit-graph for mismatched version (that is, merging
>     all graphs into a new graph with a GDAT chunk).

This is an excellent thing to do. There are a few options when writing an
incremental commit-graph when the base graphs do not have the GDAT chunk:

   i. Do not write the GDAT chunk unless we are merging all levels
      (based on the merging strategy).

  ii. Merge all levels, then write the GDAT chunk.

>  3. Update technical documentation.

Yes, I was going to ask for a patch that updates
Documentation/technical/commit-graph-format.txt.

This is an excellent v1. A lot of small things, but no
really big issues.

Thanks,
-Stolee

  parent reply	other threads:[~2020-07-28 16:36 UTC|newest]

Thread overview: 211+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-28  9:13 [PATCH 0/6] [GSoC] Implement Corrected Commit Date Abhishek Kumar via GitGitGadget
2020-07-28  9:13 ` [PATCH 1/6] commit-graph: fix regression when computing bloom filter Abhishek Kumar via GitGitGadget
2020-07-28 15:28   ` Taylor Blau
2020-07-30  5:24     ` Abhishek Kumar
2020-08-04  0:46   ` Jakub Narębski
2020-08-04  0:56     ` Taylor Blau
2020-08-04 10:10       ` Jakub Narębski
2020-08-04  7:55     ` Jakub Narębski
2020-07-28  9:13 ` [PATCH 2/6] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-07-28 13:00   ` Derrick Stolee
2020-07-28 15:30     ` Taylor Blau
2020-08-05 23:16   ` Jakub Narębski
2020-07-28  9:13 ` [PATCH 3/6] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-07-28 13:14   ` Derrick Stolee
2020-07-28 15:19     ` René Scharfe
2020-07-28 15:58       ` Derrick Stolee
2020-07-28 16:01     ` Taylor Blau
2020-07-30  6:07     ` Abhishek Kumar
2020-07-28  9:13 ` [PATCH 4/6] commit-graph: consolidate compare_commits_by_gen Abhishek Kumar via GitGitGadget
2020-07-28 16:03   ` Taylor Blau
2020-07-28  9:13 ` [PATCH 5/6] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-07-28 16:12   ` Taylor Blau
2020-07-30  6:52     ` Abhishek Kumar
2020-07-28  9:13 ` [PATCH 6/6] commit-graph: implement corrected commit date offset Abhishek Kumar via GitGitGadget
2020-07-28 15:55   ` Derrick Stolee
2020-07-28 16:23     ` Taylor Blau
2020-07-30  7:27     ` Abhishek Kumar
2020-07-28 14:54 ` [PATCH 0/6] [GSoC] Implement Corrected Commit Date Taylor Blau
2020-07-30  7:47   ` Abhishek Kumar
2020-07-28 16:35 ` Derrick Stolee [this message]
2020-08-09  2:53 ` [PATCH v2 00/10] " Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 01/10] commit-graph: fix regression when computing bloom filter Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 02/10] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 03/10] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 04/10] commit-graph: consolidate compare_commits_by_gen Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 05/10] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-08-10 16:28     ` Derrick Stolee
2020-08-11 11:03       ` Abhishek Kumar
2020-08-11 12:27         ` Derrick Stolee
2020-08-11 18:58           ` Taylor Blau
2020-08-09  2:53   ` [PATCH v2 06/10] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 07/10] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-08-10 14:23     ` Derrick Stolee
2020-08-14  4:59       ` Abhishek Kumar
2020-08-14 12:24         ` Derrick Stolee
2020-08-09  2:53   ` [PATCH v2 08/10] commit-graph: handle mixed generation commit chains Abhishek Kumar via GitGitGadget
2020-08-10 16:42     ` Derrick Stolee
2020-08-11 11:36       ` Abhishek Kumar
2020-08-11 12:43         ` Derrick Stolee
2020-08-09  2:53   ` [PATCH v2 09/10] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 10/10] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2020-08-10 16:47   ` [PATCH v2 00/10] [GSoC] Implement Corrected Commit Date Derrick Stolee
2020-08-15 16:39   ` [PATCH v3 00/11] " Abhishek Kumar via GitGitGadget
2020-08-15 16:39     ` [PATCH v3 01/11] commit-graph: fix regression when computing bloom filter Abhishek Kumar via GitGitGadget
2020-08-17 22:30       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 02/11] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-08-18 14:18       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 03/11] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-08-19 17:54       ` Jakub Narębski
2020-08-21  4:11         ` Abhishek Kumar
2020-08-25 11:11           ` Jakub Narębski
2020-09-01 11:35             ` Abhishek Kumar
2020-08-15 16:39     ` [PATCH v3 04/11] commit-graph: consolidate compare_commits_by_gen Abhishek Kumar via GitGitGadget
2020-08-17 13:22       ` Derrick Stolee
2020-08-21 11:05       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 05/11] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-08-21 13:14       ` Jakub Narębski
2020-08-25  5:04         ` Abhishek Kumar
2020-08-25 12:18           ` Jakub Narębski
2020-09-01 12:06             ` Abhishek Kumar
2020-09-03 13:42               ` Jakub Narębski
2020-09-05 17:21                 ` Abhishek Kumar
2020-09-13 15:39                   ` Jakub Narębski
2020-09-28 21:48                     ` Jakub Narębski
2020-10-05  5:25                       ` Abhishek Kumar
2020-08-15 16:39     ` [PATCH v3 06/11] commit-graph: add a slab to store topological levels Abhishek Kumar via GitGitGadget
2020-08-21 18:43       ` Jakub Narębski
2020-08-25  6:14         ` Abhishek Kumar
2020-08-25  7:33           ` Jakub Narębski
2020-08-25  7:56             ` Jakub Narębski
2020-09-01 10:26               ` Abhishek Kumar
2020-09-03  9:25                 ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 07/11] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-08-22  0:05       ` Jakub Narębski
2020-08-25  6:49         ` Abhishek Kumar
2020-08-25 10:07           ` Jakub Narębski
2020-09-01 11:01             ` Abhishek Kumar
2020-08-15 16:39     ` [PATCH v3 08/11] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-08-22 13:09       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 09/11] commit-graph: use generation v2 only if entire chain does Abhishek Kumar via GitGitGadget
2020-08-22 17:14       ` Jakub Narębski
2020-08-26  7:15         ` Abhishek Kumar
2020-08-26 10:38           ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 10/11] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-08-22 19:09       ` Jakub Narębski
2020-09-01 10:08         ` Abhishek Kumar
2020-09-03 19:11           ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 11/11] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2020-08-22 22:20       ` Jakub Narębski
2020-08-27  6:39         ` Abhishek Kumar
2020-08-27 12:43           ` Jakub Narębski
2020-08-27 13:15           ` Derrick Stolee
2020-09-01 13:01             ` Abhishek Kumar
2020-08-17  0:13     ` [PATCH v3 00/11] [GSoC] Implement Corrected Commit Date Jakub Narębski
     [not found]       ` <CANQwDwdKp7oKy9BeKdvKhwPUiq0R5MS8TCw-eWGCYCoMGv=G-g@mail.gmail.com>
2020-08-17  1:32         ` Fwd: " Taylor Blau
2020-08-17  7:56           ` Jakub Narębski
2020-08-18  6:12       ` Abhishek Kumar
2020-08-23 15:27       ` Jakub Narębski
2020-08-24  2:49         ` Abhishek Kumar
2020-10-07 14:09     ` [PATCH v4 00/10] " Abhishek Kumar via GitGitGadget
2020-10-07 14:09       ` [PATCH v4 01/10] commit-graph: fix regression when computing Bloom filters Abhishek Kumar via GitGitGadget
2020-10-24 23:16         ` Jakub Narębski
2020-10-25 20:58           ` Taylor Blau
2020-11-03  5:36             ` Abhishek Kumar
2020-10-07 14:09       ` [PATCH v4 02/10] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-10-24 23:41         ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 03/10] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-10-25 10:52         ` Jakub Narębski
2020-10-27  6:33           ` Abhishek Kumar
2020-10-07 14:09       ` [PATCH v4 04/10] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-10-25 13:48         ` Jakub Narębski
2020-11-03  6:40           ` Abhishek Kumar
2020-10-07 14:09       ` [PATCH v4 05/10] commit-graph: add a slab to store topological levels Abhishek Kumar via GitGitGadget
2020-10-25 22:17         ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 06/10] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-10-27 18:53         ` Jakub Narębski
2020-11-03 11:44           ` Abhishek Kumar
2020-11-04 16:45             ` Jakub Narębski
2020-11-05 14:05               ` Philip Oakley
2020-11-05 18:22                 ` Junio C Hamano
2020-11-06 18:26                   ` Extending and updating gitglossary (was: Re: [PATCH v4 06/10] commit-graph: implement corrected commit date) Jakub Narębski
2020-11-06 19:33                     ` Extending and updating gitglossary Junio C Hamano
2020-11-08 17:23                     ` Extending and updating gitglossary (was: Re: [PATCH v4 06/10] commit-graph: implement corrected commit date) Philip Oakley
2020-11-10  1:35                       ` Extending and updating gitglossary Jakub Narębski
2020-11-10 14:04                         ` Philip Oakley
2020-11-10 23:52                           ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 07/10] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-10-30 12:45         ` Jakub Narębski
2020-11-06 11:25           ` Abhishek Kumar
2020-11-06 17:56             ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 08/10] commit-graph: use generation v2 only if entire chain does Abhishek Kumar via GitGitGadget
2020-11-01  0:55         ` Jakub Narębski
2020-11-12 10:01           ` Abhishek Kumar
2020-11-13  9:59             ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 09/10] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-11-03 17:59         ` Jakub Narębski
2020-11-03 18:19           ` Junio C Hamano
2020-11-20 10:33           ` Abhishek Kumar
2020-10-07 14:09       ` [PATCH v4 10/10] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2020-11-04  1:37         ` Jakub Narębski
2020-11-21  6:30           ` Abhishek Kumar
2020-11-04 23:37       ` [PATCH v4 00/10] [GSoC] Implement Corrected Commit Date Jakub Narębski
2020-11-22  5:31         ` Abhishek Kumar
2020-12-28 11:15       ` [PATCH v5 00/11] " Abhishek Kumar via GitGitGadget
2020-12-28 11:15         ` [PATCH v5 01/11] commit-graph: fix regression when computing Bloom filters Abhishek Kumar via GitGitGadget
2020-12-30  1:35           ` Derrick Stolee
2021-01-08  5:45             ` Abhishek Kumar
2021-01-05  9:45           ` SZEDER Gábor
2021-01-05  9:47             ` SZEDER Gábor
2021-01-08  5:51             ` Abhishek Kumar
2020-12-28 11:15         ` [PATCH v5 02/11] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-12-28 11:16         ` [PATCH v5 03/11] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-12-28 11:16         ` [PATCH v5 04/11] t6600-test-reach: generalize *_three_modes Abhishek Kumar via GitGitGadget
2020-12-28 11:16         ` [PATCH v5 05/11] commit-graph: add a slab to store topological levels Abhishek Kumar via GitGitGadget
2020-12-28 11:16         ` [PATCH v5 06/11] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-12-28 11:16         ` [PATCH v5 07/11] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-12-30  1:53           ` Derrick Stolee
2021-01-10 12:21             ` Abhishek Kumar
2020-12-28 11:16         ` [PATCH v5 08/11] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-12-28 11:16         ` [PATCH v5 09/11] commit-graph: use generation v2 only if entire chain does Abhishek Kumar via GitGitGadget
2020-12-30  3:23           ` Derrick Stolee
2021-01-10 13:13             ` Abhishek Kumar
2021-01-11 12:43               ` Derrick Stolee
2020-12-28 11:16         ` [PATCH v5 10/11] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-12-28 11:16         ` [PATCH v5 11/11] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2020-12-30  4:35         ` [PATCH v5 00/11] [GSoC] Implement Corrected Commit Date Derrick Stolee
2021-01-10 14:06           ` Abhishek Kumar
2021-01-16 18:11         ` [PATCH v6 " Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 01/11] commit-graph: fix regression when computing Bloom filters Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 02/11] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 03/11] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 04/11] t6600-test-reach: generalize *_three_modes Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 05/11] commit-graph: add a slab to store topological levels Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 06/11] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 07/11] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 08/11] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 09/11] commit-graph: use generation v2 only if entire chain does Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 10/11] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2021-01-16 18:11           ` [PATCH v6 11/11] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2021-01-27  0:04             ` SZEDER Gábor
2021-01-30  5:29               ` Abhishek Kumar
2021-01-31  1:45                 ` Taylor Blau
2021-01-18 21:04           ` [PATCH v6 00/11] [GSoC] Implement Corrected Commit Date Derrick Stolee
2021-01-18 22:00             ` Taylor Blau
2021-01-23 12:11               ` Abhishek Kumar
2021-01-19  0:02             ` Junio C Hamano
2021-01-23 12:07             ` Abhishek Kumar
2021-02-01  6:58           ` [PATCH v7 " Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 01/11] commit-graph: fix regression when computing Bloom filters Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 02/11] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 03/11] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 04/11] t6600-test-reach: generalize *_three_modes Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 05/11] commit-graph: add a slab to store topological levels Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 06/11] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 07/11] commit-graph: document generation number v2 Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 08/11] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 09/11] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 10/11] commit-graph: use generation v2 only if entire chain does Abhishek Kumar via GitGitGadget
2021-02-01  6:58             ` [PATCH v7 11/11] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2021-02-01 13:14             ` [PATCH v7 00/11] [GSoC] Implement Corrected Commit Date Derrick Stolee
2021-02-01 18:26               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2ae8e6e0-9bf7-4e47-2a93-5e5092abe77a@gmail.com \
    --to=stolee@gmail.com \
    --cc=abhishekkumar8222@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=jnareb@gmail.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).