Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Abhishek Kumar <abhishekkumar8222@gmail.com>
To: "Jakub Narębski" <jnareb@gmail.com>
Cc: abhishekkumar8222@gmail.com, git@vger.kernel.org,
	gitgitgadget@gmail.com, stolee@gmail.com, me@ttaylorr.com
Subject: Re: [PATCH v4 03/10] commit-graph: consolidate fill_commit_graph_info
Date: Tue, 27 Oct 2020 12:03:24 +0530
Message-ID: <20201027063306.GA15674@Abhishek-Arch> (raw)
In-Reply-To: <85blgq4lxh.fsf@gmail.com>

Hello Dr. Narębski,

On Sun, Oct 25, 2020 at 11:52:42AM +0100, Jakub Narębski wrote:
> Hi Abhishek,
> 
> In short: everything is all right, except for the now duplicated test
> names in t5000 after this commit.
> 
> "Abhishek Kumar via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > From: Abhishek Kumar <abhishekkumar8222@gmail.com>
> >
> > Both fill_commit_graph_info() and fill_commit_in_graph() parse
> > information present in commit data chunk. Let's simplify the
> > implementation by calling fill_commit_graph_info() within
> > fill_commit_in_graph().
> >
> > fill_commit_graph_info() used to not load committer data from commit data
> > chunk. However, with the corrected committer date, we have to load
> > committer date to calculate generation number value.
> 
> Nice writeup, however the last sentence would in my opinion read better
> in the future tense: we don't use generation number v2 yet.  For
> example:
> 
>   However, with upcoming switch to using corrected committer date as
>   generation number v2, we will have to load committer date to compute
>   generation number value anyway.
> 
> Or something like that - notice the minor addition and changes.
> 

Thanks for the change, it looks better!

> The following is slightly unrelated change, but we agreed that it would
> be better to not separate them; the need for change to the t5000 test is
> caused by the change described above.

> 
> >
> > e51217e15 (t5000: test tar files that overflow ustar headers,
> > 30-06-2016) introduced a test 'generate tar with future mtime' that
> > creates a commit with committer date of (2 ^ 36 + 1) seconds since
> > EPOCH. The CDAT chunk provides 34-bits for storing committer date, thus
> > committer time overflows into generation number (within CDAT chunk) and
> > has undefined behavior.
> >
> > The test used to pass as fill_commit_graph_info() would not set struct
> > member `date` of struct commit and loads committer date from the object
> > database, generating a tar file with the expected mtime.
> 
> I think it should be s/loads/load/, as in "would load", but I am not a
> native English speaker.
> 

That's correct - since I have used "would not set" in the first half of
sentence, the later half should follow suit too.

> >
> > However, with corrected commit date, we will load the committer date
> > from CDAT chunk (truncated to lower 34-bits to populate the generation
> > number. Thus, Git sets date and generates tar file with the truncated
> > mtime.
> >
> > The ustar format (the header format used by most modern tar programs)
> > only has room for 11 (or 12, depending om some implementations) octal
> > digits for the size and mtime of each files.
> >
> > Thus, setting a timestamp of 2 ^ 33 + 1 would overflow the 11-octal
> > digit implementations while still fitting into commit data chunk.
> >
> > Since we want to test 12-octal digit implementations of ustar as well,
> > let's modify the existing test to no longer use commit-graph file.
> 
> The description above is for me does not make it entirely clear that we
> add new test for handling possible 11-octal digit overflow nearly
> identical to the existing one, and turn off use of commit-graph file for
> test that checks handling 12-octal digit overflow.
> 

Revised the last paragraphs to:

  The ustar format (the header format used by most modern tar programs)
  only has room for 11 (or 12, depending on some implementations) octal
  digits for the size and mtime of each file.

  To test the 11-octal digit implementation, we create a future commit
  with committer date of 2^34 - 1, which overflows 11-octal digits
  without overflowing 34-bits of the Commit Data chunk.

  To test the 12-octal digit implementation, the smallest committer date
  possible is 2^36, which overflows the Commit Data chunk and thus
  commit-graph must be disabled for the test.

> > Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
> > ---
> >  commit-graph.c      | 27 ++++++++++-----------------
> >  t/t5000-tar-tree.sh | 20 +++++++++++++++++++-
> >  2 files changed, 29 insertions(+), 18 deletions(-)
> >
> > diff --git a/commit-graph.c b/commit-graph.c
> > index 94503e584b..e8362e144e 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -749,15 +749,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
> >  	const unsigned char *commit_data;
> >  	struct commit_graph_data *graph_data;
> >  	uint32_t lex_index;
> > +	uint64_t date_high, date_low;
> >  
> >  	while (pos < g->num_commits_in_base)
> >  		g = g->base_graph;
> >  
> > +	if (pos >= g->num_commits + g->num_commits_in_base)
> > +		die(_("invalid commit position. commit-graph is likely corrupt"));
> > +
> >  	lex_index = pos - g->num_commits_in_base;
> >  	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
> >  
> >  	graph_data = commit_graph_data_at(item);
> >  	graph_data->graph_pos = pos;
> > +
> > +	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> > +	date_low = get_be32(commit_data + g->hash_len + 12);
> > +	item->date = (timestamp_t)((date_high << 32) | date_low);
> > +
> >  	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> >  }
> >  
> > @@ -772,38 +781,22 @@ static int fill_commit_in_graph(struct repository *r,
> >  {
> >  	uint32_t edge_value;
> >  	uint32_t *parent_data_ptr;
> > -	uint64_t date_low, date_high;
> >  	struct commit_list **pptr;
> > -	struct commit_graph_data *graph_data;
> >  	const unsigned char *commit_data;
> >  	uint32_t lex_index;
> >  
> >  	while (pos < g->num_commits_in_base)
> >  		g = g->base_graph;
> >  
> > -	if (pos >= g->num_commits + g->num_commits_in_base)
> > -		die(_("invalid commit position. commit-graph is likely corrupt"));
> > +	fill_commit_graph_info(item, g, pos);
> >  
> > -	/*
> > -	 * Store the "full" position, but then use the
> > -	 * "local" position for the rest of the calculation.
> > -	 */
> > -	graph_data = commit_graph_data_at(item);
> > -	graph_data->graph_pos = pos;
> >  	lex_index = pos - g->num_commits_in_base;
> > -
> >  	commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index;
> >  
> >  	item->object.parsed = 1;
> >  
> >  	set_commit_tree(item, NULL);
> >  
> > -	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> > -	date_low = get_be32(commit_data + g->hash_len + 12);
> > -	item->date = (timestamp_t)((date_high << 32) | date_low);
> > -
> > -	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> > -
> >  	pptr = &item->parents;
> >  
> >  	edge_value = get_be32(commit_data + g->hash_len);
> 
> All right, looks good for me.
> 
> Here second change begins.
> 
> > diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> > index 3ebb0d3b65..8f41cdc509 100755
> > --- a/t/t5000-tar-tree.sh
> > +++ b/t/t5000-tar-tree.sh
> > @@ -431,11 +431,29 @@ test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our huge size' '
> >  	test_cmp expect actual
> >  '
> >  
> > +test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
> > +	rm -f .git/index &&
> > +	echo foo >file &&
> > +	git add file &&
> > +	GIT_COMMITTER_DATE="@17179869183 +0000" \
> > +		git commit -m "tempori parendum"
> > +'
> > +
> > +test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
> > +	git archive HEAD >future.tar
> > +'
> > +
> > +test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' '
> > +	echo 2514 >expect &&
> > +	tar_info future.tar | cut -d" " -f2 >actual &&
> > +	test_cmp expect actual
> > +'
> > +
> 
> Everything is all right, except we now have duplicated test names.
> 
> Perhaps in the three following tests we should use 'far-far-future
> commit' and 'far future mtime' in place of current 'far-future commit'
> and 'future mtime' for tests checking handling 12-digital ditgits
> overflow, or add description how far the future is, for example
> 'far-future commit (2^11 + 1)', etc.
> 

Changed, thanks for pointing this out.

> >  test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
> >  	rm -f .git/index &&
> >  	echo content >file &&
> >  	git add file &&
> > -	GIT_COMMITTER_DATE="@68719476737 +0000" \
> > +	GIT_TEST_COMMIT_GRAPH=0 GIT_COMMITTER_DATE="@68719476737 +0000" \
> >  		git commit -m "tempori parendum"
> >  '
> 
> Best,
> -- 
> Jakub Narębski

Thanks
- Abhishek

  reply index

Thread overview: 153+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-28  9:13 [PATCH 0/6] [GSoC] Implement Corrected Commit Date Abhishek Kumar via GitGitGadget
2020-07-28  9:13 ` [PATCH 1/6] commit-graph: fix regression when computing bloom filter Abhishek Kumar via GitGitGadget
2020-07-28 15:28   ` Taylor Blau
2020-07-30  5:24     ` Abhishek Kumar
2020-08-04  0:46   ` Jakub Narębski
2020-08-04  0:56     ` Taylor Blau
2020-08-04 10:10       ` Jakub Narębski
2020-08-04  7:55     ` Jakub Narębski
2020-07-28  9:13 ` [PATCH 2/6] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-07-28 13:00   ` Derrick Stolee
2020-07-28 15:30     ` Taylor Blau
2020-08-05 23:16   ` Jakub Narębski
2020-07-28  9:13 ` [PATCH 3/6] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-07-28 13:14   ` Derrick Stolee
2020-07-28 15:19     ` René Scharfe
2020-07-28 15:58       ` Derrick Stolee
2020-07-28 16:01     ` Taylor Blau
2020-07-30  6:07     ` Abhishek Kumar
2020-07-28  9:13 ` [PATCH 4/6] commit-graph: consolidate compare_commits_by_gen Abhishek Kumar via GitGitGadget
2020-07-28 16:03   ` Taylor Blau
2020-07-28  9:13 ` [PATCH 5/6] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-07-28 16:12   ` Taylor Blau
2020-07-30  6:52     ` Abhishek Kumar
2020-07-28  9:13 ` [PATCH 6/6] commit-graph: implement corrected commit date offset Abhishek Kumar via GitGitGadget
2020-07-28 15:55   ` Derrick Stolee
2020-07-28 16:23     ` Taylor Blau
2020-07-30  7:27     ` Abhishek Kumar
2020-07-28 14:54 ` [PATCH 0/6] [GSoC] Implement Corrected Commit Date Taylor Blau
2020-07-30  7:47   ` Abhishek Kumar
2020-07-28 16:35 ` Derrick Stolee
2020-08-09  2:53 ` [PATCH v2 00/10] " Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 01/10] commit-graph: fix regression when computing bloom filter Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 02/10] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 03/10] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 04/10] commit-graph: consolidate compare_commits_by_gen Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 05/10] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-08-10 16:28     ` Derrick Stolee
2020-08-11 11:03       ` Abhishek Kumar
2020-08-11 12:27         ` Derrick Stolee
2020-08-11 18:58           ` Taylor Blau
2020-08-09  2:53   ` [PATCH v2 06/10] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 07/10] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-08-10 14:23     ` Derrick Stolee
2020-08-14  4:59       ` Abhishek Kumar
2020-08-14 12:24         ` Derrick Stolee
2020-08-09  2:53   ` [PATCH v2 08/10] commit-graph: handle mixed generation commit chains Abhishek Kumar via GitGitGadget
2020-08-10 16:42     ` Derrick Stolee
2020-08-11 11:36       ` Abhishek Kumar
2020-08-11 12:43         ` Derrick Stolee
2020-08-09  2:53   ` [PATCH v2 09/10] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-08-09  2:53   ` [PATCH v2 10/10] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2020-08-10 16:47   ` [PATCH v2 00/10] [GSoC] Implement Corrected Commit Date Derrick Stolee
2020-08-15 16:39   ` [PATCH v3 00/11] " Abhishek Kumar via GitGitGadget
2020-08-15 16:39     ` [PATCH v3 01/11] commit-graph: fix regression when computing bloom filter Abhishek Kumar via GitGitGadget
2020-08-17 22:30       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 02/11] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-08-18 14:18       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 03/11] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-08-19 17:54       ` Jakub Narębski
2020-08-21  4:11         ` Abhishek Kumar
2020-08-25 11:11           ` Jakub Narębski
2020-09-01 11:35             ` Abhishek Kumar
2020-08-15 16:39     ` [PATCH v3 04/11] commit-graph: consolidate compare_commits_by_gen Abhishek Kumar via GitGitGadget
2020-08-17 13:22       ` Derrick Stolee
2020-08-21 11:05       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 05/11] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-08-21 13:14       ` Jakub Narębski
2020-08-25  5:04         ` Abhishek Kumar
2020-08-25 12:18           ` Jakub Narębski
2020-09-01 12:06             ` Abhishek Kumar
2020-09-03 13:42               ` Jakub Narębski
2020-09-05 17:21                 ` Abhishek Kumar
2020-09-13 15:39                   ` Jakub Narębski
2020-09-28 21:48                     ` Jakub Narębski
2020-10-05  5:25                       ` Abhishek Kumar
2020-08-15 16:39     ` [PATCH v3 06/11] commit-graph: add a slab to store topological levels Abhishek Kumar via GitGitGadget
2020-08-21 18:43       ` Jakub Narębski
2020-08-25  6:14         ` Abhishek Kumar
2020-08-25  7:33           ` Jakub Narębski
2020-08-25  7:56             ` Jakub Narębski
2020-09-01 10:26               ` Abhishek Kumar
2020-09-03  9:25                 ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 07/11] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-08-22  0:05       ` Jakub Narębski
2020-08-25  6:49         ` Abhishek Kumar
2020-08-25 10:07           ` Jakub Narębski
2020-09-01 11:01             ` Abhishek Kumar
2020-08-15 16:39     ` [PATCH v3 08/11] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-08-22 13:09       ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 09/11] commit-graph: use generation v2 only if entire chain does Abhishek Kumar via GitGitGadget
2020-08-22 17:14       ` Jakub Narębski
2020-08-26  7:15         ` Abhishek Kumar
2020-08-26 10:38           ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 10/11] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-08-22 19:09       ` Jakub Narębski
2020-09-01 10:08         ` Abhishek Kumar
2020-09-03 19:11           ` Jakub Narębski
2020-08-15 16:39     ` [PATCH v3 11/11] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2020-08-22 22:20       ` Jakub Narębski
2020-08-27  6:39         ` Abhishek Kumar
2020-08-27 12:43           ` Jakub Narębski
2020-08-27 13:15           ` Derrick Stolee
2020-09-01 13:01             ` Abhishek Kumar
2020-08-17  0:13     ` [PATCH v3 00/11] [GSoC] Implement Corrected Commit Date Jakub Narębski
     [not found]       ` <CANQwDwdKp7oKy9BeKdvKhwPUiq0R5MS8TCw-eWGCYCoMGv=G-g@mail.gmail.com>
2020-08-17  1:32         ` Fwd: " Taylor Blau
2020-08-17  7:56           ` Jakub Narębski
2020-08-18  6:12       ` Abhishek Kumar
2020-08-23 15:27       ` Jakub Narębski
2020-08-24  2:49         ` Abhishek Kumar
2020-10-07 14:09     ` [PATCH v4 00/10] " Abhishek Kumar via GitGitGadget
2020-10-07 14:09       ` [PATCH v4 01/10] commit-graph: fix regression when computing Bloom filters Abhishek Kumar via GitGitGadget
2020-10-24 23:16         ` Jakub Narębski
2020-10-25 20:58           ` Taylor Blau
2020-11-03  5:36             ` Abhishek Kumar
2020-10-07 14:09       ` [PATCH v4 02/10] revision: parse parent in indegree_walk_step() Abhishek Kumar via GitGitGadget
2020-10-24 23:41         ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 03/10] commit-graph: consolidate fill_commit_graph_info Abhishek Kumar via GitGitGadget
2020-10-25 10:52         ` Jakub Narębski
2020-10-27  6:33           ` Abhishek Kumar [this message]
2020-10-07 14:09       ` [PATCH v4 04/10] commit-graph: return 64-bit generation number Abhishek Kumar via GitGitGadget
2020-10-25 13:48         ` Jakub Narębski
2020-11-03  6:40           ` Abhishek Kumar
2020-10-07 14:09       ` [PATCH v4 05/10] commit-graph: add a slab to store topological levels Abhishek Kumar via GitGitGadget
2020-10-25 22:17         ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 06/10] commit-graph: implement corrected commit date Abhishek Kumar via GitGitGadget
2020-10-27 18:53         ` Jakub Narębski
2020-11-03 11:44           ` Abhishek Kumar
2020-11-04 16:45             ` Jakub Narębski
2020-11-05 14:05               ` Philip Oakley
2020-11-05 18:22                 ` Junio C Hamano
2020-11-06 18:26                   ` Extending and updating gitglossary (was: Re: [PATCH v4 06/10] commit-graph: implement corrected commit date) Jakub Narębski
2020-11-06 19:33                     ` Extending and updating gitglossary Junio C Hamano
2020-11-08 17:23                     ` Extending and updating gitglossary (was: Re: [PATCH v4 06/10] commit-graph: implement corrected commit date) Philip Oakley
2020-11-10  1:35                       ` Extending and updating gitglossary Jakub Narębski
2020-11-10 14:04                         ` Philip Oakley
2020-11-10 23:52                           ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 07/10] commit-graph: implement generation data chunk Abhishek Kumar via GitGitGadget
2020-10-30 12:45         ` Jakub Narębski
2020-11-06 11:25           ` Abhishek Kumar
2020-11-06 17:56             ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 08/10] commit-graph: use generation v2 only if entire chain does Abhishek Kumar via GitGitGadget
2020-11-01  0:55         ` Jakub Narębski
2020-11-12 10:01           ` Abhishek Kumar
2020-11-13  9:59             ` Jakub Narębski
2020-10-07 14:09       ` [PATCH v4 09/10] commit-reach: use corrected commit dates in paint_down_to_common() Abhishek Kumar via GitGitGadget
2020-11-03 17:59         ` Jakub Narębski
2020-11-03 18:19           ` Junio C Hamano
2020-11-20 10:33           ` Abhishek Kumar
2020-10-07 14:09       ` [PATCH v4 10/10] doc: add corrected commit date info Abhishek Kumar via GitGitGadget
2020-11-04  1:37         ` Jakub Narębski
2020-11-21  6:30           ` Abhishek Kumar
2020-11-04 23:37       ` [PATCH v4 00/10] [GSoC] Implement Corrected Commit Date Jakub Narębski
2020-11-22  5:31         ` Abhishek Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201027063306.GA15674@Abhishek-Arch \
    --to=abhishekkumar8222@gmail.com \
    --cc=85blgq4lxh.fsf@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=jnareb@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git