All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Keller, Jacob E" <jacob.e.keller@intel.com>
To: Derrick Stolee <derrickstolee@github.com>,
	Jacob Keller <jacob.keller@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Cc: Johannes Schindelin <johannes.schindelin@gmx.de>,
	Git mailing list <git@vger.kernel.org>
Subject: Re: [PATCH] name-rev: test showing failure with non-monotonic commit dates
Date: Tue, 15 Feb 2022 23:38:41 +0000	[thread overview]
Message-ID: <92fe3661-8025-18a7-a9a3-0fce88da522c@intel.com> (raw)
In-Reply-To: <42d2a9fe-a3f2-f841-dcd1-27a0440521b6@github.com>

On 2/15/2022 6:48 AM, Derrick Stolee wrote:
> On 2/14/2022 5:07 PM, Jacob Keller wrote:
>> On Mon, Feb 14, 2022 at 1:50 PM Junio C Hamano <gitster@pobox.com> wrote:
>>>
>>> Jacob Keller <jacob.e.keller@intel.com> writes:
>>>
>>>> From: Jacob Keller <jacob.keller@gmail.com>
>>>>
>>>> If a commit in a sequence of linear history has a non-monotonically
>>>> increasing commit timestamp, git name-rev will not properly name the
>>>> commit.
>>>>
>>>> However, if you use --annotate-stdin then the commit does actually get
>>>> picked up and named properly.
>>>
>>> IIRC, this is to be expected.
>>>
>>
>> Right. I figured this is somehow expected behavior...
>>
>>> When preparing to answer --annotate-stdin request, the command has
>>> to dig down to the root of the history, which would be too expensive
>>> in some repositories and wants to stop traversal early when it knows
>>> particular commits it needs to describe.
>>>
>>
>> And this method of cutting the search relies on monotonic commit times right?
>>
>> Is there any other method we could use (commit graph?) or perhaps to
>> add an option so that you could go "git name-rev --no-cutoff <commid
>> id>"? That would potentially allow working around this particular
>> problem on repositories where its known to be problematic.
> 
> I initially thought that generation numbers could help. The usual way
> is to use a priority queue that explores by generation, not commit
> date. This approach was immediately stifled by these lines:
> 
> 	memset(&queue, 0, sizeof(queue)); /* Use the prio_queue as LIFO */
> 	prio_queue_put(&queue, start_commit);
> 
> So the queue is really a stack.
> 

Right. A closer look at the name-rev algorithm seems to be that it
starts looking at each tag and scanning down history to see if it can
find the given commit. It uses the commit timestamp as a heuristic to
decide whether or not to stop looking. This can speed up the search
because it prevents scanning the entire history.. but it breaks when
that heuristic is no longer true such as in the particular setup like
mine with a funky timestamp.


>> Alternatively is there some other way to apply the cutoff heuristic
>> only in some cases? I get the sense this is intended to allow cutting
>> off merged branches? i.e. not applying it when history is linear? I'd
>> have to study it further but the existing algorithm seems to break
>> because as it goes up the history it has found an "older" commit, so
>> it stops trying to blame that line...?
> 
> It is still possible that the cutoff could be altered to use generation
> numbers instead of commit dates, but I haven't looked closely enough to
> be sure.
> 

Right. Using generation number would work for this I think.. The real
question being if it satisfies the other requirements.

I think it does, but I'm not 100% sure yet.

> Here is a very basic attempt. With GIT_TEST_COMMIT_GRAPH=1, your
> test_expect_failure turns into a success.
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index 138e3c30a2b..f7ad1dd8b4d 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -9,6 +9,7 @@
>  #include "prio-queue.h"
>  #include "hash-lookup.h"
>  #include "commit-slab.h"
> +#include "commit-graph.h"
>  
>  /*
>   * One day.  See the 'name a rev shortly after epoch' test in t6120 when
> @@ -27,6 +28,7 @@ struct rev_name {
>  define_commit_slab(commit_rev_name, struct rev_name);
>  
>  static timestamp_t cutoff = TIME_MAX;
> +static timestamp_t generation_cutoff = 0;
>  static struct commit_rev_name rev_names;
>  
>  /* How many generations are maximally preferred over _one_ merge traversal? */
> @@ -151,7 +153,10 @@ static void name_rev(struct commit *start_commit,
>  	struct rev_name *start_name;
>  
>  	parse_commit(start_commit);
> -	if (start_commit->date < cutoff)
> +	if (generation_cutoff && generation_cutoff < GENERATION_NUMBER_INFINITY) {
> +		if (commit_graph_generation(start_commit) < generation_cutoff)
> +			return;
> +	} else if (start_commit->date < cutoff)
>  		return;
>  
>  	start_name = create_or_update_name(start_commit, taggerdate, 0, 0,
> @@ -599,6 +604,8 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
>  		if (commit) {
>  			if (cutoff > commit->date)
>  				cutoff = commit->date;
> +			if (generation_cutoff > commit_graph_generation(commit))
> +				generation_cutoff = commit_graph_generation(commit);
>  		}
>  
>  		if (peel_tag) {
> 
> Thanks,
> -Stolee


  reply	other threads:[~2022-02-15 23:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-14 21:01 [PATCH] name-rev: test showing failure with non-monotonic commit dates Jacob Keller
2022-02-14 21:50 ` Junio C Hamano
2022-02-14 22:07   ` Jacob Keller
2022-02-15 14:48     ` Derrick Stolee
2022-02-15 23:38       ` Keller, Jacob E [this message]
2022-02-16  0:51       ` Junio C Hamano
2022-02-27 22:05         ` Jacob Keller
2022-03-09 21:56   ` Johannes Schindelin
2022-02-15  7:15 ` Ævar Arnfjörð Bjarmason
2022-02-16  1:16   ` Keller, Jacob E

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=92fe3661-8025-18a7-a9a3-0fce88da522c@intel.com \
    --to=jacob.e.keller@intel.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=johannes.schindelin@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.