All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Sergey Rudyshin via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Sergey Rudyshin <540555@gmail.com>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 1/1] Preserve topological ordering after merges This modifies the algorithm of topopological ordering. The problem with the current algoritm is that it displays the graph below as follows
Date: Tue, 7 Jan 2020 07:14:29 -0500	[thread overview]
Message-ID: <6c86f1e9-b70b-10b1-f2c5-589312f73a4c@gmail.com> (raw)
In-Reply-To: <542a02020c8578d0e004cb9c929bced8719b902a.1578393057.git.gitgitgadget@gmail.com>

On 1/7/2020 5:30 AM, Sergey Rudyshin via GitGitGadget wrote:
> From: Sergey Rudyshin <540555@gmail.com>
> 
> * E
> |\
> | * D
> | |\
> | |/
> |/|
> * | C
> | * B
> |/
> * A
> 
> commit B is placed between A and C, which is wrong
> because E stays that D and B comes after C
> 
> the only correct solution here is
> 
> * E
> |\
> | * D
> | |\
> | |/
> |/|
> | * B
> * | C
> |/
> * A
> 
> while it seems that it contradicts to
> D stating that C should be between D and B
> The new algorithm solves this issue

This ordering concern makes sense _somewhat_, because D is the
second parent of D and that wants to say "Show everything in C..D
before showing C". The issues is that since C is the second parent
of D, the topo-ordering says "Show everything in B..C before showing
things reachable from B". It is unfortunate that these constraints
collide.

Perhaps your description could do a better job clarifying this
issue and how your algorithm change fixes the problem.

However, I'm not sure we even want to make the change, as this
is still a valid topological order (parents appear after children).
When we have an ambiguous pair (like B and C) the order can differ.
The --topo-order option tries to group commits by when they were
introduced, and that's the reason for presenting the commits reachable
from the later parents before presenting the commits from earlier
parents.

The only documentation we have is from [1]:

"Show no parents before all of its children are shown, and avoid
showing commits on multiple lines of history intermixed."

The first part of the sentence is still true, and the second part
is ambiguous of how to do that.

[1] https://git-scm.com/docs/git-log#Documentation/git-log.txt---topo-order

> This change makes option "--topo-order" obsolete, because
> there is only one way to order parents of a single commit.
> "--date-order" and "--author-date-order" are preserved and make sense
> only for the case when multiple commits are given
> to be able to sort those commits.

This part of the change needs to be removed. The default sort does
not preserve topological orderings (like --date-order does), and
so is much faster to output, especially without a commit-graph file.

>  void sort_in_topological_order(struct commit_list **list, enum rev_sort_order sort_order)

Since you are only editing this code, and not the incremental topo-order code, your
test changes will likely break when run with GIT_TEST_COMMIT_GRAPH=1. When the commit-graph
exists and generation numbers are calculated, we use a different algorithm for topo-order.

I've been meaning to clean up this "duplicated" logic by deleting this method in favor of
the incremental algorithm in all cases. It needs some perf testing to make sure that
refactor doesn't have too large of a perf hit in the case of no commit-graph.

>  	/* update the indegree */
> @@ -832,51 +886,56 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
>  	for (next = orig; next; next = next->next) {
>  		struct commit *commit = next->item;
>  
> -		if (*(indegree_slab_at(&indegree, commit)) == 1)
> -			prio_queue_put(&queue, commit);
> +		if (*(indegree_slab_at(&indegree, commit)) == 1) {
> +			/* also record the author dates, if needed */
> +			if (sort_order == REV_SORT_BY_AUTHOR_DATE)
> +				record_author_date(&author_date, commit);
> +			prio_queue_put(&queue_tips, commit);
> +		}

Your code change looks rather large for what I expected to be a much simpler change.
Likely the only thing we need is to avoid adding to the priority queue if we already
have the commit in the queue (maybe using a hashset storing the commits that we've
added to the queue). I believe the reason C appears before B in your example is that
it was added to the queue a second time, and the queue behaves like a stack in the
topo-order case.

Thanks,
-Stolee

  parent reply	other threads:[~2020-01-07 12:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07 10:30 [PATCH 0/1] Preserve topological ordering after merges Sergey Rudyshin via GitGitGadget
2020-01-07 10:30 ` [PATCH 1/1] Preserve topological ordering after merges This modifies the algorithm of topopological ordering. The problem with the current algoritm is that it displays the graph below as follows Sergey Rudyshin via GitGitGadget
2020-01-07 12:08   ` Johannes Schindelin
2020-01-07 17:01     ` Junio C Hamano
2020-01-08  7:51     ` Sergey Rudyshin
2020-01-07 12:14   ` Derrick Stolee [this message]
2020-01-08 15:11     ` Sergey Rudyshin
2020-01-07 12:11 ` [PATCH 0/1] Preserve topological ordering after merges Johannes Schindelin
2020-01-08 12:50   ` Sergey Rudyshin
2020-01-08 13:37     ` Derrick Stolee
2020-01-08 17:04       ` Sergey Rudyshin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6c86f1e9-b70b-10b1-f2c5-589312f73a4c@gmail.com \
    --to=stolee@gmail.com \
    --cc=540555@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.