All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Tom Clarkson via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Avery Pennarun <apenwarr@gmail.com>,
	Ed Maste <emaste@freebsd.org>, Tom Clarkson <tom@tqclarkson.com>,
	Tom Clarkson <tom@tqclarkson.com>
Subject: Re: [PATCH v2 2/7] subtree: exclude commits predating add from recursive processing
Date: Wed, 7 Oct 2020 17:36:28 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2010071211180.50@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <79b5f4a65197cea26ddc080c19dd2c5c7d424fc1.1602021913.git.gitgitgadget@gmail.com>

Hi Tom,

On Tue, 6 Oct 2020, Tom Clarkson via GitGitGadget wrote:

> From: Tom Clarkson <tom@tqclarkson.com>
>
> Include recursion depth in debug logs so we can see when the recursion is
> getting out of hand.
>
> Making the cache handle null mappings correctly and adding older commits
> to the cache allows the recursive algorithm to terminate at any point on
> mainline rather than needing to reach either the add point or the initial
> commit.

Makes sense.

> diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
> index 9867718503..160bad95c1 100755
> --- a/contrib/subtree/git-subtree.sh
> +++ b/contrib/subtree/git-subtree.sh
> @@ -244,7 +244,7 @@ check_parents () {
>  	do
>  		if ! test -r "$cachedir/notree/$miss"
>  		then
> -			debug "  incorrect order: $miss"
> +			debug "  unprocessed parent commit: $miss ($indent)"

Without any context, it is hard to understand what the `$indent` variable
is supposed to mean, so it is unclear why we need to print it here.

I _guess_ it is the degree removed from the first-parent lineage?

In any case, it does not hurt here, so I trust that it is good to include
it in the debug output.

>  			process_split_commit "$miss" "" "$indent"
>  		fi
>  	done
> @@ -392,6 +392,24 @@ find_existing_splits () {
>  	done
>  }
>
> +find_mainline_ref () {
> +	debug "Looking for first split..."
> +	dir="$1"
> +	revs="$2"

The `git-subtree` script seems to rely on the `local` construct, using it
in plenty of other circumstances. How about using it here, too?

> +
> +	git log --reverse --grep="^git-subtree-dir: $dir/*\$" \
> +		--no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs |

Since all you are interested in is the `git-subtree-mainline:` trailer,
wouldn't a format like `%(trailers:key=git-subtree-mainline)` instead of
`START %H%n%s%n%n%b%nEND%n`?

See
https://git-scm.com/docs/git-log#Documentation/git-log.txt-emtrailersoptionsem
for more information about pretty formats.

BTW I am super unfamiliar with `git subtree`'s inner workings, and
therefore it would help me incredibly if the commit message talked a bit
about the commit message layout (with a particular eye on
`git-subtree-dir` and `git-subtree-mainline` which I guess are trailers
added by `git subtree`?)...

> +	while read a b junk
> +	do
> +		case "$a" in
> +		git-subtree-mainline:)
> +			echo "$b"
> +			return
> +			;;
> +		esac
> +	done
> +}
> +
>  copy_commit () {
>  	# We're going to set some environment vars here, so
>  	# do it in a subshell to get rid of them safely later
> @@ -646,9 +664,9 @@ process_split_commit () {
>
>  	progress "$revcount/$revmax ($createcount) [$extracount]"
>
> -	debug "Processing commit: $rev"
> +	debug "Processing commit: $rev ($indent)"
>  	exists=$(cache_get "$rev")
> -	if test -n "$exists"
> +	if test -z "$(cache_miss "$rev")"
>  	then
>  		debug "  prior: $exists"

I do not see the `exists` variable being used other than for the debug
statement. Maybe better something like this?

	debug "  prior found for $rev"

>  		return
> @@ -773,6 +791,17 @@ cmd_split () {
>
>  	unrevs="$(find_existing_splits "$dir" "$revs")"
>
> +	mainline="$(find_mainline_ref "$dir" "$revs")"
> +	if test -n "$mainline"
> +	then
> +		debug "Mainline $mainline predates subtree add"
> +		git rev-list --topo-order --skip=1 $mainline |
> +		while read rev
> +		do
> +			cache_set "$rev" ""

Ah, so they are not really "null mappings", but mapped to an empty string.
Makes sense. Maybe adjust the commit message?

> +		done || exit $?
> +	fi
> +
>  	# We can't restrict rev-list to only $dir here, because some of our
>  	# parents have the $dir contents the root, and those won't match.
>  	# (and rev-list --follow doesn't seem to solve this)
> --
> gitgitgadget
>
>

  reply	other threads:[~2020-10-07 15:36 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-11  5:49 [PATCH 0/7] subtree: Fix handling of complex history Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 1/7] subtree: handle multiple parents passed to cache_miss Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 2/7] subtree: exclude commits predating add from recursive processing Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 3/7] subtree: persist cache between split runs Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 4/7] subtree: add git subtree map command Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 5/7] subtree: add git subtree use and ignore commands Tom Clarkson via GitGitGadget
2020-05-11  5:50 ` [PATCH 6/7] subtree: more robustly distinguish subtree and mainline commits Tom Clarkson via GitGitGadget
2020-05-11  5:50 ` [PATCH 7/7] subtree: document new subtree commands Tom Clarkson via GitGitGadget
2020-10-04 17:52 ` [PATCH 0/7] subtree: Fix handling of complex history Ed Maste
2020-10-04 19:27   ` Johannes Schindelin
2020-10-05 16:47     ` Junio C Hamano
2020-10-05 21:37     ` Ed Maste
2020-10-07 16:31       ` Johannes Schindelin
2020-10-06 22:05 ` [PATCH v2 " Tom Clarkson via GitGitGadget
2020-10-06 22:05   ` [PATCH v2 1/7] subtree: handle multiple parents passed to cache_miss Tom Clarkson via GitGitGadget
2020-10-07 13:12     ` Ed Maste
2020-10-06 22:05   ` [PATCH v2 2/7] subtree: exclude commits predating add from recursive processing Tom Clarkson via GitGitGadget
2020-10-07 15:36     ` Johannes Schindelin [this message]
2020-10-06 22:05   ` [PATCH v2 3/7] subtree: persist cache between split runs Tom Clarkson via GitGitGadget
2020-10-07 16:06     ` Johannes Schindelin
2020-10-06 22:05   ` [PATCH v2 4/7] subtree: add git subtree map command Tom Clarkson via GitGitGadget
2020-10-06 22:05   ` [PATCH v2 5/7] subtree: add git subtree use and ignore commands Tom Clarkson via GitGitGadget
2020-10-07 16:29     ` Johannes Schindelin
2020-10-06 22:05   ` [PATCH v2 6/7] subtree: more robustly distinguish subtree and mainline commits Tom Clarkson via GitGitGadget
2020-10-07 19:42     ` Johannes Schindelin
2020-10-06 22:05   ` [PATCH v2 7/7] subtree: document new subtree commands Tom Clarkson via GitGitGadget
2020-10-07 19:43     ` Johannes Schindelin
2020-10-07 19:46   ` [PATCH v2 0/7] subtree: Fix handling of complex history Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2010071211180.50@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=apenwarr@gmail.com \
    --cc=emaste@freebsd.org \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=tom@tqclarkson.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.