All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: "Martin Ågren" <martin.agren@gmail.com>
Cc: git@vger.kernel.org, Isaac Chou <Isaac.Chou@microfocus.com>,
	Junio C Hamano <gitster@pobox.com>,
	Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH] fast-export: fix regression skipping some merge-commits
Date: Fri, 20 Apr 2018 21:07:32 +0200 (DST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.1804202041400.4241@ZVAVAG-6OXH6DA.rhebcr.pbec.zvpebfbsg.pbz> (raw)
In-Reply-To: <20180420181248.2015922-1-martin.agren@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7227 bytes --]

Hi Martin,

On Fri, 20 Apr 2018, Martin Ågren wrote:

> 7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
> noted that the pattern `object = array.objects[--array.nr].item` could
> be abstracted as `object = object_array_pop(&array)`.
> 
> Unfortunately, one of the conversions was horribly wrong. Between
> grabbing the last object (i.e., peeking at it) and decreasing the object
> count, the original code would sometimes return early. The updated code
> on the other hand, will always pop the last element, then maybe do the
> early return before doing anything with the object.
> 
> The end result is that merge commits where all the parents have still
> not been exported will simply be dropped, meaning that they will be
> completely missing from the exported data.

Excellent explanation.

> Reintroduce the pattern of first grabbing the last object (using a new
> function `object_array_peek()`), then later poping it. Using
> `..._peek()` and `..._pop()` makes it clear that we are referring to the
> same item, i.e., we do not grab one element, then remove another one.

Instead of using _peek() and _pop() and having to reason about the
correctness, maybe we should simply re-push? See below for my suggested
alternative.

> My sincerest apologies for the stupid train-wreck that the original
> conversion was. Weird interactions between different components can make
> for fun bugs, but this one is just embarassing.

The only way to fail is by doing something. You did something. That is
much better than not doing anything. So please do not be sorry about
introducing a breakage. You did fix it, which makes you double awesome.

> Isaac, this should solve the problem you are seeing. Unfortunately, I do
> not have any experience with building Git for Windows [1]. I really hope
> that this bug did not take up too much of your time. Or eat your data!

It is as easy as

	git clone --depth=1 https://github.com/git-for-windows/git-sdk-64

(downloading half a gigabyte of objects, but then you have almost
everything except for the Git source and one support repository for Git
for Windows), then starting git-bash.exe in its toplevel directory and
calling

	sdk build git

in there. The `sdk` helper is in its infancy, so I could imagine that a
really neat thing would be to be able to build custom branches and bundle
them in a portable Git. Something like `sdk build portable-git --patch
https://public-inbox.org/git/20180420181248.2015922-1-martin.agren@gmail.com/`.

In the meantime, it should still be doable by calling

	sdk cd git
	sdk init build-extra
	/usr/src/build-extra/apply-from-public-inbox.sh https://public-inbox.org/git/20180420181248.2015922-1-martin.agren@gmail.com/
	make -j15 && make -j15 strip && make -j15 install
	sdk build installer

and then running that installer.

You could also build a portable Git instead by replacing the last line
with

	/usr/src/build-extra/portable/release.sh 0-test

Or if you want to avoid building a portable Git installer, and instead
copy the files directly, you could try this sequence:

	pacman -S --noconfirm rsync
	mkdir ~/my-test-git
	(cd / && rsync -Rau $(ARCH=x86_64 BITNESS=64 \
		usr/src/build-extra/make-file-list.sh) ~/my-test-git/)

Please let me know how it is going, I am always eager to make the Git for
Windows SDK easier to use, as it will ultimately save me time.

> diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
> index 866ddf6058..2b46a83a49 100755
> --- a/t/t9350-fast-export.sh
> +++ b/t/t9350-fast-export.sh
> @@ -540,4 +540,26 @@ test_expect_success 'when using -C, do not declare copy when source of copy is a
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'todo' '
> +	test_create_repo merging &&
> +	git -C merging commit --allow-empty -m initial &&

I see that you copied the style of the latest test case, but I have to
admit that I would find it much easier to read if it said:

	(
		cd merging &&
		test_commit initial &&
		git checkout -b topic &&
		test_commit on-branch &&
		git checkout master &&
		test_commit on-master &&
		test_tick &&
		git merge --no-ff -m Yeah topic &&

		echo ":1 $(git rev-parse HEAD^^)" >marks &&
		git fast-export --import-marks=marks master >out &&
		grep Yeah out
	)

i.e. using the subshell where you cd into merging/ first thing, and then
making extensive use of `test_commit`.

> +
> +	git -C merging checkout -b topic &&
> +	>merging/topic-file &&
> +	git -C merging add topic-file &&
> +	git -C merging commit -m topic-file &&
> +
> +	git -C merging checkout master &&
> +	>merging/master-file &&
> +	git -C merging add master-file &&
> +	git -C merging commit -m master-file &&
> +
> +	git -C merging merge --no-ff topic -m "merge the topic" &&
> +
> +	oid=$(git -C merging rev-parse HEAD^^) &&
> +	echo :1 $oid >merging/git-marks &&
> +	git -C merging fast-export --import-marks=git-marks refs/heads/master >out &&
> +	grep "merge the topic" out
> +'
> +
>  test_done
> diff --git a/object.h b/object.h
> index f13f85b2a9..4d8ce280d9 100644
> --- a/object.h
> +++ b/object.h
> @@ -129,6 +129,15 @@ void add_object_array_with_path(struct object *obj, const char *name, struct obj
>   */
>  struct object *object_array_pop(struct object_array *array);
>  
> +/*
> + * Returns NULL if the array is empty. Otherwise, returns the last object.
> + * That is, the returned value is what `object_array_pop()` would have returned.
> + */
> +inline struct object *object_array_peek(const struct object_array *array)

I looked, and this would be the first use of `inline` without `static`...

> +{
> +	return array->nr ? array->objects[array->nr - 1].item : NULL;
> +}
> +
>  typedef int (*object_array_each_func_t)(struct object_array_entry *, void *);
>  
>  /*
> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
> index 27b2cc138e..8377d27b46 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> @@ -650,9 +650,10 @@ static void handle_tail(struct object_array *commits, struct rev_info *revs,
>  {
>  	struct commit *commit;
>  	while (commits->nr) {
> -		commit = (struct commit *)object_array_pop(commits);
> +		commit = (struct commit *)object_array_peek(commits);
>  		if (has_unshown_parent(commit))
>  			return;
> +		(void)object_array_pop(commits);
>  		handle_commit(commit, revs, paths_of_changed_objects);
>  	}
>  }

As I stated above, I think we can make this a bit easier to reason about
(and less easy to break by future additions) if we avoided the _peek()
function altogether, like this:

 {
 	struct commit *commit;
 	while (commits->nr) {
 		commit = (struct commit *)object_array_pop(commits);
-		if (has_unshown_parent(commit))
+		if (has_unshown_parent(commit)) {
+			/* Queue again, to be handled later */
+			add_object_array(commits, NULL, commit);
 			return;
+		}
 		handle_commit(commit, revs, paths_of_changed_objects);
 	}
 }

(I did not test this, and I was honestly surprised that there is no
object_array_push() counterpart to _pop() ;-) So this might be all wrong.)

Ciao,
Johannes

  parent reply	other threads:[~2018-04-20 19:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-19 21:46 [BUG] Git fast-export with import marks file omits merge commits Isaac Chou
2018-04-19 22:26 ` Elijah Newren
2018-04-19 22:48 ` Junio C Hamano
2018-04-20  5:07   ` Martin Ågren
2018-04-20 13:53     ` Isaac Chou
2018-04-20 18:12       ` [PATCH] fast-export: fix regression skipping some merge-commits Martin Ågren
2018-04-20 18:57         ` Isaac Chou
2018-04-20 19:08           ` [PATCH v2] " Martin Ågren
2018-04-20 19:07         ` Johannes Schindelin [this message]
2018-04-20 19:32           ` [PATCH] " Martin Ågren
2018-04-20 21:00             ` Johannes Schindelin
2018-04-20 22:12               ` [PATCH v3] " Martin Ågren
2018-04-20 22:16                 ` Eric Sunshine
2018-04-21  6:58                   ` Martin Ågren
2018-04-21  3:43                 ` Junio C Hamano
2018-04-21  7:00                   ` Martin Ågren
2018-06-01 19:41                     ` Isaac Chou
2018-06-02  6:48                       ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.1804202041400.4241@ZVAVAG-6OXH6DA.rhebcr.pbec.zvpebfbsg.pbz \
    --to=johannes.schindelin@gmx.de \
    --cc=Isaac.Chou@microfocus.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=martin.agren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.