All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
@ 2017-09-26  3:30 Eric Rannaud
  2017-09-26  4:25 ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Rannaud @ 2017-09-26  3:30 UTC (permalink / raw)
  To: git; +Cc: jeremy.serror, Shawn O . Pearce, Eric Rannaud

The checkpoint command cycles packfiles if object_count != 0, a sensible
test or there would be no pack files to write. Since 820b931012, the
command also dumps branches, tags and marks, but still conditionally.
However, it is possible for a command stream to modify refs or create
marks without creating any new objects.

For example, reset a branch:

	$ git fast-import
	reset refs/heads/master
	from refs/heads/master^

	checkpoint

but refs/heads/master remains unchanged.

Other example: a commit command that re-creates an object that already
exists in the object database.

This fix unconditionally calls dump_{branches,tags,marks}() for all
checkpoint commands. dump_branches() and dump_tags() are cheap to call
in the case of a no-op.

Signed-off-by: Eric Rannaud <e@nanocritical.com>
---
 fast-import.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 35bf671f12c4..d5e4cf0bad41 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3189,10 +3189,10 @@ static void checkpoint(void)
 	checkpoint_requested = 0;
 	if (object_count) {
 		cycle_packfile();
-		dump_branches();
-		dump_tags();
-		dump_marks();
 	}
+	dump_branches();
+	dump_tags();
+	dump_marks();
 }
 
 static void parse_checkpoint(void)
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-26  3:30 [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0 Eric Rannaud
@ 2017-09-26  4:25 ` Junio C Hamano
  2017-09-26  9:53   ` [PATCH 1/1] " Eric Rannaud
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2017-09-26  4:25 UTC (permalink / raw)
  To: Eric Rannaud; +Cc: git, jeremy.serror, Shawn O . Pearce

"Eric Rannaud" <e@nanocritical.com> writes:

> The checkpoint command cycles packfiles if object_count != 0, a sensible
> test or there would be no pack files to write. Since 820b931012, the
> command also dumps branches, tags and marks, but still conditionally.
> However, it is possible for a command stream to modify refs or create
> marks without creating any new objects.

That reasoning sounds sensible.  Especially given the discussion of
"checkpoint" and "progress" we can see in "git fast-import --help"
documentation. E.g.

    Placing a progress command immediately after a checkpoint will
    inform the reader when the checkpoint has been completed and it
    can safely access the refs that fast-import updated.

would not be true without this change, I suspect.

Can we also add a new test or two that protect this from future
breakages?

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/1] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-26  4:25 ` Junio C Hamano
@ 2017-09-26  9:53   ` Eric Rannaud
  2017-09-27  3:37     ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Rannaud @ 2017-09-26  9:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, jeremy.serror, Shawn O . Pearce, Eric Rannaud

The checkpoint command cycles packfiles if object_count != 0, a sensible
test or there would be no pack files to write. Since 820b931012, the
command also dumps branches, tags and marks, but still conditionally.
However, it is possible for a command stream to modify refs or create
marks without creating any new objects.

For example, reset a branch (and keep fast-import running):

	$ git fast-import
	reset refs/heads/master
	from refs/heads/master^

	checkpoint

but refs/heads/master remains unchanged.

Other example: a commit command that re-creates an object that already
exists in the object database.

The man page also states that checkpoint "updates the refs" and that
"placing a progress command immediately after a checkpoint will inform
the reader when the checkpoint has been completed and it can safely
access the refs that fast-import updated". This wasn't always true
without this patch.

This fix unconditionally calls dump_{branches,tags,marks}() for all
checkpoint commands. dump_branches() and dump_tags() are cheap to call
in the case of a no-op.

Add tests to t9300 that observe the (non-packfiles) effects of
checkpoint.

Signed-off-by: Eric Rannaud <e@nanocritical.com>
---
 fast-import.c          |   6 +--
 t/t9300-fast-import.sh | 117 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 120 insertions(+), 3 deletions(-)


Any comments on the testing strategy with a background fast-import?

To summarize:
- fast-import is started in the background with a command stream that
  ends with "checkpoint\nprogress checkpoint\n". fast-import keeps
  running after reaching the last command (we don't want fast-import to
  terminate).
- In the meantime, the test is waiting to read "progress checkpoint" in
  the output of fast-import, then it checks the testing conditions.
- Finally, the test ensures that fast-import is still running in the
  background (and thus that it has just observed the effect of
  checkpoint, and not the side effects of fast-import terminating).
- After 10 sec, no matter what, the background fast-import is sent
  "done" and terminates.

I tried to make sure that the test runs quickly and does not (typically) sleep.
Upon failure, the test may take up to 10 sec to fully terminate.

However, the test could break under (very heavy) load if fast-import is unable
to make progress in a reasonable amount of time (either "progress checkpoint"
is not read within 5 sec, or fast-import receives "done" before the testing
conditions are checked). Let me know if this is not acceptable.

Should these test cases be at least separated to a new t9304 to
circumscribe any such nuisance? Alternatively, a FIFO-based approach
could be considered.

(Note: added cases 2 and 4 pass without this patch, but 1 and 3 do not.)


diff --git a/fast-import.c b/fast-import.c
index 35bf671f12c4..d5e4cf0bad41 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3189,10 +3189,10 @@ static void checkpoint(void)
 	checkpoint_requested = 0;
 	if (object_count) {
 		cycle_packfile();
-		dump_branches();
-		dump_tags();
-		dump_marks();
 	}
+	dump_branches();
+	dump_tags();
+	dump_marks();
 }
 
 static void parse_checkpoint(void)
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 67b8c50a5ab4..b410bf3a3a7a 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -3120,4 +3120,121 @@ test_expect_success 'U: validate root delete result' '
 	compare_diff_raw expect actual
 '
 
+###
+### series V (checkpoint)
+###
+
+# To make sure you're observing the side effects of checkpoint *before*
+# fast-import terminates (and thus writes out its state), check that the
+# fast-import process is still running using background_import_still_running
+# *after* evaluating the test conditions.
+background_import_until_checkpoint () {
+	options=$1
+	input_file=$2
+	( cat $input_file; sleep 10; echo done ) | git fast-import $options >V.output &
+	echo $! >V.pid
+
+	# The loop will poll for approximately 5 seconds before giving up.
+	n=0
+	while ! test "$(cat V.output)" = "progress checkpoint"; do
+		if test $n -gt 55
+		then
+			echo >&2 "no progress checkpoint received"
+			exit 1
+		fi
+
+		# Try to avoid sleeping in the first iterations and poll
+		# aggressively.
+		if test $n -ge 50
+		then
+			sleep 1
+		fi
+
+		n=$(($n + 1))
+	done
+}
+
+background_import_still_running () {
+	if ! ps --pid "$(cat V.pid)"
+	then
+		echo >&2 "background fast-import terminated too early"
+		exit 1
+	fi
+	kill $(cat V.pid)
+}
+
+test_expect_success 'V: checkpoint updates refs after reset' '
+	cat >input <<-INPUT_END &&
+	reset refs/heads/V
+	from refs/heads/U
+
+	checkpoint
+	progress checkpoint
+	INPUT_END
+
+	background_import_until_checkpoint "" input &&
+	test "$(git rev-parse --verify V)" = "$(git rev-parse --verify U)" &&
+	background_import_still_running
+'
+
+test_expect_success 'V: checkpoint updates refs and marks after commit' '
+	cat >input <<-INPUT_END &&
+	commit refs/heads/V
+	mark :1
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+	from refs/heads/U
+
+	checkpoint
+	progress checkpoint
+	INPUT_END
+
+	background_import_until_checkpoint "--export-marks=marks.actual" input &&
+
+	echo ":1 $(git rev-parse --verify V)" >marks.expected &&
+
+	test "$(git rev-parse --verify V^)" = "$(git rev-parse --verify U)" &&
+	test_cmp marks.expected marks.actual &&
+	background_import_still_running
+'
+
+# Re-create the exact same commit, but on a different branch: no new object is
+# created in the database, but the refs and marks still need to be updated.
+test_expect_success 'V: checkpoint updates refs and marks after commit (no new objects)' '
+	cat >input <<-INPUT_END &&
+	commit refs/heads/V2
+	mark :2
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+	from refs/heads/U
+
+	checkpoint
+	progress checkpoint
+	INPUT_END
+
+	background_import_until_checkpoint "--export-marks=marks.actual" input &&
+
+	echo ":2 $(git rev-parse --verify V2)" >marks.expected &&
+
+	test "$(git rev-parse --verify V2)" = "$(git rev-parse --verify V)" &&
+	test_cmp marks.expected marks.actual &&
+	background_import_still_running
+'
+
+test_expect_success 'V: checkpoint updates tags after tag' '
+	cat >input <<-INPUT_END &&
+	tag Vtag
+	from refs/heads/V
+	tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+
+	checkpoint
+	progress checkpoint
+	INPUT_END
+
+	background_import_until_checkpoint "" input &&
+	git show-ref -d Vtag &&
+	background_import_still_running
+'
+
 test_done
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/1] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-26  9:53   ` [PATCH 1/1] " Eric Rannaud
@ 2017-09-27  3:37     ` Junio C Hamano
  2017-09-27 19:46       ` [PATCH] " Eric Rannaud
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2017-09-27  3:37 UTC (permalink / raw)
  To: Eric Rannaud; +Cc: git, jeremy.serror, Shawn O . Pearce

"Eric Rannaud" <e@nanocritical.com> writes:

> Any comments on the testing strategy with a background fast-import?
>
> To summarize:
> - fast-import is started in the background with a command stream that
>   ends with "checkpoint\nprogress checkpoint\n". fast-import keeps
>   running after reaching the last command (we don't want fast-import to
>   terminate).
> - In the meantime, the test is waiting to read "progress checkpoint" in
>   the output of fast-import, then it checks the testing conditions.
> - Finally, the test ensures that fast-import is still running in the
>   background (and thus that it has just observed the effect of
>   checkpoint, and not the side effects of fast-import terminating).
> - After 10 sec, no matter what, the background fast-import is sent
>   "done" and terminates.
>
> I tried to make sure that the test runs quickly and does not (typically) sleep.
> Upon failure, the test may take up to 10 sec to fully terminate.

Hmmmm, it certainly looks a bit too brittle with many tweakables
like 10, 50, 55, etc. that heavily depend on the load on the system.

Sorry for asking you to go through the hoops.

> diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
> index 67b8c50a5ab4..b410bf3a3a7a 100755
> --- a/t/t9300-fast-import.sh
> +++ b/t/t9300-fast-import.sh
> @@ -3120,4 +3120,121 @@ test_expect_success 'U: validate root delete result' '
>  	compare_diff_raw expect actual
>  '
>  
> +###
> +### series V (checkpoint)
> +###
> +
> +# To make sure you're observing the side effects of checkpoint *before*
> +# fast-import terminates (and thus writes out its state), check that the
> +# fast-import process is still running using background_import_still_running
> +# *after* evaluating the test conditions.
> +background_import_until_checkpoint () {
> +	options=$1
> +	input_file=$2
> +	( cat $input_file; sleep 10; echo done ) | git fast-import $options >V.output &
> +	echo $! >V.pid
> +
> +	# The loop will poll for approximately 5 seconds before giving up.
> +	n=0
> +	while ! test "$(cat V.output)" = "progress checkpoint"; do

Micronit.  Just like you have if/then on different lines, have
while/do on different lines, i.e.

	while test "$(cat V.output)" != "progress checkpoint"
	do

> +		if test $n -gt 55
> +		then
> +...

> +background_import_still_running () {
> +	if ! ps --pid "$(cat V.pid)"

"ps --pid" is probably not portable (I think we'd do "kill -0 $pid"
instead in our existing tests---and it should be kosher according to
POSIX [*1*, *2*]).

> +test_expect_success 'V: checkpoint updates refs after reset' '
> +	cat >input <<-INPUT_END &&

It is not wrong per-se but we quote INPUT_END when there is no
interpolation necessary in the body of here document to help
readers, like so:

	cat >input <<-\INPUT_END &&

> +	reset refs/heads/V
> +	from refs/heads/U
> +
> +	checkpoint
> +	progress checkpoint
> +	INPUT_END

> +test_expect_success 'V: checkpoint updates refs and marks after commit' '
> +	cat >input <<-INPUT_END &&

This we do want interpolation and the above is correct.


[Footnotes]

*1* http://pubs.opengroup.org/onlinepubs/9699919799/utilities/kill.html 
    lists '0' as an allowed signal number to be sent, and defers to the
    description of the kill() function to explain what happens.

*2* http://pubs.opengroup.org/onlinepubs/9699919799/functions/kill.html
    says """If sig is 0 (the null signal), error checking is
    performed but no signal is actually sent. The null signal can be
    used to check the validity of pid."""

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-27  3:37     ` Junio C Hamano
@ 2017-09-27 19:46       ` Eric Rannaud
  2017-09-27 23:19         ` Ramsay Jones
  2017-09-28  3:48         ` Junio C Hamano
  0 siblings, 2 replies; 20+ messages in thread
From: Eric Rannaud @ 2017-09-27 19:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, jeremy.serror, Shawn O . Pearce, Eric Rannaud

The checkpoint command cycles packfiles if object_count != 0, a sensible
test or there would be no pack files to write. Since 820b931012, the
command also dumps branches, tags and marks, but still conditionally.
However, it is possible for a command stream to modify refs or create
marks without creating any new objects.

For example, reset a branch (and keep fast-import running):

	$ git fast-import
	reset refs/heads/master
	from refs/heads/master^

	checkpoint

but refs/heads/master remains unchanged.

Other example: a commit command that re-creates an object that already
exists in the object database.

The man page also states that checkpoint "updates the refs" and that
"placing a progress command immediately after a checkpoint will inform
the reader when the checkpoint has been completed and it can safely
access the refs that fast-import updated". This wasn't always true
without this patch.

This fix unconditionally calls dump_{branches,tags,marks}() for all
checkpoint commands. dump_branches() and dump_tags() are cheap to call
in the case of a no-op.

Add tests to t9300 that observe the (non-packfiles) effects of
checkpoint.

Signed-off-by: Eric Rannaud <e@nanocritical.com>
---
 fast-import.c          |   6 +--
 t/t9300-fast-import.sh | 129 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 132 insertions(+), 3 deletions(-)


Use named pipes instead of the polling approach. Also incorporate your other
comments.


diff --git a/fast-import.c b/fast-import.c
index 35bf671f12c4..d5e4cf0bad41 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3189,10 +3189,10 @@ static void checkpoint(void)
 	checkpoint_requested = 0;
 	if (object_count) {
 		cycle_packfile();
-		dump_branches();
-		dump_tags();
-		dump_marks();
 	}
+	dump_branches();
+	dump_tags();
+	dump_marks();
 }
 
 static void parse_checkpoint(void)
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 67b8c50a5ab4..9aa3470d895b 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -3120,4 +3120,133 @@ test_expect_success 'U: validate root delete result' '
 	compare_diff_raw expect actual
 '
 
+###
+### series V (checkpoint)
+###
+
+# To make sure you're observing the side effects of checkpoint *before*
+# fast-import terminates (and thus writes out its state), check that the
+# fast-import process is still running using background_import_still_running
+# *after* evaluating the test conditions.
+background_import_until_checkpoint () {
+	options=$1
+	input_file=$2
+
+	mkfifo V.input
+	exec 8<>V.input
+	rm V.input
+
+	mkfifo V.output
+	exec 9<>V.output
+	rm V.output
+
+	cat $input_file >&8
+	git fast-import $options <&8 >&9 &
+	echo $! >V.pid
+	test_when_finished "kill $(cat V.pid) || true"
+
+	error=0
+	if read output <&9
+	then
+		if ! test "$output" = "progress checkpoint"
+		then
+			echo >&2 "no progress checkpoint received: $output"
+			error=1
+		fi
+	else
+		echo >&2 "failed to read fast-import output"
+		error=1
+	fi
+
+	exec 8>&-
+	exec 9>&-
+
+	if test $error -eq 1
+	then
+		exit 1
+	fi
+}
+
+background_import_still_running () {
+	if ! kill -0 "$(cat V.pid)"
+	then
+		echo >&2 "background fast-import terminated too early"
+		exit 1
+	fi
+}
+
+test_expect_success 'V: checkpoint updates refs after reset' '
+	cat >input <<-\INPUT_END &&
+	reset refs/heads/V
+	from refs/heads/U
+
+	checkpoint
+	progress checkpoint
+	INPUT_END
+
+	background_import_until_checkpoint "" input &&
+	test "$(git rev-parse --verify V)" = "$(git rev-parse --verify U)" &&
+	background_import_still_running
+'
+
+test_expect_success 'V: checkpoint updates refs and marks after commit' '
+	cat >input <<-INPUT_END &&
+	commit refs/heads/V
+	mark :1
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+	from refs/heads/U
+
+	checkpoint
+	progress checkpoint
+	INPUT_END
+
+	background_import_until_checkpoint "--export-marks=marks.actual" input &&
+
+	echo ":1 $(git rev-parse --verify V)" >marks.expected &&
+
+	test "$(git rev-parse --verify V^)" = "$(git rev-parse --verify U)" &&
+	test_cmp marks.expected marks.actual &&
+	background_import_still_running
+'
+
+# Re-create the exact same commit, but on a different branch: no new object is
+# created in the database, but the refs and marks still need to be updated.
+test_expect_success 'V: checkpoint updates refs and marks after commit (no new objects)' '
+	cat >input <<-INPUT_END &&
+	commit refs/heads/V2
+	mark :2
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+	from refs/heads/U
+
+	checkpoint
+	progress checkpoint
+	INPUT_END
+
+	background_import_until_checkpoint "--export-marks=marks.actual" input &&
+
+	echo ":2 $(git rev-parse --verify V2)" >marks.expected &&
+
+	test "$(git rev-parse --verify V2)" = "$(git rev-parse --verify V)" &&
+	test_cmp marks.expected marks.actual &&
+	background_import_still_running
+'
+
+test_expect_success 'V: checkpoint updates tags after tag' '
+	cat >input <<-INPUT_END &&
+	tag Vtag
+	from refs/heads/V
+	tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+
+	checkpoint
+	progress checkpoint
+	INPUT_END
+
+	background_import_until_checkpoint "" input &&
+	git show-ref -d Vtag &&
+	background_import_still_running
+'
+
 test_done
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-27 19:46       ` [PATCH] " Eric Rannaud
@ 2017-09-27 23:19         ` Ramsay Jones
  2017-09-28  3:48         ` Junio C Hamano
  1 sibling, 0 replies; 20+ messages in thread
From: Ramsay Jones @ 2017-09-27 23:19 UTC (permalink / raw)
  To: Eric Rannaud, Junio C Hamano; +Cc: git, jeremy.serror, Shawn O . Pearce



On 27/09/17 20:46, Eric Rannaud wrote:
> The checkpoint command cycles packfiles if object_count != 0, a sensible
> test or there would be no pack files to write. Since 820b931012, the
> command also dumps branches, tags and marks, but still conditionally.
> However, it is possible for a command stream to modify refs or create
> marks without creating any new objects.
> 
> For example, reset a branch (and keep fast-import running):
> 
> 	$ git fast-import
> 	reset refs/heads/master
> 	from refs/heads/master^
> 
> 	checkpoint
> 
> but refs/heads/master remains unchanged.
> 
> Other example: a commit command that re-creates an object that already
> exists in the object database.
> 
> The man page also states that checkpoint "updates the refs" and that
> "placing a progress command immediately after a checkpoint will inform
> the reader when the checkpoint has been completed and it can safely
> access the refs that fast-import updated". This wasn't always true
> without this patch.
> 
> This fix unconditionally calls dump_{branches,tags,marks}() for all
> checkpoint commands. dump_branches() and dump_tags() are cheap to call
> in the case of a no-op.
> 
> Add tests to t9300 that observe the (non-packfiles) effects of
> checkpoint.
> 
> Signed-off-by: Eric Rannaud <e@nanocritical.com>
> ---
>  fast-import.c          |   6 +--
>  t/t9300-fast-import.sh | 129 +++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 132 insertions(+), 3 deletions(-)
> 
> 
> Use named pipes instead of the polling approach. Also incorporate your other
> comments.
> 
> 
> diff --git a/fast-import.c b/fast-import.c
> index 35bf671f12c4..d5e4cf0bad41 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -3189,10 +3189,10 @@ static void checkpoint(void)
>  	checkpoint_requested = 0;
>  	if (object_count) {
>  		cycle_packfile();
> -		dump_branches();
> -		dump_tags();
> -		dump_marks();
>  	}
> +	dump_branches();
> +	dump_tags();
> +	dump_marks();
>  }
>  
>  static void parse_checkpoint(void)
> diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
> index 67b8c50a5ab4..9aa3470d895b 100755
> --- a/t/t9300-fast-import.sh
> +++ b/t/t9300-fast-import.sh
> @@ -3120,4 +3120,133 @@ test_expect_success 'U: validate root delete result' '
>  	compare_diff_raw expect actual
>  '
>  
> +###
> +### series V (checkpoint)
> +###
> +
> +# To make sure you're observing the side effects of checkpoint *before*
> +# fast-import terminates (and thus writes out its state), check that the
> +# fast-import process is still running using background_import_still_running
> +# *after* evaluating the test conditions.
> +background_import_until_checkpoint () {
> +	options=$1
> +	input_file=$2
> +
> +	mkfifo V.input

Since you are using mkfifo here ...

> +	exec 8<>V.input
> +	rm V.input
> +
> +	mkfifo V.output
> +	exec 9<>V.output
> +	rm V.output
> +
> +	cat $input_file >&8
> +	git fast-import $options <&8 >&9 &
> +	echo $! >V.pid
> +	test_when_finished "kill $(cat V.pid) || true"
> +
> +	error=0
> +	if read output <&9
> +	then
> +		if ! test "$output" = "progress checkpoint"
> +		then
> +			echo >&2 "no progress checkpoint received: $output"
> +			error=1
> +		fi
> +	else
> +		echo >&2 "failed to read fast-import output"
> +		error=1
> +	fi
> +
> +	exec 8>&-
> +	exec 9>&-
> +
> +	if test $error -eq 1
> +	then
> +		exit 1
> +	fi
> +}
> +
> +background_import_still_running () {
> +	if ! kill -0 "$(cat V.pid)"
> +	then
> +		echo >&2 "background fast-import terminated too early"
> +		exit 1
> +	fi
> +}
> +

... you need to set the PIPE prerequisite on all of your new tests.

> +test_expect_success 'V: checkpoint updates refs after reset' '
> +	cat >input <<-\INPUT_END &&
> +	reset refs/heads/V
> +	from refs/heads/U
> +
> +	checkpoint
> +	progress checkpoint
> +	INPUT_END
> +
> +	background_import_until_checkpoint "" input &&
> +	test "$(git rev-parse --verify V)" = "$(git rev-parse --verify U)" &&
> +	background_import_still_running
> +'

[snip]

ATB,
Ramsay Jones



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-27 19:46       ` [PATCH] " Eric Rannaud
  2017-09-27 23:19         ` Ramsay Jones
@ 2017-09-28  3:48         ` Junio C Hamano
  2017-09-28  4:56           ` Eric Rannaud
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2017-09-28  3:48 UTC (permalink / raw)
  To: Eric Rannaud; +Cc: git, jeremy.serror, Shawn O . Pearce

"Eric Rannaud" <e@nanocritical.com> writes:

> diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
> index 67b8c50a5ab4..9aa3470d895b 100755
> --- a/t/t9300-fast-import.sh
> +++ b/t/t9300-fast-import.sh
> @@ -3120,4 +3120,133 @@ test_expect_success 'U: validate root delete result' '
>  	compare_diff_raw expect actual
>  '
>  
> +###
> +### series V (checkpoint)
> +###
> +
> +# To make sure you're observing the side effects of checkpoint *before*
> +# fast-import terminates (and thus writes out its state), check that the
> +# fast-import process is still running using background_import_still_running
> +# *after* evaluating the test conditions.
> +background_import_until_checkpoint () {
> +	options=$1
> +	input_file=$2
> +
> +	mkfifo V.input
> +	exec 8<>V.input
> +	rm V.input
> +
> +	mkfifo V.output
> +	exec 9<>V.output
> +	rm V.output
> +
> +	cat $input_file >&8

It probably is a good idea to quote "$input_file" in case other
people later use a full path to the file or something; for now this
is OK.

fd#8 at this point does not have a reader; unless the contents of
the $input_file is small enough, wouldn't this "cat" block until
somebody else comes and reads from it to drain?  Should we instead
start fast-import first in the background, arrange it to be killed
when we are done with it, and then start feeding the input?

> +	git fast-import $options <&8 >&9 &
> +	echo $! >V.pid
> +	test_when_finished "kill $(cat V.pid) || true"

This '|| true' is here because the process might already have died
on its own, which sounds like a sensible precaution.

> +	error=0
> +	if read output <&9
> +	then
> +		if ! test "$output" = "progress checkpoint"
> +		then
> +			echo >&2 "no progress checkpoint received: $output"
> +			error=1
> +		fi
> +	else
> +		echo >&2 "failed to read fast-import output"
> +		error=1
> +	fi

And we expect "progress checkpoint" would be the first and only
output after fast-import consumes all the input stream up to the
"progress" thing we feed, so this is not "read and discard until
we see 'progress checkpoint'" but is "read one and that must be
'progress checkpoint'".  Makes sense to me.

If this script is (and will be in the future) all about issuing a
checkpoint command and observing its effect, we can reasonably
expect that the input file _must_ end with "checkpoint" followed by
"progress checkpoint", no?  If that is the case, perhaps feeding
these two from this helper function to >&8, instead of forcing the
caller to prepare the input file to always end with these two, may
be a better organization.

> +	exec 8>&-
> +	exec 9>&-

These are to make sure that nobody (after fast-import dies) has
these file descriptors hanging open for writing.  Makes one wonder
what happens to the reader side of the file descriptor, though ;-)

Before we return from this function, we expect (as the comment
before the function says) that fast-import is still running, waiting
further input.  Wouldn't closing the other side of the pipe here
like these make it notice that there is no more data by causing
read_next_command() find EOF?  IOW, is "use import_until_checkout,
test the outcome and then make sure import_still_running reports that
the outcome was not due to the process terminating and flushing"
somewhat racy?

Or are we closing these file descriptors for different reason
(i.e. not to tell fast-import we are done feeding it input) and I am
reading the code incorrectly?  Puzzled.

> +	if test $error -eq 1
> +	then
> +		exit 1
> +	fi
> +}
> +
> +background_import_still_running () {
> +	if ! kill -0 "$(cat V.pid)"
> +	then
> +		echo >&2 "background fast-import terminated too early"
> +		exit 1
> +	fi
> +}

I suspect these "exit 1" above should be "false", to give the calling
test_expect_success a chance to notice the failure and react to it.

> +test_expect_success 'V: checkpoint updates refs after reset' '
> +	cat >input <<-\INPUT_END &&
> +	reset refs/heads/V
> +	from refs/heads/U
> +
> +	checkpoint
> +	progress checkpoint
> +	INPUT_END
> +
> +	background_import_until_checkpoint "" input &&
> +	test "$(git rev-parse --verify V)" = "$(git rev-parse --verify U)" &&
> +	background_import_still_running
> +'

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-28  3:48         ` Junio C Hamano
@ 2017-09-28  4:56           ` Eric Rannaud
  2017-09-28  5:07             ` [PATCH 1/1] " Eric Rannaud
  2017-09-28  6:02             ` Junio C Hamano
  0 siblings, 2 replies; 20+ messages in thread
From: Eric Rannaud @ 2017-09-28  4:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy SERROR, Shawn O . Pearce

On Wed, Sep 27, 2017 at 8:48 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> +     cat $input_file >&8
>
> It probably is a good idea to quote "$input_file" in case other
> people later use a full path to the file or something; for now this
> is OK.

Right.


> fd#8 at this point does not have a reader; unless the contents of
> the $input_file is small enough, wouldn't this "cat" block until
> somebody else comes and reads from it to drain?  Should we instead
> start fast-import first in the background, arrange it to be killed
> when we are done with it, and then start feeding the input?

Good point, I will swap the order.


>> +     git fast-import $options <&8 >&9 &
>> +     echo $! >V.pid
>> +     test_when_finished "kill $(cat V.pid) || true"
>
> This '|| true' is here because the process might already have died
> on its own, which sounds like a sensible precaution.

I added a comment.


>> +     error=0
>> +     if read output <&9
>> +     then
>> +             if ! test "$output" = "progress checkpoint"
>> +             then
>> +                     echo >&2 "no progress checkpoint received: $output"
>> +                     error=1
>> +             fi
>> +     else
>> +             echo >&2 "failed to read fast-import output"
>> +             error=1
>> +     fi
>
> And we expect "progress checkpoint" would be the first and only
> output after fast-import consumes all the input stream up to the
> "progress" thing we feed, so this is not "read and discard until
> we see 'progress checkpoint'" but is "read one and that must be
> 'progress checkpoint'".  Makes sense to me.
>
> If this script is (and will be in the future) all about issuing a
> checkpoint command and observing its effect, we can reasonably
> expect that the input file _must_ end with "checkpoint" followed by
> "progress checkpoint", no?  If that is the case, perhaps feeding
> these two from this helper function to >&8, instead of forcing the
> caller to prepare the input file to always end with these two, may
> be a better organization.

Agreed. Renamed the function background_import_then_checkpoint to
reflect the change.


>> +     exec 8>&-
>> +     exec 9>&-
>
> These are to make sure that nobody (after fast-import dies) has
> these file descriptors hanging open for writing.  Makes one wonder
> what happens to the reader side of the file descriptor, though ;-)
>
> Before we return from this function, we expect (as the comment
> before the function says) that fast-import is still running, waiting
> further input.  Wouldn't closing the other side of the pipe here
> like these make it notice that there is no more data by causing
> read_next_command() find EOF?  IOW, is "use import_until_checkout,
> test the outcome and then make sure import_still_running reports that
> the outcome was not due to the process terminating and flushing"
> somewhat racy?
>
> Or are we closing these file descriptors for different reason
> (i.e. not to tell fast-import we are done feeding it input) and I am
> reading the code incorrectly?  Puzzled.

Closing 8 and 9 was just housekeeping on my part. But you raise a good
point: what happens then to the stdin of fast-import?

Doesn't fast-import get a copy of 8 (open for both reading and
writing), as a child process, and exec 8>&- only closes the copy of
the file descriptor in the parent shell, so the named pipe remains
open for writing somewhere (in the fast-import process itself, in
fact), therefore fast-import will not find EOF on its stdin?

But in any case, it is sensible to delay the closing of 8 and 9 to
test_when_finished.


>> +     if test $error -eq 1
>> +     then
>> +             exit 1
>> +     fi
>> +}
>> +
>> +background_import_still_running () {
>> +     if ! kill -0 "$(cat V.pid)"
>> +     then
>> +             echo >&2 "background fast-import terminated too early"
>> +             exit 1
>> +     fi
>> +}
>
> I suspect these "exit 1" above should be "false", to give the calling
> test_expect_success a chance to notice the failure and react to it.

True.

Will follow-up with an updated patch.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/1] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-28  4:56           ` Eric Rannaud
@ 2017-09-28  5:07             ` Eric Rannaud
  2017-09-28 10:35               ` Junio C Hamano
  2017-09-28 12:59               ` Adam Dinwoodie
  2017-09-28  6:02             ` Junio C Hamano
  1 sibling, 2 replies; 20+ messages in thread
From: Eric Rannaud @ 2017-09-28  5:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, jeremy.serror, Shawn O . Pearce, Ramsay Jones, Eric Rannaud

The checkpoint command cycles packfiles if object_count != 0, a sensible
test or there would be no pack files to write. Since 820b931012, the
command also dumps branches, tags and marks, but still conditionally.
However, it is possible for a command stream to modify refs or create
marks without creating any new objects.

For example, reset a branch (and keep fast-import running):

	$ git fast-import
	reset refs/heads/master
	from refs/heads/master^

	checkpoint

but refs/heads/master remains unchanged.

Other example: a commit command that re-creates an object that already
exists in the object database.

The man page also states that checkpoint "updates the refs" and that
"placing a progress command immediately after a checkpoint will inform
the reader when the checkpoint has been completed and it can safely
access the refs that fast-import updated". This wasn't always true
without this patch.

This fix unconditionally calls dump_{branches,tags,marks}() for all
checkpoint commands. dump_branches() and dump_tags() are cheap to call
in the case of a no-op.

Add tests to t9300 that observe the (non-packfiles) effects of
checkpoint.

Signed-off-by: Eric Rannaud <e@nanocritical.com>
---
 fast-import.c          |   6 +--
 t/t9300-fast-import.sh | 126 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 129 insertions(+), 3 deletions(-)


Updated to include Junio's latest remarks.

Also adding the necessary PIPE prereq, as pointed out by Ramsay Jones.


diff --git a/fast-import.c b/fast-import.c
index 35bf671f12c4..d5e4cf0bad41 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3189,10 +3189,10 @@ static void checkpoint(void)
 	checkpoint_requested = 0;
 	if (object_count) {
 		cycle_packfile();
-		dump_branches();
-		dump_tags();
-		dump_marks();
 	}
+	dump_branches();
+	dump_tags();
+	dump_marks();
 }
 
 static void parse_checkpoint(void)
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 67b8c50a5ab4..b8d394548520 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -3120,4 +3120,130 @@ test_expect_success 'U: validate root delete result' '
 	compare_diff_raw expect actual
 '
 
+###
+### series V (checkpoint)
+###
+
+# The commands in input_file should not produce any output on the file
+# descriptor set with --cat-blob-fd (or stdout if unspecified).
+#
+# To make sure you're observing the side effects of checkpoint *before*
+# fast-import terminates (and thus writes out its state), check that the
+# fast-import process is still running using background_import_still_running
+# *after* evaluating the test conditions.
+background_import_then_checkpoint () {
+	options=$1
+	input_file=$2
+
+	mkfifo V.input
+	exec 8<>V.input
+	rm V.input
+
+	mkfifo V.output
+	exec 9<>V.output
+	rm V.output
+
+	git fast-import $options <&8 >&9 &
+	echo $! >V.pid
+	# We don't mind if fast-import has already died by the time the test
+	# ends.
+	test_when_finished "exec 8>&-; exec 9>&-; kill $(cat V.pid) || true"
+
+	cat "$input_file" >&8
+	echo "checkpoint" >&8
+	echo "progress checkpoint" >&8
+
+	error=0
+	if read output <&9
+	then
+		if ! test "$output" = "progress checkpoint"
+		then
+			echo >&2 "no progress checkpoint received: $output"
+			error=1
+		fi
+	else
+		echo >&2 "failed to read fast-import output"
+		error=1
+	fi
+
+	if test $error -eq 1
+	then
+		false
+	fi
+}
+
+background_import_still_running () {
+	if ! kill -0 "$(cat V.pid)"
+	then
+		echo >&2 "background fast-import terminated too early"
+		false
+	fi
+}
+
+test_expect_success PIPE 'V: checkpoint updates refs after reset' '
+	cat >input <<-\INPUT_END &&
+	reset refs/heads/V
+	from refs/heads/U
+
+	INPUT_END
+
+	background_import_then_checkpoint "" input &&
+	test "$(git rev-parse --verify V)" = "$(git rev-parse --verify U)" &&
+	background_import_still_running
+'
+
+test_expect_success PIPE 'V: checkpoint updates refs and marks after commit' '
+	cat >input <<-INPUT_END &&
+	commit refs/heads/V
+	mark :1
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+	from refs/heads/U
+
+	INPUT_END
+
+	background_import_then_checkpoint "--export-marks=marks.actual" input &&
+
+	echo ":1 $(git rev-parse --verify V)" >marks.expected &&
+
+	test "$(git rev-parse --verify V^)" = "$(git rev-parse --verify U)" &&
+	test_cmp marks.expected marks.actual &&
+	background_import_still_running
+'
+
+# Re-create the exact same commit, but on a different branch: no new object is
+# created in the database, but the refs and marks still need to be updated.
+test_expect_success PIPE 'V: checkpoint updates refs and marks after commit (no new objects)' '
+	cat >input <<-INPUT_END &&
+	commit refs/heads/V2
+	mark :2
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+	from refs/heads/U
+
+	INPUT_END
+
+	background_import_then_checkpoint "--export-marks=marks.actual" input &&
+
+	echo ":2 $(git rev-parse --verify V2)" >marks.expected &&
+
+	test "$(git rev-parse --verify V2)" = "$(git rev-parse --verify V)" &&
+	test_cmp marks.expected marks.actual &&
+	background_import_still_running
+'
+
+test_expect_success PIPE 'V: checkpoint updates tags after tag' '
+	cat >input <<-INPUT_END &&
+	tag Vtag
+	from refs/heads/V
+	tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+
+	INPUT_END
+
+	background_import_then_checkpoint "" input &&
+	git show-ref -d Vtag &&
+	background_import_still_running
+'
+
 test_done
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-28  4:56           ` Eric Rannaud
  2017-09-28  5:07             ` [PATCH 1/1] " Eric Rannaud
@ 2017-09-28  6:02             ` Junio C Hamano
  2017-09-28  6:44               ` Eric Rannaud
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2017-09-28  6:02 UTC (permalink / raw)
  To: Eric Rannaud; +Cc: git, Jeremy SERROR, Shawn O . Pearce

Eric Rannaud <e@nanocritical.com> writes:

> Doesn't fast-import get a copy of 8 (open for both reading and
> writing), as a child process, and exec 8>&- only closes the copy of
> the file descriptor in the parent shell, so the named pipe remains
> open for writing somewhere (in the fast-import process itself, in
> fact), therefore fast-import will not find EOF on its stdin?

AHHHHhhhh.  If that was done intentionally, well, I really have to
marvel at the cleverness of the solution!  It makes sense now to me.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-28  6:02             ` Junio C Hamano
@ 2017-09-28  6:44               ` Eric Rannaud
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Rannaud @ 2017-09-28  6:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy SERROR, Shawn O . Pearce

On Wed, Sep 27, 2017 at 11:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Eric Rannaud <e@nanocritical.com> writes:
>
>> Doesn't fast-import get a copy of 8 (open for both reading and
>> writing), as a child process, and exec 8>&- only closes the copy of
>> the file descriptor in the parent shell, so the named pipe remains
>> open for writing somewhere (in the fast-import process itself, in
>> fact), therefore fast-import will not find EOF on its stdin?
>
> AHHHHhhhh.  If that was done intentionally, well, I really have to
> marvel at the cleverness of the solution!  It makes sense now to me.

I plead the Fifth.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/1] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-28  5:07             ` [PATCH 1/1] " Eric Rannaud
@ 2017-09-28 10:35               ` Junio C Hamano
  2017-09-28 20:30                 ` Eric Rannaud
  2017-09-28 12:59               ` Adam Dinwoodie
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2017-09-28 10:35 UTC (permalink / raw)
  To: Eric Rannaud; +Cc: git, jeremy.serror, Shawn O . Pearce, Ramsay Jones

"Eric Rannaud" <e@nanocritical.com> writes:

> +# The commands in input_file should not produce any output on the file
> +# descriptor set with --cat-blob-fd (or stdout if unspecified).

Thanks for documenting this.  Swapping the order of starting
fast-import and feeding its input (which is one change in this
version relative to the previous one) alone would not help, because
in the updated order in this patch, nobody is reading from
fast-import until the parent process finishes feeding it.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/1] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-28  5:07             ` [PATCH 1/1] " Eric Rannaud
  2017-09-28 10:35               ` Junio C Hamano
@ 2017-09-28 12:59               ` Adam Dinwoodie
  2017-09-28 21:03                 ` [PATCH] " Eric Rannaud
  2017-09-29  2:44                 ` [PATCH 1/1] " Junio C Hamano
  1 sibling, 2 replies; 20+ messages in thread
From: Adam Dinwoodie @ 2017-09-28 12:59 UTC (permalink / raw)
  To: Eric Rannaud
  Cc: Junio C Hamano, git, jeremy.serror, Shawn O . Pearce, Ramsay Jones

On Wed, Sep 27, 2017 at 10:07:41PM -0700, Eric Rannaud wrote:
> The checkpoint command cycles packfiles if object_count != 0, a sensible
> test or there would be no pack files to write. Since 820b931012, the
> command also dumps branches, tags and marks, but still conditionally.
> However, it is possible for a command stream to modify refs or create
> marks without creating any new objects.
> 
> For example, reset a branch (and keep fast-import running):
> 
> 	$ git fast-import
> 	reset refs/heads/master
> 	from refs/heads/master^
> 
> 	checkpoint
> 
> but refs/heads/master remains unchanged.
> 
> Other example: a commit command that re-creates an object that already
> exists in the object database.
> 
> The man page also states that checkpoint "updates the refs" and that
> "placing a progress command immediately after a checkpoint will inform
> the reader when the checkpoint has been completed and it can safely
> access the refs that fast-import updated". This wasn't always true
> without this patch.
> 
> This fix unconditionally calls dump_{branches,tags,marks}() for all
> checkpoint commands. dump_branches() and dump_tags() are cheap to call
> in the case of a no-op.
> 
> Add tests to t9300 that observe the (non-packfiles) effects of
> checkpoint.
> 
> Signed-off-by: Eric Rannaud <e@nanocritical.com>
> ---
>  fast-import.c          |   6 +--
>  t/t9300-fast-import.sh | 126 +++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 129 insertions(+), 3 deletions(-)
> 
> 
> Updated to include Junio's latest remarks.
> 
> Also adding the necessary PIPE prereq, as pointed out by Ramsay Jones.

Cygwin doesn't have the PIPE prereq; I've just confirmed that the
previous version of this patch has t9300 failing on Cygwin, but this
version passes.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/1] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-28 10:35               ` Junio C Hamano
@ 2017-09-28 20:30                 ` Eric Rannaud
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Rannaud @ 2017-09-28 20:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy SERROR, Shawn O . Pearce, Ramsay Jones

On Thu, Sep 28, 2017 at 3:35 AM, Junio C Hamano <gitster@pobox.com> wrote:
> "Eric Rannaud" <e@nanocritical.com> writes:
>
>> +# The commands in input_file should not produce any output on the file
>> +# descriptor set with --cat-blob-fd (or stdout if unspecified).
>
> Thanks for documenting this.  Swapping the order of starting
> fast-import and feeding its input (which is one change in this
> version relative to the previous one) alone would not help, because
> in the updated order in this patch, nobody is reading from
> fast-import until the parent process finishes feeding it.

Darn. That's correct, on a platform with blocking pipes it would not
work. I will start the input cat in the background (so it may block
while writing), before reading from the output of fast-import.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-28 12:59               ` Adam Dinwoodie
@ 2017-09-28 21:03                 ` Eric Rannaud
  2017-09-29  2:44                 ` [PATCH 1/1] " Junio C Hamano
  1 sibling, 0 replies; 20+ messages in thread
From: Eric Rannaud @ 2017-09-28 21:03 UTC (permalink / raw)
  To: Adam Dinwoodie
  Cc: Junio C Hamano, git, jeremy.serror, Shawn O . Pearce, Ramsay Jones

On Thu, Sep 28, 2017 at 5:59 AM, Adam Dinwoodie <adam@dinwoodie.org> wrote:
> On Wed, Sep 27, 2017 at 10:07:41PM -0700, Eric Rannaud wrote:
>>
>> Also adding the necessary PIPE prereq, as pointed out by Ramsay Jones.
>
> Cygwin doesn't have the PIPE prereq; I've just confirmed that the
> previous version of this patch has t9300 failing on Cygwin, but this
> version passes.

What's the preferred solution here? I can avoid using named pipes entirely:

	read_checkpoint () {
		if read output
		then
			if ! test "$output" = "progress checkpoint"
			then
				echo >&2 "no progress checkpoint received: $output"
				echo 1 > V.result
			else
				echo 0 > V.result
			fi
		else
			echo >&2 "failed to read fast-import output"
			echo 1 > V.result
		fi
	}
	
	# The commands in input_file should not produce any output on the file
	# descriptor set with --cat-blob-fd (or stdout if unspecified).
	#
	# To make sure you're observing the side effects of checkpoint *before*
	# fast-import terminates (and thus writes out its state), check that the
	# fast-import process is still running using background_import_still_running
	# *after* evaluating the test conditions.
	background_import_then_checkpoint () {
		options=$1
		input_file=$2
	
		rm -f V.result
	
		( cat "$input_file"
		echo "checkpoint"
		echo "progress checkpoint"
		sleep 3600 &
		echo $! >V.pid
		wait ) | git fast-import $options | read_checkpoint &
	
		# We don't mind if the pipeline has already died by the time the test
		# ends.
		test_when_finished "kill $(cat V.pid) || true"
	
		while ! test -f V.result
		do
			# Try to sleep less than a second, if supported.
			sleep .1 2>/dev/null || sleep 1
		done
		return $(cat V.result)
	}

Do we like this better?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/1] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-28 12:59               ` Adam Dinwoodie
  2017-09-28 21:03                 ` [PATCH] " Eric Rannaud
@ 2017-09-29  2:44                 ` Junio C Hamano
  2017-09-29  3:09                   ` [PATCH] " Eric Rannaud
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2017-09-29  2:44 UTC (permalink / raw)
  To: Adam Dinwoodie
  Cc: Eric Rannaud, git, jeremy.serror, Shawn O . Pearce, Ramsay Jones

Adam Dinwoodie <adam@dinwoodie.org> writes:

>> Also adding the necessary PIPE prereq, as pointed out by Ramsay Jones.
>
> Cygwin doesn't have the PIPE prereq; I've just confirmed that the
> previous version of this patch has t9300 failing on Cygwin, but this
> version passes.

Thanks for a report.  So the patch should be good to go, it seems to
me.

Eric, thanks again for working on this one.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-29  2:44                 ` [PATCH 1/1] " Junio C Hamano
@ 2017-09-29  3:09                   ` Eric Rannaud
  2017-09-29  3:51                     ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Rannaud @ 2017-09-29  3:09 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Adam Dinwoodie, git, jeremy.serror, Shawn O . Pearce,
	Ramsay Jones, Eric Rannaud

The checkpoint command cycles packfiles if object_count != 0, a sensible
test or there would be no pack files to write. Since 820b931012, the
command also dumps branches, tags and marks, but still conditionally.
However, it is possible for a command stream to modify refs or create
marks without creating any new objects.

For example, reset a branch (and keep fast-import running):

	$ git fast-import
	reset refs/heads/master
	from refs/heads/master^

	checkpoint

but refs/heads/master remains unchanged.

Other example: a commit command that re-creates an object that already
exists in the object database.

The man page also states that checkpoint "updates the refs" and that
"placing a progress command immediately after a checkpoint will inform
the reader when the checkpoint has been completed and it can safely
access the refs that fast-import updated". This wasn't always true
without this patch.

This fix unconditionally calls dump_{branches,tags,marks}() for all
checkpoint commands. dump_branches() and dump_tags() are cheap to call
in the case of a no-op.

Add tests to t9300 that observe the (non-packfiles) effects of
checkpoint.

Signed-off-by: Eric Rannaud <e@nanocritical.com>
---
 fast-import.c          |   6 +--
 t/t9300-fast-import.sh | 130 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 133 insertions(+), 3 deletions(-)


Junio, this last version addresses your last remark regarding the
potential for the cat $input_file sequence to block when the FIFOs are
unbuffered.

The concern is mainly theoretical (*if* the shell function is used
correctly): fast-import will not start writing to its own output until
it has fully consumed its input (as the only command generating output
should be the last one). Nevertheless, in this version the write is done
in the background.

Thanks!


diff --git a/fast-import.c b/fast-import.c
index 35bf671f12c4..d5e4cf0bad41 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3189,10 +3189,10 @@ static void checkpoint(void)
 	checkpoint_requested = 0;
 	if (object_count) {
 		cycle_packfile();
-		dump_branches();
-		dump_tags();
-		dump_marks();
 	}
+	dump_branches();
+	dump_tags();
+	dump_marks();
 }
 
 static void parse_checkpoint(void)
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 67b8c50a5ab4..8f583e8a22c1 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -3120,4 +3120,134 @@ test_expect_success 'U: validate root delete result' '
 	compare_diff_raw expect actual
 '
 
+###
+### series V (checkpoint)
+###
+
+# The commands in input_file should not produce any output on the file
+# descriptor set with --cat-blob-fd (or stdout if unspecified).
+#
+# To make sure you're observing the side effects of checkpoint *before*
+# fast-import terminates (and thus writes out its state), check that the
+# fast-import process is still running using background_import_still_running
+# *after* evaluating the test conditions.
+background_import_then_checkpoint () {
+	options=$1
+	input_file=$2
+
+	mkfifo V.input
+	exec 8<>V.input
+	rm V.input
+
+	mkfifo V.output
+	exec 9<>V.output
+	rm V.output
+
+	git fast-import $options <&8 >&9 &
+	echo $! >V.pid
+	# We don't mind if fast-import has already died by the time the test
+	# ends.
+	test_when_finished "exec 8>&-; exec 9>&-; kill $(cat V.pid) || true"
+
+	# Start in the background to ensure we adhere strictly to (blocking)
+	# pipes writing sequence. We want to assume that the write below could
+	# block, e.g. if fast-import blocks writing its own output to &9
+	# because there is no reader on &9 yet.
+	( cat "$input_file"
+	echo "checkpoint"
+	echo "progress checkpoint" ) >&8 &
+
+	error=0
+	if read output <&9
+	then
+		if ! test "$output" = "progress checkpoint"
+		then
+			echo >&2 "no progress checkpoint received: $output"
+			error=1
+		fi
+	else
+		echo >&2 "failed to read fast-import output"
+		error=1
+	fi
+
+	if test $error -eq 1
+	then
+		false
+	fi
+}
+
+background_import_still_running () {
+	if ! kill -0 "$(cat V.pid)"
+	then
+		echo >&2 "background fast-import terminated too early"
+		false
+	fi
+}
+
+test_expect_success PIPE 'V: checkpoint updates refs after reset' '
+	cat >input <<-\INPUT_END &&
+	reset refs/heads/V
+	from refs/heads/U
+
+	INPUT_END
+
+	background_import_then_checkpoint "" input &&
+	test "$(git rev-parse --verify V)" = "$(git rev-parse --verify U)" &&
+	background_import_still_running
+'
+
+test_expect_success PIPE 'V: checkpoint updates refs and marks after commit' '
+	cat >input <<-INPUT_END &&
+	commit refs/heads/V
+	mark :1
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+	from refs/heads/U
+
+	INPUT_END
+
+	background_import_then_checkpoint "--export-marks=marks.actual" input &&
+
+	echo ":1 $(git rev-parse --verify V)" >marks.expected &&
+
+	test "$(git rev-parse --verify V^)" = "$(git rev-parse --verify U)" &&
+	test_cmp marks.expected marks.actual &&
+	background_import_still_running
+'
+
+# Re-create the exact same commit, but on a different branch: no new object is
+# created in the database, but the refs and marks still need to be updated.
+test_expect_success PIPE 'V: checkpoint updates refs and marks after commit (no new objects)' '
+	cat >input <<-INPUT_END &&
+	commit refs/heads/V2
+	mark :2
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+	from refs/heads/U
+
+	INPUT_END
+
+	background_import_then_checkpoint "--export-marks=marks.actual" input &&
+
+	echo ":2 $(git rev-parse --verify V2)" >marks.expected &&
+
+	test "$(git rev-parse --verify V2)" = "$(git rev-parse --verify V)" &&
+	test_cmp marks.expected marks.actual &&
+	background_import_still_running
+'
+
+test_expect_success PIPE 'V: checkpoint updates tags after tag' '
+	cat >input <<-INPUT_END &&
+	tag Vtag
+	from refs/heads/V
+	tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data 0
+
+	INPUT_END
+
+	background_import_then_checkpoint "" input &&
+	git show-ref -d Vtag &&
+	background_import_still_running
+'
+
 test_done
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-29  3:09                   ` [PATCH] " Eric Rannaud
@ 2017-09-29  3:51                     ` Junio C Hamano
  2017-09-29  5:40                       ` Eric Rannaud
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2017-09-29  3:51 UTC (permalink / raw)
  To: Eric Rannaud
  Cc: Adam Dinwoodie, git, jeremy.serror, Shawn O . Pearce, Ramsay Jones

"Eric Rannaud" <e@nanocritical.com> writes:

> Junio, this last version addresses your last remark regarding the
> potential for the cat $input_file sequence to block when the FIFOs are
> unbuffered.
>
> The concern is mainly theoretical (*if* the shell function is used
> correctly): fast-import will not start writing to its own output until
> it has fully consumed its input (as the only command generating output
> should be the last one). Nevertheless, in this version the write is done
> in the background.

I agree that the concern is theoretical.  Without this fix, we may
not be able to feed the input fully up to the final 'progress
checkpoint' command---because the other side is not reading, we may
get stuck while attempting to write "checkpoint" or "progress
checkpoint", and may never get to read what fast-import says to get
us unstuck.

But if we wanted to solve the theoretical issue (i.e. the command
sequence the user of this helper shell function gives us _might_
trigger an output from fast-import) fully, I do not think
backgrounding the feeding side is sufficient.  The code that reads
fd#9 would need to become a while loop that reads and discards extra
output until we see the "progress checkpoint", at least, perhaps
like the attached patch.

But even with such a fix, we are still at the mercy of the caller of
the helper---the helper will get broken if the input happened to
have a "progress checkpoint" command itself.  There has to be a
"good enough" place to stop.

I think that your patch the last round that feeds fd#8 in the
foreground (i.e. fully trusting that the caller is sensibly giving
input that produces no output) is already a good place to stop.

Your patch this round that feeds fd#8 in the background, plus the
attached patch (i.e. not trusting the caller as much and allowing it
to use commands that outputs something, within reason), would also
be a good place to stop.

But I am not sure your patch this round alone is a good place to
stop.  It somehow feels halfway either way.

 t/t9300-fast-import.sh | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 74ba70874b..d47560b634 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -3159,18 +3159,17 @@ background_import_then_checkpoint () {
 		echo "progress checkpoint"
 	) >&8 &
 
-	error=0
-	if read output <&9
-	then
-		if ! test "$output" = "progress checkpoint"
+	error=1 ;# assume the worst
+	while read output <&9
+	do
+		if test "$output" = "progress checkpoint"
 		then
-			echo >&2 "no progress checkpoint received: $output"
-			error=1
+			error=0
+			break
 		fi
-	else
-		echo >&2 "failed to read fast-import output"
-		error=1
-	fi
+		# otherwise ignore cruft
+		echo >&2 "cruft: $output"
+	done
 
 	if test $error -eq 1
 	then
@@ -3186,6 +3185,17 @@ background_import_still_running () {
 	fi
 }
 
+test_expect_success PIPE 'V: checkpoint helper does not get stuck with extra output' '
+	cat >input <<-INPUT_END &&
+	progress foo
+	progress bar
+
+	INPUT_END
+
+	background_import_then_checkpoint "" input &&
+	background_import_still_running
+'
+
 test_expect_success PIPE 'V: checkpoint updates refs after reset' '
 	cat >input <<-\INPUT_END &&
 	reset refs/heads/V
-- 
2.14.2-820-gd7428e091c




^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-29  3:51                     ` Junio C Hamano
@ 2017-09-29  5:40                       ` Eric Rannaud
  2017-09-29  9:35                         ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Rannaud @ 2017-09-29  5:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Adam Dinwoodie, git, Jeremy SERROR, Shawn O . Pearce, Ramsay Jones

On Thu, Sep 28, 2017 at 8:51 PM, Junio C Hamano <gitster@pobox.com> wrote:
> I think that your patch the last round that feeds fd#8 in the
> foreground (i.e. fully trusting that the caller is sensibly giving
> input that produces no output) is already a good place to stop.
>
> Your patch this round that feeds fd#8 in the background, plus the
> attached patch (i.e. not trusting the caller as much and allowing it
> to use commands that outputs something, within reason), would also
> be a good place to stop.
>
> But I am not sure your patch this round alone is a good place to
> stop.  It somehow feels halfway either way.

I agree. If we're coding defensively against the caller, we do have to
include your patch to be effective, you're right. I reckon we likely
don't need to be quite so paranoid, at least until this has more
users.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0
  2017-09-29  5:40                       ` Eric Rannaud
@ 2017-09-29  9:35                         ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2017-09-29  9:35 UTC (permalink / raw)
  To: Eric Rannaud
  Cc: Adam Dinwoodie, git, Jeremy SERROR, Shawn O . Pearce, Ramsay Jones

Eric Rannaud <e@nanocritical.com> writes:

> On Thu, Sep 28, 2017 at 8:51 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> I think that your patch the last round that feeds fd#8 in the
>> foreground (i.e. fully trusting that the caller is sensibly giving
>> input that produces no output) is already a good place to stop.
>>
>> Your patch this round that feeds fd#8 in the background, plus the
>> attached patch (i.e. not trusting the caller as much and allowing it
>> to use commands that outputs something, within reason), would also
>> be a good place to stop.
>>
>> But I am not sure your patch this round alone is a good place to
>> stop.  It somehow feels halfway either way.
>
> I agree. If we're coding defensively against the caller, we do have to
> include your patch to be effective, you're right. I reckon we likely
> don't need to be quite so paranoid, at least until this has more
> users.

OK, let's then pick the (not too excessively) defensive version by
taking your last one and suggested "while" loop squashed into it.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-09-29  9:35 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-26  3:30 [PATCH] fast-import: checkpoint: dump branches/tags/marks even if object_count==0 Eric Rannaud
2017-09-26  4:25 ` Junio C Hamano
2017-09-26  9:53   ` [PATCH 1/1] " Eric Rannaud
2017-09-27  3:37     ` Junio C Hamano
2017-09-27 19:46       ` [PATCH] " Eric Rannaud
2017-09-27 23:19         ` Ramsay Jones
2017-09-28  3:48         ` Junio C Hamano
2017-09-28  4:56           ` Eric Rannaud
2017-09-28  5:07             ` [PATCH 1/1] " Eric Rannaud
2017-09-28 10:35               ` Junio C Hamano
2017-09-28 20:30                 ` Eric Rannaud
2017-09-28 12:59               ` Adam Dinwoodie
2017-09-28 21:03                 ` [PATCH] " Eric Rannaud
2017-09-29  2:44                 ` [PATCH 1/1] " Junio C Hamano
2017-09-29  3:09                   ` [PATCH] " Eric Rannaud
2017-09-29  3:51                     ` Junio C Hamano
2017-09-29  5:40                       ` Eric Rannaud
2017-09-29  9:35                         ` Junio C Hamano
2017-09-28  6:02             ` Junio C Hamano
2017-09-28  6:44               ` Eric Rannaud

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.