git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/10] The final building block for a faster rebase -i
@ 2017-07-14 14:44 Johannes Schindelin
  2017-07-14 14:44 ` [PATCH v6 01/10] t3415: verify that an empty instructionFormat is handled as before Johannes Schindelin
                   ` (11 more replies)
  0 siblings, 12 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:44 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

This patch series reimplements the expensive pre- and post-processing of
the todo script in C.

And it concludes the work I did to accelerate rebase -i so far.

I am still unwilling to replace a compile-time safe way to pass the
options to the revision machinery by the alternative (which I am still
flabbergasted about) proposed by Junio. This will not change.

I know that we want to concentrate on bug fixes on `master`, but this
patch series will most likely take a couple of months to get there, anyway.
So I may just as well send this iteration now. It's also not like I haven't
contributed any bug fixes lately...

Changes since v5:

- replaced a get_sha1() call by a get_oid() call already.

- adjusted to hashmap API changes


Johannes Schindelin (10):
  t3415: verify that an empty instructionFormat is handled as before
  rebase -i: generate the script via rebase--helper
  rebase -i: remove useless indentation
  rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  t3404: relax rebase.missingCommitsCheck tests
  rebase -i: check for missing commits in the rebase--helper
  rebase -i: skip unnecessary picks using the rebase--helper
  t3415: test fixup with wrapped oneline
  rebase -i: rearrange fixup/squash lines using the rebase--helper

 Documentation/git-rebase.txt  |  16 +-
 builtin/rebase--helper.c      |  29 ++-
 git-rebase--interactive.sh    | 373 ++++-------------------------
 sequencer.c                   | 531 ++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                   |   8 +
 t/t3404-rebase-interactive.sh |  22 +-
 t/t3415-rebase-autosquash.sh  |  28 ++-
 7 files changed, 647 insertions(+), 360 deletions(-)


base-commit: f3da2b79be9565779e4f76dc5812c68e156afdf0
Based-On: rebase--helper at https://github.com/dscho/git
Fetch-Base-Via: git fetch https://github.com/dscho/git rebase--helper
Published-As: https://github.com/dscho/git/releases/tag/rebase-i-extra-v6
Fetch-It-Via: git fetch https://github.com/dscho/git rebase-i-extra-v6

Interdiff vs v5:
 diff --git a/sequencer.c b/sequencer.c
 index 8713cc8d1d5..c54596f9699 100644
 --- a/sequencer.c
 +++ b/sequencer.c
 @@ -2654,7 +2654,7 @@ int skip_unnecessary_picks(void)
  
  	if (!read_oneliner(&buf, rebase_path_onto(), 0))
  		return error(_("could not read 'onto'"));
 -	if (get_sha1(buf.buf, onto_oid.hash)) {
 +	if (get_oid(buf.buf, &onto_oid)) {
  		strbuf_release(&buf);
  		return error(_("need a HEAD to fixup"));
  	}
 @@ -2756,8 +2756,9 @@ struct subject2item_entry {
  	char subject[FLEX_ARRAY];
  };
  
 -static int subject2item_cmp(const struct subject2item_entry *a,
 -	const struct subject2item_entry *b, const void *key)
 +static int subject2item_cmp(const void *fndata,
 +			    const struct subject2item_entry *a,
 +			    const struct subject2item_entry *b, const void *key)
  {
  	return key ? strcmp(a->subject, key) : strcmp(a->subject, b->subject);
  }
 @@ -2802,7 +2803,7 @@ int rearrange_squash(void)
  	 * be moved to appear after the i'th.
  	 */
  	hashmap_init(&subject2item, (hashmap_cmp_fn) subject2item_cmp,
 -		     todo_list.nr);
 +		     NULL, todo_list.nr);
  	ALLOC_ARRAY(next, todo_list.nr);
  	ALLOC_ARRAY(tail, todo_list.nr);
  	ALLOC_ARRAY(subjects, todo_list.nr);
-- 
2.13.3.windows.1.13.gaf0c2223da0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 01/10] t3415: verify that an empty instructionFormat is handled as before
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
@ 2017-07-14 14:44 ` Johannes Schindelin
  2017-07-14 14:44 ` [PATCH v6 02/10] rebase -i: generate the script via rebase--helper Johannes Schindelin
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:44 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

An upcoming patch will move the todo list generation into the
rebase--helper. An early version of that patch regressed on an empty
rebase.instructionFormat value (the shell version could not discern
between an empty one and a non-existing one, but the C version used the
empty one as if that was intended to skip the oneline from the `pick
<hash>` lines).

Let's verify that this still works as before.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3415-rebase-autosquash.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 5848949ec37..6d99f624b62 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -271,6 +271,18 @@ test_expect_success C_LOCALE_OUTPUT 'autosquash with custom inst format' '
 	test 2 = $(git cat-file commit HEAD^ | grep squash | wc -l)
 '
 
+test_expect_success 'autosquash with empty custom instructionFormat' '
+	git reset --hard base &&
+	test_commit empty-instructionFormat-test &&
+	(
+		set_cat_todo_editor &&
+		test_must_fail git -c rebase.instructionFormat= \
+			rebase --autosquash  --force -i HEAD^ >actual &&
+		git log -1 --format="pick %h %s" >expect &&
+		test_cmp expect actual
+	)
+'
+
 set_backup_editor () {
 	write_script backup-editor.sh <<-\EOF
 	cp "$1" .git/backup-"$(basename "$1")"
-- 
2.13.3.windows.1.13.gaf0c2223da0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 02/10] rebase -i: generate the script via rebase--helper
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
  2017-07-14 14:44 ` [PATCH v6 01/10] t3415: verify that an empty instructionFormat is handled as before Johannes Schindelin
@ 2017-07-14 14:44 ` Johannes Schindelin
  2017-07-14 22:50   ` Stefan Beller
  2017-07-14 14:45 ` [PATCH v6 03/10] rebase -i: remove useless indentation Johannes Schindelin
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:44 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

The first step of an interactive rebase is to generate the so-called "todo
script", to be stored in the state directory as "git-rebase-todo" and to
be edited by the user.

Originally, we adjusted the output of `git log <options>` using a simple
sed script. Over the course of the years, the code became more
complicated. We now use shell scripting to edit the output of `git log`
conditionally, depending whether to keep "empty" commits (i.e. commits
that do not change any files).

On platforms where shell scripting is not native, this can be a serious
drag. And it opens the door for incompatibilities between platforms when
it comes to shell scripting or to Unix-y commands.

Let's just re-implement the todo script generation in plain C, using the
revision machinery directly.

This is substantially faster, improving the speed relative to the
shell script version of the interactive rebase from 2x to 3x on Windows.

Note that the rearrange_squash() function in git-rebase--interactive
relied on the fact that we set the "format" variable to the config setting
rebase.instructionFormat. Relying on a side effect like this is no good,
hence we explicitly perform that assignment (possibly again) in
rearrange_squash().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |  8 +++++++-
 git-rebase--interactive.sh | 44 +++++++++++++++++++++--------------------
 sequencer.c                | 49 ++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  3 +++
 4 files changed, 82 insertions(+), 22 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index c82b4dce683..613053276e6 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -12,15 +12,19 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
 	struct replay_opts opts = REPLAY_OPTS_INIT;
+	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
+		OPT_BOOL(0, "keep-empty", &keep_empty, N_("keep empty commits")),
 		OPT_CMDMODE(0, "continue", &command, N_("continue rebase"),
 				CONTINUE),
 		OPT_CMDMODE(0, "abort", &command, N_("abort rebase"),
 				ABORT),
+		OPT_CMDMODE(0, "make-script", &command,
+			N_("make rebase script"), MAKE_SCRIPT),
 		OPT_END()
 	};
 
@@ -37,5 +41,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_continue(&opts);
 	if (command == ABORT && argc == 1)
 		return !!sequencer_remove_state(&opts);
+	if (command == MAKE_SCRIPT && argc > 1)
+		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 90b1fbe9cf6..05766835de1 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -785,6 +785,7 @@ collapse_todo_ids() {
 # each log message will be re-retrieved in order to normalize the
 # autosquash arrangement
 rearrange_squash () {
+	format=$(git config --get rebase.instructionFormat)
 	# extract fixup!/squash! lines and resolve any referenced sha1's
 	while read -r pick sha1 message
 	do
@@ -1210,26 +1211,27 @@ else
 	revisions=$onto...$orig_head
 	shortrevisions=$shorthead
 fi
-format=$(git config --get rebase.instructionFormat)
-# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
-git rev-list $merges_option --format="%m%H ${format:-%s}" \
-	--reverse --left-right --topo-order \
-	$revisions ${restrict_revision+^$restrict_revision} | \
-	sed -n "s/^>//p" |
-while read -r sha1 rest
-do
-
-	if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
-	then
-		comment_out="$comment_char "
-	else
-		comment_out=
-	fi
+if test t != "$preserve_merges"
+then
+	git rebase--helper --make-script ${keep_empty:+--keep-empty} \
+		$revisions ${restrict_revision+^$restrict_revision} >"$todo"
+else
+	format=$(git config --get rebase.instructionFormat)
+	# the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse
+	git rev-list $merges_option --format="%m%H ${format:-%s}" \
+		--reverse --left-right --topo-order \
+		$revisions ${restrict_revision+^$restrict_revision} | \
+		sed -n "s/^>//p" |
+	while read -r sha1 rest
+	do
+
+		if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1
+		then
+			comment_out="$comment_char "
+		else
+			comment_out=
+		fi
 
-	if test t != "$preserve_merges"
-	then
-		printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
-	else
 		if test -z "$rebase_root"
 		then
 			preserve=t
@@ -1248,8 +1250,8 @@ do
 			touch "$rewritten"/$sha1
 			printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo"
 		fi
-	fi
-done
+	done
+fi
 
 # Watch for commits that been dropped by --cherry-pick
 if test t = "$preserve_merges"
diff --git a/sequencer.c b/sequencer.c
index 3010faf8639..afcb3d00a32 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2413,3 +2413,52 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag)
 
 	strbuf_release(&sob);
 }
+
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv)
+{
+	char *format = NULL;
+	struct pretty_print_context pp = {0};
+	struct strbuf buf = STRBUF_INIT;
+	struct rev_info revs;
+	struct commit *commit;
+
+	init_revisions(&revs, NULL);
+	revs.verbose_header = 1;
+	revs.max_parents = 1;
+	revs.cherry_pick = 1;
+	revs.limited = 1;
+	revs.reverse = 1;
+	revs.right_only = 1;
+	revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
+	revs.topo_order = 1;
+
+	revs.pretty_given = 1;
+	git_config_get_string("rebase.instructionFormat", &format);
+	if (!format || !*format) {
+		free(format);
+		format = xstrdup("%s");
+	}
+	get_commit_format(format, &revs);
+	free(format);
+	pp.fmt = revs.commit_format;
+	pp.output_encoding = get_log_output_encoding();
+
+	if (setup_revisions(argc, argv, &revs, NULL) > 1)
+		return error(_("make_script: unhandled options"));
+
+	if (prepare_revision_walk(&revs) < 0)
+		return error(_("make_script: error preparing revisions"));
+
+	while ((commit = get_revision(&revs))) {
+		strbuf_reset(&buf);
+		if (!keep_empty && is_original_commit_empty(commit))
+			strbuf_addf(&buf, "%c ", comment_line_char);
+		strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid));
+		pretty_print_commit(&pp, commit, &buf);
+		strbuf_addch(&buf, '\n');
+		fputs(buf.buf, out);
+	}
+	strbuf_release(&buf);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index f885b68395f..83f2943b7a9 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -45,6 +45,9 @@ int sequencer_continue(struct replay_opts *opts);
 int sequencer_rollback(struct replay_opts *opts);
 int sequencer_remove_state(struct replay_opts *opts);
 
+int sequencer_make_script(int keep_empty, FILE *out,
+		int argc, const char **argv);
+
 extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
-- 
2.13.3.windows.1.13.gaf0c2223da0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 03/10] rebase -i: remove useless indentation
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
  2017-07-14 14:44 ` [PATCH v6 01/10] t3415: verify that an empty instructionFormat is handled as before Johannes Schindelin
  2017-07-14 14:44 ` [PATCH v6 02/10] rebase -i: generate the script via rebase--helper Johannes Schindelin
@ 2017-07-14 14:45 ` Johannes Schindelin
  2017-07-14 14:45 ` [PATCH v6 04/10] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:45 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

The commands used to be indented, and it is nice to look at, but when we
transform the SHA-1s, the indentation is removed. So let's do away with it.

For the moment, at least: when we will use the upcoming rebase--helper
to transform the SHA-1s, we *will* keep the indentation and can
reintroduce it. Yet, to be able to validate the rebase--helper against
the output of the current shell script version, we need to remove the
extra indentation.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 05766835de1..93372c62b2e 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -155,13 +155,13 @@ reschedule_last_action () {
 append_todo_help () {
 	gettext "
 Commands:
- p, pick = use commit
- r, reword = use commit, but edit the commit message
- e, edit = use commit, but stop for amending
- s, squash = use commit, but meld into previous commit
- f, fixup = like \"squash\", but discard this commit's log message
- x, exec = run command (the rest of the line) using shell
- d, drop = remove commit
+p, pick = use commit
+r, reword = use commit, but edit the commit message
+e, edit = use commit, but stop for amending
+s, squash = use commit, but meld into previous commit
+f, fixup = like \"squash\", but discard this commit's log message
+x, exec = run command (the rest of the line) using shell
+d, drop = remove commit
 
 These lines can be re-ordered; they are executed from top to bottom.
 " | git stripspace --comment-lines >>"$todo"
-- 
2.13.3.windows.1.13.gaf0c2223da0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 04/10] rebase -i: do not invent onelines when expanding/collapsing SHA-1s
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
                   ` (2 preceding siblings ...)
  2017-07-14 14:45 ` [PATCH v6 03/10] rebase -i: remove useless indentation Johannes Schindelin
@ 2017-07-14 14:45 ` Johannes Schindelin
  2017-07-14 14:45 ` [PATCH v6 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:45 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

To avoid problems with short SHA-1s that become non-unique during the
rebase, we rewrite the todo script with short/long SHA-1s before and
after letting the user edit the script. Since SHA-1s are not intuitive
for humans, rebase -i also provides the onelines (commit message
subjects) in the script, purely for the user's convenience.

It is very possible to generate a todo script via different means than
rebase -i and then to let rebase -i run with it; In this case, these
onelines are not required.

And this is where the expand/collapse machinery has a bug: it *expects*
that oneline, and failing to find one reuses the previous SHA-1 as
"oneline".

It was most likely an oversight, and made implementation in the (quite
limiting) shell script language less convoluted. However, we are about
to reimplement performance-critical parts in C (and due to spawning a
git.exe process for every single line of the todo script, the
expansion/collapsing of the SHA-1s *is* performance-hampering on
Windows), therefore let's fix this bug to make cross-validation with the
C version of that functionality possible.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-rebase--interactive.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 93372c62b2e..9d65212b7f1 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -760,7 +760,12 @@ transform_todo_ids () {
 			;;
 		*)
 			sha1=$(git rev-parse --verify --quiet "$@" ${rest%%[	 ]*}) &&
-			rest="$sha1 ${rest#*[	 ]}"
+			if test "a$rest" = "a${rest#*[	 ]}"
+			then
+				rest=$sha1
+			else
+				rest="$sha1 ${rest#*[	 ]}"
+			fi
 			;;
 		esac
 		printf '%s\n' "$command${rest:+ }$rest"
-- 
2.13.3.windows.1.13.gaf0c2223da0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
                   ` (3 preceding siblings ...)
  2017-07-14 14:45 ` [PATCH v6 04/10] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
@ 2017-07-14 14:45 ` Johannes Schindelin
  2017-07-14 14:45 ` [PATCH v6 06/10] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:45 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

This is crucial to improve performance on Windows, as the speed is now
mostly dominated by the SHA-1 transformation (because it spawns a new
rev-parse process for *every* line, and spawning processes is pretty
slow from Git for Windows' MSYS2 Bash).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   | 10 +++++++-
 git-rebase--interactive.sh | 27 ++--------------------
 sequencer.c                | 57 ++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |  2 ++
 4 files changed, 70 insertions(+), 26 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 613053276e6..791b62901c3 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -25,6 +25,10 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 				ABORT),
 		OPT_CMDMODE(0, "make-script", &command,
 			N_("make rebase script"), MAKE_SCRIPT),
+		OPT_CMDMODE(0, "shorten-ids", &command,
+			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
+		OPT_CMDMODE(0, "expand-ids", &command,
+			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_END()
 	};
 
@@ -43,5 +47,9 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!sequencer_remove_state(&opts);
 	if (command == MAKE_SCRIPT && argc > 1)
 		return !!sequencer_make_script(keep_empty, stdout, argc, argv);
+	if (command == SHORTEN_SHA1S && argc == 1)
+		return !!transform_todo_ids(1);
+	if (command == EXPAND_SHA1S && argc == 1)
+		return !!transform_todo_ids(0);
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 9d65212b7f1..d5df02435ae 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -750,35 +750,12 @@ skip_unnecessary_picks () {
 		die "$(gettext "Could not skip unnecessary pick commands")"
 }
 
-transform_todo_ids () {
-	while read -r command rest
-	do
-		case "$command" in
-		"$comment_char"* | exec)
-			# Be careful for oddball commands like 'exec'
-			# that do not have a SHA-1 at the beginning of $rest.
-			;;
-		*)
-			sha1=$(git rev-parse --verify --quiet "$@" ${rest%%[	 ]*}) &&
-			if test "a$rest" = "a${rest#*[	 ]}"
-			then
-				rest=$sha1
-			else
-				rest="$sha1 ${rest#*[	 ]}"
-			fi
-			;;
-		esac
-		printf '%s\n' "$command${rest:+ }$rest"
-	done <"$todo" >"$todo.new" &&
-	mv -f "$todo.new" "$todo"
-}
-
 expand_todo_ids() {
-	transform_todo_ids
+	git rebase--helper --expand-ids
 }
 
 collapse_todo_ids() {
-	transform_todo_ids --short
+	git rebase--helper --shorten-ids
 }
 
 # Rearrange the todo list that has both "pick sha1 msg" and
diff --git a/sequencer.c b/sequencer.c
index afcb3d00a32..36ed45d9050 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2462,3 +2462,60 @@ int sequencer_make_script(int keep_empty, FILE *out,
 	strbuf_release(&buf);
 	return 0;
 }
+
+
+int transform_todo_ids(int shorten_ids)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	int fd, res, i;
+	FILE *out;
+
+	strbuf_reset(&todo_list.buf);
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("could not open '%s'"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+
+	res = parse_insn_buffer(todo_list.buf.buf, &todo_list);
+	if (res) {
+		todo_list_release(&todo_list);
+		return error(_("unusable todo list: '%s'"), todo_file);
+	}
+
+	out = fopen(todo_file, "w");
+	if (!out) {
+		todo_list_release(&todo_list);
+		return error(_("unable to open '%s' for writing"), todo_file);
+	}
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+		int bol = item->offset_in_buf;
+		const char *p = todo_list.buf.buf + bol;
+		int eol = i + 1 < todo_list.nr ?
+			todo_list.items[i + 1].offset_in_buf :
+			todo_list.buf.len;
+
+		if (item->command >= TODO_EXEC && item->command != TODO_DROP)
+			fwrite(p, eol - bol, 1, out);
+		else {
+			const char *id = shorten_ids ?
+				short_commit_name(item->commit) :
+				oid_to_hex(&item->commit->object.oid);
+			int len;
+
+			p += strspn(p, " \t"); /* left-trim command */
+			len = strcspn(p, " \t"); /* length of command */
+
+			fprintf(out, "%.*s %s %.*s\n",
+				len, p, id, item->arg_len, item->arg);
+		}
+	}
+	fclose(out);
+	todo_list_release(&todo_list);
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index 83f2943b7a9..71d25374afe 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -48,6 +48,8 @@ int sequencer_remove_state(struct replay_opts *opts);
 int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
+int transform_todo_ids(int shorten_ids);
+
 extern const char sign_off_header[];
 
 void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag);
-- 
2.13.3.windows.1.13.gaf0c2223da0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 06/10] t3404: relax rebase.missingCommitsCheck tests
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
                   ` (4 preceding siblings ...)
  2017-07-14 14:45 ` [PATCH v6 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
@ 2017-07-14 14:45 ` Johannes Schindelin
  2017-07-14 14:45 ` [PATCH v6 07/10] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:45 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

These tests were a bit anal about the *exact* warning/error message
printed by git rebase. But those messages are intended for the *end
user*, therefore it does not make sense to test so rigidly for the
*exact* wording.

In the following, we will reimplement the missing commits check in
the sequencer, with slightly different words.

So let's just test for the parts in the warning/error message that
we *really* care about, nothing more, nothing less.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3404-rebase-interactive.sh | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 37821d24543..3704dbb2ecf 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1249,20 +1249,13 @@ test_expect_success 'rebase -i respects rebase.missingCommitsCheck = error' '
 	test B = $(git cat-file commit HEAD^ | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the command isn't recognized in the following line:
- - badcmd $(git rev-list --oneline -1 master~1)
-
-You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad command' '
 	rebase_setup_and_clean bad-cmd &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 3 bad 4 5" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "badcmd $(git rev-list --oneline -1 master~1)" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 3 drop 4 5" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p) &&
@@ -1284,20 +1277,13 @@ test_expect_success 'tabs and spaces are accepted in the todolist' '
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
 '
 
-cat >expect <<EOF
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - edit XXXXXXX False commit
-
-You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
-Or you can abort the rebase with 'git rebase --abort'.
-EOF
-
 test_expect_success 'static check of bad SHA-1' '
 	rebase_setup_and_clean bad-sha &&
 	set_fake_editor &&
 	test_must_fail env FAKE_LINES="1 2 edit fakesha 3 4 5 #" \
 		git rebase -i --root 2>actual &&
-	test_i18ncmp expect actual &&
+	test_i18ngrep "edit XXXXXXX False commit" actual &&
+	test_i18ngrep "You can fix this with .git rebase --edit-todo.." actual &&
 	FAKE_LINES="1 2 4 5 6" git rebase --edit-todo &&
 	git rebase --continue &&
 	test E = $(git cat-file commit HEAD | sed -ne \$p)
-- 
2.13.3.windows.1.13.gaf0c2223da0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 07/10] rebase -i: check for missing commits in the rebase--helper
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
                   ` (5 preceding siblings ...)
  2017-07-14 14:45 ` [PATCH v6 06/10] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
@ 2017-07-14 14:45 ` Johannes Schindelin
  2017-07-14 14:45 ` [PATCH v6 08/10] rebase -i: skip unnecessary picks using " Johannes Schindelin
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:45 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |   7 +-
 git-rebase--interactive.sh | 164 ++-------------------------------------------
 sequencer.c                | 122 +++++++++++++++++++++++++++++++++
 sequencer.h                |   1 +
 4 files changed, 134 insertions(+), 160 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 791b62901c3..305adb5b707 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	struct replay_opts opts = REPLAY_OPTS_INIT;
 	int keep_empty = 0;
 	enum {
-		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S
+		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
+		CHECK_TODO_LIST
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -29,6 +30,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
 		OPT_CMDMODE(0, "expand-ids", &command,
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
+		OPT_CMDMODE(0, "check-todo-list", &command,
+			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_END()
 	};
 
@@ -51,5 +54,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(1);
 	if (command == EXPAND_SHA1S && argc == 1)
 		return !!transform_todo_ids(0);
+	if (command == CHECK_TODO_LIST && argc == 1)
+		return !!check_todo_list();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index d5df02435ae..c8cad318fa4 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -867,96 +867,6 @@ add_exec_commands () {
 	mv "$1.new" "$1"
 }
 
-# Check if the SHA-1 passed as an argument is a
-# correct one, if not then print $2 in "$todo".badsha
-# $1: the SHA-1 to test
-# $2: the line number of the input
-# $3: the input filename
-check_commit_sha () {
-	badsha=0
-	if test -z "$1"
-	then
-		badsha=1
-	else
-		sha1_verif="$(git rev-parse --verify --quiet $1^{commit})"
-		if test -z "$sha1_verif"
-		then
-			badsha=1
-		fi
-	fi
-
-	if test $badsha -ne 0
-	then
-		line="$(sed -n -e "${2}p" "$3")"
-		warn "$(eval_gettext "\
-Warning: the SHA-1 is missing or isn't a commit in the following line:
- - \$line")"
-		warn
-	fi
-
-	return $badsha
-}
-
-# prints the bad commits and bad commands
-# from the todolist in stdin
-check_bad_cmd_and_sha () {
-	retval=0
-	lineno=0
-	while read -r command rest
-	do
-		lineno=$(( $lineno + 1 ))
-		case $command in
-		"$comment_char"*|''|noop|x|exec)
-			# Doesn't expect a SHA-1
-			;;
-		"$cr")
-			# Work around CR left by "read" (e.g. with Git for
-			# Windows' Bash).
-			;;
-		pick|p|drop|d|reword|r|edit|e|squash|s|fixup|f)
-			if ! check_commit_sha "${rest%%[ 	]*}" "$lineno" "$1"
-			then
-				retval=1
-			fi
-			;;
-		*)
-			line="$(sed -n -e "${lineno}p" "$1")"
-			warn "$(eval_gettext "\
-Warning: the command isn't recognized in the following line:
- - \$line")"
-			warn
-			retval=1
-			;;
-		esac
-	done <"$1"
-	return $retval
-}
-
-# Print the list of the SHA-1 of the commits
-# from stdin to stdout
-todo_list_to_sha_list () {
-	git stripspace --strip-comments |
-	while read -r command sha1 rest
-	do
-		case $command in
-		"$comment_char"*|''|noop|x|"exec")
-			;;
-		*)
-			long_sha=$(git rev-list --no-walk "$sha1" 2>/dev/null)
-			printf "%s\n" "$long_sha"
-			;;
-		esac
-	done
-}
-
-# Use warn for each line in stdin
-warn_lines () {
-	while read -r line
-	do
-		warn " - $line"
-	done
-}
-
 # Switch to the branch in $into and notify it in the reflog
 checkout_onto () {
 	GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $onto_name"
@@ -971,74 +881,6 @@ get_missing_commit_check_level () {
 	printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
-# Check if the user dropped some commits by mistake
-# Behaviour determined by rebase.missingCommitsCheck.
-# Check if there is an unrecognized command or a
-# bad SHA-1 in a command.
-check_todo_list () {
-	raise_error=f
-
-	check_level=$(get_missing_commit_check_level)
-
-	case "$check_level" in
-	warn|error)
-		# Get the SHA-1 of the commits
-		todo_list_to_sha_list <"$todo".backup >"$todo".oldsha1
-		todo_list_to_sha_list <"$todo" >"$todo".newsha1
-
-		# Sort the SHA-1 and compare them
-		sort -u "$todo".oldsha1 >"$todo".oldsha1+
-		mv "$todo".oldsha1+ "$todo".oldsha1
-		sort -u "$todo".newsha1 >"$todo".newsha1+
-		mv "$todo".newsha1+ "$todo".newsha1
-		comm -2 -3 "$todo".oldsha1 "$todo".newsha1 >"$todo".miss
-
-		# Warn about missing commits
-		if test -s "$todo".miss
-		then
-			test "$check_level" = error && raise_error=t
-
-			warn "$(gettext "\
-Warning: some commits may have been dropped accidentally.
-Dropped commits (newer to older):")"
-
-			# Make the list user-friendly and display
-			opt="--no-walk=sorted --format=oneline --abbrev-commit --stdin"
-			git rev-list $opt <"$todo".miss | warn_lines
-
-			warn "$(gettext "\
-To avoid this message, use \"drop\" to explicitly remove a commit.
-
-Use 'git config rebase.missingCommitsCheck' to change the level of warnings.
-The possible behaviours are: ignore, warn, error.")"
-			warn
-		fi
-		;;
-	ignore)
-		;;
-	*)
-		warn "$(eval_gettext "Unrecognized setting \$check_level for option rebase.missingCommitsCheck. Ignoring.")"
-		;;
-	esac
-
-	if ! check_bad_cmd_and_sha "$todo"
-	then
-		raise_error=t
-	fi
-
-	if test $raise_error = t
-	then
-		# Checkout before the first commit of the
-		# rebase: this way git rebase --continue
-		# will work correctly as it expects HEAD to be
-		# placed before the commit of the next action
-		checkout_onto
-
-		warn "$(gettext "You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.")"
-		die "$(gettext "Or you can abort the rebase with 'git rebase --abort'.")"
-	fi
-}
-
 # The whole contents of this file is run by dot-sourcing it from
 # inside a shell function.  It used to be that "return"s we see
 # below were not inside any function, and expected to return
@@ -1299,7 +1141,11 @@ git_sequence_editor "$todo" ||
 has_action "$todo" ||
 	return 2
 
-check_todo_list
+git rebase--helper --check-todo-list || {
+	ret=$?
+	checkout_onto
+	exit $ret
+}
 
 expand_todo_ids
 
diff --git a/sequencer.c b/sequencer.c
index 36ed45d9050..15107de1e1c 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2519,3 +2519,125 @@ int transform_todo_ids(int shorten_ids)
 	todo_list_release(&todo_list);
 	return 0;
 }
+
+enum check_level {
+	CHECK_IGNORE = 0, CHECK_WARN, CHECK_ERROR
+};
+
+static enum check_level get_missing_commit_check_level(void)
+{
+	const char *value;
+
+	if (git_config_get_value("rebase.missingcommitscheck", &value) ||
+			!strcasecmp("ignore", value))
+		return CHECK_IGNORE;
+	if (!strcasecmp("warn", value))
+		return CHECK_WARN;
+	if (!strcasecmp("error", value))
+		return CHECK_ERROR;
+	warning(_("unrecognized setting %s for option"
+		  "rebase.missingCommitsCheck. Ignoring."), value);
+	return CHECK_IGNORE;
+}
+
+/*
+ * Check if the user dropped some commits by mistake
+ * Behaviour determined by rebase.missingCommitsCheck.
+ * Check if there is an unrecognized command or a
+ * bad SHA-1 in a command.
+ */
+int check_todo_list(void)
+{
+	enum check_level check_level = get_missing_commit_check_level();
+	struct strbuf todo_file = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct strbuf missing = STRBUF_INIT;
+	int advise_to_edit_todo = 0, res = 0, fd, i;
+
+	strbuf_addstr(&todo_file, rebase_path_todo());
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("could not open '%s'"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("could not read '%s'."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	advise_to_edit_todo = res =
+		parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	if (res || check_level == CHECK_IGNORE)
+		goto leave_check;
+
+	/* Mark the commits in git-rebase-todo as seen */
+	for (i = 0; i < todo_list.nr; i++) {
+		struct commit *commit = todo_list.items[i].commit;
+		if (commit)
+			commit->util = (void *)1;
+	}
+
+	todo_list_release(&todo_list);
+	strbuf_addstr(&todo_file, ".backup");
+	fd = open(todo_file.buf, O_RDONLY);
+	if (fd < 0) {
+		res = error_errno(_("could not open '%s'"), todo_file.buf);
+		goto leave_check;
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		res = error(_("could not read '%s'."), todo_file.buf);
+		goto leave_check;
+	}
+	close(fd);
+	strbuf_release(&todo_file);
+	res = !!parse_insn_buffer(todo_list.buf.buf, &todo_list);
+
+	/* Find commits in git-rebase-todo.backup yet unseen */
+	for (i = todo_list.nr - 1; i >= 0; i--) {
+		struct todo_item *item = todo_list.items + i;
+		struct commit *commit = item->commit;
+		if (commit && !commit->util) {
+			strbuf_addf(&missing, " - %s %.*s\n",
+				    short_commit_name(commit),
+				    item->arg_len, item->arg);
+			commit->util = (void *)1;
+		}
+	}
+
+	/* Warn about missing commits */
+	if (!missing.len)
+		goto leave_check;
+
+	if (check_level == CHECK_ERROR)
+		advise_to_edit_todo = res = 1;
+
+	fprintf(stderr,
+		_("Warning: some commits may have been dropped accidentally.\n"
+		"Dropped commits (newer to older):\n"));
+
+	/* Make the list user-friendly and display */
+	fputs(missing.buf, stderr);
+	strbuf_release(&missing);
+
+	fprintf(stderr, _("To avoid this message, use \"drop\" to "
+		"explicitly remove a commit.\n\n"
+		"Use 'git config rebase.missingCommitsCheck' to change "
+		"the level of warnings.\n"
+		"The possible behaviours are: ignore, warn, error.\n\n"));
+
+leave_check:
+	strbuf_release(&todo_file);
+	todo_list_release(&todo_list);
+
+	if (advise_to_edit_todo)
+		fprintf(stderr,
+			_("You can fix this with 'git rebase --edit-todo' "
+			  "and then run 'git rebase --continue'.\n"
+			  "Or you can abort the rebase with 'git rebase"
+			  " --abort'.\n"));
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 71d25374afe..878dd296f8c 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -49,6 +49,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 		int argc, const char **argv);
 
 int transform_todo_ids(int shorten_ids);
+int check_todo_list(void);
 
 extern const char sign_off_header[];
 
-- 
2.13.3.windows.1.13.gaf0c2223da0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 08/10] rebase -i: skip unnecessary picks using the rebase--helper
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
                   ` (6 preceding siblings ...)
  2017-07-14 14:45 ` [PATCH v6 07/10] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
@ 2017-07-14 14:45 ` Johannes Schindelin
  2017-07-14 14:45 ` [PATCH v6 09/10] t3415: test fixup with wrapped oneline Johannes Schindelin
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:45 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Note: The original code did not try to skip unnecessary picks of root
commits but punts instead (probably --root was not considered common
enough of a use case to bother optimizing). We do the same, for now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/rebase--helper.c   |   6 ++-
 git-rebase--interactive.sh |  41 ++---------------
 sequencer.c                | 107 +++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                |   1 +
 4 files changed, 116 insertions(+), 39 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 305adb5b707..057ae7102ff 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -15,7 +15,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -32,6 +32,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
 		OPT_CMDMODE(0, "check-todo-list", &command,
 			N_("check the todo list"), CHECK_TODO_LIST),
+		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
+			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
 		OPT_END()
 	};
 
@@ -56,5 +58,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!transform_todo_ids(0);
 	if (command == CHECK_TODO_LIST && argc == 1)
 		return !!check_todo_list();
+	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
+		return !!skip_unnecessary_picks();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index c8cad318fa4..af8d7bd77fb 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -713,43 +713,6 @@ do_rest () {
 	done
 }
 
-# skip picking commits whose parents are unchanged
-skip_unnecessary_picks () {
-	fd=3
-	while read -r command rest
-	do
-		# fd=3 means we skip the command
-		case "$fd,$command" in
-		3,pick|3,p)
-			# pick a commit whose parent is current $onto -> skip
-			sha1=${rest%% *}
-			case "$(git rev-parse --verify --quiet "$sha1"^)" in
-			"$onto"*)
-				onto=$sha1
-				;;
-			*)
-				fd=1
-				;;
-			esac
-			;;
-		3,"$comment_char"*|3,)
-			# copy comments
-			;;
-		*)
-			fd=1
-			;;
-		esac
-		printf '%s\n' "$command${rest:+ }$rest" >&$fd
-	done <"$todo" >"$todo.new" 3>>"$done" &&
-	mv -f "$todo".new "$todo" &&
-	case "$(peek_next_command)" in
-	squash|s|fixup|f)
-		record_in_rewritten "$onto"
-		;;
-	esac ||
-		die "$(gettext "Could not skip unnecessary pick commands")"
-}
-
 expand_todo_ids() {
 	git rebase--helper --expand-ids
 }
@@ -1149,7 +1112,9 @@ git rebase--helper --check-todo-list || {
 
 expand_todo_ids
 
-test -d "$rewritten" || test -n "$force_rebase" || skip_unnecessary_picks
+test -d "$rewritten" || test -n "$force_rebase" ||
+onto="$(git rebase--helper --skip-unnecessary-picks)" ||
+die "Could not skip unnecessary pick commands"
 
 checkout_onto
 if test -z "$rebase_root" && test ! -d "$rewritten"
diff --git a/sequencer.c b/sequencer.c
index 15107de1e1c..96d43aec764 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2641,3 +2641,110 @@ int check_todo_list(void)
 
 	return res;
 }
+
+/* skip picking commits whose parents are unchanged */
+int skip_unnecessary_picks(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct strbuf buf = STRBUF_INIT;
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct object_id onto_oid, *oid = &onto_oid, *parent_oid;
+	int fd, i;
+
+	if (!read_oneliner(&buf, rebase_path_onto(), 0))
+		return error(_("could not read 'onto'"));
+	if (get_oid(buf.buf, &onto_oid)) {
+		strbuf_release(&buf);
+		return error(_("need a HEAD to fixup"));
+	}
+	strbuf_release(&buf);
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0) {
+		return error_errno(_("could not open '%s'"), todo_file);
+	}
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	for (i = 0; i < todo_list.nr; i++) {
+		struct todo_item *item = todo_list.items + i;
+
+		if (item->command >= TODO_NOOP)
+			continue;
+		if (item->command != TODO_PICK)
+			break;
+		if (parse_commit(item->commit)) {
+			todo_list_release(&todo_list);
+			return error(_("could not parse commit '%s'"),
+				oid_to_hex(&item->commit->object.oid));
+		}
+		if (!item->commit->parents)
+			break; /* root commit */
+		if (item->commit->parents->next)
+			break; /* merge commit */
+		parent_oid = &item->commit->parents->item->object.oid;
+		if (hashcmp(parent_oid->hash, oid->hash))
+			break;
+		oid = &item->commit->object.oid;
+	}
+	if (i > 0) {
+		int offset = i < todo_list.nr ?
+			todo_list.items[i].offset_in_buf : todo_list.buf.len;
+		const char *done_path = rebase_path_done();
+
+		fd = open(done_path, O_CREAT | O_WRONLY | O_APPEND, 0666);
+		if (fd < 0) {
+			error_errno(_("could not open '%s' for writing"),
+				    done_path);
+			todo_list_release(&todo_list);
+			return -1;
+		}
+		if (write_in_full(fd, todo_list.buf.buf, offset) < 0) {
+			error_errno(_("could not write to '%s'"), done_path);
+			todo_list_release(&todo_list);
+			close(fd);
+			return -1;
+		}
+		close(fd);
+
+		fd = open(rebase_path_todo(), O_WRONLY, 0666);
+		if (fd < 0) {
+			error_errno(_("could not open '%s' for writing"),
+				    rebase_path_todo());
+			todo_list_release(&todo_list);
+			return -1;
+		}
+		if (write_in_full(fd, todo_list.buf.buf + offset,
+				todo_list.buf.len - offset) < 0) {
+			error_errno(_("could not write to '%s'"),
+				    rebase_path_todo());
+			close(fd);
+			todo_list_release(&todo_list);
+			return -1;
+		}
+		if (ftruncate(fd, todo_list.buf.len - offset) < 0) {
+			error_errno(_("could not truncate '%s'"),
+				    rebase_path_todo());
+			todo_list_release(&todo_list);
+			close(fd);
+			return -1;
+		}
+		close(fd);
+
+		todo_list.current = i;
+		if (is_fixup(peek_command(&todo_list, 0)))
+			record_in_rewritten(oid, peek_command(&todo_list, 0));
+	}
+
+	todo_list_release(&todo_list);
+	printf("%s\n", oid_to_hex(oid));
+
+	return 0;
+}
diff --git a/sequencer.h b/sequencer.h
index 878dd296f8c..04a57e09a1d 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -50,6 +50,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 
 int transform_todo_ids(int shorten_ids);
 int check_todo_list(void);
+int skip_unnecessary_picks(void);
 
 extern const char sign_off_header[];
 
-- 
2.13.3.windows.1.13.gaf0c2223da0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 09/10] t3415: test fixup with wrapped oneline
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
                   ` (7 preceding siblings ...)
  2017-07-14 14:45 ` [PATCH v6 08/10] rebase -i: skip unnecessary picks using " Johannes Schindelin
@ 2017-07-14 14:45 ` Johannes Schindelin
  2017-07-14 14:45 ` [PATCH v6 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:45 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

The `git commit --fixup` command unwraps wrapped onelines when
constructing the commit message, without wrapping the result.

We need to make sure that `git rebase --autosquash` keeps handling such
cases correctly, in particular since we are about to move the autosquash
handling into the rebase--helper.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3415-rebase-autosquash.sh | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 6d99f624b62..62cb977e4ec 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -316,4 +316,18 @@ test_expect_success 'extra spaces after fixup!' '
 	test $base = $parent
 '
 
+test_expect_success 'wrapped original subject' '
+	if test -d .git/rebase-merge; then git rebase --abort; fi &&
+	base=$(git rev-parse HEAD) &&
+	echo "wrapped subject" >wrapped &&
+	git add wrapped &&
+	test_tick &&
+	git commit --allow-empty -m "$(printf "To\nfixup")" &&
+	test_tick &&
+	git commit --allow-empty -m "fixup! To fixup" &&
+	git rebase -i --autosquash --keep-empty HEAD~2 &&
+	parent=$(git rev-parse HEAD^) &&
+	test $base = $parent
+'
+
 test_done
-- 
2.13.3.windows.1.13.gaf0c2223da0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
                   ` (8 preceding siblings ...)
  2017-07-14 14:45 ` [PATCH v6 09/10] t3415: test fixup with wrapped oneline Johannes Schindelin
@ 2017-07-14 14:45 ` Johannes Schindelin
  2017-07-14 17:27 ` [PATCH v6 00/10] The final building block for a faster rebase -i Stefan Beller
  2017-07-20 21:38 ` Junio C Hamano
  11 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 14:45 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

This operation has quadratic complexity, which is especially painful
on Windows, where shell scripts are *already* slow (mainly due to the
overhead of the POSIX emulation layer).

Let's reimplement this with linear complexity (using a hash map to
match the commits' subject lines) for the common case; Sadly, the
fixup/squash feature's design neglected performance considerations,
allowing arbitrary prefixes (read: `fixup! hell` will match the
commit subject `hello world`), which means that we are stuck with
quadratic performance in the worst case.

The reimplemented logic also happens to fix a bug where commented-out
lines (representing empty patches) were dropped by the previous code.

While at it, clarify how the fixup/squash feature works in `git rebase
-i`'s man page.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-rebase.txt |  16 ++--
 builtin/rebase--helper.c     |   6 +-
 git-rebase--interactive.sh   |  90 +-------------------
 sequencer.c                  | 196 +++++++++++++++++++++++++++++++++++++++++++
 sequencer.h                  |   1 +
 t/t3415-rebase-autosquash.sh |   2 +-
 6 files changed, 213 insertions(+), 98 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 4f6bed61a92..a3c01dfdc8a 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -430,13 +430,15 @@ without an explicit `--interactive`.
 --autosquash::
 --no-autosquash::
 	When the commit log message begins with "squash! ..." (or
-	"fixup! ..."), and there is a commit whose title begins with
-	the same ..., automatically modify the todo list of rebase -i
-	so that the commit marked for squashing comes right after the
-	commit to be modified, and change the action of the moved
-	commit from `pick` to `squash` (or `fixup`).  Ignores subsequent
-	"fixup! " or "squash! " after the first, in case you referred to an
-	earlier fixup/squash with `git commit --fixup/--squash`.
+	"fixup! ..."), and there is already a commit in the todo list that
+	matches the same `...`, automatically modify the todo list of rebase
+	-i so that the commit marked for squashing comes right after the
+	commit to be modified, and change the action of the moved commit
+	from `pick` to `squash` (or `fixup`).  A commit matches the `...` if
+	the commit subject matches, or if the `...` refers to the commit's
+	hash. As a fall-back, partial matches of the commit subject work,
+	too.  The recommended way to create fixup/squash commits is by using
+	the `--fixup`/`--squash` options of linkgit:git-commit[1].
 +
 This option is only valid when the `--interactive` option is used.
 +
diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 057ae7102ff..f8519363a39 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -15,7 +15,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 	int keep_empty = 0;
 	enum {
 		CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
+		CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
 	} command = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -34,6 +34,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 			N_("check the todo list"), CHECK_TODO_LIST),
 		OPT_CMDMODE(0, "skip-unnecessary-picks", &command,
 			N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
+		OPT_CMDMODE(0, "rearrange-squash", &command,
+			N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
 		OPT_END()
 	};
 
@@ -60,5 +62,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 		return !!check_todo_list();
 	if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
 		return !!skip_unnecessary_picks();
+	if (command == REARRANGE_SQUASH && argc == 1)
+		return !!rearrange_squash();
 	usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index af8d7bd77fb..3b0340e7cc9 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -721,94 +721,6 @@ collapse_todo_ids() {
 	git rebase--helper --shorten-ids
 }
 
-# Rearrange the todo list that has both "pick sha1 msg" and
-# "pick sha1 fixup!/squash! msg" appears in it so that the latter
-# comes immediately after the former, and change "pick" to
-# "fixup"/"squash".
-#
-# Note that if the config has specified a custom instruction format
-# each log message will be re-retrieved in order to normalize the
-# autosquash arrangement
-rearrange_squash () {
-	format=$(git config --get rebase.instructionFormat)
-	# extract fixup!/squash! lines and resolve any referenced sha1's
-	while read -r pick sha1 message
-	do
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		case "$message" in
-		"squash! "*|"fixup! "*)
-			action="${message%%!*}"
-			rest=$message
-			prefix=
-			# skip all squash! or fixup! (but save for later)
-			while :
-			do
-				case "$rest" in
-				"squash! "*|"fixup! "*)
-					prefix="$prefix${rest%%!*},"
-					rest="${rest#*! }"
-					;;
-				*)
-					break
-					;;
-				esac
-			done
-			printf '%s %s %s %s\n' "$sha1" "$action" "$prefix" "$rest"
-			# if it's a single word, try to resolve to a full sha1 and
-			# emit a second copy. This allows us to match on both message
-			# and on sha1 prefix
-			if test "${rest#* }" = "$rest"; then
-				fullsha="$(git rev-parse -q --verify "$rest" 2>/dev/null)"
-				if test -n "$fullsha"; then
-					# prefix the action to uniquely identify this line as
-					# intended for full sha1 match
-					echo "$sha1 +$action $prefix $fullsha"
-				fi
-			fi
-		esac
-	done >"$1.sq" <"$1"
-	test -s "$1.sq" || return
-
-	used=
-	while read -r pick sha1 message
-	do
-		case " $used" in
-		*" $sha1 "*) continue ;;
-		esac
-		printf '%s\n' "$pick $sha1 $message"
-		test -z "${format}" || message=$(git log -n 1 --format="%s" ${sha1})
-		used="$used$sha1 "
-		while read -r squash action msg_prefix msg_content
-		do
-			case " $used" in
-			*" $squash "*) continue ;;
-			esac
-			emit=0
-			case "$action" in
-			+*)
-				action="${action#+}"
-				# full sha1 prefix test
-				case "$msg_content" in "$sha1"*) emit=1;; esac ;;
-			*)
-				# message prefix test
-				case "$message" in "$msg_content"*) emit=1;; esac ;;
-			esac
-			if test $emit = 1; then
-				if test -n "${format}"
-				then
-					msg_content=$(git log -n 1 --format="${format}" ${squash})
-				else
-					msg_content="$(echo "$msg_prefix" | sed "s/,/! /g")$msg_content"
-				fi
-				printf '%s\n' "$action $squash $msg_content"
-				used="$used$squash "
-			fi
-		done <"$1.sq"
-	done >"$1.rearranged" <"$1"
-	cat "$1.rearranged" >"$1"
-	rm -f "$1.sq" "$1.rearranged"
-}
-
 # Add commands after a pick or after a squash/fixup serie
 # in the todo list.
 add_exec_commands () {
@@ -1068,7 +980,7 @@ then
 fi
 
 test -s "$todo" || echo noop >> "$todo"
-test -n "$autosquash" && rearrange_squash "$todo"
+test -z "$autosquash" || git rebase--helper --rearrange-squash || exit
 test -n "$cmd" && add_exec_commands "$todo"
 
 todocount=$(git stripspace --strip-comments <"$todo" | wc -l)
diff --git a/sequencer.c b/sequencer.c
index 96d43aec764..c54596f9699 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -20,6 +20,7 @@
 #include "trailer.h"
 #include "log-tree.h"
 #include "wt-status.h"
+#include "hashmap.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2748,3 +2749,198 @@ int skip_unnecessary_picks(void)
 
 	return 0;
 }
+
+struct subject2item_entry {
+	struct hashmap_entry entry;
+	int i;
+	char subject[FLEX_ARRAY];
+};
+
+static int subject2item_cmp(const void *fndata,
+			    const struct subject2item_entry *a,
+			    const struct subject2item_entry *b, const void *key)
+{
+	return key ? strcmp(a->subject, key) : strcmp(a->subject, b->subject);
+}
+
+/*
+ * Rearrange the todo list that has both "pick commit-id msg" and "pick
+ * commit-id fixup!/squash! msg" in it so that the latter is put immediately
+ * after the former, and change "pick" to "fixup"/"squash".
+ *
+ * Note that if the config has specified a custom instruction format, each log
+ * message will have to be retrieved from the commit (as the oneline in the
+ * script cannot be trusted) in order to normalize the autosquash arrangement.
+ */
+int rearrange_squash(void)
+{
+	const char *todo_file = rebase_path_todo();
+	struct todo_list todo_list = TODO_LIST_INIT;
+	struct hashmap subject2item;
+	int res = 0, rearranged = 0, *next, *tail, fd, i;
+	char **subjects;
+
+	fd = open(todo_file, O_RDONLY);
+	if (fd < 0)
+		return error_errno(_("could not open '%s'"), todo_file);
+	if (strbuf_read(&todo_list.buf, fd, 0) < 0) {
+		close(fd);
+		return error(_("could not read '%s'."), todo_file);
+	}
+	close(fd);
+	if (parse_insn_buffer(todo_list.buf.buf, &todo_list) < 0) {
+		todo_list_release(&todo_list);
+		return -1;
+	}
+
+	/*
+	 * The hashmap maps onelines to the respective todo list index.
+	 *
+	 * If any items need to be rearranged, the next[i] value will indicate
+	 * which item was moved directly after the i'th.
+	 *
+	 * In that case, last[i] will indicate the index of the latest item to
+	 * be moved to appear after the i'th.
+	 */
+	hashmap_init(&subject2item, (hashmap_cmp_fn) subject2item_cmp,
+		     NULL, todo_list.nr);
+	ALLOC_ARRAY(next, todo_list.nr);
+	ALLOC_ARRAY(tail, todo_list.nr);
+	ALLOC_ARRAY(subjects, todo_list.nr);
+	for (i = 0; i < todo_list.nr; i++) {
+		struct strbuf buf = STRBUF_INIT;
+		struct todo_item *item = todo_list.items + i;
+		const char *commit_buffer, *subject, *p;
+		size_t subject_len;
+		int i2 = -1;
+		struct subject2item_entry *entry;
+
+		next[i] = tail[i] = -1;
+		if (item->command >= TODO_EXEC) {
+			subjects[i] = NULL;
+			continue;
+		}
+
+		if (is_fixup(item->command)) {
+			todo_list_release(&todo_list);
+			return error(_("the script was already rearranged."));
+		}
+
+		item->commit->util = item;
+
+		parse_commit(item->commit);
+		commit_buffer = get_commit_buffer(item->commit, NULL);
+		find_commit_subject(commit_buffer, &subject);
+		format_subject(&buf, subject, " ");
+		subject = subjects[i] = strbuf_detach(&buf, &subject_len);
+		unuse_commit_buffer(item->commit, commit_buffer);
+		if ((skip_prefix(subject, "fixup! ", &p) ||
+		     skip_prefix(subject, "squash! ", &p))) {
+			struct commit *commit2;
+
+			for (;;) {
+				while (isspace(*p))
+					p++;
+				if (!skip_prefix(p, "fixup! ", &p) &&
+				    !skip_prefix(p, "squash! ", &p))
+					break;
+			}
+
+			if ((entry = hashmap_get_from_hash(&subject2item,
+							   strhash(p), p)))
+				/* found by title */
+				i2 = entry->i;
+			else if (!strchr(p, ' ') &&
+				 (commit2 =
+				  lookup_commit_reference_by_name(p)) &&
+				 commit2->util)
+				/* found by commit name */
+				i2 = (struct todo_item *)commit2->util
+					- todo_list.items;
+			else {
+				/* copy can be a prefix of the commit subject */
+				for (i2 = 0; i2 < i; i2++)
+					if (subjects[i2] &&
+					    starts_with(subjects[i2], p))
+						break;
+				if (i2 == i)
+					i2 = -1;
+			}
+		}
+		if (i2 >= 0) {
+			rearranged = 1;
+			todo_list.items[i].command =
+				starts_with(subject, "fixup!") ?
+				TODO_FIXUP : TODO_SQUASH;
+			if (next[i2] < 0)
+				next[i2] = i;
+			else
+				next[tail[i2]] = i;
+			tail[i2] = i;
+		} else if (!hashmap_get_from_hash(&subject2item,
+						strhash(subject), subject)) {
+			FLEX_ALLOC_MEM(entry, subject, subject, subject_len);
+			entry->i = i;
+			hashmap_entry_init(entry, strhash(entry->subject));
+			hashmap_put(&subject2item, entry);
+		}
+	}
+
+	if (rearranged) {
+		struct strbuf buf = STRBUF_INIT;
+
+		for (i = 0; i < todo_list.nr; i++) {
+			enum todo_command command = todo_list.items[i].command;
+			int cur = i;
+
+			/*
+			 * Initially, all commands are 'pick's. If it is a
+			 * fixup or a squash now, we have rearranged it.
+			 */
+			if (is_fixup(command))
+				continue;
+
+			while (cur >= 0) {
+				int offset = todo_list.items[cur].offset_in_buf;
+				int end_offset = cur + 1 < todo_list.nr ?
+					todo_list.items[cur + 1].offset_in_buf :
+					todo_list.buf.len;
+				char *bol = todo_list.buf.buf + offset;
+				char *eol = todo_list.buf.buf + end_offset;
+
+				/* replace 'pick', by 'fixup' or 'squash' */
+				command = todo_list.items[cur].command;
+				if (is_fixup(command)) {
+					strbuf_addstr(&buf,
+						todo_command_info[command].str);
+					bol += strcspn(bol, " \t");
+				}
+
+				strbuf_add(&buf, bol, eol - bol);
+
+				cur = next[cur];
+			}
+		}
+
+		fd = open(todo_file, O_WRONLY);
+		if (fd < 0)
+			res = error_errno(_("could not open '%s'"), todo_file);
+		else if (write(fd, buf.buf, buf.len) < 0)
+			res = error_errno(_("could not read '%s'."), todo_file);
+		else if (ftruncate(fd, buf.len) < 0)
+			res = error_errno(_("could not finish '%s'"),
+					   todo_file);
+		close(fd);
+		strbuf_release(&buf);
+	}
+
+	free(next);
+	free(tail);
+	for (i = 0; i < todo_list.nr; i++)
+		free(subjects[i]);
+	free(subjects);
+	hashmap_free(&subject2item, 1);
+	todo_list_release(&todo_list);
+
+	return res;
+}
diff --git a/sequencer.h b/sequencer.h
index 04a57e09a1d..6f3d3df82c0 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -51,6 +51,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 int transform_todo_ids(int shorten_ids);
 int check_todo_list(void);
 int skip_unnecessary_picks(void);
+int rearrange_squash(void);
 
 extern const char sign_off_header[];
 
diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 62cb977e4ec..e364c12622f 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -290,7 +290,7 @@ set_backup_editor () {
 	test_set_editor "$PWD/backup-editor.sh"
 }
 
-test_expect_failure 'autosquash with multiple empty patches' '
+test_expect_success 'autosquash with multiple empty patches' '
 	test_tick &&
 	git commit --allow-empty -m "empty" &&
 	test_tick &&
-- 
2.13.3.windows.1.13.gaf0c2223da0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 00/10] The final building block for a faster rebase -i
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
                   ` (9 preceding siblings ...)
  2017-07-14 14:45 ` [PATCH v6 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
@ 2017-07-14 17:27 ` Stefan Beller
  2017-07-14 20:39   ` Johannes Schindelin
  2017-07-20 21:38 ` Junio C Hamano
  11 siblings, 1 reply; 17+ messages in thread
From: Stefan Beller @ 2017-07-14 17:27 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: git, Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

On Fri, Jul 14, 2017 at 7:44 AM, Johannes Schindelin
<johannes.schindelin@gmx.de> wrote:
> This patch series reimplements the expensive pre- and post-processing of
> the todo script in C.
>
> And it concludes the work I did to accelerate rebase -i so far.
>
> I am still unwilling to replace a compile-time safe way to pass the
> options to the revision machinery by the alternative (which I am still
> flabbergasted about) proposed by Junio. This will not change.

There is new input for this discussion, though. :)

https://public-inbox.org/git/20170706202739.6056-1-sbeller@google.com/
was sent out to gauge interest if we want to pull through with removing
all the submodule hack, such as 'add_submodule_odb' in submodule.c
which just adds the submodule objects as an alternate object store, such
that we can do some things in-process (check if a submodule has certain
commits, merge_submodule).

For one of these ('merge_submodule') I would have to add the
'struct repository' as another parameter to pass around to the
diff and revision walking machinery. But the API for these subsystems
is traditionally only operated via an array of strings instead of passing
assigning a member field to a value.

So If I were to follow the "use arrays of strings only to operate the
revision machinery" I would:
* pass a string indicating which repo to use
  (probably the path to git dir?)
* the repository objects are cached, so we can lookup e.g.
  "the_repository" via the correct string.
* use that repo object inside the revision machinery.

That sounds cumbersome and I would prefer to pass in
the repository object directory. So maybe we want to have some
other way opposed to "an array of strings" to operate the revision
machinery.

>  -static int subject2item_cmp(const struct subject2item_entry *a,
>  -      const struct subject2item_entry *b, const void *key)
>  +static int subject2item_cmp(const void *fndata,

This could also be named unused_fndata.
Please see origin/sb/hashmap-cleanup, if that makes sense as well
(have all arguments const void and cast them inside the function, such
that we can avoid the cast to hashmap_cmp_fn in hashmap_init)

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 00/10] The final building block for a faster rebase -i
  2017-07-14 17:27 ` [PATCH v6 00/10] The final building block for a faster rebase -i Stefan Beller
@ 2017-07-14 20:39   ` Johannes Schindelin
  0 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-14 20:39 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

Hi Stefan,

On Fri, 14 Jul 2017, Stefan Beller wrote:

> On Fri, Jul 14, 2017 at 7:44 AM, Johannes Schindelin
> <johannes.schindelin@gmx.de> wrote:
>
> >  -static int subject2item_cmp(const struct subject2item_entry *a,
> >  -      const struct subject2item_entry *b, const void *key)
> >  +static int subject2item_cmp(const void *fndata,
> 
> This could also be named unused_fndata.

Sure. I simply took the version Junio used when he merged the previous
patch series iteration into `pu`.

The function is short enough, though, that I feel it obvious that the
parameter is unused.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 02/10] rebase -i: generate the script via rebase--helper
  2017-07-14 14:44 ` [PATCH v6 02/10] rebase -i: generate the script via rebase--helper Johannes Schindelin
@ 2017-07-14 22:50   ` Stefan Beller
  2017-07-15 12:56     ` Johannes Schindelin
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Beller @ 2017-07-14 22:50 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: git, Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

On Fri, Jul 14, 2017 at 7:44 AM, Johannes Schindelin
<johannes.schindelin@gmx.de> wrote:
> The first step of an interactive rebase is to generate the so-called "todo
> script", to be stored in the state directory as "git-rebase-todo" and to
> be edited by the user.
>
> Originally, we adjusted the output of `git log <options>` using a simple
> sed script. Over the course of the years, the code became more
> complicated. We now use shell scripting to edit the output of `git log`
> conditionally, depending whether to keep "empty" commits (i.e. commits
> that do not change any files).
>
> On platforms where shell scripting is not native, this can be a serious
> drag. And it opens the door for incompatibilities between platforms when
> it comes to shell scripting or to Unix-y commands.
>
> Let's just re-implement the todo script generation in plain C, using the
> revision machinery directly.
>
> This is substantially faster, improving the speed relative to the
> shell script version of the interactive rebase from 2x to 3x on Windows.

Thanks for working on this


> +int sequencer_make_script(int keep_empty, FILE *out,
> +               int argc, const char **argv)
> +{

> +       init_revisions(&revs, NULL);
> +       revs.verbose_header = 1;
> +       revs.max_parents = 1;
> +       revs.cherry_pick = 1;
> +       revs.limited = 1;
> +       revs.reverse = 1;
> +       revs.right_only = 1;
> +       revs.sort_order = REV_SORT_IN_GRAPH_ORDER;
> +       revs.topo_order = 1;
> +
> +       revs.pretty_given = 1;
> +       git_config_get_string("rebase.instructionFormat", &format);
> +       if (!format || !*format) {
> +               free(format);
> +               format = xstrdup("%s");
> +       }

https://public-inbox.org/git/xmqqvapqo4i8.fsf@gitster.mtv.corp.google.com/

So this is the core part that you and Junio have differing opinions on.

> All of the above feels like inviting unnecessary future breakages by
> knowing too much about the implementation the current version of
> revision.c happens to use.  A more careful implementation would be
> to allocate our own av[] and prepare "--reverse", "--left-right",
> "--cherry-pick", etc. to be parsed by setup_revisions() call we see
> below.  The parsing is not an expensive part of the operation
> anyway, and that way we do not have to worry about one less thing.

Allow me go through each of the options which may help
us finding a consensus (at least it helps me having a more
informed opinion).
List of options used outside of revision.c, which in the ideal
world of Git are parsed in e.g. handle_revision_opt called
from setup_revisions:

.verbose_header
  bisect.c:       opt.verbose_header = 1;
  builtin/commit.c:       rev.verbose_header = 1;
  builtin/log.c:  rev->verbose_header = 1;
  builtin/log.c:  rev.verbose_header = 1;
  builtin/log.c:  rev.verbose_header = 1;

.max_parents
  builtin/log.c:  check_rev.max_parents = 1;
  builtin/log.c:  revs.max_parents = 1;
  builtin/log.c:  rev.max_parents = 1;
  builtin/log.c:  revs.max_parents = 1;

.cherry_pick
  is clean!

.limited
  ref-filter.c:   revs.limited = 1;

.reverse
  seems clean.

.right_only:
.sort_order:
  is clean!

.topo_order:
  builtin/fast-export.c:  revs.topo_order = 1;
  builtin/log.c:  revs.topo_order = 1;

.pretty_given
  builtin/log.c:  if (!rev->show_notes_given && (!rev->pretty_given || w.notes))
  builtin/log.c:  if (rev->pretty_given && rev->commit_format == CMIT_FMT_RAW) {

There are two conflicting messages I get:
* only a few fields seem to be polluted (verbose_header,
  max_parents), much fewer than I thought
* we do use these undocumented ways already,
  but not at the scale that DScho is trying to here.

In the reply to the cover letter I outlined that we may have
a problem with integrating the repository struct when using
string arrays only.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 02/10] rebase -i: generate the script via rebase--helper
  2017-07-14 22:50   ` Stefan Beller
@ 2017-07-15 12:56     ` Johannes Schindelin
  0 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-15 12:56 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Junio C Hamano, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

Hi Stefan,

On Fri, 14 Jul 2017, Stefan Beller wrote:

> There are two conflicting messages I get:
> * only a few fields seem to be polluted (verbose_header,
>   max_parents), much fewer than I thought
> * we do use these undocumented ways already,
>   but not at the scale that DScho is trying to here.

We use mostly undocumented ways, because there is simply no API. It was
decided early on that libgit was never intended to be a public API. So
that's that.

TBH a much better way to handle this would have been to say: this is
fragile, as the pretty-printing is not properly abstracted, and hence we
may need to break the contract at some stage (which would not cause
compile errors, though), maybe you, Dscho, can fix that after this patch
series was applied by abstracting it properly?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 00/10] The final building block for a faster rebase -i
  2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
                   ` (10 preceding siblings ...)
  2017-07-14 17:27 ` [PATCH v6 00/10] The final building block for a faster rebase -i Stefan Beller
@ 2017-07-20 21:38 ` Junio C Hamano
  2017-07-22 11:44   ` Johannes Schindelin
  11 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2017-07-20 21:38 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: git, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> Changes since v5:
>
> - replaced a get_sha1() call by a get_oid() call already.
>
> - adjusted to hashmap API changes

Applying this to the tip of 'master' yields exactly the same result
as merging the previous round js/rebase-i-final to the tip of
'master' and then applying merge-fix/js/rebase-i-final to adjust to
the codebase, so the net effect of this reroll is none.  Which is a
good sign, as it means there wasn't any rebase mistake and the evil
merge we've been carrying was a good one.

But at the same time, I prefer to avoid rebasing to newer 'master'
until the codebase starts drifting too far apart, or until a new
feature release is made out of newer 'master'.  This is primarily
because I want dates on commits to mean something---namely, "this
change hasn't seen a need to be updated for 'oops, that was wrong'
since this date".  This use of commit dates as 'priority date'
matters much less for a topic not in 'next', but as a general
principle, my workflow tries to preserve commit dates for all
topics.

For the above reason, I may hold onto this patch series in my inbox
without actually updating js/rebase-i-final topic until the current
cycle is over; please do not mistake it as this new reroll being
ignored.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 00/10] The final building block for a faster rebase -i
  2017-07-20 21:38 ` Junio C Hamano
@ 2017-07-22 11:44   ` Johannes Schindelin
  0 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-07-22 11:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Philip Oakley, Jeff King, Phillip Wood, Liam Beguin

Hi Junio,

On Thu, 20 Jul 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > Changes since v5:
> >
> > - replaced a get_sha1() call by a get_oid() call already.
> >
> > - adjusted to hashmap API changes
> 
> Applying this to the tip of 'master' yields exactly the same result
> as merging the previous round js/rebase-i-final to the tip of
> 'master' and then applying merge-fix/js/rebase-i-final to adjust to
> the codebase, so the net effect of this reroll is none.  Which is a
> good sign, as it means there wasn't any rebase mistake and the evil
> merge we've been carrying was a good one.

Good.

> But at the same time, I prefer to avoid rebasing to newer 'master'
> until the codebase starts drifting too far apart, or until a new
> feature release is made out of newer 'master'.  This is primarily
> because I want dates on commits to mean something---namely, "this
> change hasn't seen a need to be updated for 'oops, that was wrong'
> since this date".  This use of commit dates as 'priority date'
> matters much less for a topic not in 'next', but as a general
> principle, my workflow tries to preserve commit dates for all
> topics.

By that token, commit message updates would also be inappropriate, in
particular when they came from somebody else than the patch author ;-P

As to avoiding a rebase: we can add that to the growing list of things on
which we disagree.

If the author dates really meant anything, we would also have to avoid v2,
v3, v4, ... v226 of patch series. So that flies in the face of trying to
keep the meaning of author dates.

In addition, the development flow I prefer is one that is in harmony with
the modern Continuous Integration style, where topic branches are merged
into a single, always-ready-to-release integration branch.

That means that I always work off of `master`, unless there is a good
reason to base off of `next` or even `pu`. That's to avoid merge
conflicts, to see what really gets applied.

I am *especially* adamant about rebasing to a newer upstream commit when
there are merge conflicts.

Such as is the case here.

> For the above reason, I may hold onto this patch series in my inbox
> without actually updating js/rebase-i-final topic until the current
> cycle is over; please do not mistake it as this new reroll being
> ignored.

You do as you want, of course. But please note that I will not rebase my
topic branches to an ancient revision, especially if that would cause merge
conflicts with the current `master`.

And if there should be another iteration of this wallflower patch series,
I will rebase it to the then-current `master` again [*1*].

Ciao,
Dscho

Footnote *1*: in general, I try to abide by the wishes of maintainers when
contributing code, unless those wishes are contrary to what I consider
correct software development. Like, when in Rome, I will do as the Romans
do. Except when I see them looting a parking meter.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-07-22 11:45 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-14 14:44 [PATCH v6 00/10] The final building block for a faster rebase -i Johannes Schindelin
2017-07-14 14:44 ` [PATCH v6 01/10] t3415: verify that an empty instructionFormat is handled as before Johannes Schindelin
2017-07-14 14:44 ` [PATCH v6 02/10] rebase -i: generate the script via rebase--helper Johannes Schindelin
2017-07-14 22:50   ` Stefan Beller
2017-07-15 12:56     ` Johannes Schindelin
2017-07-14 14:45 ` [PATCH v6 03/10] rebase -i: remove useless indentation Johannes Schindelin
2017-07-14 14:45 ` [PATCH v6 04/10] rebase -i: do not invent onelines when expanding/collapsing SHA-1s Johannes Schindelin
2017-07-14 14:45 ` [PATCH v6 05/10] rebase -i: also expand/collapse the SHA-1s via the rebase--helper Johannes Schindelin
2017-07-14 14:45 ` [PATCH v6 06/10] t3404: relax rebase.missingCommitsCheck tests Johannes Schindelin
2017-07-14 14:45 ` [PATCH v6 07/10] rebase -i: check for missing commits in the rebase--helper Johannes Schindelin
2017-07-14 14:45 ` [PATCH v6 08/10] rebase -i: skip unnecessary picks using " Johannes Schindelin
2017-07-14 14:45 ` [PATCH v6 09/10] t3415: test fixup with wrapped oneline Johannes Schindelin
2017-07-14 14:45 ` [PATCH v6 10/10] rebase -i: rearrange fixup/squash lines using the rebase--helper Johannes Schindelin
2017-07-14 17:27 ` [PATCH v6 00/10] The final building block for a faster rebase -i Stefan Beller
2017-07-14 20:39   ` Johannes Schindelin
2017-07-20 21:38 ` Junio C Hamano
2017-07-22 11:44   ` Johannes Schindelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).