Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] rebase --merge: optionally skip upstreamed commits
@ 2020-03-09 20:55 Jonathan Tan
  2020-03-10  2:10 ` Taylor Blau
                   ` (4 more replies)
  0 siblings, 5 replies; 78+ messages in thread
From: Jonathan Tan @ 2020-03-09 20:55 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, git

When rebasing against an upstream that has had many commits since the
original branch was created:

 O -- O -- ... -- O -- O (upstream)
  \
   -- O (my-dev-branch)

because "git rebase" attempts to exclude commits that are duplicates of
upstream ones, it must read the contents of every novel upstream commit,
in addition to the tip of the upstream and the merge base. This can be a
significant performance hit, especially in a partial clone, wherein a
read of an object may end up being a fetch.

Add a flag to "git rebase" to allow suppression of this feature. This
flag only works when using the "merge" backend.

This flag changes the behavior of sequencer_make_script(), called from
do_interactive_rebase() <- run_rebase_interactive() <-
run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
(indirectly called from sequencer_make_script() through
prepare_revision_walk()) will no longer call cherry_pick_list(), and
thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
means that the intermediate commits in upstream are no longer read (as
shown by the test) and means that no PATCHSAME-caused skipping of
commits is done by sequencer_make_script(), either directly or through
make_script_with_merges().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
More improvements for partial clone, but this is a benefit for
non-partial-clone as well, hence the way I wrote the commit message (not
focusing too much on partial clone) and the documentation.

I've chosen --skip-already-present and --no-skip-already-present to
reuse the language already existing in the documentation and to avoid a
double negative (e.g. --avoid-checking-if-already-present and
--no-avoid-checking-if-already-present) but this causes some clumsiness
in the documentation and in the code. Any suggestions for the name are
welcome.

I've only implemented this for the "merge" backend since I think that
there is an effort to migrate "rebase" to use the "merge" backend by
default, and also because "merge" uses diff internally which already has
the (per-commit) blob batch prefetching.
---
 Documentation/git-rebase.txt | 12 +++++-
 builtin/rebase.c             | 10 ++++-
 sequencer.c                  |  3 +-
 sequencer.h                  |  2 +-
 t/t3402-rebase-merge.sh      | 77 ++++++++++++++++++++++++++++++++++++
 5 files changed, 100 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 0c4f038dd6..f73a82b4a9 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -318,6 +318,15 @@ See also INCOMPATIBLE OPTIONS below.
 +
 See also INCOMPATIBLE OPTIONS below.
 
+--skip-already-present::
+--no-skip-already-present::
+	Skip commits that are already present in the new upstream.
+	This is the default.
++
+If the skip-if-already-present feature is unnecessary or undesired,
+`--no-skip-already-present` may improve performance since it avoids
+the need to read the contents of every commit in the new upstream.
+
 --rerere-autoupdate::
 --no-rerere-autoupdate::
 	Allow the rerere mechanism to update the index with the
@@ -866,7 +875,8 @@ Only works if the changes (patch IDs based on the diff contents) on
 'subsystem' did.
 
 In that case, the fix is easy because 'git rebase' knows to skip
-changes that are already present in the new upstream.  So if you say
+changes that are already present in the new upstream (unless
+`--no-skip-already-present` is given). So if you say
 (assuming you're on 'topic')
 ------------
     $ git rebase subsystem
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 6154ad8fa5..943211e5bb 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -88,13 +88,15 @@ struct rebase_options {
 	struct strbuf git_format_patch_opt;
 	int reschedule_failed_exec;
 	int use_legacy_rebase;
+	int skip_already_present;
 };
 
 #define REBASE_OPTIONS_INIT {			  	\
 		.type = REBASE_UNSPECIFIED,	  	\
 		.flags = REBASE_NO_QUIET, 		\
 		.git_am_opts = ARGV_ARRAY_INIT,		\
-		.git_format_patch_opt = STRBUF_INIT	\
+		.git_format_patch_opt = STRBUF_INIT,	\
+		.skip_already_present =	1		\
 	}
 
 static struct replay_opts get_replay_opts(const struct rebase_options *opts)
@@ -373,6 +375,7 @@ static int run_rebase_interactive(struct rebase_options *opts,
 	flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
 	flags |= opts->root_with_onto ? TODO_LIST_ROOT_WITH_ONTO : 0;
 	flags |= command == ACTION_SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
+	flags |= opts->skip_already_present ? TODO_LIST_SKIP_ALREADY_PRESENT : 0;
 
 	switch (command) {
 	case ACTION_NONE: {
@@ -1507,6 +1510,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "reschedule-failed-exec",
 			 &reschedule_failed_exec,
 			 N_("automatically re-schedule any `exec` that fails")),
+		OPT_BOOL(0, "skip-already-present", &options.skip_already_present,
+			 N_("skip changes that are already present in the new upstream")),
 		OPT_END(),
 	};
 	int i;
@@ -1840,6 +1845,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 			      "interactive or merge options"));
 	}
 
+	if (!options.skip_already_present && !is_interactive(&options))
+		die(_("--no-skip-already-present does not work with the 'am' backend"));
+
 	if (options.signoff) {
 		if (options.type == REBASE_PRESERVE_MERGES)
 			die("cannot combine '--signoff' with "
diff --git a/sequencer.c b/sequencer.c
index ba90a513b9..752580c017 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4797,12 +4797,13 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
 	int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
 	const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick";
 	int rebase_merges = flags & TODO_LIST_REBASE_MERGES;
+	int skip_already_present = !!(flags & TODO_LIST_SKIP_ALREADY_PRESENT);
 
 	repo_init_revisions(r, &revs, NULL);
 	revs.verbose_header = 1;
 	if (!rebase_merges)
 		revs.max_parents = 1;
-	revs.cherry_mark = 1;
+	revs.cherry_mark = skip_already_present;
 	revs.limited = 1;
 	revs.reverse = 1;
 	revs.right_only = 1;
diff --git a/sequencer.h b/sequencer.h
index 393571e89a..39bb12f624 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -149,7 +149,7 @@ int sequencer_remove_state(struct replay_opts *opts);
  * `--onto`, we do not want to re-generate the root commits.
  */
 #define TODO_LIST_ROOT_WITH_ONTO (1U << 6)
-
+#define TODO_LIST_SKIP_ALREADY_PRESENT (1U << 7)
 
 int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
 			  const char **argv, unsigned flags);
diff --git a/t/t3402-rebase-merge.sh b/t/t3402-rebase-merge.sh
index a1ec501a87..9b52739a10 100755
--- a/t/t3402-rebase-merge.sh
+++ b/t/t3402-rebase-merge.sh
@@ -162,4 +162,81 @@ test_expect_success 'rebase --skip works with two conflicts in a row' '
 	git rebase --skip
 '
 
+test_expect_success '--no-skip-already-present' '
+	git init repo &&
+
+	# O(1-10) -- O(1-11) -- O(0-10) master
+	#        \
+	#         -- O(1-11) -- O(1-12) otherbranch
+
+	printf "Line %d\n" $(test_seq 1 10) >repo/file.txt &&
+	git -C repo add file.txt &&
+	git -C repo commit -m "base commit" &&
+
+	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
+	git -C repo commit -a -m "add 11" &&
+
+	printf "Line %d\n" $(test_seq 0 10) >repo/file.txt &&
+	git -C repo commit -a -m "add 0 delete 11" &&
+
+	git -C repo checkout -b otherbranch HEAD^^ &&
+	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
+	git -C repo commit -a -m "add 11 in another branch" &&
+
+	printf "Line %d\n" $(test_seq 1 12) >repo/file.txt &&
+	git -C repo commit -a -m "add 12 in another branch" &&
+
+	# Regular rebase fails, because the 1-11 commit is deduplicated
+	test_must_fail git -C repo rebase --merge master 2> err &&
+	test_i18ngrep "error: could not apply.*add 12 in another branch" err &&
+	git -C repo rebase --abort &&
+
+	# With --no-skip-already-present, it works
+	git -C repo rebase --merge --no-skip-already-present master
+'
+
+test_expect_success '--no-skip-already-present refrains from reading unneeded blobs' '
+	git init server &&
+
+	# O(1-10) -- O(1-11) -- O(1-12) master
+	#        \
+	#         -- O(0-10) otherbranch
+
+	printf "Line %d\n" $(test_seq 1 10) >server/file.txt &&
+	git -C server add file.txt &&
+	git -C server commit -m "merge base" &&
+
+	printf "Line %d\n" $(test_seq 1 11) >server/file.txt &&
+	git -C server commit -a -m "add 11" &&
+
+	printf "Line %d\n" $(test_seq 1 12) >server/file.txt &&
+	git -C server commit -a -m "add 12" &&
+
+	git -C server checkout -b otherbranch HEAD^^ &&
+	printf "Line %d\n" $(test_seq 0 10) >server/file.txt &&
+	git -C server commit -a -m "add 0" &&
+
+	test_config -C server uploadpack.allowfilter 1 &&
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+	git clone --filter=blob:none "file://$(pwd)/server" client &&
+	git -C client checkout origin/master &&
+	git -C client checkout origin/otherbranch &&
+
+	# Sanity check to ensure that the blobs from the merge base and "add
+	# 11" are missing
+	git -C client rev-list --objects --all --missing=print >missing_list &&
+	MERGE_BASE_BLOB=$(git -C server rev-parse master^^:file.txt) &&
+	ADD_11_BLOB=$(git -C server rev-parse master^:file.txt) &&
+	grep "\\?$MERGE_BASE_BLOB" missing_list &&
+	grep "\\?$ADD_11_BLOB" missing_list &&
+
+	git -C client rebase --merge --no-skip-already-present origin/master &&
+
+	# The blob from the merge base had to be fetched, but not "add 11"
+	git -C client rev-list --objects --all --missing=print >missing_list &&
+	! grep "\\?$MERGE_BASE_BLOB" missing_list &&
+	grep "\\?$ADD_11_BLOB" missing_list
+'
+
 test_done
-- 
2.25.1.481.gfbce0eb801-goog


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-09 20:55 [PATCH] rebase --merge: optionally skip upstreamed commits Jonathan Tan
@ 2020-03-10  2:10 ` Taylor Blau
  2020-03-10 15:51   ` Jonathan Tan
  2020-03-10 12:17 ` Johannes Schindelin
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 78+ messages in thread
From: Taylor Blau @ 2020-03-10  2:10 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, stolee, git

Hi Jonathan,

This patch makes good sense to me. I left a few notes below, but they
are relatively minor, and this seems to be all in a good direction.

As a (somewhat) interesting aside, this feature would be useful to me
outside of partial clones, since I often have this workflow in my local
development wherein 'git rebase' spends quite a bit of time comparing
patches on my branch to everything new upstream.

On Mon, Mar 09, 2020 at 01:55:23PM -0700, Jonathan Tan wrote:
> When rebasing against an upstream that has had many commits since the
> original branch was created:
>
>  O -- O -- ... -- O -- O (upstream)
>   \
>    -- O (my-dev-branch)
>
> because "git rebase" attempts to exclude commits that are duplicates of
> upstream ones, it must read the contents of every novel upstream commit,
> in addition to the tip of the upstream and the merge base.

This sentence is a little confusing if you skip over the graph, since it
reads: "When rebasing against an because ... because ...". It may be
clearer if you swap the order of the last two clauses to instead be:

  it must read the contents of every novel upstream commit, in addition to
  the tip of the upstream and the merge base, because "git rebase"
  attempts to exclude commits that are duplicates of upstream ones.

> This can be a significant performance hit, especially in a partial
> clone, wherein a read of an object may end up being a fetch.
>
> Add a flag to "git rebase" to allow suppression of this feature. This
> flag only works when using the "merge" backend.
>
> This flag changes the behavior of sequencer_make_script(), called from
> do_interactive_rebase() <- run_rebase_interactive() <-
> run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
> (indirectly called from sequencer_make_script() through
> prepare_revision_walk()) will no longer call cherry_pick_list(), and
> thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
> means that the intermediate commits in upstream are no longer read (as
> shown by the test) and means that no PATCHSAME-caused skipping of
> commits is done by sequencer_make_script(), either directly or through
> make_script_with_merges().

This all sounds good to me.

> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> More improvements for partial clone, but this is a benefit for
> non-partial-clone as well, hence the way I wrote the commit message (not
> focusing too much on partial clone) and the documentation.
>
> I've chosen --skip-already-present and --no-skip-already-present to
> reuse the language already existing in the documentation and to avoid a
> double negative (e.g. --avoid-checking-if-already-present and
> --no-avoid-checking-if-already-present) but this causes some clumsiness
> in the documentation and in the code. Any suggestions for the name are
> welcome.
>
> I've only implemented this for the "merge" backend since I think that
> there is an effort to migrate "rebase" to use the "merge" backend by
> default, and also because "merge" uses diff internally which already has
> the (per-commit) blob batch prefetching.

This also makes sense to me.

> ---
>  Documentation/git-rebase.txt | 12 +++++-
>  builtin/rebase.c             | 10 ++++-
>  sequencer.c                  |  3 +-
>  sequencer.h                  |  2 +-
>  t/t3402-rebase-merge.sh      | 77 ++++++++++++++++++++++++++++++++++++
>  5 files changed, 100 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 0c4f038dd6..f73a82b4a9 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -318,6 +318,15 @@ See also INCOMPATIBLE OPTIONS below.
>  +
>  See also INCOMPATIBLE OPTIONS below.
>
> +--skip-already-present::
> +--no-skip-already-present::
> +	Skip commits that are already present in the new upstream.
> +	This is the default.

I believe that you mean '--skip-already-present' is the default, here,
but the placement makes it ambiguous, since it is in a paragraph with a
header that contains both the positive and negated version of this flag.

Maybe this could changed to: s/This/--skip-already-present/'.

> ++
> +If the skip-if-already-present feature is unnecessary or undesired,
> +`--no-skip-already-present` may improve performance since it avoids
> +the need to read the contents of every commit in the new upstream.
> +
>  --rerere-autoupdate::
>  --no-rerere-autoupdate::
>  	Allow the rerere mechanism to update the index with the
> @@ -866,7 +875,8 @@ Only works if the changes (patch IDs based on the diff contents) on
>  'subsystem' did.
>
>  In that case, the fix is easy because 'git rebase' knows to skip
> -changes that are already present in the new upstream.  So if you say
> +changes that are already present in the new upstream (unless
> +`--no-skip-already-present` is given). So if you say

Extremely minor nit: there is a whitespace change on this line where the
original has two spaces between the '.' and 'So', and the new version
has only one.

>  (assuming you're on 'topic')
>  ------------
>      $ git rebase subsystem
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 6154ad8fa5..943211e5bb 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -88,13 +88,15 @@ struct rebase_options {
>  	struct strbuf git_format_patch_opt;
>  	int reschedule_failed_exec;
>  	int use_legacy_rebase;
> +	int skip_already_present;
>  };
>
>  #define REBASE_OPTIONS_INIT {			  	\
>  		.type = REBASE_UNSPECIFIED,	  	\
>  		.flags = REBASE_NO_QUIET, 		\
>  		.git_am_opts = ARGV_ARRAY_INIT,		\
> -		.git_format_patch_opt = STRBUF_INIT	\
> +		.git_format_patch_opt = STRBUF_INIT,	\
> +		.skip_already_present =	1		\
>  	}
>
>  static struct replay_opts get_replay_opts(const struct rebase_options *opts)
> @@ -373,6 +375,7 @@ static int run_rebase_interactive(struct rebase_options *opts,
>  	flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
>  	flags |= opts->root_with_onto ? TODO_LIST_ROOT_WITH_ONTO : 0;
>  	flags |= command == ACTION_SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
> +	flags |= opts->skip_already_present ? TODO_LIST_SKIP_ALREADY_PRESENT : 0;
>
>  	switch (command) {
>  	case ACTION_NONE: {
> @@ -1507,6 +1510,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  		OPT_BOOL(0, "reschedule-failed-exec",
>  			 &reschedule_failed_exec,
>  			 N_("automatically re-schedule any `exec` that fails")),
> +		OPT_BOOL(0, "skip-already-present", &options.skip_already_present,
> +			 N_("skip changes that are already present in the new upstream")),

I scratched my head a little bit about why we weren't using OPT_BIT and
&flags directly here, but it matches the pattern in the surrounding, so
I think that 'OPT_BOOL' and target '&options.skip_already_present' here.

>  		OPT_END(),
>  	};
>  	int i;
> @@ -1840,6 +1845,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  			      "interactive or merge options"));
>  	}
>
> +	if (!options.skip_already_present && !is_interactive(&options))
> +		die(_("--no-skip-already-present does not work with the 'am' backend"));
> +
>  	if (options.signoff) {
>  		if (options.type == REBASE_PRESERVE_MERGES)
>  			die("cannot combine '--signoff' with "
> diff --git a/sequencer.c b/sequencer.c
> index ba90a513b9..752580c017 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4797,12 +4797,13 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
>  	int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
>  	const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick";
>  	int rebase_merges = flags & TODO_LIST_REBASE_MERGES;
> +	int skip_already_present = !!(flags & TODO_LIST_SKIP_ALREADY_PRESENT);
>
>  	repo_init_revisions(r, &revs, NULL);
>  	revs.verbose_header = 1;
>  	if (!rebase_merges)
>  		revs.max_parents = 1;
> -	revs.cherry_mark = 1;
> +	revs.cherry_mark = skip_already_present;

:-). All of that plumbing just to poke at this variable. Looks good to
me.

>  	revs.limited = 1;
>  	revs.reverse = 1;
>  	revs.right_only = 1;
> diff --git a/sequencer.h b/sequencer.h
> index 393571e89a..39bb12f624 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -149,7 +149,7 @@ int sequencer_remove_state(struct replay_opts *opts);
>   * `--onto`, we do not want to re-generate the root commits.
>   */
>  #define TODO_LIST_ROOT_WITH_ONTO (1U << 6)
> -
> +#define TODO_LIST_SKIP_ALREADY_PRESENT (1U << 7)

This was another spot that I thought could maybe be turned into an enum,
but it's clearly not the fault of your patch, and could easily be turned
into #leftoverbits.

>  int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
>  			  const char **argv, unsigned flags);
> diff --git a/t/t3402-rebase-merge.sh b/t/t3402-rebase-merge.sh
> index a1ec501a87..9b52739a10 100755
> --- a/t/t3402-rebase-merge.sh
> +++ b/t/t3402-rebase-merge.sh
> @@ -162,4 +162,81 @@ test_expect_success 'rebase --skip works with two conflicts in a row' '
>  	git rebase --skip
>  '
>
> +test_expect_success '--no-skip-already-present' '
> +	git init repo &&
> +
> +	# O(1-10) -- O(1-11) -- O(0-10) master
> +	#        \
> +	#         -- O(1-11) -- O(1-12) otherbranch
> +
> +	printf "Line %d\n" $(test_seq 1 10) >repo/file.txt &&
> +	git -C repo add file.txt &&
> +	git -C repo commit -m "base commit" &&
> +
> +	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> +	git -C repo commit -a -m "add 11" &&
> +
> +	printf "Line %d\n" $(test_seq 0 10) >repo/file.txt &&
> +	git -C repo commit -a -m "add 0 delete 11" &&
> +
> +	git -C repo checkout -b otherbranch HEAD^^ &&
> +	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> +	git -C repo commit -a -m "add 11 in another branch" &&
> +
> +	printf "Line %d\n" $(test_seq 1 12) >repo/file.txt &&
> +	git -C repo commit -a -m "add 12 in another branch" &&
> +
> +	# Regular rebase fails, because the 1-11 commit is deduplicated
> +	test_must_fail git -C repo rebase --merge master 2> err &&
> +	test_i18ngrep "error: could not apply.*add 12 in another branch" err &&
> +	git -C repo rebase --abort &&
> +
> +	# With --no-skip-already-present, it works
> +	git -C repo rebase --merge --no-skip-already-present master
> +'
> +
> +test_expect_success '--no-skip-already-present refrains from reading unneeded blobs' '
> +	git init server &&
> +
> +	# O(1-10) -- O(1-11) -- O(1-12) master
> +	#        \
> +	#         -- O(0-10) otherbranch
> +
> +	printf "Line %d\n" $(test_seq 1 10) >server/file.txt &&
> +	git -C server add file.txt &&
> +	git -C server commit -m "merge base" &&
> +
> +	printf "Line %d\n" $(test_seq 1 11) >server/file.txt &&
> +	git -C server commit -a -m "add 11" &&
> +
> +	printf "Line %d\n" $(test_seq 1 12) >server/file.txt &&
> +	git -C server commit -a -m "add 12" &&
> +
> +	git -C server checkout -b otherbranch HEAD^^ &&
> +	printf "Line %d\n" $(test_seq 0 10) >server/file.txt &&
> +	git -C server commit -a -m "add 0" &&
> +
> +	test_config -C server uploadpack.allowfilter 1 &&
> +	test_config -C server uploadpack.allowanysha1inwant 1 &&
> +
> +	git clone --filter=blob:none "file://$(pwd)/server" client &&
> +	git -C client checkout origin/master &&
> +	git -C client checkout origin/otherbranch &&
> +
> +	# Sanity check to ensure that the blobs from the merge base and "add
> +	# 11" are missing
> +	git -C client rev-list --objects --all --missing=print >missing_list &&
> +	MERGE_BASE_BLOB=$(git -C server rev-parse master^^:file.txt) &&
> +	ADD_11_BLOB=$(git -C server rev-parse master^:file.txt) &&
> +	grep "\\?$MERGE_BASE_BLOB" missing_list &&
> +	grep "\\?$ADD_11_BLOB" missing_list &&
> +
> +	git -C client rebase --merge --no-skip-already-present origin/master &&
> +
> +	# The blob from the merge base had to be fetched, but not "add 11"
> +	git -C client rev-list --objects --all --missing=print >missing_list &&
> +	! grep "\\?$MERGE_BASE_BLOB" missing_list &&
> +	grep "\\?$ADD_11_BLOB" missing_list
> +'
> +
>  test_done
> --
> 2.25.1.481.gfbce0eb801-goog

The tests look good to me. Thanks for working on this!

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-09 20:55 [PATCH] rebase --merge: optionally skip upstreamed commits Jonathan Tan
  2020-03-10  2:10 ` Taylor Blau
@ 2020-03-10 12:17 ` Johannes Schindelin
  2020-03-10 16:00   ` Jonathan Tan
  2020-03-10 18:56 ` Elijah Newren
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 78+ messages in thread
From: Johannes Schindelin @ 2020-03-10 12:17 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, stolee, git

Hi Jonathan,

On Mon, 9 Mar 2020, Jonathan Tan wrote:

> When rebasing against an upstream that has had many commits since the
> original branch was created:
>
>  O -- O -- ... -- O -- O (upstream)
>   \
>    -- O (my-dev-branch)
>
> because "git rebase" attempts to exclude commits that are duplicates of
> upstream ones, it must read the contents of every novel upstream commit,
> in addition to the tip of the upstream and the merge base. This can be a
> significant performance hit, especially in a partial clone, wherein a
> read of an object may end up being a fetch.
>
> Add a flag to "git rebase" to allow suppression of this feature. This
> flag only works when using the "merge" backend.

I wonder whether we can make this a bit more user-friendly by defaulting
to `--right-only` if there are no promised objects in the symmetric range,
and if there _are_ promised objects, to skip `--right-only`, possibly with
an advice that we did that and how to force it to download the promised
objects?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-10  2:10 ` Taylor Blau
@ 2020-03-10 15:51   ` Jonathan Tan
  0 siblings, 0 replies; 78+ messages in thread
From: Jonathan Tan @ 2020-03-10 15:51 UTC (permalink / raw)
  To: me; +Cc: jonathantanmy, git, stolee, git

> Hi Jonathan,
> 
> This patch makes good sense to me. I left a few notes below, but they
> are relatively minor, and this seems to be all in a good direction.
> 
> As a (somewhat) interesting aside, this feature would be useful to me
> outside of partial clones, since I often have this workflow in my local
> development wherein 'git rebase' spends quite a bit of time comparing
> patches on my branch to everything new upstream.

Thanks for your review, and it's great to know of a use case that this
helps.

> This sentence is a little confusing if you skip over the graph, since it
> reads: "When rebasing against an because ... because ...". It may be
> clearer if you swap the order of the last two clauses to instead be:
> 
>   it must read the contents of every novel upstream commit, in addition to
>   the tip of the upstream and the merge base, because "git rebase"
>   attempts to exclude commits that are duplicates of upstream ones.

Sounds good; will do.

> > +--skip-already-present::
> > +--no-skip-already-present::
> > +	Skip commits that are already present in the new upstream.
> > +	This is the default.
> 
> I believe that you mean '--skip-already-present' is the default, here,
> but the placement makes it ambiguous, since it is in a paragraph with a
> header that contains both the positive and negated version of this flag.
> 
> Maybe this could changed to: s/This/--skip-already-present/'.

Will do.

> >  In that case, the fix is easy because 'git rebase' knows to skip
> > -changes that are already present in the new upstream.  So if you say
> > +changes that are already present in the new upstream (unless
> > +`--no-skip-already-present` is given). So if you say
> 
> Extremely minor nit: there is a whitespace change on this line where the
> original has two spaces between the '.' and 'So', and the new version
> has only one.

OK - I'll change it to 2 spaces.

> > diff --git a/sequencer.h b/sequencer.h
> > index 393571e89a..39bb12f624 100644
> > --- a/sequencer.h
> > +++ b/sequencer.h
> > @@ -149,7 +149,7 @@ int sequencer_remove_state(struct replay_opts *opts);
> >   * `--onto`, we do not want to re-generate the root commits.
> >   */
> >  #define TODO_LIST_ROOT_WITH_ONTO (1U << 6)
> > -
> > +#define TODO_LIST_SKIP_ALREADY_PRESENT (1U << 7)
> 
> This was another spot that I thought could maybe be turned into an enum,
> but it's clearly not the fault of your patch, and could easily be turned
> into #leftoverbits.

There was a recent discussion on the list [1] about whether bitsets
should be enums, and we decided against it. But anyway we can revisit
this later if need be.

[1] https://lore.kernel.org/git/20191016193750.258148-1-jonathantanmy@google.com/

The changes Taylor suggested were minor, so I'll hold off sending
another version until there are more substantial changes requested.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-10 12:17 ` Johannes Schindelin
@ 2020-03-10 16:00   ` Jonathan Tan
  0 siblings, 0 replies; 78+ messages in thread
From: Jonathan Tan @ 2020-03-10 16:00 UTC (permalink / raw)
  To: Johannes.Schindelin; +Cc: jonathantanmy, git, stolee, git

> I wonder whether we can make this a bit more user-friendly by defaulting
> to `--right-only` if there are no promised objects in the symmetric range,
> and if there _are_ promised objects, to skip `--right-only`, possibly with
> an advice that we did that and how to force it to download the promised
> objects?

Thanks for your suggestion. I'm inclined to think that in a partial
clone, whether an object is missing or not should not affect the
behavior of a Git command (except for lazy fetching and performance),
but I can see how this is useful, at least for the purposes of
discoverability and ease of use (good diagnostics for the
non-partial-clone case, and better performance for the partial clone
case).

But in any case, I think that this can be built later on top of my
patch. Even if we have automatic detection of missing objects and
automatic selection of functionality, we will still need the CLI
arguments for manual override, so the CLI flags and functionality in
this patch are still useful.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-09 20:55 [PATCH] rebase --merge: optionally skip upstreamed commits Jonathan Tan
  2020-03-10  2:10 ` Taylor Blau
  2020-03-10 12:17 ` Johannes Schindelin
@ 2020-03-10 18:56 ` Elijah Newren
  2020-03-10 22:56   ` Jonathan Tan
  2020-03-18 17:30 ` [PATCH v2] " Jonathan Tan
  2020-03-30  4:06 ` [PATCH v3] rebase --merge: optionally skip upstreamed commits Jonathan Tan
  4 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2020-03-10 18:56 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Git Mailing List, Derrick Stolee, Jeff Hostetler

On Mon, Mar 9, 2020 at 1:58 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> When rebasing against an upstream that has had many commits since the
> original branch was created:
>
>  O -- O -- ... -- O -- O (upstream)
>   \
>    -- O (my-dev-branch)
>
> because "git rebase" attempts to exclude commits that are duplicates of
> upstream ones, it must read the contents of every novel upstream commit,
> in addition to the tip of the upstream and the merge base. This can be a
> significant performance hit, especially in a partial clone, wherein a
> read of an object may end up being a fetch.

Does this suggest that the cherry-pick detection is suboptimal and
needs to be improved?  When rebasing, it is typical that you are just
rebasing a small number of patches compared to how many exist
upstream.  As such, any upstream patch modifying files outside the set
of files modified on the rebased side is known to not be PATCHSAME
without looking at those new files.  Or is the issue just the sheer
number of upstream commits that modify only the files also modified on
the rebased side is large?

> Add a flag to "git rebase" to allow suppression of this feature. This
> flag only works when using the "merge" backend.

Interesting.  A little over a year ago we discussed not only making
such a change in behavior, but making it the default; see
https://public-inbox.org/git/nycvar.QRO.7.76.6.1901211635080.41@tvgsbejvaqbjf.bet/
(from "Ooh, that's interesting" to "VFS for Git").

Since that time, we did indeed change the handling of commits that
become empty (so that they now default to dropping), which certainly
goes well with the new behavior to skip the cherry-pick detection.

One note: I think that old thread was wrong about the apply versus the
merge backends (which were referred to as "am" and "interactive" at
the time): both the apply and the merge backends have done the
cherry-pick detection so it wasn't a difference between the two.

> This flag changes the behavior of sequencer_make_script(), called from
> do_interactive_rebase() <- run_rebase_interactive() <-
> run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
> (indirectly called from sequencer_make_script() through
> prepare_revision_walk()) will no longer call cherry_pick_list(), and
> thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
> means that the intermediate commits in upstream are no longer read (as
> shown by the test) and means that no PATCHSAME-caused skipping of
> commits is done by sequencer_make_script(), either directly or through
> make_script_with_merges().
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> More improvements for partial clone, but this is a benefit for
> non-partial-clone as well, hence the way I wrote the commit message (not
> focusing too much on partial clone) and the documentation.

Yes, I've wanted to kill that performance overhead too even without
partial clones, though I was slightly worried about cases of a known
cherry-pick no longer cleanly applying and thus forcing the user to
detect that it has become empty.  I guess that's why it's a flag
instead of the new default, but there's something inside of me that
asks why this special case is detected for the user when other
conflict cases aren't...  Not sure if I'm just pigeonholing on
performance too much.

> I've chosen --skip-already-present and --no-skip-already-present to
> reuse the language already existing in the documentation and to avoid a
> double negative (e.g. --avoid-checking-if-already-present and
> --no-avoid-checking-if-already-present) but this causes some clumsiness
> in the documentation and in the code. Any suggestions for the name are
> welcome.

I'll add comments on this below.

> I've only implemented this for the "merge" backend since I think that
> there is an effort to migrate "rebase" to use the "merge" backend by
> default, and also because "merge" uses diff internally which already has
> the (per-commit) blob batch prefetching.

I understand the first half of your reason here, but I don't follow
the second half.  The apply backend uses diff to generate the patches,
but diff isn't the relevant operation here; it's the rev-list walking,
and both call the exact same rev-list walk the last time I checked so
I'm not sure what the difference is here.  Am I misunderstanding one
or more things?

> ---
>  Documentation/git-rebase.txt | 12 +++++-
>  builtin/rebase.c             | 10 ++++-
>  sequencer.c                  |  3 +-
>  sequencer.h                  |  2 +-
>  t/t3402-rebase-merge.sh      | 77 ++++++++++++++++++++++++++++++++++++
>  5 files changed, 100 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 0c4f038dd6..f73a82b4a9 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -318,6 +318,15 @@ See also INCOMPATIBLE OPTIONS below.
>  +
>  See also INCOMPATIBLE OPTIONS below.
>
> +--skip-already-present::
> +--no-skip-already-present::
> +       Skip commits that are already present in the new upstream.
> +       This is the default.
> ++
> +If the skip-if-already-present feature is unnecessary or undesired,
> +`--no-skip-already-present` may improve performance since it avoids
> +the need to read the contents of every commit in the new upstream.
> +

I'm afraid the naming might be pretty opaque and confusing to users.
Even if we do keep the names, it might help to be clearer about the
ramifications.  And there's a missing reference to the option
incompatibility.  Perhaps something like:

--skip-cherry-pick-detection
--no-skip-cherry-pick-detection

    Whether rebase tries to determine if commits are already present
upstream, i.e. if there are commits which are cherry-picks.  If such
detection is done, any commits being rebased which are cherry-picks
will be dropped, since those commits are already found upstream.  If
such detection is not done, those commits will be re-applied, which
most likely will result in no new changes (as the changes are already
upstream) and result in the commit being dropped anyway.  cherry-pick
detection is the default, but can be expensive in repos with a large
number of upstream commits that need to be read.

See also INCOMPATIBLE OPTIONS below.

>  --rerere-autoupdate::
>  --no-rerere-autoupdate::
>         Allow the rerere mechanism to update the index with the
> @@ -866,7 +875,8 @@ Only works if the changes (patch IDs based on the diff contents) on
>  'subsystem' did.
>
>  In that case, the fix is easy because 'git rebase' knows to skip
> -changes that are already present in the new upstream.  So if you say
> +changes that are already present in the new upstream (unless
> +`--no-skip-already-present` is given). So if you say
>  (assuming you're on 'topic')
>  ------------
>      $ git rebase subsystem
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 6154ad8fa5..943211e5bb 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -88,13 +88,15 @@ struct rebase_options {
>         struct strbuf git_format_patch_opt;
>         int reschedule_failed_exec;
>         int use_legacy_rebase;
> +       int skip_already_present;
>  };
>
>  #define REBASE_OPTIONS_INIT {                          \
>                 .type = REBASE_UNSPECIFIED,             \
>                 .flags = REBASE_NO_QUIET,               \
>                 .git_am_opts = ARGV_ARRAY_INIT,         \
> -               .git_format_patch_opt = STRBUF_INIT     \
> +               .git_format_patch_opt = STRBUF_INIT,    \
> +               .skip_already_present = 1               \
>         }
>
>  static struct replay_opts get_replay_opts(const struct rebase_options *opts)
> @@ -373,6 +375,7 @@ static int run_rebase_interactive(struct rebase_options *opts,
>         flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
>         flags |= opts->root_with_onto ? TODO_LIST_ROOT_WITH_ONTO : 0;
>         flags |= command == ACTION_SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
> +       flags |= opts->skip_already_present ? TODO_LIST_SKIP_ALREADY_PRESENT : 0;
>
>         switch (command) {
>         case ACTION_NONE: {
> @@ -1507,6 +1510,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>                 OPT_BOOL(0, "reschedule-failed-exec",
>                          &reschedule_failed_exec,
>                          N_("automatically re-schedule any `exec` that fails")),
> +               OPT_BOOL(0, "skip-already-present", &options.skip_already_present,
> +                        N_("skip changes that are already present in the new upstream")),
>                 OPT_END(),
>         };
>         int i;
> @@ -1840,6 +1845,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>                               "interactive or merge options"));
>         }
>
> +       if (!options.skip_already_present && !is_interactive(&options))
> +               die(_("--no-skip-already-present does not work with the 'am' backend"));
> +

with the *apply* backend, not the 'am' one (the backend was renamed in
commit 10cdb9f38a ("rebase: rename the two primary rebase backends",
2020-02-15))

>         if (options.signoff) {
>                 if (options.type == REBASE_PRESERVE_MERGES)
>                         die("cannot combine '--signoff' with "
> diff --git a/sequencer.c b/sequencer.c
> index ba90a513b9..752580c017 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4797,12 +4797,13 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
>         int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
>         const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick";
>         int rebase_merges = flags & TODO_LIST_REBASE_MERGES;
> +       int skip_already_present = !!(flags & TODO_LIST_SKIP_ALREADY_PRESENT);
>
>         repo_init_revisions(r, &revs, NULL);
>         revs.verbose_header = 1;
>         if (!rebase_merges)
>                 revs.max_parents = 1;
> -       revs.cherry_mark = 1;
> +       revs.cherry_mark = skip_already_present;
>         revs.limited = 1;
>         revs.reverse = 1;
>         revs.right_only = 1;
> diff --git a/sequencer.h b/sequencer.h
> index 393571e89a..39bb12f624 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -149,7 +149,7 @@ int sequencer_remove_state(struct replay_opts *opts);
>   * `--onto`, we do not want to re-generate the root commits.
>   */
>  #define TODO_LIST_ROOT_WITH_ONTO (1U << 6)
> -
> +#define TODO_LIST_SKIP_ALREADY_PRESENT (1U << 7)
>
>  int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
>                           const char **argv, unsigned flags);
> diff --git a/t/t3402-rebase-merge.sh b/t/t3402-rebase-merge.sh
> index a1ec501a87..9b52739a10 100755
> --- a/t/t3402-rebase-merge.sh
> +++ b/t/t3402-rebase-merge.sh
> @@ -162,4 +162,81 @@ test_expect_success 'rebase --skip works with two conflicts in a row' '
>         git rebase --skip
>  '
>
> +test_expect_success '--no-skip-already-present' '
> +       git init repo &&
> +
> +       # O(1-10) -- O(1-11) -- O(0-10) master
> +       #        \
> +       #         -- O(1-11) -- O(1-12) otherbranch
> +
> +       printf "Line %d\n" $(test_seq 1 10) >repo/file.txt &&
> +       git -C repo add file.txt &&
> +       git -C repo commit -m "base commit" &&
> +
> +       printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> +       git -C repo commit -a -m "add 11" &&
> +
> +       printf "Line %d\n" $(test_seq 0 10) >repo/file.txt &&
> +       git -C repo commit -a -m "add 0 delete 11" &&
> +
> +       git -C repo checkout -b otherbranch HEAD^^ &&
> +       printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> +       git -C repo commit -a -m "add 11 in another branch" &&
> +
> +       printf "Line %d\n" $(test_seq 1 12) >repo/file.txt &&
> +       git -C repo commit -a -m "add 12 in another branch" &&
> +
> +       # Regular rebase fails, because the 1-11 commit is deduplicated
> +       test_must_fail git -C repo rebase --merge master 2> err &&
> +       test_i18ngrep "error: could not apply.*add 12 in another branch" err &&
> +       git -C repo rebase --abort &&
> +
> +       # With --no-skip-already-present, it works
> +       git -C repo rebase --merge --no-skip-already-present master
> +'
> +
> +test_expect_success '--no-skip-already-present refrains from reading unneeded blobs' '
> +       git init server &&
> +
> +       # O(1-10) -- O(1-11) -- O(1-12) master
> +       #        \
> +       #         -- O(0-10) otherbranch
> +
> +       printf "Line %d\n" $(test_seq 1 10) >server/file.txt &&
> +       git -C server add file.txt &&
> +       git -C server commit -m "merge base" &&
> +
> +       printf "Line %d\n" $(test_seq 1 11) >server/file.txt &&
> +       git -C server commit -a -m "add 11" &&
> +
> +       printf "Line %d\n" $(test_seq 1 12) >server/file.txt &&
> +       git -C server commit -a -m "add 12" &&
> +
> +       git -C server checkout -b otherbranch HEAD^^ &&
> +       printf "Line %d\n" $(test_seq 0 10) >server/file.txt &&
> +       git -C server commit -a -m "add 0" &&
> +
> +       test_config -C server uploadpack.allowfilter 1 &&
> +       test_config -C server uploadpack.allowanysha1inwant 1 &&
> +
> +       git clone --filter=blob:none "file://$(pwd)/server" client &&
> +       git -C client checkout origin/master &&
> +       git -C client checkout origin/otherbranch &&
> +
> +       # Sanity check to ensure that the blobs from the merge base and "add
> +       # 11" are missing
> +       git -C client rev-list --objects --all --missing=print >missing_list &&
> +       MERGE_BASE_BLOB=$(git -C server rev-parse master^^:file.txt) &&
> +       ADD_11_BLOB=$(git -C server rev-parse master^:file.txt) &&
> +       grep "\\?$MERGE_BASE_BLOB" missing_list &&
> +       grep "\\?$ADD_11_BLOB" missing_list &&
> +
> +       git -C client rebase --merge --no-skip-already-present origin/master &&
> +
> +       # The blob from the merge base had to be fetched, but not "add 11"
> +       git -C client rev-list --objects --all --missing=print >missing_list &&
> +       ! grep "\\?$MERGE_BASE_BLOB" missing_list &&
> +       grep "\\?$ADD_11_BLOB" missing_list
> +'
> +
>  test_done
> --
> 2.25.1.481.gfbce0eb801-goog

Should there be a config setting to flip the default?  And should
feature.experimental and/or feature.manyFiles enable it by default?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-10 18:56 ` Elijah Newren
@ 2020-03-10 22:56   ` Jonathan Tan
  2020-03-12 18:04     ` Jonathan Tan
  0 siblings, 1 reply; 78+ messages in thread
From: Jonathan Tan @ 2020-03-10 22:56 UTC (permalink / raw)
  To: newren; +Cc: jonathantanmy, git, stolee, git

> On Mon, Mar 9, 2020 at 1:58 PM Jonathan Tan <jonathantanmy@google.com> wrote:
> >
> > When rebasing against an upstream that has had many commits since the
> > original branch was created:
> >
> >  O -- O -- ... -- O -- O (upstream)
> >   \
> >    -- O (my-dev-branch)
> >
> > because "git rebase" attempts to exclude commits that are duplicates of
> > upstream ones, it must read the contents of every novel upstream commit,
> > in addition to the tip of the upstream and the merge base. This can be a
> > significant performance hit, especially in a partial clone, wherein a
> > read of an object may end up being a fetch.
> 
> Does this suggest that the cherry-pick detection is suboptimal and
> needs to be improved?  When rebasing, it is typical that you are just
> rebasing a small number of patches compared to how many exist
> upstream.  As such, any upstream patch modifying files outside the set
> of files modified on the rebased side is known to not be PATCHSAME
> without looking at those new files.

That's true - and this would drastically reduce the fetches necessary in
partial clone, perhaps enough that we no longer need this check.

In the absence of partial clone, this also might improve performance
sufficiently, such that we no longer need my new option. (Or it might
not.)

> Or is the issue just the sheer
> number of upstream commits that modify only the files also modified on
> the rebased side is large?
> 
> > Add a flag to "git rebase" to allow suppression of this feature. This
> > flag only works when using the "merge" backend.
> 
> Interesting.  A little over a year ago we discussed not only making
> such a change in behavior, but making it the default; see
> https://public-inbox.org/git/nycvar.QRO.7.76.6.1901211635080.41@tvgsbejvaqbjf.bet/
> (from "Ooh, that's interesting" to "VFS for Git").

Thanks for the pointer! Dscho did propose again to make it the default
[1] and I replied that this can be done later [2].

[1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.2003101315100.46@tvgsbejvaqbjf.bet/
[2] https://lore.kernel.org/git/20200310160035.20252-1-jonathantanmy@google.com/

> Yes, I've wanted to kill that performance overhead too even without
> partial clones, though I was slightly worried about cases of a known
> cherry-pick no longer cleanly applying and thus forcing the user to
> detect that it has become empty.  I guess that's why it's a flag
> instead of the new default, but there's something inside of me that
> asks why this special case is detected for the user when other
> conflict cases aren't...  Not sure if I'm just pigeonholing on
> performance too much.

I haven't dug into this, but the email you linked [3] shows that this
behavior was once-upon-a-time relied upon ("For example, when I did not
use GitGitGadget yet to submit patches..."). So I don't think we should
change it.

[3] https://public-inbox.org/git/nycvar.QRO.7.76.6.1901211635080.41@tvgsbejvaqbjf.bet/


> > I've only implemented this for the "merge" backend since I think that
> > there is an effort to migrate "rebase" to use the "merge" backend by
> > default, and also because "merge" uses diff internally which already has
> > the (per-commit) blob batch prefetching.
> 
> I understand the first half of your reason here, but I don't follow
> the second half.  The apply backend uses diff to generate the patches,
> but diff isn't the relevant operation here; it's the rev-list walking,
> and both call the exact same rev-list walk the last time I checked so
> I'm not sure what the difference is here.  Am I misunderstanding one
> or more things?

Maybe just ignore the second half :-)

I thought and wrote the second half because I noticed that somewhere in
the "am"-related code, blobs were being fetched one by one, but no such
thing was happening when I used the "merge" backend. The rev-list
walking doesn't access blobs, I believe.

> > +--skip-already-present::
> > +--no-skip-already-present::
> > +       Skip commits that are already present in the new upstream.
> > +       This is the default.
> > ++
> > +If the skip-if-already-present feature is unnecessary or undesired,
> > +`--no-skip-already-present` may improve performance since it avoids
> > +the need to read the contents of every commit in the new upstream.
> > +
> 
> I'm afraid the naming might be pretty opaque and confusing to users.
> Even if we do keep the names, it might help to be clearer about the
> ramifications.  And there's a missing reference to the option
> incompatibility.  Perhaps something like:
> 
> --skip-cherry-pick-detection
> --no-skip-cherry-pick-detection
> 
>     Whether rebase tries to determine if commits are already present
> upstream, i.e. if there are commits which are cherry-picks.  If such
> detection is done, any commits being rebased which are cherry-picks
> will be dropped, since those commits are already found upstream.  If
> such detection is not done, those commits will be re-applied, which
> most likely will result in no new changes (as the changes are already
> upstream) and result in the commit being dropped anyway.  cherry-pick
> detection is the default, but can be expensive in repos with a large
> number of upstream commits that need to be read.
> 
> See also INCOMPATIBLE OPTIONS below.

I understand that commits being already present in upstream is usually
due to cherry-picking, but I don't think that's always the case, so
perhaps there is some imprecision here. But this might be better - in
particular, documentation and code will not be so clumsy (the "no", or
0, is the status quo, and the lack of "no", or 1, requires special
handling).

> > +       if (!options.skip_already_present && !is_interactive(&options))
> > +               die(_("--no-skip-already-present does not work with the 'am' backend"));
> > +
> 
> with the *apply* backend, not the 'am' one (the backend was renamed in
> commit 10cdb9f38a ("rebase: rename the two primary rebase backends",
> 2020-02-15))

Thanks. Will do.

> Should there be a config setting to flip the default?  And should
> feature.experimental and/or feature.manyFiles enable it by default?

As above, I think this can be done in a separate patch.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-10 22:56   ` Jonathan Tan
@ 2020-03-12 18:04     ` Jonathan Tan
  2020-03-12 22:40       ` Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: Jonathan Tan @ 2020-03-12 18:04 UTC (permalink / raw)
  To: jonathantanmy; +Cc: newren, git, stolee, git

> > Does this suggest that the cherry-pick detection is suboptimal and
> > needs to be improved?  When rebasing, it is typical that you are just
> > rebasing a small number of patches compared to how many exist
> > upstream.  As such, any upstream patch modifying files outside the set
> > of files modified on the rebased side is known to not be PATCHSAME
> > without looking at those new files.
> 
> That's true - and this would drastically reduce the fetches necessary in
> partial clone, perhaps enough that we no longer need this check.
> 
> In the absence of partial clone, this also might improve performance
> sufficiently, such that we no longer need my new option. (Or it might
> not.)

I took a further look at this. patch-ids.c and its caller
(cherry_pick_list() in revision.c) implement duplicate checking by first
generating full diff outputs for the commits in the shorter side,
putting them in a hashmap keyed by the SHA-1 of the diff output (and
values being the commit itself), and then generating full diff outputs
for the commits in the longer side and checking them against the
hashmap. When processing the shorter side, we could also generate
filename-only diffs and put their hashes into a hashset; so when
processing the longer side, we could generate the filename-only diff
first (without reading any blobs) and checking them against our new
hashset, and only if it appears in our new hashset, then do we generate
the full diff (thus reading blobs).

One issue with this is unpredictability to the user (since which blobs
get read depend on which side is longer), but that seems resolvable by
not doing any length checks but always reading the blobs on the right
side (that is, the non-upstream side).

So I would say that yes, the cherry-pick detection is suboptimal and
could be improved. So the question is...what to do with my patch? An
argument could be made that my patch should be dropped because an
improvement in cherry-pick detection would eliminate the need for the
option I'm introducing in my patch, but after some thought, I think that
this option will still be useful even with cherry-pick detection. If we
move in a direction where not only blobs but also trees (or even
commits) are omitted, we'll definitely want this new option. And even if
a user is not using partial clone at all, I think it is still useful to
suppress both the filtering of commits (e.g. when upstream has a commit
then revert, it would be reasonable to cherry-pick the same commit on
top) and reduce disk reads (although I don't know if this will be the
case in practice).

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-12 18:04     ` Jonathan Tan
@ 2020-03-12 22:40       ` Elijah Newren
  2020-03-14  8:04         ` Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2020-03-12 22:40 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Git Mailing List, Derrick Stolee, Jeff Hostetler

 On Thu, Mar 12, 2020 at 11:04 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > > Does this suggest that the cherry-pick detection is suboptimal and
> > > needs to be improved?  When rebasing, it is typical that you are just
> > > rebasing a small number of patches compared to how many exist
> > > upstream.  As such, any upstream patch modifying files outside the set
> > > of files modified on the rebased side is known to not be PATCHSAME
> > > without looking at those new files.
> >
> > That's true - and this would drastically reduce the fetches necessary in
> > partial clone, perhaps enough that we no longer need this check.
> >
> > In the absence of partial clone, this also might improve performance
> > sufficiently, such that we no longer need my new option. (Or it might
> > not.)
>
> I took a further look at this. patch-ids.c and its caller
> (cherry_pick_list() in revision.c) implement duplicate checking by first
> generating full diff outputs for the commits in the shorter side,
> putting them in a hashmap keyed by the SHA-1 of the diff output (and
> values being the commit itself), and then generating full diff outputs
> for the commits in the longer side and checking them against the
> hashmap. When processing the shorter side, we could also generate
> filename-only diffs and put their hashes into a hashset; so when
> processing the longer side, we could generate the filename-only diff
> first (without reading any blobs) and checking them against our new
> hashset, and only if it appears in our new hashset, then do we generate
> the full diff (thus reading blobs).
>
> One issue with this is unpredictability to the user (since which blobs
> get read depend on which side is longer), but that seems resolvable by
> not doing any length checks but always reading the blobs on the right
> side (that is, the non-upstream side).
>
> So I would say that yes, the cherry-pick detection is suboptimal and
> could be improved.

Sweet, thanks for doing this investigative work!  Sounds promising.

> So the question is...what to do with my patch? An
> argument could be made that my patch should be dropped because an
> improvement in cherry-pick detection would eliminate the need for the
> option I'm introducing in my patch, but after some thought, I think that
> this option will still be useful even with cherry-pick detection.

The option may be totally justified here, but can I go on a possible
tangent and vent for a little bit?  We seem to introduce options an
awful lot.  While options can be valuable they also have a cost, and I
think we tend not to acknowledge that.  Some of the negatives:

- Developers add options when they run into bugs instead of fixing
bugs (usually they don't realize that the behavior in question is a
bug, but that's exacerbated by a willingness to add options and not
consider costs)
- Developers add options without considering combinations of options
and what they mean (though it's hard to fault them because considering
combinations becomes harder and harder with the more options we have;
it's a negative feedback cycle)
- Growth in number of options leads to code that is hard or impossible
to refactor based on a maze of competing options with myriad edge and
corner cases that are fundamentally broken
- Users get overloaded by the sheer number of options and minor distinctions

The fourth case is probably obvious, so let me just include some
examples of the first three cases above:
* Commits b00bf1c9a8 ("git-rebase: make --allow-empty-message the
default", 2018-06-27), 22a69fda19 ("git-rebase.txt: update description
of --allow-empty-message", 2020-01-16), and d48e5e21da ("rebase
(interactive-backend): make --keep-empty the default", 2020-02-15)
noted that options that previously existed were just workarounds to
buggy behavior and the flags should have been always on.
* Commit e86bbcf987 ("clean: disambiguate the definition of -d",
2019-09-17) showed a pretty hairy case where the combination of
options led to cases where I not only didn't know how to implement
correct behavior, I didn't even know how to state what the desired
behavior for end-users was.  Despite a few different reports over a
year and a half that I had a series that fixed some known issues for
users the series languished because I couldn't get an answer on what
was right.  See also
https://lore.kernel.org/git/20190905154735.29784-1-newren@gmail.com/
* See the huge "Behavioral differences" section of the git-rebase
manpage, and a combination of rants from me on dir.c:
  - https://lore.kernel.org/git/20190905154735.29784-1-newren@gmail.com/
  - https://lore.kernel.org/git/CABPp-BFG3FkTkC=L1v97LUksndkOmCN8ZhNJh5eoNdquE7v9DA@mail.gmail.com/
  - https://lore.kernel.org/git/pull.676.v3.git.git.1576571586.gitgitgadget@gmail.com/
  - The commit message of
https://lore.kernel.org/git/d3136ef52f3306d465a5a6004cdc9ba5b1ae4148.1580495486.git.gitgitgadget@gmail.com/

>  If we
> move in a direction where not only blobs but also trees (or even
> commits) are omitted, we'll definitely want this new option.

Why would this new option be needed if we omitted trees?   If trees
are omitted based on something like sparse-checkouts, then they are
omitted based on path; shouldn't we be able to avoid walking trees
just by noting they modified some path outside a requested sparse
checkout?

I want grep, log, etc. to behave within the cone of a sparse checkout,
which means that I need trees of upstream branches within the relevant
paths anyway.  But theoretically I should certainly be able to avoid
walking trees outside those paths.

>  And even if
> a user is not using partial clone at all, I think it is still useful to
> suppress both the filtering of commits (e.g. when upstream has a commit
> then revert, it would be reasonable to cherry-pick the same commit on
> top) and reduce disk reads (although I don't know if this will be the
> case in practice).

That sounds like yet another argument that the behavior you're arguing
for should be the default, not a flag we make the users pick to
workaround bugs.  Yes, sometimes weird behaviors beget usecases (cue
link to xkcd's comic on emacs spacebar overheating) and we need to
provide transition plans, but I think this might be a case where
transitioning makes sense.  From a high level, here's my guess
(emphasis on guess) at the history:

* am checked for upstream patches, because apply would get confused
trying to apply an already applied patch
* legacy-interactive-rebase would check for upstream patches as a
performance optimization because having to shell out to a separate
cherry-pick process for each commit is slow (and may have also been
done partially to match am, even though am did it as a workaround)

And now we're in the state where:
* The check-for-upstream bits hurt performance, significantly enough
that we have three different reports of folks not liking it (you, me,
and Taylor)
* It actively does the wrong thing in cases such as revert + re-apply
sequences, which exist in practice (and exist a lot more than they
should, but they absolutely do exist)

We've made changes in other places (e.g. opening an editor for merge
or rebase, push.default, etc.); is there any reason a similar change
wouldn't be justified here?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-12 22:40       ` Elijah Newren
@ 2020-03-14  8:04         ` Elijah Newren
  2020-03-17  3:03           ` Jonathan Tan
  0 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2020-03-14  8:04 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Git Mailing List, Derrick Stolee, Jeff Hostetler

On Thu, Mar 12, 2020 at 3:40 PM Elijah Newren <newren@gmail.com> wrote:
>
>  On Thu, Mar 12, 2020 at 11:04 AM Jonathan Tan <jonathantanmy@google.com> wrote:
> >
> > > > Does this suggest that the cherry-pick detection is suboptimal and
> > > > needs to be improved?  When rebasing, it is typical that you are just
> > > > rebasing a small number of patches compared to how many exist
> > > > upstream.  As such, any upstream patch modifying files outside the set
> > > > of files modified on the rebased side is known to not be PATCHSAME
> > > > without looking at those new files.
> > >
> > > That's true - and this would drastically reduce the fetches necessary in
> > > partial clone, perhaps enough that we no longer need this check.
> > >
> > > In the absence of partial clone, this also might improve performance
> > > sufficiently, such that we no longer need my new option. (Or it might
> > > not.)
> >
> > I took a further look at this. patch-ids.c and its caller
> > (cherry_pick_list() in revision.c) implement duplicate checking by first
> > generating full diff outputs for the commits in the shorter side,
> > putting them in a hashmap keyed by the SHA-1 of the diff output (and
> > values being the commit itself), and then generating full diff outputs
> > for the commits in the longer side and checking them against the
> > hashmap. When processing the shorter side, we could also generate
> > filename-only diffs and put their hashes into a hashset; so when
> > processing the longer side, we could generate the filename-only diff
> > first (without reading any blobs) and checking them against our new
> > hashset, and only if it appears in our new hashset, then do we generate
> > the full diff (thus reading blobs).
> >
> > One issue with this is unpredictability to the user (since which blobs
> > get read depend on which side is longer), but that seems resolvable by
> > not doing any length checks but always reading the blobs on the right
> > side (that is, the non-upstream side).
> >
> > So I would say that yes, the cherry-pick detection is suboptimal and
> > could be improved.
>
> Sweet, thanks for doing this investigative work!  Sounds promising.
>
> > So the question is...what to do with my patch? An
> > argument could be made that my patch should be dropped because an
> > improvement in cherry-pick detection would eliminate the need for the
> > option I'm introducing in my patch, but after some thought, I think that
> > this option will still be useful even with cherry-pick detection.
>
> The option may be totally justified here, but can I go on a possible
> tangent and vent for a little bit?  We seem to introduce options an
> awful lot.  While options can be valuable they also have a cost, and I
> think we tend not to acknowledge that.  Some of the negatives:
>
> - Developers add options when they run into bugs instead of fixing
> bugs (usually they don't realize that the behavior in question is a
> bug, but that's exacerbated by a willingness to add options and not
> consider costs)
> - Developers add options without considering combinations of options
> and what they mean (though it's hard to fault them because considering
> combinations becomes harder and harder with the more options we have;
> it's a negative feedback cycle)
> - Growth in number of options leads to code that is hard or impossible
> to refactor based on a maze of competing options with myriad edge and
> corner cases that are fundamentally broken
> - Users get overloaded by the sheer number of options and minor distinctions
>
> The fourth case is probably obvious, so let me just include some
> examples of the first three cases above:
> * Commits b00bf1c9a8 ("git-rebase: make --allow-empty-message the
> default", 2018-06-27), 22a69fda19 ("git-rebase.txt: update description
> of --allow-empty-message", 2020-01-16), and d48e5e21da ("rebase
> (interactive-backend): make --keep-empty the default", 2020-02-15)
> noted that options that previously existed were just workarounds to
> buggy behavior and the flags should have been always on.
> * Commit e86bbcf987 ("clean: disambiguate the definition of -d",
> 2019-09-17) showed a pretty hairy case where the combination of
> options led to cases where I not only didn't know how to implement
> correct behavior, I didn't even know how to state what the desired
> behavior for end-users was.  Despite a few different reports over a
> year and a half that I had a series that fixed some known issues for
> users the series languished because I couldn't get an answer on what
> was right.  See also
> https://lore.kernel.org/git/20190905154735.29784-1-newren@gmail.com/
> * See the huge "Behavioral differences" section of the git-rebase
> manpage, and a combination of rants from me on dir.c:
>   - https://lore.kernel.org/git/20190905154735.29784-1-newren@gmail.com/
>   - https://lore.kernel.org/git/CABPp-BFG3FkTkC=L1v97LUksndkOmCN8ZhNJh5eoNdquE7v9DA@mail.gmail.com/
>   - https://lore.kernel.org/git/pull.676.v3.git.git.1576571586.gitgitgadget@gmail.com/
>   - The commit message of
> https://lore.kernel.org/git/d3136ef52f3306d465a5a6004cdc9ba5b1ae4148.1580495486.git.gitgitgadget@gmail.com/
>
> >  If we
> > move in a direction where not only blobs but also trees (or even
> > commits) are omitted, we'll definitely want this new option.
>
> Why would this new option be needed if we omitted trees?   If trees
> are omitted based on something like sparse-checkouts, then they are
> omitted based on path; shouldn't we be able to avoid walking trees
> just by noting they modified some path outside a requested sparse
> checkout?
>
> I want grep, log, etc. to behave within the cone of a sparse checkout,
> which means that I need trees of upstream branches within the relevant
> paths anyway.  But theoretically I should certainly be able to avoid
> walking trees outside those paths.
>
> >  And even if
> > a user is not using partial clone at all, I think it is still useful to
> > suppress both the filtering of commits (e.g. when upstream has a commit
> > then revert, it would be reasonable to cherry-pick the same commit on
> > top) and reduce disk reads (although I don't know if this will be the
> > case in practice).
>
> That sounds like yet another argument that the behavior you're arguing
> for should be the default, not a flag we make the users pick to
> workaround bugs.  Yes, sometimes weird behaviors beget usecases (cue
> link to xkcd's comic on emacs spacebar overheating) and we need to
> provide transition plans, but I think this might be a case where
> transitioning makes sense.  From a high level, here's my guess
> (emphasis on guess) at the history:
>
> * am checked for upstream patches, because apply would get confused
> trying to apply an already applied patch
> * legacy-interactive-rebase would check for upstream patches as a
> performance optimization because having to shell out to a separate
> cherry-pick process for each commit is slow (and may have also been
> done partially to match am, even though am did it as a workaround)

Or maybe that wasn't the reasoning?  I'm having a hard time parsing the
history to verify:

a6ec3c1599 ("git-rebase: Use --ignore-if-in-upstream option when
            executing git-format-patch.", 2006-10-03)
  '''This reduces the number of conflicts when rebasing after a series of
     patches to the same piece of code is committed upstream.'''

96ffe892e3 ("rebase -i: ignore patches that are already in the
            upstream", 2007-08-01)
  '''Non-interactive rebase had this from the beginning -- match it by
     using --cherry-pick option to rev-list.'''

and related: 1e0dacdbdb ("rebase: omit patch-identical commits with
    --fork-point", 2014-07-16)

> And now we're in the state where:
> * The check-for-upstream bits hurt performance, significantly enough
> that we have three different reports of folks not liking it (you, me,
> and Taylor)
> * It actively does the wrong thing in cases such as revert + re-apply
> sequences, which exist in practice (and exist a lot more than they
> should, but they absolutely do exist)

Though these are definitely still problems with
check-if-upstream-already behavior.

> We've made changes in other places (e.g. opening an editor for merge
> or rebase, push.default, etc.); is there any reason a similar change
> wouldn't be justified here?

After another day of thought, and my attempt to figure out the reason
above, perhaps my assumptions about the reason behind the original
behavior, any my assumptions about the sanity of switching the default
might not be as grounded as I thought.  Thus, my worries about yet
another flag may be overstated as well, at least in this case.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH] rebase --merge: optionally skip upstreamed commits
  2020-03-14  8:04         ` Elijah Newren
@ 2020-03-17  3:03           ` Jonathan Tan
  0 siblings, 0 replies; 78+ messages in thread
From: Jonathan Tan @ 2020-03-17  3:03 UTC (permalink / raw)
  To: newren; +Cc: jonathantanmy, git, stolee, git

> > We've made changes in other places (e.g. opening an editor for merge
> > or rebase, push.default, etc.); is there any reason a similar change
> > wouldn't be justified here?
> 
> After another day of thought, and my attempt to figure out the reason
> above, perhaps my assumptions about the reason behind the original
> behavior, any my assumptions about the sanity of switching the default
> might not be as grounded as I thought.  Thus, my worries about yet
> another flag may be overstated as well, at least in this case.

Thanks for the further investigation. For what it's worth, I do agree
that we should think of the cost of introducing an option before
introducing it. Admittedly in my write-up [1], I mentioned my
investigation into speeding up existing behavior enough to not need my
new feature, but didn't mention the possibility of just changing the
existing behavior. (But it seemed to me that this existing behavior is
presented as a desirable feature, so I didn't think of changing it.)

[1] https://lore.kernel.org/git/20200312180427.192096-1-jonathantanmy@google.com/

This question (from your other email [2]) is probably moot if we're
going to introduce this option anyway, but just to answer it:

> >  If we
> > move in a direction where not only blobs but also trees (or even
> > commits) are omitted, we'll definitely want this new option.
> 
> Why would this new option be needed if we omitted trees?   If trees
> are omitted based on something like sparse-checkouts, then they are
> omitted based on path; shouldn't we be able to avoid walking trees
> just by noting they modified some path outside a requested sparse
> checkout?
> 
> I want grep, log, etc. to behave within the cone of a sparse checkout,
> which means that I need trees of upstream branches within the relevant
> paths anyway.  But theoretically I should certainly be able to avoid
> walking trees outside those paths.

I haven't given much thought to it, but the diffing mechanism will need
to receive a whitelist of paths and, if it ever needs to traverse
outside those, will need to abort with "there's a difference outside
this whitelist". I don't know if it supports such a thing now.

[2] https://lore.kernel.org/git/CABPp-BE83ZhezkgmwatxAhqh4rptMUggcjSwBeiSByyPTUi6Lw@mail.gmail.com/

I'll give some time for others to respond, and will send out a v2 with
your and Taylor's suggestions implemented.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-09 20:55 [PATCH] rebase --merge: optionally skip upstreamed commits Jonathan Tan
                   ` (2 preceding siblings ...)
  2020-03-10 18:56 ` Elijah Newren
@ 2020-03-18 17:30 ` Jonathan Tan
  2020-03-18 18:47   ` Junio C Hamano
                     ` (3 more replies)
  2020-03-30  4:06 ` [PATCH v3] rebase --merge: optionally skip upstreamed commits Jonathan Tan
  4 siblings, 4 replies; 78+ messages in thread
From: Jonathan Tan @ 2020-03-18 17:30 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, me, Johannes.Schindelin, newren

When rebasing against an upstream that has had many commits since the
original branch was created:

 O -- O -- ... -- O -- O (upstream)
  \
   -- O (my-dev-branch)

it must read the contents of every novel upstream commit, in addition to
the tip of the upstream and the merge base, because "git rebase"
attempts to exclude commits that are duplicates of upstream ones. This
can be a significant performance hit, especially in a partial clone,
wherein a read of an object may end up being a fetch.

Add a flag to "git rebase" to allow suppression of this feature. This
flag only works when using the "merge" backend.

This flag changes the behavior of sequencer_make_script(), called from
do_interactive_rebase() <- run_rebase_interactive() <-
run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
(indirectly called from sequencer_make_script() through
prepare_revision_walk()) will no longer call cherry_pick_list(), and
thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
means that the intermediate commits in upstream are no longer read (as
shown by the test) and means that no PATCHSAME-caused skipping of
commits is done by sequencer_make_script(), either directly or through
make_script_with_merges().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
New in V2: changed parameter name, used Taylor's commit message
suggestions, and used Elijah's documentation suggestions.
---
 Documentation/git-rebase.txt | 20 +++++++++-
 builtin/rebase.c             |  7 ++++
 sequencer.c                  |  3 +-
 sequencer.h                  |  2 +-
 t/t3402-rebase-merge.sh      | 77 ++++++++++++++++++++++++++++++++++++
 5 files changed, 106 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 0c4f038dd6..4629eb573f 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -318,6 +318,20 @@ See also INCOMPATIBLE OPTIONS below.
 +
 See also INCOMPATIBLE OPTIONS below.
 
+--skip-cherry-pick-detection::
+--no-skip-cherry-pick-detection::
+	Whether rebase tries to determine if commits are already present
+	upstream, i.e. if there are commits which are cherry-picks.  If such
+	detection is done, any commits being rebased which are cherry-picks
+	will be dropped, since those commits are already found upstream.  If
+	such detection is not done, those commits will be re-applied, which
+	most likely will result in no new changes (as the changes are already
+	upstream) and result in the commit being dropped anyway.  cherry-pick
+	detection is the default, but can be expensive in repos with a large
+	number of upstream commits that need to be read.
++
+See also INCOMPATIBLE OPTIONS below.
+
 --rerere-autoupdate::
 --no-rerere-autoupdate::
 	Allow the rerere mechanism to update the index with the
@@ -568,6 +582,9 @@ In addition, the following pairs of options are incompatible:
  * --keep-base and --onto
  * --keep-base and --root
 
+Also, the --skip-cherry-pick-detection option requires the use of the merge
+backend (e.g., through --merge).
+
 BEHAVIORAL DIFFERENCES
 -----------------------
 
@@ -866,7 +883,8 @@ Only works if the changes (patch IDs based on the diff contents) on
 'subsystem' did.
 
 In that case, the fix is easy because 'git rebase' knows to skip
-changes that are already present in the new upstream.  So if you say
+changes that are already present in the new upstream (unless
+`--skip-cherry-pick-detection` is given). So if you say
 (assuming you're on 'topic')
 ------------
     $ git rebase subsystem
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 6154ad8fa5..100b8872af 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -88,6 +88,7 @@ struct rebase_options {
 	struct strbuf git_format_patch_opt;
 	int reschedule_failed_exec;
 	int use_legacy_rebase;
+	int skip_cherry_pick_detection;
 };
 
 #define REBASE_OPTIONS_INIT {			  	\
@@ -373,6 +374,7 @@ static int run_rebase_interactive(struct rebase_options *opts,
 	flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
 	flags |= opts->root_with_onto ? TODO_LIST_ROOT_WITH_ONTO : 0;
 	flags |= command == ACTION_SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
+	flags |= opts->skip_cherry_pick_detection ? TODO_LIST_SKIP_CHERRY_PICK_DETECTION : 0;
 
 	switch (command) {
 	case ACTION_NONE: {
@@ -1507,6 +1509,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "reschedule-failed-exec",
 			 &reschedule_failed_exec,
 			 N_("automatically re-schedule any `exec` that fails")),
+		OPT_BOOL(0, "skip-cherry-pick-detection", &options.skip_cherry_pick_detection,
+			 N_("skip changes that are already present in the new upstream")),
 		OPT_END(),
 	};
 	int i;
@@ -1840,6 +1844,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 			      "interactive or merge options"));
 	}
 
+	if (options.skip_cherry_pick_detection && !is_interactive(&options))
+		die(_("--skip-cherry-pick-detection does not work with the 'apply' backend"));
+
 	if (options.signoff) {
 		if (options.type == REBASE_PRESERVE_MERGES)
 			die("cannot combine '--signoff' with "
diff --git a/sequencer.c b/sequencer.c
index ba90a513b9..8b2cae3b69 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4797,12 +4797,13 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
 	int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
 	const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick";
 	int rebase_merges = flags & TODO_LIST_REBASE_MERGES;
+	int skip_cherry_pick_detection = flags & TODO_LIST_SKIP_CHERRY_PICK_DETECTION;
 
 	repo_init_revisions(r, &revs, NULL);
 	revs.verbose_header = 1;
 	if (!rebase_merges)
 		revs.max_parents = 1;
-	revs.cherry_mark = 1;
+	revs.cherry_mark = !skip_cherry_pick_detection;
 	revs.limited = 1;
 	revs.reverse = 1;
 	revs.right_only = 1;
diff --git a/sequencer.h b/sequencer.h
index 393571e89a..a54ea696c2 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -149,7 +149,7 @@ int sequencer_remove_state(struct replay_opts *opts);
  * `--onto`, we do not want to re-generate the root commits.
  */
 #define TODO_LIST_ROOT_WITH_ONTO (1U << 6)
-
+#define TODO_LIST_SKIP_CHERRY_PICK_DETECTION (1U << 7)
 
 int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
 			  const char **argv, unsigned flags);
diff --git a/t/t3402-rebase-merge.sh b/t/t3402-rebase-merge.sh
index a1ec501a87..290c79e0f6 100755
--- a/t/t3402-rebase-merge.sh
+++ b/t/t3402-rebase-merge.sh
@@ -162,4 +162,81 @@ test_expect_success 'rebase --skip works with two conflicts in a row' '
 	git rebase --skip
 '
 
+test_expect_success '--skip-cherry-pick-detection' '
+	git init repo &&
+
+	# O(1-10) -- O(1-11) -- O(0-10) master
+	#        \
+	#         -- O(1-11) -- O(1-12) otherbranch
+
+	printf "Line %d\n" $(test_seq 1 10) >repo/file.txt &&
+	git -C repo add file.txt &&
+	git -C repo commit -m "base commit" &&
+
+	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
+	git -C repo commit -a -m "add 11" &&
+
+	printf "Line %d\n" $(test_seq 0 10) >repo/file.txt &&
+	git -C repo commit -a -m "add 0 delete 11" &&
+
+	git -C repo checkout -b otherbranch HEAD^^ &&
+	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
+	git -C repo commit -a -m "add 11 in another branch" &&
+
+	printf "Line %d\n" $(test_seq 1 12) >repo/file.txt &&
+	git -C repo commit -a -m "add 12 in another branch" &&
+
+	# Regular rebase fails, because the 1-11 commit is deduplicated
+	test_must_fail git -C repo rebase --merge master 2> err &&
+	test_i18ngrep "error: could not apply.*add 12 in another branch" err &&
+	git -C repo rebase --abort &&
+
+	# With --skip-cherry-pick-detection, it works
+	git -C repo rebase --merge --skip-cherry-pick-detection master
+'
+
+test_expect_success '--skip-cherry-pick-detection refrains from reading unneeded blobs' '
+	git init server &&
+
+	# O(1-10) -- O(1-11) -- O(1-12) master
+	#        \
+	#         -- O(0-10) otherbranch
+
+	printf "Line %d\n" $(test_seq 1 10) >server/file.txt &&
+	git -C server add file.txt &&
+	git -C server commit -m "merge base" &&
+
+	printf "Line %d\n" $(test_seq 1 11) >server/file.txt &&
+	git -C server commit -a -m "add 11" &&
+
+	printf "Line %d\n" $(test_seq 1 12) >server/file.txt &&
+	git -C server commit -a -m "add 12" &&
+
+	git -C server checkout -b otherbranch HEAD^^ &&
+	printf "Line %d\n" $(test_seq 0 10) >server/file.txt &&
+	git -C server commit -a -m "add 0" &&
+
+	test_config -C server uploadpack.allowfilter 1 &&
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+	git clone --filter=blob:none "file://$(pwd)/server" client &&
+	git -C client checkout origin/master &&
+	git -C client checkout origin/otherbranch &&
+
+	# Sanity check to ensure that the blobs from the merge base and "add
+	# 11" are missing
+	git -C client rev-list --objects --all --missing=print >missing_list &&
+	MERGE_BASE_BLOB=$(git -C server rev-parse master^^:file.txt) &&
+	ADD_11_BLOB=$(git -C server rev-parse master^:file.txt) &&
+	grep "\\?$MERGE_BASE_BLOB" missing_list &&
+	grep "\\?$ADD_11_BLOB" missing_list &&
+
+	git -C client rebase --merge --skip-cherry-pick-detection origin/master &&
+
+	# The blob from the merge base had to be fetched, but not "add 11"
+	git -C client rev-list --objects --all --missing=print >missing_list &&
+	! grep "\\?$MERGE_BASE_BLOB" missing_list &&
+	grep "\\?$ADD_11_BLOB" missing_list
+'
+
 test_done
-- 
2.25.1.481.gfbce0eb801-goog


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-18 17:30 ` [PATCH v2] " Jonathan Tan
@ 2020-03-18 18:47   ` Junio C Hamano
  2020-03-18 19:28     ` Jonathan Tan
  2020-03-18 20:20   ` Junio C Hamano
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2020-03-18 18:47 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, me, Johannes.Schindelin, newren

Jonathan Tan <jonathantanmy@google.com> writes:

> When rebasing against an upstream that has had many commits since the
> original branch was created:
>
>  O -- O -- ... -- O -- O (upstream)
>   \
>    -- O (my-dev-branch)
>
> it must read the contents of every novel upstream commit, in addition to
> the tip of the upstream and the merge base, because "git rebase"
> attempts to exclude commits that are duplicates of upstream ones. This
> can be a significant performance hit, especially in a partial clone,
> wherein a read of an object may end up being a fetch.

OK.  I presume that we do this by comparing patch IDs?

Total disabling would of course is OK as a feature, especially for
the first cut, but I wonder if it would be a reasonable idea to use
some heuristic to keep the current "filter the same change" feature
as much as possible but optimize it by filtering the novel upstream
commits without hitting their trees and blobs (I am assuming that
you at least are aware of and have the commit objects on the
upstream side).

The most false-negative-prone approach is just to compare the
<author ident, author timestamp> of a candidate upstream commit with
what you have---if that author does not appear on my-dev-branch, it
is very unlikely that your change has been accepted upstream.  Of
course, two people who independently discover the same solution is
not all that rare, so it does risk false-negative to take too little
clue from the commits to compare, but at least it is not worse than
what you are proposing here ;-)  And if one of your commits on
my-dev-branch _might_ be identical to one of the novel upstream ones,
at that point, we could dig deeper to actually compute the patch ID
by fetching the upstream's tree.

That's all totally outside the scope of this patch.  It is just a
random thought to see if anybody wants to pursue to make the topic
even better, possible after it lands.

> New in V2: changed parameter name, used Taylor's commit message
> suggestions, and used Elijah's documentation suggestions.

Hmph, what was it called earlier?  My gut reaction without much
thinking finds --no-skip-* a bit confusing double-negation and
suspect "--[no-]detect-cherry-pick" (which defaults to true for
backward compatibility) may feel more natural, but I suspect (I do
not recall details of the discussion on v1) it has been already
discussed and people found --no-skip-* is OK (in which case I won't
object)?

I also wonder if --detect-cherry-pick=(yes|no|auto) may give a
better end-user experience, with "auto" meaning "do run patch-ID
based filtering, but if we know it will be expensive (e.g. the
repository is sparsely cloned), please skip it".  That way, there
may appear other reasons that makes patch-ID computation expensive
now or in the fiture, and the users are automatically covered.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-18 18:47   ` Junio C Hamano
@ 2020-03-18 19:28     ` Jonathan Tan
  2020-03-18 19:55       ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Jonathan Tan @ 2020-03-18 19:28 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git, me, Johannes.Schindelin, newren

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > When rebasing against an upstream that has had many commits since the
> > original branch was created:
> >
> >  O -- O -- ... -- O -- O (upstream)
> >   \
> >    -- O (my-dev-branch)
> >
> > it must read the contents of every novel upstream commit, in addition to
> > the tip of the upstream and the merge base, because "git rebase"
> > attempts to exclude commits that are duplicates of upstream ones. This
> > can be a significant performance hit, especially in a partial clone,
> > wherein a read of an object may end up being a fetch.
> 
> OK.  I presume that we do this by comparing patch IDs?

Yes.

> Total disabling would of course is OK as a feature, especially for
> the first cut, but I wonder if it would be a reasonable idea to use
> some heuristic to keep the current "filter the same change" feature
> as much as possible but optimize it by filtering the novel upstream
> commits without hitting their trees and blobs (I am assuming that
> you at least are aware of and have the commit objects on the
> upstream side).
> 
> The most false-negative-prone approach is just to compare the
> <author ident, author timestamp> of a candidate upstream commit with
> what you have---if that author does not appear on my-dev-branch, it
> is very unlikely that your change has been accepted upstream.  Of
> course, two people who independently discover the same solution is
> not all that rare, so it does risk false-negative to take too little
> clue from the commits to compare, but at least it is not worse than
> what you are proposing here ;-)  And if one of your commits on
> my-dev-branch _might_ be identical to one of the novel upstream ones,
> at that point, we could dig deeper to actually compute the patch ID
> by fetching the upstream's tree.

As far as I know, the existing patch ID behavior is only based on the
patch contents, so if there was any author name or time rewriting (or if
two people independently discovered the same solution, as you wrote),
then the behavior would be different. Apart from that, this does sound
like a cheap thing to compare before comparing the diff.

Elijah Newren suggested and I investigated another approach of using a
filename-only diff as a first approximation. The relevant quotations and
explanations are in my email here [1].

[1] https://lore.kernel.org/git/20200312180427.192096-1-jonathantanmy@google.com/

> That's all totally outside the scope of this patch.  It is just a
> random thought to see if anybody wants to pursue to make the topic
> even better, possible after it lands.

OK.

> > New in V2: changed parameter name, used Taylor's commit message
> > suggestions, and used Elijah's documentation suggestions.
> 
> Hmph, what was it called earlier?  My gut reaction without much
> thinking finds --no-skip-* a bit confusing double-negation and
> suspect "--[no-]detect-cherry-pick" (which defaults to true for
> backward compatibility) may feel more natural, but I suspect (I do
> not recall details of the discussion on v1) it has been already
> discussed and people found --no-skip-* is OK (in which case I won't
> object)?

It was earlier called "--{,no-}skip-already-present" (with the opposite
meaning, and thus, --skip-already-present is the default), so the double
negative has always existed. "--detect-cherry-pick" might be a better
idea...I'll wait to see if anybody else has an opinion.

> I also wonder if --detect-cherry-pick=(yes|no|auto) may give a
> better end-user experience, with "auto" meaning "do run patch-ID
> based filtering, but if we know it will be expensive (e.g. the
> repository is sparsely cloned), please skip it".  That way, there
> may appear other reasons that makes patch-ID computation expensive
> now or in the fiture, and the users are automatically covered.

It might be better to have predictability, and for "auto", I don't know
if we can have a simple and explainable set of rules as to when to use
patch-ID-based filtering - for example, in a partial clone with no
blobs, I would normally want no patch-ID-based filtering, but in a
partial clone with only a blob size limit, I probably will still want
patch-ID-based filtering.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-18 19:28     ` Jonathan Tan
@ 2020-03-18 19:55       ` Junio C Hamano
  2020-03-18 20:41         ` Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2020-03-18 19:55 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, me, Johannes.Schindelin, newren

Jonathan Tan <jonathantanmy@google.com> writes:

>> Hmph, what was it called earlier?  My gut reaction without much
>> thinking finds --no-skip-* a bit confusing double-negation and
>> suspect "--[no-]detect-cherry-pick" (which defaults to true for
>> backward compatibility) may feel more natural, but I suspect (I do
>> not recall details of the discussion on v1) it has been already
>> discussed and people found --no-skip-* is OK (in which case I won't
>> object)?
>
> It was earlier called "--{,no-}skip-already-present" (with the opposite
> meaning, and thus, --skip-already-present is the default), so the double
> negative has always existed. "--detect-cherry-pick" might be a better
> idea...I'll wait to see if anybody else has an opinion.

While "--[no-]detect-cherry-pick" is much better in avoiding double
negation, it is a horrible name---we do not tell the users what we
do after we detect cherry pick ("--[no-]skip-cherry-pick-detection"
does not tell us, either).  

Compared to them, "--[no-]skip-already-present" is much better, even
though there is double negation.  

How about a name along the lines of "--[no-]keep-duplicate", then?

>> I also wonder if --detect-cherry-pick=(yes|no|auto) may give a
>> better end-user experience, with "auto" meaning "do run patch-ID
>> based filtering, but if we know it will be expensive (e.g. the
>> repository is sparsely cloned), please skip it".  That way, there
>> may appear other reasons that makes patch-ID computation expensive
>> now or in the fiture, and the users are automatically covered.
>
> It might be better to have predictability, and for "auto", I don't know
> if we can have a simple and explainable set of rules as to when to use
> patch-ID-based filtering - for example, in a partial clone with no
> blobs, I would normally want no patch-ID-based filtering, but in a
> partial clone with only a blob size limit, I probably will still want
> patch-ID-based filtering.

Perhaps.  You could have something more specific than "auto".  The
main point was instead of "--[no-]$knob", "--$knob=(yes|no|...)" is
much easier to extend.  I simply do not know if we will see need to
extend the vocabulary in the near future (to which you guys who are
more interested in sparse clones would have much better insight than
I do).

Thanks.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-18 17:30 ` [PATCH v2] " Jonathan Tan
  2020-03-18 18:47   ` Junio C Hamano
@ 2020-03-18 20:20   ` Junio C Hamano
  2020-03-26 17:50   ` Jonathan Tan
  2020-03-29 10:12   ` [PATCH v2 4/4] t3402: use POSIX compliant regex(7) Đoàn Trần Công Danh
  3 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2020-03-18 20:20 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, me, Johannes.Schindelin, newren

Jonathan Tan <jonathantanmy@google.com> writes:

> @@ -1840,6 +1844,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  			      "interactive or merge options"));
>  	}
>  
> +	if (options.skip_cherry_pick_detection && !is_interactive(&options))
> +		die(_("--skip-cherry-pick-detection does not work with the 'apply' backend"));
> +

I presume this is, as before, built directly on v2.25.0; thanks for
keeping the original base while iterating.

Just a note to myself and those who are experimenting with the
patch.  When merged to the more recent codebase, is_interactive()
here will have to become is_merge().

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-18 19:55       ` Junio C Hamano
@ 2020-03-18 20:41         ` Elijah Newren
  2020-03-18 23:39           ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2020-03-18 20:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Tan, Git Mailing List, Taylor Blau, Johannes Schindelin

On Wed, Mar 18, 2020 at 12:55 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jonathan Tan <jonathantanmy@google.com> writes:
>
> >> Hmph, what was it called earlier?  My gut reaction without much
> >> thinking finds --no-skip-* a bit confusing double-negation and
> >> suspect "--[no-]detect-cherry-pick" (which defaults to true for
> >> backward compatibility) may feel more natural, but I suspect (I do
> >> not recall details of the discussion on v1) it has been already
> >> discussed and people found --no-skip-* is OK (in which case I won't
> >> object)?
> >
> > It was earlier called "--{,no-}skip-already-present" (with the opposite
> > meaning, and thus, --skip-already-present is the default), so the double
> > negative has always existed. "--detect-cherry-pick" might be a better
> > idea...I'll wait to see if anybody else has an opinion.
>
> While "--[no-]detect-cherry-pick" is much better in avoiding double
> negation, it is a horrible name---we do not tell the users what we
> do after we detect cherry pick ("--[no-]skip-cherry-pick-detection"
> does not tell us, either).

I like --[no-]detect-cherry-pick.  I'm on board with using "keep"
instead of "skip" to avoid double negation.

> Compared to them, "--[no-]skip-already-present" is much better, even
> though there is double negation.

This one seems especially bad from a discoverability and
understandability viewpoint though.  It's certainly nice if options
are fully self-documenting, but sometimes that would require full
paragraphs for the option name.  Focusing on what is done with the
option at the expense of discovering which options are relevant to
your case or at the expense of enabling users to create a mental model
for when options might be meaningful is something that I think is very
detrimental to usability.  I think users who want such an option would
have a very hard time finding this based on its name, and people who
want completely unrelated features would be confused enough by it that
they feel compelled to read its description and attempt to parse it
and guess how it's related.  In contrast, --[no-]detect-cherry-pick is
a bit clearer to both groups of people for whether it is useful, and
the group who wants it can read up the description to get the details.

> How about a name along the lines of "--[no-]keep-duplicate", then?

This name is much better than --[no-]keep-already-present would be
because "duplicate" is a far better indicator than "already-present"
of the intended meaning.  But I'm still worried the name "duplicate"
isn't going to be enough of a clue to individuals about whether they
will need this options or not.  Perhaps --[no-]keep-cherry-pick?

> >> I also wonder if --detect-cherry-pick=(yes|no|auto) may give a
> >> better end-user experience, with "auto" meaning "do run patch-ID
> >> based filtering, but if we know it will be expensive (e.g. the
> >> repository is sparsely cloned), please skip it".  That way, there
> >> may appear other reasons that makes patch-ID computation expensive
> >> now or in the fiture, and the users are automatically covered.
> >
> > It might be better to have predictability, and for "auto", I don't know
> > if we can have a simple and explainable set of rules as to when to use
> > patch-ID-based filtering - for example, in a partial clone with no
> > blobs, I would normally want no patch-ID-based filtering, but in a
> > partial clone with only a blob size limit, I probably will still want
> > patch-ID-based filtering.
>
> Perhaps.  You could have something more specific than "auto".  The
> main point was instead of "--[no-]$knob", "--$knob=(yes|no|...)" is
> much easier to extend.  I simply do not know if we will see need to
> extend the vocabulary in the near future (to which you guys who are
> more interested in sparse clones would have much better insight than
> I do).

I also struggle to understand when auto would be used.  But beyond
that, I'm still a little uneasy with where we seem to be ending up
(even if no fault of this patch):

1) Behavior has long been --keep-cherry-pick, and in various cases
that behavior can reduce conflicts users have to deal with.
2) Both Junio and I independently guessed that the cherry-pick
detection logic is poorly performing and could be improved; Jonathan
confirmed with some investigative work.  We've all suggested punting
for now, though.
3) I think we can make the sequencer machinery fast enough that the
cherry-pick detection is going to be the slowest part by a good margin
even in normal cases, not just sparse clones or the cases Taylor or I
had in mind.  So I think it's going to stick out like a sore thumb for
a lot more people (though maybe they're all happy because it's faster
overall?).
4) Jonathan provided some good examples of cases where the
--keep-cherry-pick behavior isn't just slow, but leads to actually
wrong answers (a revert followed by an un-revert).

I particularly don't like the idea of something being the default when
it can both cause wrong behavior and present a huge performance
problem that folks have to learn to workaround, especially when based
only on the tradeoff of sometimes reducing the amount of work we push
back on the user.  Maybe that's just inevitable, but does anyone have
any words that will make me feel better about this?

Elijah

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-18 20:41         ` Elijah Newren
@ 2020-03-18 23:39           ` Junio C Hamano
  2020-03-19  0:17             ` Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2020-03-18 23:39 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Jonathan Tan, Git Mailing List, Taylor Blau, Johannes Schindelin

Elijah Newren <newren@gmail.com> writes:

> 4) Jonathan provided some good examples of cases where the
> --keep-cherry-pick behavior isn't just slow, but leads to actually
> wrong answers (a revert followed by an un-revert).

That one cuts both ways, doesn't it?  If your change that upstream
once thought was good (and got accepted) turned out to be bad and
they reverted, you do not want to blindly reapply it to break the
codebase again, and with the "drop duplicate" logic, it would lead
to a wrong answer silently.

So from correctness point of view, I do not think you can make any
argument either way.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-18 23:39           ` Junio C Hamano
@ 2020-03-19  0:17             ` Elijah Newren
  0 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2020-03-19  0:17 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Tan, Git Mailing List, Taylor Blau, Johannes Schindelin

On Wed, Mar 18, 2020 at 4:39 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > 4) Jonathan provided some good examples of cases where the
> > --keep-cherry-pick behavior isn't just slow, but leads to actually
> > wrong answers (a revert followed by an un-revert).
>
> That one cuts both ways, doesn't it?  If your change that upstream
> once thought was good (and got accepted) turned out to be bad and
> they reverted, you do not want to blindly reapply it to break the
> codebase again, and with the "drop duplicate" logic, it would lead
> to a wrong answer silently.
>
> So from correctness point of view, I do not think you can make any
> argument either way.

Good point.  Thanks, that helps.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 0/3] add travis job for linux with musl libc
@ 2020-03-26  7:35 Đoàn Trần Công Danh
  2020-03-26  7:35 ` [PATCH 1/3] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
                   ` (4 more replies)
  0 siblings, 5 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-03-26  7:35 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Recently, we've un-broken git for Linux with musl libc,
and we have a serie to fix false negative with busybox shell utils.

There is a sample travis build for this serie applied on top of master:
https://travis-ci.org/github/sgn/git/builds/667111988

And, after merging with my v4 to fix busybox false negative:
https://travis-ci.org/github/sgn/git/builds/667112197


Đoàn Trần Công Danh (3):
  ci: libify logic for usage and checking CI_USER
  ci: refactor docker runner script
  travis: build and test on Linux with musl libc and busybox

 .travis.yml                                 | 10 +++++-
 azure-pipelines.yml                         |  4 +--
 ci/lib-docker.sh                            | 37 +++++++++++++++++++++
 ci/run-alpine-build.sh                      | 31 +++++++++++++++++
 ci/{run-linux32-docker.sh => run-docker.sh} | 20 +++++++----
 ci/run-linux32-build.sh                     | 35 +------------------
 6 files changed, 94 insertions(+), 43 deletions(-)
 create mode 100644 ci/lib-docker.sh
 create mode 100755 ci/run-alpine-build.sh
 rename ci/{run-linux32-docker.sh => run-docker.sh} (46%)

-- 
2.26.0.rc2.357.g1e1ba0441d


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 1/3] ci: libify logic for usage and checking CI_USER
  2020-03-26  7:35 [PATCH 0/3] add travis job for linux with musl libc Đoàn Trần Công Danh
@ 2020-03-26  7:35 ` Đoàn Trần Công Danh
  2020-03-26  7:35 ` [PATCH 2/3] ci: refactor docker runner script Đoàn Trần Công Danh
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-03-26  7:35 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

This part of logic will be reused for alpine docker later.

Merge those logic into single chunk since they will be used together.

While we're at it, add a comment to tell people run with root inside
podman container.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---

This patch is viewed better with "git diff --color-moved"

 ci/lib-docker.sh        | 37 +++++++++++++++++++++++++++++++++++++
 ci/run-linux32-build.sh | 35 +----------------------------------
 2 files changed, 38 insertions(+), 34 deletions(-)
 create mode 100644 ci/lib-docker.sh

diff --git a/ci/lib-docker.sh b/ci/lib-docker.sh
new file mode 100644
index 0000000000..ac155ace54
--- /dev/null
+++ b/ci/lib-docker.sh
@@ -0,0 +1,37 @@
+# Library of functions shared by all CI scripts run inside docker
+
+if test $# -ne 1 || test -z "$1"
+then
+	echo >&2 "usage: $0 <host-user-id>"
+	exit 1
+fi
+
+# If this script runs inside a docker container, then all commands are
+# usually executed as root. Consequently, the host user might not be
+# able to access the test output files.
+# If a non 0 host user id is given, then create a user "ci" with that
+# user id to make everything accessible to the host user.
+HOST_UID=$1
+if test $HOST_UID -eq 0
+then
+	# Just in case someone does want to run the test suite as root.
+	# or podman is used in place of docker
+	CI_USER=root
+else
+	CI_USER=ci
+	if test "$(id -u $CI_USER 2>/dev/null)" = $HOST_UID
+	then
+		echo "user '$CI_USER' already exists with the requested ID $HOST_UID"
+	else
+		useradd -u $HOST_UID $CI_USER
+	fi
+
+	# Due to a bug the test suite was run as root in the past, so
+	# a prove state file created back then is only accessible by
+	# root.  Now that bug is fixed, the test suite is run as a
+	# regular user, but the prove state file coming from Travis
+	# CI's cache might still be owned by root.
+	# Make sure that this user has rights to any cached files,
+	# including an existing prove state file.
+	test -n "$cache_dir" && chown -R $HOST_UID:$HOST_UID "$cache_dir"
+fi
diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index e3a193adbc..81296cdd19 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -8,11 +8,7 @@
 
 set -ex
 
-if test $# -ne 1 || test -z "$1"
-then
-	echo >&2 "usage: run-linux32-build.sh <host-user-id>"
-	exit 1
-fi
+. "${0%/*}/lib-docker.sh"
 
 # Update packages to the latest available versions
 linux32 --32bit i386 sh -c '
@@ -21,35 +17,6 @@ linux32 --32bit i386 sh -c '
 	libexpat-dev gettext python >/dev/null
 '
 
-# If this script runs inside a docker container, then all commands are
-# usually executed as root. Consequently, the host user might not be
-# able to access the test output files.
-# If a non 0 host user id is given, then create a user "ci" with that
-# user id to make everything accessible to the host user.
-HOST_UID=$1
-if test $HOST_UID -eq 0
-then
-	# Just in case someone does want to run the test suite as root.
-	CI_USER=root
-else
-	CI_USER=ci
-	if test "$(id -u $CI_USER 2>/dev/null)" = $HOST_UID
-	then
-		echo "user '$CI_USER' already exists with the requested ID $HOST_UID"
-	else
-		useradd -u $HOST_UID $CI_USER
-	fi
-
-	# Due to a bug the test suite was run as root in the past, so
-	# a prove state file created back then is only accessible by
-	# root.  Now that bug is fixed, the test suite is run as a
-	# regular user, but the prove state file coming from Travis
-	# CI's cache might still be owned by root.
-	# Make sure that this user has rights to any cached files,
-	# including an existing prove state file.
-	test -n "$cache_dir" && chown -R $HOST_UID:$HOST_UID "$cache_dir"
-fi
-
 # Build and test
 linux32 --32bit i386 su -m -l $CI_USER -c '
 	set -ex
-- 
2.26.0.rc2.357.g1e1ba0441d


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 2/3] ci: refactor docker runner script
  2020-03-26  7:35 [PATCH 0/3] add travis job for linux with musl libc Đoàn Trần Công Danh
  2020-03-26  7:35 ` [PATCH 1/3] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
@ 2020-03-26  7:35 ` Đoàn Trần Công Danh
  2020-03-26 16:06   ` Eric Sunshine
  2020-03-28 17:53   ` SZEDER Gábor
  2020-03-26  7:35 ` [PATCH 3/3] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-03-26  7:35 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

We will support alpine check in docker later in this serie.

While we're at it, tell people to run as root in podman.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 .travis.yml                                 |  2 +-
 azure-pipelines.yml                         |  4 ++--
 ci/{run-linux32-docker.sh => run-docker.sh} | 19 +++++++++++++------
 3 files changed, 16 insertions(+), 9 deletions(-)
 rename ci/{run-linux32-docker.sh => run-docker.sh} (48%)

diff --git a/.travis.yml b/.travis.yml
index fc5730b085..32e80e2670 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -32,7 +32,7 @@ matrix:
       services:
         - docker
       before_install:
-      script: ci/run-linux32-docker.sh
+      script: ci/run-docker.sh linux32
     - env: jobname=StaticAnalysis
       os: linux
       compiler:
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
index 675c3a43c9..ef504ff29f 100644
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -489,14 +489,14 @@ jobs:
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
 
        res=0
-       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-linux32-docker.sh || res=1
+       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-docker.sh linux32 || res=1
 
        sudo chmod a+r t/out/TEST-*.xml
        test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
 
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
        exit $res
-    displayName: 'ci/run-linux32-docker.sh'
+    displayName: 'ci/run-docker.sh linux32'
     env:
       GITFILESHAREPWD: $(gitfileshare.pwd)
   - task: PublishTestResults@2
diff --git a/ci/run-linux32-docker.sh b/ci/run-docker.sh
similarity index 48%
rename from ci/run-linux32-docker.sh
rename to ci/run-docker.sh
index 751acfcf8a..c8dff9d41a 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-docker.sh
@@ -1,15 +1,22 @@
 #!/bin/sh
 #
-# Download and run Docker image to build and test 32-bit Git
+# Download and run Docker image to build and test git
 #
 
 . ${0%/*}/lib.sh
 
-docker pull daald/ubuntu32:xenial
+CI_TARGET=${1:-linux32}
+case "$CI_TARGET" in
+linux32) CI_CONTAINER="daald/ubuntu32:xenial" ;;
+*)       exit 1 ;;
+esac
+
+docker pull "$CI_CONTAINER"
 
 # Use the following command to debug the docker build locally:
-# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/bash daald/ubuntu32:xenial
-# root@container:/# /usr/src/git/ci/run-linux32-build.sh <host-user-id>
+# <host-user-id> must be 0 if podman is used in place of docker
+# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/sh "$CI_CONTAINER"
+# root@container:/# /usr/src/git/ci/run-$CI_TARGET-build.sh <host-user-id>
 
 container_cache_dir=/tmp/travis-cache
 
@@ -23,8 +30,8 @@ docker run \
 	--env cache_dir="$container_cache_dir" \
 	--volume "${PWD}:/usr/src/git" \
 	--volume "$cache_dir:$container_cache_dir" \
-	daald/ubuntu32:xenial \
-	/usr/src/git/ci/run-linux32-build.sh $(id -u $USER)
+	"$CI_CONTAINER" \
+	"/usr/src/git/ci/run-$CI_TARGET-build.sh" $(id -u $USER)
 
 check_unignored_build_artifacts
 
-- 
2.26.0.rc2.357.g1e1ba0441d


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 3/3] travis: build and test on Linux with musl libc and busybox
  2020-03-26  7:35 [PATCH 0/3] add travis job for linux with musl libc Đoàn Trần Công Danh
  2020-03-26  7:35 ` [PATCH 1/3] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
  2020-03-26  7:35 ` [PATCH 2/3] ci: refactor docker runner script Đoàn Trần Công Danh
@ 2020-03-26  7:35 ` Đoàn Trần Công Danh
  2020-03-29  5:49 ` [PATCH 0/3] add travis job for linux with musl libc Junio C Hamano
  2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
  4 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-03-26  7:35 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 .travis.yml            |  8 ++++++++
 ci/run-alpine-build.sh | 31 +++++++++++++++++++++++++++++++
 ci/run-docker.sh       |  1 +
 3 files changed, 40 insertions(+)
 create mode 100755 ci/run-alpine-build.sh

diff --git a/.travis.yml b/.travis.yml
index 32e80e2670..a2927dd120 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -33,6 +33,14 @@ matrix:
         - docker
       before_install:
       script: ci/run-docker.sh linux32
+    - env: jobname=linux-musl-busybox
+      os: linux
+      compiler:
+      addons:
+      services:
+        - docker
+      before_install:
+      script: ci/run-docker.sh alpine
     - env: jobname=StaticAnalysis
       os: linux
       compiler:
diff --git a/ci/run-alpine-build.sh b/ci/run-alpine-build.sh
new file mode 100755
index 0000000000..c83df536e4
--- /dev/null
+++ b/ci/run-alpine-build.sh
@@ -0,0 +1,31 @@
+#!/bin/sh
+#
+# Build and test Git in Alpine Linux
+#
+# Usage:
+#   run-alpine-build.sh <host-user-id>
+#
+
+set -ex
+
+useradd () {
+	adduser -D "$@"
+}
+
+. "${0%/*}/lib-docker.sh"
+
+# Update packages to the latest available versions
+apk add --update autoconf build-base curl-dev openssl-dev expat-dev \
+	gettext pcre2-dev python3 musl-libintl >/dev/null
+
+# Build and test
+su -m -l $CI_USER -c '
+	set -ex
+	cd /usr/src/git
+	test -n "$cache_dir" && ln -s "$cache_dir/.prove" t/.prove
+	autoconf
+	echo "PYTHON_PATH=/usr/bin/python3" >config.mak
+	./configure --with-libpcre
+	make
+	make test
+'
diff --git a/ci/run-docker.sh b/ci/run-docker.sh
index c8dff9d41a..47affcd6d3 100755
--- a/ci/run-docker.sh
+++ b/ci/run-docker.sh
@@ -8,6 +8,7 @@
 CI_TARGET=${1:-linux32}
 case "$CI_TARGET" in
 linux32) CI_CONTAINER="daald/ubuntu32:xenial" ;;
+alpine)  CI_CONTAINER="alpine" ;;
 *)       exit 1 ;;
 esac
 
-- 
2.26.0.rc2.357.g1e1ba0441d


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/3] ci: refactor docker runner script
  2020-03-26  7:35 ` [PATCH 2/3] ci: refactor docker runner script Đoàn Trần Công Danh
@ 2020-03-26 16:06   ` Eric Sunshine
  2020-03-28 17:53   ` SZEDER Gábor
  1 sibling, 0 replies; 78+ messages in thread
From: Eric Sunshine @ 2020-03-26 16:06 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: Git List

On Thu, Mar 26, 2020 at 3:35 AM Đoàn Trần Công Danh
<congdanhqx@gmail.com> wrote:
> We will support alpine check in docker later in this serie.

s/serie/series/

> While we're at it, tell people to run as root in podman.
>
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-18 17:30 ` [PATCH v2] " Jonathan Tan
  2020-03-18 18:47   ` Junio C Hamano
  2020-03-18 20:20   ` Junio C Hamano
@ 2020-03-26 17:50   ` Jonathan Tan
  2020-03-26 19:17     ` Elijah Newren
  2020-03-26 19:27     ` Junio C Hamano
  2020-03-29 10:12   ` [PATCH v2 4/4] t3402: use POSIX compliant regex(7) Đoàn Trần Công Danh
  3 siblings, 2 replies; 78+ messages in thread
From: Jonathan Tan @ 2020-03-26 17:50 UTC (permalink / raw)
  To: jonathantanmy; +Cc: git, newren, gitster

> New in V2: changed parameter name, used Taylor's commit message
> suggestions, and used Elijah's documentation suggestions.

I think the discussion has shifted away from whether this functionality
is desirable (or desirable and we should implement this functionality
without any CLI option) to the name and nature of the CLI option. Before
I send out a new version, what do you think of using this name and
documenting it this way:

  --keep-cherry-pick=(always|never)::
          Control rebase's behavior towards commits in the working
          branch that are already present upstream, i.e. cherry-picks.
  +
  If 'never', these commits will be dropped. Because this necessitates
  reading all upstream commits, this can be expensive in repos with a
  large number of upstream commits that need to be read.
  +
  If 'always', all commits (including these) will be re-applied. This
  allows rebase to forgo reading all upstream commits, potentially 
  improving performance.
  +
  The default is 'never'.
  +
  See also INCOMPATIBLE OPTIONS below.

I've tried to use everyone's suggestions: Junio's suggestions to use the
"keep" name (instead of "detect", so that we also communicate what we do
with the result of our detection) and the non-boolean option (for
extensibility later if we need it), and Elijah's suggestion to use
"cherry-pick" instead of "duplicate". If this sounds good, I'll update
the patch and send out a new version.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-26 17:50   ` Jonathan Tan
@ 2020-03-26 19:17     ` Elijah Newren
  2020-03-26 19:27     ` Junio C Hamano
  1 sibling, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2020-03-26 19:17 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Git Mailing List, Junio C Hamano

On Thu, Mar 26, 2020 at 10:50 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > New in V2: changed parameter name, used Taylor's commit message
> > suggestions, and used Elijah's documentation suggestions.
>
> I think the discussion has shifted away from whether this functionality
> is desirable (or desirable and we should implement this functionality
> without any CLI option) to the name and nature of the CLI option. Before
> I send out a new version, what do you think of using this name and
> documenting it this way:
>
>   --keep-cherry-pick=(always|never)::
>           Control rebase's behavior towards commits in the working
>           branch that are already present upstream, i.e. cherry-picks.
>   +
>   If 'never', these commits will be dropped. Because this necessitates
>   reading all upstream commits, this can be expensive in repos with a
>   large number of upstream commits that need to be read.
>   +
>   If 'always', all commits (including these) will be re-applied. This
>   allows rebase to forgo reading all upstream commits, potentially
>   improving performance.
>   +
>   The default is 'never'.
>   +
>   See also INCOMPATIBLE OPTIONS below.
>
> I've tried to use everyone's suggestions: Junio's suggestions to use the
> "keep" name (instead of "detect", so that we also communicate what we do
> with the result of our detection) and the non-boolean option (for
> extensibility later if we need it), and Elijah's suggestion to use
> "cherry-pick" instead of "duplicate". If this sounds good, I'll update
> the patch and send out a new version.

Sounds good to me.  Thanks!

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] rebase --merge: optionally skip upstreamed commits
  2020-03-26 17:50   ` Jonathan Tan
  2020-03-26 19:17     ` Elijah Newren
@ 2020-03-26 19:27     ` Junio C Hamano
  1 sibling, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2020-03-26 19:27 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, newren

Jonathan Tan <jonathantanmy@google.com> writes:

>> New in V2: changed parameter name, used Taylor's commit message
>> suggestions, and used Elijah's documentation suggestions.
>
> I think the discussion has shifted away from whether this functionality
> is desirable (or desirable and we should implement this functionality
> without any CLI option) to the name and nature of the CLI option. Before
> I send out a new version, what do you think of using this name and
> documenting it this way:
>
>   --keep-cherry-pick=(always|never)::
>   ...
>   The default is 'never'.
>   +
>   See also INCOMPATIBLE OPTIONS below.

Sounds much better to me.  I do not mind --[no-]keep-cherry-pick,
either, by the way.  I know I raised the possibility of having to
make it non-bool later, but since then I haven't thought of a good
third option myself anyway, so...

Thanks for keeping the ball rolling.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/3] ci: refactor docker runner script
  2020-03-26  7:35 ` [PATCH 2/3] ci: refactor docker runner script Đoàn Trần Công Danh
  2020-03-26 16:06   ` Eric Sunshine
@ 2020-03-28 17:53   ` SZEDER Gábor
  2020-03-29  6:36     ` Danh Doan
  1 sibling, 1 reply; 78+ messages in thread
From: SZEDER Gábor @ 2020-03-28 17:53 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

On Thu, Mar 26, 2020 at 02:35:18PM +0700, Đoàn Trần Công Danh wrote:
> We will support alpine check in docker later in this serie.
> 
> While we're at it, tell people to run as root in podman.

Why tell that to people?  Please clarify what podman is and why should
we care.

> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
>  .travis.yml                                 |  2 +-
>  azure-pipelines.yml                         |  4 ++--
>  ci/{run-linux32-docker.sh => run-docker.sh} | 19 +++++++++++++------
>  3 files changed, 16 insertions(+), 9 deletions(-)
>  rename ci/{run-linux32-docker.sh => run-docker.sh} (48%)
> 
> diff --git a/.travis.yml b/.travis.yml
> index fc5730b085..32e80e2670 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -32,7 +32,7 @@ matrix:
>        services:
>          - docker
>        before_install:
> -      script: ci/run-linux32-docker.sh
> +      script: ci/run-docker.sh linux32

The name of the 'Linux32' build job starts with a capital 'L'; please
be consistent with that.

>      - env: jobname=StaticAnalysis
>        os: linux
>        compiler:
> diff --git a/azure-pipelines.yml b/azure-pipelines.yml
> index 675c3a43c9..ef504ff29f 100644
> --- a/azure-pipelines.yml
> +++ b/azure-pipelines.yml
> @@ -489,14 +489,14 @@ jobs:
>         test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
>  
>         res=0
> -       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-linux32-docker.sh || res=1
> +       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-docker.sh linux32 || res=1
>  
>         sudo chmod a+r t/out/TEST-*.xml
>         test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
>  
>         test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
>         exit $res
> -    displayName: 'ci/run-linux32-docker.sh'
> +    displayName: 'ci/run-docker.sh linux32'
>      env:
>        GITFILESHAREPWD: $(gitfileshare.pwd)
>    - task: PublishTestResults@2
> diff --git a/ci/run-linux32-docker.sh b/ci/run-docker.sh
> similarity index 48%
> rename from ci/run-linux32-docker.sh
> rename to ci/run-docker.sh
> index 751acfcf8a..c8dff9d41a 100755
> --- a/ci/run-linux32-docker.sh
> +++ b/ci/run-docker.sh
> @@ -1,15 +1,22 @@
>  #!/bin/sh
>  #
> -# Download and run Docker image to build and test 32-bit Git
> +# Download and run Docker image to build and test git

s/git/Git/

>  #
>  
>  . ${0%/*}/lib.sh
>  
> -docker pull daald/ubuntu32:xenial
> +CI_TARGET=${1:-linux32}
> +case "$CI_TARGET" in
> +linux32) CI_CONTAINER="daald/ubuntu32:xenial" ;;
> +*)       exit 1 ;;
> +esac
> +
> +docker pull "$CI_CONTAINER"
>  
>  # Use the following command to debug the docker build locally:
> -# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/bash daald/ubuntu32:xenial
> -# root@container:/# /usr/src/git/ci/run-linux32-build.sh <host-user-id>
> +# <host-user-id> must be 0 if podman is used in place of docker
> +# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/sh "$CI_CONTAINER"
> +# root@container:/# /usr/src/git/ci/run-$CI_TARGET-build.sh <host-user-id>
>  
>  container_cache_dir=/tmp/travis-cache
>  
> @@ -23,8 +30,8 @@ docker run \
>  	--env cache_dir="$container_cache_dir" \
>  	--volume "${PWD}:/usr/src/git" \
>  	--volume "$cache_dir:$container_cache_dir" \
> -	daald/ubuntu32:xenial \
> -	/usr/src/git/ci/run-linux32-build.sh $(id -u $USER)
> +	"$CI_CONTAINER" \
> +	"/usr/src/git/ci/run-$CI_TARGET-build.sh" $(id -u $USER)
>  
>  check_unignored_build_artifacts
>  
> -- 
> 2.26.0.rc2.357.g1e1ba0441d
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 0/3] add travis job for linux with musl libc
  2020-03-26  7:35 [PATCH 0/3] add travis job for linux with musl libc Đoàn Trần Công Danh
                   ` (2 preceding siblings ...)
  2020-03-26  7:35 ` [PATCH 3/3] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
@ 2020-03-29  5:49 ` Junio C Hamano
  2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
  4 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2020-03-29  5:49 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> Recently, we've un-broken git for Linux with musl libc,
> and we have a serie to fix false negative with busybox shell utils.
>
> There is a sample travis build for this serie applied on top of master:
> https://travis-ci.org/github/sgn/git/builds/667111988
>
> And, after merging with my v4 to fix busybox false negative:
> https://travis-ci.org/github/sgn/git/builds/667112197

I have this topic near the tip of 'pu' and it seems to be finding
issues not in your build with 'master'.  I'd expect that we'll be
seeing patches to various parts of the system from you or others
with musl libc to correct them, after which we can merge the topic
down.

Thanks.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/3] ci: refactor docker runner script
  2020-03-28 17:53   ` SZEDER Gábor
@ 2020-03-29  6:36     ` Danh Doan
  0 siblings, 0 replies; 78+ messages in thread
From: Danh Doan @ 2020-03-29  6:36 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

On 2020-03-28 18:53:29+0100, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> On Thu, Mar 26, 2020 at 02:35:18PM +0700, Đoàn Trần Công Danh wrote:
> > We will support alpine check in docker later in this serie.
> > 
> > While we're at it, tell people to run as root in podman.
> 
> Why tell that to people?  Please clarify what podman is and why should
> we care.

podman is a docker drop-in placement.
I use it instead of docker to develop this series.

docker requires a service to be run and user in docker system groups,
podman requires neither.

root user in podman is mapped to host user.
I run into trouble when develop this series in my local machine.

> > Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> > ---
> >  .travis.yml                                 |  2 +-
> >  azure-pipelines.yml                         |  4 ++--
> >  ci/{run-linux32-docker.sh => run-docker.sh} | 19 +++++++++++++------
> >  3 files changed, 16 insertions(+), 9 deletions(-)
> >  rename ci/{run-linux32-docker.sh => run-docker.sh} (48%)
> > 
> > diff --git a/.travis.yml b/.travis.yml
> > index fc5730b085..32e80e2670 100644
> > --- a/.travis.yml
> > +++ b/.travis.yml
> > @@ -32,7 +32,7 @@ matrix:
> >        services:
> >          - docker
> >        before_install:
> > -      script: ci/run-linux32-docker.sh
> > +      script: ci/run-docker.sh linux32
> 
> The name of the 'Linux32' build job starts with a capital 'L'; please
> be consistent with that.

the old name of the script is run-linux32-docker,
I think it's better to rename the job to all lowercase.
All other jobs, except Documentation and static analysis are in
lowercase.

> 
> >      - env: jobname=StaticAnalysis
> >        os: linux
> >        compiler:
> > diff --git a/azure-pipelines.yml b/azure-pipelines.yml
> > index 675c3a43c9..ef504ff29f 100644
> > --- a/azure-pipelines.yml
> > +++ b/azure-pipelines.yml
> > @@ -489,14 +489,14 @@ jobs:
> >         test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
> >  
> >         res=0
> > -       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-linux32-docker.sh || res=1
> > +       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-docker.sh linux32 || res=1
> >  
> >         sudo chmod a+r t/out/TEST-*.xml
> >         test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
> >  
> >         test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
> >         exit $res
> > -    displayName: 'ci/run-linux32-docker.sh'
> > +    displayName: 'ci/run-docker.sh linux32'
> >      env:
> >        GITFILESHAREPWD: $(gitfileshare.pwd)
> >    - task: PublishTestResults@2
> > diff --git a/ci/run-linux32-docker.sh b/ci/run-docker.sh
> > similarity index 48%
> > rename from ci/run-linux32-docker.sh
> > rename to ci/run-docker.sh
> > index 751acfcf8a..c8dff9d41a 100755
> > --- a/ci/run-linux32-docker.sh
> > +++ b/ci/run-docker.sh
> > @@ -1,15 +1,22 @@
> >  #!/bin/sh
> >  #
> > -# Download and run Docker image to build and test 32-bit Git
> > +# Download and run Docker image to build and test git
> 
> s/git/Git/

Will change

-- 
Danh

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v2 0/4] Travis + Azure jobs for linux with musl libc
  2020-03-26  7:35 [PATCH 0/3] add travis job for linux with musl libc Đoàn Trần Công Danh
                   ` (3 preceding siblings ...)
  2020-03-29  5:49 ` [PATCH 0/3] add travis job for linux with musl libc Junio C Hamano
@ 2020-03-29 10:12 ` Đoàn Trần Công Danh
  2020-03-29 10:12   ` [PATCH v2 1/4] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
                     ` (5 more replies)
  4 siblings, 6 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-03-29 10:12 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, Jonathan Tan,
	SZEDER Gábor, Junio C Hamano, Eric Sunshine

Recently, we've un-broken git for Linux with musl libc,
and we have a serie to fix false negative with busybox shell utils.

Add a CI job on Travis and Azure to make sure we won't break it again.

There is a sample travis build for this serie applied on top of:
jt/rebase-allow-duplicate
https://travis-ci.org/github/sgn/git/builds/668300819

And, after merging with junio's pu to fix busybox false negative:
https://travis-ci.org/github/sgn/git/builds/668300919
https://dev.azure.com/git/git/_build/results?buildId=1910&view=results

Change from v1:
- fix spelling
- run-docker.sh: use "jobname" environment variable instead of passing argument
- add linux-musl job on Azure
- Add 4th patch for jt/rebase-allow-duplicate (feel free to squash into
 jt/rebase-allow-duplicate)

The first 3 patches could be applied on top of master,
but the last patch needs to be applied on top of jt/rebase-allow-duplicate


Đoàn Trần Công Danh (4):
  ci: libify logic for usage and checking CI_USER
  ci: refactor docker runner script
  travis: build and test on Linux with musl libc and busybox
  t3402: use POSIX compliant regex(7)

 .travis.yml                                 | 10 +++++-
 azure-pipelines.yml                         | 39 +++++++++++++++++++--
 ci/lib-docker.sh                            | 37 +++++++++++++++++++
 ci/run-alpine-build.sh                      | 31 ++++++++++++++++
 ci/{run-linux32-docker.sh => run-docker.sh} | 26 ++++++++++----
 ci/run-linux32-build.sh                     | 35 +-----------------
 t/t3402-rebase-merge.sh                     |  8 ++---
 7 files changed, 139 insertions(+), 47 deletions(-)
 create mode 100644 ci/lib-docker.sh
 create mode 100755 ci/run-alpine-build.sh
 rename ci/{run-linux32-docker.sh => run-docker.sh} (46%)

Range-diff against v1:
1:  f23f2a563a = 1:  1ec7c2024d ci: libify logic for usage and checking CI_USER
2:  6fd1370678 ! 2:  140e0ef390 ci: refactor docker runner script
    @@ Metadata
      ## Commit message ##
         ci: refactor docker runner script
     
    -    We will support alpine check in docker later in this serie.
    +    We will support alpine check in docker later in this series.
     
         While we're at it, tell people to run as root in podman.
     
    @@ .travis.yml: matrix:
              - docker
            before_install:
     -      script: ci/run-linux32-docker.sh
    -+      script: ci/run-docker.sh linux32
    ++      script: ci/run-docker.sh
          - env: jobname=StaticAnalysis
            os: linux
            compiler:
    @@ azure-pipelines.yml: jobs:
      
             res=0
     -       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-linux32-docker.sh || res=1
    -+       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-docker.sh linux32 || res=1
    ++       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=Linux32 bash -lxc ci/run-docker.sh || res=1
      
             sudo chmod a+r t/out/TEST-*.xml
             test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
    @@ azure-pipelines.yml: jobs:
             test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
             exit $res
     -    displayName: 'ci/run-linux32-docker.sh'
    -+    displayName: 'ci/run-docker.sh linux32'
    ++    displayName: 'jobname=Linux32 ci/run-docker.sh'
          env:
            GITFILESHAREPWD: $(gitfileshare.pwd)
        - task: PublishTestResults@2
    @@ ci/run-docker.sh (new)
     @@
     +#!/bin/sh
     +#
    -+# Download and run Docker image to build and test git
    ++# Download and run Docker image to build and test Git
     +#
     +
     +. ${0%/*}/lib.sh
     +
    -+CI_TARGET=${1:-linux32}
    -+case "$CI_TARGET" in
    -+linux32) CI_CONTAINER="daald/ubuntu32:xenial" ;;
    -+*)       exit 1 ;;
    ++case "$jobname" in
    ++Linux32)
    ++	CI_TARGET=linux32
    ++	CI_CONTAINER="daald/ubuntu32:xenial"
    ++	;;
    ++*)
    ++	exit 1 ;;
     +esac
     +
     +docker pull "$CI_CONTAINER"
3:  2f68e65fb7 ! 3:  6cf6400f2e travis: build and test on Linux with musl libc and busybox
    @@ .travis.yml
     @@ .travis.yml: matrix:
              - docker
            before_install:
    -       script: ci/run-docker.sh linux32
    -+    - env: jobname=linux-musl-busybox
    +       script: ci/run-docker.sh
    ++    - env: jobname=linux-musl
     +      os: linux
     +      compiler:
     +      addons:
     +      services:
     +        - docker
     +      before_install:
    -+      script: ci/run-docker.sh alpine
    ++      script: ci/run-docker.sh
          - env: jobname=StaticAnalysis
            os: linux
            compiler:
     
    + ## azure-pipelines.yml ##
    +@@ azure-pipelines.yml: jobs:
    +       PathtoPublish: t/failed-test-artifacts
    +       ArtifactName: failed-test-artifacts
    + 
    ++- job: linux_musl
    ++  displayName: linux-musl
    ++  condition: succeeded()
    ++  pool:
    ++    vmImage: ubuntu-latest
    ++  steps:
    ++  - bash: |
    ++       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
    ++
    ++       res=0
    ++       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=linux-musl bash -lxc ci/run-docker.sh || res=1
    ++
    ++       sudo chmod a+r t/out/TEST-*.xml
    ++       test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
    ++
    ++       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
    ++       exit $res
    ++    displayName: 'jobname=linux-musl ci/run-docker.sh'
    ++    env:
    ++      GITFILESHAREPWD: $(gitfileshare.pwd)
    ++  - task: PublishTestResults@2
    ++    displayName: 'Publish Test Results **/TEST-*.xml'
    ++    inputs:
    ++      mergeTestResults: true
    ++      testRunTitle: 'musl'
    ++      platform: Linux
    ++      publishRunAttachments: false
    ++    condition: succeededOrFailed()
    ++  - task: PublishBuildArtifacts@1
    ++    displayName: 'Publish trash directories of failed tests'
    ++    condition: failed()
    ++    inputs:
    ++      PathtoPublish: t/failed-test-artifacts
    ++      ArtifactName: failed-test-artifacts
    ++
    + - job: static_analysis
    +   displayName: StaticAnalysis
    +   condition: succeeded()
    +
      ## ci/run-alpine-build.sh (new) ##
     @@
     +#!/bin/sh
    @@ ci/run-alpine-build.sh (new)
     +'
     
      ## ci/run-docker.sh ##
    -@@
    - CI_TARGET=${1:-linux32}
    - case "$CI_TARGET" in
    - linux32) CI_CONTAINER="daald/ubuntu32:xenial" ;;
    -+alpine)  CI_CONTAINER="alpine" ;;
    - *)       exit 1 ;;
    +@@ ci/run-docker.sh: Linux32)
    + 	CI_TARGET=linux32
    + 	CI_CONTAINER="daald/ubuntu32:xenial"
    + 	;;
    ++linux-musl)
    ++	CI_TARGET=alpine
    ++	CI_CONTAINER=alpine
    ++	;;
    + *)
    + 	exit 1 ;;
      esac
    - 
-:  ---------- > 4:  a4eacb4362 t3402: use POSIX compliant regex(7)
-- 
2.26.0.302.g234993491e


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v2 1/4] ci: libify logic for usage and checking CI_USER
  2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
@ 2020-03-29 10:12   ` Đoàn Trần Công Danh
  2020-03-29 10:12   ` [PATCH v2 2/4] ci: refactor docker runner script Đoàn Trần Công Danh
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-03-29 10:12 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

This part of logic will be reused for alpine docker later.

Merge those logic into single chunk since they will be used together.

While we're at it, add a comment to tell people run with root inside
podman container.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 ci/lib-docker.sh        | 37 +++++++++++++++++++++++++++++++++++++
 ci/run-linux32-build.sh | 35 +----------------------------------
 2 files changed, 38 insertions(+), 34 deletions(-)
 create mode 100644 ci/lib-docker.sh

diff --git a/ci/lib-docker.sh b/ci/lib-docker.sh
new file mode 100644
index 0000000000..ac155ace54
--- /dev/null
+++ b/ci/lib-docker.sh
@@ -0,0 +1,37 @@
+# Library of functions shared by all CI scripts run inside docker
+
+if test $# -ne 1 || test -z "$1"
+then
+	echo >&2 "usage: $0 <host-user-id>"
+	exit 1
+fi
+
+# If this script runs inside a docker container, then all commands are
+# usually executed as root. Consequently, the host user might not be
+# able to access the test output files.
+# If a non 0 host user id is given, then create a user "ci" with that
+# user id to make everything accessible to the host user.
+HOST_UID=$1
+if test $HOST_UID -eq 0
+then
+	# Just in case someone does want to run the test suite as root.
+	# or podman is used in place of docker
+	CI_USER=root
+else
+	CI_USER=ci
+	if test "$(id -u $CI_USER 2>/dev/null)" = $HOST_UID
+	then
+		echo "user '$CI_USER' already exists with the requested ID $HOST_UID"
+	else
+		useradd -u $HOST_UID $CI_USER
+	fi
+
+	# Due to a bug the test suite was run as root in the past, so
+	# a prove state file created back then is only accessible by
+	# root.  Now that bug is fixed, the test suite is run as a
+	# regular user, but the prove state file coming from Travis
+	# CI's cache might still be owned by root.
+	# Make sure that this user has rights to any cached files,
+	# including an existing prove state file.
+	test -n "$cache_dir" && chown -R $HOST_UID:$HOST_UID "$cache_dir"
+fi
diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index e3a193adbc..81296cdd19 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -8,11 +8,7 @@
 
 set -ex
 
-if test $# -ne 1 || test -z "$1"
-then
-	echo >&2 "usage: run-linux32-build.sh <host-user-id>"
-	exit 1
-fi
+. "${0%/*}/lib-docker.sh"
 
 # Update packages to the latest available versions
 linux32 --32bit i386 sh -c '
@@ -21,35 +17,6 @@ linux32 --32bit i386 sh -c '
 	libexpat-dev gettext python >/dev/null
 '
 
-# If this script runs inside a docker container, then all commands are
-# usually executed as root. Consequently, the host user might not be
-# able to access the test output files.
-# If a non 0 host user id is given, then create a user "ci" with that
-# user id to make everything accessible to the host user.
-HOST_UID=$1
-if test $HOST_UID -eq 0
-then
-	# Just in case someone does want to run the test suite as root.
-	CI_USER=root
-else
-	CI_USER=ci
-	if test "$(id -u $CI_USER 2>/dev/null)" = $HOST_UID
-	then
-		echo "user '$CI_USER' already exists with the requested ID $HOST_UID"
-	else
-		useradd -u $HOST_UID $CI_USER
-	fi
-
-	# Due to a bug the test suite was run as root in the past, so
-	# a prove state file created back then is only accessible by
-	# root.  Now that bug is fixed, the test suite is run as a
-	# regular user, but the prove state file coming from Travis
-	# CI's cache might still be owned by root.
-	# Make sure that this user has rights to any cached files,
-	# including an existing prove state file.
-	test -n "$cache_dir" && chown -R $HOST_UID:$HOST_UID "$cache_dir"
-fi
-
 # Build and test
 linux32 --32bit i386 su -m -l $CI_USER -c '
 	set -ex
-- 
2.26.0.302.g234993491e


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v2 2/4] ci: refactor docker runner script
  2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
  2020-03-29 10:12   ` [PATCH v2 1/4] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
@ 2020-03-29 10:12   ` Đoàn Trần Công Danh
  2020-04-01 21:51     ` SZEDER Gábor
  2020-03-29 10:12   ` [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-03-29 10:12 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

We will support alpine check in docker later in this series.

While we're at it, tell people to run as root in podman.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 .travis.yml                                 |  2 +-
 azure-pipelines.yml                         |  4 ++--
 ci/{run-linux32-docker.sh => run-docker.sh} | 22 +++++++++++++++------
 3 files changed, 19 insertions(+), 9 deletions(-)
 rename ci/{run-linux32-docker.sh => run-docker.sh} (48%)

diff --git a/.travis.yml b/.travis.yml
index fc5730b085..069aeeff3c 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -32,7 +32,7 @@ matrix:
       services:
         - docker
       before_install:
-      script: ci/run-linux32-docker.sh
+      script: ci/run-docker.sh
     - env: jobname=StaticAnalysis
       os: linux
       compiler:
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
index af2a5ea484..f6dcc35ad4 100644
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -478,14 +478,14 @@ jobs:
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
 
        res=0
-       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-linux32-docker.sh || res=1
+       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=Linux32 bash -lxc ci/run-docker.sh || res=1
 
        sudo chmod a+r t/out/TEST-*.xml
        test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
 
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
        exit $res
-    displayName: 'ci/run-linux32-docker.sh'
+    displayName: 'jobname=Linux32 ci/run-docker.sh'
     env:
       GITFILESHAREPWD: $(gitfileshare.pwd)
   - task: PublishTestResults@2
diff --git a/ci/run-linux32-docker.sh b/ci/run-docker.sh
similarity index 48%
rename from ci/run-linux32-docker.sh
rename to ci/run-docker.sh
index 751acfcf8a..be698817cb 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-docker.sh
@@ -1,15 +1,25 @@
 #!/bin/sh
 #
-# Download and run Docker image to build and test 32-bit Git
+# Download and run Docker image to build and test Git
 #
 
 . ${0%/*}/lib.sh
 
-docker pull daald/ubuntu32:xenial
+case "$jobname" in
+Linux32)
+	CI_TARGET=linux32
+	CI_CONTAINER="daald/ubuntu32:xenial"
+	;;
+*)
+	exit 1 ;;
+esac
+
+docker pull "$CI_CONTAINER"
 
 # Use the following command to debug the docker build locally:
-# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/bash daald/ubuntu32:xenial
-# root@container:/# /usr/src/git/ci/run-linux32-build.sh <host-user-id>
+# <host-user-id> must be 0 if podman is used in place of docker
+# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/sh "$CI_CONTAINER"
+# root@container:/# /usr/src/git/ci/run-$CI_TARGET-build.sh <host-user-id>
 
 container_cache_dir=/tmp/travis-cache
 
@@ -23,8 +33,8 @@ docker run \
 	--env cache_dir="$container_cache_dir" \
 	--volume "${PWD}:/usr/src/git" \
 	--volume "$cache_dir:$container_cache_dir" \
-	daald/ubuntu32:xenial \
-	/usr/src/git/ci/run-linux32-build.sh $(id -u $USER)
+	"$CI_CONTAINER" \
+	"/usr/src/git/ci/run-$CI_TARGET-build.sh" $(id -u $USER)
 
 check_unignored_build_artifacts
 
-- 
2.26.0.302.g234993491e


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox
  2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
  2020-03-29 10:12   ` [PATCH v2 1/4] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
  2020-03-29 10:12   ` [PATCH v2 2/4] ci: refactor docker runner script Đoàn Trần Công Danh
@ 2020-03-29 10:12   ` Đoàn Trần Công Danh
  2020-04-01 22:18     ` SZEDER Gábor
  2020-03-29 16:23   ` [PATCH v2 0/4] Travis + Azure jobs for linux with musl libc Junio C Hamano
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-03-29 10:12 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 .travis.yml            |  8 ++++++++
 azure-pipelines.yml    | 35 +++++++++++++++++++++++++++++++++++
 ci/run-alpine-build.sh | 31 +++++++++++++++++++++++++++++++
 ci/run-docker.sh       |  4 ++++
 4 files changed, 78 insertions(+)
 create mode 100755 ci/run-alpine-build.sh

diff --git a/.travis.yml b/.travis.yml
index 069aeeff3c..0cfc3c3428 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -33,6 +33,14 @@ matrix:
         - docker
       before_install:
       script: ci/run-docker.sh
+    - env: jobname=linux-musl
+      os: linux
+      compiler:
+      addons:
+      services:
+        - docker
+      before_install:
+      script: ci/run-docker.sh
     - env: jobname=StaticAnalysis
       os: linux
       compiler:
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
index f6dcc35ad4..615289167b 100644
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -503,6 +503,41 @@ jobs:
       PathtoPublish: t/failed-test-artifacts
       ArtifactName: failed-test-artifacts
 
+- job: linux_musl
+  displayName: linux-musl
+  condition: succeeded()
+  pool:
+    vmImage: ubuntu-latest
+  steps:
+  - bash: |
+       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
+
+       res=0
+       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=linux-musl bash -lxc ci/run-docker.sh || res=1
+
+       sudo chmod a+r t/out/TEST-*.xml
+       test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
+
+       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
+       exit $res
+    displayName: 'jobname=linux-musl ci/run-docker.sh'
+    env:
+      GITFILESHAREPWD: $(gitfileshare.pwd)
+  - task: PublishTestResults@2
+    displayName: 'Publish Test Results **/TEST-*.xml'
+    inputs:
+      mergeTestResults: true
+      testRunTitle: 'musl'
+      platform: Linux
+      publishRunAttachments: false
+    condition: succeededOrFailed()
+  - task: PublishBuildArtifacts@1
+    displayName: 'Publish trash directories of failed tests'
+    condition: failed()
+    inputs:
+      PathtoPublish: t/failed-test-artifacts
+      ArtifactName: failed-test-artifacts
+
 - job: static_analysis
   displayName: StaticAnalysis
   condition: succeeded()
diff --git a/ci/run-alpine-build.sh b/ci/run-alpine-build.sh
new file mode 100755
index 0000000000..c83df536e4
--- /dev/null
+++ b/ci/run-alpine-build.sh
@@ -0,0 +1,31 @@
+#!/bin/sh
+#
+# Build and test Git in Alpine Linux
+#
+# Usage:
+#   run-alpine-build.sh <host-user-id>
+#
+
+set -ex
+
+useradd () {
+	adduser -D "$@"
+}
+
+. "${0%/*}/lib-docker.sh"
+
+# Update packages to the latest available versions
+apk add --update autoconf build-base curl-dev openssl-dev expat-dev \
+	gettext pcre2-dev python3 musl-libintl >/dev/null
+
+# Build and test
+su -m -l $CI_USER -c '
+	set -ex
+	cd /usr/src/git
+	test -n "$cache_dir" && ln -s "$cache_dir/.prove" t/.prove
+	autoconf
+	echo "PYTHON_PATH=/usr/bin/python3" >config.mak
+	./configure --with-libpcre
+	make
+	make test
+'
diff --git a/ci/run-docker.sh b/ci/run-docker.sh
index be698817cb..f203db03cf 100755
--- a/ci/run-docker.sh
+++ b/ci/run-docker.sh
@@ -10,6 +10,10 @@ Linux32)
 	CI_TARGET=linux32
 	CI_CONTAINER="daald/ubuntu32:xenial"
 	;;
+linux-musl)
+	CI_TARGET=alpine
+	CI_CONTAINER=alpine
+	;;
 *)
 	exit 1 ;;
 esac
-- 
2.26.0.302.g234993491e


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v2 4/4] t3402: use POSIX compliant regex(7)
  2020-03-18 17:30 ` [PATCH v2] " Jonathan Tan
                     ` (2 preceding siblings ...)
  2020-03-26 17:50   ` Jonathan Tan
@ 2020-03-29 10:12   ` Đoàn Trần Công Danh
  3 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-03-29 10:12 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, Jonathan Tan

`\?` is undefined for POSIX BRE, from:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_02

> The interpretation of an ordinary character preceded
> by an unescaped <backslash> ( '\\' ) is undefined, except for:
> - The characters ')', '(', '{', and '}'
> - The digits 1 to 9 inclusive
> - A character inside a bracket expression

This test is failing with busybox grep.

Fix it by using character class.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
Needs to be applied on top of jt/rebase-allow-duplicate 

Cc: Jonathan Tan <jonathantanmy@google.com>

 t/t3402-rebase-merge.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t3402-rebase-merge.sh b/t/t3402-rebase-merge.sh
index 290c79e0f6..e8bab8102d 100755
--- a/t/t3402-rebase-merge.sh
+++ b/t/t3402-rebase-merge.sh
@@ -228,15 +228,15 @@ test_expect_success '--skip-cherry-pick-detection refrains from reading unneeded
 	git -C client rev-list --objects --all --missing=print >missing_list &&
 	MERGE_BASE_BLOB=$(git -C server rev-parse master^^:file.txt) &&
 	ADD_11_BLOB=$(git -C server rev-parse master^:file.txt) &&
-	grep "\\?$MERGE_BASE_BLOB" missing_list &&
-	grep "\\?$ADD_11_BLOB" missing_list &&
+	grep "[?]$MERGE_BASE_BLOB" missing_list &&
+	grep "[?]$ADD_11_BLOB" missing_list &&
 
 	git -C client rebase --merge --skip-cherry-pick-detection origin/master &&
 
 	# The blob from the merge base had to be fetched, but not "add 11"
 	git -C client rev-list --objects --all --missing=print >missing_list &&
-	! grep "\\?$MERGE_BASE_BLOB" missing_list &&
-	grep "\\?$ADD_11_BLOB" missing_list
+	! grep "[?]$MERGE_BASE_BLOB" missing_list &&
+	grep "[?]$ADD_11_BLOB" missing_list
 '
 
 test_done
-- 
2.26.0.302.g234993491e


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 0/4] Travis + Azure jobs for linux with musl libc
  2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
                     ` (2 preceding siblings ...)
  2020-03-29 10:12   ` [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
@ 2020-03-29 16:23   ` Junio C Hamano
  2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
  2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
  5 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2020-03-29 16:23 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: git, Jonathan Tan, SZEDER Gábor, Eric Sunshine

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> Change from v1:
> - fix spelling
> - run-docker.sh: use "jobname" environment variable instead of passing argument
> - add linux-musl job on Azure
> - Add 4th patch for jt/rebase-allow-duplicate (feel free to squash into
>  jt/rebase-allow-duplicate)
>
> The first 3 patches could be applied on top of master,
> but the last patch needs to be applied on top of jt/rebase-allow-duplicate

This note was very helpful.  Very much appreciated.

Please keep this a three-patch series ([1/4], [2/4] and [3/4] become
[1/3], [2/3] and [3/3]), and make the fourth one a separate fix to
the other topic.  Even if we were not going to take this topic, the
last one is an independently useful improvement.

I'll update jt/rebase-allow-duplicate with the last one, so no real
harm done, but keeping the topics separate on the list would help
reduce confusion.

Thanks.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-09 20:55 [PATCH] rebase --merge: optionally skip upstreamed commits Jonathan Tan
                   ` (3 preceding siblings ...)
  2020-03-18 17:30 ` [PATCH v2] " Jonathan Tan
@ 2020-03-30  4:06 ` Jonathan Tan
  2020-03-30  5:09   ` Junio C Hamano
                     ` (3 more replies)
  4 siblings, 4 replies; 78+ messages in thread
From: Jonathan Tan @ 2020-03-30  4:06 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, congdanhqx, newren, gitster

When rebasing against an upstream that has had many commits since the
original branch was created:

 O -- O -- ... -- O -- O (upstream)
  \
   -- O (my-dev-branch)

it must read the contents of every novel upstream commit, in addition to
the tip of the upstream and the merge base, because "git rebase"
attempts to exclude commits that are duplicates of upstream ones. This
can be a significant performance hit, especially in a partial clone,
wherein a read of an object may end up being a fetch.

Add a flag to "git rebase" to allow suppression of this feature. This
flag only works when using the "merge" backend.

This flag changes the behavior of sequencer_make_script(), called from
do_interactive_rebase() <- run_rebase_interactive() <-
run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
(indirectly called from sequencer_make_script() through
prepare_revision_walk()) will no longer call cherry_pick_list(), and
thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
means that the intermediate commits in upstream are no longer read (as
shown by the test) and means that no PATCHSAME-caused skipping of
commits is done by sequencer_make_script(), either directly or through
make_script_with_merges().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
This commit contains Junio's sign-off because I based it on
jt/rebase-allow-duplicate.

This does not include the fix by Đoàn Trần Công Danh. If we want all
commits to pass all tests (whether run by Busybox or not) it seems like
we should squash that patch instead of having it as a separate commit.
If we do squash, maybe include a "Helped-by" with Đoàn Trần Công Danh's
name.

Junio wrote [1]:

> Sounds much better to me.  I do not mind --[no-]keep-cherry-pick,
> either, by the way.  I know I raised the possibility of having to
> make it non-bool later, but since then I haven't thought of a good
> third option myself anyway, so...

In that case, I think it's better to stick to bool. This also means that
the change from the version in jt/rebase-allow-duplicate is very small,
hopefully aiding reviewers - mostly a replacement of
--skip-cherry-pick-detection with --keep-cherry-pick (which mean the
same thing).

[1] https://lore.kernel.org/git/xmqq4kuakjcn.fsf@gitster.c.googlers.com/
---
 Documentation/git-rebase.txt | 21 +++++++++-
 builtin/rebase.c             |  7 ++++
 sequencer.c                  |  3 +-
 sequencer.h                  |  2 +-
 t/t3402-rebase-merge.sh      | 77 ++++++++++++++++++++++++++++++++++++
 5 files changed, 107 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 0c4f038dd6..f4f8afeb9a 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -318,6 +318,21 @@ See also INCOMPATIBLE OPTIONS below.
 +
 See also INCOMPATIBLE OPTIONS below.
 
+--keep-cherry-pick::
+--no-keep-cherry-pick::
+	Control rebase's behavior towards commits in the working
+	branch that are already present upstream, i.e. cherry-picks.
++
+By default, these commits will be dropped. Because this necessitates
+reading all upstream commits, this can be expensive in repos with a
+large number of upstream commits that need to be read.
++
+If `--keep-cherry-pick is given`, all commits (including these) will be
+re-applied. This allows rebase to forgo reading all upstream commits,
+potentially improving performance.
++
+See also INCOMPATIBLE OPTIONS below.
+
 --rerere-autoupdate::
 --no-rerere-autoupdate::
 	Allow the rerere mechanism to update the index with the
@@ -568,6 +583,9 @@ In addition, the following pairs of options are incompatible:
  * --keep-base and --onto
  * --keep-base and --root
 
+Also, the --keep-cherry-pick option requires the use of the merge backend
+(e.g., through --merge).
+
 BEHAVIORAL DIFFERENCES
 -----------------------
 
@@ -866,7 +884,8 @@ Only works if the changes (patch IDs based on the diff contents) on
 'subsystem' did.
 
 In that case, the fix is easy because 'git rebase' knows to skip
-changes that are already present in the new upstream.  So if you say
+changes that are already present in the new upstream (unless
+`--keep-cherry-pick` is given). So if you say
 (assuming you're on 'topic')
 ------------
     $ git rebase subsystem
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 8081741f8a..626549b0b2 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -88,6 +88,7 @@ struct rebase_options {
 	struct strbuf git_format_patch_opt;
 	int reschedule_failed_exec;
 	int use_legacy_rebase;
+	int keep_cherry_pick;
 };
 
 #define REBASE_OPTIONS_INIT {			  	\
@@ -381,6 +382,7 @@ static int run_rebase_interactive(struct rebase_options *opts,
 	flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
 	flags |= opts->root_with_onto ? TODO_LIST_ROOT_WITH_ONTO : 0;
 	flags |= command == ACTION_SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
+	flags |= opts->keep_cherry_pick ? TODO_LIST_KEEP_CHERRY_PICK : 0;
 
 	switch (command) {
 	case ACTION_NONE: {
@@ -1515,6 +1517,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "reschedule-failed-exec",
 			 &reschedule_failed_exec,
 			 N_("automatically re-schedule any `exec` that fails")),
+		OPT_BOOL(0, "keep-cherry-pick", &options.keep_cherry_pick,
+			 N_("apply all changes, even those already present upstream")),
 		OPT_END(),
 	};
 	int i;
@@ -1848,6 +1852,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 			      "interactive or merge options"));
 	}
 
+	if (options.keep_cherry_pick && !is_interactive(&options))
+		die(_("--keep-cherry-pick does not work with the 'apply' backend"));
+
 	if (options.signoff) {
 		if (options.type == REBASE_PRESERVE_MERGES)
 			die("cannot combine '--signoff' with "
diff --git a/sequencer.c b/sequencer.c
index b9dbf1adb0..7bbb63f444 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4800,12 +4800,13 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
 	int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
 	const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick";
 	int rebase_merges = flags & TODO_LIST_REBASE_MERGES;
+	int keep_cherry_pick = flags & TODO_LIST_KEEP_CHERRY_PICK;
 
 	repo_init_revisions(r, &revs, NULL);
 	revs.verbose_header = 1;
 	if (!rebase_merges)
 		revs.max_parents = 1;
-	revs.cherry_mark = 1;
+	revs.cherry_mark = !keep_cherry_pick;
 	revs.limited = 1;
 	revs.reverse = 1;
 	revs.right_only = 1;
diff --git a/sequencer.h b/sequencer.h
index 9f9ae291e3..298b7de1c8 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -148,7 +148,7 @@ int sequencer_remove_state(struct replay_opts *opts);
  * `--onto`, we do not want to re-generate the root commits.
  */
 #define TODO_LIST_ROOT_WITH_ONTO (1U << 6)
-
+#define TODO_LIST_KEEP_CHERRY_PICK (1U << 7)
 
 int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
 			  const char **argv, unsigned flags);
diff --git a/t/t3402-rebase-merge.sh b/t/t3402-rebase-merge.sh
index a1ec501a87..64200c5f20 100755
--- a/t/t3402-rebase-merge.sh
+++ b/t/t3402-rebase-merge.sh
@@ -162,4 +162,81 @@ test_expect_success 'rebase --skip works with two conflicts in a row' '
 	git rebase --skip
 '
 
+test_expect_success '--keep-cherry-pick' '
+	git init repo &&
+
+	# O(1-10) -- O(1-11) -- O(0-10) master
+	#        \
+	#         -- O(1-11) -- O(1-12) otherbranch
+
+	printf "Line %d\n" $(test_seq 1 10) >repo/file.txt &&
+	git -C repo add file.txt &&
+	git -C repo commit -m "base commit" &&
+
+	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
+	git -C repo commit -a -m "add 11" &&
+
+	printf "Line %d\n" $(test_seq 0 10) >repo/file.txt &&
+	git -C repo commit -a -m "add 0 delete 11" &&
+
+	git -C repo checkout -b otherbranch HEAD^^ &&
+	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
+	git -C repo commit -a -m "add 11 in another branch" &&
+
+	printf "Line %d\n" $(test_seq 1 12) >repo/file.txt &&
+	git -C repo commit -a -m "add 12 in another branch" &&
+
+	# Regular rebase fails, because the 1-11 commit is deduplicated
+	test_must_fail git -C repo rebase --merge master 2> err &&
+	test_i18ngrep "error: could not apply.*add 12 in another branch" err &&
+	git -C repo rebase --abort &&
+
+	# With --keep-cherry-pick, it works
+	git -C repo rebase --merge --keep-cherry-pick master
+'
+
+test_expect_success '--keep-cherry-pick refrains from reading unneeded blobs' '
+	git init server &&
+
+	# O(1-10) -- O(1-11) -- O(1-12) master
+	#        \
+	#         -- O(0-10) otherbranch
+
+	printf "Line %d\n" $(test_seq 1 10) >server/file.txt &&
+	git -C server add file.txt &&
+	git -C server commit -m "merge base" &&
+
+	printf "Line %d\n" $(test_seq 1 11) >server/file.txt &&
+	git -C server commit -a -m "add 11" &&
+
+	printf "Line %d\n" $(test_seq 1 12) >server/file.txt &&
+	git -C server commit -a -m "add 12" &&
+
+	git -C server checkout -b otherbranch HEAD^^ &&
+	printf "Line %d\n" $(test_seq 0 10) >server/file.txt &&
+	git -C server commit -a -m "add 0" &&
+
+	test_config -C server uploadpack.allowfilter 1 &&
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+	git clone --filter=blob:none "file://$(pwd)/server" client &&
+	git -C client checkout origin/master &&
+	git -C client checkout origin/otherbranch &&
+
+	# Sanity check to ensure that the blobs from the merge base and "add
+	# 11" are missing
+	git -C client rev-list --objects --all --missing=print >missing_list &&
+	MERGE_BASE_BLOB=$(git -C server rev-parse master^^:file.txt) &&
+	ADD_11_BLOB=$(git -C server rev-parse master^:file.txt) &&
+	grep "\\?$MERGE_BASE_BLOB" missing_list &&
+	grep "\\?$ADD_11_BLOB" missing_list &&
+
+	git -C client rebase --merge --keep-cherry-pick origin/master &&
+
+	# The blob from the merge base had to be fetched, but not "add 11"
+	git -C client rev-list --objects --all --missing=print >missing_list &&
+	! grep "\\?$MERGE_BASE_BLOB" missing_list &&
+	grep "\\?$ADD_11_BLOB" missing_list
+'
+
 test_done
-- 
2.26.0.rc2.310.g2932bb562d-goog


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-30  4:06 ` [PATCH v3] rebase --merge: optionally skip upstreamed commits Jonathan Tan
@ 2020-03-30  5:09   ` Junio C Hamano
  2020-03-30  5:22   ` Danh Doan
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2020-03-30  5:09 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, congdanhqx, newren

Jonathan Tan <jonathantanmy@google.com> writes:

> When rebasing against an upstream that has had many commits since the
> original branch was created:
>
>  O -- O -- ... -- O -- O (upstream)
>   \
>    -- O (my-dev-branch)
>
> it must read the contents of every novel upstream commit, in addition to
> the tip of the upstream and the merge base, because "git rebase"
> attempts to exclude commits that are duplicates of upstream ones. This
> can be a significant performance hit, especially in a partial clone,
> wherein a read of an object may end up being a fetch.
>
> Add a flag to "git rebase" to allow suppression of this feature. This
> flag only works when using the "merge" backend.
>
> This flag changes the behavior of sequencer_make_script(), called from
> do_interactive_rebase() <- run_rebase_interactive() <-
> run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
> (indirectly called from sequencer_make_script() through
> prepare_revision_walk()) will no longer call cherry_pick_list(), and
> thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
> means that the intermediate commits in upstream are no longer read (as
> shown by the test) and means that no PATCHSAME-caused skipping of
> commits is done by sequencer_make_script(), either directly or through
> make_script_with_merges().
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> This commit contains Junio's sign-off because I based it on
> jt/rebase-allow-duplicate.
>
> This does not include the fix by Đoàn Trần Công Danh. If we want all
> commits to pass all tests (whether run by Busybox or not) it seems like
> we should squash that patch instead of having it as a separate commit.
> If we do squash, maybe include a "Helped-by" with Đoàn Trần Công Danh's
> name.

Yup, I think Đoàn already said it is fine to squash in, so please do
that.

Thanks.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-30  4:06 ` [PATCH v3] rebase --merge: optionally skip upstreamed commits Jonathan Tan
  2020-03-30  5:09   ` Junio C Hamano
@ 2020-03-30  5:22   ` Danh Doan
  2020-03-30 12:13   ` Derrick Stolee
  2020-03-31 16:27   ` Elijah Newren
  3 siblings, 0 replies; 78+ messages in thread
From: Danh Doan @ 2020-03-30  5:22 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, newren, gitster

On 2020-03-29 21:06:21-0700, Jonathan Tan <jonathantanmy@google.com> wrote:
> When rebasing against an upstream that has had many commits since the
> original branch was created:
> 
>  O -- O -- ... -- O -- O (upstream)
>   \
>    -- O (my-dev-branch)
> 
> it must read the contents of every novel upstream commit, in addition to
> the tip of the upstream and the merge base, because "git rebase"
> attempts to exclude commits that are duplicates of upstream ones. This
> can be a significant performance hit, especially in a partial clone,
> wherein a read of an object may end up being a fetch.
> 
> Add a flag to "git rebase" to allow suppression of this feature. This
> flag only works when using the "merge" backend.
> 
> This flag changes the behavior of sequencer_make_script(), called from
> do_interactive_rebase() <- run_rebase_interactive() <-
> run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
> (indirectly called from sequencer_make_script() through
> prepare_revision_walk()) will no longer call cherry_pick_list(), and
> thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
> means that the intermediate commits in upstream are no longer read (as
> shown by the test) and means that no PATCHSAME-caused skipping of
> commits is done by sequencer_make_script(), either directly or through
> make_script_with_merges().
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> This commit contains Junio's sign-off because I based it on
> jt/rebase-allow-duplicate.
> 
> This does not include the fix by Đoàn Trần Công Danh. If we want all
> commits to pass all tests (whether run by Busybox or not) it seems like
> we should squash that patch instead of having it as a separate commit.
> If we do squash, maybe include a "Helped-by" with Đoàn Trần Công Danh's
> name.

Hi Jonathan,

Feel free to squash it in.


-- 
Danh

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-30  4:06 ` [PATCH v3] rebase --merge: optionally skip upstreamed commits Jonathan Tan
  2020-03-30  5:09   ` Junio C Hamano
  2020-03-30  5:22   ` Danh Doan
@ 2020-03-30 12:13   ` Derrick Stolee
  2020-03-30 16:49     ` Junio C Hamano
  2020-03-30 16:57     ` Jonathan Tan
  2020-03-31 16:27   ` Elijah Newren
  3 siblings, 2 replies; 78+ messages in thread
From: Derrick Stolee @ 2020-03-30 12:13 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: congdanhqx, newren, gitster

On 3/30/2020 12:06 AM, Jonathan Tan wrote:
> When rebasing against an upstream that has had many commits since the
> original branch was created:
> 
>  O -- O -- ... -- O -- O (upstream)
>   \
>    -- O (my-dev-branch)
> 
> it must read the contents of every novel upstream commit, in addition to
> the tip of the upstream and the merge base, because "git rebase"
> attempts to exclude commits that are duplicates of upstream ones. This
> can be a significant performance hit, especially in a partial clone,
> wherein a read of an object may end up being a fetch.
> 
> Add a flag to "git rebase" to allow suppression of this feature. This
> flag only works when using the "merge" backend.

So this is the behavior that already exists, and you are providing a way
to suppress it. However, you also change the default in this patch, which
may surprise users expecting this behavior to continue.

> This flag changes the behavior of sequencer_make_script(), called from
> do_interactive_rebase() <- run_rebase_interactive() <-
> run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
> (indirectly called from sequencer_make_script() through
> prepare_revision_walk()) will no longer call cherry_pick_list(), and
> thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
> means that the intermediate commits in upstream are no longer read (as
> shown by the test) and means that no PATCHSAME-caused skipping of
> commits is done by sequencer_make_script(), either directly or through
> make_script_with_merges().
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> This commit contains Junio's sign-off because I based it on
> jt/rebase-allow-duplicate.
> 
> This does not include the fix by Đoàn Trần Công Danh. If we want all
> commits to pass all tests (whether run by Busybox or not) it seems like
> we should squash that patch instead of having it as a separate commit.
> If we do squash, maybe include a "Helped-by" with Đoàn Trần Công Danh's
> name.
> 
> Junio wrote [1]:
> 
>> Sounds much better to me.  I do not mind --[no-]keep-cherry-pick,
>> either, by the way.  I know I raised the possibility of having to
>> make it non-bool later, but since then I haven't thought of a good
>> third option myself anyway, so...
> 
> In that case, I think it's better to stick to bool. This also means that
> the change from the version in jt/rebase-allow-duplicate is very small,
> hopefully aiding reviewers - mostly a replacement of
> --skip-cherry-pick-detection with --keep-cherry-pick (which mean the
> same thing).
> 
> [1] https://lore.kernel.org/git/xmqq4kuakjcn.fsf@gitster.c.googlers.com/
> ---
>  Documentation/git-rebase.txt | 21 +++++++++-
>  builtin/rebase.c             |  7 ++++
>  sequencer.c                  |  3 +-
>  sequencer.h                  |  2 +-
>  t/t3402-rebase-merge.sh      | 77 ++++++++++++++++++++++++++++++++++++
>  5 files changed, 107 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 0c4f038dd6..f4f8afeb9a 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -318,6 +318,21 @@ See also INCOMPATIBLE OPTIONS below.
>  +
>  See also INCOMPATIBLE OPTIONS below.
>  
> +--keep-cherry-pick::
> +--no-keep-cherry-pick::

I noticed that this _could_ have been simplified to

	--[no-]keep-cherry-pick::

but I also see several uses of either in our documentation. Do we
have a preference? By inspecting the lines before a "no-" string,
I see that some have these two lines, some use the [no-] pattern,
and others highlight the --no-<option> flag completely separately.

> +	Control rebase's behavior towards commits in the working
> +	branch that are already present upstream, i.e. cherry-picks.

I think the "already present upstream" is misleading. We don't rebase
things that are _reachable_ already, but this is probably better as

	Specify if rebase should include commits in the working branch
	that have diffs equivalent to other commits upstream. For example,
	a cherry-picked commit has an equivalent diff.

> ++
> +By default, these commits will be dropped. Because this necessitates
> +reading all upstream commits, this can be expensive in repos with a
> +large number of upstream commits that need to be read.

Now I'm confused. Are they dropped by default? Which option does what?
--keep-cherry-pick makes me think that cherry-picked commits will come
along for the rebase, so we will not check for them. But you have documented
that --no-keep-cherry-pick is the default.

(Also, I keep writing "--[no-]keep-cherry-picks" (plural) because that
seems more natural to me. Then I go back and fix it when I notice.)

> ++
> +If `--keep-cherry-pick is given`, all commits (including these) will be

Bad tick marks: "`--keep-cherry-pick` is given"

> +re-applied. This allows rebase to forgo reading all upstream commits,
> +potentially improving performance.

This reasoning is good. Could you also introduce a config option to make
--keep-cherry-pick the default? I would like to enable that option by
default in Scalar, but could also see partial clones wanting to enable that
by default, too.

> ++
> +See also INCOMPATIBLE OPTIONS below.
> +

Could we just say that his only applies with the --merge option? Why require
the jump to the end of the options section? (After writing this, I go look
at the rest of the doc file and see this is a common pattern.)

>  --rerere-autoupdate::
>  --no-rerere-autoupdate::
>  	Allow the rerere mechanism to update the index with the
> @@ -568,6 +583,9 @@ In addition, the following pairs of options are incompatible:
>   * --keep-base and --onto
>   * --keep-base and --root
>  
> +Also, the --keep-cherry-pick option requires the use of the merge backend
> +(e.g., through --merge).
> +

Will the command _fail_ if someone says --keep-cherry-pick without the merge
backend, or just have no effect? Also, specify the option with ticks and

	`--[no-]keep-cherry-pick`

It seems that none of the options in this section are back-ticked, which I think
is a doc bug.

>  BEHAVIORAL DIFFERENCES
>  -----------------------
>  
> @@ -866,7 +884,8 @@ Only works if the changes (patch IDs based on the diff contents) on
>  'subsystem' did.
>  
>  In that case, the fix is easy because 'git rebase' knows to skip
> -changes that are already present in the new upstream.  So if you say
> +changes that are already present in the new upstream (unless
> +`--keep-cherry-pick` is given). So if you say
>  (assuming you're on 'topic')
>  ------------
>      $ git rebase subsystem
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 8081741f8a..626549b0b2 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -88,6 +88,7 @@ struct rebase_options {
>  	struct strbuf git_format_patch_opt;
>  	int reschedule_failed_exec;
>  	int use_legacy_rebase;
> +	int keep_cherry_pick;
>  };
>  
>  #define REBASE_OPTIONS_INIT {			  	\
> @@ -381,6 +382,7 @@ static int run_rebase_interactive(struct rebase_options *opts,
>  	flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
>  	flags |= opts->root_with_onto ? TODO_LIST_ROOT_WITH_ONTO : 0;
>  	flags |= command == ACTION_SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
> +	flags |= opts->keep_cherry_pick ? TODO_LIST_KEEP_CHERRY_PICK : 0;

Since opts->keep_cherry_pick is initialized as zero, did you change the default
behavior? Do we not have a test that verifies this behavior when using the merge
backend an no "--keep-cherry-pick" option?

If you initialize it to -1, then you can tell if the --no-keep-cherry-pick option
is specified, which is relevant to my concern below.

>  
>  	switch (command) {
>  	case ACTION_NONE: {
> @@ -1515,6 +1517,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  		OPT_BOOL(0, "reschedule-failed-exec",
>  			 &reschedule_failed_exec,
>  			 N_("automatically re-schedule any `exec` that fails")),
> +		OPT_BOOL(0, "keep-cherry-pick", &options.keep_cherry_pick,
> +			 N_("apply all changes, even those already present upstream")),
>  		OPT_END(),
>  	};
>  	int i;
> @@ -1848,6 +1852,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  			      "interactive or merge options"));
>  	}
>  
> +	if (options.keep_cherry_pick && !is_interactive(&options))
> +		die(_("--keep-cherry-pick does not work with the 'apply' backend"));
> +

I see you are failing here. Is this the right decision?

The apply backend will "keep" cherry-picks because it will not look for them upstream.
If anything, shouldn't it be that "--no-keep-cherry-pick" is incompatible?

>  	if (options.signoff) {
>  		if (options.type == REBASE_PRESERVE_MERGES)
>  			die("cannot combine '--signoff' with "
> diff --git a/sequencer.c b/sequencer.c
> index b9dbf1adb0..7bbb63f444 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4800,12 +4800,13 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
>  	int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
>  	const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick";
>  	int rebase_merges = flags & TODO_LIST_REBASE_MERGES;
> +	int keep_cherry_pick = flags & TODO_LIST_KEEP_CHERRY_PICK;
>  
>  	repo_init_revisions(r, &revs, NULL);
>  	revs.verbose_header = 1;
>  	if (!rebase_merges)
>  		revs.max_parents = 1;
> -	revs.cherry_mark = 1;
> +	revs.cherry_mark = !keep_cherry_pick;
>  	revs.limited = 1;
>  	revs.reverse = 1;
>  	revs.right_only = 1;
> diff --git a/sequencer.h b/sequencer.h
> index 9f9ae291e3..298b7de1c8 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -148,7 +148,7 @@ int sequencer_remove_state(struct replay_opts *opts);
>   * `--onto`, we do not want to re-generate the root commits.
>   */
>  #define TODO_LIST_ROOT_WITH_ONTO (1U << 6)
> -
> +#define TODO_LIST_KEEP_CHERRY_PICK (1U << 7)
>  
>  int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
>  			  const char **argv, unsigned flags);
> diff --git a/t/t3402-rebase-merge.sh b/t/t3402-rebase-merge.sh
> index a1ec501a87..64200c5f20 100755
> --- a/t/t3402-rebase-merge.sh
> +++ b/t/t3402-rebase-merge.sh
> @@ -162,4 +162,81 @@ test_expect_success 'rebase --skip works with two conflicts in a row' '
>  	git rebase --skip
>  '
>  
> +test_expect_success '--keep-cherry-pick' '
> +	git init repo &&
> +
> +	# O(1-10) -- O(1-11) -- O(0-10) master
> +	#        \
> +	#         -- O(1-11) -- O(1-12) otherbranch
> +
> +	printf "Line %d\n" $(test_seq 1 10) >repo/file.txt &&
> +	git -C repo add file.txt &&
> +	git -C repo commit -m "base commit" &&
> +
> +	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> +	git -C repo commit -a -m "add 11" &&
> +
> +	printf "Line %d\n" $(test_seq 0 10) >repo/file.txt &&
> +	git -C repo commit -a -m "add 0 delete 11" &&
> +
> +	git -C repo checkout -b otherbranch HEAD^^ &&
> +	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> +	git -C repo commit -a -m "add 11 in another branch" &&
> +
> +	printf "Line %d\n" $(test_seq 1 12) >repo/file.txt &&
> +	git -C repo commit -a -m "add 12 in another branch" &&
> +
> +	# Regular rebase fails, because the 1-11 commit is deduplicated
> +	test_must_fail git -C repo rebase --merge master 2> err &&
> +	test_i18ngrep "error: could not apply.*add 12 in another branch" err &&
> +	git -C repo rebase --abort &&

OK. So here you are demonstrating that the --no-keep-cherry-pick is the
new default. Just trying to be sure that this was intended.

> +
> +	# With --keep-cherry-pick, it works
> +	git -C repo rebase --merge --keep-cherry-pick master
> +'
> +
> +test_expect_success '--keep-cherry-pick refrains from reading unneeded blobs' '
> +	git init server &&
> +
> +	# O(1-10) -- O(1-11) -- O(1-12) master
> +	#        \
> +	#         -- O(0-10) otherbranch
> +
> +	printf "Line %d\n" $(test_seq 1 10) >server/file.txt &&
> +	git -C server add file.txt &&
> +	git -C server commit -m "merge base" &&
> +
> +	printf "Line %d\n" $(test_seq 1 11) >server/file.txt &&
> +	git -C server commit -a -m "add 11" &&
> +
> +	printf "Line %d\n" $(test_seq 1 12) >server/file.txt &&
> +	git -C server commit -a -m "add 12" &&
> +
> +	git -C server checkout -b otherbranch HEAD^^ &&
> +	printf "Line %d\n" $(test_seq 0 10) >server/file.txt &&
> +	git -C server commit -a -m "add 0" &&
> +
> +	test_config -C server uploadpack.allowfilter 1 &&
> +	test_config -C server uploadpack.allowanysha1inwant 1 &&
> +
> +	git clone --filter=blob:none "file://$(pwd)/server" client &&
> +	git -C client checkout origin/master &&
> +	git -C client checkout origin/otherbranch &&
> +
> +	# Sanity check to ensure that the blobs from the merge base and "add
> +	# 11" are missing
> +	git -C client rev-list --objects --all --missing=print >missing_list &&
> +	MERGE_BASE_BLOB=$(git -C server rev-parse master^^:file.txt) &&
> +	ADD_11_BLOB=$(git -C server rev-parse master^:file.txt) &&
> +	grep "\\?$MERGE_BASE_BLOB" missing_list &&
> +	grep "\\?$ADD_11_BLOB" missing_list &&
> +
> +	git -C client rebase --merge --keep-cherry-pick origin/master &&
> +
> +	# The blob from the merge base had to be fetched, but not "add 11"
> +	git -C client rev-list --objects --all --missing=print >missing_list &&
> +	! grep "\\?$MERGE_BASE_BLOB" missing_list &&
> +	grep "\\?$ADD_11_BLOB" missing_list
> +'

I appreciate this test showing that this is accomplishing the goal in
a partial clone. 

Thanks,
-Stolee



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-30 12:13   ` Derrick Stolee
@ 2020-03-30 16:49     ` Junio C Hamano
  2020-03-30 16:57     ` Jonathan Tan
  1 sibling, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2020-03-30 16:49 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Jonathan Tan, git, congdanhqx, newren

Derrick Stolee <stolee@gmail.com> writes:

>> +--keep-cherry-pick::
>> +--no-keep-cherry-pick::
>
> I noticed that this _could_ have been simplified to
>
> 	--[no-]keep-cherry-pick::
>
> but I also see several uses of either in our documentation. Do we
> have a preference? By inspecting the lines before a "no-" string,
> I see that some have these two lines, some use the [no-] pattern,
> and others highlight the --no-<option> flag completely separately.

"git log -S'--[no-]' Documentation/" (and its "-S'--no-'" variant)
tell us that many of our recent commits do prefer the single-line
form, but then in d333f672 (git-checkout.txt: spell out --no-option,
2019-03-29), we see we turned a handful of "--[no-]option" into
"--option" followed by "--no-option" deliberately  [*1*].

So, we do not seem to have a strong concensus.

I think all the new ones that spell --no-option:: out are the ones
when --option:: and --no-option:: have their own paragraph, e.g.
"--sign/--no-sign" of "git-tag".

As the differences do not matter all that much, I do not mind
declaring (and one of the tasks of the maintainer is to make a
declaration on such a choice that it matters more for us to pick
either one and we all sticking to it, rather than which choice we
make) that we'd prefer the expanded two-liner form (which when
formatted would become a single line with two things on it) and
mark the task to convert from '--[no-]option' as #leftoverbit.

Thanks for your attention to the details.


[Footnote]

*1* The justification given was that it makes is it is easier to
search that way and it is less cryptic.  Personally I do not think
it matters that much.  Even when trying to learn what the negated
form does, nobody would look for "--no-keep-ch" to find the above
paragraph.  "keep-cherry-pick" would be what they would look for,
with or without leading double-dashes.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-30 12:13   ` Derrick Stolee
  2020-03-30 16:49     ` Junio C Hamano
@ 2020-03-30 16:57     ` Jonathan Tan
  2020-03-31 11:55       ` Derrick Stolee
  1 sibling, 1 reply; 78+ messages in thread
From: Jonathan Tan @ 2020-03-30 16:57 UTC (permalink / raw)
  To: stolee; +Cc: jonathantanmy, git, congdanhqx, newren, gitster

> > Add a flag to "git rebase" to allow suppression of this feature. This
> > flag only works when using the "merge" backend.
> 
> So this is the behavior that already exists, and you are providing a way
> to suppress it. However, you also change the default in this patch, which
> may surprise users expecting this behavior to continue.

First of all, thanks for looking at this.

I'm not changing the default - was there anything in the commit message
that made you believe it? If yes, I could change it.

Looking back to the title, maybe it should be "rebase --merge: make
skipping of upstreamed commits optional" (although that would exceed 50
characters, so I would have to think of a way to shorten it).

> > +--keep-cherry-pick::
> > +--no-keep-cherry-pick::
> 
> I noticed that this _could_ have been simplified to
> 
> 	--[no-]keep-cherry-pick::
> 
> but I also see several uses of either in our documentation. Do we
> have a preference? By inspecting the lines before a "no-" string,
> I see that some have these two lines, some use the [no-] pattern,
> and others highlight the --no-<option> flag completely separately.

I was following the existing options in this file.

> > +	Control rebase's behavior towards commits in the working
> > +	branch that are already present upstream, i.e. cherry-picks.
> 
> I think the "already present upstream" is misleading. We don't rebase
> things that are _reachable_ already, but this is probably better as
> 
> 	Specify if rebase should include commits in the working branch
> 	that have diffs equivalent to other commits upstream. For example,
> 	a cherry-picked commit has an equivalent diff.

OK, I'll use this.

> > +By default, these commits will be dropped. Because this necessitates
> > +reading all upstream commits, this can be expensive in repos with a
> > +large number of upstream commits that need to be read.
> 
> Now I'm confused. Are they dropped by default? Which option does what?
> --keep-cherry-pick makes me think that cherry-picked commits will come
> along for the rebase, so we will not check for them. But you have documented
> that --no-keep-cherry-pick is the default.

This part is followed by "If --keep-cherry-pick is given", so I thought
it would be clear that this is the "--no-keep-cherry-pick" part (or if
nothing is given), but I'll s/By default/By default, or if
--no-keep-cherry-pick is given/.

> (Also, I keep writing "--[no-]keep-cherry-picks" (plural) because that
> seems more natural to me. Then I go back and fix it when I notice.)

OK, let's see if others have opinions on this. Admittedly,
--keep-cherry-picks with the "s" at the end now sounds more natural to
me.

> > +If `--keep-cherry-pick is given`, all commits (including these) will be
> 
> Bad tick marks: "`--keep-cherry-pick` is given"

Thanks.

> > +re-applied. This allows rebase to forgo reading all upstream commits,
> > +potentially improving performance.
> 
> This reasoning is good. Could you also introduce a config option to make
> --keep-cherry-pick the default? I would like to enable that option by
> default in Scalar, but could also see partial clones wanting to enable that
> by default, too.

Maybe this could be done in another patch. This sounds like a good idea.

> > +See also INCOMPATIBLE OPTIONS below.
> > +
> 
> Could we just say that his only applies with the --merge option? Why require
> the jump to the end of the options section? (After writing this, I go look
> at the rest of the doc file and see this is a common pattern.)

Yes, I'm following the pattern.

> > +Also, the --keep-cherry-pick option requires the use of the merge backend
> > +(e.g., through --merge).
> > +
> 
> Will the command _fail_ if someone says --keep-cherry-pick without the merge
> backend, or just have no effect? Also, specify the option with ticks and
> 
> 	`--[no-]keep-cherry-pick`
> 
> It seems that none of the options in this section are back-ticked, which I think
> is a doc bug.

It will fail. I'll figure out how to add a test for that (which might be
difficult since the default merge backend is changing).

I'll add the ticks. (The "no-" is fine with either backend, since it
just invokes the current behavior.)

> > @@ -381,6 +382,7 @@ static int run_rebase_interactive(struct rebase_options *opts,
> >  	flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
> >  	flags |= opts->root_with_onto ? TODO_LIST_ROOT_WITH_ONTO : 0;
> >  	flags |= command == ACTION_SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
> > +	flags |= opts->keep_cherry_pick ? TODO_LIST_KEEP_CHERRY_PICK : 0;
> 
> Since opts->keep_cherry_pick is initialized as zero, did you change the default
> behavior? 

I did not change it - keep_cherry_pick being 0 means that we do not keep
any cherry-picks, so we have to read every upstream commit in order to
know which are cherry-picks (which is the current behavior).

> Do we not have a test that verifies this behavior when using the merge
> backend an no "--keep-cherry-pick" option?

Yes, there are existing tests that already check the deduplicating
behavior of "git rebase --merge".

> > +	if (options.keep_cherry_pick && !is_interactive(&options))
> > +		die(_("--keep-cherry-pick does not work with the 'apply' backend"));
> > +
> 
> I see you are failing here. Is this the right decision?
> 
> The apply backend will "keep" cherry-picks because it will not look for them upstream.
> If anything, shouldn't it be that "--no-keep-cherry-pick" is incompatible?

I haven't delved deeply into the "apply" backend, but it seems to me
that it still performs some sort of cherry-pick detection (that is, it
does not keep cherry-picks, thus --no-keep-cherry-pick). In this patch,
I have a test with the lines:

  # Regular rebase fails, because the 1-11 commit is deduplicated
  test_must_fail git -C repo rebase --merge master 2> err &&

When I remove "--merge" from this line, the rebase still fails (with a
different error message, so indeed another backend is used).

> > +test_expect_success '--keep-cherry-pick' '
> > +	git init repo &&
> > +
> > +	# O(1-10) -- O(1-11) -- O(0-10) master
> > +	#        \
> > +	#         -- O(1-11) -- O(1-12) otherbranch
> > +
> > +	printf "Line %d\n" $(test_seq 1 10) >repo/file.txt &&
> > +	git -C repo add file.txt &&
> > +	git -C repo commit -m "base commit" &&
> > +
> > +	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> > +	git -C repo commit -a -m "add 11" &&
> > +
> > +	printf "Line %d\n" $(test_seq 0 10) >repo/file.txt &&
> > +	git -C repo commit -a -m "add 0 delete 11" &&
> > +
> > +	git -C repo checkout -b otherbranch HEAD^^ &&
> > +	printf "Line %d\n" $(test_seq 1 11) >repo/file.txt &&
> > +	git -C repo commit -a -m "add 11 in another branch" &&
> > +
> > +	printf "Line %d\n" $(test_seq 1 12) >repo/file.txt &&
> > +	git -C repo commit -a -m "add 12 in another branch" &&
> > +
> > +	# Regular rebase fails, because the 1-11 commit is deduplicated
> > +	test_must_fail git -C repo rebase --merge master 2> err &&
> > +	test_i18ngrep "error: could not apply.*add 12 in another branch" err &&
> > +	git -C repo rebase --abort &&
> 
> OK. So here you are demonstrating that the --no-keep-cherry-pick is the
> new default. Just trying to be sure that this was intended.

Yes, --no-keep-cherry-pick is the default, but it has the same behavior
as if the flag was omitted. (The existing tests that test the
cherry-pick deduplication behavior all still work.)

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-30 16:57     ` Jonathan Tan
@ 2020-03-31 11:55       ` Derrick Stolee
  0 siblings, 0 replies; 78+ messages in thread
From: Derrick Stolee @ 2020-03-31 11:55 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, congdanhqx, newren, gitster

On 3/30/2020 12:57 PM, Jonathan Tan wrote:
>>> Add a flag to "git rebase" to allow suppression of this feature. This
>>> flag only works when using the "merge" backend.
>>
>> So this is the behavior that already exists, and you are providing a way
>> to suppress it. However, you also change the default in this patch, which
>> may surprise users expecting this behavior to continue.
> 
> First of all, thanks for looking at this.
> 
> I'm not changing the default - was there anything in the commit message
> that made you believe it? If yes, I could change it.

It's not your fault. My confusion is all. You make it very clear, but
I got flipped around several times while reading the patch. Here is
your message again:

> When rebasing against an upstream that has had many commits since the
> original branch was created:
> 
>  O -- O -- ... -- O -- O (upstream)
>   \
>    -- O (my-dev-branch)
> 
> it must read the contents of every novel upstream commit, in addition to
> the tip of the upstream and the merge base, because "git rebase"
> attempts to exclude commits that are duplicates of upstream ones. This
> can be a significant performance hit, especially in a partial clone,
> wherein a read of an object may end up being a fetch.

So by default, it "attempts to exclude commits that are duplicates of
upstream ones." So that's the --no-keep-cherry-pick option, which is
the default.

> Add a flag to "git rebase" to allow suppression of this feature. This
> flag only works when using the "merge" backend.

Perhaps this is how I got confused. "suppression of this feature" probably
got associated with the "--no-" version of the flag in my head. But that's
not your fault. I'm probably biased from my experience with the
--no-show-forced-updates flag in "git fetch". There, the "--no-" option
disables the default behavior.

Maybe I wouldn't be as confused if the flag was reversed and called
"--no-skip-cherry-picks" or something. That direction would make it
more clear that this is a performance optimization with a possible
behavior side-effect. I doubt users will be lining up to "keep cherry-picks."
There is a reason we remove them by default, but it is also atypical
for the check to actually change the outcome.

But if we have a config option to change the default (as a follow-up)
then all of my complaints are reduced, because users will not need to
think about this very often.

> This flag changes the behavior of sequencer_make_script(), called from
> do_interactive_rebase() <- run_rebase_interactive() <-
> run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
> (indirectly called from sequencer_make_script() through
> prepare_revision_walk()) will no longer call cherry_pick_list(), and
> thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
> means that the intermediate commits in upstream are no longer read (as
> shown by the test) and means that no PATCHSAME-caused skipping of
> commits is done by sequencer_make_script(), either directly or through
> make_script_with_merges().

Perhaps the only change I would recommend for the commit message is to
be very clear about what "this flag" means. You are talking about the
"--keep-cherry-pick(s)" flag in this paragraph.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-30  4:06 ` [PATCH v3] rebase --merge: optionally skip upstreamed commits Jonathan Tan
                     ` (2 preceding siblings ...)
  2020-03-30 12:13   ` Derrick Stolee
@ 2020-03-31 16:27   ` Elijah Newren
  2020-03-31 18:34     ` Junio C Hamano
  2020-04-10 22:27     ` Jonathan Tan
  3 siblings, 2 replies; 78+ messages in thread
From: Elijah Newren @ 2020-03-31 16:27 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Git Mailing List, congdanhqx, Junio C Hamano, Derrick Stolee

Hi Jonathan,

On Sun, Mar 29, 2020 at 9:06 PM Jonathan Tan <jonathantanmy@google.com> wrote:
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 0c4f038dd6..f4f8afeb9a 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -318,6 +318,21 @@ See also INCOMPATIBLE OPTIONS below.
>  +
>  See also INCOMPATIBLE OPTIONS below.
>
> +--keep-cherry-pick::
> +--no-keep-cherry-pick::

I like the plural form Derrick elsewhere in this thread suggested
(--keep-cherry-picks), but it's not a strong preference.  However,
with fresh eyes I'm slightly worried about "keep".  More on that
below...

> +       Control rebase's behavior towards commits in the working
> +       branch that are already present upstream, i.e. cherry-picks.
> ++
> +By default, these commits will be dropped. Because this necessitates
> +reading all upstream commits, this can be expensive in repos with a
> +large number of upstream commits that need to be read.
> ++
> +If `--keep-cherry-pick is given`, all commits (including these) will be
> +re-applied. This allows rebase to forgo reading all upstream commits,
> +potentially improving performance.

I'm slightly worried that "keep" is setting up an incorrect
expectation for users; in most cases, a reapplied cherry-pick will
result in the merge machinery applying no new changes (they were
already applied) and then rebase's default of dropping commits which
become empty will kick in and drop the commit.

Maybe the name is fine and we just need to be more clear in the text
on the expected behavior and advantages and disadvantages of this
option:

If `--keep-cherry-picks` is given, all commits (including these) will be
re-applied.  Note that cherry picks are likely to result in no changes
when being reapplied and thus are likely to be dropped anyway (assuming
the default --empty=drop behavior).  The advantage of this option, is it
allows rebase to forgo reading all upstream commits, potentially
improving performance.  The disadvantage of this option is that in some
cases, the code has drifted such that reapplying a cherry-pick is not
detectable as a no-op, and instead results in conflicts for the user to
manually resolve (usually via `git rebase --skip`).

It may also be helpful to prevent users from making a false inference
by renaming these options to --[no-]reapply-cherry-pick[s].  Sorry to
bring this up so late after earlier saying --[no-]keep-cherry-pick[s]
was fine; didn't occur to me then.  If you want to keep the name, the
extended paragraph should be good enough.

> ++
> +See also INCOMPATIBLE OPTIONS below.
> +
>  --rerere-autoupdate::
>  --no-rerere-autoupdate::
>         Allow the rerere mechanism to update the index with the
> @@ -568,6 +583,9 @@ In addition, the following pairs of options are incompatible:
>   * --keep-base and --onto
>   * --keep-base and --root
>
> +Also, the --keep-cherry-pick option requires the use of the merge backend
> +(e.g., through --merge).

Why not just list --keep-cherry-pick[s] in the list of options that
require use of the merge backend (i.e. the list containing '--merge')
instead of adding another sentence here?

> +
>  BEHAVIORAL DIFFERENCES
>  -----------------------
>
> @@ -866,7 +884,8 @@ Only works if the changes (patch IDs based on the diff contents) on
>  'subsystem' did.
>
>  In that case, the fix is easy because 'git rebase' knows to skip
> -changes that are already present in the new upstream.  So if you say
> +changes that are already present in the new upstream (unless
> +`--keep-cherry-pick` is given). So if you say
>  (assuming you're on 'topic')
>  ------------
>      $ git rebase subsystem
> diff --git a/builtin/rebase.c b/builtin/rebase.c
...
> @@ -1848,6 +1852,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>                               "interactive or merge options"));
>         }
>
> +       if (options.keep_cherry_pick && !is_interactive(&options))

You're building on an old version of git.  Do you want to rebase and
make this use the new is_merge() instead so Junio has fewer conflicts
to handle?

> +               die(_("--keep-cherry-pick does not work with the 'apply' backend"));
> +
>         if (options.signoff) {
>                 if (options.type == REBASE_PRESERVE_MERGES)
>                         die("cannot combine '--signoff' with "
...

Thanks for working on this!

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-31 16:27   ` Elijah Newren
@ 2020-03-31 18:34     ` Junio C Hamano
  2020-03-31 18:43       ` Junio C Hamano
  2020-04-10 22:27     ` Jonathan Tan
  1 sibling, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2020-03-31 18:34 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Jonathan Tan, Git Mailing List, congdanhqx, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

>> +--keep-cherry-pick::
>> +--no-keep-cherry-pick::
> ...
> I'm slightly worried that "keep" is setting up an incorrect
> expectation for users; in most cases, a reapplied cherry-pick will
> result in the merge machinery applying no new changes (they were
> already applied) and then rebase's default of dropping commits which
> become empty will kick in and drop the commit.

Yes.

> Maybe the name is fine and we just need to be more clear in the text
> on the expected behavior and advantages and disadvantages of this
> option:
>
> If `--keep-cherry-picks` is given, all commits (including these) will be
> re-applied.  Note that cherry picks are likely to result in no changes
> when being reapplied and thus are likely to be dropped anyway (assuming
> the default --empty=drop behavior).  The advantage of this option, is it
> allows rebase to forgo reading all upstream commits, potentially
> improving performance.  The disadvantage of this option is that in some
> cases, the code has drifted such that reapplying a cherry-pick is not
> detectable as a no-op, and instead results in conflicts for the user to
> manually resolve (usually via `git rebase --skip`).

True.  So instead of letting the machine match commits on the both
sides up, the end-user who is rebasing will find matches (or near
matches) and manually handle them.  It would be a good idea to
describe the pros and cons for the option (which I think has already
been written fairly clearly in the proposed patch).

> It may also be helpful to prevent users from making a false inference
> by renaming these options to --[no-]reapply-cherry-pick[s].

Hmm, yeah, that may not be a bad name.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-31 18:34     ` Junio C Hamano
@ 2020-03-31 18:43       ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2020-03-31 18:43 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Jonathan Tan, Git Mailing List, congdanhqx, Derrick Stolee

Junio C Hamano <gitster@pobox.com> writes:

> True.  So instead of letting the machine match commits on the both
> sides up, the end-user who is rebasing will find matches (or near
> matches) and manually handle them.  It would be a good idea to
> describe the pros and cons for the option (which I think has already
> been written fairly clearly in the proposed patch).

Sorry, strike the part in (parentheses) out.  I was looking at the
description in an earlier version with --skip-cherry-pick-detection,
which had a good explanation on "if such detection is not done,
then..." but with the latest one seems to have lost the description
altogether.  Minimally something like this from the earlier round
should probably be resurrected.

    ... these duplicates will be re-applied, which will likely
    result in no new changes (as they are already in upstream) and
    drop them from the resulting history.

Thanks.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 2/4] ci: refactor docker runner script
  2020-03-29 10:12   ` [PATCH v2 2/4] ci: refactor docker runner script Đoàn Trần Công Danh
@ 2020-04-01 21:51     ` SZEDER Gábor
  0 siblings, 0 replies; 78+ messages in thread
From: SZEDER Gábor @ 2020-04-01 21:51 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

On Sun, Mar 29, 2020 at 05:12:30PM +0700, Đoàn Trần Công Danh wrote:
> We will support alpine check in docker later in this series.
> 
> While we're at it, tell people to run as root in podman.

Ok, I'll try to be more specific :)

Please clarify *in the commit message* why we should tell this to
people, i.e. what podman is and why should we care.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox
  2020-03-29 10:12   ` [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
@ 2020-04-01 22:18     ` SZEDER Gábor
  2020-04-02  1:42       ` Danh Doan
  2020-04-07 14:53       ` Johannes Schindelin
  0 siblings, 2 replies; 78+ messages in thread
From: SZEDER Gábor @ 2020-04-01 22:18 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

On Sun, Mar 29, 2020 at 05:12:31PM +0700, Đoàn Trần Công Danh wrote:
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>

> diff --git a/ci/run-alpine-build.sh b/ci/run-alpine-build.sh
> new file mode 100755
> index 0000000000..c83df536e4
> --- /dev/null
> +++ b/ci/run-alpine-build.sh
> @@ -0,0 +1,31 @@
> +#!/bin/sh
> +#
> +# Build and test Git in Alpine Linux
> +#
> +# Usage:
> +#   run-alpine-build.sh <host-user-id>
> +#
> +
> +set -ex
> +
> +useradd () {
> +	adduser -D "$@"
> +}
> +
> +. "${0%/*}/lib-docker.sh"
> +
> +# Update packages to the latest available versions
> +apk add --update autoconf build-base curl-dev openssl-dev expat-dev \
> +	gettext pcre2-dev python3 musl-libintl >/dev/null

In 'ci/run-docker.sh' we run 'docker run' with a bunch of '--env ...'
options to make some important environment variables available
inside the container.  At this point in this script all those
variables are set to the expected values, but ...

> +# Build and test
> +su -m -l $CI_USER -c '

... but here, for some reason, those environment variables are not set
anymore.  This is bad, because this CI job then builds Git
sequentially, runs the tests sequentially, runs the tests with 'make'
instead of 'prove', and runs the tests without '-V -x'.  IOW, it's
slow, it produces a lot of useless output, it doesn't report all the
failures, and doesn't tell us anything about the failures.

> +	set -ex
> +	cd /usr/src/git
> +	test -n "$cache_dir" && ln -s "$cache_dir/.prove" t/.prove
> +	autoconf
> +	echo "PYTHON_PATH=/usr/bin/python3" >config.mak
> +	./configure --with-libpcre

The recommended way to build Git is without autoconf and configure.
You can set the PYTHON_PATH and USE_LIBPCRE Makefile knobs in
MAKEFLAGS in 'ci/lib.sh', but to be able to access MAKEFLAGS in the
container you'll need to build this patch series on top of:

  https://public-inbox.org/git/20200401212151.15164-1-szeder.dev@gmail.com/

> +	make
> +	make test
> +'
> diff --git a/ci/run-docker.sh b/ci/run-docker.sh
> index be698817cb..f203db03cf 100755
> --- a/ci/run-docker.sh
> +++ b/ci/run-docker.sh
> @@ -10,6 +10,10 @@ Linux32)
>  	CI_TARGET=linux32
>  	CI_CONTAINER="daald/ubuntu32:xenial"
>  	;;
> +linux-musl)
> +	CI_TARGET=alpine
> +	CI_CONTAINER=alpine
> +	;;
>  *)
>  	exit 1 ;;
>  esac
> -- 
> 2.26.0.302.g234993491e
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox
  2020-04-01 22:18     ` SZEDER Gábor
@ 2020-04-02  1:42       ` Danh Doan
  2020-04-07 14:53       ` Johannes Schindelin
  1 sibling, 0 replies; 78+ messages in thread
From: Danh Doan @ 2020-04-02  1:42 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

On 2020-04-02 00:18:35+0200, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> On Sun, Mar 29, 2020 at 05:12:31PM +0700, Đoàn Trần Công Danh wrote:
> > Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> 
> > diff --git a/ci/run-alpine-build.sh b/ci/run-alpine-build.sh
> > new file mode 100755
> > index 0000000000..c83df536e4
> > --- /dev/null
> > +++ b/ci/run-alpine-build.sh
> > @@ -0,0 +1,31 @@
> > +#!/bin/sh
> > +#
> > +# Build and test Git in Alpine Linux
> > +#
> > +# Usage:
> > +#   run-alpine-build.sh <host-user-id>
> > +#
> > +
> > +set -ex
> > +
> > +useradd () {
> > +	adduser -D "$@"
> > +}
> > +
> > +. "${0%/*}/lib-docker.sh"
> > +
> > +# Update packages to the latest available versions
> > +apk add --update autoconf build-base curl-dev openssl-dev expat-dev \
> > +	gettext pcre2-dev python3 musl-libintl >/dev/null
> 
> In 'ci/run-docker.sh' we run 'docker run' with a bunch of '--env ...'
> options to make some important environment variables available
> inside the container.  At this point in this script all those
> variables are set to the expected values, but ...
> 
> > +# Build and test
> > +su -m -l $CI_USER -c '
> 
> ... but here, for some reason, those environment variables are not set
> anymore.  This is bad, because this CI job then builds Git
> sequentially, runs the tests sequentially, runs the tests with 'make'
> instead of 'prove', and runs the tests without '-V -x'.  IOW, it's
> slow, it produces a lot of useless output, it doesn't report all the
> failures, and doesn't tell us anything about the failures.

At this point, I tempted to change this to

	su -m -l $CI_USER -c /usr/src/git/ci/run-build-and-tests.sh

instead.

But, after digging into ci/lib.sh, I found too many setup for "$CI_*",
let choose your path instead.

> > +	set -ex
> > +	cd /usr/src/git
> > +	test -n "$cache_dir" && ln -s "$cache_dir/.prove" t/.prove
> > +	autoconf
> > +	echo "PYTHON_PATH=/usr/bin/python3" >config.mak
> > +	./configure --with-libpcre
> 
> The recommended way to build Git is without autoconf and configure.
> You can set the PYTHON_PATH and USE_LIBPCRE Makefile knobs in
> MAKEFLAGS in 'ci/lib.sh', but to be able to access MAKEFLAGS in the
> container you'll need to build this patch series on top of:
> 
>   https://public-inbox.org/git/20200401212151.15164-1-szeder.dev@gmail.com/
> 
> > +	make
> > +	make test
> > +'
> > diff --git a/ci/run-docker.sh b/ci/run-docker.sh
> > index be698817cb..f203db03cf 100755
> > --- a/ci/run-docker.sh
> > +++ b/ci/run-docker.sh
> > @@ -10,6 +10,10 @@ Linux32)
> >  	CI_TARGET=linux32
> >  	CI_CONTAINER="daald/ubuntu32:xenial"
> >  	;;
> > +linux-musl)
> > +	CI_TARGET=alpine
> > +	CI_CONTAINER=alpine
> > +	;;
> >  *)
> >  	exit 1 ;;
> >  esac
> > -- 
> > 2.26.0.302.g234993491e
> > 

-- 
Danh

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3 0/6] Travis + Azure jobs for linux with musl libc
  2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
                     ` (3 preceding siblings ...)
  2020-03-29 16:23   ` [PATCH v2 0/4] Travis + Azure jobs for linux with musl libc Junio C Hamano
@ 2020-04-02 13:03   ` Đoàn Trần Công Danh
  2020-04-02 13:04     ` [PATCH v3 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
                       ` (6 more replies)
  2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
  5 siblings, 7 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-02 13:03 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, Junio C Hamano,
	SZEDER Gábor, Eric Sunshine, Johannes Schindelin

Recently, we've un-broken git for Linux with musl libc,
and we have a serie to fix false negative with busybox shell utils.

Add a CI job on Travis and Azure to make sure we won't break it again.

This is a nearly rewrite of this series, because there're GitHub Action
allow running directly inside container.

=> I rewrite this series to prepare as much as possible for the GitHub
Action series.
=> No range-diff

The first patch is coming from Szeder, Junio hasn't picked it up yet.
And, this series depends on it.

Sample build without busybox fix series:
https://travis-ci.org/github/sgn/git/builds/670097222

With busybox fix:
https://travis-ci.org/github/sgn/git/builds/670103249


SZEDER Gábor (1):
  ci: make MAKEFLAGS available inside the Docker container in the
    Linux32 job

Đoàn Trần Công Danh (5):
  ci/lib-docker: preserve required environment variables
  ci/linux32: parameterise command to switch arch
  ci: refactor docker runner script
  ci/linux32: libify install-dependencies step
  travis: build and test on Linux with musl libc and busybox

 .travis.yml                                   | 10 ++++-
 azure-pipelines.yml                           | 39 ++++++++++++++++++-
 ci/install-docker-dependencies.sh             | 18 +++++++++
 ci/lib.sh                                     |  8 ++++
 ...n-linux32-build.sh => run-docker-build.sh} | 39 +++++++++++++------
 ci/{run-linux32-docker.sh => run-docker.sh}   | 28 ++++++++++---
 6 files changed, 121 insertions(+), 21 deletions(-)
 create mode 100755 ci/install-docker-dependencies.sh
 rename ci/{run-linux32-build.sh => run-docker-build.sh} (63%)
 rename ci/{run-linux32-docker.sh => run-docker.sh} (43%)

-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job
  2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
@ 2020-04-02 13:04     ` Đoàn Trần Công Danh
  2020-04-02 13:04     ` [PATCH v3 2/6] ci/lib-docker: preserve required environment variables Đoàn Trần Công Danh
                       ` (5 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-02 13:04 UTC (permalink / raw)
  To: git; +Cc: SZEDER Gábor, Đoàn Trần Công Danh

From: SZEDER Gábor <szeder.dev@gmail.com>

Once upon a time we ran 'make --jobs=2 ...' to build Git, its
documentation, or to apply Coccinelle semantic patches.  Then commit
eaa62291ff (ci: inherit --jobs via MAKEFLAGS in run-build-and-tests,
2019-01-27) came along, and started using the MAKEFLAGS environment
variable to centralize setting the number of parallel jobs in
'ci/libs.sh'.  Alas, it forgot to update 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container running the 32
bit Linux job, and, consequently, since then that job builds Git
sequentially, and it ignores any Makefile knobs that we might set in
MAKEFLAGS (though we don't set any for the 32 bit Linux job at the
moment).

So update the 'docker run' invocation in 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container as well.  Set
CC=gcc for the 32 bit Linux job, because that's the compiler installed
in the 32 bit Linux Docker image that we use (Travis CI nowadays sets
CC=clang by default, but clang is not installed in this image).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 ci/lib.sh                | 3 +++
 ci/run-linux32-docker.sh | 1 +
 2 files changed, 4 insertions(+)

diff --git a/ci/lib.sh b/ci/lib.sh
index c3a8cd2104..d637825222 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -198,6 +198,9 @@ osx-clang|osx-gcc)
 GIT_TEST_GETTEXT_POISON)
 	export GIT_TEST_GETTEXT_POISON=true
 	;;
+Linux32)
+	CC=gcc
+	;;
 esac
 
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-linux32-docker.sh b/ci/run-linux32-docker.sh
index 751acfcf8a..ebb18fa747 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-linux32-docker.sh
@@ -20,6 +20,7 @@ docker run \
 	--env GIT_PROVE_OPTS \
 	--env GIT_TEST_OPTS \
 	--env GIT_TEST_CLONE_2GB \
+	--env MAKEFLAGS \
 	--env cache_dir="$container_cache_dir" \
 	--volume "${PWD}:/usr/src/git" \
 	--volume "$cache_dir:$container_cache_dir" \
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3 2/6] ci/lib-docker: preserve required environment variables
  2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
  2020-04-02 13:04     ` [PATCH v3 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
@ 2020-04-02 13:04     ` Đoàn Trần Công Danh
  2020-04-03  8:22       ` SZEDER Gábor
  2020-04-02 13:04     ` [PATCH v3 3/6] ci/linux32: parameterise command to switch arch Đoàn Trần Công Danh
                       ` (4 subsequent siblings)
  6 siblings, 1 reply; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-02 13:04 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

We're using "su -m" to preserve environment variables in the shell run
by "su". But, that options will be ignored while "-l" (aka "--login") is
specified.

Since we don't have interest in all environment variables,
pass only those necessary variables to the inner script.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 ci/run-linux32-build.sh | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index e3a193adbc..7f985615c2 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -51,10 +51,17 @@ else
 fi
 
 # Build and test
-linux32 --32bit i386 su -m -l $CI_USER -c '
+linux32 --32bit i386 su -m -l $CI_USER -c "
 	set -ex
+	export DEVELOPER='$DEVELOPER'
+	export DEFAULT_TEST_TARGET='$DEFAULT_TEST_TARGET'
+	export GIT_PROVE_OPTS='$GIT_PROVE_OPTS'
+	export GIT_TEST_OPTS='$GIT_TEST_OPTS'
+	export GIT_TEST_CLONE_2GB='$GIT_TEST_CLONE_2GB'
+	export MAKEFLAGS='$MAKEFLAGS'
+	export cache_dir='$cache_dir'
 	cd /usr/src/git
-	test -n "$cache_dir" && ln -s "$cache_dir/.prove" t/.prove
+	test -n '$cache_dir' && ln -s '$cache_dir/.prove' t/.prove
 	make
 	make test
-'
+"
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3 3/6] ci/linux32: parameterise command to switch arch
  2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
  2020-04-02 13:04     ` [PATCH v3 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
  2020-04-02 13:04     ` [PATCH v3 2/6] ci/lib-docker: preserve required environment variables Đoàn Trần Công Danh
@ 2020-04-02 13:04     ` Đoàn Trần Công Danh
  2020-04-02 13:04     ` [PATCH v3 4/6] ci: refactor docker runner script Đoàn Trần Công Danh
                       ` (3 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-02 13:04 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

In a later patch, the remaining of this command will be re-used for the
CI job for linux with musl libc.

Allow customisation of the emulator, now.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 ci/run-linux32-build.sh  | 13 +++++++++++--
 ci/run-linux32-docker.sh |  2 ++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index 7f985615c2..44bb332f64 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -14,8 +14,17 @@ then
 	exit 1
 fi
 
+case "$jobname" in
+Linux32)
+	switch_cmd="linux32 --32bit i386"
+	;;
+*)
+	exit 1
+	;;
+esac
+
 # Update packages to the latest available versions
-linux32 --32bit i386 sh -c '
+command $switch_cmd sh -c '
     apt update >/dev/null &&
     apt install -y build-essential libcurl4-openssl-dev libssl-dev \
 	libexpat-dev gettext python >/dev/null
@@ -51,7 +60,7 @@ else
 fi
 
 # Build and test
-linux32 --32bit i386 su -m -l $CI_USER -c "
+command $switch_cmd su -m -l $CI_USER -c "
 	set -ex
 	export DEVELOPER='$DEVELOPER'
 	export DEFAULT_TEST_TARGET='$DEFAULT_TEST_TARGET'
diff --git a/ci/run-linux32-docker.sh b/ci/run-linux32-docker.sh
index ebb18fa747..54186b6aa7 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-linux32-docker.sh
@@ -9,6 +9,7 @@ docker pull daald/ubuntu32:xenial
 
 # Use the following command to debug the docker build locally:
 # $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/bash daald/ubuntu32:xenial
+# root@container:/# export jobname=<jobname>
 # root@container:/# /usr/src/git/ci/run-linux32-build.sh <host-user-id>
 
 container_cache_dir=/tmp/travis-cache
@@ -21,6 +22,7 @@ docker run \
 	--env GIT_TEST_OPTS \
 	--env GIT_TEST_CLONE_2GB \
 	--env MAKEFLAGS \
+	--env jobname \
 	--env cache_dir="$container_cache_dir" \
 	--volume "${PWD}:/usr/src/git" \
 	--volume "$cache_dir:$container_cache_dir" \
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3 4/6] ci: refactor docker runner script
  2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
                       ` (2 preceding siblings ...)
  2020-04-02 13:04     ` [PATCH v3 3/6] ci/linux32: parameterise command to switch arch Đoàn Trần Công Danh
@ 2020-04-02 13:04     ` Đoàn Trần Công Danh
  2020-04-02 13:04     ` [PATCH v3 5/6] ci/linux32: libify install-dependencies step Đoàn Trần Công Danh
                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-02 13:04 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

We will support alpine check in docker later in this series.

While we're at it, tell people to run as root in podman,
if podman is used as drop-in replacement for docker,
because podman will map host-user to container's root,
therefore, mapping their permission.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 .travis.yml                                   |  2 +-
 azure-pipelines.yml                           |  4 ++--
 ...n-linux32-build.sh => run-docker-build.sh} |  6 ++---
 ci/{run-linux32-docker.sh => run-docker.sh}   | 22 ++++++++++++++-----
 4 files changed, 22 insertions(+), 12 deletions(-)
 rename ci/{run-linux32-build.sh => run-docker-build.sh} (93%)
 rename ci/{run-linux32-docker.sh => run-docker.sh} (53%)

diff --git a/.travis.yml b/.travis.yml
index fc5730b085..069aeeff3c 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -32,7 +32,7 @@ matrix:
       services:
         - docker
       before_install:
-      script: ci/run-linux32-docker.sh
+      script: ci/run-docker.sh
     - env: jobname=StaticAnalysis
       os: linux
       compiler:
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
index 675c3a43c9..11413f66f8 100644
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -489,14 +489,14 @@ jobs:
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
 
        res=0
-       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-linux32-docker.sh || res=1
+       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=Linux32 bash -lxc ci/run-docker.sh || res=1
 
        sudo chmod a+r t/out/TEST-*.xml
        test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
 
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
        exit $res
-    displayName: 'ci/run-linux32-docker.sh'
+    displayName: 'jobname=Linux32 ci/run-docker.sh'
     env:
       GITFILESHAREPWD: $(gitfileshare.pwd)
   - task: PublishTestResults@2
diff --git a/ci/run-linux32-build.sh b/ci/run-docker-build.sh
similarity index 93%
rename from ci/run-linux32-build.sh
rename to ci/run-docker-build.sh
index 44bb332f64..a05b48c559 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-docker-build.sh
@@ -1,16 +1,16 @@
 #!/bin/sh
 #
-# Build and test Git in a 32-bit environment
+# Build and test Git inside container
 #
 # Usage:
-#   run-linux32-build.sh <host-user-id>
+#   run-docker-build.sh <host-user-id>
 #
 
 set -ex
 
 if test $# -ne 1 || test -z "$1"
 then
-	echo >&2 "usage: run-linux32-build.sh <host-user-id>"
+	echo >&2 "usage: run-docker-build.sh <host-user-id>"
 	exit 1
 fi
 
diff --git a/ci/run-linux32-docker.sh b/ci/run-docker.sh
similarity index 53%
rename from ci/run-linux32-docker.sh
rename to ci/run-docker.sh
index 54186b6aa7..3881f99b53 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-docker.sh
@@ -1,16 +1,26 @@
 #!/bin/sh
 #
-# Download and run Docker image to build and test 32-bit Git
+# Download and run Docker image to build and test Git
 #
 
 . ${0%/*}/lib.sh
 
-docker pull daald/ubuntu32:xenial
+case "$jobname" in
+Linux32)
+	CI_CONTAINER="daald/ubuntu32:xenial"
+	;;
+*)
+	exit 1
+	;;
+esac
+
+docker pull "$CI_CONTAINER"
 
 # Use the following command to debug the docker build locally:
-# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/bash daald/ubuntu32:xenial
+# <host-user-id> must be 0 if podman is used as drop-in replacement for docker
+# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/sh "$CI_CONTAINER"
 # root@container:/# export jobname=<jobname>
-# root@container:/# /usr/src/git/ci/run-linux32-build.sh <host-user-id>
+# root@container:/# /usr/src/git/ci/run-docker-build.sh <host-user-id>
 
 container_cache_dir=/tmp/travis-cache
 
@@ -26,8 +36,8 @@ docker run \
 	--env cache_dir="$container_cache_dir" \
 	--volume "${PWD}:/usr/src/git" \
 	--volume "$cache_dir:$container_cache_dir" \
-	daald/ubuntu32:xenial \
-	/usr/src/git/ci/run-linux32-build.sh $(id -u $USER)
+	"$CI_CONTAINER" \
+	/usr/src/git/ci/run-docker-build.sh $(id -u $USER)
 
 check_unignored_build_artifacts
 
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3 5/6] ci/linux32: libify install-dependencies step
  2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
                       ` (3 preceding siblings ...)
  2020-04-02 13:04     ` [PATCH v3 4/6] ci: refactor docker runner script Đoàn Trần Công Danh
@ 2020-04-02 13:04     ` Đoàn Trần Công Danh
  2020-04-02 13:04     ` [PATCH v3 6/6] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
  2020-04-02 17:53     ` [PATCH v3 0/6] Travis + Azure jobs for linux with musl libc Junio C Hamano
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-02 13:04 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

In a later patch, we will add new Travis Job for linux-musl.
Most of other code in this file could be reuse for that job.

Move the code to install dependencies to a common script.
Should we add new CI system that can run directly in container,
we can reuse this script for installation step.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 ci/install-docker-dependencies.sh | 14 ++++++++++++++
 ci/run-docker-build.sh            |  7 +------
 2 files changed, 15 insertions(+), 6 deletions(-)
 create mode 100755 ci/install-docker-dependencies.sh

diff --git a/ci/install-docker-dependencies.sh b/ci/install-docker-dependencies.sh
new file mode 100755
index 0000000000..a104c61d29
--- /dev/null
+++ b/ci/install-docker-dependencies.sh
@@ -0,0 +1,14 @@
+#!/bin/sh
+#
+# Install dependencies required to build and test Git inside container
+#
+
+case "$jobname" in
+Linux32)
+	linux32 --32bit i386 sh -c '
+		apt update >/dev/null &&
+		apt install -y build-essential libcurl4-openssl-dev \
+			libssl-dev libexpat-dev gettext python >/dev/null
+	'
+	;;
+esac
diff --git a/ci/run-docker-build.sh b/ci/run-docker-build.sh
index a05b48c559..4a153492ba 100755
--- a/ci/run-docker-build.sh
+++ b/ci/run-docker-build.sh
@@ -23,12 +23,7 @@ Linux32)
 	;;
 esac
 
-# Update packages to the latest available versions
-command $switch_cmd sh -c '
-    apt update >/dev/null &&
-    apt install -y build-essential libcurl4-openssl-dev libssl-dev \
-	libexpat-dev gettext python >/dev/null
-'
+"${0%/*}/install-docker-dependencies.sh"
 
 # If this script runs inside a docker container, then all commands are
 # usually executed as root. Consequently, the host user might not be
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3 6/6] travis: build and test on Linux with musl libc and busybox
  2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
                       ` (4 preceding siblings ...)
  2020-04-02 13:04     ` [PATCH v3 5/6] ci/linux32: libify install-dependencies step Đoàn Trần Công Danh
@ 2020-04-02 13:04     ` Đoàn Trần Công Danh
  2020-04-02 17:53     ` [PATCH v3 0/6] Travis + Azure jobs for linux with musl libc Junio C Hamano
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-02 13:04 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 .travis.yml                       |  8 +++++++
 azure-pipelines.yml               | 35 +++++++++++++++++++++++++++++++
 ci/install-docker-dependencies.sh |  4 ++++
 ci/lib.sh                         |  5 +++++
 ci/run-docker-build.sh            |  4 ++++
 ci/run-docker.sh                  |  3 +++
 6 files changed, 59 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index 069aeeff3c..0cfc3c3428 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -33,6 +33,14 @@ matrix:
         - docker
       before_install:
       script: ci/run-docker.sh
+    - env: jobname=linux-musl
+      os: linux
+      compiler:
+      addons:
+      services:
+        - docker
+      before_install:
+      script: ci/run-docker.sh
     - env: jobname=StaticAnalysis
       os: linux
       compiler:
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
index 11413f66f8..84ecad76ec 100644
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -514,6 +514,41 @@ jobs:
       PathtoPublish: t/failed-test-artifacts
       ArtifactName: failed-test-artifacts
 
+- job: linux_musl
+  displayName: linux-musl
+  condition: succeeded()
+  pool:
+    vmImage: ubuntu-latest
+  steps:
+  - bash: |
+       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
+
+       res=0
+       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=linux-musl bash -lxc ci/run-docker.sh || res=1
+
+       sudo chmod a+r t/out/TEST-*.xml
+       test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
+
+       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
+       exit $res
+    displayName: 'jobname=linux-musl ci/run-docker.sh'
+    env:
+      GITFILESHAREPWD: $(gitfileshare.pwd)
+  - task: PublishTestResults@2
+    displayName: 'Publish Test Results **/TEST-*.xml'
+    inputs:
+      mergeTestResults: true
+      testRunTitle: 'musl'
+      platform: Linux
+      publishRunAttachments: false
+    condition: succeededOrFailed()
+  - task: PublishBuildArtifacts@1
+    displayName: 'Publish trash directories of failed tests'
+    condition: failed()
+    inputs:
+      PathtoPublish: t/failed-test-artifacts
+      ArtifactName: failed-test-artifacts
+
 - job: static_analysis
   displayName: StaticAnalysis
   condition: succeeded()
diff --git a/ci/install-docker-dependencies.sh b/ci/install-docker-dependencies.sh
index a104c61d29..26a6689766 100755
--- a/ci/install-docker-dependencies.sh
+++ b/ci/install-docker-dependencies.sh
@@ -11,4 +11,8 @@ Linux32)
 			libssl-dev libexpat-dev gettext python >/dev/null
 	'
 	;;
+linux-musl)
+	apk add --update build-base curl-dev openssl-dev expat-dev gettext \
+		pcre2-dev python3 musl-libintl perl-utils ncurses >/dev/null
+	;;
 esac
diff --git a/ci/lib.sh b/ci/lib.sh
index d637825222..87cd29bab6 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -201,6 +201,11 @@ GIT_TEST_GETTEXT_POISON)
 Linux32)
 	CC=gcc
 	;;
+linux-musl)
+	CC=gcc
+	MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3 USE_LIBPCRE2=Yes"
+	MAKEFLAGS="$MAKEFLAGS NO_REGEX=Yes ICONV_OMITS_BOM=Yes"
+	;;
 esac
 
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-docker-build.sh b/ci/run-docker-build.sh
index 4a153492ba..8d47a5fda3 100755
--- a/ci/run-docker-build.sh
+++ b/ci/run-docker-build.sh
@@ -18,6 +18,10 @@ case "$jobname" in
 Linux32)
 	switch_cmd="linux32 --32bit i386"
 	;;
+linux-musl)
+	switch_cmd=
+	useradd () { adduser -D "$@"; }
+	;;
 *)
 	exit 1
 	;;
diff --git a/ci/run-docker.sh b/ci/run-docker.sh
index 3881f99b53..37fa372052 100755
--- a/ci/run-docker.sh
+++ b/ci/run-docker.sh
@@ -9,6 +9,9 @@ case "$jobname" in
 Linux32)
 	CI_CONTAINER="daald/ubuntu32:xenial"
 	;;
+linux-musl)
+	CI_CONTAINER=alpine
+	;;
 *)
 	exit 1
 	;;
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 0/6] Travis + Azure jobs for linux with musl libc
  2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
                       ` (5 preceding siblings ...)
  2020-04-02 13:04     ` [PATCH v3 6/6] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
@ 2020-04-02 17:53     ` Junio C Hamano
  2020-04-03  0:23       ` Danh Doan
  6 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2020-04-02 17:53 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: git, SZEDER Gábor, Eric Sunshine, Johannes Schindelin

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> This is a nearly rewrite of this series, because there're GitHub Action
> allow running directly inside container.
>
> => I rewrite this series to prepare as much as possible for the GitHub
> Action series.
> ...
>  .travis.yml                                   | 10 ++++-
>  azure-pipelines.yml                           | 39 ++++++++++++++++++-
>  ci/install-docker-dependencies.sh             | 18 +++++++++
>  ci/lib.sh                                     |  8 ++++
>  ...n-linux32-build.sh => run-docker-build.sh} | 39 +++++++++++++------
>  ci/{run-linux32-docker.sh => run-docker.sh}   | 28 ++++++++++---
>  6 files changed, 121 insertions(+), 21 deletions(-)
>  create mode 100755 ci/install-docker-dependencies.sh
>  rename ci/{run-linux32-build.sh => run-docker-build.sh} (63%)
>  rename ci/{run-linux32-docker.sh => run-docker.sh} (43%)

Thanks.  The above diffstat makes me wonder if it makes more sense
to do the topic from Dscho first to migrate existing CI targets to
GitHub Actions and then add musl job to the ci suite on top?  That
way, we won't have to worry about azure-pipelines.yml at all here.




^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 0/6] Travis + Azure jobs for linux with musl libc
  2020-04-02 17:53     ` [PATCH v3 0/6] Travis + Azure jobs for linux with musl libc Junio C Hamano
@ 2020-04-03  0:23       ` Danh Doan
  0 siblings, 0 replies; 78+ messages in thread
From: Danh Doan @ 2020-04-03  0:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, SZEDER Gábor, Eric Sunshine, Johannes Schindelin

On 2020-04-02 10:53:35-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> 
> > This is a nearly rewrite of this series, because there're GitHub Action
> > allow running directly inside container.
> >
> > => I rewrite this series to prepare as much as possible for the GitHub
> > Action series.
> > ...
> >  .travis.yml                                   | 10 ++++-
> >  azure-pipelines.yml                           | 39 ++++++++++++++++++-
> >  ci/install-docker-dependencies.sh             | 18 +++++++++
> >  ci/lib.sh                                     |  8 ++++
> >  ...n-linux32-build.sh => run-docker-build.sh} | 39 +++++++++++++------
> >  ci/{run-linux32-docker.sh => run-docker.sh}   | 28 ++++++++++---
> >  6 files changed, 121 insertions(+), 21 deletions(-)
> >  create mode 100755 ci/install-docker-dependencies.sh
> >  rename ci/{run-linux32-build.sh => run-docker-build.sh} (63%)
> >  rename ci/{run-linux32-docker.sh => run-docker.sh} (43%)
> 
> Thanks.  The above diffstat makes me wonder if it makes more sense
> to do the topic from Dscho first to migrate existing CI targets to
> GitHub Actions and then add musl job to the ci suite on top?  That
> way, we won't have to worry about azure-pipelines.yml at all here.

You can ignore the change to azure-pipelines.yml in 6/6
to reduce noise about Azure (it'll be deleted by next series anyway).
And declare that this series is working
for Travis only (as same intention of v1). New diffstat:
---------------8<------------------
 .travis.yml                                      | 10 +++++-
 azure-pipelines.yml                              |  4 +--
 ci/install-docker-dependencies.sh                | 18 +++++++++++
 ci/lib.sh                                        |  8 +++++
 ci/{run-linux32-build.sh => run-docker-build.sh} | 39 ++++++++++++++++--------
 ci/{run-linux32-docker.sh => run-docker.sh}      | 28 +++++++++++++----
 6 files changed, 86 insertions(+), 21 deletions(-)

---------------->8----------------

In _my_ opinion, I still prefer have this series first.

But, if we prefer to have GitHub Action first:
- We'll need 5/6 moved to that series
- In the rebased of this series, we'll change about 10 lines in GitHub Action yml.

If people think it's better that way, please tell me, I could re-order it.

-- 
Danh

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 2/6] ci/lib-docker: preserve required environment variables
  2020-04-02 13:04     ` [PATCH v3 2/6] ci/lib-docker: preserve required environment variables Đoàn Trần Công Danh
@ 2020-04-03  8:22       ` SZEDER Gábor
  2020-04-03 10:09         ` Danh Doan
  0 siblings, 1 reply; 78+ messages in thread
From: SZEDER Gábor @ 2020-04-03  8:22 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

On Thu, Apr 02, 2020 at 08:04:01PM +0700, Đoàn Trần Công Danh wrote:
> We're using "su -m" to preserve environment variables in the shell run
> by "su". But, that options will be ignored while "-l" (aka "--login") is
> specified.

This is not true.  See any previous runs of the 32 bit Linux job,
which worked as expected, because none of these environment variables
were cleared.

> Since we don't have interest in all environment variables,
> pass only those necessary variables to the inner script.
> 
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
>  ci/run-linux32-build.sh | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
> index e3a193adbc..7f985615c2 100755
> --- a/ci/run-linux32-build.sh
> +++ b/ci/run-linux32-build.sh
> @@ -51,10 +51,17 @@ else
>  fi
>  
>  # Build and test
> -linux32 --32bit i386 su -m -l $CI_USER -c '
> +linux32 --32bit i386 su -m -l $CI_USER -c "
>  	set -ex
> +	export DEVELOPER='$DEVELOPER'
> +	export DEFAULT_TEST_TARGET='$DEFAULT_TEST_TARGET'
> +	export GIT_PROVE_OPTS='$GIT_PROVE_OPTS'
> +	export GIT_TEST_OPTS='$GIT_TEST_OPTS'
> +	export GIT_TEST_CLONE_2GB='$GIT_TEST_CLONE_2GB'
> +	export MAKEFLAGS='$MAKEFLAGS'
> +	export cache_dir='$cache_dir'
>  	cd /usr/src/git
> -	test -n "$cache_dir" && ln -s "$cache_dir/.prove" t/.prove
> +	test -n '$cache_dir' && ln -s '$cache_dir/.prove' t/.prove
>  	make
>  	make test
> -'
> +"
> -- 
> 2.26.0.334.g6536db25bb
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 2/6] ci/lib-docker: preserve required environment variables
  2020-04-03  8:22       ` SZEDER Gábor
@ 2020-04-03 10:09         ` Danh Doan
  2020-04-03 19:55           ` SZEDER Gábor
  0 siblings, 1 reply; 78+ messages in thread
From: Danh Doan @ 2020-04-03 10:09 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

On 2020-04-03 10:22:54+0200, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> On Thu, Apr 02, 2020 at 08:04:01PM +0700, Đoàn Trần Công Danh wrote:
> > We're using "su -m" to preserve environment variables in the shell run
> > by "su". But, that options will be ignored while "-l" (aka "--login") is
> > specified.
> 
> This is not true.  See any previous runs of the 32 bit Linux job,
> which worked as expected, because none of these environment variables
> were cleared.

Different su have different behavior when combine "-m" and "-l"

util-linux's su has this as far as 60541961f, (docs: improve grammar,
wording and formatting of su man page, 2013-10-12)

       -m, -p, --preserve-environment
              Preserve the entire environment, i.e., it does not set HOME,
              SHELL, USER nor LOGNAME.  This option is ignored if the option
              --login is specified.

https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/tree/login-utils/su.1#n120

Ubuntu (our Linux32 builder), ships su by shadow-utils:

	Note that the default behavior for the environment is the following:
		The $HOME, $SHELL, $USER, $LOGNAME, $PATH,
		and $IFS environment variables are reset.

		If --login is not used, the environment is copied,
		except for the variables above.

		If --login is used, the $TERM, $COLORTERM, $DISPLAY, and
		$XAUTHORITY environment variables are copied if they were set.

There're no mentions of other variables, I _think_ our Linux32 works
by accident.

Alpine ships su from busybox, this su ignores "-m" if "-l" is set.

-- 
Danh

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 2/6] ci/lib-docker: preserve required environment variables
  2020-04-03 10:09         ` Danh Doan
@ 2020-04-03 19:55           ` SZEDER Gábor
  0 siblings, 0 replies; 78+ messages in thread
From: SZEDER Gábor @ 2020-04-03 19:55 UTC (permalink / raw)
  To: Danh Doan; +Cc: git

On Fri, Apr 03, 2020 at 05:09:31PM +0700, Danh Doan wrote:
> On 2020-04-03 10:22:54+0200, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> > On Thu, Apr 02, 2020 at 08:04:01PM +0700, Đoàn Trần Công Danh wrote:
> > > We're using "su -m" to preserve environment variables in the shell run
> > > by "su". But, that options will be ignored while "-l" (aka "--login") is
> > > specified.
> > 
> > This is not true.  See any previous runs of the 32 bit Linux job,
> > which worked as expected, because none of these environment variables
> > were cleared.
> 
> Different su have different behavior when combine "-m" and "-l"
> 
> util-linux's su has this as far as 60541961f, (docs: improve grammar,
> wording and formatting of su man page, 2013-10-12)
> 
>        -m, -p, --preserve-environment
>               Preserve the entire environment, i.e., it does not set HOME,
>               SHELL, USER nor LOGNAME.  This option is ignored if the option
>               --login is specified.
> 
> https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/tree/login-utils/su.1#n120
> 
> Ubuntu (our Linux32 builder), ships su by shadow-utils:
> 
> 	Note that the default behavior for the environment is the following:
> 		The $HOME, $SHELL, $USER, $LOGNAME, $PATH,
> 		and $IFS environment variables are reset.
> 
> 		If --login is not used, the environment is copied,
> 		except for the variables above.
> 
> 		If --login is used, the $TERM, $COLORTERM, $DISPLAY, and
> 		$XAUTHORITY environment variables are copied if they were set.
> 
> There're no mentions of other variables, I _think_ our Linux32 works
> by accident.

We do know which image we use, and we do know how its 'su' behaves.  I
think relying on that is fine, and it's not just "works by accident".

> Alpine ships su from busybox, this su ignores "-m" if "-l" is set.

Then the commit message should specifically mention these behavior
differences between different 'su' variants.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v4 0/6] Travis jobs for linux with musl libc
  2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
                     ` (4 preceding siblings ...)
  2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
@ 2020-04-04  1:08   ` Đoàn Trần Công Danh
  2020-04-04  1:08     ` [PATCH v4 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
                       ` (6 more replies)
  5 siblings, 7 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-04  1:08 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, Junio C Hamano

Cc: Junio C Hamano <gitster@pobox.com>,
 SZEDER Gábor <szeder.dev@gmail.com>,
 Eric Sunshine <sunshine@sunshineco.com>,
 Johannes Schindelin <johannes.schindelin@gmx.de>

Recently, we've un-broken git for Linux with musl libc,
and we have a serie to fix false negative with busybox shell utils.

Add a CI job on Travis and Azure to make sure we won't break it again.

Change from v3:
* reword 2/6: adding su behavior in util-linux and busybox
* 6/6: Drop change to azure-pipelines.yml, and declare this series will
  be used for Travis only. Since Azure will be replaced by GitHub Action
  in a later series.

Hi Junio,
The series for GitHub Actions will need to be rebased on this series again.
6/6 in that seriess will have UD conflicts.
Please "git rm azure-pipelines.yml" to fix conflicts.

SZEDER Gábor (1):
  ci: make MAKEFLAGS available inside the Docker container in the
    Linux32 job

Đoàn Trần Công Danh (5):
  ci/lib-docker: preserve required environment variables
  ci/linux32: parameterise command to switch arch
  ci: refactor docker runner script
  ci/linux32: libify install-dependencies step
  travis: build and test on Linux with musl libc and busybox

 .travis.yml                                   | 10 ++++-
 azure-pipelines.yml                           |  4 +-
 ci/install-docker-dependencies.sh             | 18 +++++++++
 ci/lib.sh                                     |  8 ++++
 ...n-linux32-build.sh => run-docker-build.sh} | 39 +++++++++++++------
 ci/{run-linux32-docker.sh => run-docker.sh}   | 28 ++++++++++---
 6 files changed, 86 insertions(+), 21 deletions(-)
 create mode 100755 ci/install-docker-dependencies.sh
 rename ci/{run-linux32-build.sh => run-docker-build.sh} (63%)
 rename ci/{run-linux32-docker.sh => run-docker.sh} (43%)

Range-diff against v3:
1:  2fdce60075 = 1:  2fdce60075 ci: make MAKEFLAGS available inside the Docker container in the Linux32 job
2:  b7b079f559 ! 2:  db574b3ff9 ci/lib-docker: preserve required environment variables
    @@ Commit message
     
         We're using "su -m" to preserve environment variables in the shell run
         by "su". But, that options will be ignored while "-l" (aka "--login") is
    -    specified.
    +    specified in util-linux and busybox's su.
    +
    +    In a later patch this script will be reused for checking Git for Linux
    +    with musl libc on Alpine Linux, Alpine Linux uses "su" from busybox.
     
         Since we don't have interest in all environment variables,
         pass only those necessary variables to the inner script.
3:  8c8cf3eb24 = 3:  a13715245f ci/linux32: parameterise command to switch arch
4:  22cc7960c3 = 4:  b5de868c1e ci: refactor docker runner script
5:  2e0d54f81e = 5:  c39451ffe5 ci/linux32: libify install-dependencies step
6:  b61ed50cf6 ! 6:  231affae83 travis: build and test on Linux with musl libc and busybox
    @@ .travis.yml: matrix:
            os: linux
            compiler:
     
    - ## azure-pipelines.yml ##
    -@@ azure-pipelines.yml: jobs:
    -       PathtoPublish: t/failed-test-artifacts
    -       ArtifactName: failed-test-artifacts
    - 
    -+- job: linux_musl
    -+  displayName: linux-musl
    -+  condition: succeeded()
    -+  pool:
    -+    vmImage: ubuntu-latest
    -+  steps:
    -+  - bash: |
    -+       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
    -+
    -+       res=0
    -+       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=linux-musl bash -lxc ci/run-docker.sh || res=1
    -+
    -+       sudo chmod a+r t/out/TEST-*.xml
    -+       test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
    -+
    -+       test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
    -+       exit $res
    -+    displayName: 'jobname=linux-musl ci/run-docker.sh'
    -+    env:
    -+      GITFILESHAREPWD: $(gitfileshare.pwd)
    -+  - task: PublishTestResults@2
    -+    displayName: 'Publish Test Results **/TEST-*.xml'
    -+    inputs:
    -+      mergeTestResults: true
    -+      testRunTitle: 'musl'
    -+      platform: Linux
    -+      publishRunAttachments: false
    -+    condition: succeededOrFailed()
    -+  - task: PublishBuildArtifacts@1
    -+    displayName: 'Publish trash directories of failed tests'
    -+    condition: failed()
    -+    inputs:
    -+      PathtoPublish: t/failed-test-artifacts
    -+      ArtifactName: failed-test-artifacts
    -+
    - - job: static_analysis
    -   displayName: StaticAnalysis
    -   condition: succeeded()
    -
      ## ci/install-docker-dependencies.sh ##
     @@ ci/install-docker-dependencies.sh: Linux32)
      			libssl-dev libexpat-dev gettext python >/dev/null
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v4 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job
  2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
@ 2020-04-04  1:08     ` Đoàn Trần Công Danh
  2020-04-04  1:08     ` [PATCH v4 2/6] ci/lib-docker: preserve required environment variables Đoàn Trần Công Danh
                       ` (5 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-04  1:08 UTC (permalink / raw)
  To: git; +Cc: SZEDER Gábor, Đoàn Trần Công Danh

From: SZEDER Gábor <szeder.dev@gmail.com>

Once upon a time we ran 'make --jobs=2 ...' to build Git, its
documentation, or to apply Coccinelle semantic patches.  Then commit
eaa62291ff (ci: inherit --jobs via MAKEFLAGS in run-build-and-tests,
2019-01-27) came along, and started using the MAKEFLAGS environment
variable to centralize setting the number of parallel jobs in
'ci/libs.sh'.  Alas, it forgot to update 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container running the 32
bit Linux job, and, consequently, since then that job builds Git
sequentially, and it ignores any Makefile knobs that we might set in
MAKEFLAGS (though we don't set any for the 32 bit Linux job at the
moment).

So update the 'docker run' invocation in 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container as well.  Set
CC=gcc for the 32 bit Linux job, because that's the compiler installed
in the 32 bit Linux Docker image that we use (Travis CI nowadays sets
CC=clang by default, but clang is not installed in this image).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 ci/lib.sh                | 3 +++
 ci/run-linux32-docker.sh | 1 +
 2 files changed, 4 insertions(+)

diff --git a/ci/lib.sh b/ci/lib.sh
index c3a8cd2104..d637825222 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -198,6 +198,9 @@ osx-clang|osx-gcc)
 GIT_TEST_GETTEXT_POISON)
 	export GIT_TEST_GETTEXT_POISON=true
 	;;
+Linux32)
+	CC=gcc
+	;;
 esac
 
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-linux32-docker.sh b/ci/run-linux32-docker.sh
index 751acfcf8a..ebb18fa747 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-linux32-docker.sh
@@ -20,6 +20,7 @@ docker run \
 	--env GIT_PROVE_OPTS \
 	--env GIT_TEST_OPTS \
 	--env GIT_TEST_CLONE_2GB \
+	--env MAKEFLAGS \
 	--env cache_dir="$container_cache_dir" \
 	--volume "${PWD}:/usr/src/git" \
 	--volume "$cache_dir:$container_cache_dir" \
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v4 2/6] ci/lib-docker: preserve required environment variables
  2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
  2020-04-04  1:08     ` [PATCH v4 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
@ 2020-04-04  1:08     ` Đoàn Trần Công Danh
  2020-04-04  1:08     ` [PATCH v4 3/6] ci/linux32: parameterise command to switch arch Đoàn Trần Công Danh
                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-04  1:08 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

We're using "su -m" to preserve environment variables in the shell run
by "su". But, that options will be ignored while "-l" (aka "--login") is
specified in util-linux and busybox's su.

In a later patch this script will be reused for checking Git for Linux
with musl libc on Alpine Linux, Alpine Linux uses "su" from busybox.

Since we don't have interest in all environment variables,
pass only those necessary variables to the inner script.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 ci/run-linux32-build.sh | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index e3a193adbc..7f985615c2 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -51,10 +51,17 @@ else
 fi
 
 # Build and test
-linux32 --32bit i386 su -m -l $CI_USER -c '
+linux32 --32bit i386 su -m -l $CI_USER -c "
 	set -ex
+	export DEVELOPER='$DEVELOPER'
+	export DEFAULT_TEST_TARGET='$DEFAULT_TEST_TARGET'
+	export GIT_PROVE_OPTS='$GIT_PROVE_OPTS'
+	export GIT_TEST_OPTS='$GIT_TEST_OPTS'
+	export GIT_TEST_CLONE_2GB='$GIT_TEST_CLONE_2GB'
+	export MAKEFLAGS='$MAKEFLAGS'
+	export cache_dir='$cache_dir'
 	cd /usr/src/git
-	test -n "$cache_dir" && ln -s "$cache_dir/.prove" t/.prove
+	test -n '$cache_dir' && ln -s '$cache_dir/.prove' t/.prove
 	make
 	make test
-'
+"
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v4 3/6] ci/linux32: parameterise command to switch arch
  2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
  2020-04-04  1:08     ` [PATCH v4 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
  2020-04-04  1:08     ` [PATCH v4 2/6] ci/lib-docker: preserve required environment variables Đoàn Trần Công Danh
@ 2020-04-04  1:08     ` Đoàn Trần Công Danh
  2020-04-04  1:08     ` [PATCH v4 4/6] ci: refactor docker runner script Đoàn Trần Công Danh
                       ` (3 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-04  1:08 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

In a later patch, the remaining of this command will be re-used for the
CI job for linux with musl libc.

Allow customisation of the emulator, now.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 ci/run-linux32-build.sh  | 13 +++++++++++--
 ci/run-linux32-docker.sh |  2 ++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index 7f985615c2..44bb332f64 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -14,8 +14,17 @@ then
 	exit 1
 fi
 
+case "$jobname" in
+Linux32)
+	switch_cmd="linux32 --32bit i386"
+	;;
+*)
+	exit 1
+	;;
+esac
+
 # Update packages to the latest available versions
-linux32 --32bit i386 sh -c '
+command $switch_cmd sh -c '
     apt update >/dev/null &&
     apt install -y build-essential libcurl4-openssl-dev libssl-dev \
 	libexpat-dev gettext python >/dev/null
@@ -51,7 +60,7 @@ else
 fi
 
 # Build and test
-linux32 --32bit i386 su -m -l $CI_USER -c "
+command $switch_cmd su -m -l $CI_USER -c "
 	set -ex
 	export DEVELOPER='$DEVELOPER'
 	export DEFAULT_TEST_TARGET='$DEFAULT_TEST_TARGET'
diff --git a/ci/run-linux32-docker.sh b/ci/run-linux32-docker.sh
index ebb18fa747..54186b6aa7 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-linux32-docker.sh
@@ -9,6 +9,7 @@ docker pull daald/ubuntu32:xenial
 
 # Use the following command to debug the docker build locally:
 # $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/bash daald/ubuntu32:xenial
+# root@container:/# export jobname=<jobname>
 # root@container:/# /usr/src/git/ci/run-linux32-build.sh <host-user-id>
 
 container_cache_dir=/tmp/travis-cache
@@ -21,6 +22,7 @@ docker run \
 	--env GIT_TEST_OPTS \
 	--env GIT_TEST_CLONE_2GB \
 	--env MAKEFLAGS \
+	--env jobname \
 	--env cache_dir="$container_cache_dir" \
 	--volume "${PWD}:/usr/src/git" \
 	--volume "$cache_dir:$container_cache_dir" \
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v4 4/6] ci: refactor docker runner script
  2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
                       ` (2 preceding siblings ...)
  2020-04-04  1:08     ` [PATCH v4 3/6] ci/linux32: parameterise command to switch arch Đoàn Trần Công Danh
@ 2020-04-04  1:08     ` Đoàn Trần Công Danh
  2020-04-04  1:08     ` [PATCH v4 5/6] ci/linux32: libify install-dependencies step Đoàn Trần Công Danh
                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-04  1:08 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

We will support alpine check in docker later in this series.

While we're at it, tell people to run as root in podman,
if podman is used as drop-in replacement for docker,
because podman will map host-user to container's root,
therefore, mapping their permission.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 .travis.yml                                   |  2 +-
 azure-pipelines.yml                           |  4 ++--
 ...n-linux32-build.sh => run-docker-build.sh} |  6 ++---
 ci/{run-linux32-docker.sh => run-docker.sh}   | 22 ++++++++++++++-----
 4 files changed, 22 insertions(+), 12 deletions(-)
 rename ci/{run-linux32-build.sh => run-docker-build.sh} (93%)
 rename ci/{run-linux32-docker.sh => run-docker.sh} (53%)

diff --git a/.travis.yml b/.travis.yml
index fc5730b085..069aeeff3c 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -32,7 +32,7 @@ matrix:
       services:
         - docker
       before_install:
-      script: ci/run-linux32-docker.sh
+      script: ci/run-docker.sh
     - env: jobname=StaticAnalysis
       os: linux
       compiler:
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
index 675c3a43c9..11413f66f8 100644
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -489,14 +489,14 @@ jobs:
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1
 
        res=0
-       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" bash -lxc ci/run-linux32-docker.sh || res=1
+       sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=Linux32 bash -lxc ci/run-docker.sh || res=1
 
        sudo chmod a+r t/out/TEST-*.xml
        test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts
 
        test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1
        exit $res
-    displayName: 'ci/run-linux32-docker.sh'
+    displayName: 'jobname=Linux32 ci/run-docker.sh'
     env:
       GITFILESHAREPWD: $(gitfileshare.pwd)
   - task: PublishTestResults@2
diff --git a/ci/run-linux32-build.sh b/ci/run-docker-build.sh
similarity index 93%
rename from ci/run-linux32-build.sh
rename to ci/run-docker-build.sh
index 44bb332f64..a05b48c559 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-docker-build.sh
@@ -1,16 +1,16 @@
 #!/bin/sh
 #
-# Build and test Git in a 32-bit environment
+# Build and test Git inside container
 #
 # Usage:
-#   run-linux32-build.sh <host-user-id>
+#   run-docker-build.sh <host-user-id>
 #
 
 set -ex
 
 if test $# -ne 1 || test -z "$1"
 then
-	echo >&2 "usage: run-linux32-build.sh <host-user-id>"
+	echo >&2 "usage: run-docker-build.sh <host-user-id>"
 	exit 1
 fi
 
diff --git a/ci/run-linux32-docker.sh b/ci/run-docker.sh
similarity index 53%
rename from ci/run-linux32-docker.sh
rename to ci/run-docker.sh
index 54186b6aa7..3881f99b53 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-docker.sh
@@ -1,16 +1,26 @@
 #!/bin/sh
 #
-# Download and run Docker image to build and test 32-bit Git
+# Download and run Docker image to build and test Git
 #
 
 . ${0%/*}/lib.sh
 
-docker pull daald/ubuntu32:xenial
+case "$jobname" in
+Linux32)
+	CI_CONTAINER="daald/ubuntu32:xenial"
+	;;
+*)
+	exit 1
+	;;
+esac
+
+docker pull "$CI_CONTAINER"
 
 # Use the following command to debug the docker build locally:
-# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/bash daald/ubuntu32:xenial
+# <host-user-id> must be 0 if podman is used as drop-in replacement for docker
+# $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/sh "$CI_CONTAINER"
 # root@container:/# export jobname=<jobname>
-# root@container:/# /usr/src/git/ci/run-linux32-build.sh <host-user-id>
+# root@container:/# /usr/src/git/ci/run-docker-build.sh <host-user-id>
 
 container_cache_dir=/tmp/travis-cache
 
@@ -26,8 +36,8 @@ docker run \
 	--env cache_dir="$container_cache_dir" \
 	--volume "${PWD}:/usr/src/git" \
 	--volume "$cache_dir:$container_cache_dir" \
-	daald/ubuntu32:xenial \
-	/usr/src/git/ci/run-linux32-build.sh $(id -u $USER)
+	"$CI_CONTAINER" \
+	/usr/src/git/ci/run-docker-build.sh $(id -u $USER)
 
 check_unignored_build_artifacts
 
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v4 5/6] ci/linux32: libify install-dependencies step
  2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
                       ` (3 preceding siblings ...)
  2020-04-04  1:08     ` [PATCH v4 4/6] ci: refactor docker runner script Đoàn Trần Công Danh
@ 2020-04-04  1:08     ` Đoàn Trần Công Danh
  2020-04-04  1:08     ` [PATCH v4 6/6] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
  2020-04-05 20:39     ` [PATCH v4 0/6] Travis jobs for linux with musl libc Junio C Hamano
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-04  1:08 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

In a later patch, we will add new Travis Job for linux-musl.
Most of other code in this file could be reuse for that job.

Move the code to install dependencies to a common script.
Should we add new CI system that can run directly in container,
we can reuse this script for installation step.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 ci/install-docker-dependencies.sh | 14 ++++++++++++++
 ci/run-docker-build.sh            |  7 +------
 2 files changed, 15 insertions(+), 6 deletions(-)
 create mode 100755 ci/install-docker-dependencies.sh

diff --git a/ci/install-docker-dependencies.sh b/ci/install-docker-dependencies.sh
new file mode 100755
index 0000000000..a104c61d29
--- /dev/null
+++ b/ci/install-docker-dependencies.sh
@@ -0,0 +1,14 @@
+#!/bin/sh
+#
+# Install dependencies required to build and test Git inside container
+#
+
+case "$jobname" in
+Linux32)
+	linux32 --32bit i386 sh -c '
+		apt update >/dev/null &&
+		apt install -y build-essential libcurl4-openssl-dev \
+			libssl-dev libexpat-dev gettext python >/dev/null
+	'
+	;;
+esac
diff --git a/ci/run-docker-build.sh b/ci/run-docker-build.sh
index a05b48c559..4a153492ba 100755
--- a/ci/run-docker-build.sh
+++ b/ci/run-docker-build.sh
@@ -23,12 +23,7 @@ Linux32)
 	;;
 esac
 
-# Update packages to the latest available versions
-command $switch_cmd sh -c '
-    apt update >/dev/null &&
-    apt install -y build-essential libcurl4-openssl-dev libssl-dev \
-	libexpat-dev gettext python >/dev/null
-'
+"${0%/*}/install-docker-dependencies.sh"
 
 # If this script runs inside a docker container, then all commands are
 # usually executed as root. Consequently, the host user might not be
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v4 6/6] travis: build and test on Linux with musl libc and busybox
  2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
                       ` (4 preceding siblings ...)
  2020-04-04  1:08     ` [PATCH v4 5/6] ci/linux32: libify install-dependencies step Đoàn Trần Công Danh
@ 2020-04-04  1:08     ` Đoàn Trần Công Danh
  2020-04-05 20:39     ` [PATCH v4 0/6] Travis jobs for linux with musl libc Junio C Hamano
  6 siblings, 0 replies; 78+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-04  1:08 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 .travis.yml                       | 8 ++++++++
 ci/install-docker-dependencies.sh | 4 ++++
 ci/lib.sh                         | 5 +++++
 ci/run-docker-build.sh            | 4 ++++
 ci/run-docker.sh                  | 3 +++
 5 files changed, 24 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index 069aeeff3c..0cfc3c3428 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -33,6 +33,14 @@ matrix:
         - docker
       before_install:
       script: ci/run-docker.sh
+    - env: jobname=linux-musl
+      os: linux
+      compiler:
+      addons:
+      services:
+        - docker
+      before_install:
+      script: ci/run-docker.sh
     - env: jobname=StaticAnalysis
       os: linux
       compiler:
diff --git a/ci/install-docker-dependencies.sh b/ci/install-docker-dependencies.sh
index a104c61d29..26a6689766 100755
--- a/ci/install-docker-dependencies.sh
+++ b/ci/install-docker-dependencies.sh
@@ -11,4 +11,8 @@ Linux32)
 			libssl-dev libexpat-dev gettext python >/dev/null
 	'
 	;;
+linux-musl)
+	apk add --update build-base curl-dev openssl-dev expat-dev gettext \
+		pcre2-dev python3 musl-libintl perl-utils ncurses >/dev/null
+	;;
 esac
diff --git a/ci/lib.sh b/ci/lib.sh
index d637825222..87cd29bab6 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -201,6 +201,11 @@ GIT_TEST_GETTEXT_POISON)
 Linux32)
 	CC=gcc
 	;;
+linux-musl)
+	CC=gcc
+	MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3 USE_LIBPCRE2=Yes"
+	MAKEFLAGS="$MAKEFLAGS NO_REGEX=Yes ICONV_OMITS_BOM=Yes"
+	;;
 esac
 
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-docker-build.sh b/ci/run-docker-build.sh
index 4a153492ba..8d47a5fda3 100755
--- a/ci/run-docker-build.sh
+++ b/ci/run-docker-build.sh
@@ -18,6 +18,10 @@ case "$jobname" in
 Linux32)
 	switch_cmd="linux32 --32bit i386"
 	;;
+linux-musl)
+	switch_cmd=
+	useradd () { adduser -D "$@"; }
+	;;
 *)
 	exit 1
 	;;
diff --git a/ci/run-docker.sh b/ci/run-docker.sh
index 3881f99b53..37fa372052 100755
--- a/ci/run-docker.sh
+++ b/ci/run-docker.sh
@@ -9,6 +9,9 @@ case "$jobname" in
 Linux32)
 	CI_CONTAINER="daald/ubuntu32:xenial"
 	;;
+linux-musl)
+	CI_CONTAINER=alpine
+	;;
 *)
 	exit 1
 	;;
-- 
2.26.0.334.g6536db25bb


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 0/6] Travis jobs for linux with musl libc
  2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
                       ` (5 preceding siblings ...)
  2020-04-04  1:08     ` [PATCH v4 6/6] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
@ 2020-04-05 20:39     ` Junio C Hamano
  2020-04-07 14:55       ` Johannes Schindelin
  6 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2020-04-05 20:39 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> The series for GitHub Actions will need to be rebased on this series again.
> 6/6 in that seriess will have UD conflicts.
> Please "git rm azure-pipelines.yml" to fix conflicts.

If you want to keep doing this, please take over the ownership of
both series and build one on top of the other.  I asked you and
Dscho to coordinate and work together, but Dscho seems to be
comfortable with the idea of letting you touch his series, so doing
so would still count as you two working together ;-).  I do not have
a strong opinion which parts should come first (it is something that
can be decided between you two which way is cleaner).

Thanks.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox
  2020-04-01 22:18     ` SZEDER Gábor
  2020-04-02  1:42       ` Danh Doan
@ 2020-04-07 14:53       ` Johannes Schindelin
  2020-04-07 21:35         ` Junio C Hamano
  1 sibling, 1 reply; 78+ messages in thread
From: Johannes Schindelin @ 2020-04-07 14:53 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Đoàn Trần Công Danh, git


[-- Attachment #1: Type: text/plain, Size: 2010 bytes --]

Hi Gábor,

On Thu, 2 Apr 2020, SZEDER Gábor wrote:

> On Sun, Mar 29, 2020 at 05:12:31PM +0700, Đoàn Trần Công Danh wrote:
> > Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
>
> > +	set -ex
> > +	cd /usr/src/git
> > +	test -n "$cache_dir" && ln -s "$cache_dir/.prove" t/.prove
> > +	autoconf
> > +	echo "PYTHON_PATH=/usr/bin/python3" >config.mak
> > +	./configure --with-libpcre
>
> The recommended way to build Git is without autoconf and configure.

That is news to me.

My understanding still is that `make` is the recommended way to build Git,
and `./configure` is only for those who want to use autoconf.

It seems that the `INSTALL` file agrees with my understanding:

-- snip --

                Git installation

Normally you can just do "make" followed by "make install", and that
will install the git programs in your own ~/bin/ directory.  If you want
to do a global install, you can do

        $ make prefix=/usr all doc info ;# as yourself
        # make prefix=/usr install install-doc install-html install-info
        # ;# as root

(or prefix=/usr/local, of course).  Just like any program suite
that uses $prefix, the built results have some paths encoded,
which are derived from $prefix, so "make all; make prefix=/usr
install" would not work.

The beginning of the Makefile documents many variables that affect the way
git is built.  You can override them either from the command line, or in a
config.mak file.

Alternatively you can use autoconf generated ./configure script to
set up install paths (via config.mak.autogen), so you can write instead

        $ make configure ;# as yourself
        $ ./configure --prefix=/usr ;# as yourself
        $ make all doc ;# as yourself
        # make install install-doc install-html;# as root
-- snap --

If you think that I am wrong, I invite you to change the recommendation by
proposing a patch to `INSTALL`, to change the current recommendation.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 0/6] Travis jobs for linux with musl libc
  2020-04-05 20:39     ` [PATCH v4 0/6] Travis jobs for linux with musl libc Junio C Hamano
@ 2020-04-07 14:55       ` Johannes Schindelin
  2020-04-07 19:25         ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Johannes Schindelin @ 2020-04-07 14:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Đoàn Trần Công Danh, git


[-- Attachment #1: Type: text/plain, Size: 962 bytes --]

Hi Junio,

On Sun, 5 Apr 2020, Junio C Hamano wrote:

> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
>
> > The series for GitHub Actions will need to be rebased on this series again.
> > 6/6 in that seriess will have UD conflicts.
> > Please "git rm azure-pipelines.yml" to fix conflicts.
>
> If you want to keep doing this, please take over the ownership of
> both series and build one on top of the other.  I asked you and
> Dscho to coordinate and work together, but Dscho seems to be
> comfortable with the idea of letting you touch his series, so doing
> so would still count as you two working together ;-).  I do not have
> a strong opinion which parts should come first (it is something that
> can be decided between you two which way is cleaner).

For the record, Danh and I _are_ working together, and yes, I am totally
comfortable with him taking the lead on submitting the patch series'
iterations.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 0/6] Travis jobs for linux with musl libc
  2020-04-07 14:55       ` Johannes Schindelin
@ 2020-04-07 19:25         ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2020-04-07 19:25 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Đoàn Trần Công Danh, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Sun, 5 Apr 2020, Junio C Hamano wrote:
>
>> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
>>
>> > The series for GitHub Actions will need to be rebased on this series again.
>> > 6/6 in that seriess will have UD conflicts.
>> > Please "git rm azure-pipelines.yml" to fix conflicts.
>>
>> If you want to keep doing this, please take over the ownership of
>> both series and build one on top of the other.  I asked you and
>> Dscho to coordinate and work together, but Dscho seems to be
>> comfortable with the idea of letting you touch his series, so doing
>> so would still count as you two working together ;-).  I do not have
>> a strong opinion which parts should come first (it is something that
>> can be decided between you two which way is cleaner).
>
> For the record, Danh and I _are_ working together, and yes, I am totally
> comfortable with him taking the lead on submitting the patch series'
> iterations.

Good.  Thanks, both.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox
  2020-04-07 14:53       ` Johannes Schindelin
@ 2020-04-07 21:35         ` Junio C Hamano
  2020-04-10 13:38           ` Johannes Schindelin
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2020-04-07 21:35 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: SZEDER Gábor, Đoàn Trần Công Danh, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> The recommended way to build Git is without autoconf and configure.
>
> That is news to me.
>
> My understanding still is that `make` is the recommended way to build Git,
> and `./configure` is only for those who want to use autoconf.
>
> It seems that the `INSTALL` file agrees with my understanding:

Did you misread the sentence you quoted?  It says "without", not
"with", so I think you two are on the same page.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox
  2020-04-07 21:35         ` Junio C Hamano
@ 2020-04-10 13:38           ` Johannes Schindelin
  0 siblings, 0 replies; 78+ messages in thread
From: Johannes Schindelin @ 2020-04-10 13:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Đoàn Trần Công Danh, git

Hi Junio,

On Tue, 7 Apr 2020, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> The recommended way to build Git is without autoconf and configure.
> >
> > That is news to me.
> >
> > My understanding still is that `make` is the recommended way to build Git,
> > and `./configure` is only for those who want to use autoconf.
> >
> > It seems that the `INSTALL` file agrees with my understanding:
>
> Did you misread the sentence you quoted?  It says "without", not
> "with", so I think you two are on the same page.

Yes, I misread it. Thanks for pointing that out.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-03-31 16:27   ` Elijah Newren
  2020-03-31 18:34     ` Junio C Hamano
@ 2020-04-10 22:27     ` Jonathan Tan
  2020-04-11  0:06       ` Elijah Newren
  1 sibling, 1 reply; 78+ messages in thread
From: Jonathan Tan @ 2020-04-10 22:27 UTC (permalink / raw)
  To: newren; +Cc: jonathantanmy, git, congdanhqx, gitster, stolee

> > +If `--keep-cherry-pick is given`, all commits (including these) will be
> > +re-applied. This allows rebase to forgo reading all upstream commits,
> > +potentially improving performance.
> 
> I'm slightly worried that "keep" is setting up an incorrect
> expectation for users; in most cases, a reapplied cherry-pick will
> result in the merge machinery applying no new changes (they were
> already applied) and then rebase's default of dropping commits which
> become empty will kick in and drop the commit.
> 
> Maybe the name is fine and we just need to be more clear in the text
> on the expected behavior and advantages and disadvantages of this
> option:
> 
> If `--keep-cherry-picks` is given, all commits (including these) will be
> re-applied.  Note that cherry picks are likely to result in no changes
> when being reapplied and thus are likely to be dropped anyway (assuming
> the default --empty=drop behavior).  The advantage of this option, is it
> allows rebase to forgo reading all upstream commits, potentially
> improving performance.  The disadvantage of this option is that in some
> cases, the code has drifted such that reapplying a cherry-pick is not
> detectable as a no-op, and instead results in conflicts for the user to
> manually resolve (usually via `git rebase --skip`).
> 
> It may also be helpful to prevent users from making a false inference
> by renaming these options to --[no-]reapply-cherry-pick[s].  Sorry to
> bring this up so late after earlier saying --[no-]keep-cherry-pick[s]
> was fine; didn't occur to me then.  If you want to keep the name, the
> extended paragraph should be good enough.

Sorry for getting back to this so late. After some thought, I'm liking
--reapply-cherry-picks too. Perhaps documented like this:

  Reapply all clean cherry-picks of any upstream commit instead of
  dropping them. (If these commits then become empty after rebasing,
  because they contain a subset of already upstream changes, the
  behavior towards them is controlled by the `--empty` flag.)

  By default (or if `--noreapply-cherry-picks` is given), these commits
  will be automatically dropped. Because this necessitates reading all
  upstream commits, this can be expensive in repos with a large number
  of upstream commits that need to be read.

  `--reapply-cherry-picks` allows rebase to forgo reading all upstream
  commits, potentially improving performance.

  See also INCOMPATIBLE OPTIONS below.

This also makes me realize that we probably need to change the "--empty"
documentation too. Maybe:

   --empty={drop,keep,ask}::
  -       How to handle commits that are not empty to start and are not
  -       clean cherry-picks of any upstream commit, but which become
  +       How to handle commits that become
          empty after rebasing (because they contain a subset of already
          upstream changes).  With drop (the default), commits that
          become empty are dropped.  With keep, such commits are kept.
          With ask (implied by --interactive), the rebase will halt when
          an empty commit is applied allowing you to choose whether to
          drop it, edit files more, or just commit the empty changes.
          Other options, like --exec, will use the default of drop unless
          -i/--interactive is explicitly specified.
   +
  -Note that commits which start empty are kept, and commits which are
  -clean cherry-picks (as determined by `git log --cherry-mark ...`) are
  -always dropped.
  +Commits that start empty are always kept.
  ++
  +Commits that are clean cherry-picks of any upstream commit (as determined by
  +`git log --cherry-mark ...`) are always dropped, unless
  +`--reapply-cherry-picks`, is set, in which case they are reapplied. If they
  +become empty after rebasing, `--empty` determines what happens to them.
   +
   See also INCOMPATIBLE OPTIONS below.

If this works, I'll send out a new version containing Elijah's patches
and mine in whatever branch my patch shows up in [1].

[1] https://lore.kernel.org/git/xmqqd08fhvx5.fsf@gitster.c.googlers.com/

> > @@ -568,6 +583,9 @@ In addition, the following pairs of options are incompatible:
> >   * --keep-base and --onto
> >   * --keep-base and --root
> >
> > +Also, the --keep-cherry-pick option requires the use of the merge backend
> > +(e.g., through --merge).
> 
> Why not just list --keep-cherry-pick[s] in the list of options that
> require use of the merge backend (i.e. the list containing '--merge')
> instead of adding another sentence here?

My reading of the list containing "--merge" is that they *trigger* the
merge backend, not require the merge backend. My new option requires but
does not trigger it (unless we want to change it to do so, which I'm
fine with).

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-04-10 22:27     ` Jonathan Tan
@ 2020-04-11  0:06       ` Elijah Newren
  2020-04-11  1:11         ` Jonathan Tan
  0 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2020-04-11  0:06 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Git Mailing List, congdanhqx, Junio C Hamano, Derrick Stolee

On Fri, Apr 10, 2020 at 3:27 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > > +If `--keep-cherry-pick is given`, all commits (including these) will be
> > > +re-applied. This allows rebase to forgo reading all upstream commits,
> > > +potentially improving performance.
> >
> > I'm slightly worried that "keep" is setting up an incorrect
> > expectation for users; in most cases, a reapplied cherry-pick will
> > result in the merge machinery applying no new changes (they were
> > already applied) and then rebase's default of dropping commits which
> > become empty will kick in and drop the commit.
> >
> > Maybe the name is fine and we just need to be more clear in the text
> > on the expected behavior and advantages and disadvantages of this
> > option:
> >
> > If `--keep-cherry-picks` is given, all commits (including these) will be
> > re-applied.  Note that cherry picks are likely to result in no changes
> > when being reapplied and thus are likely to be dropped anyway (assuming
> > the default --empty=drop behavior).  The advantage of this option, is it
> > allows rebase to forgo reading all upstream commits, potentially
> > improving performance.  The disadvantage of this option is that in some
> > cases, the code has drifted such that reapplying a cherry-pick is not
> > detectable as a no-op, and instead results in conflicts for the user to
> > manually resolve (usually via `git rebase --skip`).
> >
> > It may also be helpful to prevent users from making a false inference
> > by renaming these options to --[no-]reapply-cherry-pick[s].  Sorry to
> > bring this up so late after earlier saying --[no-]keep-cherry-pick[s]
> > was fine; didn't occur to me then.  If you want to keep the name, the
> > extended paragraph should be good enough.
>
> Sorry for getting back to this so late. After some thought, I'm liking
> --reapply-cherry-picks too. Perhaps documented like this:
>
>   Reapply all clean cherry-picks of any upstream commit instead of
>   dropping them. (If these commits then become empty after rebasing,
>   because they contain a subset of already upstream changes, the
>   behavior towards them is controlled by the `--empty` flag.)

Perhaps add "preemptively" in there, so that it reads "...instead of
preemptively dropping them..."?

>   By default (or if `--noreapply-cherry-picks` is given), these commits
>   will be automatically dropped. Because this necessitates reading all
>   upstream commits, this can be expensive in repos with a large number
>   of upstream commits that need to be read.
>
>   `--reapply-cherry-picks` allows rebase to forgo reading all upstream
>   commits, potentially improving performance.
>
>   See also INCOMPATIBLE OPTIONS below.

Otherwise, this description looks good to me.

> This also makes me realize that we probably need to change the "--empty"
> documentation too. Maybe:
>
>    --empty={drop,keep,ask}::
>   -       How to handle commits that are not empty to start and are not
>   -       clean cherry-picks of any upstream commit, but which become
>   +       How to handle commits that become
>           empty after rebasing (because they contain a subset of already
>           upstream changes).  With drop (the default), commits that
>           become empty are dropped.  With keep, such commits are kept.
>           With ask (implied by --interactive), the rebase will halt when
>           an empty commit is applied allowing you to choose whether to
>           drop it, edit files more, or just commit the empty changes.
>           Other options, like --exec, will use the default of drop unless
>           -i/--interactive is explicitly specified.
>    +
>   -Note that commits which start empty are kept, and commits which are
>   -clean cherry-picks (as determined by `git log --cherry-mark ...`) are
>   -always dropped.
>   +Commits that start empty are always kept.
>   ++
>   +Commits that are clean cherry-picks of any upstream commit (as determined by
>   +`git log --cherry-mark ...`) are always dropped, unless
>   +`--reapply-cherry-picks`, is set, in which case they are reapplied. If they
>   +become empty after rebasing, `--empty` determines what happens to them.
>    +
>    See also INCOMPATIBLE OPTIONS below.
>
> If this works, I'll send out a new version containing Elijah's patches
> and mine in whatever branch my patch shows up in [1].
>
> [1] https://lore.kernel.org/git/xmqqd08fhvx5.fsf@gitster.c.googlers.com/

Yeah, I was making changes to this exact same area in my series to
reference your flags.[2]

[2] https://lore.kernel.org/git/e15c599c874956f1a297424c68fe28e04c71807b.1586541094.git.gitgitgadget@gmail.com/

Would you mind if I took your proposed changes, put them in your
patch, and then rebased your patch on top of my series and touched up
the wording in the manpage to have the options reference each other?

> > > @@ -568,6 +583,9 @@ In addition, the following pairs of options are incompatible:
> > >   * --keep-base and --onto
> > >   * --keep-base and --root
> > >
> > > +Also, the --keep-cherry-pick option requires the use of the merge backend
> > > +(e.g., through --merge).
> >
> > Why not just list --keep-cherry-pick[s] in the list of options that
> > require use of the merge backend (i.e. the list containing '--merge')
> > instead of adding another sentence here?
>
> My reading of the list containing "--merge" is that they *trigger* the
> merge backend, not require the merge backend. My new option requires but
> does not trigger it (unless we want to change it to do so, which I'm
> fine with).

Interesting; what part of the man page comes across that way?  That
may just be poor wording.

However, if an option requires a certain backend, is there a reason
why we would want to require the user to manually specify that backend
for their chosen option to work?  We know exactly which backend they
need, so we could just trigger it.  For every other case in rebase I
can think of, whenever a certain backend was required for an option we
always made the option trigger that backend (or throw an error if a
different backend had already been requested).

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-04-11  0:06       ` Elijah Newren
@ 2020-04-11  1:11         ` Jonathan Tan
  2020-04-11  2:46           ` Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: Jonathan Tan @ 2020-04-11  1:11 UTC (permalink / raw)
  To: newren; +Cc: jonathantanmy, git, congdanhqx, gitster, stolee

> >   Reapply all clean cherry-picks of any upstream commit instead of
> >   dropping them. (If these commits then become empty after rebasing,
> >   because they contain a subset of already upstream changes, the
> >   behavior towards them is controlled by the `--empty` flag.)
> 
> Perhaps add "preemptively" in there, so that it reads "...instead of
> preemptively dropping them..."?

Sounds good. Yes I can do this.

> > If this works, I'll send out a new version containing Elijah's patches
> > and mine in whatever branch my patch shows up in [1].
> >
> > [1] https://lore.kernel.org/git/xmqqd08fhvx5.fsf@gitster.c.googlers.com/
> 
> Yeah, I was making changes to this exact same area in my series to
> reference your flags.[2]
> 
> [2] https://lore.kernel.org/git/e15c599c874956f1a297424c68fe28e04c71807b.1586541094.git.gitgitgadget@gmail.com/
> 
> Would you mind if I took your proposed changes, put them in your
> patch, and then rebased your patch on top of my series and touched up
> the wording in the manpage to have the options reference each other?

Go ahead! Thanks.

> > > Why not just list --keep-cherry-pick[s] in the list of options that
> > > require use of the merge backend (i.e. the list containing '--merge')
> > > instead of adding another sentence here?
> >
> > My reading of the list containing "--merge" is that they *trigger* the
> > merge backend, not require the merge backend. My new option requires but
> > does not trigger it (unless we want to change it to do so, which I'm
> > fine with).
> 
> Interesting; what part of the man page comes across that way?  That
> may just be poor wording.

"--merge" is documented as "Use merging strategies to rebase", which I
interpret as triggering the merge backend. There are other things in the
list like "--strategy" and "--interactive", which seem to be things that
trigger the merge backend too, so I concluded that the list is about
triggering the merge backend, not requiring it.

> However, if an option requires a certain backend, is there a reason
> why we would want to require the user to manually specify that backend
> for their chosen option to work?  We know exactly which backend they
> need, so we could just trigger it.  For every other case in rebase I
> can think of, whenever a certain backend was required for an option we
> always made the option trigger that backend (or throw an error if a
> different backend had already been requested).

I guess I wanted to leave open the option to have the same feature in
the "apply" (formerly "am") backend. The use cases I am thinking of
won't need that in the near future (for partial clone to make use of it
in the "apply" backend, the "apply" backend would have to be further
improved to batch fetching of missing blobs), though, so it might be
best to just require and trigger "merge" (like the other cases you
mention). I'll do that in the next version.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] rebase --merge: optionally skip upstreamed commits
  2020-04-11  1:11         ` Jonathan Tan
@ 2020-04-11  2:46           ` Elijah Newren
  0 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2020-04-11  2:46 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Git Mailing List, congdanhqx, Junio C Hamano, Derrick Stolee

On Fri, Apr 10, 2020 at 6:11 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > >   Reapply all clean cherry-picks of any upstream commit instead of
> > >   dropping them. (If these commits then become empty after rebasing,
> > >   because they contain a subset of already upstream changes, the
> > >   behavior towards them is controlled by the `--empty` flag.)
> >
> > Perhaps add "preemptively" in there, so that it reads "...instead of
> > preemptively dropping them..."?
>
> Sounds good. Yes I can do this.
>
> > > If this works, I'll send out a new version containing Elijah's patches
> > > and mine in whatever branch my patch shows up in [1].
> > >
> > > [1] https://lore.kernel.org/git/xmqqd08fhvx5.fsf@gitster.c.googlers.com/
> >
> > Yeah, I was making changes to this exact same area in my series to
> > reference your flags.[2]
> >
> > [2] https://lore.kernel.org/git/e15c599c874956f1a297424c68fe28e04c71807b.1586541094.git.gitgitgadget@gmail.com/
> >
> > Would you mind if I took your proposed changes, put them in your
> > patch, and then rebased your patch on top of my series and touched up
> > the wording in the manpage to have the options reference each other?
>
> Go ahead! Thanks.

Cool, please double check that I made the changes as you expected:

https://lore.kernel.org/git/20d3a50f5a4bf91223c1b849d91e790683d70d66.1586573068.git.gitgitgadget@gmail.com/

> > > > Why not just list --keep-cherry-pick[s] in the list of options that
> > > > require use of the merge backend (i.e. the list containing '--merge')
> > > > instead of adding another sentence here?
> > >
> > > My reading of the list containing "--merge" is that they *trigger* the
> > > merge backend, not require the merge backend. My new option requires but
> > > does not trigger it (unless we want to change it to do so, which I'm
> > > fine with).
> >
> > Interesting; what part of the man page comes across that way?  That
> > may just be poor wording.
>
> "--merge" is documented as "Use merging strategies to rebase", which I
> interpret as triggering the merge backend. There are other things in the
> list like "--strategy" and "--interactive", which seem to be things that
> trigger the merge backend too, so I concluded that the list is about
> triggering the merge backend, not requiring it.
>
> > However, if an option requires a certain backend, is there a reason
> > why we would want to require the user to manually specify that backend
> > for their chosen option to work?  We know exactly which backend they
> > need, so we could just trigger it.  For every other case in rebase I
> > can think of, whenever a certain backend was required for an option we
> > always made the option trigger that backend (or throw an error if a
> > different backend had already been requested).
>
> I guess I wanted to leave open the option to have the same feature in
> the "apply" (formerly "am") backend. The use cases I am thinking of
> won't need that in the near future (for partial clone to make use of it
> in the "apply" backend, the "apply" backend would have to be further
> improved to batch fetching of missing blobs), though, so it might be
> best to just require and trigger "merge" (like the other cases you
> mention). I'll do that in the next version.

Putting them in the list doesn't mean that they're designed to only
work with one backend, just a reflection of what the current
requirements/incompatibilities are.  We've removed things from the
list before when we implemented it in the other backend(s).

^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, back to index

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-26  7:35 [PATCH 0/3] add travis job for linux with musl libc Đoàn Trần Công Danh
2020-03-26  7:35 ` [PATCH 1/3] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
2020-03-26  7:35 ` [PATCH 2/3] ci: refactor docker runner script Đoàn Trần Công Danh
2020-03-26 16:06   ` Eric Sunshine
2020-03-28 17:53   ` SZEDER Gábor
2020-03-29  6:36     ` Danh Doan
2020-03-26  7:35 ` [PATCH 3/3] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
2020-03-29  5:49 ` [PATCH 0/3] add travis job for linux with musl libc Junio C Hamano
2020-03-29 10:12 ` [PATCH v2 0/4] Travis + Azure jobs " Đoàn Trần Công Danh
2020-03-29 10:12   ` [PATCH v2 1/4] ci: libify logic for usage and checking CI_USER Đoàn Trần Công Danh
2020-03-29 10:12   ` [PATCH v2 2/4] ci: refactor docker runner script Đoàn Trần Công Danh
2020-04-01 21:51     ` SZEDER Gábor
2020-03-29 10:12   ` [PATCH v2 3/4] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
2020-04-01 22:18     ` SZEDER Gábor
2020-04-02  1:42       ` Danh Doan
2020-04-07 14:53       ` Johannes Schindelin
2020-04-07 21:35         ` Junio C Hamano
2020-04-10 13:38           ` Johannes Schindelin
2020-03-29 16:23   ` [PATCH v2 0/4] Travis + Azure jobs for linux with musl libc Junio C Hamano
2020-04-02 13:03   ` [PATCH v3 0/6] " Đoàn Trần Công Danh
2020-04-02 13:04     ` [PATCH v3 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
2020-04-02 13:04     ` [PATCH v3 2/6] ci/lib-docker: preserve required environment variables Đoàn Trần Công Danh
2020-04-03  8:22       ` SZEDER Gábor
2020-04-03 10:09         ` Danh Doan
2020-04-03 19:55           ` SZEDER Gábor
2020-04-02 13:04     ` [PATCH v3 3/6] ci/linux32: parameterise command to switch arch Đoàn Trần Công Danh
2020-04-02 13:04     ` [PATCH v3 4/6] ci: refactor docker runner script Đoàn Trần Công Danh
2020-04-02 13:04     ` [PATCH v3 5/6] ci/linux32: libify install-dependencies step Đoàn Trần Công Danh
2020-04-02 13:04     ` [PATCH v3 6/6] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
2020-04-02 17:53     ` [PATCH v3 0/6] Travis + Azure jobs for linux with musl libc Junio C Hamano
2020-04-03  0:23       ` Danh Doan
2020-04-04  1:08   ` [PATCH v4 0/6] Travis " Đoàn Trần Công Danh
2020-04-04  1:08     ` [PATCH v4 1/6] ci: make MAKEFLAGS available inside the Docker container in the Linux32 job Đoàn Trần Công Danh
2020-04-04  1:08     ` [PATCH v4 2/6] ci/lib-docker: preserve required environment variables Đoàn Trần Công Danh
2020-04-04  1:08     ` [PATCH v4 3/6] ci/linux32: parameterise command to switch arch Đoàn Trần Công Danh
2020-04-04  1:08     ` [PATCH v4 4/6] ci: refactor docker runner script Đoàn Trần Công Danh
2020-04-04  1:08     ` [PATCH v4 5/6] ci/linux32: libify install-dependencies step Đoàn Trần Công Danh
2020-04-04  1:08     ` [PATCH v4 6/6] travis: build and test on Linux with musl libc and busybox Đoàn Trần Công Danh
2020-04-05 20:39     ` [PATCH v4 0/6] Travis jobs for linux with musl libc Junio C Hamano
2020-04-07 14:55       ` Johannes Schindelin
2020-04-07 19:25         ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2020-03-09 20:55 [PATCH] rebase --merge: optionally skip upstreamed commits Jonathan Tan
2020-03-10  2:10 ` Taylor Blau
2020-03-10 15:51   ` Jonathan Tan
2020-03-10 12:17 ` Johannes Schindelin
2020-03-10 16:00   ` Jonathan Tan
2020-03-10 18:56 ` Elijah Newren
2020-03-10 22:56   ` Jonathan Tan
2020-03-12 18:04     ` Jonathan Tan
2020-03-12 22:40       ` Elijah Newren
2020-03-14  8:04         ` Elijah Newren
2020-03-17  3:03           ` Jonathan Tan
2020-03-18 17:30 ` [PATCH v2] " Jonathan Tan
2020-03-18 18:47   ` Junio C Hamano
2020-03-18 19:28     ` Jonathan Tan
2020-03-18 19:55       ` Junio C Hamano
2020-03-18 20:41         ` Elijah Newren
2020-03-18 23:39           ` Junio C Hamano
2020-03-19  0:17             ` Elijah Newren
2020-03-18 20:20   ` Junio C Hamano
2020-03-26 17:50   ` Jonathan Tan
2020-03-26 19:17     ` Elijah Newren
2020-03-26 19:27     ` Junio C Hamano
2020-03-29 10:12   ` [PATCH v2 4/4] t3402: use POSIX compliant regex(7) Đoàn Trần Công Danh
2020-03-30  4:06 ` [PATCH v3] rebase --merge: optionally skip upstreamed commits Jonathan Tan
2020-03-30  5:09   ` Junio C Hamano
2020-03-30  5:22   ` Danh Doan
2020-03-30 12:13   ` Derrick Stolee
2020-03-30 16:49     ` Junio C Hamano
2020-03-30 16:57     ` Jonathan Tan
2020-03-31 11:55       ` Derrick Stolee
2020-03-31 16:27   ` Elijah Newren
2020-03-31 18:34     ` Junio C Hamano
2020-03-31 18:43       ` Junio C Hamano
2020-04-10 22:27     ` Jonathan Tan
2020-04-11  0:06       ` Elijah Newren
2020-04-11  1:11         ` Jonathan Tan
2020-04-11  2:46           ` Elijah Newren

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git