git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Allow --ancestry-path to take an argument
@ 2022-08-17  2:48 Elijah Newren via GitGitGadget
  2022-08-17  2:48 ` [PATCH 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-08-17  2:48 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

This came out of a previous thread[1], where I wanted to be able to run
something like

git log --oneline --ancestry-path=ab/submodule-cleanup main..seen


and see the commits in main..seen which contained ab/submodule-cleanup in
their ancestry path. Let me start by defining the terminology "X is in a
commit's ancestry path". By that, I just mean that either the commit is X,
the commit is an ancestor of X, or the commit is a descendant of X. With
that definition...

The command

git log --ancestry-path A..B


means find the commits in A..B which contain A in their ancestry path. I
sometimes still want to use A..B to get the basic range, but would like to
use a commit other than A for specifying which ancestry path is of interest.
So, for example, I might want to use

git log --ancestry-path=C A..B


to mean find the commits in A..B which contain C in their ancestry path, or
use

git log --ancestry-path=C --ancestry-path=D A..B


to mean find the commits in A..B which contain either C or D in their
ancestry path.

This series implements this request, by allowing --ancestry-path to take an
optional argument. With it, I can find the answer to my question in the
thread at [1] within the git.git repository (replacing branch names with
actual hashes since the branches have since moved on):

$ git log --oneline --ancestry-path=5b893f7d81 8168d5e9c2..ac0248bfba | wc -l
36


This returns the answer I want, whereas dropping the '=5b893f7d81' from the
command line gives me 192 unwanted commits (228 total), and various other
command line flags (--first-parent, --boundary, etc.) also fail to give me
the set of commits I am looking for.

[1]
https://lore.kernel.org/git/CABPp-BF+8aqysioP_e27Q9kJ02rE2SuSqXu+XphzKWnk5a_Q+A@mail.gmail.com/

Elijah Newren (2):
  rev-list-options.txt: fix simple typo
  revision: allow --ancestry-path to take an argument

 Documentation/rev-list-options.txt | 47 +++++++++++++----
 object.h                           |  2 +-
 revision.c                         | 83 ++++++++++++++++++++----------
 revision.h                         |  3 ++
 t/t6019-rev-list-ancestry-path.sh  | 47 ++++++++++++++++-
 5 files changed, 140 insertions(+), 42 deletions(-)


base-commit: 6a475b71f8c4ce708d69fdc9317aefbde3769e25
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1303%2Fnewren%2Fancestry-path-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1303/newren/ancestry-path-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1303
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] rev-list-options.txt: fix simple typo
  2022-08-17  2:48 [PATCH 0/2] Allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
@ 2022-08-17  2:48 ` Elijah Newren via GitGitGadget
  2022-08-17  2:48 ` [PATCH 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
  2022-08-18  6:17 ` [PATCH v2 0/2] Allow " Elijah Newren via GitGitGadget
  2 siblings, 0 replies; 29+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-08-17  2:48 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/rev-list-options.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 195e74eec63..2f85726745a 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -659,7 +659,7 @@ Here, the merge commits `O` and `P` contribute extra noise, as they did
 not actually contribute a change to `file.txt`. They only merged a topic
 that was based on an older version of `file.txt`. This is a common
 issue in repositories using a workflow where many contributors work in
-parallel and merge their topic branches along a single trunk: manu
+parallel and merge their topic branches along a single trunk: many
 unrelated merges appear in the `--full-history` results.
 
 When using the `--simplify-merges` option, the commits `O` and `P`
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] revision: allow --ancestry-path to take an argument
  2022-08-17  2:48 [PATCH 0/2] Allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
  2022-08-17  2:48 ` [PATCH 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
@ 2022-08-17  2:48 ` Elijah Newren via GitGitGadget
  2022-08-17 22:42   ` Junio C Hamano
  2022-08-18  6:17 ` [PATCH v2 0/2] Allow " Elijah Newren via GitGitGadget
  2 siblings, 1 reply; 29+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-08-17  2:48 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

We have long allowed users to run e.g.
    git log --ancestry-path next..seen
which shows all commits which satisfy all three of these criteria:
  * are an ancestor of seen
  * are not an ancestor next
  * have next as an ancestor

This commit allows another variant:
    git log --ancestry-path=$TOPIC next..seen
which shows all commits which satisfy all of these criteria:
  * are an ancestor of seen
  * are not an ancestor of next
  * have $TOPIC in their ancestry-path
that last bullet can be defined as commits meeting any of these
criteria:
    * are an ancestor of $TOPIC
    * have $TOPIC as an ancestor
    * are $TOPIC

This also allows multiple --ancestry-path arguments, which can be
used to find commits with any of the given topics in their ancestry
path.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/rev-list-options.txt | 45 ++++++++++++----
 object.h                           |  2 +-
 revision.c                         | 83 ++++++++++++++++++++----------
 revision.h                         |  3 ++
 t/t6019-rev-list-ancestry-path.sh  | 47 ++++++++++++++++-
 5 files changed, 139 insertions(+), 41 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 2f85726745a..001e49cee55 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -389,12 +389,15 @@ Default mode::
 	merges from the resulting history, as there are no selected
 	commits contributing to this merge.
 
---ancestry-path::
+--ancestry-path[=<commit>]::
 	When given a range of commits to display (e.g. 'commit1..commit2'
-	or 'commit2 {caret}commit1'), only display commits that exist
-	directly on the ancestry chain between the 'commit1' and
-	'commit2', i.e. commits that are both descendants of 'commit1',
-	and ancestors of 'commit2'.
+	or 'commit2 {caret}commit1'), only display commits in that
+	range where <commit> is part of the ancestry chain.  By "part of
+	the ancestry chain", we mean including <commit> itself and
+	commits that are either ancestors or descendants of <commit>.
+	If no commit is specified, use 'commit1' (the excluded part of
+	the range) as <commit>.  Can be passed multiple times to look for
+	commits in the ancestry range of multiple commits.
 
 A more detailed explanation follows.
 
@@ -568,11 +571,10 @@ Note the major differences in `N`, `P`, and `Q` over `--full-history`:
 
 There is another simplification mode available:
 
---ancestry-path::
-	Limit the displayed commits to those directly on the ancestry
-	chain between the ``from'' and ``to'' commits in the given commit
-	range. I.e. only display commits that are ancestor of the ``to''
-	commit and descendants of the ``from'' commit.
+--ancestry-path[=<commit>]::
+	Limit the displayed commits to those containing <commit> in their
+	ancestry path.  I.e. only display <commit> and commits which have
+	<commit> as either a direct ancestor or descendant.
 +
 As an example use case, consider the following commit history:
 +
@@ -604,6 +606,29 @@ option does. Applied to the 'D..M' range, it results in:
 			       \
 				L--M
 -----------------------------------------------------------------------
++
+We can also use `--ancestry-path=D` instead of `--ancestry-path` which
+means the same thing when applied to the 'D..M' range but is just more
+explicit.
++
+If we instead are interested in a given topic within this range, and all
+commits affected by that topic, we may only want to view the subset of
+`D..M` which contain that topic in their ancestry path.  So, using
+`--ancestry-path=H D..M` for example would result in:
++
+-----------------------------------------------------------------------
+		E
+		 \
+		  G---H---I---J
+			       \
+				L--M
+-----------------------------------------------------------------------
++
+Whereas `--ancestry-path=K D..M` would result in
++
+-----------------------------------------------------------------------
+		K---------------L--M
+-----------------------------------------------------------------------
 
 Before discussing another option, `--show-pulls`, we need to
 create a new example history.
diff --git a/object.h b/object.h
index a2219464c2b..9caef89f1f0 100644
--- a/object.h
+++ b/object.h
@@ -59,7 +59,7 @@ struct object_array {
 
 /*
  * object flag allocation:
- * revision.h:               0---------10         15             23------26
+ * revision.h:               0---------10         15             23------27
  * fetch-pack.c:             01    67
  * negotiator/default.c:       2--5
  * walker.c:                 0-2
diff --git a/revision.c b/revision.c
index 0c6e26cd9c8..660f1dd1b9b 100644
--- a/revision.c
+++ b/revision.c
@@ -1105,7 +1105,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			   struct commit_list **list, struct prio_queue *queue)
 {
 	struct commit_list *parent = commit->parents;
-	unsigned left_flag;
+	unsigned left_flag, ancestry_flag;
 
 	if (commit->object.flags & ADDED)
 		return 0;
@@ -1161,6 +1161,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 		return 0;
 
 	left_flag = (commit->object.flags & SYMMETRIC_LEFT);
+	ancestry_flag = (commit->object.flags & ANCESTRY_PATH);
 
 	for (parent = commit->parents; parent; parent = parent->next) {
 		struct commit *p = parent->item;
@@ -1181,6 +1182,8 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			if (!*slot)
 				*slot = *revision_sources_at(revs->sources, commit);
 		}
+		if (revs->ancestry_path)
+			p->object.flags |= ancestry_flag;
 		p->object.flags |= left_flag;
 		if (!(p->object.flags & SEEN)) {
 			p->object.flags |= (SEEN | NOT_USER_GIVEN);
@@ -1304,13 +1307,20 @@ static int still_interesting(struct commit_list *src, timestamp_t date, int slop
 }
 
 /*
- * "rev-list --ancestry-path A..B" computes commits that are ancestors
+ * "rev-list --ancestry-path=C A..B" computes commits that are ancestors
  * of B but not ancestors of A but further limits the result to those
- * that are descendants of A.  This takes the list of bottom commits and
- * the result of "A..B" without --ancestry-path, and limits the latter
- * further to the ones that can reach one of the commits in "bottom".
+ * that have C in their ancestry path (i.e. are either ancestors of C,
+ * descendants of C, or are C).  If multiple --ancestry-path=$COMMITTISH
+ * arguments are supplied, we limit the result to those that have at
+ * least one of those COMMITTISH in their ancestry path. If
+ * --ancestry-path is specified with no commit, we use all bottom
+ * commits for C.
+ *
+ * Before this function is called, ancestors of C will have already been
+ * marked with ANCESTRY_PATH previously, so we just need to also mark
+ * the descendants here, then collect both sets of commits.
  */
-static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
+static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)
 {
 	struct commit_list *p;
 	struct commit_list *rlist = NULL;
@@ -1323,7 +1333,7 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
 	for (p = list; p; p = p->next)
 		commit_list_insert(p->item, &rlist);
 
-	for (p = bottom; p; p = p->next)
+	for (p = bottoms; p; p = p->next)
 		p->item->object.flags |= TMP_MARK;
 
 	/*
@@ -1356,38 +1366,39 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
 	 */
 
 	/*
-	 * The ones that are not marked with TMP_MARK are uninteresting
+	 * The ones that are not marked with either TMP_MARK or
+	 * ANCESTRY_PATH are uninteresting
 	 */
 	for (p = list; p; p = p->next) {
 		struct commit *c = p->item;
-		if (c->object.flags & TMP_MARK)
+		if (c->object.flags & (TMP_MARK | ANCESTRY_PATH))
 			continue;
 		c->object.flags |= UNINTERESTING;
 	}
 
-	/* We are done with the TMP_MARK */
+	/* We are done with TMP_MARK and ANCESTRY_PATH */
 	for (p = list; p; p = p->next)
-		p->item->object.flags &= ~TMP_MARK;
-	for (p = bottom; p; p = p->next)
-		p->item->object.flags &= ~TMP_MARK;
+		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
+	for (p = bottoms; p; p = p->next)
+		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
 	free_commit_list(rlist);
 }
 
 /*
- * Before walking the history, keep the set of "negative" refs the
- * caller has asked to exclude.
+ * Before walking the history, add the set of "negative" refs the
+ * caller has asked to exclude to the bottom list.
  *
  * This is used to compute "rev-list --ancestry-path A..B", as we need
  * to filter the result of "A..B" further to the ones that can actually
  * reach A.
  */
-static struct commit_list *collect_bottom_commits(struct commit_list *list)
+static void collect_bottom_commits(struct commit_list *list,
+				   struct commit_list **bottom)
 {
-	struct commit_list *elem, *bottom = NULL;
+	struct commit_list *elem;
 	for (elem = list; elem; elem = elem->next)
 		if (elem->item->object.flags & BOTTOM)
-			commit_list_insert(elem->item, &bottom);
-	return bottom;
+			commit_list_insert(elem->item, bottom);
 }
 
 /* Assumes either left_only or right_only is set */
@@ -1414,12 +1425,12 @@ static int limit_list(struct rev_info *revs)
 	struct commit_list *original_list = revs->commits;
 	struct commit_list *newlist = NULL;
 	struct commit_list **p = &newlist;
-	struct commit_list *bottom = NULL;
 	struct commit *interesting_cache = NULL;
 
-	if (revs->ancestry_path) {
-		bottom = collect_bottom_commits(original_list);
-		if (!bottom)
+	if (revs->ancestry_path_need_bottoms) {
+		collect_bottom_commits(original_list,
+				       &revs->ancestry_path_bottoms);
+		if (!revs->ancestry_path_bottoms)
 			die("--ancestry-path given but there are no bottom commits");
 	}
 
@@ -1464,9 +1475,8 @@ static int limit_list(struct rev_info *revs)
 	if (revs->left_only || revs->right_only)
 		limit_left_right(newlist, revs);
 
-	if (bottom)
-		limit_to_ancestry(bottom, newlist);
-	free_commit_list(bottom);
+	if (revs->ancestry_path)
+		limit_to_ancestry(revs->ancestry_path_bottoms, newlist);
 
 	/*
 	 * Check if any commits have become TREESAME by some of their parents
@@ -2213,7 +2223,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 			       const struct setup_revision_opt* opt)
 {
 	const char *arg = argv[0];
-	const char *optarg;
+	const char *optarg = NULL;
 	int argcount;
 	const unsigned hexsz = the_hash_algo->hexsz;
 
@@ -2280,10 +2290,26 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->first_parent_only = 1;
 	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
 		revs->exclude_first_parent_only = 1;
-	} else if (!strcmp(arg, "--ancestry-path")) {
+	} else if (!strcmp(arg, "--ancestry-path") ||
+		   skip_prefix(arg, "--ancestry-path=", &optarg)) {
 		revs->ancestry_path = 1;
 		revs->simplify_history = 0;
 		revs->limited = 1;
+		if (optarg) {
+			struct commit *c;
+			struct object_id oid;
+			const char *msg = _("could not get commit for ancestry-path argument %s");
+
+			if (repo_get_oid_committish(revs->repo, optarg, &oid))
+				return error(msg, optarg);
+			get_reference(revs, optarg, &oid, ANCESTRY_PATH);
+			c = lookup_commit_reference(revs->repo, &oid);
+			if (!c)
+				return error(msg, optarg);
+			commit_list_insert(c, &revs->ancestry_path_bottoms);
+		} else {
+			revs->ancestry_path_need_bottoms = 1;
+		}
 	} else if (!strcmp(arg, "-g") || !strcmp(arg, "--walk-reflogs")) {
 		init_reflog_walk(&revs->reflog_info);
 	} else if (!strcmp(arg, "--default")) {
@@ -2991,6 +3017,7 @@ static void release_revisions_topo_walk_info(struct topo_walk_info *info);
 void release_revisions(struct rev_info *revs)
 {
 	free_commit_list(revs->commits);
+	free_commit_list(revs->ancestry_path_bottoms);
 	object_array_clear(&revs->pending);
 	object_array_clear(&revs->boundary_commits);
 	release_revisions_cmdline(&revs->cmdline);
diff --git a/revision.h b/revision.h
index e576845cdd1..d86241ad8fc 100644
--- a/revision.h
+++ b/revision.h
@@ -48,6 +48,7 @@
  */
 #define NOT_USER_GIVEN	(1u<<25)
 #define TRACK_LINEAR	(1u<<26)
+#define ANCESTRY_PATH	(1u<<27)
 #define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR | PULL_MERGE)
 
 #define DECORATE_SHORT_REFS	1
@@ -164,6 +165,7 @@ struct rev_info {
 			cherry_mark:1,
 			bisect:1,
 			ancestry_path:1,
+			ancestry_path_need_bottoms:1,
 			first_parent_only:1,
 			exclude_first_parent_only:1,
 			line_level_traverse:1,
@@ -306,6 +308,7 @@ struct rev_info {
 	struct saved_parents *saved_parents_slab;
 
 	struct commit_list *previous_parents;
+	struct commit_list *ancestry_path_bottoms;
 	const char *break_bar;
 
 	struct revision_sources *sources;
diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index af57a04b7ff..d3657737fa6 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -8,8 +8,13 @@ test_description='--ancestry-path'
 #   /                     \
 #  A-------K---------------L--M
 #
-#  D..M                 == E F G H I J K L M
-#  --ancestry-path D..M == E F H I J L M
+#  D..M                                     == E F G H I J K L M
+#  --ancestry-path                     D..M == E F   H I J   L M
+#  --ancestry-path=F                   D..M == E F       J   L M
+#  --ancestry-path=G                   D..M ==     G H I J   L M
+#  --ancestry-path=H                   D..M == E   G H I J   L M
+#  --ancestry-path=K                   D..M ==             K L M
+#  --ancestry-path=K --ancestry-path=F D..M == E F       J K L M
 #
 #  D..M -- M.t                 == M
 #  --ancestry-path D..M -- M.t == M
@@ -66,6 +71,44 @@ test_expect_success 'rev-list --ancestry-path D..M' '
 	test_cmp expect actual
 '
 
+test_expect_success 'rev-list --ancestry-path=F D..M' '
+	test_write_lines E F J L M >expect &&
+	git rev-list --ancestry-path=F --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+test_expect_success 'rev-list --ancestry-path=G D..M' '
+	test_write_lines G H I J L M >expect &&
+	git rev-list --ancestry-path=G --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+test_expect_success 'rev-list --ancestry-path=H D..M' '
+	test_write_lines E G H I J L M >expect &&
+	git rev-list --ancestry-path=H --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rev-list --ancestry-path=K D..M' '
+	test_write_lines K L M >expect &&
+	git rev-list --ancestry-path=K --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rev-list --ancestry-path=F --ancestry-path=K D..M' '
+	test_write_lines E F J K L M >expect &&
+	git rev-list --ancestry-path=F --ancestry-path=K --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'rev-list D..M -- M.t' '
 	echo M >expect &&
 	git rev-list --format=%s D..M -- M.t |
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] revision: allow --ancestry-path to take an argument
  2022-08-17  2:48 ` [PATCH 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
@ 2022-08-17 22:42   ` Junio C Hamano
  2022-08-18  4:01     ` Elijah Newren
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2022-08-17 22:42 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> We have long allowed users to run e.g.
>     git log --ancestry-path next..seen
> which shows all commits which satisfy all three of these criteria:
>   * are an ancestor of seen
>   * are not an ancestor next
>   * have next as an ancestor

Is it a very good example, though?  Nothing builds on next, and next
is not an ancestor of seen, so the command without --ancestry-path
does give us individual commits that are not in 'next' yet, plus all
the merge commits in master..seen but with --ancestry-path the
answer is most likely an empty set.

If you replace 'next' with 'master', it does start to make sense,
but that is a bit too straight first-parent merge chain that may not
be all that interesting.

> This commit allows another variant:
>     git log --ancestry-path=$TOPIC next..seen
> which shows all commits which satisfy all of these criteria:
>   * are an ancestor of seen
>   * are not an ancestor of next
>   * have $TOPIC in their ancestry-path
> that last bullet can be defined as commits meeting any of these
> criteria:
>     * are an ancestor of $TOPIC
>     * have $TOPIC as an ancestor
>     * are $TOPIC

So, I have en/ancestry-path-in-a-range topic merged somewhere
between 'master' and 'seen'.  Here is what I see:

    $ seen/git log --oneline --ancestry-path=en/ancestry-path-in-a-range master..seen
    21aef6c754 Merge branch 'ab/submodule-helper-leakfix' into seen
    2a57fcc25e Merge branch 'ab/submodule-helper-prep' into seen
    72ff5f5d3a ###
    edb5cf4c31 Merge branch 'cw/remote-object-info' into seen
    cc8f65a665 Merge branch 'ag/merge-strategies-in-c' into seen
    c1bacacabf Merge branch 'es/mark-gc-cruft-as-experimental' into seen
    fdf2d207d2 Merge branch 'js/bisect-in-c' into seen
    2a1bbfc016 Merge branch 'po/glossary-around-traversal' into seen
    7ecf004b9e Merge branch 'vd/scalar-enables-fsmonitor' into jch
    9dba189986 Merge branch 'en/ancestry-path-in-a-range' into jch
    4461e34d7d revision: allow --ancestry-path to take an argument
    0605b4aad9 rev-list-options.txt: fix simple typo

which is very much expected.  Two commits are what we want to
highlight, and its merge into the first-parent-chain that leads to
'seen', and all its descendants on 'seen' are shown.

Due to the way "ancestry-path" is defined, replacing the value to
"--ancestry-path" from 4461e34d7d to 0605b4aad9 would not change the
output, which is also expected.  If this were a three-commit topic,
giving the middle commit would find both the first one (i.e. the
ancestor) and the third one (i.e. the descendant) in the topic, while
excluding the much-less-interesting base commit and its ancestors
the topic builds on. 

I am not exactly sure when this feature is useful, though.  It is
handy to be able to enumerate descendants of a given commit, so
perhaps the user knows about 0605b4aad9 and is trying to find other
commits on the same topic, or something?  But then the merges nearer
the tip of 'seen' than 9dba189986 are not very useful for that
purpose.  It somehow feels like a solution in search of a problem.

> diff --git a/revision.c b/revision.c
> index 0c6e26cd9c8..660f1dd1b9b 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1105,7 +1105,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
>  			   struct commit_list **list, struct prio_queue *queue)
>  {
>  	struct commit_list *parent = commit->parents;
> -	unsigned left_flag;
> +	unsigned left_flag, ancestry_flag;
>  
>  	if (commit->object.flags & ADDED)
>  		return 0;
> @@ -1161,6 +1161,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
>  		return 0;
>  
>  	left_flag = (commit->object.flags & SYMMETRIC_LEFT);
> +	ancestry_flag = (commit->object.flags & ANCESTRY_PATH);

Wouldn't we want

	if (revs->ancestry_path)
		ancestry_flag = (commit->object.flags & ANCESTRY_PATH);

instead, so that the propagation of contaminated flag bits ...

>  	for (parent = commit->parents; parent; parent = parent->next) {
>  		struct commit *p = parent->item;
> @@ -1181,6 +1182,8 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
>  			if (!*slot)
>  				*slot = *revision_sources_at(revs->sources, commit);
>  		}
> +		if (revs->ancestry_path)
> +			p->object.flags |= ancestry_flag;
>  		p->object.flags |= left_flag;

... can become a simple

		p->object.flags |= ancestry_flag;

here?  Or even just use a single variable to compute the set of
flags to pass down i.e.

	pass_flags = commit->object.flags & (SYMMETRIC_LEFT | ANCESTRY_PATH);

before the loop, and then pass these two bits down at once, i.e.

-  		p->object.flags |= left_flag;
+  		p->object.flags |= pass_flags;

taking advantage of the fact that ANCESTRY_PATH and SYMMETRIC_LEFT
bits can be set to any object only when these features are in use?

Or did I misread the patch and sometimes ANCESTRY_PATH bit is set on
objects even when revs->ancestry_path is not in use?

> +static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)
>  {
>  	struct commit_list *p;
>  	struct commit_list *rlist = NULL;
> @@ -1323,7 +1333,7 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
>  	for (p = list; p; p = p->next)
>  		commit_list_insert(p->item, &rlist);
>  
> -	for (p = bottom; p; p = p->next)
> +	for (p = bottoms; p; p = p->next)
>  		p->item->object.flags |= TMP_MARK;
>  
>  	/*
> @@ -1356,38 +1366,39 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
>  	 */
>  
>  	/*
> -	 * The ones that are not marked with TMP_MARK are uninteresting
> +	 * The ones that are not marked with either TMP_MARK or
> +	 * ANCESTRY_PATH are uninteresting
>  	 */
>  	for (p = list; p; p = p->next) {
>  		struct commit *c = p->item;
> -		if (c->object.flags & TMP_MARK)
> +		if (c->object.flags & (TMP_MARK | ANCESTRY_PATH))
>  			continue;
>  		c->object.flags |= UNINTERESTING;
>  	}
>  
> -	/* We are done with the TMP_MARK */
> +	/* We are done with TMP_MARK and ANCESTRY_PATH */
>  	for (p = list; p; p = p->next)
> -		p->item->object.flags &= ~TMP_MARK;
> -	for (p = bottom; p; p = p->next)
> -		p->item->object.flags &= ~TMP_MARK;
> +		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
> +	for (p = bottoms; p; p = p->next)
> +		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
>  	free_commit_list(rlist);
>  }

We have called process_parents() to paint ancestor commits that can
be reached from the commit(s) of interest.  This helper function is
called after that is done, and propagates the ancestry_path bit in
the reverse direction, i.e. from parent to child.

Once we are done with this processing, we no longer need
ANCESTRY_PATH bit because the surviving ones without UNINTERESTING
bit set are the commits on the ancestry_path.  OK.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] revision: allow --ancestry-path to take an argument
  2022-08-17 22:42   ` Junio C Hamano
@ 2022-08-18  4:01     ` Elijah Newren
  0 siblings, 0 replies; 29+ messages in thread
From: Elijah Newren @ 2022-08-18  4:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Elijah Newren via GitGitGadget, Git Mailing List

On Wed, Aug 17, 2022 at 3:42 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > We have long allowed users to run e.g.
> >     git log --ancestry-path next..seen
> > which shows all commits which satisfy all three of these criteria:
> >   * are an ancestor of seen
> >   * are not an ancestor next
> >   * have next as an ancestor
>
> Is it a very good example, though?  Nothing builds on next, and next
> is not an ancestor of seen, so the command without --ancestry-path
> does give us individual commits that are not in 'next' yet, plus all
> the merge commits in master..seen but with --ancestry-path the
> answer is most likely an empty set.
>
> If you replace 'next' with 'master', it does start to make sense,
> but that is a bit too straight first-parent merge chain that may not
> be all that interesting.

Yeah, I should have written master..seen.

> > This commit allows another variant:
> >     git log --ancestry-path=$TOPIC next..seen
> > which shows all commits which satisfy all of these criteria:
> >   * are an ancestor of seen
> >   * are not an ancestor of next
> >   * have $TOPIC in their ancestry-path
> > that last bullet can be defined as commits meeting any of these
> > criteria:
> >     * are an ancestor of $TOPIC
> >     * have $TOPIC as an ancestor
> >     * are $TOPIC
>
> So, I have en/ancestry-path-in-a-range topic merged somewhere
> between 'master' and 'seen'.  Here is what I see:
>
>     $ seen/git log --oneline --ancestry-path=en/ancestry-path-in-a-range master..seen
>     21aef6c754 Merge branch 'ab/submodule-helper-leakfix' into seen
>     2a57fcc25e Merge branch 'ab/submodule-helper-prep' into seen
>     72ff5f5d3a ###
>     edb5cf4c31 Merge branch 'cw/remote-object-info' into seen
>     cc8f65a665 Merge branch 'ag/merge-strategies-in-c' into seen
>     c1bacacabf Merge branch 'es/mark-gc-cruft-as-experimental' into seen
>     fdf2d207d2 Merge branch 'js/bisect-in-c' into seen
>     2a1bbfc016 Merge branch 'po/glossary-around-traversal' into seen
>     7ecf004b9e Merge branch 'vd/scalar-enables-fsmonitor' into jch
>     9dba189986 Merge branch 'en/ancestry-path-in-a-range' into jch
>     4461e34d7d revision: allow --ancestry-path to take an argument
>     0605b4aad9 rev-list-options.txt: fix simple typo
>
> which is very much expected.  Two commits are what we want to
> highlight, and its merge into the first-parent-chain that leads to
> 'seen', and all its descendants on 'seen' are shown.
>
> Due to the way "ancestry-path" is defined, replacing the value to
> "--ancestry-path" from 4461e34d7d to 0605b4aad9 would not change the
> output, which is also expected.  If this were a three-commit topic,
> giving the middle commit would find both the first one (i.e. the
> ancestor) and the third one (i.e. the descendant) in the topic, while
> excluding the much-less-interesting base commit and its ancestors
> the topic builds on.

Yes.

> I am not exactly sure when this feature is useful, though.  It is
> handy to be able to enumerate descendants of a given commit, so
> perhaps the user knows about 0605b4aad9 and is trying to find other
> commits on the same topic, or something?  But then the merges nearer
> the tip of 'seen' than 9dba189986 are not very useful for that
> purpose.  It somehow feels like a solution in search of a problem.

Here's my usecase:
    git replay --i --keep-base --contained --ancestry-path=$TOPIC master..seen

Explanation of usecase: I want to be able to replay commits, much like
the current interactive rebase feature.  Unlike rebase, the selection
of commits is done via a standard <revision range>, for flexibility.
Now, replaying of commits in $TOPIC implies I probably want all the
topics that depend on it to be updated, and all the merges that depend
on any of those to be updated.  But, to achieve that, I don't want to
have to manually run N rebases and manually remerge things and whatnot
-- I want it all done in one command.  (And yes, replaying of merges
needs to handle squashed-in semantic fixes and replay those.)  So, I
want this command to replay all commits in the ancestry path of $TOPIC
in the range master..seen, keeping the current base (--keep-base), and
I want it to also update any --contained branches (meaning any
branches pointing to commits in the range being replayed).

Alternatives: For this usecase,

  * I cannot just use $TOPIC_TO_EDIT..seen, because that excludes the
    commits I want to tweak.

  * I could technically use `--ancestry-path master..seen`, but it's
    almost uselessly suboptimal as it would put hundreds of irrelevant
    commits in the TODO list (making it hard for the user to find what
    they want to edit) and then wastes time replaying all those commits
    unnecessarily as well.

  * As far as I can tell, there are no good alternatives here; and my
    question about it on the list didn't spur any satisfying answers[1].

If you'd rather I waited to submit the patch as part of a much larger
series implementing git-replay, I can do that.  I just thought it was
useful to send upstream independently.  But then I ran into the problem
that I thought it was weird to describe a usecase depending on something
that isn't yet very usable, and tried to explain it without that.

[1] https://lore.kernel.org/git/CABPp-BEqWX3Nr2HDxwS9d-_QjcKb_jS=fSjsP_Pbutw7-P5gbg@mail.gmail.com/

> > diff --git a/revision.c b/revision.c
> > index 0c6e26cd9c8..660f1dd1b9b 100644
> > --- a/revision.c
> > +++ b/revision.c
> > @@ -1105,7 +1105,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
> >                          struct commit_list **list, struct prio_queue *queue)
> >  {
> >       struct commit_list *parent = commit->parents;
> > -     unsigned left_flag;
> > +     unsigned left_flag, ancestry_flag;
> >
> >       if (commit->object.flags & ADDED)
> >               return 0;
> > @@ -1161,6 +1161,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
> >               return 0;
> >
> >       left_flag = (commit->object.flags & SYMMETRIC_LEFT);
> > +     ancestry_flag = (commit->object.flags & ANCESTRY_PATH);
>
> Wouldn't we want
>
>         if (revs->ancestry_path)
>                 ancestry_flag = (commit->object.flags & ANCESTRY_PATH);
>
> instead, so that the propagation of contaminated flag bits ...
>
> >       for (parent = commit->parents; parent; parent = parent->next) {
> >               struct commit *p = parent->item;
> > @@ -1181,6 +1182,8 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
> >                       if (!*slot)
> >                               *slot = *revision_sources_at(revs->sources, commit);
> >               }
> > +             if (revs->ancestry_path)
> > +                     p->object.flags |= ancestry_flag;
> >               p->object.flags |= left_flag;
>
> ... can become a simple
>
>                 p->object.flags |= ancestry_flag;
>
> here?  Or even just use a single variable to compute the set of
> flags to pass down i.e.
>
>         pass_flags = commit->object.flags & (SYMMETRIC_LEFT | ANCESTRY_PATH);
>
> before the loop, and then pass these two bits down at once, i.e.
>
> -               p->object.flags |= left_flag;
> +               p->object.flags |= pass_flags;
>
> taking advantage of the fact that ANCESTRY_PATH and SYMMETRIC_LEFT
> bits can be set to any object only when these features are in use?

Ooh, that does seem better.  I'll make that change.

> Or did I misread the patch and sometimes ANCESTRY_PATH bit is set on
> objects even when revs->ancestry_path is not in use?
>
> > +static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)
> >  {
> >       struct commit_list *p;
> >       struct commit_list *rlist = NULL;
> > @@ -1323,7 +1333,7 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
> >       for (p = list; p; p = p->next)
> >               commit_list_insert(p->item, &rlist);
> >
> > -     for (p = bottom; p; p = p->next)
> > +     for (p = bottoms; p; p = p->next)
> >               p->item->object.flags |= TMP_MARK;
> >
> >       /*
> > @@ -1356,38 +1366,39 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
> >        */
> >
> >       /*
> > -      * The ones that are not marked with TMP_MARK are uninteresting
> > +      * The ones that are not marked with either TMP_MARK or
> > +      * ANCESTRY_PATH are uninteresting
> >        */
> >       for (p = list; p; p = p->next) {
> >               struct commit *c = p->item;
> > -             if (c->object.flags & TMP_MARK)
> > +             if (c->object.flags & (TMP_MARK | ANCESTRY_PATH))
> >                       continue;
> >               c->object.flags |= UNINTERESTING;
> >       }
> >
> > -     /* We are done with the TMP_MARK */
> > +     /* We are done with TMP_MARK and ANCESTRY_PATH */
> >       for (p = list; p; p = p->next)
> > -             p->item->object.flags &= ~TMP_MARK;
> > -     for (p = bottom; p; p = p->next)
> > -             p->item->object.flags &= ~TMP_MARK;
> > +             p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
> > +     for (p = bottoms; p; p = p->next)
> > +             p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
> >       free_commit_list(rlist);
> >  }
>
> We have called process_parents() to paint ancestor commits that can
> be reached from the commit(s) of interest.  This helper function is
> called after that is done, and propagates the ancestry_path bit in
> the reverse direction, i.e. from parent to child.
>
> Once we are done with this processing, we no longer need
> ANCESTRY_PATH bit because the surviving ones without UNINTERESTING
> bit set are the commits on the ancestry_path.  OK.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 0/2] Allow --ancestry-path to take an argument
  2022-08-17  2:48 [PATCH 0/2] Allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
  2022-08-17  2:48 ` [PATCH 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
  2022-08-17  2:48 ` [PATCH 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
@ 2022-08-18  6:17 ` Elijah Newren via GitGitGadget
  2022-08-18  6:17   ` [PATCH v2 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
                     ` (3 more replies)
  2 siblings, 4 replies; 29+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-08-18  6:17 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

Changes since v1:

 * Tweaked the commit message, and incorporated Junio's suggestion to update
   left_flag and ancestry_flag together.

Series description:

This came out of a previous thread[1], where I wanted to be able to run
something like

git log --oneline --ancestry-path=ab/submodule-cleanup main..seen


and see the commits in main..seen which contained ab/submodule-cleanup in
their ancestry path. Let me start by defining the terminology "X is in a
commit's ancestry path". By that, I just mean that either the commit is X,
the commit is an ancestor of X, or the commit is a descendant of X. With
that definition...

The command

git log --ancestry-path A..B


means find the commits in A..B which contain A in their ancestry path. I
sometimes still want to use A..B to get the basic range, but would like to
use a commit other than A for specifying which ancestry path is of interest.
So, for example, I might want to use

git log --ancestry-path=C A..B


to mean find the commits in A..B which contain C in their ancestry path, or
use

git log --ancestry-path=C --ancestry-path=D A..B


to mean find the commits in A..B which contain either C or D in their
ancestry path.

This series implements this request, by allowing --ancestry-path to take an
optional argument. With it, I can find the answer to my question in the
thread at [1] within the git.git repository (replacing branch names with
actual hashes since the branches have since moved on):

$ git log --oneline --ancestry-path=5b893f7d81 8168d5e9c2..ac0248bfba | wc -l
36


This returns the answer I want, whereas dropping the '=5b893f7d81' from the
command line gives me 192 unwanted commits (228 total), and various other
command line flags (--first-parent, --boundary, etc.) also fail to give me
the set of commits I am looking for.

[1]
https://lore.kernel.org/git/CABPp-BF+8aqysioP_e27Q9kJ02rE2SuSqXu+XphzKWnk5a_Q+A@mail.gmail.com/

Elijah Newren (2):
  rev-list-options.txt: fix simple typo
  revision: allow --ancestry-path to take an argument

 Documentation/rev-list-options.txt | 47 +++++++++++++----
 object.h                           |  2 +-
 revision.c                         | 84 +++++++++++++++++++-----------
 revision.h                         |  3 ++
 t/t6019-rev-list-ancestry-path.sh  | 47 ++++++++++++++++-
 5 files changed, 139 insertions(+), 44 deletions(-)


base-commit: 6a475b71f8c4ce708d69fdc9317aefbde3769e25
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1303%2Fnewren%2Fancestry-path-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1303/newren/ancestry-path-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1303

Range-diff vs v1:

 1:  68ab719d99c = 1:  68ab719d99c rev-list-options.txt: fix simple typo
 2:  99287b67fd1 ! 2:  f580ec6d060 revision: allow --ancestry-path to take an argument
     @@ Commit message
          revision: allow --ancestry-path to take an argument
      
          We have long allowed users to run e.g.
     -        git log --ancestry-path next..seen
     +        git log --ancestry-path master..seen
          which shows all commits which satisfy all three of these criteria:
            * are an ancestor of seen
     -      * are not an ancestor next
     -      * have next as an ancestor
     +      * are not an ancestor master
     +      * have master as an ancestor
      
          This commit allows another variant:
     -        git log --ancestry-path=$TOPIC next..seen
     +        git log --ancestry-path=$TOPIC master..seen
          which shows all commits which satisfy all of these criteria:
            * are an ancestor of seen
     -      * are not an ancestor of next
     +      * are not an ancestor of master
            * have $TOPIC in their ancestry-path
          that last bullet can be defined as commits meeting any of these
          criteria:
     @@ revision.c: static int process_parents(struct rev_info *revs, struct commit *com
       {
       	struct commit_list *parent = commit->parents;
      -	unsigned left_flag;
     -+	unsigned left_flag, ancestry_flag;
     ++	unsigned pass_flags;
       
       	if (commit->object.flags & ADDED)
       		return 0;
      @@ revision.c: static int process_parents(struct rev_info *revs, struct commit *commit,
     + 	if (revs->no_walk)
       		return 0;
       
     - 	left_flag = (commit->object.flags & SYMMETRIC_LEFT);
     -+	ancestry_flag = (commit->object.flags & ANCESTRY_PATH);
     +-	left_flag = (commit->object.flags & SYMMETRIC_LEFT);
     ++	pass_flags = (commit->object.flags & (SYMMETRIC_LEFT | ANCESTRY_PATH));
       
       	for (parent = commit->parents; parent; parent = parent->next) {
       		struct commit *p = parent->item;
     @@ revision.c: static int process_parents(struct rev_info *revs, struct commit *com
       			if (!*slot)
       				*slot = *revision_sources_at(revs->sources, commit);
       		}
     -+		if (revs->ancestry_path)
     -+			p->object.flags |= ancestry_flag;
     - 		p->object.flags |= left_flag;
     +-		p->object.flags |= left_flag;
     ++		p->object.flags |= pass_flags;
       		if (!(p->object.flags & SEEN)) {
       			p->object.flags |= (SEEN | NOT_USER_GIVEN);
     + 			if (list)
      @@ revision.c: static int still_interesting(struct commit_list *src, timestamp_t date, int slop
       }
       

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 1/2] rev-list-options.txt: fix simple typo
  2022-08-18  6:17 ` [PATCH v2 0/2] Allow " Elijah Newren via GitGitGadget
@ 2022-08-18  6:17   ` Elijah Newren via GitGitGadget
  2022-08-18  6:17   ` [PATCH v2 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-08-18  6:17 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/rev-list-options.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 195e74eec63..2f85726745a 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -659,7 +659,7 @@ Here, the merge commits `O` and `P` contribute extra noise, as they did
 not actually contribute a change to `file.txt`. They only merged a topic
 that was based on an older version of `file.txt`. This is a common
 issue in repositories using a workflow where many contributors work in
-parallel and merge their topic branches along a single trunk: manu
+parallel and merge their topic branches along a single trunk: many
 unrelated merges appear in the `--full-history` results.
 
 When using the `--simplify-merges` option, the commits `O` and `P`
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18  6:17 ` [PATCH v2 0/2] Allow " Elijah Newren via GitGitGadget
  2022-08-18  6:17   ` [PATCH v2 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
@ 2022-08-18  6:17   ` Elijah Newren via GitGitGadget
  2022-08-18 15:30     ` Derrick Stolee
  2022-08-18 22:24     ` Jonathan Tan
  2022-08-18 16:32   ` [PATCH v2 0/2] Allow " Junio C Hamano
  2022-08-19  4:28   ` [PATCH v3 0/3] " Elijah Newren via GitGitGadget
  3 siblings, 2 replies; 29+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-08-18  6:17 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

We have long allowed users to run e.g.
    git log --ancestry-path master..seen
which shows all commits which satisfy all three of these criteria:
  * are an ancestor of seen
  * are not an ancestor master
  * have master as an ancestor

This commit allows another variant:
    git log --ancestry-path=$TOPIC master..seen
which shows all commits which satisfy all of these criteria:
  * are an ancestor of seen
  * are not an ancestor of master
  * have $TOPIC in their ancestry-path
that last bullet can be defined as commits meeting any of these
criteria:
    * are an ancestor of $TOPIC
    * have $TOPIC as an ancestor
    * are $TOPIC

This also allows multiple --ancestry-path arguments, which can be
used to find commits with any of the given topics in their ancestry
path.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/rev-list-options.txt | 45 ++++++++++++----
 object.h                           |  2 +-
 revision.c                         | 84 +++++++++++++++++++-----------
 revision.h                         |  3 ++
 t/t6019-rev-list-ancestry-path.sh  | 47 ++++++++++++++++-
 5 files changed, 138 insertions(+), 43 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 2f85726745a..001e49cee55 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -389,12 +389,15 @@ Default mode::
 	merges from the resulting history, as there are no selected
 	commits contributing to this merge.
 
---ancestry-path::
+--ancestry-path[=<commit>]::
 	When given a range of commits to display (e.g. 'commit1..commit2'
-	or 'commit2 {caret}commit1'), only display commits that exist
-	directly on the ancestry chain between the 'commit1' and
-	'commit2', i.e. commits that are both descendants of 'commit1',
-	and ancestors of 'commit2'.
+	or 'commit2 {caret}commit1'), only display commits in that
+	range where <commit> is part of the ancestry chain.  By "part of
+	the ancestry chain", we mean including <commit> itself and
+	commits that are either ancestors or descendants of <commit>.
+	If no commit is specified, use 'commit1' (the excluded part of
+	the range) as <commit>.  Can be passed multiple times to look for
+	commits in the ancestry range of multiple commits.
 
 A more detailed explanation follows.
 
@@ -568,11 +571,10 @@ Note the major differences in `N`, `P`, and `Q` over `--full-history`:
 
 There is another simplification mode available:
 
---ancestry-path::
-	Limit the displayed commits to those directly on the ancestry
-	chain between the ``from'' and ``to'' commits in the given commit
-	range. I.e. only display commits that are ancestor of the ``to''
-	commit and descendants of the ``from'' commit.
+--ancestry-path[=<commit>]::
+	Limit the displayed commits to those containing <commit> in their
+	ancestry path.  I.e. only display <commit> and commits which have
+	<commit> as either a direct ancestor or descendant.
 +
 As an example use case, consider the following commit history:
 +
@@ -604,6 +606,29 @@ option does. Applied to the 'D..M' range, it results in:
 			       \
 				L--M
 -----------------------------------------------------------------------
++
+We can also use `--ancestry-path=D` instead of `--ancestry-path` which
+means the same thing when applied to the 'D..M' range but is just more
+explicit.
++
+If we instead are interested in a given topic within this range, and all
+commits affected by that topic, we may only want to view the subset of
+`D..M` which contain that topic in their ancestry path.  So, using
+`--ancestry-path=H D..M` for example would result in:
++
+-----------------------------------------------------------------------
+		E
+		 \
+		  G---H---I---J
+			       \
+				L--M
+-----------------------------------------------------------------------
++
+Whereas `--ancestry-path=K D..M` would result in
++
+-----------------------------------------------------------------------
+		K---------------L--M
+-----------------------------------------------------------------------
 
 Before discussing another option, `--show-pulls`, we need to
 create a new example history.
diff --git a/object.h b/object.h
index a2219464c2b..9caef89f1f0 100644
--- a/object.h
+++ b/object.h
@@ -59,7 +59,7 @@ struct object_array {
 
 /*
  * object flag allocation:
- * revision.h:               0---------10         15             23------26
+ * revision.h:               0---------10         15             23------27
  * fetch-pack.c:             01    67
  * negotiator/default.c:       2--5
  * walker.c:                 0-2
diff --git a/revision.c b/revision.c
index 0c6e26cd9c8..4ebec71677c 100644
--- a/revision.c
+++ b/revision.c
@@ -1105,7 +1105,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			   struct commit_list **list, struct prio_queue *queue)
 {
 	struct commit_list *parent = commit->parents;
-	unsigned left_flag;
+	unsigned pass_flags;
 
 	if (commit->object.flags & ADDED)
 		return 0;
@@ -1160,7 +1160,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 	if (revs->no_walk)
 		return 0;
 
-	left_flag = (commit->object.flags & SYMMETRIC_LEFT);
+	pass_flags = (commit->object.flags & (SYMMETRIC_LEFT | ANCESTRY_PATH));
 
 	for (parent = commit->parents; parent; parent = parent->next) {
 		struct commit *p = parent->item;
@@ -1181,7 +1181,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			if (!*slot)
 				*slot = *revision_sources_at(revs->sources, commit);
 		}
-		p->object.flags |= left_flag;
+		p->object.flags |= pass_flags;
 		if (!(p->object.flags & SEEN)) {
 			p->object.flags |= (SEEN | NOT_USER_GIVEN);
 			if (list)
@@ -1304,13 +1304,20 @@ static int still_interesting(struct commit_list *src, timestamp_t date, int slop
 }
 
 /*
- * "rev-list --ancestry-path A..B" computes commits that are ancestors
+ * "rev-list --ancestry-path=C A..B" computes commits that are ancestors
  * of B but not ancestors of A but further limits the result to those
- * that are descendants of A.  This takes the list of bottom commits and
- * the result of "A..B" without --ancestry-path, and limits the latter
- * further to the ones that can reach one of the commits in "bottom".
+ * that have C in their ancestry path (i.e. are either ancestors of C,
+ * descendants of C, or are C).  If multiple --ancestry-path=$COMMITTISH
+ * arguments are supplied, we limit the result to those that have at
+ * least one of those COMMITTISH in their ancestry path. If
+ * --ancestry-path is specified with no commit, we use all bottom
+ * commits for C.
+ *
+ * Before this function is called, ancestors of C will have already been
+ * marked with ANCESTRY_PATH previously, so we just need to also mark
+ * the descendants here, then collect both sets of commits.
  */
-static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
+static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)
 {
 	struct commit_list *p;
 	struct commit_list *rlist = NULL;
@@ -1323,7 +1330,7 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
 	for (p = list; p; p = p->next)
 		commit_list_insert(p->item, &rlist);
 
-	for (p = bottom; p; p = p->next)
+	for (p = bottoms; p; p = p->next)
 		p->item->object.flags |= TMP_MARK;
 
 	/*
@@ -1356,38 +1363,39 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
 	 */
 
 	/*
-	 * The ones that are not marked with TMP_MARK are uninteresting
+	 * The ones that are not marked with either TMP_MARK or
+	 * ANCESTRY_PATH are uninteresting
 	 */
 	for (p = list; p; p = p->next) {
 		struct commit *c = p->item;
-		if (c->object.flags & TMP_MARK)
+		if (c->object.flags & (TMP_MARK | ANCESTRY_PATH))
 			continue;
 		c->object.flags |= UNINTERESTING;
 	}
 
-	/* We are done with the TMP_MARK */
+	/* We are done with TMP_MARK and ANCESTRY_PATH */
 	for (p = list; p; p = p->next)
-		p->item->object.flags &= ~TMP_MARK;
-	for (p = bottom; p; p = p->next)
-		p->item->object.flags &= ~TMP_MARK;
+		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
+	for (p = bottoms; p; p = p->next)
+		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
 	free_commit_list(rlist);
 }
 
 /*
- * Before walking the history, keep the set of "negative" refs the
- * caller has asked to exclude.
+ * Before walking the history, add the set of "negative" refs the
+ * caller has asked to exclude to the bottom list.
  *
  * This is used to compute "rev-list --ancestry-path A..B", as we need
  * to filter the result of "A..B" further to the ones that can actually
  * reach A.
  */
-static struct commit_list *collect_bottom_commits(struct commit_list *list)
+static void collect_bottom_commits(struct commit_list *list,
+				   struct commit_list **bottom)
 {
-	struct commit_list *elem, *bottom = NULL;
+	struct commit_list *elem;
 	for (elem = list; elem; elem = elem->next)
 		if (elem->item->object.flags & BOTTOM)
-			commit_list_insert(elem->item, &bottom);
-	return bottom;
+			commit_list_insert(elem->item, bottom);
 }
 
 /* Assumes either left_only or right_only is set */
@@ -1414,12 +1422,12 @@ static int limit_list(struct rev_info *revs)
 	struct commit_list *original_list = revs->commits;
 	struct commit_list *newlist = NULL;
 	struct commit_list **p = &newlist;
-	struct commit_list *bottom = NULL;
 	struct commit *interesting_cache = NULL;
 
-	if (revs->ancestry_path) {
-		bottom = collect_bottom_commits(original_list);
-		if (!bottom)
+	if (revs->ancestry_path_need_bottoms) {
+		collect_bottom_commits(original_list,
+				       &revs->ancestry_path_bottoms);
+		if (!revs->ancestry_path_bottoms)
 			die("--ancestry-path given but there are no bottom commits");
 	}
 
@@ -1464,9 +1472,8 @@ static int limit_list(struct rev_info *revs)
 	if (revs->left_only || revs->right_only)
 		limit_left_right(newlist, revs);
 
-	if (bottom)
-		limit_to_ancestry(bottom, newlist);
-	free_commit_list(bottom);
+	if (revs->ancestry_path)
+		limit_to_ancestry(revs->ancestry_path_bottoms, newlist);
 
 	/*
 	 * Check if any commits have become TREESAME by some of their parents
@@ -2213,7 +2220,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 			       const struct setup_revision_opt* opt)
 {
 	const char *arg = argv[0];
-	const char *optarg;
+	const char *optarg = NULL;
 	int argcount;
 	const unsigned hexsz = the_hash_algo->hexsz;
 
@@ -2280,10 +2287,26 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->first_parent_only = 1;
 	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
 		revs->exclude_first_parent_only = 1;
-	} else if (!strcmp(arg, "--ancestry-path")) {
+	} else if (!strcmp(arg, "--ancestry-path") ||
+		   skip_prefix(arg, "--ancestry-path=", &optarg)) {
 		revs->ancestry_path = 1;
 		revs->simplify_history = 0;
 		revs->limited = 1;
+		if (optarg) {
+			struct commit *c;
+			struct object_id oid;
+			const char *msg = _("could not get commit for ancestry-path argument %s");
+
+			if (repo_get_oid_committish(revs->repo, optarg, &oid))
+				return error(msg, optarg);
+			get_reference(revs, optarg, &oid, ANCESTRY_PATH);
+			c = lookup_commit_reference(revs->repo, &oid);
+			if (!c)
+				return error(msg, optarg);
+			commit_list_insert(c, &revs->ancestry_path_bottoms);
+		} else {
+			revs->ancestry_path_need_bottoms = 1;
+		}
 	} else if (!strcmp(arg, "-g") || !strcmp(arg, "--walk-reflogs")) {
 		init_reflog_walk(&revs->reflog_info);
 	} else if (!strcmp(arg, "--default")) {
@@ -2991,6 +3014,7 @@ static void release_revisions_topo_walk_info(struct topo_walk_info *info);
 void release_revisions(struct rev_info *revs)
 {
 	free_commit_list(revs->commits);
+	free_commit_list(revs->ancestry_path_bottoms);
 	object_array_clear(&revs->pending);
 	object_array_clear(&revs->boundary_commits);
 	release_revisions_cmdline(&revs->cmdline);
diff --git a/revision.h b/revision.h
index e576845cdd1..d86241ad8fc 100644
--- a/revision.h
+++ b/revision.h
@@ -48,6 +48,7 @@
  */
 #define NOT_USER_GIVEN	(1u<<25)
 #define TRACK_LINEAR	(1u<<26)
+#define ANCESTRY_PATH	(1u<<27)
 #define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR | PULL_MERGE)
 
 #define DECORATE_SHORT_REFS	1
@@ -164,6 +165,7 @@ struct rev_info {
 			cherry_mark:1,
 			bisect:1,
 			ancestry_path:1,
+			ancestry_path_need_bottoms:1,
 			first_parent_only:1,
 			exclude_first_parent_only:1,
 			line_level_traverse:1,
@@ -306,6 +308,7 @@ struct rev_info {
 	struct saved_parents *saved_parents_slab;
 
 	struct commit_list *previous_parents;
+	struct commit_list *ancestry_path_bottoms;
 	const char *break_bar;
 
 	struct revision_sources *sources;
diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index af57a04b7ff..d3657737fa6 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -8,8 +8,13 @@ test_description='--ancestry-path'
 #   /                     \
 #  A-------K---------------L--M
 #
-#  D..M                 == E F G H I J K L M
-#  --ancestry-path D..M == E F H I J L M
+#  D..M                                     == E F G H I J K L M
+#  --ancestry-path                     D..M == E F   H I J   L M
+#  --ancestry-path=F                   D..M == E F       J   L M
+#  --ancestry-path=G                   D..M ==     G H I J   L M
+#  --ancestry-path=H                   D..M == E   G H I J   L M
+#  --ancestry-path=K                   D..M ==             K L M
+#  --ancestry-path=K --ancestry-path=F D..M == E F       J K L M
 #
 #  D..M -- M.t                 == M
 #  --ancestry-path D..M -- M.t == M
@@ -66,6 +71,44 @@ test_expect_success 'rev-list --ancestry-path D..M' '
 	test_cmp expect actual
 '
 
+test_expect_success 'rev-list --ancestry-path=F D..M' '
+	test_write_lines E F J L M >expect &&
+	git rev-list --ancestry-path=F --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+test_expect_success 'rev-list --ancestry-path=G D..M' '
+	test_write_lines G H I J L M >expect &&
+	git rev-list --ancestry-path=G --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+test_expect_success 'rev-list --ancestry-path=H D..M' '
+	test_write_lines E G H I J L M >expect &&
+	git rev-list --ancestry-path=H --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rev-list --ancestry-path=K D..M' '
+	test_write_lines K L M >expect &&
+	git rev-list --ancestry-path=K --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rev-list --ancestry-path=F --ancestry-path=K D..M' '
+	test_write_lines E F J K L M >expect &&
+	git rev-list --ancestry-path=F --ancestry-path=K --format=%s D..M |
+	sed -e "/^commit /d" |
+	sort >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'rev-list D..M -- M.t' '
 	echo M >expect &&
 	git rev-list --format=%s D..M -- M.t |
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18  6:17   ` [PATCH v2 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
@ 2022-08-18 15:30     ` Derrick Stolee
  2022-08-18 15:50       ` Ævar Arnfjörð Bjarmason
                         ` (2 more replies)
  2022-08-18 22:24     ` Jonathan Tan
  1 sibling, 3 replies; 29+ messages in thread
From: Derrick Stolee @ 2022-08-18 15:30 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git; +Cc: Elijah Newren

On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>

Code looks good!

> diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
> index af57a04b7ff..d3657737fa6 100755
> --- a/t/t6019-rev-list-ancestry-path.sh
> +++ b/t/t6019-rev-list-ancestry-path.sh
> @@ -8,8 +8,13 @@ test_description='--ancestry-path'
>  #   /                     \
>  #  A-------K---------------L--M
>  #
> -#  D..M                 == E F G H I J K L M
> -#  --ancestry-path D..M == E F H I J L M
> +#  D..M                                     == E F G H I J K L M
> +#  --ancestry-path                     D..M == E F   H I J   L M
> +#  --ancestry-path=F                   D..M == E F       J   L M
> +#  --ancestry-path=G                   D..M ==     G H I J   L M
> +#  --ancestry-path=H                   D..M == E   G H I J   L M
> +#  --ancestry-path=K                   D..M ==             K L M
> +#  --ancestry-path=K --ancestry-path=F D..M == E F       J K L M

These are good examples, because they help clarify what I had initially
been confused about: you include things in _both_ directions. In
particular, "--ancestry-path=K --ancestry-path=f D..M" you are kind of
taking the union of the following queries:

	--ancestry-path D..K
	--ancestry-path K..M
	--ancestry-path D..F
	--ancestry-path F..M

I did check just in case, but specifying multiple ranges such as

	--ancestry-path D..K K..M

does not do what is expected.

> +test_expect_success 'rev-list --ancestry-path=F D..M' '
> +	test_write_lines E F J L M >expect &&
> +	git rev-list --ancestry-path=F --format=%s D..M |
> +	sed -e "/^commit /d" |
> +	sort >actual &&
> +	test_cmp expect actual
> +'

These tests follow the patterns from other tests in this file, but
it also has bad patterns. Specifically, the 'git rev-list' command
is fed directly into a pipe. I include a patch below that applies
directly on this one to rewrite these tests. If you want, you could
rebase to have that test refactor happen before you add your new
--ancestry-path=<X> option tests.

> +test_expect_success 'rev-list --ancestry-path=G D..M' '
> +	test_write_lines G H I J L M >expect &&
> +	git rev-list --ancestry-path=G --format=%s D..M |
> +	sed -e "/^commit /d" |
> +	sort >actual &&
> +	test_cmp expect actual
> +'
> +test_expect_success 'rev-list --ancestry-path=H D..M' '

nit: needs extra whitespace between tests. The above test pair
needs one, too. This becomes moot with the patch I provide.

Thanks,
-Stolee

--- >8 ---

From 9ac4e81dba0d7801513a09f5fe307d01357123dd Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Thu, 18 Aug 2022 11:25:04 -0400
Subject: [PATCH] t6019: modernize tests with helper

The tests in t6019 are repetive, so create a helper that greatly
simplifies the test script.

In addition, update the common pattern that places 'git rev-list' on the
left side of a pipe, which can hide some exit codes. Send the output to
a 'raw' file that is then consumed by other tools so the Git exit code
is verified as zero.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 t/t6019-rev-list-ancestry-path.sh | 131 +++++++-----------------------
 1 file changed, 31 insertions(+), 100 deletions(-)

diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index d3657737fa60..18941a80ce67 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -55,111 +55,42 @@ test_expect_success setup '
 	test_commit M
 '
 
-test_expect_success 'rev-list D..M' '
-	test_write_lines E F G H I J K L M >expect &&
-	git rev-list --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path D..M' '
-	test_write_lines E F H I J L M >expect &&
-	git rev-list --ancestry-path --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path=F D..M' '
-	test_write_lines E F J L M >expect &&
-	git rev-list --ancestry-path=F --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-test_expect_success 'rev-list --ancestry-path=G D..M' '
-	test_write_lines G H I J L M >expect &&
-	git rev-list --ancestry-path=G --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-test_expect_success 'rev-list --ancestry-path=H D..M' '
-	test_write_lines E G H I J L M >expect &&
-	git rev-list --ancestry-path=H --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path=K D..M' '
-	test_write_lines K L M >expect &&
-	git rev-list --ancestry-path=K --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path=F --ancestry-path=K D..M' '
-	test_write_lines E F J K L M >expect &&
-	git rev-list --ancestry-path=F --ancestry-path=K --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list D..M -- M.t' '
-	echo M >expect &&
-	git rev-list --format=%s D..M -- M.t |
-	sed -e "/^commit /d" >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
-	echo M >expect &&
-	git rev-list --ancestry-path --format=%s D..M -- M.t |
-	sed -e "/^commit /d" >actual &&
-	test_cmp expect actual
-'
+test_ancestry () {
+	args=$1
+	expected=$2
+	test_expect_success "rev-list $args" "
+		test_write_lines $expected >expect &&
+		git rev-list --format=%s $args >raw &&
+
+		if test -n \"$expected\"
+		then
+			sed -e \"/^commit /d\" raw | sort >actual &&
+			test_cmp expect actual || return 1
+		else
+			test_must_be_empty raw
+		fi
+	"
+}
 
-test_expect_success 'rev-list F...I' '
-	test_write_lines F G H I >expect &&
-	git rev-list --format=%s F...I |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
+test_ancestry "D..M" "E F G H I J K L M"
 
-test_expect_success 'rev-list --ancestry-path F...I' '
-	test_write_lines F H I >expect &&
-	git rev-list --ancestry-path --format=%s F...I |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
+test_ancestry "--ancestry-path D..M" "E F H I J L M"
+test_ancestry "--ancestry-path D..M" "E F H I J L M"
+test_ancestry "--ancestry-path=F D..M" "E F J L M"
+test_ancestry "--ancestry-path=G D..M" "G H I J L M"
+test_ancestry "--ancestry-path=H D..M" "E G H I J L M"
+test_ancestry "--ancestry-path=K D..M" "K L M"
+test_ancestry "--ancestry-path=K --ancestry-path=F D..M" "E F J K L M"
 
-# G.t is dropped in an "-s ours" merge
-test_expect_success 'rev-list G..M -- G.t' '
-	git rev-list --format=%s G..M -- G.t |
-	sed -e "/^commit /d" >actual &&
-	test_must_be_empty actual
-'
+test_ancestry "D..M -- M.t" "M"
+test_ancestry "--ancestry-path D..M -- M.t" "M"
 
-test_expect_success 'rev-list --ancestry-path G..M -- G.t' '
-	echo L >expect &&
-	git rev-list --ancestry-path --format=%s G..M -- G.t |
-	sed -e "/^commit /d" >actual &&
-	test_cmp expect actual
-'
+test_ancestry "F...I" "F G H I"
+test_ancestry "--ancestry-path F...I" "F H I"
 
-test_expect_success 'rev-list --ancestry-path --simplify-merges G^..M -- G.t' '
-	test_write_lines G L >expect &&
-	git rev-list --ancestry-path --simplify-merges --format=%s G^..M -- G.t |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
+test_ancestry "G..M -- G.t" ""
+test_ancestry "--ancestry-path G..M -- G.t" "L"
+test_ancestry "--ancestry-path --simplify-merges G^..M -- G.t" "G L"
 
 #   b---bc
 #  / \ /
-- 
2.37.1.vfs.0.0.rebase





^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18 15:30     ` Derrick Stolee
@ 2022-08-18 15:50       ` Ævar Arnfjörð Bjarmason
  2022-08-18 16:51         ` Derrick Stolee
                           ` (2 more replies)
  2022-08-18 16:53       ` Eric Sunshine
  2022-08-19  1:01       ` Elijah Newren
  2 siblings, 3 replies; 29+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-18 15:50 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren via GitGitGadget, git, Elijah Newren


On Thu, Aug 18 2022, Derrick Stolee wrote:

> On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
>> +test_expect_success 'rev-list --ancestry-path=F D..M' '
>> +	test_write_lines E F J L M >expect &&
>> +	git rev-list --ancestry-path=F --format=%s D..M |
>> +	sed -e "/^commit /d" |
>> +	sort >actual &&
>> +	test_cmp expect actual
>> +'
>
> These tests follow the patterns from other tests in this file, but
> it also has bad patterns. Specifically, the 'git rev-list' command
> is fed directly into a pipe. I include a patch below that applies
> directly on this one to rewrite these tests. If you want, you could
> rebase to have that test refactor happen before you add your new
> --ancestry-path=<X> option tests.

Thanks, I was going to comment on the same, but your solution is much
better (I was just going to suggest using intermediate files).

> [...]
> -test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
> -	echo M >expect &&
> -	git rev-list --ancestry-path --format=%s D..M -- M.t |
> -	sed -e "/^commit /d" >actual &&
> -	test_cmp expect actual
> -'
> +test_ancestry () {
> +	args=$1
> +	expected=$2

Maybe add &&-chaining here (we do it in some cases, but I'm not sure
when such assignments would ever fail).

> +	test_expect_success "rev-list $args" "
> +		test_write_lines $expected >expect &&
> +		git rev-list --format=%s $args >raw &&
> +
> +		if test -n \"$expected\"

Aren't you making things harder for yourself here than required by using
""-quoting for the body of the test.

We eval it into existence, so you can use ''-quotes, and then you don't
need to escape e.g. the "" quotes here for expected, no?

> +		then
> +			sed -e \"/^commit /d\" raw | sort >actual &&

nit for debuggability (and not correctness), maybe using intermediate
files here would be nicer? And then perhaps call them "actual" and
"actual.sorted".


> +			test_cmp expect actual || return 1

No need for a "return 1" here when we're not in a loop. It's redundant,
and makes the -x output on failure confusing ("why didn't I fail on the
test_cmp, but on this stray return?...").

...

> +		else
> +			test_must_be_empty raw

...which would also allow you to extract much of this if/else at the
cost of not using test_must_be_empty, i.e. just make the "expected"
empty unless "$expected" is provided. Just a thought/nit, we could also
leave this as-is :)

Also does the "compare rev" part of this want test_cmp_rev instead?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/2] Allow --ancestry-path to take an argument
  2022-08-18  6:17 ` [PATCH v2 0/2] Allow " Elijah Newren via GitGitGadget
  2022-08-18  6:17   ` [PATCH v2 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
  2022-08-18  6:17   ` [PATCH v2 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
@ 2022-08-18 16:32   ` Junio C Hamano
  2022-08-19  4:28   ` [PATCH v3 0/3] " Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2022-08-18 16:32 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Changes since v1:
>
>  * Tweaked the commit message, and incorporated Junio's suggestion to update
>    left_flag and ancestry_flag together.

Thanks, will replace and queue.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18 15:50       ` Ævar Arnfjörð Bjarmason
@ 2022-08-18 16:51         ` Derrick Stolee
  2022-08-18 16:56         ` Eric Sunshine
  2022-08-19  1:12         ` Elijah Newren
  2 siblings, 0 replies; 29+ messages in thread
From: Derrick Stolee @ 2022-08-18 16:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, git, Elijah Newren

On 8/18/2022 11:50 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Aug 18 2022, Derrick Stolee wrote:
> 
>> On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
>>> +test_expect_success 'rev-list --ancestry-path=F D..M' '
>>> +	test_write_lines E F J L M >expect &&
>>> +	git rev-list --ancestry-path=F --format=%s D..M |
>>> +	sed -e "/^commit /d" |
>>> +	sort >actual &&
>>> +	test_cmp expect actual
>>> +'
>>
>> These tests follow the patterns from other tests in this file, but
>> it also has bad patterns. Specifically, the 'git rev-list' command
>> is fed directly into a pipe. I include a patch below that applies
>> directly on this one to rewrite these tests. If you want, you could
>> rebase to have that test refactor happen before you add your new
>> --ancestry-path=<X> option tests.
> 
> Thanks, I was going to comment on the same, but your solution is much
> better (I was just going to suggest using intermediate files).
> 
>> [...]
>> -test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
>> -	echo M >expect &&
>> -	git rev-list --ancestry-path --format=%s D..M -- M.t |
>> -	sed -e "/^commit /d" >actual &&
>> -	test_cmp expect actual
>> -'
>> +test_ancestry () {
>> +	args=$1
>> +	expected=$2
> 
> Maybe add &&-chaining here (we do it in some cases, but I'm not sure
> when such assignments would ever fail).

These are outside of a test_expect_success, so they are not important.
 
>> +	test_expect_success "rev-list $args" "
>> +		test_write_lines $expected >expect &&
>> +		git rev-list --format=%s $args >raw &&
>> +
>> +		if test -n \"$expected\"
> 
> Aren't you making things harder for yourself here than required by using
> ""-quoting for the body of the test.
> 
> We eval it into existence, so you can use ''-quotes, and then you don't
> need to escape e.g. the "" quotes here for expected, no?

Are "args" and "expected" in-scope if I use single quotes? I don't think
they are unless we export them. I could be wrong.

>> +		then
>> +			sed -e \"/^commit /d\" raw | sort >actual &&
> 
> nit for debuggability (and not correctness), maybe using intermediate
> files here would be nicer? And then perhaps call them "actual" and
> "actual.sorted".

I don't think that level of debuggability is important. We have the
raw file if we want to see what the Git output was.

> 
>> +			test_cmp expect actual || return 1
> 
> No need for a "return 1" here when we're not in a loop. It's redundant,
> and makes the -x output on failure confusing ("why didn't I fail on the
> test_cmp, but on this stray return?...").

Sure. I did this more out of habit, but it makes sense that we don't
need it for an if block.

> ...
> 
>> +		else
>> +			test_must_be_empty raw
> 
> ...which would also allow you to extract much of this if/else at the
> cost of not using test_must_be_empty, i.e. just make the "expected"
> empty unless "$expected" is provided. Just a thought/nit, we could also
> leave this as-is :)

*shrug* either way is fine by me.

> Also does the "compare rev" part of this want test_cmp_rev instead?

I don't know what you mean here. We are not comparing revisions anywhere,
but we are using the commit messages to create an easy comparison for the
output.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18 15:30     ` Derrick Stolee
  2022-08-18 15:50       ` Ævar Arnfjörð Bjarmason
@ 2022-08-18 16:53       ` Eric Sunshine
  2022-08-19  1:01       ` Elijah Newren
  2 siblings, 0 replies; 29+ messages in thread
From: Eric Sunshine @ 2022-08-18 16:53 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren via GitGitGadget, Git List, Elijah Newren

On Thu, Aug 18, 2022 at 11:50 AM Derrick Stolee
<derrickstolee@github.com> wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> Subject: [PATCH] t6019: modernize tests with helper
>
> The tests in t6019 are repetive, so create a helper that greatly
> simplifies the test script.

s/repetive/repetitive/

> In addition, update the common pattern that places 'git rev-list' on the
> left side of a pipe, which can hide some exit codes. Send the output to
> a 'raw' file that is then consumed by other tools so the Git exit code
> is verified as zero.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18 15:50       ` Ævar Arnfjörð Bjarmason
  2022-08-18 16:51         ` Derrick Stolee
@ 2022-08-18 16:56         ` Eric Sunshine
  2022-08-19  1:12         ` Elijah Newren
  2 siblings, 0 replies; 29+ messages in thread
From: Eric Sunshine @ 2022-08-18 16:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee, Elijah Newren via GitGitGadget, Git List, Elijah Newren

On Thu, Aug 18, 2022 at 12:01 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Thu, Aug 18 2022, Derrick Stolee wrote:
> > +test_ancestry () {
> > +     args=$1
> > +     expected=$2
>
> Maybe add &&-chaining here (we do it in some cases, but I'm not sure
> when such assignments would ever fail).

Assignment shouldn't fail, but keeping the &&-chain intact here
protects us against the unlikely event of someone inserting &&-chained
code above these assignments and not realizing that the &&-chain is
not intact at the assignment lines.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18  6:17   ` [PATCH v2 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
  2022-08-18 15:30     ` Derrick Stolee
@ 2022-08-18 22:24     ` Jonathan Tan
  2022-08-19  1:23       ` Elijah Newren
  1 sibling, 1 reply; 29+ messages in thread
From: Jonathan Tan @ 2022-08-18 22:24 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Jonathan Tan, git, Elijah Newren

Thanks - overall this looks good. I only have some minor textual
comments and one small code comment.

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Elijah Newren <newren@gmail.com>
> 
> We have long allowed users to run e.g.
>     git log --ancestry-path master..seen
> which shows all commits which satisfy all three of these criteria:
>   * are an ancestor of seen
>   * are not an ancestor master

are not an ancestor *of* master

> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 2f85726745a..001e49cee55 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -389,12 +389,15 @@ Default mode::
>  	merges from the resulting history, as there are no selected
>  	commits contributing to this merge.
>  
> ---ancestry-path::
> +--ancestry-path[=<commit>]::
>  	When given a range of commits to display (e.g. 'commit1..commit2'
> -	or 'commit2 {caret}commit1'), only display commits that exist
> -	directly on the ancestry chain between the 'commit1' and
> -	'commit2', i.e. commits that are both descendants of 'commit1',
> -	and ancestors of 'commit2'.
> +	or 'commit2 {caret}commit1'), only display commits in that
> +	range where <commit> is part of the ancestry chain.  By "part of
> +	the ancestry chain", we mean including <commit> itself and
> +	commits that are either ancestors or descendants of <commit>.
> +	If no commit is specified, use 'commit1' (the excluded part of
> +	the range) as <commit>.  Can be passed multiple times to look for
> +	commits in the ancestry range of multiple commits.

"Ancestry chain" seems to be used multiple times in the Git codebase,
apparently with slightly different definitions, so probably best to
clear up at least this part by not reusing the term. It's also probably
not worth introducing a new term "ancestry range". Maybe rewrite to
say:

  When given a range of commits to display (e.g. 'commit1..commit2'
  or 'commit2 {caret}commit1'), only display commits in that
  range that are ancestors of <commit>, descendants of <commit>, or
  <commit> itself. If no commit is specified, use 'commit1' (the
  excluded part of the range) as <commit>. Can be passed multiple times;
  if so, a commit is included if it is any of the commits given or if it
  is an ancestor or descendant of one of them.

> @@ -568,11 +571,10 @@ Note the major differences in `N`, `P`, and `Q` over `--full-history`:
>  
>  There is another simplification mode available:
>  
> ---ancestry-path::
> -	Limit the displayed commits to those directly on the ancestry
> -	chain between the ``from'' and ``to'' commits in the given commit
> -	range. I.e. only display commits that are ancestor of the ``to''
> -	commit and descendants of the ``from'' commit.
> +--ancestry-path[=<commit>]::
> +	Limit the displayed commits to those containing <commit> in their
> +	ancestry path.  I.e. only display <commit> and commits which have
> +	<commit> as either a direct ancestor or descendant.

Can we refer back to the documentation of --ancestry-path instead?

> @@ -1304,13 +1304,20 @@ static int still_interesting(struct commit_list *src, timestamp_t date, int slop
>  }
>  
>  /*
> - * "rev-list --ancestry-path A..B" computes commits that are ancestors
> + * "rev-list --ancestry-path=C A..B" computes commits that are ancestors
>   * of B but not ancestors of A but further limits the result to those
> - * that are descendants of A.  This takes the list of bottom commits and
> - * the result of "A..B" without --ancestry-path, and limits the latter
> - * further to the ones that can reach one of the commits in "bottom".
> + * that have C in their ancestry path (i.e. are either ancestors of C,
> + * descendants of C, or are C).  If multiple --ancestry-path=$COMMITTISH
> + * arguments are supplied, we limit the result to those that have at
> + * least one of those COMMITTISH in their ancestry path. If
> + * --ancestry-path is specified with no commit, we use all bottom
> + * commits for C.
> + *
> + * Before this function is called, ancestors of C will have already been
> + * marked with ANCESTRY_PATH previously, so we just need to also mark
> + * the descendants here, then collect both sets of commits.
>   */
> -static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
> +static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)

I thought the original description of this function ("This takes the
list...") to be clear and it would be nice to retain it. So, e.g.:

  "rev-list --ancestry-path=C_0 [--ancestry-path=C_1 ...] A..B" computes commits
  that are ancestors of B but not ancestors of A but further limits the
  result to those that have any of C in their ancestry path (i.e. are
  either ancestors of any of C, descendants of any of C, or are any of
  C). If --ancestry-path is specified with no commit, we use all bottom
  commits for C.
  
  Before this function is called, ancestors of C will have already been
  marked with ANCESTRY_PATH previously.

  This takes the list of bottom commits and the result of "A..B" without
  --ancestry-path, and limits the latter further to the ones that have
  any of C in their ancestry path. Since the ancestors of C have already
  been marked (a prerequisite of this function), we just need to mark
  the descendants, then exclude any commit that does not have any of
  these marks.

Optional: Besides that, from what I can tell, sometimes the C commits
themselves are marked with ANCESTRY_PATH (when they are explicitly
specified) and sometimes they are not (when they are not explicitly
specified). It's not a bug here, but it might be worth handling that in
the ancestry_path_need_bottoms codepath (instead of explicitly setting
TMP_MARK on the bottoms in limit_to_ancestry() - that way, I think we
can also use ANCESTRY_PATH instead of TMP_MARK throughout the ancestry
path codepaths, but I haven't tested it), at least to prevent possible
future bugs.

> @@ -2213,7 +2220,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  			       const struct setup_revision_opt* opt)
>  {
>  	const char *arg = argv[0];
> -	const char *optarg;
> +	const char *optarg = NULL;
>  	int argcount;
>  	const unsigned hexsz = the_hash_algo->hexsz;

[snip]

> @@ -2280,10 +2287,26 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  		revs->first_parent_only = 1;
>  	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
>  		revs->exclude_first_parent_only = 1;
> -	} else if (!strcmp(arg, "--ancestry-path")) {
> +	} else if (!strcmp(arg, "--ancestry-path") ||
> +		   skip_prefix(arg, "--ancestry-path=", &optarg)) {

This and the above hunk might cause bugs if --ancestry-path was first
specified with a commit and then specified without. Probably best to
break this up into separate "else if" branches, even though there is a
bit of code duplication (and also remove the "= NULL" addition in the
above hunk).

> @@ -164,6 +165,7 @@ struct rev_info {
>  			cherry_mark:1,
>  			bisect:1,
>  			ancestry_path:1,
> +			ancestry_path_need_bottoms:1,

Might be better named as ancestry_path_implicit_bottoms? And probably
worth documenting, e.g.

  True if --ancestry-path was specified without an argument. The bottom
  revisions are implicitly the arguments in this case.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18 15:30     ` Derrick Stolee
  2022-08-18 15:50       ` Ævar Arnfjörð Bjarmason
  2022-08-18 16:53       ` Eric Sunshine
@ 2022-08-19  1:01       ` Elijah Newren
  2 siblings, 0 replies; 29+ messages in thread
From: Elijah Newren @ 2022-08-19  1:01 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren via GitGitGadget, Git Mailing List

On Thu, Aug 18, 2022 at 8:30 AM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
>
> Code looks good!
>
> > diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
> > index af57a04b7ff..d3657737fa6 100755
> > --- a/t/t6019-rev-list-ancestry-path.sh
> > +++ b/t/t6019-rev-list-ancestry-path.sh
> > @@ -8,8 +8,13 @@ test_description='--ancestry-path'
> >  #   /                     \
> >  #  A-------K---------------L--M
> >  #
> > -#  D..M                 == E F G H I J K L M
> > -#  --ancestry-path D..M == E F H I J L M
> > +#  D..M                                     == E F G H I J K L M
> > +#  --ancestry-path                     D..M == E F   H I J   L M
> > +#  --ancestry-path=F                   D..M == E F       J   L M
> > +#  --ancestry-path=G                   D..M ==     G H I J   L M
> > +#  --ancestry-path=H                   D..M == E   G H I J   L M
> > +#  --ancestry-path=K                   D..M ==             K L M
> > +#  --ancestry-path=K --ancestry-path=F D..M == E F       J K L M
>
> These are good examples, because they help clarify what I had initially
> been confused about: you include things in _both_ directions. In
> particular, "--ancestry-path=K --ancestry-path=f D..M" you are kind of
> taking the union of the following queries:
>
>         --ancestry-path D..K
>         --ancestry-path K..M
>         --ancestry-path D..F
>         --ancestry-path F..M
>
> I did check just in case, but specifying multiple ranges such as
>
>         --ancestry-path D..K K..M
>
> does not do what is expected.

Right, because there's no such thing as multiple ranges.  Quoting from the docs:

"""
Commands that are specifically designed to take two distinct ranges
(e.g. "git range-diff R1 R2" to compare two ranges) do exist, but
they are exceptions.  Unless otherwise noted, all "git" commands
that operate on a set of commits work on a single revision range.
In other words, writing two "two-dot range notation" next to each
other, e.g.

    $ git log A..B C..D

does *not* specify two revision ranges for most commands.  Instead
it will name a single connected set of commits, i.e. those that are
reachable from either B or D but are reachable from neither A or C.
"""

>
> > +test_expect_success 'rev-list --ancestry-path=F D..M' '
> > +     test_write_lines E F J L M >expect &&
> > +     git rev-list --ancestry-path=F --format=%s D..M |
> > +     sed -e "/^commit /d" |
> > +     sort >actual &&
> > +     test_cmp expect actual
> > +'
>
> These tests follow the patterns from other tests in this file, but
> it also has bad patterns. Specifically, the 'git rev-list' command
> is fed directly into a pipe. I include a patch below that applies
> directly on this one to rewrite these tests. If you want, you could
> rebase to have that test refactor happen before you add your new
> --ancestry-path=<X> option tests.

Ooh, I like it.  I'll rebase and include this patch earlier in my
series.  Thanks!

>
> > +test_expect_success 'rev-list --ancestry-path=G D..M' '
> > +     test_write_lines G H I J L M >expect &&
> > +     git rev-list --ancestry-path=G --format=%s D..M |
> > +     sed -e "/^commit /d" |
> > +     sort >actual &&
> > +     test_cmp expect actual
> > +'
> > +test_expect_success 'rev-list --ancestry-path=H D..M' '
>
> nit: needs extra whitespace between tests. The above test pair
> needs one, too. This becomes moot with the patch I provide.
>
> Thanks,
> -Stolee
>
> --- >8 ---
>
> From 9ac4e81dba0d7801513a09f5fe307d01357123dd Mon Sep 17 00:00:00 2001
> From: Derrick Stolee <derrickstolee@github.com>
> Date: Thu, 18 Aug 2022 11:25:04 -0400
> Subject: [PATCH] t6019: modernize tests with helper
>
> The tests in t6019 are repetive, so create a helper that greatly
> simplifies the test script.
>
> In addition, update the common pattern that places 'git rev-list' on the
> left side of a pipe, which can hide some exit codes. Send the output to
> a 'raw' file that is then consumed by other tools so the Git exit code
> is verified as zero.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  t/t6019-rev-list-ancestry-path.sh | 131 +++++++-----------------------
>  1 file changed, 31 insertions(+), 100 deletions(-)
>
> diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
> index d3657737fa60..18941a80ce67 100755
> --- a/t/t6019-rev-list-ancestry-path.sh
> +++ b/t/t6019-rev-list-ancestry-path.sh
> @@ -55,111 +55,42 @@ test_expect_success setup '
>         test_commit M
>  '
>
> -test_expect_success 'rev-list D..M' '
> -       test_write_lines E F G H I J K L M >expect &&
> -       git rev-list --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path D..M' '
> -       test_write_lines E F H I J L M >expect &&
> -       git rev-list --ancestry-path --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path=F D..M' '
> -       test_write_lines E F J L M >expect &&
> -       git rev-list --ancestry-path=F --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -test_expect_success 'rev-list --ancestry-path=G D..M' '
> -       test_write_lines G H I J L M >expect &&
> -       git rev-list --ancestry-path=G --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -test_expect_success 'rev-list --ancestry-path=H D..M' '
> -       test_write_lines E G H I J L M >expect &&
> -       git rev-list --ancestry-path=H --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path=K D..M' '
> -       test_write_lines K L M >expect &&
> -       git rev-list --ancestry-path=K --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path=F --ancestry-path=K D..M' '
> -       test_write_lines E F J K L M >expect &&
> -       git rev-list --ancestry-path=F --ancestry-path=K --format=%s D..M |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list D..M -- M.t' '
> -       echo M >expect &&
> -       git rev-list --format=%s D..M -- M.t |
> -       sed -e "/^commit /d" >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
> -       echo M >expect &&
> -       git rev-list --ancestry-path --format=%s D..M -- M.t |
> -       sed -e "/^commit /d" >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry () {
> +       args=$1
> +       expected=$2
> +       test_expect_success "rev-list $args" "
> +               test_write_lines $expected >expect &&
> +               git rev-list --format=%s $args >raw &&
> +
> +               if test -n \"$expected\"
> +               then
> +                       sed -e \"/^commit /d\" raw | sort >actual &&
> +                       test_cmp expect actual || return 1
> +               else
> +                       test_must_be_empty raw
> +               fi
> +       "
> +}
>
> -test_expect_success 'rev-list F...I' '
> -       test_write_lines F G H I >expect &&
> -       git rev-list --format=%s F...I |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry "D..M" "E F G H I J K L M"
>
> -test_expect_success 'rev-list --ancestry-path F...I' '
> -       test_write_lines F H I >expect &&
> -       git rev-list --ancestry-path --format=%s F...I |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry "--ancestry-path D..M" "E F H I J L M"
> +test_ancestry "--ancestry-path D..M" "E F H I J L M"
> +test_ancestry "--ancestry-path=F D..M" "E F J L M"
> +test_ancestry "--ancestry-path=G D..M" "G H I J L M"
> +test_ancestry "--ancestry-path=H D..M" "E G H I J L M"
> +test_ancestry "--ancestry-path=K D..M" "K L M"
> +test_ancestry "--ancestry-path=K --ancestry-path=F D..M" "E F J K L M"
>
> -# G.t is dropped in an "-s ours" merge
> -test_expect_success 'rev-list G..M -- G.t' '
> -       git rev-list --format=%s G..M -- G.t |
> -       sed -e "/^commit /d" >actual &&
> -       test_must_be_empty actual
> -'
> +test_ancestry "D..M -- M.t" "M"
> +test_ancestry "--ancestry-path D..M -- M.t" "M"
>
> -test_expect_success 'rev-list --ancestry-path G..M -- G.t' '
> -       echo L >expect &&
> -       git rev-list --ancestry-path --format=%s G..M -- G.t |
> -       sed -e "/^commit /d" >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry "F...I" "F G H I"
> +test_ancestry "--ancestry-path F...I" "F H I"
>
> -test_expect_success 'rev-list --ancestry-path --simplify-merges G^..M -- G.t' '
> -       test_write_lines G L >expect &&
> -       git rev-list --ancestry-path --simplify-merges --format=%s G^..M -- G.t |
> -       sed -e "/^commit /d" |
> -       sort >actual &&
> -       test_cmp expect actual
> -'
> +test_ancestry "G..M -- G.t" ""
> +test_ancestry "--ancestry-path G..M -- G.t" "L"
> +test_ancestry "--ancestry-path --simplify-merges G^..M -- G.t" "G L"
>
>  #   b---bc
>  #  / \ /
> --
> 2.37.1.vfs.0.0.rebase
>
>
>
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18 15:50       ` Ævar Arnfjörð Bjarmason
  2022-08-18 16:51         ` Derrick Stolee
  2022-08-18 16:56         ` Eric Sunshine
@ 2022-08-19  1:12         ` Elijah Newren
  2022-08-19  2:45           ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 29+ messages in thread
From: Elijah Newren @ 2022-08-19  1:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee, Elijah Newren via GitGitGadget, Git Mailing List

On Thu, Aug 18, 2022 at 8:57 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Thu, Aug 18 2022, Derrick Stolee wrote:
>
> > On 8/18/2022 2:17 AM, Elijah Newren via GitGitGadget wrote:
> >> +test_expect_success 'rev-list --ancestry-path=F D..M' '
> >> +    test_write_lines E F J L M >expect &&
> >> +    git rev-list --ancestry-path=F --format=%s D..M |
> >> +    sed -e "/^commit /d" |
> >> +    sort >actual &&
> >> +    test_cmp expect actual
> >> +'
> >
> > These tests follow the patterns from other tests in this file, but
> > it also has bad patterns. Specifically, the 'git rev-list' command
> > is fed directly into a pipe. I include a patch below that applies
> > directly on this one to rewrite these tests. If you want, you could
> > rebase to have that test refactor happen before you add your new
> > --ancestry-path=<X> option tests.
>
> Thanks, I was going to comment on the same, but your solution is much
> better (I was just going to suggest using intermediate files).
>
> > [...]
> > -test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
> > -     echo M >expect &&
> > -     git rev-list --ancestry-path --format=%s D..M -- M.t |
> > -     sed -e "/^commit /d" >actual &&
> > -     test_cmp expect actual
> > -'
> > +test_ancestry () {
> > +     args=$1
> > +     expected=$2
>
> Maybe add &&-chaining here (we do it in some cases, but I'm not sure
> when such assignments would ever fail).
>
> > +     test_expect_success "rev-list $args" "
> > +             test_write_lines $expected >expect &&
> > +             git rev-list --format=%s $args >raw &&
> > +
> > +             if test -n \"$expected\"
>
> Aren't you making things harder for yourself here than required by using
> ""-quoting for the body of the test.
>
> We eval it into existence, so you can use ''-quotes, and then you don't
> need to escape e.g. the "" quotes here for expected, no?
>
> > +             then
> > +                     sed -e \"/^commit /d\" raw | sort >actual &&
>
> nit for debuggability (and not correctness), maybe using intermediate
> files here would be nicer? And then perhaps call them "actual" and
> "actual.sorted".

Would be better to just nuke the sed by replacing 'rev-list' with
'log' (the line already has a --format option, so might as well get
the output we want).

> > +                     test_cmp expect actual || return 1
>
> No need for a "return 1" here when we're not in a loop. It's redundant,
> and makes the -x output on failure confusing ("why didn't I fail on the
> test_cmp, but on this stray return?...").
>
> ...
>
> > +             else
> > +                     test_must_be_empty raw
>
> ...which would also allow you to extract much of this if/else at the
> cost of not using test_must_be_empty, i.e. just make the "expected"
> empty unless "$expected" is provided. Just a thought/nit, we could also
> leave this as-is :)
>
> Also does the "compare rev" part of this want test_cmp_rev instead?

Um, I don't see any "compare rev" part of this, or any revision
comparing.  What are you referring to?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-18 22:24     ` Jonathan Tan
@ 2022-08-19  1:23       ` Elijah Newren
  2022-08-19 17:25         ` Jonathan Tan
  0 siblings, 1 reply; 29+ messages in thread
From: Elijah Newren @ 2022-08-19  1:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Elijah Newren via GitGitGadget, Git Mailing List

On Thu, Aug 18, 2022 at 3:24 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> Thanks - overall this looks good. I only have some minor textual
> comments and one small code comment.
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > From: Elijah Newren <newren@gmail.com>
> >
> > We have long allowed users to run e.g.
> >     git log --ancestry-path master..seen
> > which shows all commits which satisfy all three of these criteria:
> >   * are an ancestor of seen
> >   * are not an ancestor master
>
> are not an ancestor *of* master

Thanks, good catch.

> > diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> > index 2f85726745a..001e49cee55 100644
> > --- a/Documentation/rev-list-options.txt
> > +++ b/Documentation/rev-list-options.txt
> > @@ -389,12 +389,15 @@ Default mode::
> >       merges from the resulting history, as there are no selected
> >       commits contributing to this merge.
> >
> > ---ancestry-path::
> > +--ancestry-path[=<commit>]::
> >       When given a range of commits to display (e.g. 'commit1..commit2'
> > -     or 'commit2 {caret}commit1'), only display commits that exist
> > -     directly on the ancestry chain between the 'commit1' and
> > -     'commit2', i.e. commits that are both descendants of 'commit1',
> > -     and ancestors of 'commit2'.
> > +     or 'commit2 {caret}commit1'), only display commits in that
> > +     range where <commit> is part of the ancestry chain.  By "part of
> > +     the ancestry chain", we mean including <commit> itself and
> > +     commits that are either ancestors or descendants of <commit>.
> > +     If no commit is specified, use 'commit1' (the excluded part of
> > +     the range) as <commit>.  Can be passed multiple times to look for
> > +     commits in the ancestry range of multiple commits.
>
> "Ancestry chain" seems to be used multiple times in the Git codebase,
> apparently with slightly different definitions, so probably best to
> clear up at least this part by not reusing the term. It's also probably
> not worth introducing a new term "ancestry range". Maybe rewrite to
> say:
>
>   When given a range of commits to display (e.g. 'commit1..commit2'
>   or 'commit2 {caret}commit1'), only display commits in that
>   range that are ancestors of <commit>, descendants of <commit>, or
>   <commit> itself. If no commit is specified, use 'commit1' (the
>   excluded part of the range) as <commit>. Can be passed multiple times;
>   if so, a commit is included if it is any of the commits given or if it
>   is an ancestor or descendant of one of them.

Works for me; thanks.

> > @@ -568,11 +571,10 @@ Note the major differences in `N`, `P`, and `Q` over `--full-history`:
> >
> >  There is another simplification mode available:
> >
> > ---ancestry-path::
> > -     Limit the displayed commits to those directly on the ancestry
> > -     chain between the ``from'' and ``to'' commits in the given commit
> > -     range. I.e. only display commits that are ancestor of the ``to''
> > -     commit and descendants of the ``from'' commit.
> > +--ancestry-path[=<commit>]::
> > +     Limit the displayed commits to those containing <commit> in their
> > +     ancestry path.  I.e. only display <commit> and commits which have
> > +     <commit> as either a direct ancestor or descendant.
>
> Can we refer back to the documentation of --ancestry-path instead?

I had the same thought, especially since the nearby text also
duplicates definitions for --dense, --sparse, --simplify-merges, and
--show-pulls, each of which might also benefit from just referring
back to previous definitions to avoid drift.  I think we should handle
this whole "History Simplification" section consistently, so if we
just refer back to the previous definition somehow for this flag then
we should also do the same for the others.  But some of the others
appear to intentionally avoid using the same wording in order to draw
graphs and pictorially explain it.

So it feels like a can of worms that I'm not sure how to solve, and
what I currently have is the best solution available.

> > @@ -1304,13 +1304,20 @@ static int still_interesting(struct commit_list *src, timestamp_t date, int slop
> >  }
> >
> >  /*
> > - * "rev-list --ancestry-path A..B" computes commits that are ancestors
> > + * "rev-list --ancestry-path=C A..B" computes commits that are ancestors
> >   * of B but not ancestors of A but further limits the result to those
> > - * that are descendants of A.  This takes the list of bottom commits and
> > - * the result of "A..B" without --ancestry-path, and limits the latter
> > - * further to the ones that can reach one of the commits in "bottom".
> > + * that have C in their ancestry path (i.e. are either ancestors of C,
> > + * descendants of C, or are C).  If multiple --ancestry-path=$COMMITTISH
> > + * arguments are supplied, we limit the result to those that have at
> > + * least one of those COMMITTISH in their ancestry path. If
> > + * --ancestry-path is specified with no commit, we use all bottom
> > + * commits for C.
> > + *
> > + * Before this function is called, ancestors of C will have already been
> > + * marked with ANCESTRY_PATH previously, so we just need to also mark
> > + * the descendants here, then collect both sets of commits.
> >   */
> > -static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
> > +static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)
>
> I thought the original description of this function ("This takes the
> list...") to be clear and it would be nice to retain it. So, e.g.:
>
>   "rev-list --ancestry-path=C_0 [--ancestry-path=C_1 ...] A..B" computes commits
>   that are ancestors of B but not ancestors of A but further limits the
>   result to those that have any of C in their ancestry path (i.e. are
>   either ancestors of any of C, descendants of any of C, or are any of
>   C). If --ancestry-path is specified with no commit, we use all bottom
>   commits for C.
>
>   Before this function is called, ancestors of C will have already been
>   marked with ANCESTRY_PATH previously.
>
>   This takes the list of bottom commits and the result of "A..B" without
>   --ancestry-path, and limits the latter further to the ones that have
>   any of C in their ancestry path. Since the ancestors of C have already
>   been marked (a prerequisite of this function), we just need to mark
>   the descendants, then exclude any commit that does not have any of
>   these marks.

Sounds good to me; I'm happy to adopt this wording.

> Optional: Besides that, from what I can tell, sometimes the C commits
> themselves are marked with ANCESTRY_PATH (when they are explicitly
> specified) and sometimes they are not (when they are not explicitly
> specified). It's not a bug here, but it might be worth handling that in
> the ancestry_path_need_bottoms codepath (instead of explicitly setting
> TMP_MARK on the bottoms in limit_to_ancestry() - that way, I think we
> can also use ANCESTRY_PATH instead of TMP_MARK throughout the ancestry
> path codepaths, but I haven't tested it), at least to prevent possible
> future bugs.

That sounds like you're trying to duplicate the bug in my first
attempt at this patch.  If you try to coalesce ANCESTRY_PATH and
TMP_MARK, then you not only get all descendants of C, you also get all
descendants of any ancestor of C, which defeats the whole point of my
changes.

It's true that I don't mark implicit C commits with ANCESTRY_PATH, but
those are always bottom commits that are the excluded end of a range
anyway.  While those could be marked without causing problems, it
would always be a waste of effort.

> > @@ -2213,7 +2220,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
> >                              const struct setup_revision_opt* opt)
> >  {
> >       const char *arg = argv[0];
> > -     const char *optarg;
> > +     const char *optarg = NULL;
> >       int argcount;
> >       const unsigned hexsz = the_hash_algo->hexsz;
>
> [snip]
>
> > @@ -2280,10 +2287,26 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
> >               revs->first_parent_only = 1;
> >       } else if (!strcmp(arg, "--exclude-first-parent-only")) {
> >               revs->exclude_first_parent_only = 1;
> > -     } else if (!strcmp(arg, "--ancestry-path")) {
> > +     } else if (!strcmp(arg, "--ancestry-path") ||
> > +                skip_prefix(arg, "--ancestry-path=", &optarg)) {
>
> This and the above hunk might cause bugs if --ancestry-path was first
> specified with a commit and then specified without. Probably best to
> break this up into separate "else if" branches, even though there is a
> bit of code duplication (and also remove the "= NULL" addition in the
> above hunk).

Ah, good catch.

> > @@ -164,6 +165,7 @@ struct rev_info {
> >                       cherry_mark:1,
> >                       bisect:1,
> >                       ancestry_path:1,
> > +                     ancestry_path_need_bottoms:1,
>
> Might be better named as ancestry_path_implicit_bottoms? And probably
> worth documenting, e.g.
>
>   True if --ancestry-path was specified without an argument. The bottom
>   revisions are implicitly the arguments in this case.

Sure, sounds good.

Thanks for the careful review!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-19  1:12         ` Elijah Newren
@ 2022-08-19  2:45           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 29+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-19  2:45 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Derrick Stolee, Elijah Newren via GitGitGadget, Git Mailing List


On Thu, Aug 18 2022, Elijah Newren wrote:

> On Thu, Aug 18, 2022 at 8:57 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> [...]
>> Also does the "compare rev" part of this want test_cmp_rev instead?
>
> Um, I don't see any "compare rev" part of this, or any revision
> comparing.  What are you referring to?

I just misread this part while skimming it, sorry.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 0/3] Allow --ancestry-path to take an argument
  2022-08-18  6:17 ` [PATCH v2 0/2] Allow " Elijah Newren via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-08-18 16:32   ` [PATCH v2 0/2] Allow " Junio C Hamano
@ 2022-08-19  4:28   ` Elijah Newren via GitGitGadget
  2022-08-19  4:28     ` [PATCH v3 1/3] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
                       ` (3 more replies)
  3 siblings, 4 replies; 29+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-08-19  4:28 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan, Elijah Newren

Changes since v2:

 * Incorporated Stolee's suggested patch as a preliminary cleanup of t6019.
   Slightly modified his patch by:
   * fixing the "repetitive" typo pointed out by Eric
   * removed the unnecessary "return 1" pointed out by Ævar
   * switched "rev-list" to "log" since we are using --format anyway, in
     order to remove the need to call "sed" afterward
 * lots of wording improvements suggested by Jonathan
 * fixed an issue with argument parsing pointed out by Jonathan

Changes since v1:

 * Tweaked the commit message, and incorporated Junio's suggestion to update
   left_flag and ancestry_flag together.

Series description:

This came out of a previous thread[1], where I wanted to be able to run
something like

git log --oneline --ancestry-path=ab/submodule-cleanup main..seen


and see the commits in main..seen which contained ab/submodule-cleanup in
their ancestry path. Let me start by defining the terminology "X is in a
commit's ancestry path". By that, I just mean that either the commit is X,
the commit is an ancestor of X, or the commit is a descendant of X. With
that definition...

The command

git log --ancestry-path A..B


means find the commits in A..B which contain A in their ancestry path. I
sometimes still want to use A..B to get the basic range, but would like to
use a commit other than A for specifying which ancestry path is of interest.
So, for example, I might want to use

git log --ancestry-path=C A..B


to mean find the commits in A..B which contain C in their ancestry path, or
use

git log --ancestry-path=C --ancestry-path=D A..B


to mean find the commits in A..B which contain either C or D in their
ancestry path.

This series implements this request, by allowing --ancestry-path to take an
optional argument. With it, I can find the answer to my question in the
thread at [1] within the git.git repository (replacing branch names with
actual hashes since the branches have since moved on):

$ git log --oneline --ancestry-path=5b893f7d81 8168d5e9c2..ac0248bfba | wc -l
36


This returns the answer I want, whereas dropping the '=5b893f7d81' from the
command line gives me 192 unwanted commits (228 total), and various other
command line flags (--first-parent, --boundary, etc.) also fail to give me
the set of commits I am looking for.

[1]
https://lore.kernel.org/git/CABPp-BF+8aqysioP_e27Q9kJ02rE2SuSqXu+XphzKWnk5a_Q+A@mail.gmail.com/

Derrick Stolee (1):
  t6019: modernize tests with helper

Elijah Newren (2):
  rev-list-options.txt: fix simple typo
  revision: allow --ancestry-path to take an argument

 Documentation/rev-list-options.txt |  46 +++++++++----
 object.h                           |   2 +-
 revision.c                         |  89 ++++++++++++++++---------
 revision.h                         |   9 +++
 t/t6019-rev-list-ancestry-path.sh  | 101 +++++++++++------------------
 5 files changed, 141 insertions(+), 106 deletions(-)


base-commit: 6a475b71f8c4ce708d69fdc9317aefbde3769e25
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1303%2Fnewren%2Fancestry-path-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1303/newren/ancestry-path-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1303

Range-diff vs v2:

 1:  68ab719d99c = 1:  68ab719d99c rev-list-options.txt: fix simple typo
 -:  ----------- > 2:  5226da2123e t6019: modernize tests with helper
 2:  f580ec6d060 ! 3:  b810b3c8a2a revision: allow --ancestry-path to take an argument
     @@ Commit message
              git log --ancestry-path master..seen
          which shows all commits which satisfy all three of these criteria:
            * are an ancestor of seen
     -      * are not an ancestor master
     +      * are not an ancestor of master
            * have master as an ancestor
      
          This commit allows another variant:
     @@ Documentation/rev-list-options.txt: Default mode::
      -	directly on the ancestry chain between the 'commit1' and
      -	'commit2', i.e. commits that are both descendants of 'commit1',
      -	and ancestors of 'commit2'.
     -+	or 'commit2 {caret}commit1'), only display commits in that
     -+	range where <commit> is part of the ancestry chain.  By "part of
     -+	the ancestry chain", we mean including <commit> itself and
     -+	commits that are either ancestors or descendants of <commit>.
     -+	If no commit is specified, use 'commit1' (the excluded part of
     -+	the range) as <commit>.  Can be passed multiple times to look for
     -+	commits in the ancestry range of multiple commits.
     ++	or 'commit2 {caret}commit1'), only display commits in that range
     ++	that are ancestors of <commit>, descendants of <commit>, or
     ++	<commit> itself.  If no commit is specified, use 'commit1' (the
     ++	excluded part of the range) as <commit>.  Can be passed multiple
     ++	times; if so, a commit is included if it is any of the commits
     ++	given or if it is an ancestor or descendant of one of them.
       
       A more detailed explanation follows.
       
     @@ Documentation/rev-list-options.txt: Note the major differences in `N`, `P`, and
      -	range. I.e. only display commits that are ancestor of the ``to''
      -	commit and descendants of the ``from'' commit.
      +--ancestry-path[=<commit>]::
     -+	Limit the displayed commits to those containing <commit> in their
     -+	ancestry path.  I.e. only display <commit> and commits which have
     -+	<commit> as either a direct ancestor or descendant.
     ++	Limit the displayed commits to those which are an ancestor of
     ++	<commit>, or which are a descendant of <commit>, or are <commit>
     ++	itself.
       +
       As an example use case, consider the following commit history:
       +
     @@ revision.c: static int still_interesting(struct commit_list *src, timestamp_t da
       
       /*
      - * "rev-list --ancestry-path A..B" computes commits that are ancestors
     -+ * "rev-list --ancestry-path=C A..B" computes commits that are ancestors
     -  * of B but not ancestors of A but further limits the result to those
     +- * of B but not ancestors of A but further limits the result to those
      - * that are descendants of A.  This takes the list of bottom commits and
      - * the result of "A..B" without --ancestry-path, and limits the latter
      - * further to the ones that can reach one of the commits in "bottom".
     -+ * that have C in their ancestry path (i.e. are either ancestors of C,
     -+ * descendants of C, or are C).  If multiple --ancestry-path=$COMMITTISH
     -+ * arguments are supplied, we limit the result to those that have at
     -+ * least one of those COMMITTISH in their ancestry path. If
     -+ * --ancestry-path is specified with no commit, we use all bottom
     -+ * commits for C.
     ++ * "rev-list --ancestry-path=C_0 [--ancestry-path=C_1 ...] A..B"
     ++ * computes commits that are ancestors of B but not ancestors of A but
     ++ * further limits the result to those that have any of C in their
     ++ * ancestry path (i.e. are either ancestors of any of C, descendants
     ++ * of any of C, or are any of C). If --ancestry-path is specified with
     ++ * no commit, we use all bottom commits for C.
     ++ *
     ++ * Before this function is called, ancestors of C will have already
     ++ * been marked with ANCESTRY_PATH previously.
      + *
     -+ * Before this function is called, ancestors of C will have already been
     -+ * marked with ANCESTRY_PATH previously, so we just need to also mark
     -+ * the descendants here, then collect both sets of commits.
     ++ * This takes the list of bottom commits and the result of "A..B"
     ++ * without --ancestry-path, and limits the latter further to the ones
     ++ * that have any of C in their ancestry path. Since the ancestors of C
     ++ * have already been marked (a prerequisite of this function), we just
     ++ * need to mark the descendants, then exclude any commit that does not
     ++ * have any of these marks.
        */
      -static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
      +static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)
     @@ revision.c: static int limit_list(struct rev_info *revs)
      -	if (revs->ancestry_path) {
      -		bottom = collect_bottom_commits(original_list);
      -		if (!bottom)
     -+	if (revs->ancestry_path_need_bottoms) {
     ++	if (revs->ancestry_path_implicit_bottoms) {
      +		collect_bottom_commits(original_list,
      +				       &revs->ancestry_path_bottoms);
      +		if (!revs->ancestry_path_bottoms)
     @@ revision.c: static int handle_revision_opt(struct rev_info *revs, int argc, cons
       	const unsigned hexsz = the_hash_algo->hexsz;
       
      @@ revision.c: static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
     - 		revs->first_parent_only = 1;
     - 	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
     - 		revs->exclude_first_parent_only = 1;
     --	} else if (!strcmp(arg, "--ancestry-path")) {
     -+	} else if (!strcmp(arg, "--ancestry-path") ||
     -+		   skip_prefix(arg, "--ancestry-path=", &optarg)) {
       		revs->ancestry_path = 1;
       		revs->simplify_history = 0;
       		revs->limited = 1;
     -+		if (optarg) {
     -+			struct commit *c;
     -+			struct object_id oid;
     -+			const char *msg = _("could not get commit for ancestry-path argument %s");
     ++		revs->ancestry_path_implicit_bottoms = 1;
     ++	} else if (skip_prefix(arg, "--ancestry-path=", &optarg)) {
     ++		struct commit *c;
     ++		struct object_id oid;
     ++		const char *msg = _("could not get commit for ancestry-path argument %s");
     ++
     ++		revs->ancestry_path = 1;
     ++		revs->simplify_history = 0;
     ++		revs->limited = 1;
      +
     -+			if (repo_get_oid_committish(revs->repo, optarg, &oid))
     -+				return error(msg, optarg);
     -+			get_reference(revs, optarg, &oid, ANCESTRY_PATH);
     -+			c = lookup_commit_reference(revs->repo, &oid);
     -+			if (!c)
     -+				return error(msg, optarg);
     -+			commit_list_insert(c, &revs->ancestry_path_bottoms);
     -+		} else {
     -+			revs->ancestry_path_need_bottoms = 1;
     -+		}
     ++		if (repo_get_oid_committish(revs->repo, optarg, &oid))
     ++			return error(msg, optarg);
     ++		get_reference(revs, optarg, &oid, ANCESTRY_PATH);
     ++		c = lookup_commit_reference(revs->repo, &oid);
     ++		if (!c)
     ++			return error(msg, optarg);
     ++		commit_list_insert(c, &revs->ancestry_path_bottoms);
       	} else if (!strcmp(arg, "-g") || !strcmp(arg, "--walk-reflogs")) {
       		init_reflog_walk(&revs->reflog_info);
       	} else if (!strcmp(arg, "--default")) {
     @@ revision.h: struct rev_info {
       			cherry_mark:1,
       			bisect:1,
       			ancestry_path:1,
     -+			ancestry_path_need_bottoms:1,
     ++
     ++			/* True if --ancestry-path was specified without an
     ++			 * argument. The bottom revisions are implicitly
     ++			 * the arguments in this case.
     ++			 */
     ++			ancestry_path_implicit_bottoms:1,
     ++
       			first_parent_only:1,
       			exclude_first_parent_only:1,
       			line_level_traverse:1,
     @@ t/t6019-rev-list-ancestry-path.sh: test_description='--ancestry-path'
       #
       #  D..M -- M.t                 == M
       #  --ancestry-path D..M -- M.t == M
     -@@ t/t6019-rev-list-ancestry-path.sh: test_expect_success 'rev-list --ancestry-path D..M' '
     - 	test_cmp expect actual
     - '
     +@@ t/t6019-rev-list-ancestry-path.sh: test_ancestry () {
     + test_ancestry "D..M" "E F G H I J K L M"
       
     -+test_expect_success 'rev-list --ancestry-path=F D..M' '
     -+	test_write_lines E F J L M >expect &&
     -+	git rev-list --ancestry-path=F --format=%s D..M |
     -+	sed -e "/^commit /d" |
     -+	sort >actual &&
     -+	test_cmp expect actual
     -+'
     -+test_expect_success 'rev-list --ancestry-path=G D..M' '
     -+	test_write_lines G H I J L M >expect &&
     -+	git rev-list --ancestry-path=G --format=%s D..M |
     -+	sed -e "/^commit /d" |
     -+	sort >actual &&
     -+	test_cmp expect actual
     -+'
     -+test_expect_success 'rev-list --ancestry-path=H D..M' '
     -+	test_write_lines E G H I J L M >expect &&
     -+	git rev-list --ancestry-path=H --format=%s D..M |
     -+	sed -e "/^commit /d" |
     -+	sort >actual &&
     -+	test_cmp expect actual
     -+'
     -+
     -+test_expect_success 'rev-list --ancestry-path=K D..M' '
     -+	test_write_lines K L M >expect &&
     -+	git rev-list --ancestry-path=K --format=%s D..M |
     -+	sed -e "/^commit /d" |
     -+	sort >actual &&
     -+	test_cmp expect actual
     -+'
     -+
     -+test_expect_success 'rev-list --ancestry-path=F --ancestry-path=K D..M' '
     -+	test_write_lines E F J K L M >expect &&
     -+	git rev-list --ancestry-path=F --ancestry-path=K --format=%s D..M |
     -+	sed -e "/^commit /d" |
     -+	sort >actual &&
     -+	test_cmp expect actual
     -+'
     -+
     - test_expect_success 'rev-list D..M -- M.t' '
     - 	echo M >expect &&
     - 	git rev-list --format=%s D..M -- M.t |
     + test_ancestry "--ancestry-path D..M" "E F H I J L M"
     ++test_ancestry "--ancestry-path=F D..M" "E F J L M"
     ++test_ancestry "--ancestry-path=G D..M" "G H I J L M"
     ++test_ancestry "--ancestry-path=H D..M" "E G H I J L M"
     ++test_ancestry "--ancestry-path=K D..M" "K L M"
     ++test_ancestry "--ancestry-path=F --ancestry-path=K D..M" "E F J K L M"
     + 
     + test_ancestry "D..M -- M.t" "M"
     + test_ancestry "--ancestry-path D..M -- M.t" "M"

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 1/3] rev-list-options.txt: fix simple typo
  2022-08-19  4:28   ` [PATCH v3 0/3] " Elijah Newren via GitGitGadget
@ 2022-08-19  4:28     ` Elijah Newren via GitGitGadget
  2022-08-19  4:28     ` [PATCH v3 2/3] t6019: modernize tests with helper Derrick Stolee via GitGitGadget
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-08-19  4:28 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/rev-list-options.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 195e74eec63..2f85726745a 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -659,7 +659,7 @@ Here, the merge commits `O` and `P` contribute extra noise, as they did
 not actually contribute a change to `file.txt`. They only merged a topic
 that was based on an older version of `file.txt`. This is a common
 issue in repositories using a workflow where many contributors work in
-parallel and merge their topic branches along a single trunk: manu
+parallel and merge their topic branches along a single trunk: many
 unrelated merges appear in the `--full-history` results.
 
 When using the `--simplify-merges` option, the commits `O` and `P`
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 2/3] t6019: modernize tests with helper
  2022-08-19  4:28   ` [PATCH v3 0/3] " Elijah Newren via GitGitGadget
  2022-08-19  4:28     ` [PATCH v3 1/3] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
@ 2022-08-19  4:28     ` Derrick Stolee via GitGitGadget
  2022-08-19  4:28     ` [PATCH v3 3/3] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
  2022-08-19 12:53     ` [PATCH v3 0/3] Allow " Derrick Stolee
  3 siblings, 0 replies; 29+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-08-19  4:28 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan, Elijah Newren, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The tests in t6019 are repetitive, so create a helper that greatly
simplifies the test script.

In addition, update the common pattern that places 'git rev-list' on the
left side of a pipe, which can hide some exit codes. Send the output to
a 'raw' file that is then consumed by other tools so the Git exit code
is verified as zero.  And since we're using --format anyway, switch to
`git log`, so that we get the desired format and can avoid using sed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6019-rev-list-ancestry-path.sh | 87 +++++++++----------------------
 1 file changed, 25 insertions(+), 62 deletions(-)

diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index af57a04b7ff..5bd787a3c0a 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -50,73 +50,36 @@ test_expect_success setup '
 	test_commit M
 '
 
-test_expect_success 'rev-list D..M' '
-	test_write_lines E F G H I J K L M >expect &&
-	git rev-list --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path D..M' '
-	test_write_lines E F H I J L M >expect &&
-	git rev-list --ancestry-path --format=%s D..M |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list D..M -- M.t' '
-	echo M >expect &&
-	git rev-list --format=%s D..M -- M.t |
-	sed -e "/^commit /d" >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
-	echo M >expect &&
-	git rev-list --ancestry-path --format=%s D..M -- M.t |
-	sed -e "/^commit /d" >actual &&
-	test_cmp expect actual
-'
+test_ancestry () {
+	args=$1
+	expected=$2
+	test_expect_success "log $args" "
+		test_write_lines $expected >expect &&
+		git log --format=%s $args >raw &&
+
+		if test -n \"$expected\"
+		then
+			sort raw >actual &&
+			test_cmp expect actual
+		else
+			test_must_be_empty raw
+		fi
+	"
+}
 
-test_expect_success 'rev-list F...I' '
-	test_write_lines F G H I >expect &&
-	git rev-list --format=%s F...I |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
+test_ancestry "D..M" "E F G H I J K L M"
 
-test_expect_success 'rev-list --ancestry-path F...I' '
-	test_write_lines F H I >expect &&
-	git rev-list --ancestry-path --format=%s F...I |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
+test_ancestry "--ancestry-path D..M" "E F H I J L M"
 
-# G.t is dropped in an "-s ours" merge
-test_expect_success 'rev-list G..M -- G.t' '
-	git rev-list --format=%s G..M -- G.t |
-	sed -e "/^commit /d" >actual &&
-	test_must_be_empty actual
-'
+test_ancestry "D..M -- M.t" "M"
+test_ancestry "--ancestry-path D..M -- M.t" "M"
 
-test_expect_success 'rev-list --ancestry-path G..M -- G.t' '
-	echo L >expect &&
-	git rev-list --ancestry-path --format=%s G..M -- G.t |
-	sed -e "/^commit /d" >actual &&
-	test_cmp expect actual
-'
+test_ancestry "F...I" "F G H I"
+test_ancestry "--ancestry-path F...I" "F H I"
 
-test_expect_success 'rev-list --ancestry-path --simplify-merges G^..M -- G.t' '
-	test_write_lines G L >expect &&
-	git rev-list --ancestry-path --simplify-merges --format=%s G^..M -- G.t |
-	sed -e "/^commit /d" |
-	sort >actual &&
-	test_cmp expect actual
-'
+test_ancestry "G..M -- G.t" ""
+test_ancestry "--ancestry-path G..M -- G.t" "L"
+test_ancestry "--ancestry-path --simplify-merges G^..M -- G.t" "G L"
 
 #   b---bc
 #  / \ /
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 3/3] revision: allow --ancestry-path to take an argument
  2022-08-19  4:28   ` [PATCH v3 0/3] " Elijah Newren via GitGitGadget
  2022-08-19  4:28     ` [PATCH v3 1/3] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
  2022-08-19  4:28     ` [PATCH v3 2/3] t6019: modernize tests with helper Derrick Stolee via GitGitGadget
@ 2022-08-19  4:28     ` Elijah Newren via GitGitGadget
  2022-08-19 17:54       ` Junio C Hamano
  2022-08-19 12:53     ` [PATCH v3 0/3] Allow " Derrick Stolee
  3 siblings, 1 reply; 29+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-08-19  4:28 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

We have long allowed users to run e.g.
    git log --ancestry-path master..seen
which shows all commits which satisfy all three of these criteria:
  * are an ancestor of seen
  * are not an ancestor of master
  * have master as an ancestor

This commit allows another variant:
    git log --ancestry-path=$TOPIC master..seen
which shows all commits which satisfy all of these criteria:
  * are an ancestor of seen
  * are not an ancestor of master
  * have $TOPIC in their ancestry-path
that last bullet can be defined as commits meeting any of these
criteria:
    * are an ancestor of $TOPIC
    * have $TOPIC as an ancestor
    * are $TOPIC

This also allows multiple --ancestry-path arguments, which can be
used to find commits with any of the given topics in their ancestry
path.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/rev-list-options.txt | 44 +++++++++++----
 object.h                           |  2 +-
 revision.c                         | 89 ++++++++++++++++++++----------
 revision.h                         |  9 +++
 t/t6019-rev-list-ancestry-path.sh  | 14 ++++-
 5 files changed, 115 insertions(+), 43 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 2f85726745a..aed486dc309 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -389,12 +389,14 @@ Default mode::
 	merges from the resulting history, as there are no selected
 	commits contributing to this merge.
 
---ancestry-path::
+--ancestry-path[=<commit>]::
 	When given a range of commits to display (e.g. 'commit1..commit2'
-	or 'commit2 {caret}commit1'), only display commits that exist
-	directly on the ancestry chain between the 'commit1' and
-	'commit2', i.e. commits that are both descendants of 'commit1',
-	and ancestors of 'commit2'.
+	or 'commit2 {caret}commit1'), only display commits in that range
+	that are ancestors of <commit>, descendants of <commit>, or
+	<commit> itself.  If no commit is specified, use 'commit1' (the
+	excluded part of the range) as <commit>.  Can be passed multiple
+	times; if so, a commit is included if it is any of the commits
+	given or if it is an ancestor or descendant of one of them.
 
 A more detailed explanation follows.
 
@@ -568,11 +570,10 @@ Note the major differences in `N`, `P`, and `Q` over `--full-history`:
 
 There is another simplification mode available:
 
---ancestry-path::
-	Limit the displayed commits to those directly on the ancestry
-	chain between the ``from'' and ``to'' commits in the given commit
-	range. I.e. only display commits that are ancestor of the ``to''
-	commit and descendants of the ``from'' commit.
+--ancestry-path[=<commit>]::
+	Limit the displayed commits to those which are an ancestor of
+	<commit>, or which are a descendant of <commit>, or are <commit>
+	itself.
 +
 As an example use case, consider the following commit history:
 +
@@ -604,6 +605,29 @@ option does. Applied to the 'D..M' range, it results in:
 			       \
 				L--M
 -----------------------------------------------------------------------
++
+We can also use `--ancestry-path=D` instead of `--ancestry-path` which
+means the same thing when applied to the 'D..M' range but is just more
+explicit.
++
+If we instead are interested in a given topic within this range, and all
+commits affected by that topic, we may only want to view the subset of
+`D..M` which contain that topic in their ancestry path.  So, using
+`--ancestry-path=H D..M` for example would result in:
++
+-----------------------------------------------------------------------
+		E
+		 \
+		  G---H---I---J
+			       \
+				L--M
+-----------------------------------------------------------------------
++
+Whereas `--ancestry-path=K D..M` would result in
++
+-----------------------------------------------------------------------
+		K---------------L--M
+-----------------------------------------------------------------------
 
 Before discussing another option, `--show-pulls`, we need to
 create a new example history.
diff --git a/object.h b/object.h
index a2219464c2b..9caef89f1f0 100644
--- a/object.h
+++ b/object.h
@@ -59,7 +59,7 @@ struct object_array {
 
 /*
  * object flag allocation:
- * revision.h:               0---------10         15             23------26
+ * revision.h:               0---------10         15             23------27
  * fetch-pack.c:             01    67
  * negotiator/default.c:       2--5
  * walker.c:                 0-2
diff --git a/revision.c b/revision.c
index 0c6e26cd9c8..19d76da235c 100644
--- a/revision.c
+++ b/revision.c
@@ -1105,7 +1105,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			   struct commit_list **list, struct prio_queue *queue)
 {
 	struct commit_list *parent = commit->parents;
-	unsigned left_flag;
+	unsigned pass_flags;
 
 	if (commit->object.flags & ADDED)
 		return 0;
@@ -1160,7 +1160,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 	if (revs->no_walk)
 		return 0;
 
-	left_flag = (commit->object.flags & SYMMETRIC_LEFT);
+	pass_flags = (commit->object.flags & (SYMMETRIC_LEFT | ANCESTRY_PATH));
 
 	for (parent = commit->parents; parent; parent = parent->next) {
 		struct commit *p = parent->item;
@@ -1181,7 +1181,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			if (!*slot)
 				*slot = *revision_sources_at(revs->sources, commit);
 		}
-		p->object.flags |= left_flag;
+		p->object.flags |= pass_flags;
 		if (!(p->object.flags & SEEN)) {
 			p->object.flags |= (SEEN | NOT_USER_GIVEN);
 			if (list)
@@ -1304,13 +1304,24 @@ static int still_interesting(struct commit_list *src, timestamp_t date, int slop
 }
 
 /*
- * "rev-list --ancestry-path A..B" computes commits that are ancestors
- * of B but not ancestors of A but further limits the result to those
- * that are descendants of A.  This takes the list of bottom commits and
- * the result of "A..B" without --ancestry-path, and limits the latter
- * further to the ones that can reach one of the commits in "bottom".
+ * "rev-list --ancestry-path=C_0 [--ancestry-path=C_1 ...] A..B"
+ * computes commits that are ancestors of B but not ancestors of A but
+ * further limits the result to those that have any of C in their
+ * ancestry path (i.e. are either ancestors of any of C, descendants
+ * of any of C, or are any of C). If --ancestry-path is specified with
+ * no commit, we use all bottom commits for C.
+ *
+ * Before this function is called, ancestors of C will have already
+ * been marked with ANCESTRY_PATH previously.
+ *
+ * This takes the list of bottom commits and the result of "A..B"
+ * without --ancestry-path, and limits the latter further to the ones
+ * that have any of C in their ancestry path. Since the ancestors of C
+ * have already been marked (a prerequisite of this function), we just
+ * need to mark the descendants, then exclude any commit that does not
+ * have any of these marks.
  */
-static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *list)
+static void limit_to_ancestry(struct commit_list *bottoms, struct commit_list *list)
 {
 	struct commit_list *p;
 	struct commit_list *rlist = NULL;
@@ -1323,7 +1334,7 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
 	for (p = list; p; p = p->next)
 		commit_list_insert(p->item, &rlist);
 
-	for (p = bottom; p; p = p->next)
+	for (p = bottoms; p; p = p->next)
 		p->item->object.flags |= TMP_MARK;
 
 	/*
@@ -1356,38 +1367,39 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
 	 */
 
 	/*
-	 * The ones that are not marked with TMP_MARK are uninteresting
+	 * The ones that are not marked with either TMP_MARK or
+	 * ANCESTRY_PATH are uninteresting
 	 */
 	for (p = list; p; p = p->next) {
 		struct commit *c = p->item;
-		if (c->object.flags & TMP_MARK)
+		if (c->object.flags & (TMP_MARK | ANCESTRY_PATH))
 			continue;
 		c->object.flags |= UNINTERESTING;
 	}
 
-	/* We are done with the TMP_MARK */
+	/* We are done with TMP_MARK and ANCESTRY_PATH */
 	for (p = list; p; p = p->next)
-		p->item->object.flags &= ~TMP_MARK;
-	for (p = bottom; p; p = p->next)
-		p->item->object.flags &= ~TMP_MARK;
+		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
+	for (p = bottoms; p; p = p->next)
+		p->item->object.flags &= ~(TMP_MARK | ANCESTRY_PATH);
 	free_commit_list(rlist);
 }
 
 /*
- * Before walking the history, keep the set of "negative" refs the
- * caller has asked to exclude.
+ * Before walking the history, add the set of "negative" refs the
+ * caller has asked to exclude to the bottom list.
  *
  * This is used to compute "rev-list --ancestry-path A..B", as we need
  * to filter the result of "A..B" further to the ones that can actually
  * reach A.
  */
-static struct commit_list *collect_bottom_commits(struct commit_list *list)
+static void collect_bottom_commits(struct commit_list *list,
+				   struct commit_list **bottom)
 {
-	struct commit_list *elem, *bottom = NULL;
+	struct commit_list *elem;
 	for (elem = list; elem; elem = elem->next)
 		if (elem->item->object.flags & BOTTOM)
-			commit_list_insert(elem->item, &bottom);
-	return bottom;
+			commit_list_insert(elem->item, bottom);
 }
 
 /* Assumes either left_only or right_only is set */
@@ -1414,12 +1426,12 @@ static int limit_list(struct rev_info *revs)
 	struct commit_list *original_list = revs->commits;
 	struct commit_list *newlist = NULL;
 	struct commit_list **p = &newlist;
-	struct commit_list *bottom = NULL;
 	struct commit *interesting_cache = NULL;
 
-	if (revs->ancestry_path) {
-		bottom = collect_bottom_commits(original_list);
-		if (!bottom)
+	if (revs->ancestry_path_implicit_bottoms) {
+		collect_bottom_commits(original_list,
+				       &revs->ancestry_path_bottoms);
+		if (!revs->ancestry_path_bottoms)
 			die("--ancestry-path given but there are no bottom commits");
 	}
 
@@ -1464,9 +1476,8 @@ static int limit_list(struct rev_info *revs)
 	if (revs->left_only || revs->right_only)
 		limit_left_right(newlist, revs);
 
-	if (bottom)
-		limit_to_ancestry(bottom, newlist);
-	free_commit_list(bottom);
+	if (revs->ancestry_path)
+		limit_to_ancestry(revs->ancestry_path_bottoms, newlist);
 
 	/*
 	 * Check if any commits have become TREESAME by some of their parents
@@ -2213,7 +2224,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 			       const struct setup_revision_opt* opt)
 {
 	const char *arg = argv[0];
-	const char *optarg;
+	const char *optarg = NULL;
 	int argcount;
 	const unsigned hexsz = the_hash_algo->hexsz;
 
@@ -2284,6 +2295,23 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->ancestry_path = 1;
 		revs->simplify_history = 0;
 		revs->limited = 1;
+		revs->ancestry_path_implicit_bottoms = 1;
+	} else if (skip_prefix(arg, "--ancestry-path=", &optarg)) {
+		struct commit *c;
+		struct object_id oid;
+		const char *msg = _("could not get commit for ancestry-path argument %s");
+
+		revs->ancestry_path = 1;
+		revs->simplify_history = 0;
+		revs->limited = 1;
+
+		if (repo_get_oid_committish(revs->repo, optarg, &oid))
+			return error(msg, optarg);
+		get_reference(revs, optarg, &oid, ANCESTRY_PATH);
+		c = lookup_commit_reference(revs->repo, &oid);
+		if (!c)
+			return error(msg, optarg);
+		commit_list_insert(c, &revs->ancestry_path_bottoms);
 	} else if (!strcmp(arg, "-g") || !strcmp(arg, "--walk-reflogs")) {
 		init_reflog_walk(&revs->reflog_info);
 	} else if (!strcmp(arg, "--default")) {
@@ -2991,6 +3019,7 @@ static void release_revisions_topo_walk_info(struct topo_walk_info *info);
 void release_revisions(struct rev_info *revs)
 {
 	free_commit_list(revs->commits);
+	free_commit_list(revs->ancestry_path_bottoms);
 	object_array_clear(&revs->pending);
 	object_array_clear(&revs->boundary_commits);
 	release_revisions_cmdline(&revs->cmdline);
diff --git a/revision.h b/revision.h
index e576845cdd1..a10520300ce 100644
--- a/revision.h
+++ b/revision.h
@@ -48,6 +48,7 @@
  */
 #define NOT_USER_GIVEN	(1u<<25)
 #define TRACK_LINEAR	(1u<<26)
+#define ANCESTRY_PATH	(1u<<27)
 #define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR | PULL_MERGE)
 
 #define DECORATE_SHORT_REFS	1
@@ -164,6 +165,13 @@ struct rev_info {
 			cherry_mark:1,
 			bisect:1,
 			ancestry_path:1,
+
+			/* True if --ancestry-path was specified without an
+			 * argument. The bottom revisions are implicitly
+			 * the arguments in this case.
+			 */
+			ancestry_path_implicit_bottoms:1,
+
 			first_parent_only:1,
 			exclude_first_parent_only:1,
 			line_level_traverse:1,
@@ -306,6 +314,7 @@ struct rev_info {
 	struct saved_parents *saved_parents_slab;
 
 	struct commit_list *previous_parents;
+	struct commit_list *ancestry_path_bottoms;
 	const char *break_bar;
 
 	struct revision_sources *sources;
diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index 5bd787a3c0a..738da23628b 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -8,8 +8,13 @@ test_description='--ancestry-path'
 #   /                     \
 #  A-------K---------------L--M
 #
-#  D..M                 == E F G H I J K L M
-#  --ancestry-path D..M == E F H I J L M
+#  D..M                                     == E F G H I J K L M
+#  --ancestry-path                     D..M == E F   H I J   L M
+#  --ancestry-path=F                   D..M == E F       J   L M
+#  --ancestry-path=G                   D..M ==     G H I J   L M
+#  --ancestry-path=H                   D..M == E   G H I J   L M
+#  --ancestry-path=K                   D..M ==             K L M
+#  --ancestry-path=K --ancestry-path=F D..M == E F       J K L M
 #
 #  D..M -- M.t                 == M
 #  --ancestry-path D..M -- M.t == M
@@ -70,6 +75,11 @@ test_ancestry () {
 test_ancestry "D..M" "E F G H I J K L M"
 
 test_ancestry "--ancestry-path D..M" "E F H I J L M"
+test_ancestry "--ancestry-path=F D..M" "E F J L M"
+test_ancestry "--ancestry-path=G D..M" "G H I J L M"
+test_ancestry "--ancestry-path=H D..M" "E G H I J L M"
+test_ancestry "--ancestry-path=K D..M" "K L M"
+test_ancestry "--ancestry-path=F --ancestry-path=K D..M" "E F J K L M"
 
 test_ancestry "D..M -- M.t" "M"
 test_ancestry "--ancestry-path D..M -- M.t" "M"
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/3] Allow --ancestry-path to take an argument
  2022-08-19  4:28   ` [PATCH v3 0/3] " Elijah Newren via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-08-19  4:28     ` [PATCH v3 3/3] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
@ 2022-08-19 12:53     ` Derrick Stolee
  2022-08-19 21:08       ` Junio C Hamano
  3 siblings, 1 reply; 29+ messages in thread
From: Derrick Stolee @ 2022-08-19 12:53 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git
  Cc: Elijah Newren, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Jonathan Tan

On 8/19/2022 12:28 AM, Elijah Newren via GitGitGadget wrote:
> Changes since v2:
> 
>  * Incorporated Stolee's suggested patch as a preliminary cleanup of t6019.
>    Slightly modified his patch by:
>    * fixing the "repetitive" typo pointed out by Eric
>    * removed the unnecessary "return 1" pointed out by Ævar
>    * switched "rev-list" to "log" since we are using --format anyway, in
>      order to remove the need to call "sed" afterward

Thanks! I appreciate you fixing all of my mistakes and
incorporating all of the existing feedback. It also looks
nicer as your patch 2 than as a third on top.

This version looks good to me.

-Stolee

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] revision: allow --ancestry-path to take an argument
  2022-08-19  1:23       ` Elijah Newren
@ 2022-08-19 17:25         ` Jonathan Tan
  0 siblings, 0 replies; 29+ messages in thread
From: Jonathan Tan @ 2022-08-19 17:25 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Jonathan Tan, Elijah Newren via GitGitGadget, Git Mailing List

Elijah Newren <newren@gmail.com> writes:
> > Optional: Besides that, from what I can tell, sometimes the C commits
> > themselves are marked with ANCESTRY_PATH (when they are explicitly
> > specified) and sometimes they are not (when they are not explicitly
> > specified). It's not a bug here, but it might be worth handling that in
> > the ancestry_path_need_bottoms codepath (instead of explicitly setting
> > TMP_MARK on the bottoms in limit_to_ancestry() - that way, I think we
> > can also use ANCESTRY_PATH instead of TMP_MARK throughout the ancestry
> > path codepaths, but I haven't tested it), at least to prevent possible
> > future bugs.
> 
> That sounds like you're trying to duplicate the bug in my first
> attempt at this patch.  If you try to coalesce ANCESTRY_PATH and
> TMP_MARK, then you not only get all descendants of C, you also get all
> descendants of any ancestor of C, which defeats the whole point of my
> changes.

Ah, yes you're right.

> It's true that I don't mark implicit C commits with ANCESTRY_PATH, but
> those are always bottom commits that are the excluded end of a range
> anyway.  While those could be marked without causing problems, it
> would always be a waste of effort.

Yes, that's true.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/3] revision: allow --ancestry-path to take an argument
  2022-08-19  4:28     ` [PATCH v3 3/3] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
@ 2022-08-19 17:54       ` Junio C Hamano
  2022-08-20  0:10         ` Elijah Newren
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2022-08-19 17:54 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Elijah Newren, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  			ancestry_path:1,
> +
> +			/* True if --ancestry-path was specified without an
> +			 * argument. The bottom revisions are implicitly
> +			 * the arguments in this case.
> +			 */
> +			ancestry_path_implicit_bottoms:1,

The above comment explains why this is called "implicit" very well.
The earlier round used "need", but the word missed the essense (i.e.
ancestry_path argument is implicit and not given, so we need to
compute bottoms and use them instead).  The new name is certainly
better.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/3] Allow --ancestry-path to take an argument
  2022-08-19 12:53     ` [PATCH v3 0/3] Allow " Derrick Stolee
@ 2022-08-19 21:08       ` Junio C Hamano
  2022-08-20  0:13         ` Elijah Newren
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2022-08-19 21:08 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Elijah Newren via GitGitGadget, git, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan

Derrick Stolee <derrickstolee@github.com> writes:

> On 8/19/2022 12:28 AM, Elijah Newren via GitGitGadget wrote:
>> Changes since v2:
>> 
>>  * Incorporated Stolee's suggested patch as a preliminary cleanup of t6019.
>>    Slightly modified his patch by:
>>    * fixing the "repetitive" typo pointed out by Eric
>>    * removed the unnecessary "return 1" pointed out by Ævar
>>    * switched "rev-list" to "log" since we are using --format anyway, in
>>      order to remove the need to call "sed" afterward
>
> Thanks! I appreciate you fixing all of my mistakes and
> incorporating all of the existing feedback. It also looks
> nicer as your patch 2 than as a third on top.
>
> This version looks good to me.

Yup, it looks good to me, too.  Thanks, both.

Let's mark it for 'next' and merge it down soonish.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/3] revision: allow --ancestry-path to take an argument
  2022-08-19 17:54       ` Junio C Hamano
@ 2022-08-20  0:10         ` Elijah Newren
  0 siblings, 0 replies; 29+ messages in thread
From: Elijah Newren @ 2022-08-20  0:10 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan

On Fri, Aug 19, 2022 at 10:54 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >                       ancestry_path:1,
> > +
> > +                     /* True if --ancestry-path was specified without an
> > +                      * argument. The bottom revisions are implicitly
> > +                      * the arguments in this case.
> > +                      */
> > +                     ancestry_path_implicit_bottoms:1,
>
> The above comment explains why this is called "implicit" very well.
> The earlier round used "need", but the word missed the essense (i.e.
> ancestry_path argument is implicit and not given, so we need to
> compute bottoms and use them instead).  The new name is certainly
> better.

Yep, I agree, but all credit for this improvement goes to Jonathan;
this was one of many good suggestions in his careful review.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 0/3] Allow --ancestry-path to take an argument
  2022-08-19 21:08       ` Junio C Hamano
@ 2022-08-20  0:13         ` Elijah Newren
  0 siblings, 0 replies; 29+ messages in thread
From: Elijah Newren @ 2022-08-20  0:13 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee, Elijah Newren via GitGitGadget, Git Mailing List,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan

On Fri, Aug 19, 2022 at 2:08 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Derrick Stolee <derrickstolee@github.com> writes:
>
> > On 8/19/2022 12:28 AM, Elijah Newren via GitGitGadget wrote:
> >> Changes since v2:
> >>
> >>  * Incorporated Stolee's suggested patch as a preliminary cleanup of t6019.
> >>    Slightly modified his patch by:
> >>    * fixing the "repetitive" typo pointed out by Eric
> >>    * removed the unnecessary "return 1" pointed out by Ævar
> >>    * switched "rev-list" to "log" since we are using --format anyway, in
> >>      order to remove the need to call "sed" afterward
> >
> > Thanks! I appreciate you fixing all of my mistakes and
> > incorporating all of the existing feedback. It also looks
> > nicer as your patch 2 than as a third on top.
> >
> > This version looks good to me.
>
> Yup, it looks good to me, too.  Thanks, both.

And thanks to all the other reviewers as well!  Good suggestions came
in from several different folks.

> Let's mark it for 'next' and merge it down soonish.

Sounds good.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-08-20  0:13 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-17  2:48 [PATCH 0/2] Allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
2022-08-17  2:48 ` [PATCH 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
2022-08-17  2:48 ` [PATCH 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
2022-08-17 22:42   ` Junio C Hamano
2022-08-18  4:01     ` Elijah Newren
2022-08-18  6:17 ` [PATCH v2 0/2] Allow " Elijah Newren via GitGitGadget
2022-08-18  6:17   ` [PATCH v2 1/2] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
2022-08-18  6:17   ` [PATCH v2 2/2] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
2022-08-18 15:30     ` Derrick Stolee
2022-08-18 15:50       ` Ævar Arnfjörð Bjarmason
2022-08-18 16:51         ` Derrick Stolee
2022-08-18 16:56         ` Eric Sunshine
2022-08-19  1:12         ` Elijah Newren
2022-08-19  2:45           ` Ævar Arnfjörð Bjarmason
2022-08-18 16:53       ` Eric Sunshine
2022-08-19  1:01       ` Elijah Newren
2022-08-18 22:24     ` Jonathan Tan
2022-08-19  1:23       ` Elijah Newren
2022-08-19 17:25         ` Jonathan Tan
2022-08-18 16:32   ` [PATCH v2 0/2] Allow " Junio C Hamano
2022-08-19  4:28   ` [PATCH v3 0/3] " Elijah Newren via GitGitGadget
2022-08-19  4:28     ` [PATCH v3 1/3] rev-list-options.txt: fix simple typo Elijah Newren via GitGitGadget
2022-08-19  4:28     ` [PATCH v3 2/3] t6019: modernize tests with helper Derrick Stolee via GitGitGadget
2022-08-19  4:28     ` [PATCH v3 3/3] revision: allow --ancestry-path to take an argument Elijah Newren via GitGitGadget
2022-08-19 17:54       ` Junio C Hamano
2022-08-20  0:10         ` Elijah Newren
2022-08-19 12:53     ` [PATCH v3 0/3] Allow " Derrick Stolee
2022-08-19 21:08       ` Junio C Hamano
2022-08-20  0:13         ` Elijah Newren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).