All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] rev-list: allow missing tips with --missing
@ 2024-02-08 13:50 Christian Couder
  2024-02-08 13:50 ` [PATCH v2 1/4] revision: clarify a 'return NULL' in get_reference() Christian Couder
                   ` (5 more replies)
  0 siblings, 6 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-08 13:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Christian Couder

# Intro

A recent patch series, kn/rev-list-missing-fix [1] extended the
`--missing` option in git-rev-list(1) to support commit objects.

Unfortunately, git-rev-list(1) with `--missing` set to something other
than 'error' still fails, usually with a "fatal: bad object <oid>"
error message, when a missing object is passed as an argument.

This patch series removes this limitation and when using
`--missing=print` allows all missing objects to be printed including
those that are passed as arguments.

[1] https://lore.kernel.org/git/20231026101109.43110-1-karthik.188@gmail.com/

# Patch overview

Patches 1/4 (revision: clarify a 'return NULL' in get_reference()),
2/4 (oidset: refactor oidset_insert_from_set()) and 
3/4 (t6022: fix 'test' style and 'even though' typo) are very small
preparatory cleanups.

Patch 4/4 (rev-list: allow missing tips with --missing=[print|allow*])
allows git-rev-list(1) with `--missing=<arg>` when <arg> is not
'error' to not fail if some tips it is passed are missing.

# Changes since V1

Thanks to Linus Arver, Eric Sunshine and Junio who commented on V1!
The changes since V1 are the following:

  - In patch 1/4 (revision: clarify a 'return NULL' in
    get_reference()), some 's/explicitely/explicitly/' typos were fixed
    in the commit message. Thanks to a suggestion from Eric. 

  - Patch 2/4 (oidset: refactor oidset_insert_from_set()) is new. It
    refactors some code into "oidset.{c,h}" to avoid duplicating code
    to copy elements from one oidset into another one. Thanks to a
    suggestion from Linus.

  - Patch 3/4 (t6022: fix 'test' style and 'even though' typo) used to
    fix only an 'even though' typo, but while at it I made it use
    `if test ...` instead of `if [ ... ]` too. Thanks to a suggestion
    from Junio.

  - Patch 4/4 (rev-list: allow missing tips with
    --missing=[print|allow*]) was changed so that missing tips are
    always allowed when `--missing=[print|allow*]` is used, as
    suggested by Junio. So:
      - no new `--allow-missing-tips` option is implemented,
      - no ugly early parsing loop is added,
      - no new 'do_not_die_on_missing_tips' flag is added into
        'struct rev_info',
      - the 'do_not_die_on_missing_objects' is used more instead,
      - the commit message as been changed accordingly.

  - In patch 4/4 (rev-list: allow missing tips with
    --missing=[print|allow*]), a code comment has been added before
    `if (get_oid_with_context(...))` in "revision.c::get_reference()"
    as suggested by Linus.

  - In patch 4/4 (rev-list: allow missing tips with
    --missing=[print|allow*]), a big NEEDSWORK comment has been added
    before the ugly early parsing loops for the
    `--exclude-promisor-objects` and `--missing=...` options in
    "builtin/rev-list.c".

  - In patch 4/4 (rev-list: allow missing tips with
    --missing=[print|allow*]), a code comment now uses "missing tips"
    instead of "missing commits" in "builtin/rev-list.c", as
    suggested by Linus.
    
  - In patch 4/4 (rev-list: allow missing tips with
    --missing=[print|allow*]), the added tests in t6022 have the
    following changes:
      - variables 'obj' and 'tip' have been renamed to 'missing_tip'
        and 'existing_tip' respectively as suggested by Linus,
      - a comment explaining how the 'existing_tip' variable is useful
        has been added as suggested by Linus,
      - `if test ...` is used instead of `if [ ... ]`, as suggested by
        Junio.

  - Also in patch 4/4 (rev-list: allow missing tips with
    --missing=[print|allow*]), the documentation of the `--missing=...`
    option has been improved to talk about missing tips.

# Range-diff since V1

Unfortunately there has been many changes in patch 4/4, so the
range-diff shows different patches:

1:  b8abbc1d42 ! 1:  5233a83181 revision: clarify a 'return NULL' in get_reference()
    @@ Commit message
         revision: clarify a 'return NULL' in get_reference()
     
         In general when we know a pointer variable is NULL, it's clearer to
    -    explicitely return NULL than to return that variable.
    +    explicitly return NULL than to return that variable.
     
         In get_reference() when 'object' is NULL, we already return NULL
         when 'revs->exclude_promisor_objects && is_promisor_object(oid)' is
         true, but we return 'object' when 'revs->ignore_missing' is true.
     
    -    Let's make the code clearer and more uniform by also explicitely
    +    Let's make the code clearer and more uniform by also explicitly
         returning NULL when 'revs->ignore_missing' is true.
     
    +    Helped-by: Eric Sunshine <sunshine@sunshineco.com>
         Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
     
      ## revision.c ##
-:  ---------- > 2:  cfa72c8cf1 oidset: refactor oidset_insert_from_set()
2:  208d43eb81 ! 3:  5668340516 t6022: fix 'even though' typo in comment
    @@ Metadata
     Author: Christian Couder <chriscool@tuxfamily.org>
     
      ## Commit message ##
    -    t6022: fix 'even though' typo in comment
    +    t6022: fix 'test' style and 'even though' typo
     
         Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
     
    @@ t/t6022-rev-list-missing.sh: do
     -                  # Blobs are shared by all commits, so evethough a commit/tree
     +                  # Blobs are shared by all commits, so even though a commit/tree
                        # might be skipped, its blob must be accounted for.
    -                   if [ $obj != "HEAD:1.t" ]; then
    +-                  if [ $obj != "HEAD:1.t" ]; then
    ++                  if test $obj != "HEAD:1.t"
    ++                  then
                                echo $(git rev-parse HEAD:1.t) >>expect.raw &&
    +                           echo $(git rev-parse HEAD:2.t) >>expect.raw
    +                   fi &&
3:  8be34ce359 < -:  ---------- rev-list: add --allow-missing-tips to be used with --missing=...
-:  ---------- > 4:  55792110ca rev-list: allow missing tips with --missing=[print|allow*]


Christian Couder (4):
  revision: clarify a 'return NULL' in get_reference()
  oidset: refactor oidset_insert_from_set()
  t6022: fix 'test' style and 'even though' typo
  rev-list: allow missing tips with --missing=[print|allow*]

 Documentation/rev-list-options.txt |  4 ++
 builtin/rev-list.c                 | 15 +++++++-
 list-objects-filter.c              | 11 +-----
 oidset.c                           | 10 +++++
 oidset.h                           |  6 +++
 revision.c                         | 16 ++++++--
 t/t6022-rev-list-missing.sh        | 61 +++++++++++++++++++++++++++++-
 7 files changed, 106 insertions(+), 17 deletions(-)

-- 
2.43.0.565.g97b5fd12a3.dirty


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/4] revision: clarify a 'return NULL' in get_reference()
  2024-02-08 13:50 [PATCH v2 0/4] rev-list: allow missing tips with --missing Christian Couder
@ 2024-02-08 13:50 ` Christian Couder
  2024-02-08 13:50 ` [PATCH v2 2/4] oidset: refactor oidset_insert_from_set() Christian Couder
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-08 13:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Christian Couder, Eric Sunshine, Christian Couder

In general when we know a pointer variable is NULL, it's clearer to
explicitly return NULL than to return that variable.

In get_reference() when 'object' is NULL, we already return NULL
when 'revs->exclude_promisor_objects && is_promisor_object(oid)' is
true, but we return 'object' when 'revs->ignore_missing' is true.

Let's make the code clearer and more uniform by also explicitly
returning NULL when 'revs->ignore_missing' is true.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 revision.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index 2424c9bd67..4c5cd7c3ce 100644
--- a/revision.c
+++ b/revision.c
@@ -385,7 +385,7 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
 
 	if (!object) {
 		if (revs->ignore_missing)
-			return object;
+			return NULL;
 		if (revs->exclude_promisor_objects && is_promisor_object(oid))
 			return NULL;
 		die("bad object %s", name);
-- 
2.43.0.565.g97b5fd12a3.dirty


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/4] oidset: refactor oidset_insert_from_set()
  2024-02-08 13:50 [PATCH v2 0/4] rev-list: allow missing tips with --missing Christian Couder
  2024-02-08 13:50 ` [PATCH v2 1/4] revision: clarify a 'return NULL' in get_reference() Christian Couder
@ 2024-02-08 13:50 ` Christian Couder
  2024-02-08 17:33   ` Junio C Hamano
  2024-02-13 21:02   ` Linus Arver
  2024-02-08 13:50 ` [PATCH v2 3/4] t6022: fix 'test' style and 'even though' typo Christian Couder
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-08 13:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Christian Couder, Christian Couder

In a following commit, we will need to add all the oids from a set into
another set. In "list-objects-filter.c", there is already a static
function called add_all() to do that.

Let's rename this function oidset_insert_from_set() and move it into
oidset.{c,h} to make it generally available.

While at it, let's remove a useless `!= NULL`.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 list-objects-filter.c | 11 +----------
 oidset.c              | 10 ++++++++++
 oidset.h              |  6 ++++++
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/list-objects-filter.c b/list-objects-filter.c
index da287cc8e0..4346f8da45 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -711,15 +711,6 @@ static void filter_combine__free(void *filter_data)
 	free(d);
 }
 
-static void add_all(struct oidset *dest, struct oidset *src) {
-	struct oidset_iter iter;
-	struct object_id *src_oid;
-
-	oidset_iter_init(src, &iter);
-	while ((src_oid = oidset_iter_next(&iter)) != NULL)
-		oidset_insert(dest, src_oid);
-}
-
 static void filter_combine__finalize_omits(
 	struct oidset *omits,
 	void *filter_data)
@@ -728,7 +719,7 @@ static void filter_combine__finalize_omits(
 	size_t sub;
 
 	for (sub = 0; sub < d->nr; sub++) {
-		add_all(omits, &d->sub[sub].omits);
+		oidset_insert_from_set(omits, &d->sub[sub].omits);
 		oidset_clear(&d->sub[sub].omits);
 	}
 }
diff --git a/oidset.c b/oidset.c
index d1e5376316..91d1385910 100644
--- a/oidset.c
+++ b/oidset.c
@@ -23,6 +23,16 @@ int oidset_insert(struct oidset *set, const struct object_id *oid)
 	return !added;
 }
 
+void oidset_insert_from_set(struct oidset *dest, struct oidset *src)
+{
+	struct oidset_iter iter;
+	struct object_id *src_oid;
+
+	oidset_iter_init(src, &iter);
+	while ((src_oid = oidset_iter_next(&iter)))
+		oidset_insert(dest, src_oid);
+}
+
 int oidset_remove(struct oidset *set, const struct object_id *oid)
 {
 	khiter_t pos = kh_get_oid_set(&set->set, *oid);
diff --git a/oidset.h b/oidset.h
index ba4a5a2cd3..262f4256d6 100644
--- a/oidset.h
+++ b/oidset.h
@@ -47,6 +47,12 @@ int oidset_contains(const struct oidset *set, const struct object_id *oid);
  */
 int oidset_insert(struct oidset *set, const struct object_id *oid);
 
+/**
+ * Insert all the oids that are in set 'src' into set 'dest'; a copy
+ * is made of each oid inserted into set 'dest'.
+ */
+void oidset_insert_from_set(struct oidset *dest, struct oidset *src);
+
 /**
  * Remove the oid from the set.
  *
-- 
2.43.0.565.g97b5fd12a3.dirty


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/4] t6022: fix 'test' style and 'even though' typo
  2024-02-08 13:50 [PATCH v2 0/4] rev-list: allow missing tips with --missing Christian Couder
  2024-02-08 13:50 ` [PATCH v2 1/4] revision: clarify a 'return NULL' in get_reference() Christian Couder
  2024-02-08 13:50 ` [PATCH v2 2/4] oidset: refactor oidset_insert_from_set() Christian Couder
@ 2024-02-08 13:50 ` Christian Couder
  2024-02-08 13:50 ` [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*] Christian Couder
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-08 13:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Christian Couder, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t6022-rev-list-missing.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
index 211672759a..5f1be7abb5 100755
--- a/t/t6022-rev-list-missing.sh
+++ b/t/t6022-rev-list-missing.sh
@@ -46,9 +46,10 @@ do
 			git rev-list --objects --no-object-names \
 				HEAD ^$obj >expect.raw &&
 
-			# Blobs are shared by all commits, so evethough a commit/tree
+			# Blobs are shared by all commits, so even though a commit/tree
 			# might be skipped, its blob must be accounted for.
-			if [ $obj != "HEAD:1.t" ]; then
+			if test $obj != "HEAD:1.t"
+			then
 				echo $(git rev-parse HEAD:1.t) >>expect.raw &&
 				echo $(git rev-parse HEAD:2.t) >>expect.raw
 			fi &&
-- 
2.43.0.565.g97b5fd12a3.dirty


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-08 13:50 [PATCH v2 0/4] rev-list: allow missing tips with --missing Christian Couder
                   ` (2 preceding siblings ...)
  2024-02-08 13:50 ` [PATCH v2 3/4] t6022: fix 'test' style and 'even though' typo Christian Couder
@ 2024-02-08 13:50 ` Christian Couder
  2024-02-08 17:44   ` Junio C Hamano
  2024-02-13 22:33   ` Linus Arver
  2024-02-08 23:15 ` [PATCH v2 0/4] rev-list: allow missing tips with --missing Junio C Hamano
  2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
  5 siblings, 2 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-08 13:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Christian Couder, Christian Couder

In 9830926c7d (rev-list: add commit object support in `--missing`
option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
so that it works with with missing commits, not just blobs/trees.

Unfortunately, such a command would still fail with a "fatal: bad
object <oid>" if it is passed a missing commit, blob or tree as an
argument (before the rev walking even begins).

When such a command is used to find the dependencies of some objects,
for example the dependencies of quarantined objects (see the
"QUARANTINE ENVIRONMENT" section in the git-receive-pack(1)
documentation), it would be better if the command would instead
consider such missing objects, especially commits, in the same way as
other missing objects.

If, for example `--missing=print` is used, it would be nice for some
use cases if the missing tips passed as arguments were reported in
the same way as other missing objects instead of the command just
failing.

We could introduce a new option to make it work like this, but most
users are likely to prefer the command to have this behavior as the
default one. Introducing a new option would require another dumb loop
to look for that options early, which isn't nice.

Also we made `git rev-list` work with missing commits very recently
and the command is most often passed commits as arguments. So let's
consider this as a bug fix related to these previous change.

While at it let's add a NEEDSWORK comment to say that we should get
rid of the existing ugly dumb loops that parse the
`--exclude-promisor-objects` and `--missing=...` options early.

Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/rev-list-options.txt |  4 +++
 builtin/rev-list.c                 | 15 +++++++-
 revision.c                         | 14 ++++++--
 t/t6022-rev-list-missing.sh        | 56 ++++++++++++++++++++++++++++++
 4 files changed, 85 insertions(+), 4 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index a583b52c61..bb753b6292 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -1019,6 +1019,10 @@ Unexpected missing objects will raise an error.
 +
 The form '--missing=print' is like 'allow-any', but will also print a
 list of the missing objects.  Object IDs are prefixed with a ``?'' character.
++
+If some tips passed to the traversal are missing, they will be
+considered as missing too, and the traversal will ignore them. In case
+we cannot get their Object ID though, an error will be raised.
 
 --exclude-promisor-objects::
 	(For internal use only.)  Prefilter object traversal at
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index b3f4783858..ec9556f135 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -545,6 +545,15 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	 *
 	 * Let "--missing" to conditionally set fetch_if_missing.
 	 */
+	/*
+	 * NEEDSWORK: These dump loops to look for some options early
+	 * are ugly. We really need setup_revisions() to have a
+	 * mechanism to allow and disallow some sets of options for
+	 * different commands (like rev-list, replay, etc). Such
+	 * mechanism should do an early parsing of option and be able
+	 * to manage the `--exclude-promisor-objects` and `--missing=...`
+	 * options below.
+	 */
 	for (i = 1; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!strcmp(arg, "--exclude-promisor-objects")) {
@@ -753,8 +762,12 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 
 	if (arg_print_omitted)
 		oidset_init(&omitted_objects, DEFAULT_OIDSET_SIZE);
-	if (arg_missing_action == MA_PRINT)
+	if (arg_missing_action == MA_PRINT) {
 		oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
+		/* Already add missing tips */
+		oidset_insert_from_set(&missing_objects, &revs.missing_commits);
+		oidset_clear(&revs.missing_commits);
+	}
 
 	traverse_commit_list_filtered(
 		&revs, show_commit, show_object, &info,
diff --git a/revision.c b/revision.c
index 4c5cd7c3ce..80c349d347 100644
--- a/revision.c
+++ b/revision.c
@@ -388,6 +388,10 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
 			return NULL;
 		if (revs->exclude_promisor_objects && is_promisor_object(oid))
 			return NULL;
+		if (revs->do_not_die_on_missing_objects) {
+			oidset_insert(&revs->missing_commits, oid);
+			return NULL;
+		}
 		die("bad object %s", name);
 	}
 	object->flags |= flags;
@@ -1947,6 +1951,7 @@ void repo_init_revisions(struct repository *r,
 	init_display_notes(&revs->notes_opt);
 	list_objects_filter_init(&revs->filter);
 	init_ref_exclusions(&revs->ref_excludes);
+	oidset_init(&revs->missing_commits, 0);
 }
 
 static void add_pending_commit_list(struct rev_info *revs,
@@ -2178,13 +2183,18 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl
 	if (revarg_opt & REVARG_COMMITTISH)
 		get_sha1_flags |= GET_OID_COMMITTISH;
 
+	/*
+	 * Even if revs->do_not_die_on_missing_objects is set, we
+	 * should error out if we can't even get an oid, as
+	 * `--missing=print` should be able to report missing oids.
+	 */
 	if (get_oid_with_context(revs->repo, arg, get_sha1_flags, &oid, &oc))
 		return revs->ignore_missing ? 0 : -1;
 	if (!cant_be_filename)
 		verify_non_filename(revs->prefix, arg);
 	object = get_reference(revs, arg, &oid, flags ^ local_flags);
 	if (!object)
-		return revs->ignore_missing ? 0 : -1;
+		return (revs->ignore_missing || revs->do_not_die_on_missing_objects) ? 0 : -1;
 	add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
 	add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
 	free(oc.path);
@@ -3830,8 +3840,6 @@ int prepare_revision_walk(struct rev_info *revs)
 				       FOR_EACH_OBJECT_PROMISOR_ONLY);
 	}
 
-	oidset_init(&revs->missing_commits, 0);
-
 	if (!revs->reflog_info)
 		prepare_to_use_bloom_filter(revs);
 	if (!revs->unsorted_input)
diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
index 5f1be7abb5..78387eebb3 100755
--- a/t/t6022-rev-list-missing.sh
+++ b/t/t6022-rev-list-missing.sh
@@ -78,4 +78,60 @@ do
 	done
 done
 
+for missing_tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
+do
+	# We want to check that things work when both
+	#   - all the tips passed are missing (case existing_tip = ""), and
+	#   - there is one missing tip and one existing tip (case existing_tip = "HEAD")
+	for existing_tip in "" "HEAD"
+	do
+		for action in "allow-any" "print"
+		do
+			test_expect_success "--missing=$action with tip '$missing_tip' missing and tip '$existing_tip'" '
+				oid="$(git rev-parse $missing_tip)" &&
+				path=".git/objects/$(test_oid_to_path $oid)" &&
+
+				# Before the object is made missing, we use rev-list to
+				# get the expected oids.
+				if test "$existing_tip" = "HEAD"
+				then
+					git rev-list --objects --no-object-names \
+						HEAD ^$missing_tip >expect.raw
+				else
+					>expect.raw
+				fi &&
+
+				# Blobs are shared by all commits, so even though a commit/tree
+				# might be skipped, its blob must be accounted for.
+				if test "$existing_tip" = "HEAD" && test $missing_tip != "HEAD:1.t"
+				then
+					echo $(git rev-parse HEAD:1.t) >>expect.raw &&
+					echo $(git rev-parse HEAD:2.t) >>expect.raw
+				fi &&
+
+				mv "$path" "$path.hidden" &&
+				test_when_finished "mv $path.hidden $path" &&
+
+				git rev-list --missing=$action --objects --no-object-names \
+				     $oid $existing_tip >actual.raw &&
+
+				# When the action is to print, we should also add the missing
+				# oid to the expect list.
+				case $action in
+				allow-any)
+					;;
+				print)
+					grep ?$oid actual.raw &&
+					echo ?$oid >>expect.raw
+					;;
+				esac &&
+
+				sort actual.raw >actual &&
+				sort expect.raw >expect &&
+				test_cmp expect actual
+			'
+		done
+	done
+done
+
 test_done
-- 
2.43.0.565.g97b5fd12a3.dirty


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/4] oidset: refactor oidset_insert_from_set()
  2024-02-08 13:50 ` [PATCH v2 2/4] oidset: refactor oidset_insert_from_set() Christian Couder
@ 2024-02-08 17:33   ` Junio C Hamano
  2024-02-13 21:02   ` Linus Arver
  1 sibling, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2024-02-08 17:33 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Patrick Steinhardt, John Cai, Linus Arver, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> In a following commit, we will need to add all the oids from a set into
> another set. In "list-objects-filter.c", there is already a static
> function called add_all() to do that.
>
> Let's rename this function oidset_insert_from_set() and move it into
> oidset.{c,h} to make it generally available.
>
> While at it, let's remove a useless `!= NULL`.

Makes sense.  

My initial reaction was that copying underlying bits may even be
more efficient, but that was only because silly-me did not realize
that dst can be non-empty when the operation starts.  The name of
the function is not "copy oidset" but "insert from set" that makes
it crystal clear what is going on.  Good.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-08 13:50 ` [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*] Christian Couder
@ 2024-02-08 17:44   ` Junio C Hamano
  2024-02-13 22:38     ` Linus Arver
  2024-02-14 14:39     ` Christian Couder
  2024-02-13 22:33   ` Linus Arver
  1 sibling, 2 replies; 30+ messages in thread
From: Junio C Hamano @ 2024-02-08 17:44 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Patrick Steinhardt, John Cai, Linus Arver, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

>  The form '--missing=print' is like 'allow-any', but will also print a
>  list of the missing objects.  Object IDs are prefixed with a ``?'' character.
> ++
> +If some tips passed to the traversal are missing, they will be
> +considered as missing too, and the traversal will ignore them. In case
> +we cannot get their Object ID though, an error will be raised.

Makes sense.

> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> index b3f4783858..ec9556f135 100644
> --- a/builtin/rev-list.c
> +++ b/builtin/rev-list.c
> @@ -545,6 +545,15 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  	 *
>  	 * Let "--missing" to conditionally set fetch_if_missing.
>  	 */
> +	/*
> +	 * NEEDSWORK: These dump loops to look for some options early
> +	 * are ugly. We really need setup_revisions() to have a

I would drop the "dump" and "ugly" from here.  It OK to make value
judgement with such words in casual conversations on a proposed
patch, but when we make a request to future developers to fix our
mess, we should be more objective and stick to the techincal facts,
so that they have better understanding on why we think this area
needs more work.

Perhaps something like:

    These loops that attempt to find presence of options without
    understanding what the options they are skipping are broken
    (e.g., it would not know "--grep --exclude-promisor-objects" is
    not triggering "--exclude-promisor-objects" option).

Everything after "We really need" is good (modulo possible grammos),
I think.  Thanks for writing it.

> +	 * mechanism to allow and disallow some sets of options for
> +	 * different commands (like rev-list, replay, etc). Such
> +	 * mechanism should do an early parsing of option and be able
> +	 * to manage the `--exclude-promisor-objects` and `--missing=...`
> +	 * options below.
> +	 */
>  	for (i = 1; i < argc; i++) {
>  		const char *arg = argv[i];
>  		if (!strcmp(arg, "--exclude-promisor-objects")) {
> @@ -753,8 +762,12 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  
>  	if (arg_print_omitted)
>  		oidset_init(&omitted_objects, DEFAULT_OIDSET_SIZE);
> -	if (arg_missing_action == MA_PRINT)
> +	if (arg_missing_action == MA_PRINT) {
>  		oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
> +		/* Already add missing tips */
> +		oidset_insert_from_set(&missing_objects, &revs.missing_commits);
> +		oidset_clear(&revs.missing_commits);
> +	}

It is unclear what "already" here refers to, at least to me.

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 0/4] rev-list: allow missing tips with --missing
  2024-02-08 13:50 [PATCH v2 0/4] rev-list: allow missing tips with --missing Christian Couder
                   ` (3 preceding siblings ...)
  2024-02-08 13:50 ` [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*] Christian Couder
@ 2024-02-08 23:15 ` Junio C Hamano
  2024-02-14 14:26   ` Christian Couder
  2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
  5 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2024-02-08 23:15 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Patrick Steinhardt, John Cai, Linus Arver

Christian Couder <christian.couder@gmail.com> writes:

> # Patch overview
>
> Patches 1/4 (revision: clarify a 'return NULL' in get_reference()),
> 2/4 (oidset: refactor oidset_insert_from_set()) and 
> 3/4 (t6022: fix 'test' style and 'even though' typo) are very small
> preparatory cleanups.
>
> Patch 4/4 (rev-list: allow missing tips with --missing=[print|allow*])
> allows git-rev-list(1) with `--missing=<arg>` when <arg> is not
> 'error' to not fail if some tips it is passed are missing.

Thanks.  There is an existing test that makes a bad assumption and
fails with these patches.  We may need something like this patch as
a preliminary fix before these four patches.

----- >8 ---------- >8 ---------- >8 ---------- >8 -----
Subject: [PATCH] t9210: do not rely on lazy fetching to fail

With "rev-list --missing=print $start", where "$start" is a 40-hex
object name, the object may or may not be lazily fetched from the
promisor.  Make sure it fails by forcing dereference of "$start"
at that point.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t9210-scalar.sh | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
index 4432a30d10..428339e342 100755
--- a/t/t9210-scalar.sh
+++ b/t/t9210-scalar.sh
@@ -154,7 +154,14 @@ test_expect_success 'scalar clone' '
 		test_cmp expect actual &&
 
 		test_path_is_missing 1/2 &&
-		test_must_fail git rev-list --missing=print $second &&
+
+		# This relies on the fact that the presence of "--missing"
+		# on the command line forces lazy fetching off before
+		# "$second^{blob}" gets parsed.  Without "^{blob}", a
+		# bare object name "$second" is taken into the queue and
+		# the command may not fail with a fixed "rev-list --missing".
+		test_must_fail git rev-list --missing=print "$second^{blob}" -- &&
+
 		git rev-list $second &&
 		git cat-file blob $second >actual &&
 		echo "second" >expect &&
-- 
2.43.0-581-g5216f8f5c4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/4] oidset: refactor oidset_insert_from_set()
  2024-02-08 13:50 ` [PATCH v2 2/4] oidset: refactor oidset_insert_from_set() Christian Couder
  2024-02-08 17:33   ` Junio C Hamano
@ 2024-02-13 21:02   ` Linus Arver
  2024-02-14 14:33     ` Christian Couder
  1 sibling, 1 reply; 30+ messages in thread
From: Linus Arver @ 2024-02-13 21:02 UTC (permalink / raw)
  To: Christian Couder, git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Christian Couder,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> In a following commit, we will need to add all the oids from a set into
> another set. In "list-objects-filter.c", there is already a static
> function called add_all() to do that.

Nice find.

> Let's rename this function oidset_insert_from_set() and move it into
> oidset.{c,h} to make it generally available.

At some point (I don't ask for it in this series) we should add unit
tests for this newly-exposed function. Presumably the stuff around
object/oid handling is stable enough to receive unit tests.

> While at it, let's remove a useless `!= NULL`.

Nice cleanup. It would have been fine to also put this in a separate
patch, but as it is so simple I think it's also fine to keep it mixed in
with the move as you did here.

> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>  list-objects-filter.c | 11 +----------
>  oidset.c              | 10 ++++++++++
>  oidset.h              |  6 ++++++
>  3 files changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/list-objects-filter.c b/list-objects-filter.c
> index da287cc8e0..4346f8da45 100644
> --- a/list-objects-filter.c
> +++ b/list-objects-filter.c
> @@ -711,15 +711,6 @@ static void filter_combine__free(void *filter_data)
>  	free(d);
>  }
>  
> -static void add_all(struct oidset *dest, struct oidset *src) {
> -	struct oidset_iter iter;
> -	struct object_id *src_oid;
> -
> -	oidset_iter_init(src, &iter);
> -	while ((src_oid = oidset_iter_next(&iter)) != NULL)
> -		oidset_insert(dest, src_oid);
> -}
> -
>  static void filter_combine__finalize_omits(
>  	struct oidset *omits,
>  	void *filter_data)
> @@ -728,7 +719,7 @@ static void filter_combine__finalize_omits(
>  	size_t sub;
>  
>  	for (sub = 0; sub < d->nr; sub++) {
> -		add_all(omits, &d->sub[sub].omits);
> +		oidset_insert_from_set(omits, &d->sub[sub].omits);
>  		oidset_clear(&d->sub[sub].omits);
>  	}
>  }
> diff --git a/oidset.c b/oidset.c
> index d1e5376316..91d1385910 100644
> --- a/oidset.c
> +++ b/oidset.c
> @@ -23,6 +23,16 @@ int oidset_insert(struct oidset *set, const struct object_id *oid)
>  	return !added;
>  }
>  
> +void oidset_insert_from_set(struct oidset *dest, struct oidset *src)
> +{
> +	struct oidset_iter iter;
> +	struct object_id *src_oid;
> +
> +	oidset_iter_init(src, &iter);
> +	while ((src_oid = oidset_iter_next(&iter)))

Are the extra parentheses necessary?

> +		oidset_insert(dest, src_oid);
> +}
> +
>  int oidset_remove(struct oidset *set, const struct object_id *oid)
>  {
>  	khiter_t pos = kh_get_oid_set(&set->set, *oid);
> diff --git a/oidset.h b/oidset.h
> index ba4a5a2cd3..262f4256d6 100644
> --- a/oidset.h
> +++ b/oidset.h
> @@ -47,6 +47,12 @@ int oidset_contains(const struct oidset *set, const struct object_id *oid);
>   */
>  int oidset_insert(struct oidset *set, const struct object_id *oid);
>  
> +/**
> + * Insert all the oids that are in set 'src' into set 'dest'; a copy
> + * is made of each oid inserted into set 'dest'.
> + */

Just above in oid_insert() there is already a comment about needing to
copy each oid.

    /**
     * Insert the oid into the set; a copy is made, so "oid" does not need
     * to persist after this function is called.
     *
     * Returns 1 if the oid was already in the set, 0 otherwise. This can be used
     * to perform an efficient check-and-add.
     */

so perhaps the following wording is simpler?

    Like oid_insert(), but insert all oids found in 'src'. Calls
    oid_insert() internally.

> +void oidset_insert_from_set(struct oidset *dest, struct oidset *src);

Perhaps "oidset_insert_all" would be a simpler name? I generally prefer
to reuse any descriptors in comments to guide the names. Plus this
function used to be called "add_all()" so keeping the "all" naming style
feels right.

> +
>  /**
>   * Remove the oid from the set.
>   *
> -- 
> 2.43.0.565.g97b5fd12a3.dirty

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-08 13:50 ` [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*] Christian Couder
  2024-02-08 17:44   ` Junio C Hamano
@ 2024-02-13 22:33   ` Linus Arver
  2024-02-14 14:38     ` Christian Couder
  1 sibling, 1 reply; 30+ messages in thread
From: Linus Arver @ 2024-02-13 22:33 UTC (permalink / raw)
  To: Christian Couder, git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Christian Couder,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> In 9830926c7d (rev-list: add commit object support in `--missing`
> option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
> so that it works with with missing commits, not just blobs/trees.
>
> Unfortunately, such a command would still fail with a "fatal: bad
> object <oid>" if it is passed a missing commit, blob or tree as an
> argument (before the rev walking even begins).
>
> When such a command is used to find the dependencies of some objects,
> for example the dependencies of quarantined objects (see the
> "QUARANTINE ENVIRONMENT" section in the git-receive-pack(1)
> documentation), it would be better if the command would instead
> consider such missing objects, especially commits, in the same way as
> other missing objects.

Looks good.

> If, for example `--missing=print` is used, it would be nice for some
> use cases if the missing tips passed as arguments were reported in
> the same way as other missing objects instead of the command just
> failing.

Nit: this paragraph sounds a bit redundant but it is giving an explicit
example of `--missing=print` which was not discussed so far, so I think
it's fine as is.

> We could introduce a new option to make it work like this, but most
> users are likely to prefer the command to have this behavior as the
> default one. Introducing a new option would require another dumb loop
> to look for that options early, which isn't nice.

s/options/option

> Also we made `git rev-list` work with missing commits very recently
> and the command is most often passed commits as arguments. So let's
> consider this as a bug fix related to these previous change.

s/previous change/recent changes

> While at it let's add a NEEDSWORK comment to say that we should get
> rid of the existing ugly dumb loops that parse the
> `--exclude-promisor-objects` and `--missing=...` options early.
>
> Helped-by: Linus Arver <linusa@google.com>
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>  Documentation/rev-list-options.txt |  4 +++
>  builtin/rev-list.c                 | 15 +++++++-
>  revision.c                         | 14 ++++++--
>  t/t6022-rev-list-missing.sh        | 56 ++++++++++++++++++++++++++++++
>  4 files changed, 85 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index a583b52c61..bb753b6292 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -1019,6 +1019,10 @@ Unexpected missing objects will raise an error.
>  +
>  The form '--missing=print' is like 'allow-any', but will also print a
>  list of the missing objects.  Object IDs are prefixed with a ``?'' character.
> ++
> +If some tips passed to the traversal are missing, they will be
> +considered as missing too, and the traversal will ignore them. In case
> +we cannot get their Object ID though, an error will be raised.

The only other mention of the term "tips" is for the "--alternate-refs"
flag on line 215 which uses "ref tips". Perhaps this is obvious to
veteran Git developers but I do wonder if we need to somehow define it
(however briefly) the first time we mention it (either in the document
as a whole, or just within this newly added paragraph).

Here's an alternate wording

    Ref tips given on the command line are a special case. They are
    first dereferenced to Object IDs (if this is not possible, an error
    will be raised). Then these IDs are checked to see if the objects
    they refer to exist. If so, the traversal happens starting with
    these tips. Otherwise, then such tips will not be used for
    traversal.

    Even though such missing tips won't be included for traversal, for
    purposes of the `--missing` flag they will be treated the same way
    as those objects that did get traversed (and were determined to be
    missing). For example, if `--missing=print` is given then the Object
    IDs of these tips will be printed just like all other missing
    objects encountered during traversal.

But also, I realize that these documentation touch-ups might be better
served by an overall pass over the whole document, so it's fine if we
decide not to take this suggestion at this time.

Aside: unfortunately we don't seem to define the relationship between
ref tips (e.g., "HEAD"), object IDs (40-char hex string), and the actual
objects (the real data that we traverse over). It's probably another
thing that could be fixed up in the docs in the future.

>  --exclude-promisor-objects::
>  	(For internal use only.)  Prefilter object traversal at
> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> index b3f4783858..ec9556f135 100644
> --- a/builtin/rev-list.c
> +++ b/builtin/rev-list.c
> @@ -545,6 +545,15 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  	 *
>  	 * Let "--missing" to conditionally set fetch_if_missing.
>  	 */
> +	/*
> +	 * NEEDSWORK: These dump loops to look for some options early
> +	 * are ugly.

I agree with Junio's suggestion to use more objective language.

> We really need setup_revisions() to have a
> +	 * mechanism to allow and disallow some sets of options for
> +	 * different commands (like rev-list, replay, etc). Such

s/Such/Such a

> +	 * mechanism should do an early parsing of option and be able

s/option/options

> +	 * to manage the `--exclude-promisor-objects` and `--missing=...`
> +	 * options below.
> +	 */
>  	for (i = 1; i < argc; i++) {
>  		const char *arg = argv[i];
>  		if (!strcmp(arg, "--exclude-promisor-objects")) {
> 
> [...]
> 
> @@ -2178,13 +2183,18 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl
>  	if (revarg_opt & REVARG_COMMITTISH)
>  		get_sha1_flags |= GET_OID_COMMITTISH;
>  
> +	/*
> +	 * Even if revs->do_not_die_on_missing_objects is set, we
> +	 * should error out if we can't even get an oid, ...
> +	 */

Perhaps this wording is more precise?

    If we can't even get an oid, we are forced to error out (regardless
    of revs->do_not_die_on_missing_objects) because a valid traversal
    must start from *some* valid oid. OTOH we ignore the ref tip
    altogether with revs->ignore_missing.

> +	 * ... as
> +	 * `--missing=print` should be able to report missing oids.

I think this comment would be better placed ...

>  	if (get_oid_with_context(revs->repo, arg, get_sha1_flags, &oid, &oc))
>  		return revs->ignore_missing ? 0 : -1;
>  	if (!cant_be_filename)
>  		verify_non_filename(revs->prefix, arg);
>  	object = get_reference(revs, arg, &oid, flags ^ local_flags);

... around here.

>  	if (!object)
> -		return revs->ignore_missing ? 0 : -1;
> +		return (revs->ignore_missing || revs->do_not_die_on_missing_objects) ? 0 : -1;
>  	add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
>  	add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
>  	free(oc.path);
> @@ -3830,8 +3840,6 @@ int prepare_revision_walk(struct rev_info *revs)
>  				       FOR_EACH_OBJECT_PROMISOR_ONLY);
>  	}
>  
> -	oidset_init(&revs->missing_commits, 0);
> -
>  	if (!revs->reflog_info)
>  		prepare_to_use_bloom_filter(revs);
>  	if (!revs->unsorted_input)
> diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
> index 5f1be7abb5..78387eebb3 100755
> --- a/t/t6022-rev-list-missing.sh
> +++ b/t/t6022-rev-list-missing.sh
> @@ -78,4 +78,60 @@ do
>  	done
>  done
>  
> +for missing_tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
> +do
> +	# We want to check that things work when both
> +	#   - all the tips passed are missing (case existing_tip = ""), and
> +	#   - there is one missing tip and one existing tip (case existing_tip = "HEAD")
> +	for existing_tip in "" "HEAD"
> +	do

Though I am biased, these new variable names do make this test that much
easier to read. Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-08 17:44   ` Junio C Hamano
@ 2024-02-13 22:38     ` Linus Arver
  2024-02-14 14:34       ` Christian Couder
  2024-02-14 14:39     ` Christian Couder
  1 sibling, 1 reply; 30+ messages in thread
From: Linus Arver @ 2024-02-13 22:38 UTC (permalink / raw)
  To: Junio C Hamano, Christian Couder
  Cc: git, Patrick Steinhardt, John Cai, Christian Couder

Junio C Hamano <gitster@pobox.com> writes:

> Christian Couder <christian.couder@gmail.com> writes:
>
>> +	 * mechanism to allow and disallow some sets of options for
>> +	 * different commands (like rev-list, replay, etc). Such
>> +	 * mechanism should do an early parsing of option and be able
>> +	 * to manage the `--exclude-promisor-objects` and `--missing=...`
>> +	 * options below.
>> +	 */
>>  	for (i = 1; i < argc; i++) {
>>  		const char *arg = argv[i];
>>  		if (!strcmp(arg, "--exclude-promisor-objects")) {
>> @@ -753,8 +762,12 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>>  
>>  	if (arg_print_omitted)
>>  		oidset_init(&omitted_objects, DEFAULT_OIDSET_SIZE);
>> -	if (arg_missing_action == MA_PRINT)
>> +	if (arg_missing_action == MA_PRINT) {
>>  		oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
>> +		/* Already add missing tips */
>> +		oidset_insert_from_set(&missing_objects, &revs.missing_commits);
>> +		oidset_clear(&revs.missing_commits);
>> +	}
>
> It is unclear what "already" here refers to, at least to me.

It's grammatically correct but perhaps a bit "over eager" (gives the
impression that we get these missing tips all the time and is a common
"happy" path). I do still think my earlier v1 comments

    Did you mean "Add already-missing commits"? Perhaps even more explicit
    would be "Add missing tips"?

are relevant here.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 0/5] rev-list: allow missing tips with --missing
  2024-02-08 13:50 [PATCH v2 0/4] rev-list: allow missing tips with --missing Christian Couder
                   ` (4 preceding siblings ...)
  2024-02-08 23:15 ` [PATCH v2 0/4] rev-list: allow missing tips with --missing Junio C Hamano
@ 2024-02-14 14:25 ` Christian Couder
  2024-02-14 14:25   ` [PATCH v3 1/5] t9210: do not rely on lazy fetching to fail Christian Couder
                     ` (6 more replies)
  5 siblings, 7 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:25 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Eric Sunshine, Christian Couder

# Intro

A recent patch series, kn/rev-list-missing-fix [1] extended the
`--missing` option in git-rev-list(1) to support commit objects.

Unfortunately, git-rev-list(1) with `--missing` set to something other
than 'error' still fails, usually with a "fatal: bad object <oid>"
error message, when a missing object is passed as an argument.

This patch series removes this limitation and when using
`--missing=print` allows all missing objects to be printed including
those that are passed as arguments.

[1] https://lore.kernel.org/git/20231026101109.43110-1-karthik.188@gmail.com/

# Patch overview

Patch 1/5 (t9210: do not rely on lazy fetching to fail) is a test fix
suggested by Junio, so that a mostly unrelated test will not wrongly
fail when this series is merged.

Patches 2/5 (revision: clarify a 'return NULL' in get_reference()),
3/5 (oidset: refactor oidset_insert_from_set()) and
4/5 (t6022: fix 'test' style and 'even though' typo) are very small
preparatory cleanups.

Patch 5/5 (rev-list: allow missing tips with --missing=[print|allow*])
allows git-rev-list(1) with `--missing=<arg>` when <arg> is not
'error' to not fail if some tips it is passed are missing.

# Changes since V2

Thanks to Linus Arver, and Junio who commented on V2!

The changes since V2 are the following:

  - Patch 1/5 (t9210: do not rely on lazy fetching to fail) was added
    to fix a test that wrongly failed when this series was applied,
    thanks to Junio who authored it.

  - In patch 5/5 (rev-list: allow missing tips with
    --missing=[print|allow*]), some grammos and typos were fixed in the
    commit message, thanks to Junio and Linus.

  - In patch 5/5 (rev-list: allow missing tips with
    --missing=[print|allow*]), the NEEDSWORK comment was improved, thanks
    to Junio. In particular, it doesn't use "ugly" and "dumb" anymore
    and gives an example of what's broken.

  - In patch 5/5 (rev-list: allow missing tips with
    --missing=[print|allow*]), a code comment as been clarified by removing
    "already" from it.

# Range-diff since V2

-:  ---------- > 1:  6e6f2cc26b t9210: do not rely on lazy fetching to fail
1:  5233a83181 = 2:  733c78144e revision: clarify a 'return NULL' in get_reference()
2:  cfa72c8cf1 = 3:  4c9e032456 oidset: refactor oidset_insert_from_set()
3:  5668340516 = 4:  35ca6e7c3d t6022: fix 'test' style and 'even though' typo
4:  55792110ca ! 5:  da36843b44 rev-list: allow missing tips with --missing=[print|allow*]
    @@ Commit message
         We could introduce a new option to make it work like this, but most
         users are likely to prefer the command to have this behavior as the
         default one. Introducing a new option would require another dumb loop
    -    to look for that options early, which isn't nice.
    +    to look for that option early, which isn't nice.
     
         Also we made `git rev-list` work with missing commits very recently
         and the command is most often passed commits as arguments. So let's
    -    consider this as a bug fix related to these previous change.
    +    consider this as a bug fix related to these recent changes.
     
         While at it let's add a NEEDSWORK comment to say that we should get
         rid of the existing ugly dumb loops that parse the
    @@ builtin/rev-list.c: int cmd_rev_list(int argc, const char **argv, const char *pr
         * Let "--missing" to conditionally set fetch_if_missing.
         */
     +  /*
    -+   * NEEDSWORK: These dump loops to look for some options early
    -+   * are ugly. We really need setup_revisions() to have a
    -+   * mechanism to allow and disallow some sets of options for
    -+   * different commands (like rev-list, replay, etc). Such
    -+   * mechanism should do an early parsing of option and be able
    -+   * to manage the `--exclude-promisor-objects` and `--missing=...`
    -+   * options below.
    ++   * NEEDSWORK: These loops that attempt to find presence of
    ++   * options without understanding that the options they are
    ++   * skipping are broken (e.g., it would not know "--grep
    ++   * --exclude-promisor-objects" is not triggering
    ++   * "--exclude-promisor-objects" option).  We really need
    ++   * setup_revisions() to have a mechanism to allow and disallow
    ++   * some sets of options for different commands (like rev-list,
    ++   * replay, etc). Such a mechanism should do an early parsing
    ++   * of options and be able to manage the `--missing=...` and
    ++   * `--exclude-promisor-objects` options below.
     +   */
        for (i = 1; i < argc; i++) {
                const char *arg = argv[i];
    @@ builtin/rev-list.c: int cmd_rev_list(int argc, const char **argv, const char *pr
     -  if (arg_missing_action == MA_PRINT)
     +  if (arg_missing_action == MA_PRINT) {
                oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
    -+          /* Already add missing tips */
    ++          /* Add missing tips */
     +          oidset_insert_from_set(&missing_objects, &revs.missing_commits);
     +          oidset_clear(&revs.missing_commits);
     +  }


Christian Couder (4):
  revision: clarify a 'return NULL' in get_reference()
  oidset: refactor oidset_insert_from_set()
  t6022: fix 'test' style and 'even though' typo
  rev-list: allow missing tips with --missing=[print|allow*]

Junio C Hamano (1):
  t9210: do not rely on lazy fetching to fail

 Documentation/rev-list-options.txt |  4 ++
 builtin/rev-list.c                 | 18 ++++++++-
 list-objects-filter.c              | 11 +-----
 oidset.c                           | 10 +++++
 oidset.h                           |  6 +++
 revision.c                         | 16 ++++++--
 t/t6022-rev-list-missing.sh        | 61 +++++++++++++++++++++++++++++-
 t/t9210-scalar.sh                  |  9 ++++-
 8 files changed, 117 insertions(+), 18 deletions(-)

-- 
2.44.0.rc0.51.gda36843b44

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 1/5] t9210: do not rely on lazy fetching to fail
  2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
@ 2024-02-14 14:25   ` Christian Couder
  2024-02-14 14:25   ` [PATCH v3 2/5] revision: clarify a 'return NULL' in get_reference() Christian Couder
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:25 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver, Eric Sunshine

From: Junio C Hamano <gitster@pobox.com>

With "rev-list --missing=print $start", where "$start" is a 40-hex
object name, the object may or may not be lazily fetched from the
promisor.  Make sure it fails by forcing dereference of "$start"
at that point.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t9210-scalar.sh | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
index 4432a30d10..428339e342 100755
--- a/t/t9210-scalar.sh
+++ b/t/t9210-scalar.sh
@@ -154,7 +154,14 @@ test_expect_success 'scalar clone' '
 		test_cmp expect actual &&
 
 		test_path_is_missing 1/2 &&
-		test_must_fail git rev-list --missing=print $second &&
+
+		# This relies on the fact that the presence of "--missing"
+		# on the command line forces lazy fetching off before
+		# "$second^{blob}" gets parsed.  Without "^{blob}", a
+		# bare object name "$second" is taken into the queue and
+		# the command may not fail with a fixed "rev-list --missing".
+		test_must_fail git rev-list --missing=print "$second^{blob}" -- &&
+
 		git rev-list $second &&
 		git cat-file blob $second >actual &&
 		echo "second" >expect &&
-- 
2.44.0.rc0.51.gda36843b44


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 2/5] revision: clarify a 'return NULL' in get_reference()
  2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
  2024-02-14 14:25   ` [PATCH v3 1/5] t9210: do not rely on lazy fetching to fail Christian Couder
@ 2024-02-14 14:25   ` Christian Couder
  2024-02-14 14:25   ` [PATCH v3 3/5] oidset: refactor oidset_insert_from_set() Christian Couder
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:25 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Eric Sunshine, Christian Couder, Christian Couder

In general when we know a pointer variable is NULL, it's clearer to
explicitly return NULL than to return that variable.

In get_reference() when 'object' is NULL, we already return NULL
when 'revs->exclude_promisor_objects && is_promisor_object(oid)' is
true, but we return 'object' when 'revs->ignore_missing' is true.

Let's make the code clearer and more uniform by also explicitly
returning NULL when 'revs->ignore_missing' is true.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 revision.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index 2424c9bd67..4c5cd7c3ce 100644
--- a/revision.c
+++ b/revision.c
@@ -385,7 +385,7 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
 
 	if (!object) {
 		if (revs->ignore_missing)
-			return object;
+			return NULL;
 		if (revs->exclude_promisor_objects && is_promisor_object(oid))
 			return NULL;
 		die("bad object %s", name);
-- 
2.44.0.rc0.51.gda36843b44


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 3/5] oidset: refactor oidset_insert_from_set()
  2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
  2024-02-14 14:25   ` [PATCH v3 1/5] t9210: do not rely on lazy fetching to fail Christian Couder
  2024-02-14 14:25   ` [PATCH v3 2/5] revision: clarify a 'return NULL' in get_reference() Christian Couder
@ 2024-02-14 14:25   ` Christian Couder
  2024-02-14 14:25   ` [PATCH v3 4/5] t6022: fix 'test' style and 'even though' typo Christian Couder
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:25 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Eric Sunshine, Christian Couder, Christian Couder

In a following commit, we will need to add all the oids from a set into
another set. In "list-objects-filter.c", there is already a static
function called add_all() to do that.

Let's rename this function oidset_insert_from_set() and move it into
oidset.{c,h} to make it generally available.

While at it, let's remove a useless `!= NULL`.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 list-objects-filter.c | 11 +----------
 oidset.c              | 10 ++++++++++
 oidset.h              |  6 ++++++
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/list-objects-filter.c b/list-objects-filter.c
index da287cc8e0..4346f8da45 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -711,15 +711,6 @@ static void filter_combine__free(void *filter_data)
 	free(d);
 }
 
-static void add_all(struct oidset *dest, struct oidset *src) {
-	struct oidset_iter iter;
-	struct object_id *src_oid;
-
-	oidset_iter_init(src, &iter);
-	while ((src_oid = oidset_iter_next(&iter)) != NULL)
-		oidset_insert(dest, src_oid);
-}
-
 static void filter_combine__finalize_omits(
 	struct oidset *omits,
 	void *filter_data)
@@ -728,7 +719,7 @@ static void filter_combine__finalize_omits(
 	size_t sub;
 
 	for (sub = 0; sub < d->nr; sub++) {
-		add_all(omits, &d->sub[sub].omits);
+		oidset_insert_from_set(omits, &d->sub[sub].omits);
 		oidset_clear(&d->sub[sub].omits);
 	}
 }
diff --git a/oidset.c b/oidset.c
index d1e5376316..91d1385910 100644
--- a/oidset.c
+++ b/oidset.c
@@ -23,6 +23,16 @@ int oidset_insert(struct oidset *set, const struct object_id *oid)
 	return !added;
 }
 
+void oidset_insert_from_set(struct oidset *dest, struct oidset *src)
+{
+	struct oidset_iter iter;
+	struct object_id *src_oid;
+
+	oidset_iter_init(src, &iter);
+	while ((src_oid = oidset_iter_next(&iter)))
+		oidset_insert(dest, src_oid);
+}
+
 int oidset_remove(struct oidset *set, const struct object_id *oid)
 {
 	khiter_t pos = kh_get_oid_set(&set->set, *oid);
diff --git a/oidset.h b/oidset.h
index ba4a5a2cd3..262f4256d6 100644
--- a/oidset.h
+++ b/oidset.h
@@ -47,6 +47,12 @@ int oidset_contains(const struct oidset *set, const struct object_id *oid);
  */
 int oidset_insert(struct oidset *set, const struct object_id *oid);
 
+/**
+ * Insert all the oids that are in set 'src' into set 'dest'; a copy
+ * is made of each oid inserted into set 'dest'.
+ */
+void oidset_insert_from_set(struct oidset *dest, struct oidset *src);
+
 /**
  * Remove the oid from the set.
  *
-- 
2.44.0.rc0.51.gda36843b44


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 4/5] t6022: fix 'test' style and 'even though' typo
  2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
                     ` (2 preceding siblings ...)
  2024-02-14 14:25   ` [PATCH v3 3/5] oidset: refactor oidset_insert_from_set() Christian Couder
@ 2024-02-14 14:25   ` Christian Couder
  2024-02-14 14:25   ` [PATCH v3 5/5] rev-list: allow missing tips with --missing=[print|allow*] Christian Couder
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:25 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Eric Sunshine, Christian Couder, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t6022-rev-list-missing.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
index 211672759a..5f1be7abb5 100755
--- a/t/t6022-rev-list-missing.sh
+++ b/t/t6022-rev-list-missing.sh
@@ -46,9 +46,10 @@ do
 			git rev-list --objects --no-object-names \
 				HEAD ^$obj >expect.raw &&
 
-			# Blobs are shared by all commits, so evethough a commit/tree
+			# Blobs are shared by all commits, so even though a commit/tree
 			# might be skipped, its blob must be accounted for.
-			if [ $obj != "HEAD:1.t" ]; then
+			if test $obj != "HEAD:1.t"
+			then
 				echo $(git rev-parse HEAD:1.t) >>expect.raw &&
 				echo $(git rev-parse HEAD:2.t) >>expect.raw
 			fi &&
-- 
2.44.0.rc0.51.gda36843b44


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 5/5] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
                     ` (3 preceding siblings ...)
  2024-02-14 14:25   ` [PATCH v3 4/5] t6022: fix 'test' style and 'even though' typo Christian Couder
@ 2024-02-14 14:25   ` Christian Couder
  2024-02-16 21:56   ` [PATCH v3 0/5] rev-list: allow missing tips with --missing Linus Arver
  2024-02-28  9:10   ` [PATCH] revision: fix --missing=[print|allow*] for annotated tags Christian Couder
  6 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:25 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Eric Sunshine, Christian Couder, Christian Couder

In 9830926c7d (rev-list: add commit object support in `--missing`
option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
so that it works with with missing commits, not just blobs/trees.

Unfortunately, such a command would still fail with a "fatal: bad
object <oid>" if it is passed a missing commit, blob or tree as an
argument (before the rev walking even begins).

When such a command is used to find the dependencies of some objects,
for example the dependencies of quarantined objects (see the
"QUARANTINE ENVIRONMENT" section in the git-receive-pack(1)
documentation), it would be better if the command would instead
consider such missing objects, especially commits, in the same way as
other missing objects.

If, for example `--missing=print` is used, it would be nice for some
use cases if the missing tips passed as arguments were reported in
the same way as other missing objects instead of the command just
failing.

We could introduce a new option to make it work like this, but most
users are likely to prefer the command to have this behavior as the
default one. Introducing a new option would require another dumb loop
to look for that option early, which isn't nice.

Also we made `git rev-list` work with missing commits very recently
and the command is most often passed commits as arguments. So let's
consider this as a bug fix related to these recent changes.

While at it let's add a NEEDSWORK comment to say that we should get
rid of the existing ugly dumb loops that parse the
`--exclude-promisor-objects` and `--missing=...` options early.

Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/rev-list-options.txt |  4 +++
 builtin/rev-list.c                 | 18 +++++++++-
 revision.c                         | 14 ++++++--
 t/t6022-rev-list-missing.sh        | 56 ++++++++++++++++++++++++++++++
 4 files changed, 88 insertions(+), 4 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index a583b52c61..bb753b6292 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -1019,6 +1019,10 @@ Unexpected missing objects will raise an error.
 +
 The form '--missing=print' is like 'allow-any', but will also print a
 list of the missing objects.  Object IDs are prefixed with a ``?'' character.
++
+If some tips passed to the traversal are missing, they will be
+considered as missing too, and the traversal will ignore them. In case
+we cannot get their Object ID though, an error will be raised.
 
 --exclude-promisor-objects::
 	(For internal use only.)  Prefilter object traversal at
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index b3f4783858..ec455aa972 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -545,6 +545,18 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	 *
 	 * Let "--missing" to conditionally set fetch_if_missing.
 	 */
+	/*
+	 * NEEDSWORK: These loops that attempt to find presence of
+	 * options without understanding that the options they are
+	 * skipping are broken (e.g., it would not know "--grep
+	 * --exclude-promisor-objects" is not triggering
+	 * "--exclude-promisor-objects" option).  We really need
+	 * setup_revisions() to have a mechanism to allow and disallow
+	 * some sets of options for different commands (like rev-list,
+	 * replay, etc). Such a mechanism should do an early parsing
+	 * of options and be able to manage the `--missing=...` and
+	 * `--exclude-promisor-objects` options below.
+	 */
 	for (i = 1; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!strcmp(arg, "--exclude-promisor-objects")) {
@@ -753,8 +765,12 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 
 	if (arg_print_omitted)
 		oidset_init(&omitted_objects, DEFAULT_OIDSET_SIZE);
-	if (arg_missing_action == MA_PRINT)
+	if (arg_missing_action == MA_PRINT) {
 		oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
+		/* Add missing tips */
+		oidset_insert_from_set(&missing_objects, &revs.missing_commits);
+		oidset_clear(&revs.missing_commits);
+	}
 
 	traverse_commit_list_filtered(
 		&revs, show_commit, show_object, &info,
diff --git a/revision.c b/revision.c
index 4c5cd7c3ce..80c349d347 100644
--- a/revision.c
+++ b/revision.c
@@ -388,6 +388,10 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
 			return NULL;
 		if (revs->exclude_promisor_objects && is_promisor_object(oid))
 			return NULL;
+		if (revs->do_not_die_on_missing_objects) {
+			oidset_insert(&revs->missing_commits, oid);
+			return NULL;
+		}
 		die("bad object %s", name);
 	}
 	object->flags |= flags;
@@ -1947,6 +1951,7 @@ void repo_init_revisions(struct repository *r,
 	init_display_notes(&revs->notes_opt);
 	list_objects_filter_init(&revs->filter);
 	init_ref_exclusions(&revs->ref_excludes);
+	oidset_init(&revs->missing_commits, 0);
 }
 
 static void add_pending_commit_list(struct rev_info *revs,
@@ -2178,13 +2183,18 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl
 	if (revarg_opt & REVARG_COMMITTISH)
 		get_sha1_flags |= GET_OID_COMMITTISH;
 
+	/*
+	 * Even if revs->do_not_die_on_missing_objects is set, we
+	 * should error out if we can't even get an oid, as
+	 * `--missing=print` should be able to report missing oids.
+	 */
 	if (get_oid_with_context(revs->repo, arg, get_sha1_flags, &oid, &oc))
 		return revs->ignore_missing ? 0 : -1;
 	if (!cant_be_filename)
 		verify_non_filename(revs->prefix, arg);
 	object = get_reference(revs, arg, &oid, flags ^ local_flags);
 	if (!object)
-		return revs->ignore_missing ? 0 : -1;
+		return (revs->ignore_missing || revs->do_not_die_on_missing_objects) ? 0 : -1;
 	add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
 	add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
 	free(oc.path);
@@ -3830,8 +3840,6 @@ int prepare_revision_walk(struct rev_info *revs)
 				       FOR_EACH_OBJECT_PROMISOR_ONLY);
 	}
 
-	oidset_init(&revs->missing_commits, 0);
-
 	if (!revs->reflog_info)
 		prepare_to_use_bloom_filter(revs);
 	if (!revs->unsorted_input)
diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
index 5f1be7abb5..78387eebb3 100755
--- a/t/t6022-rev-list-missing.sh
+++ b/t/t6022-rev-list-missing.sh
@@ -78,4 +78,60 @@ do
 	done
 done
 
+for missing_tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
+do
+	# We want to check that things work when both
+	#   - all the tips passed are missing (case existing_tip = ""), and
+	#   - there is one missing tip and one existing tip (case existing_tip = "HEAD")
+	for existing_tip in "" "HEAD"
+	do
+		for action in "allow-any" "print"
+		do
+			test_expect_success "--missing=$action with tip '$missing_tip' missing and tip '$existing_tip'" '
+				oid="$(git rev-parse $missing_tip)" &&
+				path=".git/objects/$(test_oid_to_path $oid)" &&
+
+				# Before the object is made missing, we use rev-list to
+				# get the expected oids.
+				if test "$existing_tip" = "HEAD"
+				then
+					git rev-list --objects --no-object-names \
+						HEAD ^$missing_tip >expect.raw
+				else
+					>expect.raw
+				fi &&
+
+				# Blobs are shared by all commits, so even though a commit/tree
+				# might be skipped, its blob must be accounted for.
+				if test "$existing_tip" = "HEAD" && test $missing_tip != "HEAD:1.t"
+				then
+					echo $(git rev-parse HEAD:1.t) >>expect.raw &&
+					echo $(git rev-parse HEAD:2.t) >>expect.raw
+				fi &&
+
+				mv "$path" "$path.hidden" &&
+				test_when_finished "mv $path.hidden $path" &&
+
+				git rev-list --missing=$action --objects --no-object-names \
+				     $oid $existing_tip >actual.raw &&
+
+				# When the action is to print, we should also add the missing
+				# oid to the expect list.
+				case $action in
+				allow-any)
+					;;
+				print)
+					grep ?$oid actual.raw &&
+					echo ?$oid >>expect.raw
+					;;
+				esac &&
+
+				sort actual.raw >actual &&
+				sort expect.raw >expect &&
+				test_cmp expect actual
+			'
+		done
+	done
+done
+
 test_done
-- 
2.44.0.rc0.51.gda36843b44


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 0/4] rev-list: allow missing tips with --missing
  2024-02-08 23:15 ` [PATCH v2 0/4] rev-list: allow missing tips with --missing Junio C Hamano
@ 2024-02-14 14:26   ` Christian Couder
  0 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Patrick Steinhardt, John Cai, Linus Arver

On Fri, Feb 9, 2024 at 12:15 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > # Patch overview
> >
> > Patches 1/4 (revision: clarify a 'return NULL' in get_reference()),
> > 2/4 (oidset: refactor oidset_insert_from_set()) and
> > 3/4 (t6022: fix 'test' style and 'even though' typo) are very small
> > preparatory cleanups.
> >
> > Patch 4/4 (rev-list: allow missing tips with --missing=[print|allow*])
> > allows git-rev-list(1) with `--missing=<arg>` when <arg> is not
> > 'error' to not fail if some tips it is passed are missing.
>
> Thanks.  There is an existing test that makes a bad assumption and
> fails with these patches.  We may need something like this patch as
> a preliminary fix before these four patches.

Thanks, I have added your patch to the V3 I just sent.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/4] oidset: refactor oidset_insert_from_set()
  2024-02-13 21:02   ` Linus Arver
@ 2024-02-14 14:33     ` Christian Couder
  2024-02-16  1:10       ` Linus Arver
  0 siblings, 1 reply; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:33 UTC (permalink / raw)
  To: Linus Arver
  Cc: git, Junio C Hamano, Patrick Steinhardt, John Cai, Christian Couder

On Tue, Feb 13, 2024 at 10:02 PM Linus Arver <linusa@google.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > In a following commit, we will need to add all the oids from a set into
> > another set. In "list-objects-filter.c", there is already a static
> > function called add_all() to do that.
>
> Nice find.
>
> > Let's rename this function oidset_insert_from_set() and move it into
> > oidset.{c,h} to make it generally available.
>
> At some point (I don't ask for it in this series) we should add unit
> tests for this newly-exposed function. Presumably the stuff around
> object/oid handling is stable enough to receive unit tests.

Yeah, ideally there should be unit tests for oidset and all its
features, but it seems to me that there aren't any. Also oidset is
based on khash.h which was originally imported from
https://github.com/attractivechaos/klib/ without tests. So I think
it's a different topic to add tests from scratch to oidset, khash.h or
both.

Actually after taking another look, it looks like khash.h or some of
its features are tested through "helper/test-oidmap.c" and
"t0016-oidmap.sh". I still think it's another topic to test oidset.

> > +void oidset_insert_from_set(struct oidset *dest, struct oidset *src)
> > +{
> > +     struct oidset_iter iter;
> > +     struct object_id *src_oid;
> > +
> > +     oidset_iter_init(src, &iter);
> > +     while ((src_oid = oidset_iter_next(&iter)))
>
> Are the extra parentheses necessary?

Yes. Without them gcc errors out with:

oidset.c: In function ‘oidset_insert_from_set’:
oidset.c:32:16: error: suggest parentheses around assignment used as
truth value [-Werror=parentheses]
  32 |         while (src_oid = oidset_iter_next(&iter))
     |                ^~~~~~~

Having extra parentheses is a way to tell the compiler that we do want
to use '=' and not '=='. This helps avoid the very common mistake of
using '=' where '==' was intended.

> > +/**
> > + * Insert all the oids that are in set 'src' into set 'dest'; a copy
> > + * is made of each oid inserted into set 'dest'.
> > + */
>
> Just above in oid_insert() there is already a comment about needing to
> copy each oid.

(It's "oidset_insert()" not "oid_insert()".)

>     /**
>      * Insert the oid into the set; a copy is made, so "oid" does not need
>      * to persist after this function is called.
>      *
>      * Returns 1 if the oid was already in the set, 0 otherwise. This can be used
>      * to perform an efficient check-and-add.
>      */
>
> so perhaps the following wording is simpler?
>
>     Like oid_insert(), but insert all oids found in 'src'. Calls
>     oid_insert() internally.

(What you suggest would need s/oid_insert/oidset_insert/)

Yeah, it's a bit simpler and shorter, but on the other hand a reader
might have to read both this and the oidset_insert() doc, so in the
end I am not sure it's a big win for readability. And if they don't
read the oidset_insert() doc, they might miss the fact that we are
copying the oids we insert, which might result in a bug.

Also your wording ties the implementation with oidset_insert(), which
we might not want if we could find something more performant. See
Junio's comment on this patch saying his initial reaction was that
copying underlying bits may even be more efficient.

So I prefer not to change this.

> > +void oidset_insert_from_set(struct oidset *dest, struct oidset *src);
>
> Perhaps "oidset_insert_all" would be a simpler name? I generally prefer
> to reuse any descriptors in comments to guide the names. Plus this
> function used to be called "add_all()" so keeping the "all" naming style
> feels right.

We already have other related types like 'struct oid-array' and
'struct oidmap' to store oids, as well as code that inserts many oids
into an oidset from a 'struct ref *' linked list or array in a tight
loop. So if we want to add functions inserting all the oids from
instances of such types, how should we call them?

I would say we should use suffixes like: "_from_set", "_from_map",
"from_array", "_from_ref_list", "_from_ref_array", etc.

If we start using just "_all" for oidset, then what should we use for
the other types? I don't see a good answer to that, so I prefer to
stick with "_from_set" for oidset.

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-13 22:38     ` Linus Arver
@ 2024-02-14 14:34       ` Christian Couder
  2024-02-14 16:49         ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:34 UTC (permalink / raw)
  To: Linus Arver
  Cc: Junio C Hamano, git, Patrick Steinhardt, John Cai, Christian Couder

On Tue, Feb 13, 2024 at 11:38 PM Linus Arver <linusa@google.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Christian Couder <christian.couder@gmail.com> writes:
> >
> >> +     * mechanism to allow and disallow some sets of options for
> >> +     * different commands (like rev-list, replay, etc). Such
> >> +     * mechanism should do an early parsing of option and be able
> >> +     * to manage the `--exclude-promisor-objects` and `--missing=...`
> >> +     * options below.
> >> +     */
> >>      for (i = 1; i < argc; i++) {
> >>              const char *arg = argv[i];
> >>              if (!strcmp(arg, "--exclude-promisor-objects")) {
> >> @@ -753,8 +762,12 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> >>
> >>      if (arg_print_omitted)
> >>              oidset_init(&omitted_objects, DEFAULT_OIDSET_SIZE);
> >> -    if (arg_missing_action == MA_PRINT)
> >> +    if (arg_missing_action == MA_PRINT) {
> >>              oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
> >> +            /* Already add missing tips */
> >> +            oidset_insert_from_set(&missing_objects, &revs.missing_commits);
> >> +            oidset_clear(&revs.missing_commits);
> >> +    }
> >
> > It is unclear what "already" here refers to, at least to me.

I wanted to hint that we already have some missing objects that we can
add to the set. But it's not an important detail and I agree it can be
confusing.

> It's grammatically correct but perhaps a bit "over eager" (gives the
> impression that we get these missing tips all the time and is a common
> "happy" path). I do still think my earlier v1 comments
>
>     Did you mean "Add already-missing commits"? Perhaps even more explicit
>     would be "Add missing tips"?

"Add missing tips" is used in the V3 I just sent. Thanks.

> are relevant here.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-13 22:33   ` Linus Arver
@ 2024-02-14 14:38     ` Christian Couder
  2024-02-16  1:24       ` Linus Arver
  0 siblings, 1 reply; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:38 UTC (permalink / raw)
  To: Linus Arver
  Cc: git, Junio C Hamano, Patrick Steinhardt, John Cai, Christian Couder

On Tue, Feb 13, 2024 at 11:33 PM Linus Arver <linusa@google.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:

> > We could introduce a new option to make it work like this, but most
> > users are likely to prefer the command to have this behavior as the
> > default one. Introducing a new option would require another dumb loop
> > to look for that options early, which isn't nice.
>
> s/options/option

Fixed in the V3 I just sent.

> > Also we made `git rev-list` work with missing commits very recently
> > and the command is most often passed commits as arguments. So let's
> > consider this as a bug fix related to these previous change.
>
> s/previous change/recent changes

Fixed in V3, thanks.

> > While at it let's add a NEEDSWORK comment to say that we should get
> > rid of the existing ugly dumb loops that parse the
> > `--exclude-promisor-objects` and `--missing=...` options early.

> > @@ -1019,6 +1019,10 @@ Unexpected missing objects will raise an error.
> >  +
> >  The form '--missing=print' is like 'allow-any', but will also print a
> >  list of the missing objects.  Object IDs are prefixed with a ``?'' character.
> > ++
> > +If some tips passed to the traversal are missing, they will be
> > +considered as missing too, and the traversal will ignore them. In case
> > +we cannot get their Object ID though, an error will be raised.
>
> The only other mention of the term "tips" is for the "--alternate-refs"
> flag on line 215 which uses "ref tips". Perhaps this is obvious to
> veteran Git developers but I do wonder if we need to somehow define it
> (however briefly) the first time we mention it (either in the document
> as a whole, or just within this newly added paragraph).

I did a quick grep in Documentation/git*.txt and found more than 130
instances of the 'tip' word. So I think it is quite common in the
whole Git documentation and our glossary would likely be the right
place to document it if we decide to do so. Anyway I think that topic
is independent from this small series.

> Here's an alternate wording
>
>     Ref tips given on the command line are a special case.

`git rev-list` has a `--stdin` mode which makes it accept tips from
stdin, so talking about the command line is not enough. Also maybe one
day some config option could be added that makes the command include
additional tips.

> They are
>     first dereferenced to Object IDs (if this is not possible, an error
>     will be raised). Then these IDs are checked to see if the objects
>     they refer to exist. If so, the traversal happens starting with
>     these tips. Otherwise, then such tips will not be used for
>     traversal.
>
>     Even though such missing tips won't be included for traversal, for
>     purposes of the `--missing` flag they will be treated the same way
>     as those objects that did get traversed (and were determined to be
>     missing). For example, if `--missing=print` is given then the Object
>     IDs of these tips will be printed just like all other missing
>     objects encountered during traversal.

Your wording describes what happens correctly, but I don't see much
added value for this specific patch compared to my wording which is
shorter.

Here I think, we only need to describe the result of the command in
the special case that the patch is fixing. We don't need to go into
details of how the command or even --missing works. It could be
interesting to go into details of how things work, but I think it's a
separate topic. Or perhaps it's even actually counter productive to go
into too much detail as it could prevent us from finding other ways to
make it work better. Anyway it seems to me to be a separate topic to
discuss.

> But also, I realize that these documentation touch-ups might be better
> served by an overall pass over the whole document, so it's fine if we
> decide not to take this suggestion at this time.

Right, I agree. Thanks for telling this.

> Aside: unfortunately we don't seem to define the relationship between
> ref tips (e.g., "HEAD"), object IDs (40-char hex string), and the actual
> objects (the real data that we traverse over). It's probably another
> thing that could be fixed up in the docs in the future.

Yeah, and for rev-list a tip could also be a tree or a blob. It
doesn't need to be a "ref tip". (Even though a ref can point to a tree
or a blog, it's very rare in practice.)

> >  --exclude-promisor-objects::
> >       (For internal use only.)  Prefilter object traversal at
> > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > index b3f4783858..ec9556f135 100644
> > --- a/builtin/rev-list.c
> > +++ b/builtin/rev-list.c
> > @@ -545,6 +545,15 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> >        *
> >        * Let "--missing" to conditionally set fetch_if_missing.
> >        */
> > +     /*
> > +      * NEEDSWORK: These dump loops to look for some options early
> > +      * are ugly.
>
> I agree with Junio's suggestion to use more objective language.
>
> > We really need setup_revisions() to have a
> > +      * mechanism to allow and disallow some sets of options for
> > +      * different commands (like rev-list, replay, etc). Such
>
> s/Such/Such a

Fixed in V3

> > +      * mechanism should do an early parsing of option and be able
>
> s/option/options

Fixed in V3, thanks.

> > +      * to manage the `--exclude-promisor-objects` and `--missing=...`
> > +      * options below.
> > +      */
> >       for (i = 1; i < argc; i++) {
> >               const char *arg = argv[i];
> >               if (!strcmp(arg, "--exclude-promisor-objects")) {
> >
> > [...]
> >
> > @@ -2178,13 +2183,18 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl
> >       if (revarg_opt & REVARG_COMMITTISH)
> >               get_sha1_flags |= GET_OID_COMMITTISH;
> >
> > +     /*
> > +      * Even if revs->do_not_die_on_missing_objects is set, we
> > +      * should error out if we can't even get an oid, ...
> > +      */
>
> Perhaps this wording is more precise?
>
>     If we can't even get an oid, we are forced to error out (regardless
>     of revs->do_not_die_on_missing_objects) because a valid traversal
>     must start from *some* valid oid. OTOH we ignore the ref tip
>     altogether with revs->ignore_missing.

This uses "valid oid" and "valid traversal", but I am not sure it's
easy to understand what "valid" means in both of these expressions.

Also if all the tips passed to the command are missing, the traversal
doesn't need to actually start. The command, assuming
`--missing=print` is passed, just needs to output the oids of the tips
as missing oids.

I also think that "ref tip" might be misleading as trees and blos can
be passed as tips.

So I prefer to keep the wording I used.

> > +      * ... as
> > +      * `--missing=print` should be able to report missing oids.
>
> I think this comment would be better placed ...
>
> >       if (get_oid_with_context(revs->repo, arg, get_sha1_flags, &oid, &oc))
> >               return revs->ignore_missing ? 0 : -1;
> >       if (!cant_be_filename)
> >               verify_non_filename(revs->prefix, arg);
> >       object = get_reference(revs, arg, &oid, flags ^ local_flags);
>
> ... around here.

In a previous round, I was asked to put such a comment before `if
(get_oid_with_context(...))`. So I prefer to avoid some back and forth
here.

> > +++ b/t/t6022-rev-list-missing.sh
> > @@ -78,4 +78,60 @@ do
> >       done
> >  done
> >
> > +for missing_tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
> > +do
> > +     # We want to check that things work when both
> > +     #   - all the tips passed are missing (case existing_tip = ""), and
> > +     #   - there is one missing tip and one existing tip (case existing_tip = "HEAD")
> > +     for existing_tip in "" "HEAD"
> > +     do
>
> Though I am biased, these new variable names do make this test that much
> easier to read. Thanks.

Thanks for suggesting them and for your reviews.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-08 17:44   ` Junio C Hamano
  2024-02-13 22:38     ` Linus Arver
@ 2024-02-14 14:39     ` Christian Couder
  1 sibling, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-02-14 14:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Patrick Steinhardt, John Cai, Linus Arver, Christian Couder

On Thu, Feb 8, 2024 at 6:44 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:

> > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > index b3f4783858..ec9556f135 100644
> > --- a/builtin/rev-list.c
> > +++ b/builtin/rev-list.c
> > @@ -545,6 +545,15 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> >        *
> >        * Let "--missing" to conditionally set fetch_if_missing.
> >        */
> > +     /*
> > +      * NEEDSWORK: These dump loops to look for some options early
> > +      * are ugly. We really need setup_revisions() to have a
>
> I would drop the "dump" and "ugly" from here.  It OK to make value
> judgement with such words in casual conversations on a proposed
> patch, but when we make a request to future developers to fix our
> mess, we should be more objective and stick to the techincal facts,
> so that they have better understanding on why we think this area
> needs more work.
>
> Perhaps something like:
>
>     These loops that attempt to find presence of options without
>     understanding what the options they are skipping are broken
>     (e.g., it would not know "--grep --exclude-promisor-objects" is
>     not triggering "--exclude-promisor-objects" option).

I have used what you suggest in V3, except for s/what/that/.

> Everything after "We really need" is good (modulo possible grammos),
> I think.  Thanks for writing it.
>
> > +      * mechanism to allow and disallow some sets of options for
> > +      * different commands (like rev-list, replay, etc). Such
> > +      * mechanism should do an early parsing of option and be able
> > +      * to manage the `--exclude-promisor-objects` and `--missing=...`
> > +      * options below.
> > +      */
> >       for (i = 1; i < argc; i++) {
> >               const char *arg = argv[i];
> >               if (!strcmp(arg, "--exclude-promisor-objects")) {
> > @@ -753,8 +762,12 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> >
> >       if (arg_print_omitted)
> >               oidset_init(&omitted_objects, DEFAULT_OIDSET_SIZE);
> > -     if (arg_missing_action == MA_PRINT)
> > +     if (arg_missing_action == MA_PRINT) {
> >               oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
> > +             /* Already add missing tips */
> > +             oidset_insert_from_set(&missing_objects, &revs.missing_commits);
> > +             oidset_clear(&revs.missing_commits);
> > +     }
>
> It is unclear what "already" here refers to, at least to me.

I removed it and changed the comment to just "/* Add missing tips */"
in the V3 I just sent.

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-14 14:34       ` Christian Couder
@ 2024-02-14 16:49         ` Junio C Hamano
  0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2024-02-14 16:49 UTC (permalink / raw)
  To: Christian Couder
  Cc: Linus Arver, git, Patrick Steinhardt, John Cai, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

>> >> +            /* Already add missing tips */
>> >> +            oidset_insert_from_set(&missing_objects, &revs.missing_commits);
>> >> +            oidset_clear(&revs.missing_commits);
>> >> +    }
>> >
>> > It is unclear what "already" here refers to, at least to me.
>
> I wanted to hint that we already have some missing objects that we can
> add to the set. But it's not an important detail and I agree it can be
> confusing.

I was confused primarily because "already" was not sitting next to
"missing".  I would have understood "Add already-missing tips" just
fine ;-)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/4] oidset: refactor oidset_insert_from_set()
  2024-02-14 14:33     ` Christian Couder
@ 2024-02-16  1:10       ` Linus Arver
  2024-02-16 10:38         ` Christian Couder
  0 siblings, 1 reply; 30+ messages in thread
From: Linus Arver @ 2024-02-16  1:10 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Patrick Steinhardt, John Cai, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> On Tue, Feb 13, 2024 at 10:02 PM Linus Arver <linusa@google.com> wrote:
>>
>> Christian Couder <christian.couder@gmail.com> writes:
>>
>> > In a following commit, we will need to add all the oids from a set into
>> > another set. In "list-objects-filter.c", there is already a static
>> > function called add_all() to do that.
>>
>> Nice find.
>>
>> > Let's rename this function oidset_insert_from_set() and move it into
>> > oidset.{c,h} to make it generally available.
>>
>> At some point (I don't ask for it in this series) we should add unit
>> tests for this newly-exposed function. Presumably the stuff around
>> object/oid handling is stable enough to receive unit tests.
>
> Yeah, ideally there should be unit tests for oidset and all its
> features, but it seems to me that there aren't any. Also oidset is
> based on khash.h which was originally imported from
> https://github.com/attractivechaos/klib/ without tests. So I think
> it's a different topic to add tests from scratch to oidset, khash.h or
> both.
>
> Actually after taking another look, it looks like khash.h or some of
> its features are tested through "helper/test-oidmap.c" and
> "t0016-oidmap.sh". I still think it's another topic to test oidset.

Agreed.

>> > +void oidset_insert_from_set(struct oidset *dest, struct oidset *src)
>> > +{
>> > +     struct oidset_iter iter;
>> > +     struct object_id *src_oid;
>> > +
>> > +     oidset_iter_init(src, &iter);
>> > +     while ((src_oid = oidset_iter_next(&iter)))
>>
>> Are the extra parentheses necessary?
>
> Yes. Without them gcc errors out with:
>
> oidset.c: In function ‘oidset_insert_from_set’:
> oidset.c:32:16: error: suggest parentheses around assignment used as
> truth value [-Werror=parentheses]
>   32 |         while (src_oid = oidset_iter_next(&iter))
>      |                ^~~~~~~
>
> Having extra parentheses is a way to tell the compiler that we do want
> to use '=' and not '=='. This helps avoid the very common mistake of
> using '=' where '==' was intended.

Ah, so it is a "please trust me gcc, I know what I am doing" thing and
not a "this is required in C" thing. Makes sense, thanks for clarifying.

Sorry for the noise.

>> > +/**
>> > + * Insert all the oids that are in set 'src' into set 'dest'; a copy
>> > + * is made of each oid inserted into set 'dest'.
>> > + */
>>
>> Just above in oid_insert() there is already a comment about needing to
>> copy each oid.
>
> (It's "oidset_insert()" not "oid_insert()".)

Oops, yes, sorry for the typo.

>>     /**
>>      * Insert the oid into the set; a copy is made, so "oid" does not need
>>      * to persist after this function is called.
>>      *
>>      * Returns 1 if the oid was already in the set, 0 otherwise. This can be used
>>      * to perform an efficient check-and-add.
>>      */
>>
>> so perhaps the following wording is simpler?
>>
>>     Like oid_insert(), but insert all oids found in 'src'. Calls
>>     oid_insert() internally.
>
> (What you suggest would need s/oid_insert/oidset_insert/)
>
> Yeah, it's a bit simpler and shorter, but on the other hand a reader
> might have to read both this and the oidset_insert() doc, so in the
> end I am not sure it's a big win for readability. And if they don't
> read the oidset_insert() doc, they might miss the fact that we are
> copying the oids we insert, which might result in a bug.

When functions are built on top of other functions, I think it is good
practice to point readers to those underlying functions. In this case
the new function is a wrapper around oidset_insert() which does all the
real work. Plus the helper function already has some documentation about
copying behavior that we already thought was important enough to call
out explicitly.

So, tying this definition to that (foundational) helper function sounds
like a good idea to me in terms of readability. IOW we can inform
readers "hey, we're just a wrapper around this other important function
--- go there if you're curious about internals" and emphasizing that
sort of relationship which may not be immediately obvious to those not
familiar with this area would be nice.

Alternatively, we could repeat the same comment WRT copying here but
that seems redundant and prone to maintenance burdens down the road (if
we ever change this behavior we have to change the comment in multiple
functions, possibly).

> Also your wording ties the implementation with oidset_insert(), which
> we might not want if we could find something more performant. See
> Junio's comment on this patch saying his initial reaction was that
> copying underlying bits may even be more efficient.
>
> So I prefer not to change this.

OK.

>> > +void oidset_insert_from_set(struct oidset *dest, struct oidset *src);
>>
>> Perhaps "oidset_insert_all" would be a simpler name? I generally prefer
>> to reuse any descriptors in comments to guide the names. Plus this
>> function used to be called "add_all()" so keeping the "all" naming style
>> feels right.
>
> We already have other related types like 'struct oid-array' and
> 'struct oidmap' to store oids, as well as code that inserts many oids
> into an oidset from a 'struct ref *' linked list or array in a tight
> loop.

Thank you for the additional context I was not aware of.

> So if we want to add functions inserting all the oids from
> instances of such types, how should we call them?
>
> I would say we should use suffixes like: "_from_set", "_from_map",
> "from_array", "_from_ref_list", "_from_ref_array", etc.

I agree.

However, I would like to point out that the function being added in this
patch is a bit special: it is inserting from one "oidset" into another
"oidset". IOW the both the dest and src types are the same.

For the cases where the types are different, I totally agree that using
the suffixes (to encode the type information of the src into the
function name itself) is a good idea.

So I think it's still fine to use "oidset_insert_all" because the only
type in the parameter list is an oidset.

BUT, maybe in our codebase we already use suffixes like this even for
cases where the types are the same? I don't know the answer to this
question. However if we really wanted to be consistent then maybe we
should be using the name oidset_insert_from_oidset() and not
oidset_insert_from_set().

> If we start using just "_all" for oidset, then what should we use for
> the other types? I don't see a good answer to that, so I prefer to
> stick with "_from_set" for oidset.
>
> Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*]
  2024-02-14 14:38     ` Christian Couder
@ 2024-02-16  1:24       ` Linus Arver
  0 siblings, 0 replies; 30+ messages in thread
From: Linus Arver @ 2024-02-16  1:24 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Patrick Steinhardt, John Cai, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> On Tue, Feb 13, 2024 at 11:33 PM Linus Arver <linusa@google.com> wrote:
>>
>> Christian Couder <christian.couder@gmail.com> writes:
>> > While at it let's add a NEEDSWORK comment to say that we should get
>> > rid of the existing ugly dumb loops that parse the
>> > `--exclude-promisor-objects` and `--missing=...` options early.
>
>> > @@ -1019,6 +1019,10 @@ Unexpected missing objects will raise an error.
>> >  +
>> >  The form '--missing=print' is like 'allow-any', but will also print a
>> >  list of the missing objects.  Object IDs are prefixed with a ``?'' character.
>> > ++
>> > +If some tips passed to the traversal are missing, they will be
>> > +considered as missing too, and the traversal will ignore them. In case
>> > +we cannot get their Object ID though, an error will be raised.
>>
>> The only other mention of the term "tips" is for the "--alternate-refs"
>> flag on line 215 which uses "ref tips". Perhaps this is obvious to
>> veteran Git developers but I do wonder if we need to somehow define it
>> (however briefly) the first time we mention it (either in the document
>> as a whole, or just within this newly added paragraph).
>
> I did a quick grep in Documentation/git*.txt and found more than 130
> instances of the 'tip' word. So I think it is quite common in the
> whole Git documentation and our glossary would likely be the right
> place to document it if we decide to do so.

SGTM.

> Anyway I think that topic
> is independent from this small series.

Agreed.

>> Here's an alternate wording
>>
>>     Ref tips given on the command line are a special case.
>
> `git rev-list` has a `--stdin` mode which makes it accept tips from
> stdin

Ah, thanks for the context.

> , so talking about the command line is not enough. Also maybe one
> day some config option could be added that makes the command include
> additional tips.

>> They are
>>     first dereferenced to Object IDs (if this is not possible, an error
>>     will be raised). Then these IDs are checked to see if the objects
>>     they refer to exist. If so, the traversal happens starting with
>>     these tips. Otherwise, then such tips will not be used for
>>     traversal.
>>
>>     Even though such missing tips won't be included for traversal, for
>>     purposes of the `--missing` flag they will be treated the same way
>>     as those objects that did get traversed (and were determined to be
>>     missing). For example, if `--missing=print` is given then the Object
>>     IDs of these tips will be printed just like all other missing
>>     objects encountered during traversal.
>
> Your wording describes what happens correctly, but I don't see much
> added value for this specific patch compared to my wording which is
> shorter.
>
> Here I think, we only need to describe the result of the command in
> the special case that the patch is fixing. We don't need to go into
> details of how the command or even --missing works. It could be
> interesting to go into details of how things work, but I think it's a
> separate topic. Or perhaps it's even actually counter productive to go
> into too much detail as it could prevent us from finding other ways to
> make it work better. Anyway it seems to me to be a separate topic to
> discuss.

Fair enough.

>> But also, I realize that these documentation touch-ups might be better
>> served by an overall pass over the whole document, so it's fine if we
>> decide not to take this suggestion at this time.
>
> Right, I agree. Thanks for telling this.
>
>> Aside: unfortunately we don't seem to define the relationship between
>> ref tips (e.g., "HEAD"), object IDs (40-char hex string), and the actual
>> objects (the real data that we traverse over). It's probably another
>> thing that could be fixed up in the docs in the future.
>
> Yeah, and for rev-list a tip could also be a tree or a blob. It
> doesn't need to be a "ref tip". (Even though a ref can point to a tree
> or a blog, it's very rare in practice.)

Interesting, thanks for the info.

BTW I appreciate you (and others on the list too) taking the time to
explain such subtleties. Although I've been using Git since 2008 a lot
of the terms used around in the codebase can feel quite foreign to me.
So, thanks again.

>> > +      * to manage the `--exclude-promisor-objects` and `--missing=...`
>> > +      * options below.
>> > +      */
>> >       for (i = 1; i < argc; i++) {
>> >               const char *arg = argv[i];
>> >               if (!strcmp(arg, "--exclude-promisor-objects")) {
>> >
>> > [...]
>> >
>> > @@ -2178,13 +2183,18 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl
>> >       if (revarg_opt & REVARG_COMMITTISH)
>> >               get_sha1_flags |= GET_OID_COMMITTISH;
>> >
>> > +     /*
>> > +      * Even if revs->do_not_die_on_missing_objects is set, we
>> > +      * should error out if we can't even get an oid, ...
>> > +      */
>>
>> Perhaps this wording is more precise?
>>
>>     If we can't even get an oid, we are forced to error out (regardless
>>     of revs->do_not_die_on_missing_objects) because a valid traversal
>>     must start from *some* valid oid. OTOH we ignore the ref tip
>>     altogether with revs->ignore_missing.
>
> This uses "valid oid" and "valid traversal", but I am not sure it's
> easy to understand what "valid" means in both of these expressions.
>
> Also if all the tips passed to the command are missing, the traversal
> doesn't need to actually start. The command, assuming
> `--missing=print` is passed, just needs to output the oids of the tips
> as missing oids.
>
> I also think that "ref tip" might be misleading as trees and blos can
> be passed as tips.
>
> So I prefer to keep the wording I used.

Makes sense, SGTM.

>> > +      * ... as
>> > +      * `--missing=print` should be able to report missing oids.
>>
>> I think this comment would be better placed ...
>>
>> >       if (get_oid_with_context(revs->repo, arg, get_sha1_flags, &oid, &oc))
>> >               return revs->ignore_missing ? 0 : -1;
>> >       if (!cant_be_filename)
>> >               verify_non_filename(revs->prefix, arg);
>> >       object = get_reference(revs, arg, &oid, flags ^ local_flags);
>>
>> ... around here.
>
> In a previous round, I was asked to put such a comment before `if
> (get_oid_with_context(...))`.

Sorry, I missed that.

> So I prefer to avoid some back and forth
> here.

SGTM.

>> > +++ b/t/t6022-rev-list-missing.sh
>> > @@ -78,4 +78,60 @@ do
>> >       done
>> >  done
>> >
>> > +for missing_tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
>> > +do
>> > +     # We want to check that things work when both
>> > +     #   - all the tips passed are missing (case existing_tip = ""), and
>> > +     #   - there is one missing tip and one existing tip (case existing_tip = "HEAD")
>> > +     for existing_tip in "" "HEAD"
>> > +     do
>>
>> Though I am biased, these new variable names do make this test that much
>> easier to read. Thanks.
>
> Thanks for suggesting them and for your reviews.

You're welcome!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/4] oidset: refactor oidset_insert_from_set()
  2024-02-16  1:10       ` Linus Arver
@ 2024-02-16 10:38         ` Christian Couder
  2024-02-16 20:27           ` Linus Arver
  0 siblings, 1 reply; 30+ messages in thread
From: Christian Couder @ 2024-02-16 10:38 UTC (permalink / raw)
  To: Linus Arver
  Cc: git, Junio C Hamano, Patrick Steinhardt, John Cai, Christian Couder

On Fri, Feb 16, 2024 at 2:10 AM Linus Arver <linusa@google.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > On Tue, Feb 13, 2024 at 10:02 PM Linus Arver <linusa@google.com> wrote:

> >> Are the extra parentheses necessary?
> >
> > Yes. Without them gcc errors out with:
> >
> > oidset.c: In function ‘oidset_insert_from_set’:
> > oidset.c:32:16: error: suggest parentheses around assignment used as
> > truth value [-Werror=parentheses]
> >   32 |         while (src_oid = oidset_iter_next(&iter))
> >      |                ^~~~~~~
> >
> > Having extra parentheses is a way to tell the compiler that we do want
> > to use '=' and not '=='. This helps avoid the very common mistake of
> > using '=' where '==' was intended.
>
> Ah, so it is a "please trust me gcc, I know what I am doing" thing and
> not a "this is required in C" thing. Makes sense, thanks for clarifying.
>
> Sorry for the noise.

No worries. It's better to ask in case of doubt.

> >> so perhaps the following wording is simpler?
> >>
> >>     Like oid_insert(), but insert all oids found in 'src'. Calls
> >>     oid_insert() internally.
> >
> > (What you suggest would need s/oid_insert/oidset_insert/)
> >
> > Yeah, it's a bit simpler and shorter, but on the other hand a reader
> > might have to read both this and the oidset_insert() doc, so in the
> > end I am not sure it's a big win for readability. And if they don't
> > read the oidset_insert() doc, they might miss the fact that we are
> > copying the oids we insert, which might result in a bug.
>
> When functions are built on top of other functions, I think it is good
> practice to point readers to those underlying functions. In this case
> the new function is a wrapper around oidset_insert() which does all the
> real work. Plus the helper function already has some documentation about
> copying behavior that we already thought was important enough to call
> out explicitly.
>
> So, tying this definition to that (foundational) helper function sounds
> like a good idea to me in terms of readability. IOW we can inform
> readers "hey, we're just a wrapper around this other important function
> --- go there if you're curious about internals" and emphasizing that
> sort of relationship which may not be immediately obvious to those not
> familiar with this area would be nice.
>
> Alternatively, we could repeat the same comment WRT copying here but
> that seems redundant and prone to maintenance burdens down the road (if
> we ever change this behavior we have to change the comment in multiple
> functions, possibly).
>
> > Also your wording ties the implementation with oidset_insert(), which
> > we might not want if we could find something more performant. See
> > Junio's comment on this patch saying his initial reaction was that
> > copying underlying bits may even be more efficient.
> >
> > So I prefer not to change this.
>
> OK.

I must say that in cases like this, it's difficult to be right for
sure because it's mostly with enough hindsight that we can tell what
turned out to be a good decision. Here for example, it might be that
someone will find something more performant soon or it might turn out
that the function will never change. We just can't know.

So as long as the wording is clear and good enough, I think there is
no point in trying to improve it as much as possible. Here both your
wording and my wording seem clear and good enough to me. Junio might
change his mind but so far it seems that he found my wording good
enough too. So in cases like this, it's just simpler to keep current
wording. This way I think there is a higher chance that things can be
merged sooner and that we can all be more efficient.

> >> > +void oidset_insert_from_set(struct oidset *dest, struct oidset *src);
> >>
> >> Perhaps "oidset_insert_all" would be a simpler name? I generally prefer
> >> to reuse any descriptors in comments to guide the names. Plus this
> >> function used to be called "add_all()" so keeping the "all" naming style
> >> feels right.
> >
> > We already have other related types like 'struct oid-array' and
> > 'struct oidmap' to store oids, as well as code that inserts many oids
> > into an oidset from a 'struct ref *' linked list or array in a tight
> > loop.
>
> Thank you for the additional context I was not aware of.
>
> > So if we want to add functions inserting all the oids from
> > instances of such types, how should we call them?
> >
> > I would say we should use suffixes like: "_from_set", "_from_map",
> > "from_array", "_from_ref_list", "_from_ref_array", etc.
>
> I agree.
>
> However, I would like to point out that the function being added in this
> patch is a bit special: it is inserting from one "oidset" into another
> "oidset". IOW the both the dest and src types are the same.
>
> For the cases where the types are different, I totally agree that using
> the suffixes (to encode the type information of the src into the
> function name itself) is a good idea.
>
> So I think it's still fine to use "oidset_insert_all" because the only
> type in the parameter list is an oidset.

Yeah, here also I think both "oidset_insert_from_set" and
"oidset_insert_all" are clear and good enough.

> BUT, maybe in our codebase we already use suffixes like this even for
> cases where the types are the same? I don't know the answer to this
> question.

I agree that it could be a good thing to be consistent with similar
structs, but so far for oidmap there is only oidmap_put(), and, for
oid-array, only oid_array_append(). (And no, I didn't look further
than this.)

> However if we really wanted to be consistent then maybe we
> should be using the name oidset_insert_from_oidset() and not
> oidset_insert_from_set().

Yeah, "oidset_insert_from_oidset" and perhaps
"oidset_insert_all_from_oidset" would probably be fine too. Junio
found my wording good enough though, so I think it's just simpler not
to change it.

Also it's not like it can't be improved later if there is a good
reason like consistency with other oid related structs that might get
oidmap_put_all() or oid_array_append_all(). But again we can't predict
what will happen, so...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/4] oidset: refactor oidset_insert_from_set()
  2024-02-16 10:38         ` Christian Couder
@ 2024-02-16 20:27           ` Linus Arver
  0 siblings, 0 replies; 30+ messages in thread
From: Linus Arver @ 2024-02-16 20:27 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Patrick Steinhardt, John Cai, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> On Fri, Feb 16, 2024 at 2:10 AM Linus Arver <linusa@google.com> wrote:
>>
>> Christian Couder <christian.couder@gmail.com> writes:
>> >> so perhaps the following wording is simpler?
>> >>
>> >>     Like oid_insert(), but insert all oids found in 'src'. Calls
>> >>     oid_insert() internally.
>> >
>> > (What you suggest would need s/oid_insert/oidset_insert/)
>> >
>> > Yeah, it's a bit simpler and shorter, but on the other hand a reader
>> > might have to read both this and the oidset_insert() doc, so in the
>> > end I am not sure it's a big win for readability. And if they don't
>> > read the oidset_insert() doc, they might miss the fact that we are
>> > copying the oids we insert, which might result in a bug.
>>
>> When functions are built on top of other functions, I think it is good
>> practice to point readers to those underlying functions. In this case
>> the new function is a wrapper around oidset_insert() which does all the
>> real work. Plus the helper function already has some documentation about
>> copying behavior that we already thought was important enough to call
>> out explicitly.
>>
>> So, tying this definition to that (foundational) helper function sounds
>> like a good idea to me in terms of readability. IOW we can inform
>> readers "hey, we're just a wrapper around this other important function
>> --- go there if you're curious about internals" and emphasizing that
>> sort of relationship which may not be immediately obvious to those not
>> familiar with this area would be nice.
>>
>> Alternatively, we could repeat the same comment WRT copying here but
>> that seems redundant and prone to maintenance burdens down the road (if
>> we ever change this behavior we have to change the comment in multiple
>> functions, possibly).
>>
>> > Also your wording ties the implementation with oidset_insert(), which
>> > we might not want if we could find something more performant. See
>> > Junio's comment on this patch saying his initial reaction was that
>> > copying underlying bits may even be more efficient.
>> >
>> > So I prefer not to change this.
>>
>> OK.
>
> I must say that in cases like this, it's difficult to be right for
> sure because it's mostly with enough hindsight that we can tell what
> turned out to be a good decision. Here for example, it might be that
> someone will find something more performant soon or it might turn out
> that the function will never change. We just can't know.
>
> So as long as the wording is clear and good enough, I think there is
> no point in trying to improve it as much as possible. Here both your
> wording and my wording seem clear and good enough to me. Junio might
> change his mind but so far it seems that he found my wording good
> enough too. So in cases like this, it's just simpler to keep current
> wording.

Sounds very reasonable.

> This way I think there is a higher chance that things can be
> merged sooner and that we can all be more efficient.

Thank you for pointing this out. There is definitely a balance between
trying to find the best possible solution (which may require a much
deeper analysis of the codebase, existing usage patterns, future
prospects in this area, etc) and getting something that's good enough.

Somehow I was under the impression that we always wanted the best
possible thing during the review process (regardless of the number of
rerolls), but you make a good point about "code review ergonomics", if
you will. And on top of that I fully agree with all of your other
comments below, so, SGTM. Thanks.

>> >> > +void oidset_insert_from_set(struct oidset *dest, struct oidset *src);
>> >>
>> >> Perhaps "oidset_insert_all" would be a simpler name? I generally prefer
>> >> to reuse any descriptors in comments to guide the names. Plus this
>> >> function used to be called "add_all()" so keeping the "all" naming style
>> >> feels right.
>> >
>> > We already have other related types like 'struct oid-array' and
>> > 'struct oidmap' to store oids, as well as code that inserts many oids
>> > into an oidset from a 'struct ref *' linked list or array in a tight
>> > loop.
>>
>> Thank you for the additional context I was not aware of.
>>
>> > So if we want to add functions inserting all the oids from
>> > instances of such types, how should we call them?
>> >
>> > I would say we should use suffixes like: "_from_set", "_from_map",
>> > "from_array", "_from_ref_list", "_from_ref_array", etc.
>>
>> I agree.
>>
>> However, I would like to point out that the function being added in this
>> patch is a bit special: it is inserting from one "oidset" into another
>> "oidset". IOW the both the dest and src types are the same.
>>
>> For the cases where the types are different, I totally agree that using
>> the suffixes (to encode the type information of the src into the
>> function name itself) is a good idea.
>>
>> So I think it's still fine to use "oidset_insert_all" because the only
>> type in the parameter list is an oidset.
>
> Yeah, here also I think both "oidset_insert_from_set" and
> "oidset_insert_all" are clear and good enough.
>
>> BUT, maybe in our codebase we already use suffixes like this even for
>> cases where the types are the same? I don't know the answer to this
>> question.
>
> I agree that it could be a good thing to be consistent with similar
> structs, but so far for oidmap there is only oidmap_put(), and, for
> oid-array, only oid_array_append(). (And no, I didn't look further
> than this.)
>
>> However if we really wanted to be consistent then maybe we
>> should be using the name oidset_insert_from_oidset() and not
>> oidset_insert_from_set().
>
> Yeah, "oidset_insert_from_oidset" and perhaps
> "oidset_insert_all_from_oidset" would probably be fine too. Junio
> found my wording good enough though, so I think it's just simpler not
> to change it.
>
> Also it's not like it can't be improved later if there is a good
> reason like consistency with other oid related structs that might get
> oidmap_put_all() or oid_array_append_all(). But again we can't predict
> what will happen, so...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3 0/5] rev-list: allow missing tips with --missing
  2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
                     ` (4 preceding siblings ...)
  2024-02-14 14:25   ` [PATCH v3 5/5] rev-list: allow missing tips with --missing=[print|allow*] Christian Couder
@ 2024-02-16 21:56   ` Linus Arver
  2024-02-28  9:10   ` [PATCH] revision: fix --missing=[print|allow*] for annotated tags Christian Couder
  6 siblings, 0 replies; 30+ messages in thread
From: Linus Arver @ 2024-02-16 21:56 UTC (permalink / raw)
  To: Christian Couder, git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Eric Sunshine,
	Christian Couder

This v3 LGTM.

Reviewed-by: Linus Arver <linusa@google.com>

Thanks!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] revision: fix --missing=[print|allow*] for annotated tags
  2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
                     ` (5 preceding siblings ...)
  2024-02-16 21:56   ` [PATCH v3 0/5] rev-list: allow missing tips with --missing Linus Arver
@ 2024-02-28  9:10   ` Christian Couder
  2024-02-28 17:46     ` Junio C Hamano
  6 siblings, 1 reply; 30+ messages in thread
From: Christian Couder @ 2024-02-28  9:10 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, John Cai, Linus Arver,
	Eric Sunshine, Christian Couder, Christian Couder

In 9830926c7d (rev-list: add commit object support in `--missing`
option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
so that it works with missing commits, not just blobs/trees.

Unfortunately, such a command was still failing with a "fatal: bad
object <oid>" if it was passed a missing commit, blob or tree as an
argument (before the rev walking even begins). This was fixed in a
recent commit.

That fix still doesn't work when an argument passed to the command is
an annotated tag pointing to a missing commit though. In that case
`git rev-list --missing=...` still errors out with a "fatal: bad
object <oid>" error where <oid> is the object ID of the missing
commit.

Let's fix this issue, and also, while at it, let's add tests not just
for annotated tags but also for regular tags and branches.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---

This is a follow up to cc/rev-list-allow-missing-tips which is in
'next' and scheduled to be merged to 'master'. So it depends on that
branch.

 revision.c                  |  8 +++++++-
 t/t6022-rev-list-missing.sh | 24 ++++++++++++++++++------
 2 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/revision.c b/revision.c
index 0c7266b1eb..8f0d638af1 100644
--- a/revision.c
+++ b/revision.c
@@ -419,15 +419,21 @@ static struct commit *handle_commit(struct rev_info *revs,
 	 */
 	while (object->type == OBJ_TAG) {
 		struct tag *tag = (struct tag *) object;
+		struct object_id *oid;
 		if (revs->tag_objects && !(flags & UNINTERESTING))
 			add_pending_object(revs, object, tag->tag);
-		object = parse_object(revs->repo, get_tagged_oid(tag));
+		oid = get_tagged_oid(tag);
+		object = parse_object(revs->repo, oid);
 		if (!object) {
 			if (revs->ignore_missing_links || (flags & UNINTERESTING))
 				return NULL;
 			if (revs->exclude_promisor_objects &&
 			    is_promisor_object(&tag->tagged->oid))
 				return NULL;
+			if (revs->do_not_die_on_missing_objects && oid) {
+				oidset_insert(&revs->missing_commits, oid);
+				return NULL;
+			}
 			die("bad object %s", oid_to_hex(&tag->tagged->oid));
 		}
 		object->flags |= flags;
diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
index 78387eebb3..127180e1c9 100755
--- a/t/t6022-rev-list-missing.sh
+++ b/t/t6022-rev-list-missing.sh
@@ -10,7 +10,10 @@ TEST_PASSES_SANITIZE_LEAK=true
 test_expect_success 'create repository and alternate directory' '
 	test_commit 1 &&
 	test_commit 2 &&
-	test_commit 3
+	test_commit 3 &&
+	git tag -m "tag message" annot_tag HEAD~1 &&
+	git tag regul_tag HEAD~1 &&
+	git branch a_branch HEAD~1
 '
 
 # We manually corrupt the repository, which means that the commit-graph may
@@ -78,7 +81,7 @@ do
 	done
 done
 
-for missing_tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
+for missing_tip in "annot_tag" "regul_tag" "a_branch" "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
 do
 	# We want to check that things work when both
 	#   - all the tips passed are missing (case existing_tip = ""), and
@@ -88,9 +91,6 @@ do
 		for action in "allow-any" "print"
 		do
 			test_expect_success "--missing=$action with tip '$missing_tip' missing and tip '$existing_tip'" '
-				oid="$(git rev-parse $missing_tip)" &&
-				path=".git/objects/$(test_oid_to_path $oid)" &&
-
 				# Before the object is made missing, we use rev-list to
 				# get the expected oids.
 				if test "$existing_tip" = "HEAD"
@@ -109,11 +109,23 @@ do
 					echo $(git rev-parse HEAD:2.t) >>expect.raw
 				fi &&
 
+				missing_oid="$(git rev-parse $missing_tip)" &&
+
+				if test "$missing_tip" = "annot_tag"
+				then
+					oid="$(git rev-parse $missing_tip^{commit})" &&
+					echo "$missing_oid" >>expect.raw
+				else
+					oid="$missing_oid"
+				fi &&
+
+				path=".git/objects/$(test_oid_to_path $oid)" &&
+
 				mv "$path" "$path.hidden" &&
 				test_when_finished "mv $path.hidden $path" &&
 
 				git rev-list --missing=$action --objects --no-object-names \
-				     $oid $existing_tip >actual.raw &&
+				     $missing_oid $existing_tip >actual.raw &&
 
 				# When the action is to print, we should also add the missing
 				# oid to the expect list.
-- 
2.44.0.297.g8ee07f1da4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] revision: fix --missing=[print|allow*] for annotated tags
  2024-02-28  9:10   ` [PATCH] revision: fix --missing=[print|allow*] for annotated tags Christian Couder
@ 2024-02-28 17:46     ` Junio C Hamano
  0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2024-02-28 17:46 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Patrick Steinhardt, John Cai, Linus Arver, Eric Sunshine,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> diff --git a/revision.c b/revision.c
> index 0c7266b1eb..8f0d638af1 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -419,15 +419,21 @@ static struct commit *handle_commit(struct rev_info *revs,
>  	 */
>  	while (object->type == OBJ_TAG) {
>  		struct tag *tag = (struct tag *) object;
> +		struct object_id *oid;
>  		if (revs->tag_objects && !(flags & UNINTERESTING))
>  			add_pending_object(revs, object, tag->tag);
> -		object = parse_object(revs->repo, get_tagged_oid(tag));
> +		oid = get_tagged_oid(tag);
> +		object = parse_object(revs->repo, oid);

This is locally a no-op but we need it because we will use oid
later, OK.

>  		if (!object) {
>  			if (revs->ignore_missing_links || (flags & UNINTERESTING))
>  				return NULL;
>  			if (revs->exclude_promisor_objects &&
>  			    is_promisor_object(&tag->tagged->oid))
>  				return NULL;
> +			if (revs->do_not_die_on_missing_objects && oid) {
> +				oidset_insert(&revs->missing_commits, oid);
> +				return NULL;
> +			}

And we recover from the "oh, that is not an object" by doing the
usual "add to missing-objects list".  OK.

At this point we do not know the type of the tagged object (the tag
itself may hint what the tagged object is, though).  We might want
to rename .missing_commits to .missing_objects later after the dust
settles.  revision.c:get_reference() already adds anything that is
pointed at by a ref to this oidset already, so it is not a new
problem with this patch, though.

> diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
> index 78387eebb3..127180e1c9 100755
> --- a/t/t6022-rev-list-missing.sh
> +++ b/t/t6022-rev-list-missing.sh
> @@ -10,7 +10,10 @@ TEST_PASSES_SANITIZE_LEAK=true
>  test_expect_success 'create repository and alternate directory' '
>  	test_commit 1 &&
>  	test_commit 2 &&
> -	test_commit 3
> +	test_commit 3 &&
> +	git tag -m "tag message" annot_tag HEAD~1 &&
> +	git tag regul_tag HEAD~1 &&
> +	git branch a_branch HEAD~1
>  '
>  
>  # We manually corrupt the repository, which means that the commit-graph may
> @@ -78,7 +81,7 @@ do
>  	done
>  done
>  
> -for missing_tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
> +for missing_tip in "annot_tag" "regul_tag" "a_branch" "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
>  do
>  	# We want to check that things work when both
>  	#   - all the tips passed are missing (case existing_tip = ""), and
> @@ -88,9 +91,6 @@ do
>  		for action in "allow-any" "print"
>  		do
>  			test_expect_success "--missing=$action with tip '$missing_tip' missing and tip '$existing_tip'" '
> -				oid="$(git rev-parse $missing_tip)" &&
> -				path=".git/objects/$(test_oid_to_path $oid)" &&
> -
>  				# Before the object is made missing, we use rev-list to
>  				# get the expected oids.
>  				if test "$existing_tip" = "HEAD"
> @@ -109,11 +109,23 @@ do
>  					echo $(git rev-parse HEAD:2.t) >>expect.raw
>  				fi &&
>  
> +				missing_oid="$(git rev-parse $missing_tip)" &&
> +
> +				if test "$missing_tip" = "annot_tag"
> +				then
> +					oid="$(git rev-parse $missing_tip^{commit})" &&
> +					echo "$missing_oid" >>expect.raw
> +				else
> +					oid="$missing_oid"
> +				fi &&
> +
> +				path=".git/objects/$(test_oid_to_path $oid)" &&
> +
>  				mv "$path" "$path.hidden" &&
>  				test_when_finished "mv $path.hidden $path" &&

Hmph, this might be OK for now, but recently I saw Dscho used a nice
trick to prepare a packfile that excludes certain objects in a
separate directory and use that directory as GIT_OBJECT_DIRECTORY to
simulate a situation where some objects are missing without touching
this level of implementation details.  We may want to clean things
up.

Perhaps somebody will write a shell helper function that creates
such an object directory that contains all objects in the current
repository, except for ones that are specified.  And then we add it
to t/test-lib-functions.sh, so that it can be used to update various
tests (#leftoverbits).

>  				git rev-list --missing=$action --objects --no-object-names \
> -				     $oid $existing_tip >actual.raw &&
> +				     $missing_oid $existing_tip >actual.raw &&
>  
>  				# When the action is to print, we should also add the missing
>  				# oid to the expect list.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2024-02-28 17:46 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-08 13:50 [PATCH v2 0/4] rev-list: allow missing tips with --missing Christian Couder
2024-02-08 13:50 ` [PATCH v2 1/4] revision: clarify a 'return NULL' in get_reference() Christian Couder
2024-02-08 13:50 ` [PATCH v2 2/4] oidset: refactor oidset_insert_from_set() Christian Couder
2024-02-08 17:33   ` Junio C Hamano
2024-02-13 21:02   ` Linus Arver
2024-02-14 14:33     ` Christian Couder
2024-02-16  1:10       ` Linus Arver
2024-02-16 10:38         ` Christian Couder
2024-02-16 20:27           ` Linus Arver
2024-02-08 13:50 ` [PATCH v2 3/4] t6022: fix 'test' style and 'even though' typo Christian Couder
2024-02-08 13:50 ` [PATCH v2 4/4] rev-list: allow missing tips with --missing=[print|allow*] Christian Couder
2024-02-08 17:44   ` Junio C Hamano
2024-02-13 22:38     ` Linus Arver
2024-02-14 14:34       ` Christian Couder
2024-02-14 16:49         ` Junio C Hamano
2024-02-14 14:39     ` Christian Couder
2024-02-13 22:33   ` Linus Arver
2024-02-14 14:38     ` Christian Couder
2024-02-16  1:24       ` Linus Arver
2024-02-08 23:15 ` [PATCH v2 0/4] rev-list: allow missing tips with --missing Junio C Hamano
2024-02-14 14:26   ` Christian Couder
2024-02-14 14:25 ` [PATCH v3 0/5] " Christian Couder
2024-02-14 14:25   ` [PATCH v3 1/5] t9210: do not rely on lazy fetching to fail Christian Couder
2024-02-14 14:25   ` [PATCH v3 2/5] revision: clarify a 'return NULL' in get_reference() Christian Couder
2024-02-14 14:25   ` [PATCH v3 3/5] oidset: refactor oidset_insert_from_set() Christian Couder
2024-02-14 14:25   ` [PATCH v3 4/5] t6022: fix 'test' style and 'even though' typo Christian Couder
2024-02-14 14:25   ` [PATCH v3 5/5] rev-list: allow missing tips with --missing=[print|allow*] Christian Couder
2024-02-16 21:56   ` [PATCH v3 0/5] rev-list: allow missing tips with --missing Linus Arver
2024-02-28  9:10   ` [PATCH] revision: fix --missing=[print|allow*] for annotated tags Christian Couder
2024-02-28 17:46     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.