All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] get_short_oid UI improvements
@ 2018-04-30 22:07 Ævar Arnfjörð Bjarmason
  2018-04-30 22:07 ` [PATCH 1/9] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
                   ` (24 more replies)
  0 siblings, 25 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

I started out just wanting to do 04/09 so I'd get prettier output, but
then noticed that ^{tag}, ^{commit}< ^{blob} and ^{tree} didn't behave
as expected with the disambiguation output, and that core.disambiguate
had never been documented.

Ævar Arnfjörð Bjarmason (9):
  sha1-name.c: remove stray newline
  sha1-array.h: align function arguments
  sha1-name.c: move around the collect_ambiguous() function
  get_short_oid: sort ambiguous objects by type, then SHA-1
  get_short_oid: learn to disambiguate by ^{tag}
  get_short_oid: learn to disambiguate by ^{blob}
  get_short_oid / peel_onion: ^{tree} should mean tree, not treeish
  get_short_oid / peel_onion: ^{tree} should mean commit, not commitish
  config doc: document core.disambiguate

 Documentation/config.txt            | 14 ++++++
 cache.h                             |  5 ++-
 sha1-array.c                        | 15 +++++++
 sha1-array.h                        |  7 ++-
 sha1-name.c                         | 69 ++++++++++++++++++++++++-----
 t/t1512-rev-parse-disambiguation.sh | 32 ++++++++++---
 6 files changed, 120 insertions(+), 22 deletions(-)

-- 
2.17.0.290.gded63e768a


^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH 1/9] sha1-name.c: remove stray newline
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:07 ` Ævar Arnfjörð Bjarmason
  2018-04-30 22:07 ` [PATCH 2/9] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

This stray newline was accidentally introduced in
d2b7d9c7ed ("sha1_name: convert disambiguate_hint_fn to take
object_id", 2017-03-26).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/sha1-name.c b/sha1-name.c
index 5b93bf8da3..cd3b133aae 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -346,7 +346,6 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	struct strbuf desc = STRBUF_INIT;
 	int type;
 
-
 	if (ds->fn && !ds->fn(oid, ds->cb_data))
 		return 0;
 
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH 2/9] sha1-array.h: align function arguments
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
  2018-04-30 22:07 ` [PATCH 1/9] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:07 ` Ævar Arnfjörð Bjarmason
  2018-04-30 22:07 ` [PATCH 3/9] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

The arguments weren't lined up with the opening parenthesis. Fixes up
code added in cff38a5e11 ("receive-pack: eliminate duplicate .have
refs", 2011-05-19).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-array.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sha1-array.h b/sha1-array.h
index 04b0756334..1e1d24b009 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -17,7 +17,7 @@ void oid_array_clear(struct oid_array *array);
 typedef int (*for_each_oid_fn)(const struct object_id *oid,
 			       void *data);
 int oid_array_for_each_unique(struct oid_array *array,
-			       for_each_oid_fn fn,
-			       void *data);
+			      for_each_oid_fn fn,
+			      void *data);
 
 #endif /* SHA1_ARRAY_H */
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH 3/9] sha1-name.c: move around the collect_ambiguous() function
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
  2018-04-30 22:07 ` [PATCH 1/9] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
  2018-04-30 22:07 ` [PATCH 2/9] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:07 ` Ævar Arnfjörð Bjarmason
  2018-04-30 22:07 ` [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

A subsequent change will make use of this static function in the
get_short_oid() function, which is defined above where the
collect_ambiguous() function is now, which would result in a
compilation error due to a forward declaration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index cd3b133aae..9d7bbd3e96 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -372,6 +372,12 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	return 0;
 }
 
+static int collect_ambiguous(const struct object_id *oid, void *data)
+{
+	oid_array_append(data, oid);
+	return 0;
+}
+
 static int get_short_oid(const char *name, int len, struct object_id *oid,
 			  unsigned flags)
 {
@@ -421,12 +427,6 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 	return status;
 }
 
-static int collect_ambiguous(const struct object_id *oid, void *data)
-{
-	oid_array_append(data, oid);
-	return 0;
-}
-
 int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
 {
 	struct oid_array collect = OID_ARRAY_INIT;
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (2 preceding siblings ...)
  2018-04-30 22:07 ` [PATCH 3/9] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:07 ` Ævar Arnfjörð Bjarmason
  2018-05-01 11:11   ` Derrick Stolee
  2018-04-30 22:07 ` [PATCH 5/9] get_short_oid: learn to disambiguate by ^{tag} Ævar Arnfjörð Bjarmason
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

Change the output emitted when an ambiguous object is encountered so
that we show tags first, then commits, followed by trees, and finally
blobs. Within each type we show objects in hashcmp(). Before this
change the objects were only ordered by hashcmp().

The reason for doing this is that the output looks better as a result,
e.g. the v2.17.0 tag before this change on "git show e8f2" would
display:

    hint: The candidates are:
    hint:   e8f2093055 tree
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f25a3a50 tree
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f2867228 blob
    hint:   e8f28d537c tree
    hint:   e8f2a35526 blob
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2cf6ec0 tree

Now we'll instead show:

    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob

Since we show the commit data in the output that's nicely aligned once
we sort by object type. The decision to show tags before commits is
pretty arbitrary, but it's much less likely that we'll display a tag,
so if there is one it makes sense to show it first.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-array.c | 15 +++++++++++++++
 sha1-array.h |  3 +++
 sha1-name.c  | 37 ++++++++++++++++++++++++++++++++++++-
 3 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/sha1-array.c b/sha1-array.c
index 838b3bf847..48bd9e9230 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -41,6 +41,21 @@ void oid_array_clear(struct oid_array *array)
 	array->sorted = 0;
 }
 
+
+int oid_array_for_each(struct oid_array *array,
+		       for_each_oid_fn fn,
+		       void *data)
+{
+	int i;
+
+	for (i = 0; i < array->nr; i++) {
+		int ret = fn(array->oid + i, data);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
 int oid_array_for_each_unique(struct oid_array *array,
 				for_each_oid_fn fn,
 				void *data)
diff --git a/sha1-array.h b/sha1-array.h
index 1e1d24b009..232bf95017 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -16,6 +16,9 @@ void oid_array_clear(struct oid_array *array);
 
 typedef int (*for_each_oid_fn)(const struct object_id *oid,
 			       void *data);
+int oid_array_for_each(struct oid_array *array,
+		       for_each_oid_fn fn,
+		       void *data);
 int oid_array_for_each_unique(struct oid_array *array,
 			      for_each_oid_fn fn,
 			      void *data);
diff --git a/sha1-name.c b/sha1-name.c
index 9d7bbd3e96..46d8b1afa6 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -378,6 +378,34 @@ static int collect_ambiguous(const struct object_id *oid, void *data)
 	return 0;
 }
 
+static int sort_ambiguous(const void *a, const void *b)
+{
+	int a_type = oid_object_info(a, NULL);
+	int b_type = oid_object_info(b, NULL);
+	int a_type_sort;
+	int b_type_sort;
+
+	/*
+	 * Sorts by hash within the same object type, just as
+	 * oid_array_for_each_unique() would do.
+	 */
+	if (a_type == b_type)
+		return oidcmp(a, b);
+
+	/*
+	 * Between object types show tags, then commits, and finally
+	 * trees and blobs.
+	 *
+	 * The object_type enum is commit, tree, blob, tag, but we
+	 * want tag, commit, tree blob. Cleverly (perhaps too
+	 * cleverly) do that with modulus, since the enum assigns 1 to
+	 * commit, so tag becomes 0.
+	 */
+	a_type_sort = a_type % 4;
+	b_type_sort = b_type % 4;
+	return a_type_sort > b_type_sort ? 1 : -1;
+}
+
 static int get_short_oid(const char *name, int len, struct object_id *oid,
 			  unsigned flags)
 {
@@ -409,6 +437,8 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 	status = finish_object_disambiguation(&ds, oid);
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
+		struct oid_array collect = OID_ARRAY_INIT;
+
 		error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
 
 		/*
@@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 			ds.fn = NULL;
 
 		advise(_("The candidates are:"));
-		for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds);
+		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
+		QSORT(collect.oid, collect.nr, sort_ambiguous);
+
+		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
+			BUG("show_ambiguous_object shouldn't return non-zero");
+		oid_array_clear(&collect);
 	}
 
 	return status;
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH 5/9] get_short_oid: learn to disambiguate by ^{tag}
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (3 preceding siblings ...)
  2018-04-30 22:07 ` [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:07 ` Ævar Arnfjörð Bjarmason
  2018-04-30 22:07 ` [PATCH 6/9] get_short_oid: learn to disambiguate by ^{blob} Ævar Arnfjörð Bjarmason
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

Add support for ^{tag} to the disambiguation logic. Before this ^{tag}
would simply be ignored:

    $ git rev-parse e8f2^{tag}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob
    e8f2^{tag}

Now the logic added in ed1ca6025f ("peel_onion: disambiguate to favor
tree-ish when we know we want a tree-ish", 2013-03-31) has been
extended to support it.

    $ git rev-parse e8f2^{tag}
    e8f2650052f3ff646023725e388ea1112b020e79

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h                             |  5 +++--
 sha1-name.c                         | 13 ++++++++++++-
 t/t1512-rev-parse-disambiguation.sh |  5 ++++-
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index 77b7acebb6..3f6a292ba6 100644
--- a/cache.h
+++ b/cache.h
@@ -1322,8 +1322,9 @@ struct object_context {
 #define GET_OID_TREE             010
 #define GET_OID_TREEISH          020
 #define GET_OID_BLOB             040
-#define GET_OID_FOLLOW_SYMLINKS 0100
-#define GET_OID_RECORD_PATH     0200
+#define GET_OID_TAG             0100
+#define GET_OID_FOLLOW_SYMLINKS 0200
+#define GET_OID_RECORD_PATH     0400
 #define GET_OID_ONLY_TO_DIE    04000
 
 #define GET_OID_DISAMBIGUATORS \
diff --git a/sha1-name.c b/sha1-name.c
index 46d8b1afa6..68d5f65362 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -221,6 +221,12 @@ static int finish_object_disambiguation(struct disambiguate_state *ds,
 	return 0;
 }
 
+static int disambiguate_tag_only(const struct object_id *oid, void *cb_data_unused)
+{
+	int kind = oid_object_info(oid, NULL);
+	return kind == OBJ_TAG;
+}
+
 static int disambiguate_commit_only(const struct object_id *oid, void *cb_data_unused)
 {
 	int kind = oid_object_info(oid, NULL);
@@ -288,7 +294,8 @@ int set_disambiguate_hint_config(const char *var, const char *value)
 		{ "committish", disambiguate_committish_only },
 		{ "tree", disambiguate_tree_only },
 		{ "treeish", disambiguate_treeish_only },
-		{ "blob", disambiguate_blob_only }
+		{ "blob", disambiguate_blob_only },
+		{ "tag", disambiguate_tag_only }
 	};
 	int i;
 
@@ -429,6 +436,8 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 		ds.fn = disambiguate_treeish_only;
 	else if (flags & GET_OID_BLOB)
 		ds.fn = disambiguate_blob_only;
+	else if (flags & GET_OID_TAG)
+		ds.fn = disambiguate_tag_only;
 	else
 		ds.fn = default_disambiguate_hint;
 
@@ -958,6 +967,8 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 	lookup_flags &= ~GET_OID_DISAMBIGUATORS;
 	if (expected_type == OBJ_COMMIT)
 		lookup_flags |= GET_OID_COMMITTISH;
+	else if (expected_type == OBJ_TAG)
+		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
 		lookup_flags |= GET_OID_TREEISH;
 
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 711704ba5a..c7ceda2f21 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -334,7 +334,10 @@ test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
 	test_must_fail git rev-parse 000000000^{commit} 2>stderr &&
 	grep ^hint: stderr >hints &&
 	# 5 commits, 1 tag (which is a commitish), plus intro line
-	test_line_count = 7 hints
+	test_line_count = 7 hints &&
+	git rev-parse 000000000^{tag} >stdout &&
+	test_line_count = 1 stdout &&
+	grep -q ^0000000000f8f stdout
 '
 
 test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH 6/9] get_short_oid: learn to disambiguate by ^{blob}
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (4 preceding siblings ...)
  2018-04-30 22:07 ` [PATCH 5/9] get_short_oid: learn to disambiguate by ^{tag} Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:07 ` Ævar Arnfjörð Bjarmason
  2018-04-30 22:07 ` [PATCH 7/9] get_short_oid / peel_onion: ^{tree} should mean tree, not treeish Ævar Arnfjörð Bjarmason
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

The disambiguation logic had all the pieces necessary to only print
out those blobs that were ambiguous, but they hadn't been connected.

The initial logic was added in daba53aeaf ("sha1_name.c: add support
for disambiguating other types", 2012-07-02), and when the flags were
propagated in 8a10fea49b ("get_sha1: propagate flags to child
functions", 2016-09-26) GET_OID_BLOB wasn't added to lookup_flags.

Before this change requests for blobs were simply ignored:

    $ git rev-parse e8f2^{blob}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob
    e8f2^{blob}
    [...]

But now we'll do the right thing and only print the blobs:

    $ git rev-parse e8f2^{blob}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob
    e8f2^{blob}
    [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         | 2 ++
 t/t1512-rev-parse-disambiguation.sh | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/sha1-name.c b/sha1-name.c
index 68d5f65362..023f9471a8 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -971,6 +971,8 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
 		lookup_flags |= GET_OID_TREEISH;
+	else if (expected_type == OBJ_BLOB)
+		lookup_flags |= GET_OID_BLOB;
 
 	if (get_oid_1(name, sp - name - 2, &outer, lookup_flags))
 		return -1;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index c7ceda2f21..08ae73e2a5 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -337,7 +337,11 @@ test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
 	test_line_count = 7 hints &&
 	git rev-parse 000000000^{tag} >stdout &&
 	test_line_count = 1 stdout &&
-	grep -q ^0000000000f8f stdout
+	grep -q ^0000000000f8f stdout &&
+	test_must_fail git rev-parse 000000000^{blob} 2>stderr &&
+	grep ^hint: stderr >hints &&
+	# 5 blobs plus intro line &&
+	test_line_count = 6 hints
 '
 
 test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH 7/9] get_short_oid / peel_onion: ^{tree} should mean tree, not treeish
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (5 preceding siblings ...)
  2018-04-30 22:07 ` [PATCH 6/9] get_short_oid: learn to disambiguate by ^{blob} Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:07 ` Ævar Arnfjörð Bjarmason
  2018-05-01  1:13   ` brian m. carlson
  2018-04-30 22:07 ` [PATCH 8/9] get_short_oid / peel_onion: ^{tree} should mean commit, not commitish Ævar Arnfjörð Bjarmason
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

After the recent series of patches ^{tag} and ^{blob} now work to get
just the tags and blobs, but ^{tree} will still list any
tree-ish (commits, tags and trees).

The previous behavior was added in ed1ca6025f ("peel_onion:
disambiguate to favor tree-ish when we know we want a tree-ish",
2013-03-31). I may have missed some special-case but this makes more
sense to me.

Now "$sha1:" can be used as before to mean treeish

    $ git rev-parse e8f2:
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    [...]

But ^{tree} shows just the trees, but would previously be equivalent
to the above:

    $ git rev-parse e8f2^{tree}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         |  2 +-
 t/t1512-rev-parse-disambiguation.sh | 18 ++++++++++++++----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index 023f9471a8..b61c0558d9 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -970,7 +970,7 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 	else if (expected_type == OBJ_TAG)
 		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
-		lookup_flags |= GET_OID_TREEISH;
+		lookup_flags |= GET_OID_TREE;
 	else if (expected_type == OBJ_BLOB)
 		lookup_flags |= GET_OID_BLOB;
 
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 08ae73e2a5..2acf564714 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -159,9 +159,13 @@ test_expect_failure 'two semi-ambiguous commit-ish' '
 	git log 0000000000...
 '
 
-test_expect_failure 'three semi-ambiguous tree-ish' '
+test_expect_success 'three semi-ambiguous tree-ish' '
 	# Likewise for tree-ish.  HEAD, v1.0.0 and HEAD^{tree} share
 	# the prefix but peeling them to tree yields the same thing
+	test_must_fail git rev-parse --verify 0000000000: &&
+
+	# For ^{tree} we can disambiguate because HEAD and v1.0.0 will
+	# be excluded.
 	git rev-parse --verify 0000000000^{tree}
 '
 
@@ -267,8 +271,12 @@ test_expect_success 'ambiguous commit-ish' '
 # There are three objects with this prefix: a blob, a tree, and a tag. We know
 # the blob will not pass as a treeish, but the tree and tag should (and thus
 # cause an error).
-test_expect_success 'ambiguous tags peel to treeish' '
-	test_must_fail git rev-parse 0000000000f^{tree}
+test_expect_success 'ambiguous tags peel to treeish or tree' '
+	test_must_fail git rev-parse 0000000000f: &&
+	git rev-parse 0000000000f^{tree} >stdout &&
+	test_line_count = 1 stdout &&
+	grep -q ^0000000000fd8bcc56 stdout
+
 '
 
 test_expect_success 'rev-parse --disambiguate' '
@@ -365,7 +373,9 @@ test_expect_success 'core.disambiguate config can prefer types' '
 test_expect_success 'core.disambiguate does not override context' '
 	# treeish ambiguous between tag and tree
 	test_must_fail \
-		git -c core.disambiguate=committish rev-parse $sha1^{tree}
+		git -c core.disambiguate=committish rev-parse $sha1: &&
+	# tree not ambiguous between tag and tree
+	git -c core.disambiguate=committish rev-parse $sha1^{tree}
 '
 
 test_done
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH 8/9] get_short_oid / peel_onion: ^{tree} should mean commit, not commitish
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (6 preceding siblings ...)
  2018-04-30 22:07 ` [PATCH 7/9] get_short_oid / peel_onion: ^{tree} should mean tree, not treeish Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:07 ` Ævar Arnfjörð Bjarmason
  2018-04-30 23:22   ` Eric Sunshine
  2018-04-30 22:07 ` [PATCH 9/9] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

Continue the untangling of peel disambiguation syntax. Before this
e8f2^{commit} would show the v2.17.0 tag, but now it'll just show
ambiguous commits:

    $ git rev-parse e8f2^{commit}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         | 2 +-
 t/t1512-rev-parse-disambiguation.sh | 7 ++++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index b61c0558d9..1d2a74a29c 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -966,7 +966,7 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 
 	lookup_flags &= ~GET_OID_DISAMBIGUATORS;
 	if (expected_type == OBJ_COMMIT)
-		lookup_flags |= GET_OID_COMMITTISH;
+		lookup_flags |= GET_OID_COMMIT;
 	else if (expected_type == OBJ_TAG)
 		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 2acf564714..b3ef236db8 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -341,8 +341,8 @@ test_expect_success C_LOCALE_OUTPUT 'ambiguity hints' '
 test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
 	test_must_fail git rev-parse 000000000^{commit} 2>stderr &&
 	grep ^hint: stderr >hints &&
-	# 5 commits, 1 tag (which is a commitish), plus intro line
-	test_line_count = 7 hints &&
+	# 5 commits plus intro line
+	test_line_count = 6 hints &&
 	git rev-parse 000000000^{tag} >stdout &&
 	test_line_count = 1 stdout &&
 	grep -q ^0000000000f8f stdout &&
@@ -366,7 +366,8 @@ test_expect_success 'core.disambiguate config can prefer types' '
 	# ambiguous between tree and tag
 	sha1=0000000000f &&
 	test_must_fail git rev-parse $sha1 &&
-	git rev-parse $sha1^{commit} &&
+	# there is no commit so ^{commit} comes up empty
+	test_must_fail git rev-parse $sha1^{commit} &&
 	git -c core.disambiguate=committish rev-parse $sha1
 '
 
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH 9/9] config doc: document core.disambiguate
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (7 preceding siblings ...)
  2018-04-30 22:07 ` [PATCH 8/9] get_short_oid / peel_onion: ^{tree} should mean commit, not commitish Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:07 ` Ævar Arnfjörð Bjarmason
  2018-04-30 22:34 ` [PATCH 0/9] get_short_oid UI improvements Stefan Beller
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-30 22:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson,
	Ævar Arnfjörð Bjarmason

The core.disambiguate variable was added in
5b33cb1fd7 ("get_short_sha1: make default disambiguation
configurable", 2016-09-27) but never documented.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config.txt | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..6fee67c12d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -910,6 +910,20 @@ core.abbrev::
 	abbreviated object names to stay unique for some time.
 	The minimum length is 4.
 
+core.disambiguate::
+	If Git is given a SHA-1 that's ambigous it'll suggest what
+	objects you might mean. By default it'll print out all
+	potential objects with that prefix regardless of their
+	type. This setting, along with the `^{<type>}` peel syntax
+	(see linkgit:gitrevisions[7]), allows for narrowing that down.
+
++
+Is set to `none` by default, can also be `commit` (peel syntax:
+`$sha1^{commit}`), `commitish` (commits and tags), `tree` (peel:
+`$sha1^{tree}`), `treeish` (everything except blobs), `blob` (peel:
+`$sha1^{blob}`) or `tag` (peel: `$sha1^{tag}`). The peel syntax will
+override any config value.
+
 add.ignoreErrors::
 add.ignore-errors (deprecated)::
 	Tells 'git add' to continue adding files when some files cannot be
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH 0/9] get_short_oid UI improvements
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (8 preceding siblings ...)
  2018-04-30 22:07 ` [PATCH 9/9] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
@ 2018-04-30 22:34 ` Stefan Beller
  2018-05-01  1:27 ` brian m. carlson
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Stefan Beller @ 2018-04-30 22:34 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, brian m . carlson

On Mon, Apr 30, 2018 at 3:07 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> I started out just wanting to do 04/09 so I'd get prettier output, but
> then noticed that ^{tag}, ^{commit}< ^{blob} and ^{tree} didn't behave
> as expected with the disambiguation output, and that core.disambiguate
> had never been documented.
>

This whole series, including the comment in which you wonder
if the code is overly smart, is

Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 8/9] get_short_oid / peel_onion: ^{tree} should mean commit, not commitish
  2018-04-30 22:07 ` [PATCH 8/9] get_short_oid / peel_onion: ^{tree} should mean commit, not commitish Ævar Arnfjörð Bjarmason
@ 2018-04-30 23:22   ` Eric Sunshine
  0 siblings, 0 replies; 99+ messages in thread
From: Eric Sunshine @ 2018-04-30 23:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Jeff King, brian m . carlson

On Mon, Apr 30, 2018 at 6:07 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> get_short_oid / peel_onion: ^{tree} should mean commit, not commitish

s/tree/commit/

> Continue the untangling of peel disambiguation syntax. Before this
> e8f2^{commit} would show the v2.17.0 tag, but now it'll just show
> ambiguous commits:
>
>     $ git rev-parse e8f2^{commit}
>     error: short SHA1 e8f2 is ambiguous
>     hint: The candidates are:
>     hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
>     hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
>     hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
>     [...]
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 7/9] get_short_oid / peel_onion: ^{tree} should mean tree, not treeish
  2018-04-30 22:07 ` [PATCH 7/9] get_short_oid / peel_onion: ^{tree} should mean tree, not treeish Ævar Arnfjörð Bjarmason
@ 2018-05-01  1:13   ` brian m. carlson
  0 siblings, 0 replies; 99+ messages in thread
From: brian m. carlson @ 2018-05-01  1:13 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Jeff King

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

On Mon, Apr 30, 2018 at 10:07:32PM +0000, Ævar Arnfjörð Bjarmason wrote:
> diff --git a/sha1-name.c b/sha1-name.c
> index 023f9471a8..b61c0558d9 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -970,7 +970,7 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
>  	else if (expected_type == OBJ_TAG)
>  		lookup_flags |= GET_OID_TAG;
>  	else if (expected_type == OBJ_TREE)
> -		lookup_flags |= GET_OID_TREEISH;
> +		lookup_flags |= GET_OID_TREE;
>  	else if (expected_type == OBJ_BLOB)
>  		lookup_flags |= GET_OID_BLOB;
>  

I was concerned at first that this might lead to some sort of wrong
behavior when we do something like "git rev-parse v2.17.0^{tree}", but
looking at the code I've mostly convinced myself that that should still
work.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 0/9] get_short_oid UI improvements
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (9 preceding siblings ...)
  2018-04-30 22:34 ` [PATCH 0/9] get_short_oid UI improvements Stefan Beller
@ 2018-05-01  1:27 ` brian m. carlson
  2018-05-01 11:16 ` Derrick Stolee
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: brian m. carlson @ 2018-05-01  1:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Jeff King

[-- Attachment #1: Type: text/plain, Size: 1249 bytes --]

On Mon, Apr 30, 2018 at 10:07:25PM +0000, Ævar Arnfjörð Bjarmason wrote:
> I started out just wanting to do 04/09 so I'd get prettier output, but
> then noticed that ^{tag}, ^{commit}< ^{blob} and ^{tree} didn't behave
> as expected with the disambiguation output, and that core.disambiguate
> had never been documented.
> 
> Ævar Arnfjörð Bjarmason (9):
>   sha1-name.c: remove stray newline
>   sha1-array.h: align function arguments
>   sha1-name.c: move around the collect_ambiguous() function
>   get_short_oid: sort ambiguous objects by type, then SHA-1
>   get_short_oid: learn to disambiguate by ^{tag}
>   get_short_oid: learn to disambiguate by ^{blob}
>   get_short_oid / peel_onion: ^{tree} should mean tree, not treeish
>   get_short_oid / peel_onion: ^{tree} should mean commit, not commitish
>   config doc: document core.disambiguate

As mentioned, I'm a bit unsure that patches 7 and 8 are entirely
correct.  I've mostly convinced myself that they are after looking at
peel_onion, but I'm still harboring lingering doubts for some reason.

The rest of the series looked fine to me.  Thanks for cleaning up my
stray newline.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-04-30 22:07 ` [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
@ 2018-05-01 11:11   ` Derrick Stolee
  2018-05-01 11:27     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 99+ messages in thread
From: Derrick Stolee @ 2018-05-01 11:11 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jeff King, brian m . carlson

On 4/30/2018 6:07 PM, Ævar Arnfjörð Bjarmason wrote:
> Change the output emitted when an ambiguous object is encountered so
> that we show tags first, then commits, followed by trees, and finally
> blobs. Within each type we show objects in hashcmp(). Before this
> change the objects were only ordered by hashcmp().
>
> The reason for doing this is that the output looks better as a result,
> e.g. the v2.17.0 tag before this change on "git show e8f2" would
> display:
>
>      hint: The candidates are:
>      hint:   e8f2093055 tree
>      hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
>      hint:   e8f21d02f7 blob
>      hint:   e8f21d577c blob
>      hint:   e8f25a3a50 tree
>      hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
>      hint:   e8f2650052 tag v2.17.0
>      hint:   e8f2867228 blob
>      hint:   e8f28d537c tree
>      hint:   e8f2a35526 blob
>      hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
>      hint:   e8f2cf6ec0 tree
>
> Now we'll instead show:
>
>      hint:   e8f2650052 tag v2.17.0
>      hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
>      hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
>      hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
>      hint:   e8f2093055 tree
>      hint:   e8f25a3a50 tree
>      hint:   e8f28d537c tree
>      hint:   e8f2cf6ec0 tree
>      hint:   e8f21d02f7 blob
>      hint:   e8f21d577c blob
>      hint:   e8f2867228 blob
>      hint:   e8f2a35526 blob
>
> Since we show the commit data in the output that's nicely aligned once
> we sort by object type. The decision to show tags before commits is
> pretty arbitrary, but it's much less likely that we'll display a tag,
> so if there is one it makes sense to show it first.

Here's a non-arbitrary reason: the object types are ordered 
topologically (ignoring self-references):

tag -> commit, tree, blob
commit -> tree
tree -> blob

> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   sha1-array.c | 15 +++++++++++++++
>   sha1-array.h |  3 +++
>   sha1-name.c  | 37 ++++++++++++++++++++++++++++++++++++-
>   3 files changed, 54 insertions(+), 1 deletion(-)
>
> diff --git a/sha1-array.c b/sha1-array.c
> index 838b3bf847..48bd9e9230 100644
> --- a/sha1-array.c
> +++ b/sha1-array.c
> @@ -41,6 +41,21 @@ void oid_array_clear(struct oid_array *array)
>   	array->sorted = 0;
>   }
>   
> +
> +int oid_array_for_each(struct oid_array *array,
> +		       for_each_oid_fn fn,
> +		       void *data)
> +{
> +	int i;
> +
> +	for (i = 0; i < array->nr; i++) {
> +		int ret = fn(array->oid + i, data);
> +		if (ret)
> +			return ret;
> +	}
> +	return 0;
> +}
> +
>   int oid_array_for_each_unique(struct oid_array *array,
>   				for_each_oid_fn fn,
>   				void *data)
> diff --git a/sha1-array.h b/sha1-array.h
> index 1e1d24b009..232bf95017 100644
> --- a/sha1-array.h
> +++ b/sha1-array.h
> @@ -16,6 +16,9 @@ void oid_array_clear(struct oid_array *array);
>   
>   typedef int (*for_each_oid_fn)(const struct object_id *oid,
>   			       void *data);
> +int oid_array_for_each(struct oid_array *array,
> +		       for_each_oid_fn fn,
> +		       void *data);
>   int oid_array_for_each_unique(struct oid_array *array,
>   			      for_each_oid_fn fn,
>   			      void *data);
> diff --git a/sha1-name.c b/sha1-name.c
> index 9d7bbd3e96..46d8b1afa6 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -378,6 +378,34 @@ static int collect_ambiguous(const struct object_id *oid, void *data)
>   	return 0;
>   }
>   
> +static int sort_ambiguous(const void *a, const void *b)
> +{
> +	int a_type = oid_object_info(a, NULL);
> +	int b_type = oid_object_info(b, NULL);
> +	int a_type_sort;
> +	int b_type_sort;
> +
> +	/*
> +	 * Sorts by hash within the same object type, just as
> +	 * oid_array_for_each_unique() would do.
> +	 */
> +	if (a_type == b_type)
> +		return oidcmp(a, b);
> +
> +	/*
> +	 * Between object types show tags, then commits, and finally
> +	 * trees and blobs.
> +	 *
> +	 * The object_type enum is commit, tree, blob, tag, but we
> +	 * want tag, commit, tree blob. Cleverly (perhaps too
> +	 * cleverly) do that with modulus, since the enum assigns 1 to
> +	 * commit, so tag becomes 0.
> +	 */

I appreciate this comment. Clever things should be marked as such.

> +	a_type_sort = a_type % 4;
> +	b_type_sort = b_type % 4;
> +	return a_type_sort > b_type_sort ? 1 : -1;
> +}
> +
>   static int get_short_oid(const char *name, int len, struct object_id *oid,
>   			  unsigned flags)
>   {
> @@ -409,6 +437,8 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
>   	status = finish_object_disambiguation(&ds, oid);
>   
>   	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
> +		struct oid_array collect = OID_ARRAY_INIT;
> +
>   		error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
>   
>   		/*
> @@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
>   			ds.fn = NULL;
>   
>   		advise(_("The candidates are:"));
> -		for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds);
> +		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
> +		QSORT(collect.oid, collect.nr, sort_ambiguous);

I was wondering how the old code sorted by SHA even when the ambiguous 
objects were loaded from different sources (multiple pack-files, loose 
objects). Turns out that for_each_abbrev() does its own sort after 
collecting the SHAs and then calls the given function pointer only once 
per distinct object. This avoids multiple instances of the same object, 
which may appear multiple times across pack-files.

I only ask because now we are doing two sorts. I wonder if it would be 
more elegant to provide your sorting algorithm to for_each_abbrev() and 
let it call show_ambiguous_object as before.

Another question is if we should use this sort generally for all calls 
to for_each_abbrev(). The only other case I see is in builtin/revparse.c.

> +
> +		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
> +			BUG("show_ambiguous_object shouldn't return non-zero");
> +		oid_array_clear(&collect);
>   	}
>   
>   	return status;


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 0/9] get_short_oid UI improvements
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (10 preceding siblings ...)
  2018-05-01  1:27 ` brian m. carlson
@ 2018-05-01 11:16 ` Derrick Stolee
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Derrick Stolee @ 2018-05-01 11:16 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jeff King, brian m . carlson

On 4/30/2018 6:07 PM, Ævar Arnfjörð Bjarmason wrote:
> I started out just wanting to do 04/09 so I'd get prettier output, but
> then noticed that ^{tag}, ^{commit}< ^{blob} and ^{tree} didn't behave
> as expected with the disambiguation output, and that core.disambiguate
> had never been documented.
>
> Ævar Arnfjörð Bjarmason (9):
>    sha1-name.c: remove stray newline
>    sha1-array.h: align function arguments
>    sha1-name.c: move around the collect_ambiguous() function
>    get_short_oid: sort ambiguous objects by type, then SHA-1
>    get_short_oid: learn to disambiguate by ^{tag}
>    get_short_oid: learn to disambiguate by ^{blob}
>    get_short_oid / peel_onion: ^{tree} should mean tree, not treeish
>    get_short_oid / peel_onion: ^{tree} should mean commit, not commitish
>    config doc: document core.disambiguate
>
>   Documentation/config.txt            | 14 ++++++
>   cache.h                             |  5 ++-
>   sha1-array.c                        | 15 +++++++
>   sha1-array.h                        |  7 ++-
>   sha1-name.c                         | 69 ++++++++++++++++++++++++-----
>   t/t1512-rev-parse-disambiguation.sh | 32 ++++++++++---
>   6 files changed, 120 insertions(+), 22 deletions(-)
>

This is a good series. Please take a look at my suggestion in Patch 4/9, 
but feel free to keep this series as written.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 11:11   ` Derrick Stolee
@ 2018-05-01 11:27     ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:26       ` Derrick Stolee
  0 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 11:27 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, Junio C Hamano, Jeff King, brian m . carlson


On Tue, May 01 2018, Derrick Stolee wrote:

> On 4/30/2018 6:07 PM, Ævar Arnfjörð Bjarmason wrote:
>> Since we show the commit data in the output that's nicely aligned once
>> we sort by object type. The decision to show tags before commits is
>> pretty arbitrary, but it's much less likely that we'll display a tag,
>> so if there is one it makes sense to show it first.
>
> Here's a non-arbitrary reason: the object types are ordered
> topologically (ignoring self-references):
>
> tag -> commit, tree, blob
> commit -> tree
> tree -> blob

Thanks. I'll add a patch with that comment to v2.

>> @@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
>>   			ds.fn = NULL;
>>     		advise(_("The candidates are:"));
>> -		for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds);
>> +		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
>> +		QSORT(collect.oid, collect.nr, sort_ambiguous);
>
> I was wondering how the old code sorted by SHA even when the ambiguous
> objects were loaded from different sources (multiple pack-files, loose
> objects). Turns out that for_each_abbrev() does its own sort after
> collecting the SHAs and then calls the given function pointer only
> once per distinct object. This avoids multiple instances of the same
> object, which may appear multiple times across pack-files.
>
> I only ask because now we are doing two sorts. I wonder if it would be
> more elegant to provide your sorting algorithm to for_each_abbrev()
> and let it call show_ambiguous_object as before.
>
> Another question is if we should use this sort generally for all calls
> to for_each_abbrev(). The only other case I see is in
> builtin/revparse.c.

When preparing v2 I realized how confusing this was, so I'd added this
to the commit message of my WIP re-roll which should explain this:

    A note on the implementation: I started out with something much
    simpler which just replaced oid_array_sort() in sha1-array.c with a
    custom sort function before calling oid_array_for_each_unique(). But
    then dumbly noticed that it doesn't work because the output function
    was tangled up with the code added in fad6b9e590 ("for_each_abbrev:
    drop duplicate objects", 2016-09-26) to ensure we don't display
    duplicate objects.
    
    That's why we're doing two passes here, first we need to sort the list
    and de-duplicate the objects, then sort them in our custom order, and
    finally output them without re-sorting them. I suppose we could also
    make oid_array_for_each_unique() maintain a hashmap of emitted
    objects, but that would increase its memory profile and wouldn't be
    worth the complexity for this one-off use-case,
    oid_array_for_each_unique() is used in many other places.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v2 00/12] get_short_oid UI improvements
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (11 preceding siblings ...)
  2018-05-01 11:16 ` Derrick Stolee
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 13:03   ` [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1 Derrick Stolee
                     ` (13 more replies)
  2018-05-01 12:06 ` [PATCH v2 01/12] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
                   ` (11 subsequent siblings)
  24 siblings, 14 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

A v2 addressing feedback so far. Comments inline per-patch.

Ævar Arnfjörð Bjarmason (12):
  sha1-name.c: remove stray newline
  sha1-array.h: align function arguments

No changes.

  git-p4: change "commitish" typo to "committish"

New, I fixed my own "commitish" elsewhere, fixing it here in this
unrelated area while I'm at it.

  cache.h: add comment explaining the order in object_type

New: Derrick Stolee pointed out why the object type enum used later is
ordered that way, explain that with a comment.

  sha1-name.c: move around the collect_ambiguous() function

Trivial grammar correction in commit message:
    -    collect_ambiguous() function is now, which would result in a
    +    collect_ambiguous() function is now. Without this we'd then have a

  get_short_oid: sort ambiguous objects by type, then SHA-1

* Grammar fixes in commit message
* Add docs to api-oid-array.txt documenting the new oid_array_for_each()
* Document in the commit message why we sort twice
* Note inline in sha1-array.c why oid_array_for_each() doesn't sort
  with a pointer to the API docs.
* Add test to assert that we sort objects in the order we expect, and
  that they're hash sorted within the object types.

  get_short_oid: learn to disambiguate by ^{tag}
  get_short_oid: learn to disambiguate by ^{blob}

No changes.

  get_short_oid / peel_onion: ^{tree} should be tree, not treeish

s/mean/be/ in subject line (to avoid wrapping in E-Mail).

  get_short_oid / peel_onion: ^{commit} should be commit, not committish

Rewrite commit message, now assumes less context from the rest of the
series & is easier to read stand-alone.

  config doc: document core.disambiguate

Change commitish to committish, and note the `$sha1:` peel syntax.

  get_short_oid: document & warn if we ignore the type selector

New: Explain why we ignore e.g. $sha1^{blob} if there's no blobs with
the $sha1 prefix.

 Documentation/config.txt                  | 17 +++++
 Documentation/technical/api-oid-array.txt | 17 +++--
 cache.h                                   | 13 +++-
 git-p4.py                                 |  6 +-
 sha1-array.c                              | 17 +++++
 sha1-array.h                              |  7 +-
 sha1-name.c                               | 80 +++++++++++++++++++----
 t/t1512-rev-parse-disambiguation.sh       | 58 +++++++++++++---
 8 files changed, 182 insertions(+), 33 deletions(-)

-- 
2.17.0.290.gded63e768a


^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v2 01/12] sha1-name.c: remove stray newline
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (12 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 02/12] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

This stray newline was accidentally introduced in
d2b7d9c7ed ("sha1_name: convert disambiguate_hint_fn to take
object_id", 2017-03-26).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/sha1-name.c b/sha1-name.c
index 5b93bf8da3..cd3b133aae 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -346,7 +346,6 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	struct strbuf desc = STRBUF_INIT;
 	int type;
 
-
 	if (ds->fn && !ds->fn(oid, ds->cb_data))
 		return 0;
 
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 02/12] sha1-array.h: align function arguments
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (13 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 01/12] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 03/12] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The arguments weren't lined up with the opening parenthesis. Fixes up
code added in cff38a5e11 ("receive-pack: eliminate duplicate .have
refs", 2011-05-19).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-array.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sha1-array.h b/sha1-array.h
index 04b0756334..1e1d24b009 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -17,7 +17,7 @@ void oid_array_clear(struct oid_array *array);
 typedef int (*for_each_oid_fn)(const struct object_id *oid,
 			       void *data);
 int oid_array_for_each_unique(struct oid_array *array,
-			       for_each_oid_fn fn,
-			       void *data);
+			      for_each_oid_fn fn,
+			      void *data);
 
 #endif /* SHA1_ARRAY_H */
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 03/12] git-p4: change "commitish" typo to "committish"
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (14 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 02/12] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 04/12] cache.h: add comment explaining the order in object_type Ævar Arnfjörð Bjarmason
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

This was the only occurrence of "commitish" in the tree, but as the
log will reveal we've had others in the past. Fixes up code added in
00ad6e3182 ("git-p4: work with a detached head", 2015-11-21).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 git-p4.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 7bb9cadc69..1afa87cd9d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2099,11 +2099,11 @@ class P4Submit(Command, P4UserMap):
 
         commits = []
         if self.master:
-            commitish = self.master
+            committish = self.master
         else:
-            commitish = 'HEAD'
+            committish = 'HEAD'
 
-        for line in read_pipe_lines(["git", "rev-list", "--no-merges", "%s..%s" % (self.origin, commitish)]):
+        for line in read_pipe_lines(["git", "rev-list", "--no-merges", "%s..%s" % (self.origin, committish)]):
             commits.append(line.strip())
         commits.reverse()
 
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 04/12] cache.h: add comment explaining the order in object_type
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (15 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 03/12] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 05/12] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The order in the enum might seem arbitrary, and isn't explained by
72518e9c26 ("more lightweight revalidation while reusing deflated
stream in packing", 2006-09-03) which added it, but Derrick Stolee
suggested that it's ordered topologically in
5f8b1ec1-258d-1acc-133e-a7c248b4083e@gmail.com. Makes sense to me, add
that as a comment.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/cache.h b/cache.h
index 77b7acebb6..354903c3ea 100644
--- a/cache.h
+++ b/cache.h
@@ -376,6 +376,14 @@ extern void free_name_hash(struct index_state *istate);
 enum object_type {
 	OBJ_BAD = -1,
 	OBJ_NONE = 0,
+	/*
+	 * Why have our our "real" object types in this order? They're
+	 * ordered topologically:
+	 *
+	 * tag(4)    -> commit(1), tree(2), blob(3)
+	 * commit(1) -> tree(2)
+	 * tree(2)   -> blob(3)
+	 */
 	OBJ_COMMIT = 1,
 	OBJ_TREE = 2,
 	OBJ_BLOB = 3,
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 05/12] sha1-name.c: move around the collect_ambiguous() function
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (16 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 04/12] cache.h: add comment explaining the order in object_type Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

A subsequent change will make use of this static function in the
get_short_oid() function, which is defined above where the
collect_ambiguous() function is now. Without this we'd then have a
compilation error due to a forward declaration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index cd3b133aae..9d7bbd3e96 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -372,6 +372,12 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	return 0;
 }
 
+static int collect_ambiguous(const struct object_id *oid, void *data)
+{
+	oid_array_append(data, oid);
+	return 0;
+}
+
 static int get_short_oid(const char *name, int len, struct object_id *oid,
 			  unsigned flags)
 {
@@ -421,12 +427,6 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 	return status;
 }
 
-static int collect_ambiguous(const struct object_id *oid, void *data)
-{
-	oid_array_append(data, oid);
-	return 0;
-}
-
 int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
 {
 	struct oid_array collect = OID_ARRAY_INIT;
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (17 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 05/12] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 07/12] get_short_oid: learn to disambiguate by ^{tag} Ævar Arnfjörð Bjarmason
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Change the output emitted when an ambiguous object is encountered so
that we show tags first, then commits, followed by trees, and finally
blobs. Within each type we show objects in hashcmp() order. Before
this change the objects were only ordered by hashcmp().

The reason for doing this is that the output looks better as a result,
e.g. the v2.17.0 tag before this change on "git show e8f2" would
display:

    hint: The candidates are:
    hint:   e8f2093055 tree
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f25a3a50 tree
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f2867228 blob
    hint:   e8f28d537c tree
    hint:   e8f2a35526 blob
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2cf6ec0 tree

Now we'll instead show:

    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob

Since we show the commit data in the output that's nicely aligned once
we sort by object type. The decision to show tags before commits is
pretty arbitrary. I don't want to order by object_type since there
tags come last after blobs, which doesn't make sense if we want to
show the most important things first.

I could display them after commits, but it's much less likely that
we'll display a tag, so if there is one it makes sense to show it
prominently at the top.

A note on the implementation: I started out with something much
simpler which just replaced oid_array_sort() in sha1-array.c with a
custom sort function before calling oid_array_for_each_unique(). But
then dumbly noticed that it doesn't work because the output function
was tangled up with the code added in fad6b9e590 ("for_each_abbrev:
drop duplicate objects", 2016-09-26) to ensure we don't display
duplicate objects.

That's why we're doing two passes here, first we need to sort the list
and de-duplicate the objects, then sort them in our custom order, and
finally output them without re-sorting them. I suppose we could also
make oid_array_for_each_unique() maintain a hashmap of emitted
objects, but that would increase its memory profile and wouldn't be
worth the complexity for this one-off use-case,
oid_array_for_each_unique() is used in many other places.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/technical/api-oid-array.txt | 17 +++++++----
 sha1-array.c                              | 17 +++++++++++
 sha1-array.h                              |  3 ++
 sha1-name.c                               | 37 ++++++++++++++++++++++-
 t/t1512-rev-parse-disambiguation.sh       | 21 +++++++++++++
 5 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/api-oid-array.txt b/Documentation/technical/api-oid-array.txt
index b0c11f868d..94b529722c 100644
--- a/Documentation/technical/api-oid-array.txt
+++ b/Documentation/technical/api-oid-array.txt
@@ -35,13 +35,18 @@ Functions
 	Free all memory associated with the array and return it to the
 	initial, empty state.
 
+`oid_array_for_each`::
+	Iterate over each element of the list, executing the callback
+	function for each one. Does not sort the list, so any custom
+	hash order is retained. If the callback returns a non-zero
+	value, the iteration ends immediately and the callback's
+	return is propagated; otherwise, 0 is returned.
+
 `oid_array_for_each_unique`::
-	Efficiently iterate over each unique element of the list,
-	executing the callback function for each one. If the array is
-	not sorted, this function has the side effect of sorting it. If
-	the callback returns a non-zero value, the iteration ends
-	immediately and the callback's return is propagated; otherwise,
-	0 is returned.
+	Iterate over each unique element of the list in sort order ,
+	but otherwise behaves like `oid_array_for_each`. If the array
+	is not sorted, this function has the side effect of sorting
+	it.
 
 Examples
 --------
diff --git a/sha1-array.c b/sha1-array.c
index 838b3bf847..5b2793615b 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -41,6 +41,23 @@ void oid_array_clear(struct oid_array *array)
 	array->sorted = 0;
 }
 
+
+int oid_array_for_each(struct oid_array *array,
+		       for_each_oid_fn fn,
+		       void *data)
+{
+	int i;
+
+	/* No oid_array_sort() here! See the api-oid-array.txt docs! */
+
+	for (i = 0; i < array->nr; i++) {
+		int ret = fn(array->oid + i, data);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
 int oid_array_for_each_unique(struct oid_array *array,
 				for_each_oid_fn fn,
 				void *data)
diff --git a/sha1-array.h b/sha1-array.h
index 1e1d24b009..232bf95017 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -16,6 +16,9 @@ void oid_array_clear(struct oid_array *array);
 
 typedef int (*for_each_oid_fn)(const struct object_id *oid,
 			       void *data);
+int oid_array_for_each(struct oid_array *array,
+		       for_each_oid_fn fn,
+		       void *data);
 int oid_array_for_each_unique(struct oid_array *array,
 			      for_each_oid_fn fn,
 			      void *data);
diff --git a/sha1-name.c b/sha1-name.c
index 9d7bbd3e96..46d8b1afa6 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -378,6 +378,34 @@ static int collect_ambiguous(const struct object_id *oid, void *data)
 	return 0;
 }
 
+static int sort_ambiguous(const void *a, const void *b)
+{
+	int a_type = oid_object_info(a, NULL);
+	int b_type = oid_object_info(b, NULL);
+	int a_type_sort;
+	int b_type_sort;
+
+	/*
+	 * Sorts by hash within the same object type, just as
+	 * oid_array_for_each_unique() would do.
+	 */
+	if (a_type == b_type)
+		return oidcmp(a, b);
+
+	/*
+	 * Between object types show tags, then commits, and finally
+	 * trees and blobs.
+	 *
+	 * The object_type enum is commit, tree, blob, tag, but we
+	 * want tag, commit, tree blob. Cleverly (perhaps too
+	 * cleverly) do that with modulus, since the enum assigns 1 to
+	 * commit, so tag becomes 0.
+	 */
+	a_type_sort = a_type % 4;
+	b_type_sort = b_type % 4;
+	return a_type_sort > b_type_sort ? 1 : -1;
+}
+
 static int get_short_oid(const char *name, int len, struct object_id *oid,
 			  unsigned flags)
 {
@@ -409,6 +437,8 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 	status = finish_object_disambiguation(&ds, oid);
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
+		struct oid_array collect = OID_ARRAY_INIT;
+
 		error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
 
 		/*
@@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 			ds.fn = NULL;
 
 		advise(_("The candidates are:"));
-		for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds);
+		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
+		QSORT(collect.oid, collect.nr, sort_ambiguous);
+
+		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
+			BUG("show_ambiguous_object shouldn't return non-zero");
+		oid_array_clear(&collect);
 	}
 
 	return status;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 711704ba5a..2701462041 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -361,4 +361,25 @@ test_expect_success 'core.disambiguate does not override context' '
 		git -c core.disambiguate=committish rev-parse $sha1^{tree}
 '
 
+test_expect_success C_LOCALE_OUTPUT 'ambiguous commits are printed by type first, then hash order' '
+	test_must_fail git rev-parse 0000 2>stderr &&
+	grep ^hint: stderr >hints &&
+	grep 0000 hints >objects &&
+	cat >expected <<-\EOF &&
+	tag
+	commit
+	tree
+	blob
+	EOF
+	awk "{print \$3}" <objects >objects.types &&
+	uniq <objects.types >objects.types.uniq &&
+	test_cmp expected objects.types.uniq &&
+	for type in tag commit tree blob
+	do
+		grep $type objects >$type.objects &&
+		sort $type.objects >$type.objects.sorted &&
+		test_cmp $type.objects.sorted $type.objects
+	done
+'
+
 test_done
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 07/12] get_short_oid: learn to disambiguate by ^{tag}
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (18 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 08/12] get_short_oid: learn to disambiguate by ^{blob} Ævar Arnfjörð Bjarmason
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Add support for ^{tag} to the disambiguation logic. Before this ^{tag}
would simply be ignored:

    $ git rev-parse e8f2^{tag}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob
    e8f2^{tag}

Now the logic added in ed1ca6025f ("peel_onion: disambiguate to favor
tree-ish when we know we want a tree-ish", 2013-03-31) has been
extended to support it.

    $ git rev-parse e8f2^{tag}
    e8f2650052f3ff646023725e388ea1112b020e79

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h                             |  5 +++--
 sha1-name.c                         | 13 ++++++++++++-
 t/t1512-rev-parse-disambiguation.sh |  5 ++++-
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index 354903c3ea..a141995cc7 100644
--- a/cache.h
+++ b/cache.h
@@ -1330,8 +1330,9 @@ struct object_context {
 #define GET_OID_TREE             010
 #define GET_OID_TREEISH          020
 #define GET_OID_BLOB             040
-#define GET_OID_FOLLOW_SYMLINKS 0100
-#define GET_OID_RECORD_PATH     0200
+#define GET_OID_TAG             0100
+#define GET_OID_FOLLOW_SYMLINKS 0200
+#define GET_OID_RECORD_PATH     0400
 #define GET_OID_ONLY_TO_DIE    04000
 
 #define GET_OID_DISAMBIGUATORS \
diff --git a/sha1-name.c b/sha1-name.c
index 46d8b1afa6..68d5f65362 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -221,6 +221,12 @@ static int finish_object_disambiguation(struct disambiguate_state *ds,
 	return 0;
 }
 
+static int disambiguate_tag_only(const struct object_id *oid, void *cb_data_unused)
+{
+	int kind = oid_object_info(oid, NULL);
+	return kind == OBJ_TAG;
+}
+
 static int disambiguate_commit_only(const struct object_id *oid, void *cb_data_unused)
 {
 	int kind = oid_object_info(oid, NULL);
@@ -288,7 +294,8 @@ int set_disambiguate_hint_config(const char *var, const char *value)
 		{ "committish", disambiguate_committish_only },
 		{ "tree", disambiguate_tree_only },
 		{ "treeish", disambiguate_treeish_only },
-		{ "blob", disambiguate_blob_only }
+		{ "blob", disambiguate_blob_only },
+		{ "tag", disambiguate_tag_only }
 	};
 	int i;
 
@@ -429,6 +436,8 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 		ds.fn = disambiguate_treeish_only;
 	else if (flags & GET_OID_BLOB)
 		ds.fn = disambiguate_blob_only;
+	else if (flags & GET_OID_TAG)
+		ds.fn = disambiguate_tag_only;
 	else
 		ds.fn = default_disambiguate_hint;
 
@@ -958,6 +967,8 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 	lookup_flags &= ~GET_OID_DISAMBIGUATORS;
 	if (expected_type == OBJ_COMMIT)
 		lookup_flags |= GET_OID_COMMITTISH;
+	else if (expected_type == OBJ_TAG)
+		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
 		lookup_flags |= GET_OID_TREEISH;
 
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 2701462041..74e7d9c178 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -334,7 +334,10 @@ test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
 	test_must_fail git rev-parse 000000000^{commit} 2>stderr &&
 	grep ^hint: stderr >hints &&
 	# 5 commits, 1 tag (which is a commitish), plus intro line
-	test_line_count = 7 hints
+	test_line_count = 7 hints &&
+	git rev-parse 000000000^{tag} >stdout &&
+	test_line_count = 1 stdout &&
+	grep -q ^0000000000f8f stdout
 '
 
 test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 08/12] get_short_oid: learn to disambiguate by ^{blob}
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (19 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 07/12] get_short_oid: learn to disambiguate by ^{tag} Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The disambiguation logic had all the pieces necessary to only print
out those blobs that were ambiguous, but they hadn't been connected.

The initial logic was added in daba53aeaf ("sha1_name.c: add support
for disambiguating other types", 2012-07-02), and when the flags were
propagated in 8a10fea49b ("get_sha1: propagate flags to child
functions", 2016-09-26) GET_OID_BLOB wasn't added to lookup_flags.

Before this change requests for blobs were simply ignored:

    $ git rev-parse e8f2^{blob}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob
    e8f2^{blob}
    [...]

But now we'll do the right thing and only print the blobs:

    $ git rev-parse e8f2^{blob}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob
    e8f2^{blob}
    [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         | 2 ++
 t/t1512-rev-parse-disambiguation.sh | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/sha1-name.c b/sha1-name.c
index 68d5f65362..023f9471a8 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -971,6 +971,8 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
 		lookup_flags |= GET_OID_TREEISH;
+	else if (expected_type == OBJ_BLOB)
+		lookup_flags |= GET_OID_BLOB;
 
 	if (get_oid_1(name, sp - name - 2, &outer, lookup_flags))
 		return -1;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 74e7d9c178..9ce9cc3bc3 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -337,7 +337,11 @@ test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
 	test_line_count = 7 hints &&
 	git rev-parse 000000000^{tag} >stdout &&
 	test_line_count = 1 stdout &&
-	grep -q ^0000000000f8f stdout
+	grep -q ^0000000000f8f stdout &&
+	test_must_fail git rev-parse 000000000^{blob} 2>stderr &&
+	grep ^hint: stderr >hints &&
+	# 5 blobs plus intro line &&
+	test_line_count = 6 hints
 '
 
 test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (20 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 08/12] get_short_oid: learn to disambiguate by ^{blob} Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 10/12] get_short_oid / peel_onion: ^{commit} should be commit, not committish Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

After the recent series of patches ^{tag} and ^{blob} now work to get
just the tags and blobs, but ^{tree} will still list any
tree-ish (commits, tags and trees).

The previous behavior was added in ed1ca6025f ("peel_onion:
disambiguate to favor tree-ish when we know we want a tree-ish",
2013-03-31). I may have missed some special-case but this makes more
sense to me.

Now "$sha1:" can be used as before to mean treeish

    $ git rev-parse e8f2:
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    [...]

But ^{tree} shows just the trees, but would previously be equivalent
to the above:

    $ git rev-parse e8f2^{tree}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         |  2 +-
 t/t1512-rev-parse-disambiguation.sh | 18 ++++++++++++++----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index 023f9471a8..b61c0558d9 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -970,7 +970,7 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 	else if (expected_type == OBJ_TAG)
 		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
-		lookup_flags |= GET_OID_TREEISH;
+		lookup_flags |= GET_OID_TREE;
 	else if (expected_type == OBJ_BLOB)
 		lookup_flags |= GET_OID_BLOB;
 
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 9ce9cc3bc3..81076449a2 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -159,9 +159,13 @@ test_expect_failure 'two semi-ambiguous commit-ish' '
 	git log 0000000000...
 '
 
-test_expect_failure 'three semi-ambiguous tree-ish' '
+test_expect_success 'three semi-ambiguous tree-ish' '
 	# Likewise for tree-ish.  HEAD, v1.0.0 and HEAD^{tree} share
 	# the prefix but peeling them to tree yields the same thing
+	test_must_fail git rev-parse --verify 0000000000: &&
+
+	# For ^{tree} we can disambiguate because HEAD and v1.0.0 will
+	# be excluded.
 	git rev-parse --verify 0000000000^{tree}
 '
 
@@ -267,8 +271,12 @@ test_expect_success 'ambiguous commit-ish' '
 # There are three objects with this prefix: a blob, a tree, and a tag. We know
 # the blob will not pass as a treeish, but the tree and tag should (and thus
 # cause an error).
-test_expect_success 'ambiguous tags peel to treeish' '
-	test_must_fail git rev-parse 0000000000f^{tree}
+test_expect_success 'ambiguous tags peel to treeish or tree' '
+	test_must_fail git rev-parse 0000000000f: &&
+	git rev-parse 0000000000f^{tree} >stdout &&
+	test_line_count = 1 stdout &&
+	grep -q ^0000000000fd8bcc56 stdout
+
 '
 
 test_expect_success 'rev-parse --disambiguate' '
@@ -365,7 +373,9 @@ test_expect_success 'core.disambiguate config can prefer types' '
 test_expect_success 'core.disambiguate does not override context' '
 	# treeish ambiguous between tag and tree
 	test_must_fail \
-		git -c core.disambiguate=committish rev-parse $sha1^{tree}
+		git -c core.disambiguate=committish rev-parse $sha1: &&
+	# tree not ambiguous between tag and tree
+	git -c core.disambiguate=committish rev-parse $sha1^{tree}
 '
 
 test_expect_success C_LOCALE_OUTPUT 'ambiguous commits are printed by type first, then hash order' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 10/12] get_short_oid / peel_onion: ^{commit} should be commit, not committish
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (21 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 11/12] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 12/12] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Change the ^{commit} syntax to mean just commits instead of committish
for the purpose of disambiguation. Before this e8f2^{commit} would
show the v2.17.0 tag as a disambiguation candidate, but now it'll just
show ambiguous commits:

    $ git rev-parse e8f2^{commit}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         | 2 +-
 t/t1512-rev-parse-disambiguation.sh | 7 ++++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index b61c0558d9..1d2a74a29c 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -966,7 +966,7 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 
 	lookup_flags &= ~GET_OID_DISAMBIGUATORS;
 	if (expected_type == OBJ_COMMIT)
-		lookup_flags |= GET_OID_COMMITTISH;
+		lookup_flags |= GET_OID_COMMIT;
 	else if (expected_type == OBJ_TAG)
 		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 81076449a2..b17973a266 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -341,8 +341,8 @@ test_expect_success C_LOCALE_OUTPUT 'ambiguity hints' '
 test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
 	test_must_fail git rev-parse 000000000^{commit} 2>stderr &&
 	grep ^hint: stderr >hints &&
-	# 5 commits, 1 tag (which is a commitish), plus intro line
-	test_line_count = 7 hints &&
+	# 5 commits plus intro line
+	test_line_count = 6 hints &&
 	git rev-parse 000000000^{tag} >stdout &&
 	test_line_count = 1 stdout &&
 	grep -q ^0000000000f8f stdout &&
@@ -366,7 +366,8 @@ test_expect_success 'core.disambiguate config can prefer types' '
 	# ambiguous between tree and tag
 	sha1=0000000000f &&
 	test_must_fail git rev-parse $sha1 &&
-	git rev-parse $sha1^{commit} &&
+	# there is no commit so ^{commit} comes up empty
+	test_must_fail git rev-parse $sha1^{commit} &&
 	git -c core.disambiguate=committish rev-parse $sha1
 '
 
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 11/12] config doc: document core.disambiguate
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (22 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 10/12] get_short_oid / peel_onion: ^{commit} should be commit, not committish Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  2018-05-01 12:06 ` [PATCH v2 12/12] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The core.disambiguate variable was added in
5b33cb1fd7 ("get_short_sha1: make default disambiguation
configurable", 2016-09-27) but never documented.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config.txt | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..14a3d57e77 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -910,6 +910,19 @@ core.abbrev::
 	abbreviated object names to stay unique for some time.
 	The minimum length is 4.
 
+core.disambiguate::
+	If Git is given a SHA-1 that's ambigous it'll suggest what
+	objects you might mean. By default it'll print out all
+	potential objects with that prefix regardless of their
+	type. This setting, along with the `^{<type>}` peel syntax
+	(see linkgit:gitrevisions[7]), allows for narrowing that down.
++
+Is set to `none` by default to show all object types. Can also be
+`commit` (peel syntax: `$sha1^{commit}`), `committish` (commits and
+tags), `tree` (peel: `$sha1^{tree}`), `treeish` (everything except
+blobs, peel syntax: `$sha1:`), `blob` (peel: `$sha1^{blob}`) or `tag`
+(peel: `$sha1^{tag}`). The peel syntax will override any config value.
+
 add.ignoreErrors::
 add.ignore-errors (deprecated)::
 	Tells 'git add' to continue adding files when some files cannot be
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v2 12/12] get_short_oid: document & warn if we ignore the type selector
  2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                   ` (23 preceding siblings ...)
  2018-05-01 12:06 ` [PATCH v2 11/12] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:06 ` Ævar Arnfjörð Bjarmason
  24 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The SHA1 prefix 06fa currently matches no blobs in git.git. When
disambiguating short SHA1s we've been quietly ignoring the user's type
selector as a fallback mechanism, this was intentionally added in
1ffa26c461 ("get_short_sha1: list ambiguous objects on error",
2016-09-26).

I think that behavior makes sense, it's not very useful to just show
nothing because a preference has been expressed via core.disambiguate,
but it's bad that we're quietly doing this. The user might thing that
we just didn't understand what e.g 06fa^{blob} meant.

Now we'll instead print a warning if no objects of the requested type
were found:

    $ git rev-parse 06fa^{blob}
    error: short SHA1 06fa is ambiguous
    hint: The candidates are:
    [... no blobs listed ...]
    warning: Your hint (via core.disambiguate or peel syntax) was ignored, we fell
    back to showing all object types since no object of the requested type
    matched the provide short SHA1 06fa

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config.txt            |  4 ++++
 sha1-name.c                         | 11 ++++++++++-
 t/t1512-rev-parse-disambiguation.sh |  5 ++++-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 14a3d57e77..e14f2c0492 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -922,6 +922,10 @@ Is set to `none` by default to show all object types. Can also be
 tags), `tree` (peel: `$sha1^{tree}`), `treeish` (everything except
 blobs, peel syntax: `$sha1:`), `blob` (peel: `$sha1^{blob}`) or `tag`
 (peel: `$sha1^{tag}`). The peel syntax will override any config value.
++
+If no objects of the selected type exist the disambiguation will fall
+back to `none` and print a warning indicating no objects of the
+selected type could be found for that prefix.
 
 add.ignoreErrors::
 add.ignore-errors (deprecated)::
diff --git a/sha1-name.c b/sha1-name.c
index 1d2a74a29c..9789764a38 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -447,6 +447,7 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
 		struct oid_array collect = OID_ARRAY_INIT;
+		int ignored_hint = 0;
 
 		error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
 
@@ -456,8 +457,10 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 		 * that case, we still want to show them, so disable the hint
 		 * function entirely.
 		 */
-		if (!ds.ambiguous)
+		if (!ds.ambiguous) {
 			ds.fn = NULL;
+			ignored_hint = 1;
+		}
 
 		advise(_("The candidates are:"));
 		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
@@ -466,6 +469,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
 			BUG("show_ambiguous_object shouldn't return non-zero");
 		oid_array_clear(&collect);
+
+		if (ignored_hint) {
+			warning(_("Your hint (via core.disambiguate or peel syntax) was ignored, we fell\n"
+				  "back to showing all object types since no object of the requested type\n"
+				  "matched the provide short SHA1 %s"), ds.hex_pfx);
+		}
 	}
 
 	return status;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index b17973a266..940f323ee9 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -359,7 +359,10 @@ test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
 	echo 872 | git hash-object --stdin -w &&
 	test_must_fail git rev-parse ee3d^{commit} 2>stderr &&
 	grep ^hint: stderr >hints &&
-	test_line_count = 3 hints
+	test_line_count = 3 hints &&
+	grep ^warning stderr >warnings &&
+	grep -q "Your hint.*was ignored" warnings &&
+	grep -q "the provide short SHA1 ee3d" stderr
 '
 
 test_expect_success 'core.disambiguate config can prefer types' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 11:27     ` Ævar Arnfjörð Bjarmason
@ 2018-05-01 12:26       ` Derrick Stolee
  2018-05-01 12:36         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 99+ messages in thread
From: Derrick Stolee @ 2018-05-01 12:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, brian m . carlson

On 5/1/2018 7:27 AM, Ævar Arnfjörð Bjarmason wrote:
> On Tue, May 01 2018, Derrick Stolee wrote:
>
>> On 4/30/2018 6:07 PM, Ævar Arnfjörð Bjarmason wrote:
>>> Since we show the commit data in the output that's nicely aligned once
>>> we sort by object type. The decision to show tags before commits is
>>> pretty arbitrary, but it's much less likely that we'll display a tag,
>>> so if there is one it makes sense to show it first.
>> Here's a non-arbitrary reason: the object types are ordered
>> topologically (ignoring self-references):
>>
>> tag -> commit, tree, blob
>> commit -> tree
>> tree -> blob
> Thanks. I'll add a patch with that comment to v2.
>
>>> @@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
>>>    			ds.fn = NULL;
>>>      		advise(_("The candidates are:"));
>>> -		for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds);
>>> +		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
>>> +		QSORT(collect.oid, collect.nr, sort_ambiguous);
>> I was wondering how the old code sorted by SHA even when the ambiguous
>> objects were loaded from different sources (multiple pack-files, loose
>> objects). Turns out that for_each_abbrev() does its own sort after
>> collecting the SHAs and then calls the given function pointer only
>> once per distinct object. This avoids multiple instances of the same
>> object, which may appear multiple times across pack-files.
>>
>> I only ask because now we are doing two sorts. I wonder if it would be
>> more elegant to provide your sorting algorithm to for_each_abbrev()
>> and let it call show_ambiguous_object as before.
>>
>> Another question is if we should use this sort generally for all calls
>> to for_each_abbrev(). The only other case I see is in
>> builtin/revparse.c.
> When preparing v2 I realized how confusing this was, so I'd added this
> to the commit message of my WIP re-roll which should explain this:
>
>      A note on the implementation: I started out with something much
>      simpler which just replaced oid_array_sort() in sha1-array.c with a
>      custom sort function before calling oid_array_for_each_unique(). But
>      then dumbly noticed that it doesn't work because the output function
>      was tangled up with the code added in fad6b9e590 ("for_each_abbrev:
>      drop duplicate objects", 2016-09-26) to ensure we don't display
>      duplicate objects.
>      
>      That's why we're doing two passes here, first we need to sort the list
>      and de-duplicate the objects, then sort them in our custom order, and
>      finally output them without re-sorting them. I suppose we could also
>      make oid_array_for_each_unique() maintain a hashmap of emitted
>      objects, but that would increase its memory profile and wouldn't be
>      worth the complexity for this one-off use-case,
>      oid_array_for_each_unique() is used in many other places.

How would sorting in our custom order before de-duplicating fail the 
de-duplication? We will still pair identical OIDs as consecutive 
elements and oid_array_for_each_unique only cares about consecutive 
elements having distinct OIDs, not lex-ordered OIDs.

Perhaps the noise is because we rely on oid_array_sort() to mark the 
array as sorted inside oid_array_for_each_unique(), but that could be 
remedied by calling our QSORT() inside for_each_abbrev() and marking the 
array as sorted before calling oid_array_for_each_unique().

(Again, my comments are not meant to block this series.)

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 12:26       ` Derrick Stolee
@ 2018-05-01 12:36         ` Ævar Arnfjörð Bjarmason
  2018-05-01 13:05           ` Derrick Stolee
  0 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 12:36 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, Junio C Hamano, Jeff King, brian m . carlson


On Tue, May 01 2018, Derrick Stolee wrote:

> On 5/1/2018 7:27 AM, Ævar Arnfjörð Bjarmason wrote:
>> On Tue, May 01 2018, Derrick Stolee wrote:
>>
>>> On 4/30/2018 6:07 PM, Ævar Arnfjörð Bjarmason wrote:
>>>> Since we show the commit data in the output that's nicely aligned once
>>>> we sort by object type. The decision to show tags before commits is
>>>> pretty arbitrary, but it's much less likely that we'll display a tag,
>>>> so if there is one it makes sense to show it first.
>>> Here's a non-arbitrary reason: the object types are ordered
>>> topologically (ignoring self-references):
>>>
>>> tag -> commit, tree, blob
>>> commit -> tree
>>> tree -> blob
>> Thanks. I'll add a patch with that comment to v2.
>>
>>>> @@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
>>>>    			ds.fn = NULL;
>>>>      		advise(_("The candidates are:"));
>>>> -		for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds);
>>>> +		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
>>>> +		QSORT(collect.oid, collect.nr, sort_ambiguous);
>>> I was wondering how the old code sorted by SHA even when the ambiguous
>>> objects were loaded from different sources (multiple pack-files, loose
>>> objects). Turns out that for_each_abbrev() does its own sort after
>>> collecting the SHAs and then calls the given function pointer only
>>> once per distinct object. This avoids multiple instances of the same
>>> object, which may appear multiple times across pack-files.
>>>
>>> I only ask because now we are doing two sorts. I wonder if it would be
>>> more elegant to provide your sorting algorithm to for_each_abbrev()
>>> and let it call show_ambiguous_object as before.
>>>
>>> Another question is if we should use this sort generally for all calls
>>> to for_each_abbrev(). The only other case I see is in
>>> builtin/revparse.c.
>> When preparing v2 I realized how confusing this was, so I'd added this
>> to the commit message of my WIP re-roll which should explain this:
>>
>>      A note on the implementation: I started out with something much
>>      simpler which just replaced oid_array_sort() in sha1-array.c with a
>>      custom sort function before calling oid_array_for_each_unique(). But
>>      then dumbly noticed that it doesn't work because the output function
>>      was tangled up with the code added in fad6b9e590 ("for_each_abbrev:
>>      drop duplicate objects", 2016-09-26) to ensure we don't display
>>      duplicate objects.
>>           That's why we're doing two passes here, first we need to
>> sort the list
>>      and de-duplicate the objects, then sort them in our custom order, and
>>      finally output them without re-sorting them. I suppose we could also
>>      make oid_array_for_each_unique() maintain a hashmap of emitted
>>      objects, but that would increase its memory profile and wouldn't be
>>      worth the complexity for this one-off use-case,
>>      oid_array_for_each_unique() is used in many other places.
>
> How would sorting in our custom order before de-duplicating fail the
> de-duplication? We will still pair identical OIDs as consecutive
> elements and oid_array_for_each_unique only cares about consecutive
> elements having distinct OIDs, not lex-ordered OIDs.

Because there's no de-duplication without the array first being sorted
in oidcmp() order, which oid_array_for_each_unique() checks for and
re-sorts if !array->sorted. I.e. its de-duplication is just a state
machine where it won't call the callback if the currently processed
element has the same SHA1 as the last one.

> Perhaps the noise is because we rely on oid_array_sort() to mark the
> array as sorted inside oid_array_for_each_unique(), but that could be
> remedied by calling our QSORT() inside for_each_abbrev() and marking
> the array as sorted before calling oid_array_for_each_unique().

As noted above this won't work, because the function inherently relies
on the array being sorted to be able to de-duplicate. Doing this will
yield duplicate entries.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
@ 2018-05-01 13:03   ` Derrick Stolee
  2018-05-01 13:39     ` Ævar Arnfjörð Bjarmason
  2018-05-01 18:40   ` [PATCH v3 00/12] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                     ` (12 subsequent siblings)
  13 siblings, 1 reply; 99+ messages in thread
From: Derrick Stolee @ 2018-05-01 13:03 UTC (permalink / raw)
  To: git, avarab; +Cc: stolee, Derrick Stolee

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Here is what I mean by sorting during for_each_abbrev(). This seems to work for
me, so I don't know what the issue is with this one-pass approach.

Thanks,
-Stolee

-- >8 --

Change the output emitted when an ambiguous object is encountered so
that we show tags first, then commits, followed by trees, and finally
blobs. Within each type we show objects in hashcmp() order. Before
this change the objects were only ordered by hashcmp().

The reason for doing this is that the output looks better as a result,
e.g. the v2.17.0 tag before this change on "git show e8f2" would
display:

    hint: The candidates are:
    hint:   e8f2093055 tree
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f25a3a50 tree
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f2867228 blob
    hint:   e8f28d537c tree
    hint:   e8f2a35526 blob
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2cf6ec0 tree

Now we'll instead show:

    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob

Since we show the commit data in the output that's nicely aligned once
we sort by object type. The decision to show tags before commits is
pretty arbitrary. I don't want to order by object_type since there
tags come last after blobs, which doesn't make sense if we want to
show the most important things first.

I could display them after commits, but it's much less likely that
we'll display a tag, so if there is one it makes sense to show it
prominently at the top.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 sha1-name.c                         | 31 +++++++++++++++++++++++++++++
 t/t1512-rev-parse-disambiguation.sh | 21 +++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/sha1-name.c b/sha1-name.c
index 5deebab56d..0336630c64 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -385,6 +385,34 @@ static int collect_ambiguous(const struct object_id *oid, void *data)
 	return 0;
 }
 
+static int sort_ambiguous(const void *a, const void *b)
+{
+	int a_type = oid_object_info(a, NULL);
+	int b_type = oid_object_info(b, NULL);
+	int a_type_sort;
+	int b_type_sort;
+
+	/*
+	 * Sorts by hash within the same object type, just as
+	 * oid_array_for_each_unique() would do.
+	 */
+	if (a_type == b_type)
+		return oidcmp(a, b);
+
+	/*
+	 * Between object types show tags, then commits, and finally
+	 * trees and blobs.
+	 *
+	 * The object_type enum is commit, tree, blob, tag, but we
+	 * want tag, commit, tree blob. Cleverly (perhaps too
+	 * cleverly) do that with modulus, since the enum assigns 1 to
+	 * commit, so tag becomes 0.
+	 */
+	a_type_sort = a_type % 4;
+	b_type_sort = b_type % 4;
+	return a_type_sort > b_type_sort ? 1 : -1;
+}
+
 static int get_short_oid(const char *name, int len, struct object_id *oid,
 			  unsigned flags)
 {
@@ -451,6 +479,9 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
 	find_short_object_filename(&ds);
 	find_short_packed_object(&ds);
 
+	QSORT(collect.oid, collect.nr, sort_ambiguous);
+	collect.sorted = 1;
+
 	ret = oid_array_for_each_unique(&collect, fn, cb_data);
 	oid_array_clear(&collect);
 	return ret;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index c7ceda2f21..74e7d9c178 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -364,4 +364,25 @@ test_expect_success 'core.disambiguate does not override context' '
 		git -c core.disambiguate=committish rev-parse $sha1^{tree}
 '
 
+test_expect_success C_LOCALE_OUTPUT 'ambiguous commits are printed by type first, then hash order' '
+	test_must_fail git rev-parse 0000 2>stderr &&
+	grep ^hint: stderr >hints &&
+	grep 0000 hints >objects &&
+	cat >expected <<-\EOF &&
+	tag
+	commit
+	tree
+	blob
+	EOF
+	awk "{print \$3}" <objects >objects.types &&
+	uniq <objects.types >objects.types.uniq &&
+	test_cmp expected objects.types.uniq &&
+	for type in tag commit tree blob
+	do
+		grep $type objects >$type.objects &&
+		sort $type.objects >$type.objects.sorted &&
+		test_cmp $type.objects.sorted $type.objects
+	done
+'
+
 test_done
-- 
2.17.0.39.g685157f7fb


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 12:36         ` Ævar Arnfjörð Bjarmason
@ 2018-05-01 13:05           ` Derrick Stolee
  0 siblings, 0 replies; 99+ messages in thread
From: Derrick Stolee @ 2018-05-01 13:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, brian m . carlson

On 5/1/2018 8:36 AM, Ævar Arnfjörð Bjarmason wrote:
> On Tue, May 01 2018, Derrick Stolee wrote:
>
>> How would sorting in our custom order before de-duplicating fail the
>> de-duplication? We will still pair identical OIDs as consecutive
>> elements and oid_array_for_each_unique only cares about consecutive
>> elements having distinct OIDs, not lex-ordered OIDs.
> Because there's no de-duplication without the array first being sorted
> in oidcmp() order, which oid_array_for_each_unique() checks for and
> re-sorts if !array->sorted. I.e. its de-duplication is just a state
> machine where it won't call the callback if the currently processed
> element has the same SHA1 as the last one.
>
>> Perhaps the noise is because we rely on oid_array_sort() to mark the
>> array as sorted inside oid_array_for_each_unique(), but that could be
>> remedied by calling our QSORT() inside for_each_abbrev() and marking
>> the array as sorted before calling oid_array_for_each_unique().
> As noted above this won't work, because the function inherently relies
> on the array being sorted to be able to de-duplicate. Doing this will
> yield duplicate entries.

I'm confused as to why my suggestion doesn't work, so I made it 
concrete. I sent an alternate commit 6/12 to your v2 series [1].

Thanks,
-Stolee

[1] 
https://public-inbox.org/git/20180501130318.58251-1-dstolee@microsoft.com/T/#u

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 13:03   ` [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1 Derrick Stolee
@ 2018-05-01 13:39     ` Ævar Arnfjörð Bjarmason
  2018-05-01 13:44       ` Derrick Stolee
  0 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 13:39 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, stolee


On Tue, May 01 2018, Derrick Stolee wrote:

> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>
> Here is what I mean by sorting during for_each_abbrev(). This seems to work for
> me, so I don't know what the issue is with this one-pass approach.
> [...]
> +static int sort_ambiguous(const void *a, const void *b)
> +{
> +	int a_type = oid_object_info(a, NULL);
> +	int b_type = oid_object_info(b, NULL);
> +	int a_type_sort;
> +	int b_type_sort;
> +
> +	/*
> +	 * Sorts by hash within the same object type, just as
> +	 * oid_array_for_each_unique() would do.
> +	 */
> +	if (a_type == b_type)
> +		return oidcmp(a, b);
> +
> +	/*
> +	 * Between object types show tags, then commits, and finally
> +	 * trees and blobs.
> +	 *
> +	 * The object_type enum is commit, tree, blob, tag, but we
> +	 * want tag, commit, tree blob. Cleverly (perhaps too
> +	 * cleverly) do that with modulus, since the enum assigns 1 to
> +	 * commit, so tag becomes 0.
> +	 */
> +	a_type_sort = a_type % 4;
> +	b_type_sort = b_type % 4;
> +	return a_type_sort > b_type_sort ? 1 : -1;
> +}
> +
>  static int get_short_oid(const char *name, int len, struct object_id *oid,
>  			  unsigned flags)
>  {
> @@ -451,6 +479,9 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
>  	find_short_object_filename(&ds);
>  	find_short_packed_object(&ds);
>
> +	QSORT(collect.oid, collect.nr, sort_ambiguous);
> +	collect.sorted = 1;
> +

Yes this works. You're right. I wasn't trying to intentionally omit
stuff in my recent 878t93zh60.fsf@evledraar.gmail.com, I'd just written
this code some days ago and forgotten why I did what I was doing (and
this is hard to test for), but it's all coming back to me now.

The actual requirement for oid_array_for_each_unique() working properly
is that you've got to feed it in hash order, but my new sort_ambiguous()
still does that (barring any SHA-1 collisions, at which point we have
bigger problems), so two passes aren't needed. So yes, this apporoach
works and is one-pass.

But that's just an implementation detail of the current sort method,
when I wrote this I was initially playing with other sort orders,
e.g. sorting SHAs regardless of type by the mtime of the file I found
them in. With this approach I'd start printing duplicates if I changed
the internals of sort_ambiguous() like that.

But I think it's extremely implausible that we'll start sorting things
like that, so I'll just take this method of doing it and add some
comment saying we must hashcmp() the entries in our own sort function
for the de-duplication to work, I don't see us ever changing that.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 13:39     ` Ævar Arnfjörð Bjarmason
@ 2018-05-01 13:44       ` Derrick Stolee
  2018-05-01 14:10         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 99+ messages in thread
From: Derrick Stolee @ 2018-05-01 13:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Derrick Stolee; +Cc: git



On 5/1/2018 9:39 AM, Ævar Arnfjörð Bjarmason wrote:
> On Tue, May 01 2018, Derrick Stolee wrote:
>
>> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>>
>> Here is what I mean by sorting during for_each_abbrev(). This seems to work for
>> me, so I don't know what the issue is with this one-pass approach.
>> [...]
>> +static int sort_ambiguous(const void *a, const void *b)
>> +{
>> +	int a_type = oid_object_info(a, NULL);
>> +	int b_type = oid_object_info(b, NULL);
>> +	int a_type_sort;
>> +	int b_type_sort;
>> +
>> +	/*
>> +	 * Sorts by hash within the same object type, just as
>> +	 * oid_array_for_each_unique() would do.
>> +	 */
>> +	if (a_type == b_type)
>> +		return oidcmp(a, b);
>> +
>> +	/*
>> +	 * Between object types show tags, then commits, and finally
>> +	 * trees and blobs.
>> +	 *
>> +	 * The object_type enum is commit, tree, blob, tag, but we
>> +	 * want tag, commit, tree blob. Cleverly (perhaps too
>> +	 * cleverly) do that with modulus, since the enum assigns 1 to
>> +	 * commit, so tag becomes 0.
>> +	 */
>> +	a_type_sort = a_type % 4;
>> +	b_type_sort = b_type % 4;
>> +	return a_type_sort > b_type_sort ? 1 : -1;
>> +}
>> +
>>   static int get_short_oid(const char *name, int len, struct object_id *oid,
>>   			  unsigned flags)
>>   {
>> @@ -451,6 +479,9 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
>>   	find_short_object_filename(&ds);
>>   	find_short_packed_object(&ds);
>>
>> +	QSORT(collect.oid, collect.nr, sort_ambiguous);
>> +	collect.sorted = 1;
>> +
> Yes this works. You're right. I wasn't trying to intentionally omit
> stuff in my recent 878t93zh60.fsf@evledraar.gmail.com, I'd just written
> this code some days ago and forgotten why I did what I was doing (and
> this is hard to test for), but it's all coming back to me now.
>
> The actual requirement for oid_array_for_each_unique() working properly
> is that you've got to feed it in hash order,

To work properly, duplicate entries must be consecutive. Since duplicate 
entries have the same type, our sort satisfies this condition.

> but my new sort_ambiguous()
> still does that (barring any SHA-1 collisions, at which point we have
> bigger problems), so two passes aren't needed. So yes, this apporoach
> works and is one-pass.
>
> But that's just an implementation detail of the current sort method,
> when I wrote this I was initially playing with other sort orders,
> e.g. sorting SHAs regardless of type by the mtime of the file I found
> them in. With this approach I'd start printing duplicates if I changed
> the internals of sort_ambiguous() like that.

That makes sense.

> But I think it's extremely implausible that we'll start sorting things
> like that, so I'll just take this method of doing it and add some
> comment saying we must hashcmp() the entries in our own sort function
> for the de-duplication to work, I don't see us ever changing that.

Sounds good.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 13:44       ` Derrick Stolee
@ 2018-05-01 14:10         ` Ævar Arnfjörð Bjarmason
  2018-05-01 14:15           ` Derrick Stolee
  0 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 14:10 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Derrick Stolee, git


On Tue, May 01 2018, Derrick Stolee wrote:

> On 5/1/2018 9:39 AM, Ævar Arnfjörð Bjarmason wrote:
>> On Tue, May 01 2018, Derrick Stolee wrote:
>>
>>> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>>>
>>> Here is what I mean by sorting during for_each_abbrev(). This seems to work for
>>> me, so I don't know what the issue is with this one-pass approach.
>>> [...]
>>> +static int sort_ambiguous(const void *a, const void *b)
>>> +{
>>> +	int a_type = oid_object_info(a, NULL);
>>> +	int b_type = oid_object_info(b, NULL);
>>> +	int a_type_sort;
>>> +	int b_type_sort;
>>> +
>>> +	/*
>>> +	 * Sorts by hash within the same object type, just as
>>> +	 * oid_array_for_each_unique() would do.
>>> +	 */
>>> +	if (a_type == b_type)
>>> +		return oidcmp(a, b);
>>> +
>>> +	/*
>>> +	 * Between object types show tags, then commits, and finally
>>> +	 * trees and blobs.
>>> +	 *
>>> +	 * The object_type enum is commit, tree, blob, tag, but we
>>> +	 * want tag, commit, tree blob. Cleverly (perhaps too
>>> +	 * cleverly) do that with modulus, since the enum assigns 1 to
>>> +	 * commit, so tag becomes 0.
>>> +	 */
>>> +	a_type_sort = a_type % 4;
>>> +	b_type_sort = b_type % 4;
>>> +	return a_type_sort > b_type_sort ? 1 : -1;
>>> +}
>>> +
>>>   static int get_short_oid(const char *name, int len, struct object_id *oid,
>>>   			  unsigned flags)
>>>   {
>>> @@ -451,6 +479,9 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
>>>   	find_short_object_filename(&ds);
>>>   	find_short_packed_object(&ds);
>>>
>>> +	QSORT(collect.oid, collect.nr, sort_ambiguous);
>>> +	collect.sorted = 1;
>>> +
>> Yes this works. You're right. I wasn't trying to intentionally omit
>> stuff in my recent 878t93zh60.fsf@evledraar.gmail.com, I'd just written
>> this code some days ago and forgotten why I did what I was doing (and
>> this is hard to test for), but it's all coming back to me now.
>>
>> The actual requirement for oid_array_for_each_unique() working properly
>> is that you've got to feed it in hash order,
>
> To work properly, duplicate entries must be consecutive. Since
> duplicate entries have the same type, our sort satisfies this
> condition.
>
>> but my new sort_ambiguous()
>> still does that (barring any SHA-1 collisions, at which point we have
>> bigger problems), so two passes aren't needed. So yes, this apporoach
>> works and is one-pass.
>>
>> But that's just an implementation detail of the current sort method,
>> when I wrote this I was initially playing with other sort orders,
>> e.g. sorting SHAs regardless of type by the mtime of the file I found
>> them in. With this approach I'd start printing duplicates if I changed
>> the internals of sort_ambiguous() like that.
>
> That makes sense.
>
>> But I think it's extremely implausible that we'll start sorting things
>> like that, so I'll just take this method of doing it and add some
>> comment saying we must hashcmp() the entries in our own sort function
>> for the de-duplication to work, I don't see us ever changing that.
>
> Sounds good.

Actually I'm having second thoughts about that and thinking I might keep
my original approach (with a better explanation).

A few more lines of code seems worthwhile in order to not break the
assumptions a documented API is making, no matter how briefly, so I set
about documenting this case and supporting it, since
e.g. oid_array_lookup() will completely fail with the hack of setting
the .sorted member, and came up with this:

diff --git a/Documentation/technical/api-oid-array.txt b/Documentation/technical/api-oid-array.txt
index b0c11f868d..ff87260220 100644
--- a/Documentation/technical/api-oid-array.txt
+++ b/Documentation/technical/api-oid-array.txt
@@ -16,6 +16,20 @@ Data Structures
 	the actual data. The `nr` member contains the number of items in
 	the set.  The `alloc` and `sorted` members are used internally,
 	and should not be needed by API callers.
++
+Both the `oid_array_lookup` and `oid_array_for_each_unique` functions
+rely on the array being sorted. For the former it's an absolute
+requirenment that the internal `oid_array_sort` function has been
+called on it, bu for the latter it's enough that the elements are
+ordered in such a way as to guarantee that identical object IDs are
+adjacent in the array.
++
+This is useful e.g. to print output where commits, tags etc. are
+grouped together (barring a hash collision they won't have the same
+object ID), in such cases the `custom_sorted` member can be set to `1`
+before calling `oid_array_for_each_unique`, and it'll skip its own
+sorting. Once it's been set calling e.g. `oid_array_lookup` without it
+being cleared again will cause an internal panic, so use it carefully.

 Functions
 ---------
diff --git a/sha1-array.c b/sha1-array.c
index 466a926aa3..cbae07ff78 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -18,6 +18,7 @@ static void oid_array_sort(struct oid_array *array)
 {
 	QSORT(array->oid, array->nr, void_hashcmp);
 	array->sorted = 1;
+	array->custom_sorted = 0;
 }

 static const unsigned char *sha1_access(size_t index, void *table)
@@ -28,6 +29,13 @@ static const unsigned char *sha1_access(size_t index, void *table)

 int oid_array_lookup(struct oid_array *array, const struct object_id *oid)
 {
+	if (array->custom_sorted)
+		/*
+		 * We could also just clear custom_sorted here, but if
+		 * the caller is custom sorting and then calling this
+		 * that's likely something they'd like to know about.
+		 */
+		BUG("PANIC: Cannot lookup OIDs in arrays with a custom sort!");
 	if (!array->sorted)
 		oid_array_sort(array);
 	return sha1_pos(oid->hash, array->oid, array->nr, sha1_access);
@@ -39,6 +47,7 @@ void oid_array_clear(struct oid_array *array)
 	array->nr = 0;
 	array->alloc = 0;
 	array->sorted = 0;
+	array->custom_sorted = 0;
 }

 int oid_array_for_each_unique(struct oid_array *array,
@@ -47,7 +56,7 @@ int oid_array_for_each_unique(struct oid_array *array,
 {
 	int i;

-	if (!array->sorted)
+	if (!array->sorted && !array->custom_sorted)
 		oid_array_sort(array);

 	for (i = 0; i < array->nr; i++) {
diff --git a/sha1-array.h b/sha1-array.h
index 1e1d24b009..bfa77ba1e4 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -6,6 +6,7 @@ struct oid_array {
 	int nr;
 	int alloc;
 	int sorted;
+	int custom_sorted;
 };

 #define OID_ARRAY_INIT { NULL, 0, 0, 0 }
diff --git a/sha1-name.c b/sha1-name.c
index b81e07adbb..d190800db0 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -490,9 +490,11 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
 	find_short_packed_object(&ds);

 	QSORT(collect.oid, collect.nr, sort_ambiguous);
-	collect.sorted = 1;

+	collect.custom_sorted = 1;
 	ret = oid_array_for_each_unique(&collect, fn, cb_data);
+	collect.custom_sorted = 0;
+
 	oid_array_clear(&collect);
 	return ret;
 }

So maybe I should just stop worrying and YOLO it, it just seems wrong to
leave such a fragile setup in place where we set .sorted=1 and some
future refactoring reasonably tries to call oid_array_lookup() on it and
silently fails.

What do you think?

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 14:10         ` Ævar Arnfjörð Bjarmason
@ 2018-05-01 14:15           ` Derrick Stolee
  0 siblings, 0 replies; 99+ messages in thread
From: Derrick Stolee @ 2018-05-01 14:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Derrick Stolee, git

On 5/1/2018 10:10 AM, Ævar Arnfjörð Bjarmason wrote:
> Actually I'm having second thoughts about that and thinking I might keep
> my original approach (with a better explanation).
>
> A few more lines of code seems worthwhile in order to not break the
> assumptions a documented API is making, no matter how briefly, so I set
> about documenting this case and supporting it, since
> e.g. oid_array_lookup() will completely fail with the hack of setting
> the .sorted member, and came up with this:
>
> diff --git a/Documentation/technical/api-oid-array.txt b/Documentation/technical/api-oid-array.txt
> index b0c11f868d..ff87260220 100644
> --- a/Documentation/technical/api-oid-array.txt
> +++ b/Documentation/technical/api-oid-array.txt
> @@ -16,6 +16,20 @@ Data Structures
>   	the actual data. The `nr` member contains the number of items in
>   	the set.  The `alloc` and `sorted` members are used internally,
>   	and should not be needed by API callers.
> ++
> +Both the `oid_array_lookup` and `oid_array_for_each_unique` functions
> +rely on the array being sorted. For the former it's an absolute
> +requirenment that the internal `oid_array_sort` function has been
> +called on it, bu for the latter it's enough that the elements are
> +ordered in such a way as to guarantee that identical object IDs are
> +adjacent in the array.

s/bu/but/

> ++
> +This is useful e.g. to print output where commits, tags etc. are
> +grouped together (barring a hash collision they won't have the same
> +object ID), in such cases the `custom_sorted` member can be set to `1`
> +before calling `oid_array_for_each_unique`, and it'll skip its own
> +sorting. Once it's been set calling e.g. `oid_array_lookup` without it
> +being cleared again will cause an internal panic, so use it carefully.
>
>   Functions
>   ---------
> diff --git a/sha1-array.c b/sha1-array.c
> index 466a926aa3..cbae07ff78 100644
> --- a/sha1-array.c
> +++ b/sha1-array.c
> @@ -18,6 +18,7 @@ static void oid_array_sort(struct oid_array *array)
>   {
>   	QSORT(array->oid, array->nr, void_hashcmp);
>   	array->sorted = 1;
> +	array->custom_sorted = 0;
>   }
>
>   static const unsigned char *sha1_access(size_t index, void *table)
> @@ -28,6 +29,13 @@ static const unsigned char *sha1_access(size_t index, void *table)
>
>   int oid_array_lookup(struct oid_array *array, const struct object_id *oid)
>   {
> +	if (array->custom_sorted)
> +		/*
> +		 * We could also just clear custom_sorted here, but if
> +		 * the caller is custom sorting and then calling this
> +		 * that's likely something they'd like to know about.
> +		 */
> +		BUG("PANIC: Cannot lookup OIDs in arrays with a custom sort!");

Probably don't need the "PANIC: " here.

>   	if (!array->sorted)
>   		oid_array_sort(array);
>   	return sha1_pos(oid->hash, array->oid, array->nr, sha1_access);
> @@ -39,6 +47,7 @@ void oid_array_clear(struct oid_array *array)
>   	array->nr = 0;
>   	array->alloc = 0;
>   	array->sorted = 0;
> +	array->custom_sorted = 0;
>   }
>
>   int oid_array_for_each_unique(struct oid_array *array,
> @@ -47,7 +56,7 @@ int oid_array_for_each_unique(struct oid_array *array,
>   {
>   	int i;
>
> -	if (!array->sorted)
> +	if (!array->sorted && !array->custom_sorted)
>   		oid_array_sort(array);
>
>   	for (i = 0; i < array->nr; i++) {
> diff --git a/sha1-array.h b/sha1-array.h
> index 1e1d24b009..bfa77ba1e4 100644
> --- a/sha1-array.h
> +++ b/sha1-array.h
> @@ -6,6 +6,7 @@ struct oid_array {
>   	int nr;
>   	int alloc;
>   	int sorted;
> +	int custom_sorted;
>   };
>
>   #define OID_ARRAY_INIT { NULL, 0, 0, 0 }
> diff --git a/sha1-name.c b/sha1-name.c
> index b81e07adbb..d190800db0 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -490,9 +490,11 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
>   	find_short_packed_object(&ds);
>
>   	QSORT(collect.oid, collect.nr, sort_ambiguous);
> -	collect.sorted = 1;
>
> +	collect.custom_sorted = 1;
>   	ret = oid_array_for_each_unique(&collect, fn, cb_data);
> +	collect.custom_sorted = 0;
> +
>   	oid_array_clear(&collect);
>   	return ret;
>   }
>
> So maybe I should just stop worrying and YOLO it, it just seems wrong to
> leave such a fragile setup in place where we set .sorted=1 and some
> future refactoring reasonably tries to call oid_array_lookup() on it and
> silently fails.
>
> What do you think?
I think this extra custom_sort check is worth keeping the API stable to 
future changes.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v3 00/12] get_short_oid UI improvements
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
  2018-05-01 13:03   ` [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1 Derrick Stolee
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-02 12:42     ` Derrick Stolee
  2018-05-01 18:40   ` [PATCH v3 01/12] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
                     ` (11 subsequent siblings)
  13 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Comments inline:

Ævar Arnfjörð Bjarmason (12):
  sha1-name.c: remove stray newline

No changes.

  sha1-array.h: align function arguments

Mention the correct commit to blame for disalignment in the commit
message, and also fix it in the *.c file.

  git-p4: change "commitish" typo to "committish"

No changes.

  cache.h: add comment explaining the order in object_type

Trivial commit message rewording.

  sha1-name.c: move around the collect_ambiguous() function

No changes.

  get_short_oid: sort ambiguous objects by type, then SHA-1

The biggest change in v3 is the no change at all to the code, but a
lengthy explanation of why I didn't go for Derrick's simpler
implementation. Maybe I'm wrong about that, but I felt uneasy
offloading undocumented (or if I documented it, it would only be for
this one edge-case) magic on the oid_array API. Instead I'm just
making this patch a bit more complex.

  get_short_oid: learn to disambiguate by ^{tag}
  get_short_oid: learn to disambiguate by ^{blob}
  get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  get_short_oid / peel_onion: ^{commit} should be commit, not committish
  config doc: document core.disambiguate
  get_short_oid: document & warn if we ignore the type selector

No changes except one trivial commit message formatting fix.

 Documentation/config.txt                  | 17 +++++
 Documentation/technical/api-oid-array.txt | 17 +++--
 cache.h                                   | 13 +++-
 git-p4.py                                 |  6 +-
 sha1-array.c                              | 21 +++++-
 sha1-array.h                              |  7 +-
 sha1-name.c                               | 80 +++++++++++++++++++----
 t/t1512-rev-parse-disambiguation.sh       | 58 +++++++++++++---
 8 files changed, 184 insertions(+), 35 deletions(-)

-- 
2.17.0.290.gded63e768a


^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v3 01/12] sha1-name.c: remove stray newline
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
  2018-05-01 13:03   ` [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1 Derrick Stolee
  2018-05-01 18:40   ` [PATCH v3 00/12] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-01 18:40   ` [PATCH v3 02/12] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

This stray newline was accidentally introduced in
d2b7d9c7ed ("sha1_name: convert disambiguate_hint_fn to take
object_id", 2017-03-26).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/sha1-name.c b/sha1-name.c
index 5b93bf8da3..cd3b133aae 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -346,7 +346,6 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	struct strbuf desc = STRBUF_INIT;
 	int type;
 
-
 	if (ds->fn && !ds->fn(oid, ds->cb_data))
 		return 0;
 
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 02/12] sha1-array.h: align function arguments
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (2 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 01/12] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-01 18:40   ` [PATCH v3 03/12] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The arguments weren't lined up with the opening parenthesis. Fixes up
code added in aae0caf19e ("sha1-array.h: align function arguments",
2018-04-30).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-array.c | 4 ++--
 sha1-array.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/sha1-array.c b/sha1-array.c
index 838b3bf847..466a926aa3 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -42,8 +42,8 @@ void oid_array_clear(struct oid_array *array)
 }
 
 int oid_array_for_each_unique(struct oid_array *array,
-				for_each_oid_fn fn,
-				void *data)
+			      for_each_oid_fn fn,
+			      void *data)
 {
 	int i;
 
diff --git a/sha1-array.h b/sha1-array.h
index 04b0756334..1e1d24b009 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -17,7 +17,7 @@ void oid_array_clear(struct oid_array *array);
 typedef int (*for_each_oid_fn)(const struct object_id *oid,
 			       void *data);
 int oid_array_for_each_unique(struct oid_array *array,
-			       for_each_oid_fn fn,
-			       void *data);
+			      for_each_oid_fn fn,
+			      void *data);
 
 #endif /* SHA1_ARRAY_H */
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 03/12] git-p4: change "commitish" typo to "committish"
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (3 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 02/12] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-01 18:40   ` [PATCH v3 04/12] cache.h: add comment explaining the order in object_type Ævar Arnfjörð Bjarmason
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

This was the only occurrence of "commitish" in the tree, but as the
log will reveal we've had others in the past. Fixes up code added in
00ad6e3182 ("git-p4: work with a detached head", 2015-11-21).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 git-p4.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 7bb9cadc69..1afa87cd9d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2099,11 +2099,11 @@ class P4Submit(Command, P4UserMap):
 
         commits = []
         if self.master:
-            commitish = self.master
+            committish = self.master
         else:
-            commitish = 'HEAD'
+            committish = 'HEAD'
 
-        for line in read_pipe_lines(["git", "rev-list", "--no-merges", "%s..%s" % (self.origin, commitish)]):
+        for line in read_pipe_lines(["git", "rev-list", "--no-merges", "%s..%s" % (self.origin, committish)]):
             commits.append(line.strip())
         commits.reverse()
 
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 04/12] cache.h: add comment explaining the order in object_type
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (4 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 03/12] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-03  5:05     ` Junio C Hamano
  2018-05-08 15:35     ` Duy Nguyen
  2018-05-01 18:40   ` [PATCH v3 05/12] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
                     ` (7 subsequent siblings)
  13 siblings, 2 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The order in the enum might seem arbitrary, and isn't explained by
72518e9c26 ("more lightweight revalidation while reusing deflated
stream in packing", 2006-09-03) which added it.

Derrick Stolee suggested that it's ordered topologically in
5f8b1ec1-258d-1acc-133e-a7c248b4083e@gmail.com. Makes sense to me, add
that as a comment.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/cache.h b/cache.h
index 77b7acebb6..354903c3ea 100644
--- a/cache.h
+++ b/cache.h
@@ -376,6 +376,14 @@ extern void free_name_hash(struct index_state *istate);
 enum object_type {
 	OBJ_BAD = -1,
 	OBJ_NONE = 0,
+	/*
+	 * Why have our our "real" object types in this order? They're
+	 * ordered topologically:
+	 *
+	 * tag(4)    -> commit(1), tree(2), blob(3)
+	 * commit(1) -> tree(2)
+	 * tree(2)   -> blob(3)
+	 */
 	OBJ_COMMIT = 1,
 	OBJ_TREE = 2,
 	OBJ_BLOB = 3,
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 05/12] sha1-name.c: move around the collect_ambiguous() function
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (5 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 04/12] cache.h: add comment explaining the order in object_type Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-01 18:40   ` [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

A subsequent change will make use of this static function in the
get_short_oid() function, which is defined above where the
collect_ambiguous() function is now. Without this we'd then have a
compilation error due to a forward declaration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index cd3b133aae..9d7bbd3e96 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -372,6 +372,12 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	return 0;
 }
 
+static int collect_ambiguous(const struct object_id *oid, void *data)
+{
+	oid_array_append(data, oid);
+	return 0;
+}
+
 static int get_short_oid(const char *name, int len, struct object_id *oid,
 			  unsigned flags)
 {
@@ -421,12 +427,6 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 	return status;
 }
 
-static int collect_ambiguous(const struct object_id *oid, void *data)
-{
-	oid_array_append(data, oid);
-	return 0;
-}
-
 int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
 {
 	struct oid_array collect = OID_ARRAY_INIT;
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (6 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 05/12] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-03  5:13     ` Junio C Hamano
  2018-05-08 14:44     ` Jeff King
  2018-05-01 18:40   ` [PATCH v3 07/12] get_short_oid: learn to disambiguate by ^{tag} Ævar Arnfjörð Bjarmason
                     ` (5 subsequent siblings)
  13 siblings, 2 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Change the output emitted when an ambiguous object is encountered so
that we show tags first, then commits, followed by trees, and finally
blobs. Within each type we show objects in hashcmp() order. Before
this change the objects were only ordered by hashcmp().

The reason for doing this is that the output looks better as a result,
e.g. the v2.17.0 tag before this change on "git show e8f2" would
display:

    hint: The candidates are:
    hint:   e8f2093055 tree
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f25a3a50 tree
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f2867228 blob
    hint:   e8f28d537c tree
    hint:   e8f2a35526 blob
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2cf6ec0 tree

Now we'll instead show:

    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob

Since we show the commit data in the output that's nicely aligned once
we sort by object type. The decision to show tags before commits is
pretty arbitrary. I don't want to order by object_type since there
tags come last after blobs, which doesn't make sense if we want to
show the most important things first.

I could display them after commits, but it's much less likely that
we'll display a tag, so if there is one it makes sense to show it
prominently at the top.

A note on the implementation: Derrick rightly pointed out[1] that
we're bending over backwards here in get_short_oid() to first
de-duplicate the list, and then emit it, but could simply do it in one
step.

The reason for that is that oid_array_for_each_unique() doesn't
actually require that the array be sorted by oid_array_sort(), it just
needs to be sorted in some order that guarantees that all objects with
the same ID are adjacent to one another, which (barring a hash
collision, which'll be someone else's problem) the sort_ambiguous()
function does.

I agree that would be simpler for this code, and had forgotten why I
initially wrote it like this[2]. But on further reflection I think
it's better to do more work here just so we're not underhandedly using
the oid-array API where we lie about the list being sorted. That would
break any subsequent use of oid_array_lookup() in subtle ways.

I could get around that by hacking the API itself to support this
use-case and documenting it, which I did as a WIP patch in [3], but I
think it's too much code smell just for this one call site. It's
simpler for the API to just introduce a oid_array_for_each() function
to eagerly spew out the list without sorting or de-duplication, and
then do the de-duplication and sorting in two passes.

1. https://public-inbox.org/git/20180501130318.58251-1-dstolee@microsoft.com/
2. https://public-inbox.org/git/876047ze9v.fsf@evledraar.gmail.com/
3. https://public-inbox.org/git/874ljrzctc.fsf@evledraar.gmail.com/

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/technical/api-oid-array.txt | 17 +++++++----
 sha1-array.c                              | 17 +++++++++++
 sha1-array.h                              |  3 ++
 sha1-name.c                               | 37 ++++++++++++++++++++++-
 t/t1512-rev-parse-disambiguation.sh       | 21 +++++++++++++
 5 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/api-oid-array.txt b/Documentation/technical/api-oid-array.txt
index b0c11f868d..94b529722c 100644
--- a/Documentation/technical/api-oid-array.txt
+++ b/Documentation/technical/api-oid-array.txt
@@ -35,13 +35,18 @@ Functions
 	Free all memory associated with the array and return it to the
 	initial, empty state.
 
+`oid_array_for_each`::
+	Iterate over each element of the list, executing the callback
+	function for each one. Does not sort the list, so any custom
+	hash order is retained. If the callback returns a non-zero
+	value, the iteration ends immediately and the callback's
+	return is propagated; otherwise, 0 is returned.
+
 `oid_array_for_each_unique`::
-	Efficiently iterate over each unique element of the list,
-	executing the callback function for each one. If the array is
-	not sorted, this function has the side effect of sorting it. If
-	the callback returns a non-zero value, the iteration ends
-	immediately and the callback's return is propagated; otherwise,
-	0 is returned.
+	Iterate over each unique element of the list in sort order ,
+	but otherwise behaves like `oid_array_for_each`. If the array
+	is not sorted, this function has the side effect of sorting
+	it.
 
 Examples
 --------
diff --git a/sha1-array.c b/sha1-array.c
index 466a926aa3..265941fbf4 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -41,6 +41,23 @@ void oid_array_clear(struct oid_array *array)
 	array->sorted = 0;
 }
 
+
+int oid_array_for_each(struct oid_array *array,
+		       for_each_oid_fn fn,
+		       void *data)
+{
+	int i;
+
+	/* No oid_array_sort() here! See the api-oid-array.txt docs! */
+
+	for (i = 0; i < array->nr; i++) {
+		int ret = fn(array->oid + i, data);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
 int oid_array_for_each_unique(struct oid_array *array,
 			      for_each_oid_fn fn,
 			      void *data)
diff --git a/sha1-array.h b/sha1-array.h
index 1e1d24b009..232bf95017 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -16,6 +16,9 @@ void oid_array_clear(struct oid_array *array);
 
 typedef int (*for_each_oid_fn)(const struct object_id *oid,
 			       void *data);
+int oid_array_for_each(struct oid_array *array,
+		       for_each_oid_fn fn,
+		       void *data);
 int oid_array_for_each_unique(struct oid_array *array,
 			      for_each_oid_fn fn,
 			      void *data);
diff --git a/sha1-name.c b/sha1-name.c
index 9d7bbd3e96..46d8b1afa6 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -378,6 +378,34 @@ static int collect_ambiguous(const struct object_id *oid, void *data)
 	return 0;
 }
 
+static int sort_ambiguous(const void *a, const void *b)
+{
+	int a_type = oid_object_info(a, NULL);
+	int b_type = oid_object_info(b, NULL);
+	int a_type_sort;
+	int b_type_sort;
+
+	/*
+	 * Sorts by hash within the same object type, just as
+	 * oid_array_for_each_unique() would do.
+	 */
+	if (a_type == b_type)
+		return oidcmp(a, b);
+
+	/*
+	 * Between object types show tags, then commits, and finally
+	 * trees and blobs.
+	 *
+	 * The object_type enum is commit, tree, blob, tag, but we
+	 * want tag, commit, tree blob. Cleverly (perhaps too
+	 * cleverly) do that with modulus, since the enum assigns 1 to
+	 * commit, so tag becomes 0.
+	 */
+	a_type_sort = a_type % 4;
+	b_type_sort = b_type % 4;
+	return a_type_sort > b_type_sort ? 1 : -1;
+}
+
 static int get_short_oid(const char *name, int len, struct object_id *oid,
 			  unsigned flags)
 {
@@ -409,6 +437,8 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 	status = finish_object_disambiguation(&ds, oid);
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
+		struct oid_array collect = OID_ARRAY_INIT;
+
 		error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
 
 		/*
@@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 			ds.fn = NULL;
 
 		advise(_("The candidates are:"));
-		for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds);
+		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
+		QSORT(collect.oid, collect.nr, sort_ambiguous);
+
+		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
+			BUG("show_ambiguous_object shouldn't return non-zero");
+		oid_array_clear(&collect);
 	}
 
 	return status;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 711704ba5a..2701462041 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -361,4 +361,25 @@ test_expect_success 'core.disambiguate does not override context' '
 		git -c core.disambiguate=committish rev-parse $sha1^{tree}
 '
 
+test_expect_success C_LOCALE_OUTPUT 'ambiguous commits are printed by type first, then hash order' '
+	test_must_fail git rev-parse 0000 2>stderr &&
+	grep ^hint: stderr >hints &&
+	grep 0000 hints >objects &&
+	cat >expected <<-\EOF &&
+	tag
+	commit
+	tree
+	blob
+	EOF
+	awk "{print \$3}" <objects >objects.types &&
+	uniq <objects.types >objects.types.uniq &&
+	test_cmp expected objects.types.uniq &&
+	for type in tag commit tree blob
+	do
+		grep $type objects >$type.objects &&
+		sort $type.objects >$type.objects.sorted &&
+		test_cmp $type.objects.sorted $type.objects
+	done
+'
+
 test_done
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 07/12] get_short_oid: learn to disambiguate by ^{tag}
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (7 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-01 18:40   ` [PATCH v3 08/12] get_short_oid: learn to disambiguate by ^{blob} Ævar Arnfjörð Bjarmason
                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Add support for ^{tag} to the disambiguation logic. Before this ^{tag}
would simply be ignored:

    $ git rev-parse e8f2^{tag}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob
    e8f2^{tag}

Now the logic added in ed1ca6025f ("peel_onion: disambiguate to favor
tree-ish when we know we want a tree-ish", 2013-03-31) has been
extended to support it.

    $ git rev-parse e8f2^{tag}
    e8f2650052f3ff646023725e388ea1112b020e79

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h                             |  5 +++--
 sha1-name.c                         | 13 ++++++++++++-
 t/t1512-rev-parse-disambiguation.sh |  5 ++++-
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index 354903c3ea..a141995cc7 100644
--- a/cache.h
+++ b/cache.h
@@ -1330,8 +1330,9 @@ struct object_context {
 #define GET_OID_TREE             010
 #define GET_OID_TREEISH          020
 #define GET_OID_BLOB             040
-#define GET_OID_FOLLOW_SYMLINKS 0100
-#define GET_OID_RECORD_PATH     0200
+#define GET_OID_TAG             0100
+#define GET_OID_FOLLOW_SYMLINKS 0200
+#define GET_OID_RECORD_PATH     0400
 #define GET_OID_ONLY_TO_DIE    04000
 
 #define GET_OID_DISAMBIGUATORS \
diff --git a/sha1-name.c b/sha1-name.c
index 46d8b1afa6..68d5f65362 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -221,6 +221,12 @@ static int finish_object_disambiguation(struct disambiguate_state *ds,
 	return 0;
 }
 
+static int disambiguate_tag_only(const struct object_id *oid, void *cb_data_unused)
+{
+	int kind = oid_object_info(oid, NULL);
+	return kind == OBJ_TAG;
+}
+
 static int disambiguate_commit_only(const struct object_id *oid, void *cb_data_unused)
 {
 	int kind = oid_object_info(oid, NULL);
@@ -288,7 +294,8 @@ int set_disambiguate_hint_config(const char *var, const char *value)
 		{ "committish", disambiguate_committish_only },
 		{ "tree", disambiguate_tree_only },
 		{ "treeish", disambiguate_treeish_only },
-		{ "blob", disambiguate_blob_only }
+		{ "blob", disambiguate_blob_only },
+		{ "tag", disambiguate_tag_only }
 	};
 	int i;
 
@@ -429,6 +436,8 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 		ds.fn = disambiguate_treeish_only;
 	else if (flags & GET_OID_BLOB)
 		ds.fn = disambiguate_blob_only;
+	else if (flags & GET_OID_TAG)
+		ds.fn = disambiguate_tag_only;
 	else
 		ds.fn = default_disambiguate_hint;
 
@@ -958,6 +967,8 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 	lookup_flags &= ~GET_OID_DISAMBIGUATORS;
 	if (expected_type == OBJ_COMMIT)
 		lookup_flags |= GET_OID_COMMITTISH;
+	else if (expected_type == OBJ_TAG)
+		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
 		lookup_flags |= GET_OID_TREEISH;
 
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 2701462041..74e7d9c178 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -334,7 +334,10 @@ test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
 	test_must_fail git rev-parse 000000000^{commit} 2>stderr &&
 	grep ^hint: stderr >hints &&
 	# 5 commits, 1 tag (which is a commitish), plus intro line
-	test_line_count = 7 hints
+	test_line_count = 7 hints &&
+	git rev-parse 000000000^{tag} >stdout &&
+	test_line_count = 1 stdout &&
+	grep -q ^0000000000f8f stdout
 '
 
 test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 08/12] get_short_oid: learn to disambiguate by ^{blob}
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (8 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 07/12] get_short_oid: learn to disambiguate by ^{tag} Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The disambiguation logic had all the pieces necessary to only print
out those blobs that were ambiguous, but they hadn't been connected.

The initial logic was added in daba53aeaf ("sha1_name.c: add support
for disambiguating other types", 2012-07-02), and when the flags were
propagated in 8a10fea49b ("get_sha1: propagate flags to child
functions", 2016-09-26) GET_OID_BLOB wasn't added to lookup_flags.

Before this change requests for blobs were simply ignored:

    $ git rev-parse e8f2^{blob}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob
    e8f2^{blob}
    [...]

But now we'll do the right thing and only print the blobs:

    $ git rev-parse e8f2^{blob}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob
    e8f2^{blob}
    [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         | 2 ++
 t/t1512-rev-parse-disambiguation.sh | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/sha1-name.c b/sha1-name.c
index 68d5f65362..023f9471a8 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -971,6 +971,8 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
 		lookup_flags |= GET_OID_TREEISH;
+	else if (expected_type == OBJ_BLOB)
+		lookup_flags |= GET_OID_BLOB;
 
 	if (get_oid_1(name, sp - name - 2, &outer, lookup_flags))
 		return -1;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 74e7d9c178..9ce9cc3bc3 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -337,7 +337,11 @@ test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
 	test_line_count = 7 hints &&
 	git rev-parse 000000000^{tag} >stdout &&
 	test_line_count = 1 stdout &&
-	grep -q ^0000000000f8f stdout
+	grep -q ^0000000000f8f stdout &&
+	test_must_fail git rev-parse 000000000^{blob} 2>stderr &&
+	grep ^hint: stderr >hints &&
+	# 5 blobs plus intro line &&
+	test_line_count = 6 hints
 '
 
 test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (9 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 08/12] get_short_oid: learn to disambiguate by ^{blob} Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-03  5:28     ` Junio C Hamano
                       ` (7 more replies)
  2018-05-01 18:40   ` [PATCH v3 10/12] get_short_oid / peel_onion: ^{commit} should be commit, not committish Ævar Arnfjörð Bjarmason
                     ` (2 subsequent siblings)
  13 siblings, 8 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

After the recent series of patches ^{tag} and ^{blob} now work to get
just the tags and blobs, but ^{tree} will still list any
tree-ish (commits, tags and trees).

The previous behavior was added in ed1ca6025f ("peel_onion:
disambiguate to favor tree-ish when we know we want a tree-ish",
2013-03-31). I may have missed some special-case but this makes more
sense to me.

Now "$sha1:" can be used as before to mean treeish:

    $ git rev-parse e8f2:
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    [...]

But ^{tree} shows just the trees, but would previously be equivalent
to the above:

    $ git rev-parse e8f2^{tree}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         |  2 +-
 t/t1512-rev-parse-disambiguation.sh | 18 ++++++++++++++----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index 023f9471a8..b61c0558d9 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -970,7 +970,7 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 	else if (expected_type == OBJ_TAG)
 		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
-		lookup_flags |= GET_OID_TREEISH;
+		lookup_flags |= GET_OID_TREE;
 	else if (expected_type == OBJ_BLOB)
 		lookup_flags |= GET_OID_BLOB;
 
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 9ce9cc3bc3..81076449a2 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -159,9 +159,13 @@ test_expect_failure 'two semi-ambiguous commit-ish' '
 	git log 0000000000...
 '
 
-test_expect_failure 'three semi-ambiguous tree-ish' '
+test_expect_success 'three semi-ambiguous tree-ish' '
 	# Likewise for tree-ish.  HEAD, v1.0.0 and HEAD^{tree} share
 	# the prefix but peeling them to tree yields the same thing
+	test_must_fail git rev-parse --verify 0000000000: &&
+
+	# For ^{tree} we can disambiguate because HEAD and v1.0.0 will
+	# be excluded.
 	git rev-parse --verify 0000000000^{tree}
 '
 
@@ -267,8 +271,12 @@ test_expect_success 'ambiguous commit-ish' '
 # There are three objects with this prefix: a blob, a tree, and a tag. We know
 # the blob will not pass as a treeish, but the tree and tag should (and thus
 # cause an error).
-test_expect_success 'ambiguous tags peel to treeish' '
-	test_must_fail git rev-parse 0000000000f^{tree}
+test_expect_success 'ambiguous tags peel to treeish or tree' '
+	test_must_fail git rev-parse 0000000000f: &&
+	git rev-parse 0000000000f^{tree} >stdout &&
+	test_line_count = 1 stdout &&
+	grep -q ^0000000000fd8bcc56 stdout
+
 '
 
 test_expect_success 'rev-parse --disambiguate' '
@@ -365,7 +373,9 @@ test_expect_success 'core.disambiguate config can prefer types' '
 test_expect_success 'core.disambiguate does not override context' '
 	# treeish ambiguous between tag and tree
 	test_must_fail \
-		git -c core.disambiguate=committish rev-parse $sha1^{tree}
+		git -c core.disambiguate=committish rev-parse $sha1: &&
+	# tree not ambiguous between tag and tree
+	git -c core.disambiguate=committish rev-parse $sha1^{tree}
 '
 
 test_expect_success C_LOCALE_OUTPUT 'ambiguous commits are printed by type first, then hash order' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 10/12] get_short_oid / peel_onion: ^{commit} should be commit, not committish
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (10 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-01 18:40   ` [PATCH v3 11/12] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
  2018-05-01 18:40   ` [PATCH v3 12/12] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
  13 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Change the ^{commit} syntax to mean just commits instead of committish
for the purpose of disambiguation. Before this e8f2^{commit} would
show the v2.17.0 tag as a disambiguation candidate, but now it'll just
show ambiguous commits:

    $ git rev-parse e8f2^{commit}
    error: short SHA1 e8f2 is ambiguous
    hint: The candidates are:
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         | 2 +-
 t/t1512-rev-parse-disambiguation.sh | 7 ++++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index b61c0558d9..1d2a74a29c 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -966,7 +966,7 @@ static int peel_onion(const char *name, int len, struct object_id *oid,
 
 	lookup_flags &= ~GET_OID_DISAMBIGUATORS;
 	if (expected_type == OBJ_COMMIT)
-		lookup_flags |= GET_OID_COMMITTISH;
+		lookup_flags |= GET_OID_COMMIT;
 	else if (expected_type == OBJ_TAG)
 		lookup_flags |= GET_OID_TAG;
 	else if (expected_type == OBJ_TREE)
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 81076449a2..b17973a266 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -341,8 +341,8 @@ test_expect_success C_LOCALE_OUTPUT 'ambiguity hints' '
 test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
 	test_must_fail git rev-parse 000000000^{commit} 2>stderr &&
 	grep ^hint: stderr >hints &&
-	# 5 commits, 1 tag (which is a commitish), plus intro line
-	test_line_count = 7 hints &&
+	# 5 commits plus intro line
+	test_line_count = 6 hints &&
 	git rev-parse 000000000^{tag} >stdout &&
 	test_line_count = 1 stdout &&
 	grep -q ^0000000000f8f stdout &&
@@ -366,7 +366,8 @@ test_expect_success 'core.disambiguate config can prefer types' '
 	# ambiguous between tree and tag
 	sha1=0000000000f &&
 	test_must_fail git rev-parse $sha1 &&
-	git rev-parse $sha1^{commit} &&
+	# there is no commit so ^{commit} comes up empty
+	test_must_fail git rev-parse $sha1^{commit} &&
 	git -c core.disambiguate=committish rev-parse $sha1
 '
 
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 11/12] config doc: document core.disambiguate
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (11 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 10/12] get_short_oid / peel_onion: ^{commit} should be commit, not committish Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  2018-05-08 14:41     ` Jeff King
  2018-05-01 18:40   ` [PATCH v3 12/12] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
  13 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The core.disambiguate variable was added in
5b33cb1fd7 ("get_short_sha1: make default disambiguation
configurable", 2016-09-27) but never documented.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config.txt | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..14a3d57e77 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -910,6 +910,19 @@ core.abbrev::
 	abbreviated object names to stay unique for some time.
 	The minimum length is 4.
 
+core.disambiguate::
+	If Git is given a SHA-1 that's ambigous it'll suggest what
+	objects you might mean. By default it'll print out all
+	potential objects with that prefix regardless of their
+	type. This setting, along with the `^{<type>}` peel syntax
+	(see linkgit:gitrevisions[7]), allows for narrowing that down.
++
+Is set to `none` by default to show all object types. Can also be
+`commit` (peel syntax: `$sha1^{commit}`), `committish` (commits and
+tags), `tree` (peel: `$sha1^{tree}`), `treeish` (everything except
+blobs, peel syntax: `$sha1:`), `blob` (peel: `$sha1^{blob}`) or `tag`
+(peel: `$sha1^{tag}`). The peel syntax will override any config value.
+
 add.ignoreErrors::
 add.ignore-errors (deprecated)::
 	Tells 'git add' to continue adding files when some files cannot be
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v3 12/12] get_short_oid: document & warn if we ignore the type selector
  2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
                     ` (12 preceding siblings ...)
  2018-05-01 18:40   ` [PATCH v3 11/12] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
@ 2018-05-01 18:40   ` Ævar Arnfjörð Bjarmason
  13 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-01 18:40 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

The SHA1 prefix 06fa currently matches no blobs in git.git. When
disambiguating short SHA1s we've been quietly ignoring the user's type
selector as a fallback mechanism, this was intentionally added in
1ffa26c461 ("get_short_sha1: list ambiguous objects on error",
2016-09-26).

I think that behavior makes sense, it's not very useful to just show
nothing because a preference has been expressed via core.disambiguate,
but it's bad that we're quietly doing this. The user might thing that
we just didn't understand what e.g 06fa^{blob} meant.

Now we'll instead print a warning if no objects of the requested type
were found:

    $ git rev-parse 06fa^{blob}
    error: short SHA1 06fa is ambiguous
    hint: The candidates are:
    [... no blobs listed ...]
    warning: Your hint (via core.disambiguate or peel syntax) was ignored, we fell
    back to showing all object types since no object of the requested type
    matched the provide short SHA1 06fa

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config.txt            |  4 ++++
 sha1-name.c                         | 11 ++++++++++-
 t/t1512-rev-parse-disambiguation.sh |  5 ++++-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 14a3d57e77..e14f2c0492 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -922,6 +922,10 @@ Is set to `none` by default to show all object types. Can also be
 tags), `tree` (peel: `$sha1^{tree}`), `treeish` (everything except
 blobs, peel syntax: `$sha1:`), `blob` (peel: `$sha1^{blob}`) or `tag`
 (peel: `$sha1^{tag}`). The peel syntax will override any config value.
++
+If no objects of the selected type exist the disambiguation will fall
+back to `none` and print a warning indicating no objects of the
+selected type could be found for that prefix.
 
 add.ignoreErrors::
 add.ignore-errors (deprecated)::
diff --git a/sha1-name.c b/sha1-name.c
index 1d2a74a29c..9789764a38 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -447,6 +447,7 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
 		struct oid_array collect = OID_ARRAY_INIT;
+		int ignored_hint = 0;
 
 		error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
 
@@ -456,8 +457,10 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 		 * that case, we still want to show them, so disable the hint
 		 * function entirely.
 		 */
-		if (!ds.ambiguous)
+		if (!ds.ambiguous) {
 			ds.fn = NULL;
+			ignored_hint = 1;
+		}
 
 		advise(_("The candidates are:"));
 		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
@@ -466,6 +469,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
 			BUG("show_ambiguous_object shouldn't return non-zero");
 		oid_array_clear(&collect);
+
+		if (ignored_hint) {
+			warning(_("Your hint (via core.disambiguate or peel syntax) was ignored, we fell\n"
+				  "back to showing all object types since no object of the requested type\n"
+				  "matched the provide short SHA1 %s"), ds.hex_pfx);
+		}
 	}
 
 	return status;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index b17973a266..940f323ee9 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -359,7 +359,10 @@ test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
 	echo 872 | git hash-object --stdin -w &&
 	test_must_fail git rev-parse ee3d^{commit} 2>stderr &&
 	grep ^hint: stderr >hints &&
-	test_line_count = 3 hints
+	test_line_count = 3 hints &&
+	grep ^warning stderr >warnings &&
+	grep -q "Your hint.*was ignored" warnings &&
+	grep -q "the provide short SHA1 ee3d" stderr
 '
 
 test_expect_success 'core.disambiguate config can prefer types' '
-- 
2.17.0.290.gded63e768a


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 00/12] get_short_oid UI improvements
  2018-05-01 18:40   ` [PATCH v3 00/12] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
@ 2018-05-02 12:42     ` Derrick Stolee
  2018-05-02 13:45       ` Derrick Stolee
  0 siblings, 1 reply; 99+ messages in thread
From: Derrick Stolee @ 2018-05-02 12:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Stefan Beller,
	Eric Sunshine

On 5/1/2018 2:40 PM, Ævar Arnfjörð Bjarmason wrote:
> The biggest change in v3 is the no change at all to the code, but a
> lengthy explanation of why I didn't go for Derrick's simpler
> implementation. Maybe I'm wrong about that, but I felt uneasy
> offloading undocumented (or if I documented it, it would only be for
> this one edge-case) magic on the oid_array API. Instead I'm just
> making this patch a bit more complex.

I think that's fair. Thanks for going along with me on the thought 
experiment.

-Stolee

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 00/12] get_short_oid UI improvements
  2018-05-02 12:42     ` Derrick Stolee
@ 2018-05-02 13:45       ` Derrick Stolee
  2018-05-03  6:43         ` Jacob Keller
  0 siblings, 1 reply; 99+ messages in thread
From: Derrick Stolee @ 2018-05-02 13:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Stefan Beller,
	Eric Sunshine

On 5/2/2018 8:42 AM, Derrick Stolee wrote:
> On 5/1/2018 2:40 PM, Ævar Arnfjörð Bjarmason wrote:
>> The biggest change in v3 is the no change at all to the code, but a
>> lengthy explanation of why I didn't go for Derrick's simpler
>> implementation. Maybe I'm wrong about that, but I felt uneasy
>> offloading undocumented (or if I documented it, it would only be for
>> this one edge-case) magic on the oid_array API. Instead I'm just
>> making this patch a bit more complex.
>
> I think that's fair. Thanks for going along with me on the thought 
> experiment.

Also, v3 looks good to me.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 04/12] cache.h: add comment explaining the order in object_type
  2018-05-01 18:40   ` [PATCH v3 04/12] cache.h: add comment explaining the order in object_type Ævar Arnfjörð Bjarmason
@ 2018-05-03  5:05     ` Junio C Hamano
  2018-05-08 15:35     ` Duy Nguyen
  1 sibling, 0 replies; 99+ messages in thread
From: Junio C Hamano @ 2018-05-03  5:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Stefan Beller,
	Eric Sunshine

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> The order in the enum might seem arbitrary, and isn't explained by
> 72518e9c26 ("more lightweight revalidation while reusing deflated
> stream in packing", 2006-09-03) which added it.
>
> Derrick Stolee suggested that it's ordered topologically in
> that as a comment.
> 5f8b1ec1-258d-1acc-133e-a7c248b4083e@gmail.com. Makes sense to me, add

When referring to a message-id, please do not omit surrounding <>,
which is part of the message-id string.  That's like writing your
e-mail address as avarab gmail.com without at sign.

>  enum object_type {
>  	OBJ_BAD = -1,
>  	OBJ_NONE = 0,
> +	/*
> +	 * Why have our our "real" object types in this order? They're
> +	 * ordered topologically:
> +	 *
> +	 * tag(4)    -> commit(1), tree(2), blob(3)
> +	 * commit(1) -> tree(2)
> +	 * tree(2)   -> blob(3)
> +	 */

I am not sure if the above makes sense at all in explaining tag.
With all others, type with smaller type id can refer to another type
that is with equal or larger type id (tree can refer to another
tree).  If tag had the smallest ID among all, it would have made
sense, though.

Before anybody confused raises noise, a gitlink records a commit in
a tree, which may seem to contradict the rule even more, but it
merely records a commit, without "referring" to it in the same sense
as other reference that require connectivity.

>  	OBJ_COMMIT = 1,
>  	OBJ_TREE = 2,
>  	OBJ_BLOB = 3,

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 18:40   ` [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
@ 2018-05-03  5:13     ` Junio C Hamano
  2018-05-08 14:44     ` Jeff King
  1 sibling, 0 replies; 99+ messages in thread
From: Junio C Hamano @ 2018-05-03  5:13 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Stefan Beller,
	Eric Sunshine

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +	/*
> +	 * Between object types show tags, then commits, and finally
> +	 * trees and blobs.
> +	 *
> +	 * The object_type enum is commit, tree, blob, tag, but we
> +	 * want tag, commit, tree blob. Cleverly (perhaps too

The missing comma between "tree blob"  on the second line made me
read this comment twice, which made me notice the lack of "and"
before "tag" on the previous line.

Assignment is "commit, tree, blob, and then tag" but we want "tag,
commit, tree and then blob", perhaps?

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
@ 2018-05-03  5:28     ` Junio C Hamano
  2018-05-03  7:28       ` Ævar Arnfjörð Bjarmason
  2018-05-10 12:42     ` [PATCH v4 0/6] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
                       ` (6 subsequent siblings)
  7 siblings, 1 reply; 99+ messages in thread
From: Junio C Hamano @ 2018-05-03  5:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Stefan Beller,
	Eric Sunshine

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> But ^{tree} shows just the trees, but would previously be equivalent
> to the above:
>
>     $ git rev-parse e8f2^{tree}
>     error: short SHA1 e8f2 is ambiguous
>     hint: The candidates are:
>     hint:   e8f2093055 tree
>     hint:   e8f25a3a50 tree
>     hint:   e8f28d537c tree
>     hint:   e8f2cf6ec0 tree
>     [...]

When a user says "git $cmd e8f2^{tree}", the user is telling Git
that s/he knows e8f2 *is* a tree-ish, but for whatever reason $cmd
wants a tree and does not accept an arbitrary tree-ish---that is the
whole piont of appending ^{tree} as a suffix.  A useful hint in such
a case would be "oh, you said e8f2 is a tree-ish, but there are more
than one tree-ish, so let me show them to you to help you decide
which one among them is the one you meant".  When $cmd is rev-parse,
I would even say that the user is saying "I know e8f2 is a tree-ish,
and I know it not a tree--it merely is a tree-ish.  I want the tree
that e8f2 thing points at".

Limiting that hint to show only real trees does not make any sense
to me.  I do not think we care _too_ deeply, because most of the
time, command line location that expects a tree-ish can be given a
tree-ish, so there is not much reason to use ^{tree} suffix these
days.  But in a case where it _does_ matter, I think this change
makes the "hint" almost useless.

Or am I misleading what you wanted to achieve with this patch?

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 00/12] get_short_oid UI improvements
  2018-05-02 13:45       ` Derrick Stolee
@ 2018-05-03  6:43         ` Jacob Keller
  0 siblings, 0 replies; 99+ messages in thread
From: Jacob Keller @ 2018-05-03  6:43 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Ævar Arnfjörð Bjarmason, Git mailing list,
	Junio C Hamano, Jeff King, brian m . carlson, Stefan Beller,
	Eric Sunshine

On Wed, May 2, 2018 at 6:45 AM, Derrick Stolee <stolee@gmail.com> wrote:
> On 5/2/2018 8:42 AM, Derrick Stolee wrote:
>>
>> On 5/1/2018 2:40 PM, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> The biggest change in v3 is the no change at all to the code, but a
>>> lengthy explanation of why I didn't go for Derrick's simpler
>>> implementation. Maybe I'm wrong about that, but I felt uneasy
>>> offloading undocumented (or if I documented it, it would only be for
>>> this one edge-case) magic on the oid_array API. Instead I'm just
>>> making this patch a bit more complex.
>>
>>
>> I think that's fair. Thanks for going along with me on the thought
>> experiment.
>
>
> Also, v3 looks good to me.
>
> Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
>

I also reviewed this, and it looks good to me as well.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-03  5:28     ` Junio C Hamano
@ 2018-05-03  7:28       ` Ævar Arnfjörð Bjarmason
  2018-05-04  2:19         ` Junio C Hamano
  0 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-03  7:28 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Stefan Beller,
	Eric Sunshine


On Thu, May 03 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> But ^{tree} shows just the trees, but would previously be equivalent
>> to the above:
>>
>>     $ git rev-parse e8f2^{tree}
>>     error: short SHA1 e8f2 is ambiguous
>>     hint: The candidates are:
>>     hint:   e8f2093055 tree
>>     hint:   e8f25a3a50 tree
>>     hint:   e8f28d537c tree
>>     hint:   e8f2cf6ec0 tree
>>     [...]
>
> When a user says "git $cmd e8f2^{tree}", the user is telling Git
> that s/he knows e8f2 *is* a tree-ish, but for whatever reason $cmd
> wants a tree and does not accept an arbitrary tree-ish---that is the
> whole piont of appending ^{tree} as a suffix.  A useful hint in such
> a case would be "oh, you said e8f2 is a tree-ish, but there are more
> than one tree-ish, so let me show them to you to help you decide
> which one among them is the one you meant".  When $cmd is rev-parse,
> I would even say that the user is saying "I know e8f2 is a tree-ish,
> and I know it not a tree--it merely is a tree-ish.  I want the tree
> that e8f2 thing points at".
>
> Limiting that hint to show only real trees does not make any sense
> to me.  I do not think we care _too_ deeply, because most of the
> time, command line location that expects a tree-ish can be given a
> tree-ish, so there is not much reason to use ^{tree} suffix these
> days.  But in a case where it _does_ matter, I think this change
> makes the "hint" almost useless.
>
> Or am I misleading what you wanted to achieve with this patch?

The reason I'm doing this is because I found it confusing that I can't
do:

    for t in tag commit tree blob; do ./git --exec-path=$PWD rev-parse 7452^{$t}; done

And get, respectively, only the SHAs that match the respective type, but
currently (with released git) you can do:

    for t in tag commit committish treeish tree blob; do git -c core.disambiguate=$t rev-parse 7452; done

And while =tag doesn't work the others do (inluding =blob), so
core.disambiguate=tree gives you just trees, but ^{tree} gives you
treeish.

Why should ^{tree} be giving me ^{treeish} but =tree be giving me trees,
and =treeish be synonymous with ^{tree}?

There's no other cases I know of where the ^{<type>} peel syntax won't
give you *only* the <type> you asked for. See peel_onion() ->
peel_to_type() and how get_oid_1() will short-circuit if it has an
answer, and then finally fall back to this get_short_oid() codepath.

Looking at the code & git log maybe it'll do that internally, but when
you peel a tag or commit ^{tree} will only ever find one thing, unlike
this disambiguation case where we can match multiple things.

So:

1) Am I missing some subtlety or am I correct that there was no way to
get git to return more than one SHA-1 for ^{commit} or ^{tree} before
this disambiguation feature was added?

2) I think the behavior I've implemented is consistent with how the peel
syntax has been documented in revisions.txt:

    '<rev>{caret}{<type>}', e.g. 'v0.99.8{caret}\{commit\}'::
      A suffix '{caret}' followed by an object type name enclosed in
      brace pair means dereference the object at '<rev>' recursively until
      an object of type '<type>' is found or the object cannot be
      dereferenced anymore (in which case, barf).
      For example, if '<rev>' is a commit-ish, '<rev>{caret}\{commit\}'
      describes the corresponding commit object.
      Similarly, if '<rev>' is a tree-ish, '<rev>{caret}\{tree\}'
      describes the corresponding tree object.
      '<rev>{caret}0'
      is a short-hand for '<rev>{caret}\{commit\}'.

Note "until an object of type '<type>' is found". I.e. my mental model
of this has been that yes you can *start* the search at a different
object (e.g. tag -> tree), but it'll only ever return the tree. The
disambiguation implementation has been inconsistent with this as
documented, because it hasn't been drilling don to an object of '<type>'
given a request like $short^{<type>}, but rather returning something
matching $short because it could be a container for <type>.

Anyway, I'm not saying I'm right. This is the first time I really look
at sha1-name.c in any detail. But the above describes the thought
process behind this patch.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-03  7:28       ` Ævar Arnfjörð Bjarmason
@ 2018-05-04  2:19         ` Junio C Hamano
  2018-05-04  8:42           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 99+ messages in thread
From: Junio C Hamano @ 2018-05-04  2:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Stefan Beller,
	Eric Sunshine

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> The reason I'm doing this is because I found it confusing that I can't
> do:
>
>     for t in tag commit tree blob; do ./git --exec-path=$PWD rev-parse 7452^{$t}; done
>
> And get, respectively, only the SHAs that match the respective type, but
> currently (with released git) you can do:
>
>     for t in tag commit committish treeish tree blob; do git -c core.disambiguate=$t rev-parse 7452; done

Exactly.  The former asks "I (think I) know 7452 can be used to name
an object of type $t, with peeling if necessary--give me the underlying
object of type $t".  In short, the fact that you can write "$X^{$t}"
says that $X is a $t-ish (otherwise $X cannot be used as a stand-in
for an object of type $t) and that you fully expect that $X can merely
be of type $t-ish and not exactly $t (otherwise you wouldn't be
making sure to coerce $X into $t with ^{$t} notation).

In *THAT* context, disambiguation help that lists objects whose name
begins with "7452" you gave, hoping that it is a unique enough
prefix when it wasn't in reality, *MUST* give $t-ish; restricting it
to $t makes the help mostly useless.

> 1) Am I missing some subtlety or am I correct that there was no way to
> get git to return more than one SHA-1 for ^{commit} or ^{tree} before
> this disambiguation feature was added?

There is no such feature either before or after the disambiguation
help.  I am not saying there shouldn't exist such a feature.  I am
saying that breaking the existing feature and making it useless is
not the way to add such a feature.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-04  2:19         ` Junio C Hamano
@ 2018-05-04  8:42           ` Ævar Arnfjörð Bjarmason
  2018-05-07  4:08             ` Junio C Hamano
  0 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-04  8:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Stefan Beller,
	Eric Sunshine


On Fri, May 04 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> The reason I'm doing this is because I found it confusing that I can't
>> do:
>>
>>     for t in tag commit tree blob; do ./git --exec-path=$PWD rev-parse 7452^{$t}; done
>>
>> And get, respectively, only the SHAs that match the respective type, but
>> currently (with released git) you can do:
>>
>>     for t in tag commit committish treeish tree blob; do git -c core.disambiguate=$t rev-parse 7452; done
>
> Exactly.  The former asks "I (think I) know 7452 can be used to name
> an object of type $t, with peeling if necessary--give me the underlying
> object of type $t".

Right, and I'm with you so far, this makes sense to me for all existing
uses of the peel syntax, otherwise v2.17.0^{tree} wouldn't be the same
as rev-parse v2.17.0^{tree}^{tree}...

> In short, the fact that you can write "$X^{$t}"
> says that $X is a $t-ish (otherwise $X cannot be used as a stand-in
> for an object of type $t) and that you fully expect that $X can merely
> be of type $t-ish and not exactly $t (otherwise you wouldn't be
> making sure to coerce $X into $t with ^{$t} notation).
>
> In *THAT* context, disambiguation help that lists objects whose name
> begins with "7452" you gave, hoping that it is a unique enough
> prefix when it wasn't in reality, *MUST* give $t-ish; restricting it
> to $t makes the help mostly useless.
>
>> 1) Am I missing some subtlety or am I correct that there was no way to
>> get git to return more than one SHA-1 for ^{commit} or ^{tree} before
>> this disambiguation feature was added?
>
> There is no such feature either before or after the disambiguation
> help.  I am not saying there shouldn't exist such a feature.  I am
> saying that breaking the existing feature and making it useless is
> not the way to add such a feature.

I still don't get how what you're proposing is going to be consistent,
but let's fully enumerate the output of 7452 with my patch to take that
case-by-case[1]:

    ^{tag}:
    7452b4b5786778d5d87f5c90a94fab8936502e20
    ^{commit}:
    hint:   74521eee4c commit 2007-12-01 - git-gui: install-sh from automake does not like -m755
    hint:   745224e04a commit 2014-06-18 - refs.c: SSE2 optimizations for check_refname_component
    ^{tree}:
    hint:   7452336aa3 tree
    hint:   74524f384d tree
    hint:   7452813bcd tree
    hint:   7452b1a701 tree
    hint:   7452b73c42 tree
    hint:   7452ca1557 tree
    ^{blob}:
    hint:   7452001351 blob
    hint:   745254665d blob
    hint:   7452a572c1 blob
    hint:   7452b9fd21 blob
    hint:   7452db13c8 blob
    hint:   7452fce0da blob

And[2]:

    core.disambiguate=tag:
    [same as ^{tag]
    core.disambiguate=commit:
    [same as ^{commit}]
    core.disambiguate=committish:
    hint:   7452b4b578 tag v2.1.0
    hint:   74521eee4c commit 2007-12-01 - git-gui: install-sh from automake does not like -m755
    hint:   745224e04a commit 2014-06-18 - refs.c: SSE2 optimizations for check_refname_component
    core.disambiguate=tree:
    [same as ^{tree}]
    core.disambiguate=treeish (same as $sha1:)
    hint:   7452b4b578 tag v2.1.0
    hint:   74521eee4c commit 2007-12-01 - git-gui: install-sh from automake does not like -m755
    hint:   745224e04a commit 2014-06-18 - refs.c: SSE2 optimizations for check_refname_component
    hint:   7452336aa3 tree
    hint:   74524f384d tree
    hint:   7452813bcd tree
    hint:   7452b1a701 tree
    hint:   7452b73c42 tree
    hint:   7452ca1557 tree
    core.disambiguate=blob:
    [same as ^{blob}]

So from my understanding of what you're saying you'd like to list tag,
commits and trees given $sha1^{tree}, because they're all types that can
be used to reach a tree.

I don't think that's very useful, yes it would "break" existing
disambiguations, but this is such an obscure (and purely manual)
use-case than I think that's fine.

Because I think to the extent anyone's going to use this it's because
they know they have e.g. a short blob, commit etc. SHA-1 they're not
going to use it because they have some short $SHA they know is a tree,
and then want all SHA-1s of that *and* random tag & commit objects that
happen to have the same object prefix just because tags and commits can
also point to trees.

How does that make any sense? The entire reason for using the normal
peel syntax is because you e.g. have v2.17.0 and want to get to the
^{tree} or the ^{commit} tht v2.17.0 directly points to. That's entirely
orthogonal to what the disambiguation is doing. There with your proposed
semantics you're peeling 7452 as 7452^{tree} because (IMO) you're
looking for trees, just to get some entirely unrelated commits and tags.

But *leaving that aside*, i.e. I don't see why the use-case would make
sense. What I *don't* get is why, if you think that, you only want to
apply that rule to ^{tree}. I.e. wouldn't it then be consistent to say:

    # a)
    ^{tag}    = tag
    ^{commit} = tag, commit
    ^{tree}   = tag, commit, tree
    ^{blob}   = tag, blob (blobish)

Whereas my patch now does:

    # b)
    ^{tag}    = tag
    ^{commit} = commit
    ^{tree}   = tree
    ^{blob}   = blob

But from what you seem to be proposing (or maybe you just didn't have a
chance to critique the ^{blob} and ^{commit} patches):

    # c)
    ^{tag}    = tag
    ^{commit} = commit
    ^{tree}   = tag, tree, commit
    ^{blob}   = blob

1. for type in tag commit tree blob; do echo "^{$type}:" && ./git --exec-path=$PWD rev-parse 7452^{$type} 2>&1|grep -E -e ^hint -e '^[0-9a-f]{40}$' |grep -v are:; done
2. for cfg in tag commit committish tree treeish blob; do echo "core.disambiguate=$cfg:" && ./git --exec-path=$PWD -c core.disambiguate=$cfg rev-parse 7452 2>&1|grep -E -e ^hint -e '^[0-9a-f]{40}$' |grep -v are:; done

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-04  8:42           ` Ævar Arnfjörð Bjarmason
@ 2018-05-07  4:08             ` Junio C Hamano
  2018-05-08 14:34               ` Jeff King
  0 siblings, 1 reply; 99+ messages in thread
From: Junio C Hamano @ 2018-05-07  4:08 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Stefan Beller,
	Eric Sunshine

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> Right, and I'm with you so far, this makes sense to me for all existing
> uses of the peel syntax, otherwise v2.17.0^{tree} wouldn't be the same
> as rev-parse v2.17.0^{tree}^{tree}...

More importantly, you could spell v2.17.0 part of the above with a
short hexadecimal string.  And that string should be naming some
tree-ish, the most important thing being that it is *NOT* required
to be a tree (and practically, it is likely that the user has a
tree-ish that is *NOT* a tree).

I guess I have a reaction to the title

    "get_short_oid/peel_onion: ^{tree} should be tree"

"X^{tree}" should *RESULT* in a tree, but it should *REQUIRE* X to
be a tree-ish.  It is unclear "should be tree" is about the former
and I read (perhaps mis-read) it as saying "it should require X to
be a tree"---that statement is utterly incorrect as we agreed above.

> case-by-case[1]:
>
>     ^{tag}:
>     7452b4b5786778d5d87f5c90a94fab8936502e20

I take it as "git rev-parse 7452^{tag}" output (similarly ^{$type}
for the rest)?  That probably is desirable, as blobs, trees and
commits cannot be peeled down to a tag.

>     ^{commit}:
>     hint:   74521eee4c commit 2007-12-01 - git-gui: install-sh from automake does not like -m755
>     hint:   745224e04a commit 2014-06-18 - refs.c: SSE2 optimizations for check_refname_component

If 7452 points at a commit, that tag itself should also be given as
a possible object the user may have meant in the "hint" thing.  I
agree it is a good idea to exclude trees and blobs from the hint,
for the same reason why I think it makes sense to exclude blobs,
trees and commits from hints for a X in "X^{tag}" above.

>     ^{tree}:
>     hint:   7452336aa3 tree
>     hint:   74524f384d tree
>     hint:   7452813bcd tree
>     hint:   7452b1a701 tree
>     hint:   7452b73c42 tree
>     hint:   7452ca1557 tree

Again, if there is a commit or a tag (that points at a commit or a
tree) whose name begins with 7452, it should be included in the hint
above.  Not having blobs in the hint of course makes sense, as a
blob cannot be X in "X^{tree}".

> And[2]:
>
>     core.disambiguate=tag:
>     [same as ^{tag]
>     core.disambiguate=commit:
>     [same as ^{commit}]

When core.disambiguate tells us to "interprete hexadecimal literals
to name commit objects only", giving only commits in hints: section
makes sense, because we are explicitly saying that "when I say 7452,
I do not mean any tag whose name begins with 7452", so "sorry, your
request is not explicit enough---there are two commits and a tag
that begin with that prefix" is not helpful---it should stop at "you
may have meant one of these two commits" and not mention any tag.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-07  4:08             ` Junio C Hamano
@ 2018-05-08 14:34               ` Jeff King
  2018-05-08 18:53                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 99+ messages in thread
From: Jeff King @ 2018-05-08 14:34 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, git, brian m . carlson,
	Derrick Stolee, Stefan Beller, Eric Sunshine

On Mon, May 07, 2018 at 01:08:46PM +0900, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> 
> > Right, and I'm with you so far, this makes sense to me for all existing
> > uses of the peel syntax, otherwise v2.17.0^{tree} wouldn't be the same
> > as rev-parse v2.17.0^{tree}^{tree}...
> 
> More importantly, you could spell v2.17.0 part of the above with a
> short hexadecimal string.  And that string should be naming some
> tree-ish, the most important thing being that it is *NOT* required
> to be a tree (and practically, it is likely that the user has a
> tree-ish that is *NOT* a tree).
> 
> I guess I have a reaction to the title
> 
>     "get_short_oid/peel_onion: ^{tree} should be tree"
> 
> "X^{tree}" should *RESULT* in a tree, but it should *REQUIRE* X to
> be a tree-ish.  It is unclear "should be tree" is about the former
> and I read (perhaps mis-read) it as saying "it should require X to
> be a tree"---that statement is utterly incorrect as we agreed above.

FWIW, I had the same feeling as you when reading this, that this commit
(and the one after) are doing the wrong thing. And these paragraphs sum
it up. The "^{tree}" is about asking us to peel to a tree, not about
resolving X in the first place. We can use it as a _hint_ when resolving
X, but the correct hint is "something that can be peeled to a tree", not
"is definitely a tree".

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 11/12] config doc: document core.disambiguate
  2018-05-01 18:40   ` [PATCH v3 11/12] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
@ 2018-05-08 14:41     ` Jeff King
  0 siblings, 0 replies; 99+ messages in thread
From: Jeff King @ 2018-05-08 14:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine

On Tue, May 01, 2018 at 06:40:15PM +0000, Ævar Arnfjörð Bjarmason wrote:

> The core.disambiguate variable was added in
> 5b33cb1fd7 ("get_short_sha1: make default disambiguation
> configurable", 2016-09-27) but never documented.

Thanks, this seems reasonable. It was originally added as a tool to let
people experiment with different defaults, and I never really expected
it to be something normal people would set. But I'm not sure if anybody
really did much experimentation (I still suspect that setting it to
"commit" or "committish" would make most people happy).

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 2659153cb3..14a3d57e77 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -910,6 +910,19 @@ core.abbrev::
>  	abbreviated object names to stay unique for some time.
>  	The minimum length is 4.
>  
> +core.disambiguate::
> +	If Git is given a SHA-1 that's ambigous it'll suggest what
> +	objects you might mean. By default it'll print out all
> +	potential objects with that prefix regardless of their
> +	type. This setting, along with the `^{<type>}` peel syntax
> +	(see linkgit:gitrevisions[7]), allows for narrowing that down.

This isn't just about what we print, but also about excluding objects
from consideration that don't match.

> +Is set to `none` by default to show all object types. Can also be
> +`commit` (peel syntax: `$sha1^{commit}`), `committish` (commits and
> +tags), `tree` (peel: `$sha1^{tree}`), `treeish` (everything except
> +blobs, peel syntax: `$sha1:`), `blob` (peel: `$sha1^{blob}`) or `tag`
> +(peel: `$sha1^{tag}`). The peel syntax will override any config value.

These peel references would need updating pending the discussion over
the earlier patches.

I suspect there are other things besides peel syntax which may override
this. It's really just the fallback when the caller does not give the
lookup machinery any other context. Certainly the peel specifiers are
one way to get syntax, but I think there are others. Grepping for
GET_OID_, I see that the revision dot syntax infers committish context,
as does anything that passes REVARG_COMMITTISH (so git-log, for
example).

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 18:40   ` [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
  2018-05-03  5:13     ` Junio C Hamano
@ 2018-05-08 14:44     ` Jeff King
  1 sibling, 0 replies; 99+ messages in thread
From: Jeff King @ 2018-05-08 14:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine

On Tue, May 01, 2018 at 06:40:10PM +0000, Ævar Arnfjörð Bjarmason wrote:

> Change the output emitted when an ambiguous object is encountered so
> that we show tags first, then commits, followed by trees, and finally
> blobs. Within each type we show objects in hashcmp() order. Before
> this change the objects were only ordered by hashcmp().
> 
> The reason for doing this is that the output looks better as a result,
> e.g. the v2.17.0 tag before this change on "git show e8f2" would
> display:

FWIW, I agree that this gives a big improvement to readability.

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 04/12] cache.h: add comment explaining the order in object_type
  2018-05-01 18:40   ` [PATCH v3 04/12] cache.h: add comment explaining the order in object_type Ævar Arnfjörð Bjarmason
  2018-05-03  5:05     ` Junio C Hamano
@ 2018-05-08 15:35     ` Duy Nguyen
  2018-05-08 15:56       ` [PATCH] pack-format.txt: more details on pack file format Nguyễn Thái Ngọc Duy
  1 sibling, 1 reply; 99+ messages in thread
From: Duy Nguyen @ 2018-05-08 15:35 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Junio C Hamano, Jeff King, brian m . carlson,
	Derrick Stolee, Stefan Beller, Eric Sunshine

On Tue, May 1, 2018 at 8:40 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> The order in the enum might seem arbitrary, and isn't explained by
> 72518e9c26 ("more lightweight revalidation while reusing deflated
> stream in packing", 2006-09-03) which added it.
>
> Derrick Stolee suggested that it's ordered topologically in
> 5f8b1ec1-258d-1acc-133e-a7c248b4083e@gmail.com. Makes sense to me, add
> that as a comment.
>
> Helped-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  cache.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/cache.h b/cache.h
> index 77b7acebb6..354903c3ea 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -376,6 +376,14 @@ extern void free_name_hash(struct index_state *istate);
>  enum object_type {
>         OBJ_BAD = -1,
>         OBJ_NONE = 0,
> +       /*
> +        * Why have our our "real" object types in this order? They're
> +        * ordered topologically:
> +        *
> +        * tag(4)    -> commit(1), tree(2), blob(3)
> +        * commit(1) -> tree(2)
> +        * tree(2)   -> blob(3)
> +        */

I think it's more important that these constants are part of the pack
file format. Even if it follows some order now, when a new object type
comes, you cannot just reorder to keep things look nice because then
you break pack file access.

I'm afraid this comment suggests that these numbers are just about
order, which is very wrong.

>         OBJ_COMMIT = 1,
>         OBJ_TREE = 2,
>         OBJ_BLOB = 3,
-- 
Duy

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH] pack-format.txt: more details on pack file format
  2018-05-08 15:35     ` Duy Nguyen
@ 2018-05-08 15:56       ` Nguyễn Thái Ngọc Duy
  2018-05-08 17:23         ` Stefan Beller
                           ` (2 more replies)
  0 siblings, 3 replies; 99+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-05-08 15:56 UTC (permalink / raw)
  To: pclouds; +Cc: avarab, git, gitster, peff, sandals, sbeller, stolee, sunshine

The current document mentions OBJ_* constants without their actual
values. A git developer would know these are from cache.h but that's
not very friendly to a person who wants to read this file to implement
a pack file parser.

Similarly, the deltified representation is not documented at all (the
"document" is basically patch-delta.c). Translate that C code in
English.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 I noticed that these object type values are not documented in
 pack-format.txt so here's my attempt to improve it.

 While at there, I also add some text about this obscure delta format.
 We occasionally have questions about this on the mailing list if I
 remember correctly.

 Documentation/technical/pack-format.txt | 41 +++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt
index 8e5bf60be3..2c7d5c0e74 100644
--- a/Documentation/technical/pack-format.txt
+++ b/Documentation/technical/pack-format.txt
@@ -36,6 +36,47 @@ Git pack format
 
   - The trailer records 20-byte SHA-1 checksum of all of the above.
 
+Valid object types are:
+
+- OBJ_COMMIT (1)
+- OBJ_TREE (2)
+- OBJ_BLOB (3)
+- OBJ_TAG (4)
+- OBJ_OFS_DELTA (6)
+- OBJ_REF_DELTA (7)
+
+Type 5 is reserved for future expansion.
+
+Deltified representation is a sequence of one byte command optionally
+followed by more data for the command. The following commands are
+recognized:
+
+- If bit 7 is set, the remaining bits in the command byte specifies
+  how to extract copy offset and size to copy. The following must be
+  evaluated in this exact order:
+  - If bit 0 is set, the following byte contains bits 0-7 of the copy
+    offset (this also resets all other bits in the copy offset to
+    zero).
+  - If bit 1 is set, the following byte contains bits 8-15 of the copy
+    offset.
+  - If bit 2 is set, the following byte contains bits 16-23 of the
+    copy offset.
+  - If bit 3 is set, the following byte contains bits 24-31 of the
+    copy offset.
+  - If bit 4 is set, the following byte contains bits 0-7 of the copy
+    size (this also resets all other bits in the copy size to zero_.
+  - If bit 5 is set, the following byte contains bits 8-15 of the copy
+    size.
+  - If bit 6 is set, the following byte contains bits 16-23 of the
+    copy size.
+
+  Copy size zero means 0x10000 bytes. The data from source object at
+  the given copy offset is copied back to the destination buffer.
+
+- If bit 7 is not set, it is the copy size in bytes. The following
+  bytes are copied to destination buffer
+- Command byte zero is reserved for future expansion.
+
 == Original (version 1) pack-*.idx files have the following format:
 
   - The header consists of 256 4-byte network byte order
-- 
2.17.0.705.g3525833791


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH] pack-format.txt: more details on pack file format
  2018-05-08 15:56       ` [PATCH] pack-format.txt: more details on pack file format Nguyễn Thái Ngọc Duy
@ 2018-05-08 17:23         ` Stefan Beller
  2018-05-08 18:22           ` Duy Nguyen
  2018-05-08 18:21         ` Ævar Arnfjörð Bjarmason
  2018-05-10 15:09         ` [PATCH v2] " Nguyễn Thái Ngọc Duy
  2 siblings, 1 reply; 99+ messages in thread
From: Stefan Beller @ 2018-05-08 17:23 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Jeff King, brian m. carlson, Derrick Stolee, Eric Sunshine

On Tue, May 8, 2018 at 8:56 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> The current document mentions OBJ_* constants without their actual
> values. A git developer would know these are from cache.h but that's
> not very friendly to a person who wants to read this file to implement
> a pack file parser.
>
> Similarly, the deltified representation is not documented at all (the
> "document" is basically patch-delta.c). Translate that C code in
> English.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  I noticed that these object type values are not documented in
>  pack-format.txt so here's my attempt to improve it.

Thanks for sending this patch!

>
>  While at there, I also add some text about this obscure delta format.
>  We occasionally have questions about this on the mailing list if I
>  remember correctly.

Let me see if I can understand it, as I am not well versed in the
delta format, so ideally I would understand it from the patch here?

>
>  Documentation/technical/pack-format.txt | 41 +++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>
> diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt
> index 8e5bf60be3..2c7d5c0e74 100644
> --- a/Documentation/technical/pack-format.txt
> +++ b/Documentation/technical/pack-format.txt
> @@ -36,6 +36,47 @@ Git pack format
>
>    - The trailer records 20-byte SHA-1 checksum of all of the above.
>
> +Valid object types are:
> +
> +- OBJ_COMMIT (1)
> +- OBJ_TREE (2)
> +- OBJ_BLOB (3)
> +- OBJ_TAG (4)
> +- OBJ_OFS_DELTA (6)
> +- OBJ_REF_DELTA (7)
> +
> +Type 5 is reserved for future expansion.

and type 0 as well, but that is not spelled out?

> +Deltified representation

Does this refer to OFS delta as well as REF deltas?

> is a sequence of one byte command optionally
> +followed by more data for the command. The following commands are
> +recognized:

So a Deltified representation of an object is a 6 or 7 in the 3 bit type
and then the length. Then a command is shown how to construct
the object based on other objects. Can there be more commands?

> +- If bit 7 is set, the remaining bits in the command byte specifies
> +  how to extract copy offset and size to copy. The following must be
> +  evaluated in this exact order:

So there are 2 modes, and the high bit indicates which mode is used.
You start describing the more complicated mode first,
maybe give names to both of them? "direct copy" (below) and
"compressed copy with offset" ?


> +  - If bit 0 is set, the following byte contains bits 0-7 of the copy
> +    offset (this also resets all other bits in the copy offset to
> +    zero).
> +  - If bit 1 is set, the following byte contains bits 8-15 of the copy
> +    offset.
> +  - If bit 2 is set, the following byte contains bits 16-23 of the
> +    copy offset.
> +  - If bit 3 is set, the following byte contains bits 24-31 of the
> +    copy offset.

I assume these bits are exclusive, i.e. if bit 3 is set, bits 0-2 are not
allowed to be set. What happens if they are set, do we care?

If bit 3 is set, then the following byte contains 24-31 of the copy offset,
where is the rest? Do I wait for another command byte with
bits 2,1,0 to learn about the body offsets, or do they follow the
following byte? Something like:

  "If bit 3 is set, then the next 4 bytes are the copy offset,
  in network byte order"


> +  - If bit 4 is set, the following byte contains bits 0-7 of the copy
> +    size (this also resets all other bits in the copy size to zero_.
> +  - If bit 5 is set, the following byte contains bits 8-15 of the copy
> +    size.
> +  - If bit 6 is set, the following byte contains bits 16-23 of the
> +    copy size.

bits 4-7 seem to be another group of mutually exclusive bits.
The same question as above:
If bit 6 is set, where are bits 0-15 of the copy size?

> +
> +  Copy size zero means 0x10000 bytes.

This is an interesting caveat. So we can only copy 1-0x10000 bytes,
and cannot express to copy 0 bytes?

> The data from source object at
> +  the given copy offset is copied back to the destination buffer.
> +
> +- If bit 7 is not set, it is the copy size in bytes. The following
> +  bytes are copied to destination buffer
> +- Command byte zero is reserved for future expansion.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH] pack-format.txt: more details on pack file format
  2018-05-08 15:56       ` [PATCH] pack-format.txt: more details on pack file format Nguyễn Thái Ngọc Duy
  2018-05-08 17:23         ` Stefan Beller
@ 2018-05-08 18:21         ` Ævar Arnfjörð Bjarmason
  2018-05-08 18:24           ` Duy Nguyen
  2018-05-10 15:09         ` [PATCH v2] " Nguyễn Thái Ngọc Duy
  2 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-08 18:21 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, gitster, peff, sandals, sbeller, stolee, sunshine


On Tue, May 08 2018, Nguyễn Thái Ngọc Duy wrote:

> The current document mentions OBJ_* constants without their actual
> values. A git developer would know these are from cache.h but that's
> not very friendly to a person who wants to read this file to implement
> a pack file parser.
>
> Similarly, the deltified representation is not documented at all (the
> "document" is basically patch-delta.c). Translate that C code in
> English.

Thanks, will drop my version from v4, although a comment for the enum in
cache.h pointing the reader to these docs would be very useful.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH] pack-format.txt: more details on pack file format
  2018-05-08 17:23         ` Stefan Beller
@ 2018-05-08 18:22           ` Duy Nguyen
  2018-05-08 18:58             ` Stefan Beller
  0 siblings, 1 reply; 99+ messages in thread
From: Duy Nguyen @ 2018-05-08 18:22 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Jeff King, brian m. carlson, Derrick Stolee, Eric Sunshine

On Tue, May 8, 2018 at 7:23 PM, Stefan Beller <sbeller@google.com> wrote:
>>  While at there, I also add some text about this obscure delta format.
>>  We occasionally have questions about this on the mailing list if I
>>  remember correctly.
>
> Let me see if I can understand it, as I am not well versed in the
> delta format, so ideally I would understand it from the patch here?

Well yes. I don't expect my first version to be that easy to
understand. This is where you come in to help ;-)

>> +Valid object types are:
>> +
>> +- OBJ_COMMIT (1)
>> +- OBJ_TREE (2)
>> +- OBJ_BLOB (3)
>> +- OBJ_TAG (4)
>> +- OBJ_OFS_DELTA (6)
>> +- OBJ_REF_DELTA (7)
>> +
>> +Type 5 is reserved for future expansion.
>
> and type 0 as well, but that is not spelled out?

type 0 is invalid. I think in some encoding it's not even possible to
encode zero. Anyway yes it should be spelled out.

>
>> +Deltified representation
>
> Does this refer to OFS delta as well as REF deltas?

Yes. Both OFS and REF deltas have the same "body" which is what this
part is about. The differences between OFS and REF deltas are not
described (in fact I don't think we describe what OFS and REF deltas
are at all).

>> is a sequence of one byte command optionally
>> +followed by more data for the command. The following commands are
>> +recognized:
>
> So a Deltified representation of an object is a 6 or 7 in the 3 bit type
> and then the length. Then a command is shown how to construct
> the object based on other objects. Can there be more commands?
>
>> +- If bit 7 is set, the remaining bits in the command byte specifies
>> +  how to extract copy offset and size to copy. The following must be
>> +  evaluated in this exact order:
>
> So there are 2 modes, and the high bit indicates which mode is used.
> You start describing the more complicated mode first,
> maybe give names to both of them? "direct copy" (below) and
> "compressed copy with offset" ?

I started to update this more because even this text is hard to get
even to me. So let's get the background first.

We have a source object somewhere (the object name comes from ofs/ref
delta's header), basically we have the whole content. This delta
thingy tells us how to use that source object to create a new (target)
object.

The delta is actually a sequence of instructions (of variable length).
One is for copying from the source object. The other copies from the
delta itself (e.g. this is new data in the target which is not
available anywhere in the source object to copy from). The last bit of
the first byte determines what instruction type it is.


>> +  - If bit 0 is set, the following byte contains bits 0-7 of the copy
>> +    offset (this also resets all other bits in the copy offset to
>> +    zero).
>> +  - If bit 1 is set, the following byte contains bits 8-15 of the copy
>> +    offset.
>> +  - If bit 2 is set, the following byte contains bits 16-23 of the
>> +    copy offset.
>> +  - If bit 3 is set, the following byte contains bits 24-31 of the
>> +    copy offset.
>
> I assume these bits are exclusive, i.e. if bit 3 is set, bits 0-2 are not
> allowed to be set. What happens if they are set, do we care?
>
> If bit 3 is set, then the following byte contains 24-31 of the copy offset,
> where is the rest? Do I wait for another command byte with
> bits 2,1,0 to learn about the body offsets, or do they follow the
> following byte? Something like:
>
>   "If bit 3 is set, then the next 4 bytes are the copy offset,
>   in network byte order"

My first attempt at "translating to English" is like a constructing C
from assembly: it's horrible.

The instruction looks like this

        bit      0        1        2        3       4      5      6
  +----------+--------+--------+--------+--------+------+------+------+
  | 1xxxxxxx | offset | offset | offset | offset | size | size | size |
  +----------+--------+--------+--------+--------+------+------+------+

Here you can see it in its full form, each box represents a byte. The
first byte has bit 7 set as mentioned. We can see here that offsets
(where to copy from in the source object) takes 4 bytes and size (how
many bytes to copy) takes 3. Offset size size is in LSB order.

The "xxxxxxx" part lets us shrink this down. If the offset can fit in
16 bits, there's no reason to waste the last two bytes describing
zero. Each 'x' marks whether the corresponding byte is present. The
bit number is in the first row. So if you have offset 255 and size 1,
the instruction is three bytes 10010001b, 255, 1. The octets on "bit
column" 1, 2, 3, 5 and 6 are missing because the corresponding bit in
the first bit is not set.

>> +  - If bit 4 is set, the following byte contains bits 0-7 of the copy
>> +    size (this also resets all other bits in the copy size to zero_.
>> +  - If bit 5 is set, the following byte contains bits 8-15 of the copy
>> +    size.
>> +  - If bit 6 is set, the following byte contains bits 16-23 of the
>> +    copy size.
>
> bits 4-7 seem to be another group of mutually exclusive bits.
> The same question as above:
> If bit 6 is set, where are bits 0-15 of the copy size?

I think this is a corner case in this format. I think Nico meant to
specify consecutive bytes: if size is 2 bytes then you have to specify
_both_ of them even if the first byte could be zero and omitted.

The implementation detail is, if bit 6 is set but bit 4 is not, then
the size value is pretty much random. It's only when bit 4 is set that
we first clear out "size" and start adding bits to it.

>
>> +
>> +  Copy size zero means 0x10000 bytes.
>
> This is an interesting caveat. So we can only copy 1-0x10000 bytes,
> and cannot express to copy 0 bytes?

Yes. There's no point to copy nothing. And it saves space to not
specify "size" at all. I think this is meant to copy a very large part
from the source, so you just continue to copy a series of 0x10000
chunks.

>
>> The data from source object at
>> +  the given copy offset is copied back to the destination buffer.
>> +
>> +- If bit 7 is not set, it is the copy size in bytes. The following
>> +  bytes are copied to destination buffer
>> +- Command byte zero is reserved for future expansion.
>
> Thanks,
> Stefan



-- 
Duy

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH] pack-format.txt: more details on pack file format
  2018-05-08 18:21         ` Ævar Arnfjörð Bjarmason
@ 2018-05-08 18:24           ` Duy Nguyen
  0 siblings, 0 replies; 99+ messages in thread
From: Duy Nguyen @ 2018-05-08 18:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Junio C Hamano, Jeff King, brian m. carlson,
	Stefan Beller, Derrick Stolee, Eric Sunshine

On Tue, May 8, 2018 at 8:21 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Tue, May 08 2018, Nguyễn Thái Ngọc Duy wrote:
>
>> The current document mentions OBJ_* constants without their actual
>> values. A git developer would know these are from cache.h but that's
>> not very friendly to a person who wants to read this file to implement
>> a pack file parser.
>>
>> Similarly, the deltified representation is not documented at all (the
>> "document" is basically patch-delta.c). Translate that C code in
>> English.
>
> Thanks, will drop my version from v4, although a comment for the enum in
> cache.h pointing the reader to these docs would be very useful.

True. I will add some together with the pack-format.txt update.
-- 
Duy

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-08 14:34               ` Jeff King
@ 2018-05-08 18:53                 ` Ævar Arnfjörð Bjarmason
  2018-05-09  7:56                   ` Jeff King
  0 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-08 18:53 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, git, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine


On Tue, May 08 2018, Jeff King wrote:

> On Mon, May 07, 2018 at 01:08:46PM +0900, Junio C Hamano wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>> > Right, and I'm with you so far, this makes sense to me for all existing
>> > uses of the peel syntax, otherwise v2.17.0^{tree} wouldn't be the same
>> > as rev-parse v2.17.0^{tree}^{tree}...
>>
>> More importantly, you could spell v2.17.0 part of the above with a
>> short hexadecimal string.  And that string should be naming some
>> tree-ish, the most important thing being that it is *NOT* required
>> to be a tree (and practically, it is likely that the user has a
>> tree-ish that is *NOT* a tree).
>>
>> I guess I have a reaction to the title
>>
>>     "get_short_oid/peel_onion: ^{tree} should be tree"
>>
>> "X^{tree}" should *RESULT* in a tree, but it should *REQUIRE* X to
>> be a tree-ish.  It is unclear "should be tree" is about the former
>> and I read (perhaps mis-read) it as saying "it should require X to
>> be a tree"---that statement is utterly incorrect as we agreed above.
>
> FWIW, I had the same feeling as you when reading this, that this commit
> (and the one after) are doing the wrong thing. And these paragraphs sum
> it up. The "^{tree}" is about asking us to peel to a tree, not about
> resolving X in the first place. We can use it as a _hint_ when resolving
> X, but the correct hint is "something that can be peeled to a tree", not
> "is definitely a tree".

Maybe I'm just being dense, but I still don't get from this & Junio's
E-Mails what the general rule should be.

I think a response to the part after "leaving that aside" of my upthread
E-Mail
(https://public-inbox.org/git/87lgczyfq6.fsf@evledraar.gmail.com/) would
help me out.

Not to belabor the point, but here's a patch I came up with to
revisions.txt that's a WIP version of something that would describe the
worldview after this v3:

    diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt
    index dfcc49c72c..0bf68f4ad2 100644
    --- a/Documentation/revisions.txt
    +++ b/Documentation/revisions.txt
    @@ -143,29 +143,52 @@ thing no matter the case.
       '<rev>{caret}1{caret}1{caret}1'.  See below for an illustration of
       the usage of this form.

     '<rev>{caret}{<type>}', e.g. 'v0.99.8{caret}\{commit\}'::
       A suffix '{caret}' followed by an object type name enclosed in
       brace pair means dereference the object at '<rev>' recursively until
       an object of type '<type>' is found or the object cannot be
    -  dereferenced anymore (in which case, barf).
    +  dereferenced anymore (in which case either return that object type, or barf).
       For example, if '<rev>' is a commit-ish, '<rev>{caret}\{commit\}'
       describes the corresponding commit object.
       Similarly, if '<rev>' is a tree-ish, '<rev>{caret}\{tree\}'
       describes the corresponding tree object.
       '<rev>{caret}0'
       is a short-hand for '<rev>{caret}\{commit\}'.
     +
     'rev{caret}\{object\}' can be used to make sure 'rev' names an
     object that exists, without requiring 'rev' to be a tag, and
     without dereferencing 'rev'; because a tag is already an object,
     it does not have to be dereferenced even once to get to an object.
     +
     'rev{caret}\{tag\}' can be used to ensure that 'rev' identifies an
     existing tag object.
    ++
    +Object types that don't have a 1=1 mapping to other object types
    +cannot be dereferenced with the peel syntax, and will return an
    +error. E.g. '<treeid>{caret}{commit}' or '<treeid>{caret}{tree}' is
    +allowed because a tag can only point to one commit, and a commit can
    +only point to one tree. But '<treeid>{caret}{blob}' will always
    +produce an error since trees in general don't 1=1 map to blobs, even
    +though the specific '<treeid>' in question might only contain one
    +blob. Note that '<tagid>{caret}{blob}' is not an error if '<tagid>' is
    +a tag that points directly to a blob, since that again becomes
    +unambiguous.
    ++
    +'<rev>{caret}{<type>}' takes on a different meaning when '<rev>' is a
    +SHA-1 that's ambiguous within the object store. In that case we don't
    +have a 1=1 mapping anymore. E.g. e8f2 in git.git can refer to multiple
    +objects of all the different object types. In that case
    +{caret}{<type>} should always be an error to be consistent with the
    +logic above, but that wouldn't be useful to anybody. Instead it'll
    +fall back to being selector syntax for the given object types,
    +e.g. e8f2{caret}{tag} will (as of writing this) return the v2.17.0
    +tag, and {caret}{commit}, {caret}{tree} and {caret}{blob} will return
    +commit, tree and blob objects, respectively.
    +
    [...]

My understanding of what you two are saying is that somehow the peel
semantics should be preserved when we take this beyond the 1=1 mapping
case, but I don't see how if we run with that how we wouldn't need to
introduce the concept of blobish for consistency as I noted upthread.

So it would be very useful to me if you or someone who understands the
behavior you & Junio seem to want could write a version of the patch I
have above where the last paragraph is different, and describes the
desired semantics, because I still don't get it. Why would we 1=many
peel commits to trees as a special case, but not 1=many do the same for
trees & blobs?

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH] pack-format.txt: more details on pack file format
  2018-05-08 18:22           ` Duy Nguyen
@ 2018-05-08 18:58             ` Stefan Beller
  0 siblings, 0 replies; 99+ messages in thread
From: Stefan Beller @ 2018-05-08 18:58 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Jeff King, brian m. carlson, Derrick Stolee, Eric Sunshine

>>> +Deltified representation
>>
>> Does this refer to OFS delta as well as REF deltas?
>
> Yes. Both OFS and REF deltas have the same "body" which is what this
> part is about. The differences between OFS and REF deltas are not
> described (in fact I don't think we describe what OFS and REF deltas
> are at all).

Maybe we should?

>
>>> is a sequence of one byte command optionally
>>> +followed by more data for the command. The following commands are
>>> +recognized:
>>
>> So a Deltified representation of an object is a 6 or 7 in the 3 bit type
>> and then the length. Then a command is shown how to construct
>> the object based on other objects. Can there be more commands?
>>
>>> +- If bit 7 is set, the remaining bits in the command byte specifies
>>> +  how to extract copy offset and size to copy. The following must be
>>> +  evaluated in this exact order:
>>
>> So there are 2 modes, and the high bit indicates which mode is used.
>> You start describing the more complicated mode first,
>> maybe give names to both of them? "direct copy" (below) and
>> "compressed copy with offset" ?
>
> I started to update this more because even this text is hard to get
> even to me. So let's get the background first.
>
> We have a source object somewhere (the object name comes from ofs/ref
> delta's header), basically we have the whole content. This delta
> thingy tells us how to use that source object to create a new (target)
> object.
>
> The delta is actually a sequence of instructions (of variable length).

The previous paragraph and this sentence are great for my understanding.
thanks! (Maybe keep it in a similar form around?)

> One is for copying from the source object.

ok that makes sense. I can think of it as a "HTTP range request", just
optimized for packfiles and the source is inside the same pack.
So it would say "Goto object <sha1> and copy bytes 13-168 here"

> The other copies from the
> delta itself

itself means the same object here, that we are describing here?
or does it mean other deltas?

> (e.g. this is new data in the target which is not
> available anywhere in the source object to copy from).




>
> The instruction looks like this
>
>         bit      0        1        2        3       4      5      6
>   +----------+--------+--------+--------+--------+------+------+------+
>   | 1xxxxxxx | offset | offset | offset | offset | size | size | size |
>   +----------+--------+--------+--------+--------+------+------+------+
>
> Here you can see it in its full form, each box represents a byte. The
> first byte has bit 7 set as mentioned. We can see here that offsets
> (where to copy from in the source object) takes 4 bytes and size (how
> many bytes to copy) takes 3. Offset size size is in LSB order.
>
> The "xxxxxxx" part lets us shrink this down.

.. by indicating how much prefix we can skip and assume it be all zero(?)

> If the offset can fit in
> 16 bits, there's no reason to waste the last two bytes describing
> zero. Each 'x' marks whether the corresponding byte is present.

So for a full instruction (as above), we'd have to

1 1111 111 <4 bytes offset> <3 bytes size>

for smaller instructions we have

1 1100 100 <2 bytes offset> <1 byte size>
and here the offset is in range 0..64k and
the size is 1-255 or 0x10000 ?


Modes to skip bytes in between are not allowed, e.g.
1 1101 101 < 3 bytes of offsets> <2 bytes of size>
and the missing bytes would be assumed to be 0?

> The
> bit number is in the first row. So if you have offset 255 and size 1,
> the instruction is three bytes 10010001b, 255,

Oh it is the other way round, the size will be just one byte,
indicating we can have a range of 1-255 or 0x10000 and an
offset of 0..255.

>
> I think this is a corner case in this format. I think Nico meant to
> specify consecutive bytes: if size is 2 bytes then you have to specify
> _both_ of them even if the first byte could be zero and omitted.

So it is not a mutually exclusive group, but a sequence (similar as in
git-bisect), where we start with 0 and end with exactly one edge
in between (sort of, we can also start with 1, then we have to have
all 1s)

> The implementation detail is, if bit 6 is set but bit 4 is not, then
> the size value is pretty much random. It's only when bit 4 is set that
> we first clear out "size" and start adding bits to it.

That sounds similar to what I spelled out above.

Thanks for taking on the documentation here.
The box with numbers really helped me!

Stefan

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-08 18:53                 ` Ævar Arnfjörð Bjarmason
@ 2018-05-09  7:56                   ` Jeff King
  2018-05-09 10:48                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 99+ messages in thread
From: Jeff King @ 2018-05-09  7:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, git, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine

On Tue, May 08, 2018 at 08:53:10PM +0200, Ævar Arnfjörð Bjarmason wrote:

> >> "X^{tree}" should *RESULT* in a tree, but it should *REQUIRE* X to
> >> be a tree-ish.  It is unclear "should be tree" is about the former
> >> and I read (perhaps mis-read) it as saying "it should require X to
> >> be a tree"---that statement is utterly incorrect as we agreed above.
> >
> > FWIW, I had the same feeling as you when reading this, that this commit
> > (and the one after) are doing the wrong thing. And these paragraphs sum
> > it up. The "^{tree}" is about asking us to peel to a tree, not about
> > resolving X in the first place. We can use it as a _hint_ when resolving
> > X, but the correct hint is "something that can be peeled to a tree", not
> > "is definitely a tree".
> 
> Maybe I'm just being dense, but I still don't get from this & Junio's
> E-Mails what the general rule should be.

Let me try to lay out my thinking a bit more clearly, and then I'll try
to respond to the points you laid out below.

Before we had any disambiguation code, resolving X^{tree} really was two
independent steps: resolve X, and then peel it to a tree. When we added
the disambiguation code, the goal was to provide a hint to the first
step in such a way that we could never eliminate any resolutions that
the user _might_ have meant. But it's OK to take a situation where every
case but one would result in an error, and assume the user meant that
case. Sort of a "do no harm" rule.

By disambiguating with just a tree and not a tree-ish, that hint is now
eliminating possibilities that would have worked in the second step,
which violates the rule.

Does thinking about it that way make more sense?

> I think a response to the part after "leaving that aside" of my upthread
> E-Mail
> (https://public-inbox.org/git/87lgczyfq6.fsf@evledraar.gmail.com/) would
> help me out.

I'll quote that bit here:

> But *leaving that aside*, i.e. I don't see why the use-case would make
> sense. What I *don't* get is why, if you think that, you only want to
> apply that rule to ^{tree}. I.e. wouldn't it then be consistent to say:
> 
>     # a)
>     ^{tag}    = tag
>     ^{commit} = tag, commit
>     ^{tree}   = tag, commit, tree
>     ^{blob}   = tag, blob (blobish)

Yes, that makes sense to me conceptually, and would follow the rule I
gave above. And I think that's what we do now, with the exception that
there is no blobish disambiguation. Presumably nobody ever bothered
because probably because tagged blobs are pretty rare (and obviously
though trees point to blobs, you cannot disambiguate that way since
there's no one-to-one correspondence).

So I doubt anybody really cares in practice, but I agree that it would
improve consistency to write a patch to introduce GET_OID_BLOBISH and
have "^{blob}" parsing use it.  And possibly add "blobish" to
core.disambiguate (or is it "blobbish"?), though that's almost certainly
something nobody would ever use.

> My understanding of what you two are saying is that somehow the peel
> semantics should be preserved when we take this beyond the 1=1 mapping
> case, but I don't see how if we run with that how we wouldn't need to
> introduce the concept of blobish for consistency as I noted upthread.

Yeah, I think the lack of blobish is a bug, just one that nobody has
ever really cared about.

> So it would be very useful to me if you or someone who understands the
> behavior you & Junio seem to want could write a version of the patch I
> have above where the last paragraph is different, and describes the
> desired semantics, because I still don't get it. Why would we 1=many
> peel commits to trees as a special case, but not 1=many do the same for
> trees & blobs?

I'm not sure I understand the mention of trees in the final sentence.
AFAICT tree disambiguation is consistent with the peeling rules.

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-09  7:56                   ` Jeff King
@ 2018-05-09 10:48                     ` Ævar Arnfjörð Bjarmason
  2018-05-10  4:21                       ` Junio C Hamano
  0 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-09 10:48 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, git, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine


On Wed, May 09 2018, Jeff King wrote:

> On Tue, May 08, 2018 at 08:53:10PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> >> "X^{tree}" should *RESULT* in a tree, but it should *REQUIRE* X to
>> >> be a tree-ish.  It is unclear "should be tree" is about the former
>> >> and I read (perhaps mis-read) it as saying "it should require X to
>> >> be a tree"---that statement is utterly incorrect as we agreed above.
>> >
>> > FWIW, I had the same feeling as you when reading this, that this commit
>> > (and the one after) are doing the wrong thing. And these paragraphs sum
>> > it up. The "^{tree}" is about asking us to peel to a tree, not about
>> > resolving X in the first place. We can use it as a _hint_ when resolving
>> > X, but the correct hint is "something that can be peeled to a tree", not
>> > "is definitely a tree".
>>
>> Maybe I'm just being dense, but I still don't get from this & Junio's
>> E-Mails what the general rule should be.
>
> Let me try to lay out my thinking a bit more clearly, and then I'll try
> to respond to the points you laid out below.
>
> Before we had any disambiguation code, resolving X^{tree} really was two
> independent steps: resolve X, and then peel it to a tree. When we added
> the disambiguation code, the goal was to provide a hint to the first
> step in such a way that we could never eliminate any resolutions that
> the user _might_ have meant. But it's OK to take a situation where every
> case but one would result in an error, and assume the user meant that
> case. Sort of a "do no harm" rule.
>
> By disambiguating with just a tree and not a tree-ish, that hint is now
> eliminating possibilities that would have worked in the second step,
> which violates the rule.
>
> Does thinking about it that way make more sense?

Okey, so to rephrase that to make sure I understand it. It would be
documented as something like this:

    When the short SHA-1 X is ambiguous X^{<type>} doesn't mean do the
    peel itself in X any way, rather it means list all those objects
    matching X where a subsequent X^{<type>} wouldn't be an error.

    I.e. X^{commit} will list tags and commits, since both can be peeled
    to reveal a commit, X^{tree} will similarly list tags, commits and
    trees, and X^{blob} will list tags and blobs[1], and X^{tag} will
    only list tags.

    But core.disambiguate=[tag|commit|tree|blob] is not at all like
    ^{[tag|commit|tree|blob]} and is unlike the peel syntax only going
    to list the objects of the respective type. The config synonym for
    the peel syntax is committish, treeish, and the nonexistent blobish.

>> I think a response to the part after "leaving that aside" of my upthread
>> E-Mail
>> (https://public-inbox.org/git/87lgczyfq6.fsf@evledraar.gmail.com/) would
>> help me out.
>
> I'll quote that bit here:
>
>> But *leaving that aside*, i.e. I don't see why the use-case would make
>> sense. What I *don't* get is why, if you think that, you only want to
>> apply that rule to ^{tree}. I.e. wouldn't it then be consistent to say:
>>
>>     # a)
>>     ^{tag}    = tag
>>     ^{commit} = tag, commit
>>     ^{tree}   = tag, commit, tree
>>     ^{blob}   = tag, blob (blobish)
>
> Yes, that makes sense to me conceptually, and would follow the rule I
> gave above. And I think that's what we do now, with the exception that
> there is no blobish disambiguation. Presumably nobody ever bothered
> because probably because tagged blobs are pretty rare (and obviously
> though trees point to blobs, you cannot disambiguate that way since
> there's no one-to-one correspondence).
>
> So I doubt anybody really cares in practice, but I agree that it would
> improve consistency to write a patch to introduce GET_OID_BLOBISH and
> have "^{blob}" parsing use it.  And possibly add "blobish" to
> core.disambiguate (or is it "blobbish"?), though that's almost certainly
> something nobody would ever use.

Yeah, I'll introduce it for consistency. To clarify I wasn't trying to
make some argument on the basis that we didn't have it, but I was
confused because I couldn't see how the general rule would apply to
^{tree} and not ^{blob}.

>> My understanding of what you two are saying is that somehow the peel
>> semantics should be preserved when we take this beyond the 1=1 mapping
>> case, but I don't see how if we run with that how we wouldn't need to
>> introduce the concept of blobish for consistency as I noted upthread.
>
> Yeah, I think the lack of blobish is a bug, just one that nobody has
> ever really cared about.
>
>> So it would be very useful to me if you or someone who understands the
>> behavior you & Junio seem to want could write a version of the patch I
>> have above where the last paragraph is different, and describes the
>> desired semantics, because I still don't get it. Why would we 1=many
>> peel commits to trees as a special case, but not 1=many do the same for
>> trees & blobs?
>
> I'm not sure I understand the mention of trees in the final sentence.
> AFAICT tree disambiguation is consistent with the peeling rules.

Yeah nevermind that, I was imagining some semantics where because we
dropped the 1=1 mapping ^{tree} would list blobs, but in the worldview
you describe above (if I got it right) that doesn't make sense.

1. Not currently, but I should amend my ^{blob} patch to work like that.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-09 10:48                     ` Ævar Arnfjörð Bjarmason
@ 2018-05-10  4:21                       ` Junio C Hamano
  2018-05-10  6:50                         ` Jeff King
  0 siblings, 1 reply; 99+ messages in thread
From: Junio C Hamano @ 2018-05-10  4:21 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff King, git, brian m . carlson, Derrick Stolee, Stefan Beller,
	Eric Sunshine

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> Before we had any disambiguation code, resolving X^{tree} really was two
>> independent steps: resolve X, and then peel it to a tree. When we added
>> the disambiguation code, the goal was to provide a hint to the first
>> step in such a way that we could never eliminate any resolutions that
>> the user _might_ have meant. But it's OK to take a situation where every
>> case but one would result in an error, and assume the user meant that
>> case. Sort of a "do no harm" rule.
>>
>> By disambiguating with just a tree and not a tree-ish, that hint is now
>> eliminating possibilities that would have worked in the second step,
>> which violates the rule.
>>
>> Does thinking about it that way make more sense?
>
> Okey, so to rephrase that to make sure I understand it. It would be
> documented as something like this:
>
>     When the short SHA-1 X is ambiguous X^{<type>} doesn't mean do the
>     peel itself in X any way, rather it means list all those objects
>     matching X where a subsequent X^{<type>} wouldn't be an error.

With the understanding that this paragraph is written primarily for
your own enlightenment, I wouldn't complain too much, but if you
meant this to become part of end-user documentation, I have a strong
issue with the verb "list" used here.

X^{<type>} never means to "list" anything (FWIW just X does not mean
to "list" anything, either).  It just means that the user wants to
specify a single object whose object name is X^{<type>}, i.e. the
user expects that X names a single object, that can be peeled to
<type>.

Now, it is an error when (1) X does not specify a single object in
the first place, or (2) the single object cannot be peeled to <type>.

When diagnosing such an error, we would give hints.  The hint would
show possible objects that the user could have meant with X.  The
^{<type>} suffix given to it may be used to limit the hints to
subset of the objects that the user could have meant with X;
e.g. when there is an object of each of type blob, tree, commit, and
tag, whose name begins with 7777, the short and ambiguous prefix
7777 could mean any of these four objects, but when given with
suffix, e.g. 7777^{tree}, it makes useless for the hint to include
the blob object, as it can never peel down to a tree object.

If the tag whose name begins with 7777 in this example points
directly to a blob, excluding that tag from the hint would make the
hint more useful.  I do not offhand know what the code does right
now.  I wouldn't call it a bug if such a tag is included in the
hint, but if a change stops such a tag from getting included, I
would call such a change an improvement.

>     I.e. X^{commit} will list tags and commits, since both can be peeled
>     to reveal a commit, X^{tree} will similarly list tags, commits and
>     trees, and X^{blob} will list tags and blobs[1], and X^{tag} will
>     only list tags.

Modulo the use of "list", which I have trouble is as it makes it
sound as if listing something is the purpose of the notation, I
think we are on the same page up to this point

I think the best way to explain core.disambiguate to readers is to
make them rethink what I meant with "the user expects that X names a
single object" in the early part of the above response.

Without constraint, Git understood that the user used 7777 to name
any one of the objects of all four types.  With core.disambiguate,
the user can tell Git "when I give potentially ambiguous object name
with a short prefix, assume that only a commit or a tag whose name
begins with the prefix is what I expected the short prefix to name
uniquely", so Git understood that the user wanted to name either a
commit or a tag.  It would still trigger an error as it does not
uniquely name an object (for which an attempt to apply the ^{tree}
peeling would further be made).


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
  2018-05-10  4:21                       ` Junio C Hamano
@ 2018-05-10  6:50                         ` Jeff King
  0 siblings, 0 replies; 99+ messages in thread
From: Jeff King @ 2018-05-10  6:50 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, git, brian m . carlson,
	Derrick Stolee, Stefan Beller, Eric Sunshine

On Thu, May 10, 2018 at 01:21:19PM +0900, Junio C Hamano wrote:

> When diagnosing such an error, we would give hints.  The hint would
> show possible objects that the user could have meant with X.  The
> ^{<type>} suffix given to it may be used to limit the hints to
> subset of the objects that the user could have meant with X;
> e.g. when there is an object of each of type blob, tree, commit, and
> tag, whose name begins with 7777, the short and ambiguous prefix
> 7777 could mean any of these four objects, but when given with
> suffix, e.g. 7777^{tree}, it makes useless for the hint to include
> the blob object, as it can never peel down to a tree object.
> 
> If the tag whose name begins with 7777 in this example points
> directly to a blob, excluding that tag from the hint would make the
> hint more useful.  I do not offhand know what the code does right
> now.  I wouldn't call it a bug if such a tag is included in the
> hint, but if a change stops such a tag from getting included, I
> would call such a change an improvement.

I actually wondered this while writing an earlier response, and so I
happen to know: when we are looking for a treeish, the disambiguator
will actually peel a candidate tag and only accept one that peels to a
tree or commit. So we would omit the tag-to-blob entirely from
consideration (both as a candidate for ambiguity, and in the hint list).

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v4 0/6] get_short_oid UI improvements
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
  2018-05-03  5:28     ` Junio C Hamano
@ 2018-05-10 12:42     ` Ævar Arnfjörð Bjarmason
  2018-05-10 16:04       ` Jeff King
  2018-05-10 12:42     ` [PATCH v4 1/6] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
                       ` (5 subsequent siblings)
  7 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-10 12:42 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen,
	Ævar Arnfjörð Bjarmason

This is like v3 except all the patches to the peel syntax & docs have
been dropped, which were controversial.

I think it's worthwhile to re-work that, but I don't have time for
that now, so I'm submitting this. Maybe I'll have time in the future
to re-work the rest, but then I can base it on top of this.

Ævar Arnfjörð Bjarmason (6):
  sha1-name.c: remove stray newline
  sha1-array.h: align function arguments
  git-p4: change "commitish" typo to "committish"
  sha1-name.c: move around the collect_ambiguous() function
  get_short_oid: sort ambiguous objects by type, then SHA-1
  get_short_oid: document & warn if we ignore the type selector

 Documentation/technical/api-oid-array.txt | 17 ++++---
 git-p4.py                                 |  6 +--
 sha1-array.c                              | 21 +++++++-
 sha1-array.h                              |  7 ++-
 sha1-name.c                               | 61 +++++++++++++++++++----
 t/t1512-rev-parse-disambiguation.sh       | 26 +++++++++-
 6 files changed, 115 insertions(+), 23 deletions(-)

-- 
2.17.0.410.g4ac3413cc8


^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v4 1/6] sha1-name.c: remove stray newline
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
  2018-05-03  5:28     ` Junio C Hamano
  2018-05-10 12:42     ` [PATCH v4 0/6] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
@ 2018-05-10 12:42     ` Ævar Arnfjörð Bjarmason
  2018-05-10 12:42     ` [PATCH v4 2/6] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-10 12:42 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen,
	Ævar Arnfjörð Bjarmason

This stray newline was accidentally introduced in
d2b7d9c7ed ("sha1_name: convert disambiguate_hint_fn to take
object_id", 2017-03-26).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/sha1-name.c b/sha1-name.c
index 5b93bf8da3..cd3b133aae 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -346,7 +346,6 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	struct strbuf desc = STRBUF_INIT;
 	int type;
 
-
 	if (ds->fn && !ds->fn(oid, ds->cb_data))
 		return 0;
 
-- 
2.17.0.410.g4ac3413cc8


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v4 2/6] sha1-array.h: align function arguments
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
                       ` (2 preceding siblings ...)
  2018-05-10 12:42     ` [PATCH v4 1/6] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
@ 2018-05-10 12:42     ` Ævar Arnfjörð Bjarmason
  2018-05-10 15:06       ` Jeff King
  2018-05-10 12:43     ` [PATCH v4 3/6] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
                       ` (3 subsequent siblings)
  7 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-10 12:42 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen,
	Ævar Arnfjörð Bjarmason

The arguments weren't lined up with the opening parenthesis. Fixes up
code added in aae0caf19e ("sha1-array.h: align function arguments",
2018-04-30).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-array.c | 4 ++--
 sha1-array.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/sha1-array.c b/sha1-array.c
index 838b3bf847..466a926aa3 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -42,8 +42,8 @@ void oid_array_clear(struct oid_array *array)
 }
 
 int oid_array_for_each_unique(struct oid_array *array,
-				for_each_oid_fn fn,
-				void *data)
+			      for_each_oid_fn fn,
+			      void *data)
 {
 	int i;
 
diff --git a/sha1-array.h b/sha1-array.h
index 04b0756334..1e1d24b009 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -17,7 +17,7 @@ void oid_array_clear(struct oid_array *array);
 typedef int (*for_each_oid_fn)(const struct object_id *oid,
 			       void *data);
 int oid_array_for_each_unique(struct oid_array *array,
-			       for_each_oid_fn fn,
-			       void *data);
+			      for_each_oid_fn fn,
+			      void *data);
 
 #endif /* SHA1_ARRAY_H */
-- 
2.17.0.410.g4ac3413cc8


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v4 3/6] git-p4: change "commitish" typo to "committish"
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
                       ` (3 preceding siblings ...)
  2018-05-10 12:42     ` [PATCH v4 2/6] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
@ 2018-05-10 12:43     ` Ævar Arnfjörð Bjarmason
  2018-05-10 15:00       ` Luke Diamand
  2018-05-10 12:43     ` [PATCH v4 4/6] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
                       ` (2 subsequent siblings)
  7 siblings, 1 reply; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-10 12:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen,
	Ævar Arnfjörð Bjarmason

This was the only occurrence of "commitish" in the tree, but as the
log will reveal we've had others in the past. Fixes up code added in
00ad6e3182 ("git-p4: work with a detached head", 2015-11-21).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 git-p4.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 7bb9cadc69..1afa87cd9d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2099,11 +2099,11 @@ class P4Submit(Command, P4UserMap):
 
         commits = []
         if self.master:
-            commitish = self.master
+            committish = self.master
         else:
-            commitish = 'HEAD'
+            committish = 'HEAD'
 
-        for line in read_pipe_lines(["git", "rev-list", "--no-merges", "%s..%s" % (self.origin, commitish)]):
+        for line in read_pipe_lines(["git", "rev-list", "--no-merges", "%s..%s" % (self.origin, committish)]):
             commits.append(line.strip())
         commits.reverse()
 
-- 
2.17.0.410.g4ac3413cc8


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v4 4/6] sha1-name.c: move around the collect_ambiguous() function
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
                       ` (4 preceding siblings ...)
  2018-05-10 12:43     ` [PATCH v4 3/6] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
@ 2018-05-10 12:43     ` Ævar Arnfjörð Bjarmason
  2018-05-10 12:43     ` [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
  2018-05-10 12:43     ` [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
  7 siblings, 0 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-10 12:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen,
	Ævar Arnfjörð Bjarmason

A subsequent change will make use of this static function in the
get_short_oid() function, which is defined above where the
collect_ambiguous() function is now. Without this we'd then have a
compilation error due to a forward declaration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index cd3b133aae..9d7bbd3e96 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -372,6 +372,12 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	return 0;
 }
 
+static int collect_ambiguous(const struct object_id *oid, void *data)
+{
+	oid_array_append(data, oid);
+	return 0;
+}
+
 static int get_short_oid(const char *name, int len, struct object_id *oid,
 			  unsigned flags)
 {
@@ -421,12 +427,6 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 	return status;
 }
 
-static int collect_ambiguous(const struct object_id *oid, void *data)
-{
-	oid_array_append(data, oid);
-	return 0;
-}
-
 int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data)
 {
 	struct oid_array collect = OID_ARRAY_INIT;
-- 
2.17.0.410.g4ac3413cc8


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
                       ` (5 preceding siblings ...)
  2018-05-10 12:43     ` [PATCH v4 4/6] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
@ 2018-05-10 12:43     ` Ævar Arnfjörð Bjarmason
  2018-05-10 15:22       ` Jeff King
  2018-05-11  5:36       ` Junio C Hamano
  2018-05-10 12:43     ` [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
  7 siblings, 2 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-10 12:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen,
	Ævar Arnfjörð Bjarmason

Change the output emitted when an ambiguous object is encountered so
that we show tags first, then commits, followed by trees, and finally
blobs. Within each type we show objects in hashcmp() order. Before
this change the objects were only ordered by hashcmp().

The reason for doing this is that the output looks better as a result,
e.g. the v2.17.0 tag before this change on "git show e8f2" would
display:

    hint: The candidates are:
    hint:   e8f2093055 tree
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f25a3a50 tree
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f2867228 blob
    hint:   e8f28d537c tree
    hint:   e8f2a35526 blob
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2cf6ec0 tree

Now we'll instead show:

    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob

Since we show the commit data in the output that's nicely aligned once
we sort by object type. The decision to show tags before commits is
pretty arbitrary. I don't want to order by object_type since there
tags come last after blobs, which doesn't make sense if we want to
show the most important things first.

I could display them after commits, but it's much less likely that
we'll display a tag, so if there is one it makes sense to show it
prominently at the top.

A note on the implementation: Derrick rightly pointed out[1] that
we're bending over backwards here in get_short_oid() to first
de-duplicate the list, and then emit it, but could simply do it in one
step.

The reason for that is that oid_array_for_each_unique() doesn't
actually require that the array be sorted by oid_array_sort(), it just
needs to be sorted in some order that guarantees that all objects with
the same ID are adjacent to one another, which (barring a hash
collision, which'll be someone else's problem) the sort_ambiguous()
function does.

I agree that would be simpler for this code, and had forgotten why I
initially wrote it like this[2]. But on further reflection I think
it's better to do more work here just so we're not underhandedly using
the oid-array API where we lie about the list being sorted. That would
break any subsequent use of oid_array_lookup() in subtle ways.

I could get around that by hacking the API itself to support this
use-case and documenting it, which I did as a WIP patch in [3], but I
think it's too much code smell just for this one call site. It's
simpler for the API to just introduce a oid_array_for_each() function
to eagerly spew out the list without sorting or de-duplication, and
then do the de-duplication and sorting in two passes.

1. https://public-inbox.org/git/20180501130318.58251-1-dstolee@microsoft.com/
2. https://public-inbox.org/git/876047ze9v.fsf@evledraar.gmail.com/
3. https://public-inbox.org/git/874ljrzctc.fsf@evledraar.gmail.com/

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/technical/api-oid-array.txt | 17 +++++++----
 sha1-array.c                              | 17 +++++++++++
 sha1-array.h                              |  3 ++
 sha1-name.c                               | 37 ++++++++++++++++++++++-
 t/t1512-rev-parse-disambiguation.sh       | 21 +++++++++++++
 5 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/api-oid-array.txt b/Documentation/technical/api-oid-array.txt
index b0c11f868d..94b529722c 100644
--- a/Documentation/technical/api-oid-array.txt
+++ b/Documentation/technical/api-oid-array.txt
@@ -35,13 +35,18 @@ Functions
 	Free all memory associated with the array and return it to the
 	initial, empty state.
 
+`oid_array_for_each`::
+	Iterate over each element of the list, executing the callback
+	function for each one. Does not sort the list, so any custom
+	hash order is retained. If the callback returns a non-zero
+	value, the iteration ends immediately and the callback's
+	return is propagated; otherwise, 0 is returned.
+
 `oid_array_for_each_unique`::
-	Efficiently iterate over each unique element of the list,
-	executing the callback function for each one. If the array is
-	not sorted, this function has the side effect of sorting it. If
-	the callback returns a non-zero value, the iteration ends
-	immediately and the callback's return is propagated; otherwise,
-	0 is returned.
+	Iterate over each unique element of the list in sort order ,
+	but otherwise behaves like `oid_array_for_each`. If the array
+	is not sorted, this function has the side effect of sorting
+	it.
 
 Examples
 --------
diff --git a/sha1-array.c b/sha1-array.c
index 466a926aa3..265941fbf4 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -41,6 +41,23 @@ void oid_array_clear(struct oid_array *array)
 	array->sorted = 0;
 }
 
+
+int oid_array_for_each(struct oid_array *array,
+		       for_each_oid_fn fn,
+		       void *data)
+{
+	int i;
+
+	/* No oid_array_sort() here! See the api-oid-array.txt docs! */
+
+	for (i = 0; i < array->nr; i++) {
+		int ret = fn(array->oid + i, data);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
 int oid_array_for_each_unique(struct oid_array *array,
 			      for_each_oid_fn fn,
 			      void *data)
diff --git a/sha1-array.h b/sha1-array.h
index 1e1d24b009..232bf95017 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -16,6 +16,9 @@ void oid_array_clear(struct oid_array *array);
 
 typedef int (*for_each_oid_fn)(const struct object_id *oid,
 			       void *data);
+int oid_array_for_each(struct oid_array *array,
+		       for_each_oid_fn fn,
+		       void *data);
 int oid_array_for_each_unique(struct oid_array *array,
 			      for_each_oid_fn fn,
 			      void *data);
diff --git a/sha1-name.c b/sha1-name.c
index 9d7bbd3e96..46d8b1afa6 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -378,6 +378,34 @@ static int collect_ambiguous(const struct object_id *oid, void *data)
 	return 0;
 }
 
+static int sort_ambiguous(const void *a, const void *b)
+{
+	int a_type = oid_object_info(a, NULL);
+	int b_type = oid_object_info(b, NULL);
+	int a_type_sort;
+	int b_type_sort;
+
+	/*
+	 * Sorts by hash within the same object type, just as
+	 * oid_array_for_each_unique() would do.
+	 */
+	if (a_type == b_type)
+		return oidcmp(a, b);
+
+	/*
+	 * Between object types show tags, then commits, and finally
+	 * trees and blobs.
+	 *
+	 * The object_type enum is commit, tree, blob, tag, but we
+	 * want tag, commit, tree blob. Cleverly (perhaps too
+	 * cleverly) do that with modulus, since the enum assigns 1 to
+	 * commit, so tag becomes 0.
+	 */
+	a_type_sort = a_type % 4;
+	b_type_sort = b_type % 4;
+	return a_type_sort > b_type_sort ? 1 : -1;
+}
+
 static int get_short_oid(const char *name, int len, struct object_id *oid,
 			  unsigned flags)
 {
@@ -409,6 +437,8 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 	status = finish_object_disambiguation(&ds, oid);
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
+		struct oid_array collect = OID_ARRAY_INIT;
+
 		error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
 
 		/*
@@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 			ds.fn = NULL;
 
 		advise(_("The candidates are:"));
-		for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds);
+		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
+		QSORT(collect.oid, collect.nr, sort_ambiguous);
+
+		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
+			BUG("show_ambiguous_object shouldn't return non-zero");
+		oid_array_clear(&collect);
 	}
 
 	return status;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 711704ba5a..2701462041 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -361,4 +361,25 @@ test_expect_success 'core.disambiguate does not override context' '
 		git -c core.disambiguate=committish rev-parse $sha1^{tree}
 '
 
+test_expect_success C_LOCALE_OUTPUT 'ambiguous commits are printed by type first, then hash order' '
+	test_must_fail git rev-parse 0000 2>stderr &&
+	grep ^hint: stderr >hints &&
+	grep 0000 hints >objects &&
+	cat >expected <<-\EOF &&
+	tag
+	commit
+	tree
+	blob
+	EOF
+	awk "{print \$3}" <objects >objects.types &&
+	uniq <objects.types >objects.types.uniq &&
+	test_cmp expected objects.types.uniq &&
+	for type in tag commit tree blob
+	do
+		grep $type objects >$type.objects &&
+		sort $type.objects >$type.objects.sorted &&
+		test_cmp $type.objects.sorted $type.objects
+	done
+'
+
 test_done
-- 
2.17.0.410.g4ac3413cc8


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector
  2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
                       ` (6 preceding siblings ...)
  2018-05-10 12:43     ` [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
@ 2018-05-10 12:43     ` Ævar Arnfjörð Bjarmason
  2018-05-10 13:15       ` Martin Ågren
  2018-05-10 16:03       ` Jeff King
  7 siblings, 2 replies; 99+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-05-10 12:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen,
	Ævar Arnfjörð Bjarmason

The SHA1 prefix 06fa currently matches no blobs in git.git. When
disambiguating short SHA1s we've been quietly ignoring the user's type
selector as a fallback mechanism, this was intentionally added in
1ffa26c461 ("get_short_sha1: list ambiguous objects on error",
2016-09-26).

I think that behavior makes sense, it's not very useful to just show
nothing because a preference has been expressed via core.disambiguate,
but it's bad that we're quietly doing this. The user might thing that
we just didn't understand what e.g 06fa^{blob} meant.

Now we'll instead print a warning if no objects of the requested type
were found:

    $ git rev-parse 06fa^{blob}
    error: short SHA1 06fa is ambiguous
    hint: The candidates are:
    [... no blobs listed ...]
    warning: Your hint (via core.disambiguate or peel syntax) was ignored, we fell
    back to showing all object types since no object of the requested type
    matched the provide short SHA1 06fa

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 sha1-name.c                         | 11 ++++++++++-
 t/t1512-rev-parse-disambiguation.sh |  5 ++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index 46d8b1afa6..df33cc2dba 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -438,6 +438,7 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
 		struct oid_array collect = OID_ARRAY_INIT;
+		int ignored_hint = 0;
 
 		error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
 
@@ -447,8 +448,10 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 		 * that case, we still want to show them, so disable the hint
 		 * function entirely.
 		 */
-		if (!ds.ambiguous)
+		if (!ds.ambiguous) {
 			ds.fn = NULL;
+			ignored_hint = 1;
+		}
 
 		advise(_("The candidates are:"));
 		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
@@ -457,6 +460,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
 		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
 			BUG("show_ambiguous_object shouldn't return non-zero");
 		oid_array_clear(&collect);
+
+		if (ignored_hint) {
+			warning(_("Your hint (via core.disambiguate or peel syntax) was ignored, we fell\n"
+				  "back to showing all object types since no object of the requested type\n"
+				  "matched the provide short SHA1 %s"), ds.hex_pfx);
+		}
 	}
 
 	return status;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 2701462041..1f06c1e61f 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -344,7 +344,10 @@ test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
 	echo 872 | git hash-object --stdin -w &&
 	test_must_fail git rev-parse ee3d^{commit} 2>stderr &&
 	grep ^hint: stderr >hints &&
-	test_line_count = 3 hints
+	test_line_count = 3 hints &&
+	grep ^warning stderr >warnings &&
+	grep -q "Your hint.*was ignored" warnings &&
+	grep -q "the provide short SHA1 ee3d" stderr
 '
 
 test_expect_success 'core.disambiguate config can prefer types' '
-- 
2.17.0.410.g4ac3413cc8


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector
  2018-05-10 12:43     ` [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
@ 2018-05-10 13:15       ` Martin Ågren
  2018-05-10 16:03       ` Jeff King
  1 sibling, 0 replies; 99+ messages in thread
From: Martin Ågren @ 2018-05-10 13:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Junio C Hamano, Jeff King, brian m . carlson,
	Derrick Stolee, Stefan Beller, Eric Sunshine, Duy Nguyen

On 10 May 2018 at 14:43, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> The SHA1 prefix 06fa currently matches no blobs in git.git. When
> disambiguating short SHA1s we've been quietly ignoring the user's type
> selector as a fallback mechanism, this was intentionally added in
> 1ffa26c461 ("get_short_sha1: list ambiguous objects on error",
> 2016-09-26).
>
> I think that behavior makes sense, it's not very useful to just show
> nothing because a preference has been expressed via core.disambiguate,
> but it's bad that we're quietly doing this. The user might thing that
> we just didn't understand what e.g 06fa^{blob} meant.
>
> Now we'll instead print a warning if no objects of the requested type
> were found:
>
>     $ git rev-parse 06fa^{blob}
>     error: short SHA1 06fa is ambiguous
>     hint: The candidates are:
>     [... no blobs listed ...]
>     warning: Your hint (via core.disambiguate or peel syntax) was ignored, we fell
>     back to showing all object types since no object of the requested type
>     matched the provide short SHA1 06fa

s/ignored, we/ignored. We/? IMHO, it would read easier.

s/provide short/provided short/

Also: s/SHA1/object id/? That said, you add the warning. The error
message is already there and you are simply following its "SHA1".

Martin

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 3/6] git-p4: change "commitish" typo to "committish"
  2018-05-10 12:43     ` [PATCH v4 3/6] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
@ 2018-05-10 15:00       ` Luke Diamand
  0 siblings, 0 replies; 99+ messages in thread
From: Luke Diamand @ 2018-05-10 15:00 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Users, Junio C Hamano, Jeff King, brian m . carlson,
	Derrick Stolee, Stefan Beller, Eric Sunshine, Duy Nguyen

On 10 May 2018 at 13:43, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> This was the only occurrence of "commitish" in the tree, but as the
> log will reveal we've had others in the past. Fixes up code added in
> 00ad6e3182 ("git-p4: work with a detached head", 2015-11-21).
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Looks good to me!

Thanks,
Luke


> ---
>  git-p4.py | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index 7bb9cadc69..1afa87cd9d 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2099,11 +2099,11 @@ class P4Submit(Command, P4UserMap):
>
>          commits = []
>          if self.master:
> -            commitish = self.master
> +            committish = self.master
>          else:
> -            commitish = 'HEAD'
> +            committish = 'HEAD'
>
> -        for line in read_pipe_lines(["git", "rev-list", "--no-merges", "%s..%s" % (self.origin, commitish)]):
> +        for line in read_pipe_lines(["git", "rev-list", "--no-merges", "%s..%s" % (self.origin, committish)]):
>              commits.append(line.strip())
>          commits.reverse()
>
> --
> 2.17.0.410.g4ac3413cc8
>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 2/6] sha1-array.h: align function arguments
  2018-05-10 12:42     ` [PATCH v4 2/6] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
@ 2018-05-10 15:06       ` Jeff King
  2018-05-11  3:07         ` Junio C Hamano
  0 siblings, 1 reply; 99+ messages in thread
From: Jeff King @ 2018-05-10 15:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen

On Thu, May 10, 2018 at 12:42:59PM +0000, Ævar Arnfjörð Bjarmason wrote:

> The arguments weren't lined up with the opening parenthesis. Fixes up
> code added in aae0caf19e ("sha1-array.h: align function arguments",
> 2018-04-30).

I think that's this patch. :)

Presumably you meant 910650d2f8 (Rename sha1_array to oid_array,
2017-03-31)?

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v2] pack-format.txt: more details on pack file format
  2018-05-08 15:56       ` [PATCH] pack-format.txt: more details on pack file format Nguyễn Thái Ngọc Duy
  2018-05-08 17:23         ` Stefan Beller
  2018-05-08 18:21         ` Ævar Arnfjörð Bjarmason
@ 2018-05-10 15:09         ` Nguyễn Thái Ngọc Duy
  2018-05-10 17:06           ` Stefan Beller
                             ` (2 more replies)
  2 siblings, 3 replies; 99+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-05-10 15:09 UTC (permalink / raw)
  To: pclouds; +Cc: avarab, git, gitster, peff, sandals, sbeller, stolee, sunshine

The current document mentions OBJ_* constants without their actual
values. A git developer would know these are from cache.h but that's
not very friendly to a person who wants to read this file to implement
a pack file parser.

Similarly, the deltified representation is not documented at all (the
"document" is basically patch-delta.c). Translate that C code to
English with a bit more about what ofs-delta and ref-delta mean.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 This is a much better description than v1. I hope.

 Documentation/technical/pack-format.txt | 78 +++++++++++++++++++++++++
 cache.h                                 |  5 ++
 2 files changed, 83 insertions(+)

diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt
index 8e5bf60be3..d20bf592aa 100644
--- a/Documentation/technical/pack-format.txt
+++ b/Documentation/technical/pack-format.txt
@@ -36,6 +36,84 @@ Git pack format
 
   - The trailer records 20-byte SHA-1 checksum of all of the above.
 
+=== Object types
+
+Valid object types are:
+
+- OBJ_COMMIT (1)
+- OBJ_TREE (2)
+- OBJ_BLOB (3)
+- OBJ_TAG (4)
+- OBJ_OFS_DELTA (6)
+- OBJ_REF_DELTA (7)
+
+Type 5 is reserved for future expansion. Type 0 is invalid.
+
+=== Deltified representation
+
+Conceptually there are only four object types: commit, tree, tag and
+blob. However to save space, an object could be stored as a "delta" of
+another "base" object. These representations are assigned new types
+ofs-delta and ref-delta, which is only valid in a pack file.
+
+Both ofs-delta and ref-delta store the "delta" against another
+object. The difference between them is, ref-delta directly encodes
+20-byte base object name. If the base object is in the same pack,
+ofs-delta encodes the offset of the base object in the pack instead.
+
+The delta data is a sequence of instructions to reconstruct an object
+from the base object. Each instruction appends more and more data to
+the target object until it's complete. There are two supported
+instructions so far: one for copy a byte range from the source object
+and one for inserting new data embedded in the instruction itself.
+
+Each instruction has variable length. Instruction type is determined
+by the seventh bit of the first octet. The following diagrams follow
+the convention in RFC 1951 (Deflate compressed data format).
+
+  +----------+---------+---------+---------+---------+-------+-------+-------+
+  | 1xxxxxxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 |
+  +----------+---------+---------+---------+---------+-------+-------+-------+
+
+This is the instruction format to copy a byte range from the source
+object. It encodes the offset to copy from any the number of bytes to
+copy. Offset and size are in little-endian order.
+
+All offset and size bytes are optional. This is to reduce the
+instruction size when encoding small offsets or sizes. The first seven
+bits in the first octet determines which of the next seven octets is
+present. If bit zero is set, offset1 is present. If bit one is set
+offset2 is present and so on.
+
+Note that a more compact instruction does not change offset and size
+encoding. For example, if only offset2 is omitted like below, offset3
+still contains bits 16-23. It does not become offset2 and contains
+bits 8-15 even if it's right next to offset1.
+
+  +----------+---------+---------+
+  | 10000101 | offset1 | offset3 |
+  +----------+---------+---------+
+
+In its most compact form, this instruction only takes up one byte
+(0x80) with both offset and size omitted, which will have default
+values zero. There is another exception: size zero is automatically
+converted to 0x10000.
+
+  +----------+============+
+  | 0xxxxxxx |    data    |
+  +----------+============+
+
+This is the instruction to construct target object without the base
+object. The following data is appended to the target object. The first
+seven bits of the first octet determines the size of data in
+bytes. The size must be non-zero.
+
+  +----------+============
+  | 00000000 |
+  +----------+============
+
+This is the instruction reserved for future expansion.
+
 == Original (version 1) pack-*.idx files have the following format:
 
   - The header consists of 256 4-byte network byte order
diff --git a/cache.h b/cache.h
index 77b7acebb6..ad549e258e 100644
--- a/cache.h
+++ b/cache.h
@@ -373,6 +373,11 @@ extern void free_name_hash(struct index_state *istate);
 #define read_blob_data_from_cache(path, sz) read_blob_data_from_index(&the_index, (path), (sz))
 #endif
 
+/*
+ * Values in this enum (except those outside the 3 bit range) are part
+ * of pack file format. See Documentation/technical/pack-format.txt
+ * for more information.
+ */
 enum object_type {
 	OBJ_BAD = -1,
 	OBJ_NONE = 0,
-- 
2.17.0.705.g3525833791


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-10 12:43     ` [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
@ 2018-05-10 15:22       ` Jeff King
  2018-05-11  5:36       ` Junio C Hamano
  1 sibling, 0 replies; 99+ messages in thread
From: Jeff King @ 2018-05-10 15:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen

On Thu, May 10, 2018 at 12:43:02PM +0000, Ævar Arnfjörð Bjarmason wrote:

> Now we'll instead show:
> 
>     hint:   e8f2650052 tag v2.17.0
>     hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
>     hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
>     hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
>     hint:   e8f2093055 tree
>     hint:   e8f25a3a50 tree
>     hint:   e8f28d537c tree
>     hint:   e8f2cf6ec0 tree
>     hint:   e8f21d02f7 blob
>     hint:   e8f21d577c blob
>     hint:   e8f2867228 blob
>     hint:   e8f2a35526 blob

I said already that I like the output, but this time I'll actually read
the code. ;)

It all looks good to me, with the exception of a few documentation nits
I'll mention below.

> A note on the implementation: Derrick rightly pointed out[1] that
> we're bending over backwards here in get_short_oid() to first
> de-duplicate the list, and then emit it, but could simply do it in one
> step.
> 
> The reason for that is that oid_array_for_each_unique() doesn't
> actually require that the array be sorted by oid_array_sort(), it just
> needs to be sorted in some order that guarantees that all objects with
> the same ID are adjacent to one another, which (barring a hash
> collision, which'll be someone else's problem) the sort_ambiguous()
> function does.

If we were to go this route, I think it would make sense to add a
sorting function pointer to "struct oid_array". I'm OK with punting on
it for now, though.

> diff --git a/Documentation/technical/api-oid-array.txt b/Documentation/technical/api-oid-array.txt
> index b0c11f868d..94b529722c 100644
> --- a/Documentation/technical/api-oid-array.txt
> +++ b/Documentation/technical/api-oid-array.txt
> @@ -35,13 +35,18 @@ Functions
>  	Free all memory associated with the array and return it to the
>  	initial, empty state.
>  
> +`oid_array_for_each`::
> +	Iterate over each element of the list, executing the callback
> +	function for each one. Does not sort the list, so any custom
> +	hash order is retained. If the callback returns a non-zero
> +	value, the iteration ends immediately and the callback's
> +	return is propagated; otherwise, 0 is returned.
> +
>  `oid_array_for_each_unique`::
> -	Efficiently iterate over each unique element of the list,
> -	executing the callback function for each one. If the array is
> -	not sorted, this function has the side effect of sorting it. If
> -	the callback returns a non-zero value, the iteration ends
> -	immediately and the callback's return is propagated; otherwise,
> -	0 is returned.
> +	Iterate over each unique element of the list in sort order ,
> +	but otherwise behaves like `oid_array_for_each`. If the array
> +	is not sorted, this function has the side effect of sorting
> +	it.

Extra space in "sort order ,".

I'd probably say "sorted order", but that might be a matter of
preference.

Also, your parallel verb tenses don't agree. ;) It should be "Iterate
... but otherwise behave", not "behaves".

> +	/*
> +	 * Between object types show tags, then commits, and finally
> +	 * trees and blobs.
> +	 *
> +	 * The object_type enum is commit, tree, blob, tag, but we
> +	 * want tag, commit, tree blob. Cleverly (perhaps too
> +	 * cleverly) do that with modulus, since the enum assigns 1 to
> +	 * commit, so tag becomes 0.
> +	 */
> +	a_type_sort = a_type % 4;
> +	b_type_sort = b_type % 4;
> +	return a_type_sort > b_type_sort ? 1 : -1;

This is amusingly clever, and should be very efficient. I'm glad there's
a comment at least, though.

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector
  2018-05-10 12:43     ` [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
  2018-05-10 13:15       ` Martin Ågren
@ 2018-05-10 16:03       ` Jeff King
  2018-05-10 16:10         ` Jeff King
  2018-05-10 16:15         ` Jeff King
  1 sibling, 2 replies; 99+ messages in thread
From: Jeff King @ 2018-05-10 16:03 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen

On Thu, May 10, 2018 at 12:43:03PM +0000, Ævar Arnfjörð Bjarmason wrote:

> The SHA1 prefix 06fa currently matches no blobs in git.git. When
> disambiguating short SHA1s we've been quietly ignoring the user's type
> selector as a fallback mechanism, this was intentionally added in
> 1ffa26c461 ("get_short_sha1: list ambiguous objects on error",
> 2016-09-26).
> 
> I think that behavior makes sense, it's not very useful to just show
> nothing because a preference has been expressed via core.disambiguate,
> but it's bad that we're quietly doing this. The user might thing that
> we just didn't understand what e.g 06fa^{blob} meant.

I had to read this through a few times to figure out what problem you
were solving. Possibly because you lead with 06fa, which is really just
an example (and also, I have an 06fa blob in my clone ;) ).

Maybe:

  If the short-sha1 disambiguation code is told to use a particular hint
  (e.g., treeish or blob) but no objects with that short-sha1 match that
  hint, we end up ignoring the hint. This can result in either:

    1. We choose the non-matching object if there is only one. This will
       typically result in an error later up the stack (since whatever
       gave us the hint is expecting a particular type).

    2. We list all objects with that short-sha1, including those with
       non-matching types.

  This second case can be confusing to the user, who might think that we
  didn't apply the hint properly (especially if the hint came from
  them). For example, in git.git there is no blob with the prefix 06fa.
  So the user may see:

    $ git rev-parse 06fa^{blob}
    hint: The candidates are:
    hint:   06fa2b7c2b tag v2.1.4
    hint:   06faf6ba64 tree
    06fa^{blob}
    fatal: ambiguous argument '06fa^{blob}': unknown revision or path not in the working tree.

  Let's help them out by issuing a warning whenever the hint is ignored.

So that at least explains it in a way that makes sense to me. But now
that I've propped up my strawman, let me take a few swings...

Your patch just covers case 2, I think. And for the error case, that's
probably OK:

  $ git rev-parse 06faf^{blob}
  error: 06faf^{blob}: expected blob type, but the object dereferences to tree type
  06faf^{blob}
  error: 06faf^{blob}: expected blob type, but the object dereferences to tree type
  fatal: ambiguous argument '06faf^{blob}': unknown revision or path not in the working tree.

(though there is a separate bug in showing the error twice).

But some cases _don't_ issue an error. For example, try this:

  $ git log ..06faf

which returns an empty output! We return the single matching tree, even
though the ".." triggers the commit hint. The revision machinery just
queues the tree, and then later when we see we're not doing an --objects
traversal, it just gets ignored. (That's a separate issue, but it shows
that the hints are just that: hints. The code that runs after does not
necessarily require a matching type).

And that example shows another issue, which is that the user does not
necessarily feed us the hint explicitly. We're using a committish hint
there, but I'm not sure if mentioning that would confuse the user or
not. Certainly this warning:

>     warning: Your hint (via core.disambiguate or peel syntax) was ignored, we fell
>     back to showing all object types since no object of the requested type
>     matched the provide short SHA1 06fa

is not accurate, because the hint came from neither of those places. ;)

So all that said together, I kind of wonder if we should consider
issuing the warning earlier, doing so for all cases, and being a bit
less chatty. Like:

  $ git rev-parse 06fa^{blob}
  warning: short object id 06fa did not match any objects of type 'blob'

If that were followed by any of:

  1. error: short SHA1 06fa is ambiguous, then a bunch of non-blobs

  2. error: expected blob but I got a tree

  3. the command proceeds and silently ignores the matched object

I think it would be helpful. We'd need to add in an extra mapping of
GET_OID_* back to a human-readable string, but I think that should be
pretty easy.

And finally, your 06fa example for me shows behavior that's either
buggy, or I'm just confused. I get:

  $ git rev-parse 06fa^{blob}
  error: short SHA1 06fa is ambiguous
  hint: The candidates are:
  hint:   06fa2b7c2b tag v2.1.4
  hint:   06faa52353 commit 2005-10-18 - 2005-10-18 midnight
  hint:   06fac427af blob
  hint:   06faf6ba64 tree

(That 06fac blob comes Junio's refs/notes/amlog). Shouldn't the blob
disambiguator show me just that object?

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 0/6] get_short_oid UI improvements
  2018-05-10 12:42     ` [PATCH v4 0/6] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
@ 2018-05-10 16:04       ` Jeff King
  0 siblings, 0 replies; 99+ messages in thread
From: Jeff King @ 2018-05-10 16:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen

On Thu, May 10, 2018 at 12:42:57PM +0000, Ævar Arnfjörð Bjarmason wrote:

> This is like v3 except all the patches to the peel syntax & docs have
> been dropped, which were controversial.
> 
> I think it's worthwhile to re-work that, but I don't have time for
> that now, so I'm submitting this. Maybe I'll have time in the future
> to re-work the rest, but then I can base it on top of this.

I'm not quite on-board with the final patch, but with the exception of a
few nits I sent already, the first 5 look good to me.

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector
  2018-05-10 16:03       ` Jeff King
@ 2018-05-10 16:10         ` Jeff King
  2018-05-10 16:15         ` Jeff King
  1 sibling, 0 replies; 99+ messages in thread
From: Jeff King @ 2018-05-10 16:10 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen

On Thu, May 10, 2018 at 12:03:22PM -0400, Jeff King wrote:

> But some cases _don't_ issue an error. For example, try this:
> 
>   $ git log ..06faf
> 
> which returns an empty output! We return the single matching tree, even
> though the ".." triggers the commit hint. The revision machinery just
> queues the tree, and then later when we see we're not doing an --objects
> traversal, it just gets ignored. (That's a separate issue, but it shows
> that the hints are just that: hints. The code that runs after does not
> necessarily require a matching type).

I actually have a patch that generates a warning for this case (below).
I've been running with it for about a year, but it annoyingly produces
warnings for "git log --all":

  $ git log --all
  warning: ignoring blob object in traversal: refs/tags/junio-gpg-pub

I guess ideally it would distinguish between items added explicitly and
those added by a wildcard (or perhaps the wildcard adder should be more
careful about adding useless objects).

---
diff --git a/revision.c b/revision.c
index 1cff11833e..816d6b75ee 100644
--- a/revision.c
+++ b/revision.c
@@ -215,6 +215,16 @@ void add_pending_oid(struct rev_info *revs, const char *name,
 	add_pending_object(revs, object, name);
 }
 
+static void warn_ignored_object(struct object *object, const char *name)
+{
+	if (object->flags & UNINTERESTING)
+		return;
+
+	warning(_("ignoring %s object in traversal: %s"),
+		type_name(object->type),
+		(name && *name) ? name : oid_to_hex(&object->oid));
+}
+
 static struct commit *handle_commit(struct rev_info *revs,
 				    struct object_array_entry *entry)
 {
@@ -272,8 +282,10 @@ static struct commit *handle_commit(struct rev_info *revs,
 	 */
 	if (object->type == OBJ_TREE) {
 		struct tree *tree = (struct tree *)object;
-		if (!revs->tree_objects)
+		if (!revs->tree_objects) {
+			warn_ignored_object(object, name);
 			return NULL;
+		}
 		if (flags & UNINTERESTING) {
 			mark_tree_contents_uninteresting(tree);
 			return NULL;
@@ -286,8 +298,10 @@ static struct commit *handle_commit(struct rev_info *revs,
 	 * Blob object? You know the drill by now..
 	 */
 	if (object->type == OBJ_BLOB) {
-		if (!revs->blob_objects)
+		if (!revs->blob_objects) {
+			warn_ignored_object(object, name);
 			return NULL;
+		}
 		if (flags & UNINTERESTING)
 			return NULL;
 		add_pending_object_with_path(revs, object, name, mode, path);

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector
  2018-05-10 16:03       ` Jeff King
  2018-05-10 16:10         ` Jeff King
@ 2018-05-10 16:15         ` Jeff King
  1 sibling, 0 replies; 99+ messages in thread
From: Jeff King @ 2018-05-10 16:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, brian m . carlson, Derrick Stolee,
	Stefan Beller, Eric Sunshine, Duy Nguyen

On Thu, May 10, 2018 at 12:03:22PM -0400, Jeff King wrote:

> And finally, your 06fa example for me shows behavior that's either
> buggy, or I'm just confused. I get:
> 
>   $ git rev-parse 06fa^{blob}
>   error: short SHA1 06fa is ambiguous
>   hint: The candidates are:
>   hint:   06fa2b7c2b tag v2.1.4
>   hint:   06faa52353 commit 2005-10-18 - 2005-10-18 midnight
>   hint:   06fac427af blob
>   hint:   06faf6ba64 tree
> 
> (That 06fac blob comes Junio's refs/notes/amlog). Shouldn't the blob
> disambiguator show me just that object?

Ah, I see. No, "^{blob}" does not actually pass the blob disambiguator.
We only handle committish and treeish right now, and this iteration of
the series omits the function to fix that.

So I think the principle of this commit is sound without that patch, but
your example is not a good one anymore. :)

-Peff

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v2] pack-format.txt: more details on pack file format
  2018-05-10 15:09         ` [PATCH v2] " Nguyễn Thái Ngọc Duy
@ 2018-05-10 17:06           ` Stefan Beller
  2018-05-11  6:41             ` Duy Nguyen
  2018-05-11  3:54           ` Junio C Hamano
  2018-05-11  6:55           ` [PATCH v3] " Nguyễn Thái Ngọc Duy
  2 siblings, 1 reply; 99+ messages in thread
From: Stefan Beller @ 2018-05-10 17:06 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Jeff King, brian m. carlson, Derrick Stolee, Eric Sunshine

On Thu, May 10, 2018 at 8:09 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> The current document mentions OBJ_* constants without their actual
> values. A git developer would know these are from cache.h but that's
> not very friendly to a person who wants to read this file to implement
> a pack file parser.
>
> Similarly, the deltified representation is not documented at all (the
> "document" is basically patch-delta.c). Translate that C code to
> English with a bit more about what ofs-delta and ref-delta mean.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  This is a much better description than v1. I hope.
>
>  Documentation/technical/pack-format.txt | 78 +++++++++++++++++++++++++
>  cache.h                                 |  5 ++
>  2 files changed, 83 insertions(+)
>
> diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt
> index 8e5bf60be3..d20bf592aa 100644
> --- a/Documentation/technical/pack-format.txt
> +++ b/Documentation/technical/pack-format.txt
> @@ -36,6 +36,84 @@ Git pack format
>
>    - The trailer records 20-byte SHA-1 checksum of all of the above.
>
> +=== Object types
> +
> +Valid object types are:
> +
> +- OBJ_COMMIT (1)
> +- OBJ_TREE (2)
> +- OBJ_BLOB (3)
> +- OBJ_TAG (4)
> +- OBJ_OFS_DELTA (6)
> +- OBJ_REF_DELTA (7)
> +
> +Type 5 is reserved for future expansion. Type 0 is invalid.
> +
> +=== Deltified representation
> +
> +Conceptually there are only four object types: commit, tree, tag and
> +blob. However to save space, an object could be stored as a "delta" of
> +another "base" object. These representations are assigned new types
> +ofs-delta and ref-delta, which is only valid in a pack file.

...only valid...

as opposed to loose objects or as opposed to referencing cross-packs?
I would think the former, not the latter.

> +Both ofs-delta and ref-delta store the "delta" against another
> +object. The difference between them is, ref-delta directly encodes
> +20-byte base object name. If the base object is in the same pack,
> +ofs-delta encodes the offset of the base object in the pack instead.

Reading this paragraph clears up the question from before.
The ref delta is a delta to another "reference by hash id (sha1)".
What abbreviation is OFS? OFfSet ?

> +The delta data is a sequence of instructions to reconstruct an object
> +from the base object.

As said before the base object is of type 1..4, we do not "delta-on-delta"
yet, but to construct the object we have to create the base object first,
which itself can be represented as a deltified object leading to a delta
chain.

>     Each instruction appends more and more data to
> +the target object until it's complete. There are two supported
> +instructions so far: one for copy a byte range from the source object
> +and one for inserting new data embedded in the instruction itself.

ok. So there are 2 types of instructions, "copy from (offset, size)" and
"new data follows".

The next paragraphs seem to describe the copy instruction, maybe
add a sub-headline here?

> +Each instruction has variable length. Instruction type is determined
> +by the seventh bit of the first octet. The following diagrams follow
> +the convention in RFC 1951 (Deflate compressed data format).
> +
> +  +----------+---------+---------+---------+---------+-------+-------+-------+
> +  | 1xxxxxxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 |
> +  +----------+---------+---------+---------+---------+-------+-------+-------+
> +
> +This is the instruction format to copy a byte range from the source
> +object. It encodes the offset to copy from any the number of bytes to
> +copy. Offset and size are in little-endian order.
> +
> +All offset and size bytes are optional. This is to reduce the
> +instruction size when encoding small offsets or sizes. The first seven
> +bits in the first octet determines which of the next seven octets is
> +present. If bit zero is set, offset1 is present. If bit one is set
> +offset2 is present and so on.
> +
> +Note that a more compact instruction does not change offset and size
> +encoding. For example, if only offset2 is omitted like below, offset3
> +still contains bits 16-23. It does not become offset2 and contains
> +bits 8-15 even if it's right next to offset1.
> +
> +  +----------+---------+---------+
> +  | 10000101 | offset1 | offset3 |
> +  +----------+---------+---------+

It reads very fluently to here.

> +In its most compact form, this instruction only takes up one byte
> +(0x80) with both offset and size omitted, which will have default
> +values zero. There is another exception: size zero is automatically
> +converted to 0x10000.

This "another exception" sounds a bit tacked on, but is still understandable.
I would imagine that the size of 0 is used frequently to copy large blocks
and coincidentally it is represented using the lowest number of bytes
for size. Cute!

Before the next diagram we could have a sub-headline, indicating
that the other instruction "new data follows" will now be described.

> +  +----------+============+
> +  | 0xxxxxxx |    data    |
> +  +----------+============+
> +
> +This is the instruction to construct target object without the base
> +object. The following data is appended to the target object. The first
> +seven bits of the first octet determines the size of data in
> +bytes. The size must be non-zero.

This command sounds very easy.
However we can have at most 127 bytes of new data, so if someone
adds a larger part of new code, we'd have many "insert new data"
instructions, all at the max of 127, such that the overhead for instruction
bytes is 1/127 = 0.7 %. Sounds efficient.

> +  +----------+============
> +  | 00000000 |
> +  +----------+============
> +
> +This is the instruction reserved for future expansion.

Thanks for pointing this out.


>
> +/*
> + * Values in this enum (except those outside the 3 bit range) are part
> + * of pack file format. See Documentation/technical/pack-format.txt
> + * for more information.
> + */

Makes sense.

I really like this patch very much. Thanks for writing it.
My annotations are just to add the cherry onto the cake,
the current form is
Reviewed-by: Stefan Beller <sbeller@google.com>

Thanks!

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 2/6] sha1-array.h: align function arguments
  2018-05-10 15:06       ` Jeff King
@ 2018-05-11  3:07         ` Junio C Hamano
  2018-05-11  3:09           ` Junio C Hamano
  0 siblings, 1 reply; 99+ messages in thread
From: Junio C Hamano @ 2018-05-11  3:07 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, git, brian m . carlson,
	Derrick Stolee, Stefan Beller, Eric Sunshine, Duy Nguyen

Jeff King <peff@peff.net> writes:

> On Thu, May 10, 2018 at 12:42:59PM +0000, Ævar Arnfjörð Bjarmason wrote:
>
>> The arguments weren't lined up with the opening parenthesis. Fixes up
>> code added in aae0caf19e ("sha1-array.h: align function arguments",
>> 2018-04-30).
>
> I think that's this patch. :)
>
> Presumably you meant 910650d2f8 (Rename sha1_array to oid_array,
> 2017-03-31)?

Sharp eyes.  I couldn't quite tell from a cursory read of the blame
output, until I realized that the original before that culprit were
aligned and the renaming was what made them out of alignment.

But then "fixes up code added in" is not quite right, either.  It is
what the commit should have touched but didn't ;-)

Thanks.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 2/6] sha1-array.h: align function arguments
  2018-05-11  3:07         ` Junio C Hamano
@ 2018-05-11  3:09           ` Junio C Hamano
  0 siblings, 0 replies; 99+ messages in thread
From: Junio C Hamano @ 2018-05-11  3:09 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, git, brian m . carlson,
	Derrick Stolee, Stefan Beller, Eric Sunshine, Duy Nguyen

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>> On Thu, May 10, 2018 at 12:42:59PM +0000, Ævar Arnfjörð Bjarmason wrote:
>>
>>> The arguments weren't lined up with the opening parenthesis. Fixes up
>>> code added in aae0caf19e ("sha1-array.h: align function arguments",
>>> 2018-04-30).
> ...
> But then "fixes up code added in" is not quite right, either.  It is
> what the commit should have touched but didn't ;-)

FWIW, I ended up with this description.

    The arguments weren't lined up with the opening parenthesis, after
    910650d2 ("Rename sha1_array to oid_array", 2017-03-31) renamed the
    function.
    

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v2] pack-format.txt: more details on pack file format
  2018-05-10 15:09         ` [PATCH v2] " Nguyễn Thái Ngọc Duy
  2018-05-10 17:06           ` Stefan Beller
@ 2018-05-11  3:54           ` Junio C Hamano
  2018-05-11  6:55           ` [PATCH v3] " Nguyễn Thái Ngọc Duy
  2 siblings, 0 replies; 99+ messages in thread
From: Junio C Hamano @ 2018-05-11  3:54 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: avarab, git, peff, sandals, sbeller, stolee, sunshine

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> +Both ofs-delta and ref-delta store the "delta" against another
> +object. The difference between them is, ref-delta directly encodes
> +20-byte base object name. If the base object is in the same pack,
> +ofs-delta encodes the offset of the base object in the pack instead.

Those of us who know how delta works would understand it, but "delta
against another object" followed by a mention of "base object" may
not necessarily click to readers that "another object" and "base
object" refer to the same concept.

	... store the 'delta' to be applied to another object
	(called 'base object') to reconstruct the object.

perhaps?

> ...
> +  +----------+---------+---------+---------+---------+-------+-------+-------+
> +  | 1xxxxxxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 |
> +  +----------+---------+---------+---------+---------+-------+-------+-------+
> +
> +This is the instruction format to copy a byte range from the source
> +object. It encodes the offset to copy from any the number of bytes to
> +copy. Offset and size are in little-endian order.

"any the number"???  Ah, s/any/and/ that is.

Other than that, looks good to me.

Thanks.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1
  2018-05-10 12:43     ` [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
  2018-05-10 15:22       ` Jeff King
@ 2018-05-11  5:36       ` Junio C Hamano
  1 sibling, 0 replies; 99+ messages in thread
From: Junio C Hamano @ 2018-05-11  5:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Stefan Beller,
	Eric Sunshine, Duy Nguyen

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> diff --git a/sha1-name.c b/sha1-name.c
> index 9d7bbd3e96..46d8b1afa6 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -409,6 +437,8 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
>  	status = finish_object_disambiguation(&ds, oid);
>  
>  	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
> +		struct oid_array collect = OID_ARRAY_INIT;
> +
>  		error(_("short SHA1 %s is ambiguous"), ds.hex_pfx);
>  
>  		/*
> @@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid,
>  			ds.fn = NULL;
>  
>  		advise(_("The candidates are:"));
> -		for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds);

So we used to let for_each_abbrev() to enumerate these object names
that share the prefix in the object name order and fed
show_ambiguous_object() to show them, which was the cause of the
output that is not grouped by type.  Now you collect them into
another oid_array and sort by type, relying on the fact to that the
for_each_abbrev() in the "collect" phase already does the de-duping.

Sounds good.

> +		for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect);
> +		QSORT(collect.oid, collect.nr, sort_ambiguous);
> +
> +		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
> +			BUG("show_ambiguous_object shouldn't return non-zero");
> +		oid_array_clear(&collect);
>  	}


> diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
> index 711704ba5a..2701462041 100755
> --- a/t/t1512-rev-parse-disambiguation.sh
> +++ b/t/t1512-rev-parse-disambiguation.sh
> @@ -361,4 +361,25 @@ test_expect_success 'core.disambiguate does not override context' '
>  		git -c core.disambiguate=committish rev-parse $sha1^{tree}
>  '
>  
> +test_expect_success C_LOCALE_OUTPUT 'ambiguous commits are printed by type first, then hash order' '
> +	test_must_fail git rev-parse 0000 2>stderr &&
> +	grep ^hint: stderr >hints &&
> +	grep 0000 hints >objects &&
> +	cat >expected <<-\EOF &&
> +	tag
> +	commit
> +	tree
> +	blob
> +	EOF
> +	awk "{print \$3}" <objects >objects.types &&
> +	uniq <objects.types >objects.types.uniq &&

Eww, that is somewhat tricky (but correct) use of "uniq", which
POSIX not just mandates adjacent duplicates to be removed, but also
forbids from removing duplicates that are not adjacent from each
other.  So the objects in the "hints" file are not grouped by type,
we will fail to see these four lines.

> +	test_cmp expected objects.types.uniq &&
> +	for type in tag commit tree blob
> +	do
> +		grep $type objects >$type.objects &&
> +		sort $type.objects >$type.objects.sorted &&
> +		test_cmp $type.objects.sorted $type.objects

We not only want to see objects grouped by type (and types shown in
a desired order), but within the same type we want them ordered by
object name.

OK.

> +	done
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v2] pack-format.txt: more details on pack file format
  2018-05-10 17:06           ` Stefan Beller
@ 2018-05-11  6:41             ` Duy Nguyen
  0 siblings, 0 replies; 99+ messages in thread
From: Duy Nguyen @ 2018-05-11  6:41 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Jeff King, brian m. carlson, Derrick Stolee, Eric Sunshine

On Thu, May 10, 2018 at 7:06 PM, Stefan Beller <sbeller@google.com> wrote:
>> +=== Deltified representation
>> +
>> +Conceptually there are only four object types: commit, tree, tag and
>> +blob. However to save space, an object could be stored as a "delta" of
>> +another "base" object. These representations are assigned new types
>> +ofs-delta and ref-delta, which is only valid in a pack file.
>
> ...only valid...
>
> as opposed to loose objects or as opposed to referencing cross-packs?
> I would think the former, not the latter.

Yeah. This is pretty much an implementation detail of a pack. The
"real" type is always blob/commit/tree/tag. But you only see this when
you dig deep down in pack-related code.

>> +Both ofs-delta and ref-delta store the "delta" against another
>> +object. The difference between them is, ref-delta directly encodes
>> +20-byte base object name. If the base object is in the same pack,
>> +ofs-delta encodes the offset of the base object in the pack instead.
>
> Reading this paragraph clears up the question from before.
> The ref delta is a delta to another "reference by hash id (sha1)".
> What abbreviation is OFS? OFfSet ?

I guess so. I never bothered to track down the source for that.

>> +The delta data is a sequence of instructions to reconstruct an object
>> +from the base object.
>
> As said before the base object is of type 1..4, we do not "delta-on-delta"
> yet, but to construct the object we have to create the base object first,
> which itself can be represented as a deltified object leading to a delta
> chain.

Yeah that's the delta chain concept. I'll just make a note here about
base object potentially being a delta object as well.
-- 
Duy

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v3] pack-format.txt: more details on pack file format
  2018-05-10 15:09         ` [PATCH v2] " Nguyễn Thái Ngọc Duy
  2018-05-10 17:06           ` Stefan Beller
  2018-05-11  3:54           ` Junio C Hamano
@ 2018-05-11  6:55           ` Nguyễn Thái Ngọc Duy
  2 siblings, 0 replies; 99+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-05-11  6:55 UTC (permalink / raw)
  To: pclouds; +Cc: avarab, git, gitster, peff, sandals, sbeller, stolee, sunshine

The current document mentions OBJ_* constants without their actual
values. A git developer would know these are from cache.h but that's
not very friendly to a person who wants to read this file to implement
a pack file parser.

Similarly, the deltified representation is not documented at all (the
"document" is basically patch-delta.c). Translate that C code to
English with a bit more about what ofs-delta and ref-delta mean.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Diff from v2

    diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt
    index 00351cb822..70a99fd142 100644
    --- a/Documentation/technical/pack-format.txt
    +++ b/Documentation/technical/pack-format.txt
    @@ -56,27 +56,37 @@ blob. However to save space, an object could be stored as a "delta" of
     another "base" object. These representations are assigned new types
     ofs-delta and ref-delta, which is only valid in a pack file.
     
    -Both ofs-delta and ref-delta store the "delta" against another
    -object. The difference between them is, ref-delta directly encodes
    -20-byte base object name. If the base object is in the same pack,
    -ofs-delta encodes the offset of the base object in the pack instead.
    +Both ofs-delta and ref-delta store the "delta" to be applied to
    +another object (called 'base object') to reconstruct the object. The
    +difference between them is, ref-delta directly encodes 20-byte base
    +object name. If the base object is in the same pack, ofs-delta encodes
    +the offset of the base object in the pack instead.
    +
    +The base object could also be deltified if it's in the same pack.
    +Ref-delta can also refer to an object outside the pack (i.e. the
    +so-called "thin pack"). When stored on disk however, the pack should
    +be self contained to avoid cyclic dependency.
     
     The delta data is a sequence of instructions to reconstruct an object
    -from the base object. Each instruction appends more and more data to
    -the target object until it's complete. There are two supported
    -instructions so far: one for copy a byte range from the source object
    -and one for inserting new data embedded in the instruction itself.
    +from the base object. If the base object is deltified, it must be
    +converted to canonical form first. Each instruction appends more and
    +more data to the target object until it's complete. There are two
    +supported instructions so far: one for copy a byte range from the
    +source object and one for inserting new data embedded in the
    +instruction itself.
     
     Each instruction has variable length. Instruction type is determined
     by the seventh bit of the first octet. The following diagrams follow
     the convention in RFC 1951 (Deflate compressed data format).
     
    +==== Instruction to copy from base object
    +
       +----------+---------+---------+---------+---------+-------+-------+-------+
       | 1xxxxxxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 |
       +----------+---------+---------+---------+---------+-------+-------+-------+
     
     This is the instruction format to copy a byte range from the source
    -object. It encodes the offset to copy from any the number of bytes to
    +object. It encodes the offset to copy from and the number of bytes to
     copy. Offset and size are in little-endian order.
     
     All offset and size bytes are optional. This is to reduce the
    @@ -99,6 +109,8 @@ In its most compact form, this instruction only takes up one byte
     values zero. There is another exception: size zero is automatically
     converted to 0x10000.
     
    +==== Instruction to add new data
    +
       +----------+============+
       | 0xxxxxxx |    data    |
       +----------+============+
    @@ -108,6 +120,8 @@ object. The following data is appended to the target object. The first
     seven bits of the first octet determines the size of data in
     bytes. The size must be non-zero.
     
    +==== Reserved instruction
    +
       +----------+============
       | 00000000 |
       +----------+============

 Documentation/technical/pack-format.txt | 92 +++++++++++++++++++++++++
 cache.h                                 |  5 ++
 2 files changed, 97 insertions(+)

diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt
index 8e5bf60be3..70a99fd142 100644
--- a/Documentation/technical/pack-format.txt
+++ b/Documentation/technical/pack-format.txt
@@ -36,6 +36,98 @@ Git pack format
 
   - The trailer records 20-byte SHA-1 checksum of all of the above.
 
+=== Object types
+
+Valid object types are:
+
+- OBJ_COMMIT (1)
+- OBJ_TREE (2)
+- OBJ_BLOB (3)
+- OBJ_TAG (4)
+- OBJ_OFS_DELTA (6)
+- OBJ_REF_DELTA (7)
+
+Type 5 is reserved for future expansion. Type 0 is invalid.
+
+=== Deltified representation
+
+Conceptually there are only four object types: commit, tree, tag and
+blob. However to save space, an object could be stored as a "delta" of
+another "base" object. These representations are assigned new types
+ofs-delta and ref-delta, which is only valid in a pack file.
+
+Both ofs-delta and ref-delta store the "delta" to be applied to
+another object (called 'base object') to reconstruct the object. The
+difference between them is, ref-delta directly encodes 20-byte base
+object name. If the base object is in the same pack, ofs-delta encodes
+the offset of the base object in the pack instead.
+
+The base object could also be deltified if it's in the same pack.
+Ref-delta can also refer to an object outside the pack (i.e. the
+so-called "thin pack"). When stored on disk however, the pack should
+be self contained to avoid cyclic dependency.
+
+The delta data is a sequence of instructions to reconstruct an object
+from the base object. If the base object is deltified, it must be
+converted to canonical form first. Each instruction appends more and
+more data to the target object until it's complete. There are two
+supported instructions so far: one for copy a byte range from the
+source object and one for inserting new data embedded in the
+instruction itself.
+
+Each instruction has variable length. Instruction type is determined
+by the seventh bit of the first octet. The following diagrams follow
+the convention in RFC 1951 (Deflate compressed data format).
+
+==== Instruction to copy from base object
+
+  +----------+---------+---------+---------+---------+-------+-------+-------+
+  | 1xxxxxxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 |
+  +----------+---------+---------+---------+---------+-------+-------+-------+
+
+This is the instruction format to copy a byte range from the source
+object. It encodes the offset to copy from and the number of bytes to
+copy. Offset and size are in little-endian order.
+
+All offset and size bytes are optional. This is to reduce the
+instruction size when encoding small offsets or sizes. The first seven
+bits in the first octet determines which of the next seven octets is
+present. If bit zero is set, offset1 is present. If bit one is set
+offset2 is present and so on.
+
+Note that a more compact instruction does not change offset and size
+encoding. For example, if only offset2 is omitted like below, offset3
+still contains bits 16-23. It does not become offset2 and contains
+bits 8-15 even if it's right next to offset1.
+
+  +----------+---------+---------+
+  | 10000101 | offset1 | offset3 |
+  +----------+---------+---------+
+
+In its most compact form, this instruction only takes up one byte
+(0x80) with both offset and size omitted, which will have default
+values zero. There is another exception: size zero is automatically
+converted to 0x10000.
+
+==== Instruction to add new data
+
+  +----------+============+
+  | 0xxxxxxx |    data    |
+  +----------+============+
+
+This is the instruction to construct target object without the base
+object. The following data is appended to the target object. The first
+seven bits of the first octet determines the size of data in
+bytes. The size must be non-zero.
+
+==== Reserved instruction
+
+  +----------+============
+  | 00000000 |
+  +----------+============
+
+This is the instruction reserved for future expansion.
+
 == Original (version 1) pack-*.idx files have the following format:
 
   - The header consists of 256 4-byte network byte order
diff --git a/cache.h b/cache.h
index 77b7acebb6..ad549e258e 100644
--- a/cache.h
+++ b/cache.h
@@ -373,6 +373,11 @@ extern void free_name_hash(struct index_state *istate);
 #define read_blob_data_from_cache(path, sz) read_blob_data_from_index(&the_index, (path), (sz))
 #endif
 
+/*
+ * Values in this enum (except those outside the 3 bit range) are part
+ * of pack file format. See Documentation/technical/pack-format.txt
+ * for more information.
+ */
 enum object_type {
 	OBJ_BAD = -1,
 	OBJ_NONE = 0,
-- 
2.17.0.705.g3525833791


^ permalink raw reply related	[flat|nested] 99+ messages in thread

end of thread, other threads:[~2018-05-11  6:55 UTC | newest]

Thread overview: 99+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-30 22:07 [PATCH 0/9] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
2018-04-30 22:07 ` [PATCH 1/9] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
2018-04-30 22:07 ` [PATCH 2/9] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
2018-04-30 22:07 ` [PATCH 3/9] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
2018-04-30 22:07 ` [PATCH 4/9] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
2018-05-01 11:11   ` Derrick Stolee
2018-05-01 11:27     ` Ævar Arnfjörð Bjarmason
2018-05-01 12:26       ` Derrick Stolee
2018-05-01 12:36         ` Ævar Arnfjörð Bjarmason
2018-05-01 13:05           ` Derrick Stolee
2018-04-30 22:07 ` [PATCH 5/9] get_short_oid: learn to disambiguate by ^{tag} Ævar Arnfjörð Bjarmason
2018-04-30 22:07 ` [PATCH 6/9] get_short_oid: learn to disambiguate by ^{blob} Ævar Arnfjörð Bjarmason
2018-04-30 22:07 ` [PATCH 7/9] get_short_oid / peel_onion: ^{tree} should mean tree, not treeish Ævar Arnfjörð Bjarmason
2018-05-01  1:13   ` brian m. carlson
2018-04-30 22:07 ` [PATCH 8/9] get_short_oid / peel_onion: ^{tree} should mean commit, not commitish Ævar Arnfjörð Bjarmason
2018-04-30 23:22   ` Eric Sunshine
2018-04-30 22:07 ` [PATCH 9/9] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
2018-04-30 22:34 ` [PATCH 0/9] get_short_oid UI improvements Stefan Beller
2018-05-01  1:27 ` brian m. carlson
2018-05-01 11:16 ` Derrick Stolee
2018-05-01 12:06 ` [PATCH v2 00/12] " Ævar Arnfjörð Bjarmason
2018-05-01 13:03   ` [PATCH v2 06/11] get_short_oid: sort ambiguous objects by type, then SHA-1 Derrick Stolee
2018-05-01 13:39     ` Ævar Arnfjörð Bjarmason
2018-05-01 13:44       ` Derrick Stolee
2018-05-01 14:10         ` Ævar Arnfjörð Bjarmason
2018-05-01 14:15           ` Derrick Stolee
2018-05-01 18:40   ` [PATCH v3 00/12] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
2018-05-02 12:42     ` Derrick Stolee
2018-05-02 13:45       ` Derrick Stolee
2018-05-03  6:43         ` Jacob Keller
2018-05-01 18:40   ` [PATCH v3 01/12] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
2018-05-01 18:40   ` [PATCH v3 02/12] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
2018-05-01 18:40   ` [PATCH v3 03/12] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
2018-05-01 18:40   ` [PATCH v3 04/12] cache.h: add comment explaining the order in object_type Ævar Arnfjörð Bjarmason
2018-05-03  5:05     ` Junio C Hamano
2018-05-08 15:35     ` Duy Nguyen
2018-05-08 15:56       ` [PATCH] pack-format.txt: more details on pack file format Nguyễn Thái Ngọc Duy
2018-05-08 17:23         ` Stefan Beller
2018-05-08 18:22           ` Duy Nguyen
2018-05-08 18:58             ` Stefan Beller
2018-05-08 18:21         ` Ævar Arnfjörð Bjarmason
2018-05-08 18:24           ` Duy Nguyen
2018-05-10 15:09         ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-05-10 17:06           ` Stefan Beller
2018-05-11  6:41             ` Duy Nguyen
2018-05-11  3:54           ` Junio C Hamano
2018-05-11  6:55           ` [PATCH v3] " Nguyễn Thái Ngọc Duy
2018-05-01 18:40   ` [PATCH v3 05/12] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
2018-05-01 18:40   ` [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
2018-05-03  5:13     ` Junio C Hamano
2018-05-08 14:44     ` Jeff King
2018-05-01 18:40   ` [PATCH v3 07/12] get_short_oid: learn to disambiguate by ^{tag} Ævar Arnfjörð Bjarmason
2018-05-01 18:40   ` [PATCH v3 08/12] get_short_oid: learn to disambiguate by ^{blob} Ævar Arnfjörð Bjarmason
2018-05-01 18:40   ` [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
2018-05-03  5:28     ` Junio C Hamano
2018-05-03  7:28       ` Ævar Arnfjörð Bjarmason
2018-05-04  2:19         ` Junio C Hamano
2018-05-04  8:42           ` Ævar Arnfjörð Bjarmason
2018-05-07  4:08             ` Junio C Hamano
2018-05-08 14:34               ` Jeff King
2018-05-08 18:53                 ` Ævar Arnfjörð Bjarmason
2018-05-09  7:56                   ` Jeff King
2018-05-09 10:48                     ` Ævar Arnfjörð Bjarmason
2018-05-10  4:21                       ` Junio C Hamano
2018-05-10  6:50                         ` Jeff King
2018-05-10 12:42     ` [PATCH v4 0/6] get_short_oid UI improvements Ævar Arnfjörð Bjarmason
2018-05-10 16:04       ` Jeff King
2018-05-10 12:42     ` [PATCH v4 1/6] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
2018-05-10 12:42     ` [PATCH v4 2/6] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
2018-05-10 15:06       ` Jeff King
2018-05-11  3:07         ` Junio C Hamano
2018-05-11  3:09           ` Junio C Hamano
2018-05-10 12:43     ` [PATCH v4 3/6] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
2018-05-10 15:00       ` Luke Diamand
2018-05-10 12:43     ` [PATCH v4 4/6] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
2018-05-10 12:43     ` [PATCH v4 5/6] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
2018-05-10 15:22       ` Jeff King
2018-05-11  5:36       ` Junio C Hamano
2018-05-10 12:43     ` [PATCH v4 6/6] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
2018-05-10 13:15       ` Martin Ågren
2018-05-10 16:03       ` Jeff King
2018-05-10 16:10         ` Jeff King
2018-05-10 16:15         ` Jeff King
2018-05-01 18:40   ` [PATCH v3 10/12] get_short_oid / peel_onion: ^{commit} should be commit, not committish Ævar Arnfjörð Bjarmason
2018-05-01 18:40   ` [PATCH v3 11/12] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
2018-05-08 14:41     ` Jeff King
2018-05-01 18:40   ` [PATCH v3 12/12] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 01/12] sha1-name.c: remove stray newline Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 02/12] sha1-array.h: align function arguments Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 03/12] git-p4: change "commitish" typo to "committish" Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 04/12] cache.h: add comment explaining the order in object_type Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 05/12] sha1-name.c: move around the collect_ambiguous() function Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1 Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 07/12] get_short_oid: learn to disambiguate by ^{tag} Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 08/12] get_short_oid: learn to disambiguate by ^{blob} Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 10/12] get_short_oid / peel_onion: ^{commit} should be commit, not committish Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 11/12] config doc: document core.disambiguate Ævar Arnfjörð Bjarmason
2018-05-01 12:06 ` [PATCH v2 12/12] get_short_oid: document & warn if we ignore the type selector Ævar Arnfjörð Bjarmason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.