All of lore.kernel.org
 help / color / mirror / Atom feed
* rev-list and "ambiguous" IDs
@ 2019-11-14  4:35 Bryan Turner
  2019-11-14  5:59 ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Bryan Turner @ 2019-11-14  4:35 UTC (permalink / raw)
  To: Git Users

When using a command like `git rev-list dc41e --`, it's possible to
get output like this: (from newer Git versions)
error: short SHA1 dc41e is ambiguous
hint: The candidates are:
hint:   dc41eeb01ba commit 2012-11-23 - Stuff from the commit message
hint:   dc41e0d508b tree
hint:   dc41e5bef41 tree
hint:   dc41e11ee18 blob
fatal: bad revision 'dc41e'

Is there any way to ask rev-list to be a little...pickier about what
it considers a candidate? Almost without question the two trees and
the blob aren't what I'm asking for, which means there's actually only
one real candidate.

Also, while considering this, I noticed that `git rev-list
dc41e11ee18` (the blob from the output above) doesn't fail. It
silently exits, nothing written to stdout or stderr, with 0 status. A
little surprising; I would have expected rev-list to complain that
dc41e11ee18 isn't a valid commit-ish value.

Bryan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-14  4:35 rev-list and "ambiguous" IDs Bryan Turner
@ 2019-11-14  5:59 ` Jeff King
  2019-11-15  0:12   ` Thomas Braun
  2019-11-15  1:19   ` Bryan Turner
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff King @ 2019-11-14  5:59 UTC (permalink / raw)
  To: Bryan Turner; +Cc: Git Users

On Wed, Nov 13, 2019 at 08:35:47PM -0800, Bryan Turner wrote:

> When using a command like `git rev-list dc41e --`, it's possible to
> get output like this: (from newer Git versions)
> error: short SHA1 dc41e is ambiguous
> hint: The candidates are:
> hint:   dc41eeb01ba commit 2012-11-23 - Stuff from the commit message
> hint:   dc41e0d508b tree
> hint:   dc41e5bef41 tree
> hint:   dc41e11ee18 blob
> fatal: bad revision 'dc41e'
>
> Is there any way to ask rev-list to be a little...pickier about what
> it considers a candidate? Almost without question the two trees and
> the blob aren't what I'm asking for, which means there's actually only
> one real candidate.

Try "dc41e^{commit}", which will realize that trees and blobs cannot
peel to a commit (there would still be an ambiguity with a tag).

I think one could argue that without "--objects" in play, rev-list
should automatically disambiguate in favor of a committish. But that's
not true for every command.

You can also set core.disambiguate to "committish" (or even "commit").
At the time we added that option (and started reporting the list of
candidates), we pondered whether it might make sense to make that the
default. That would probably help in a lot of cases, but the argument
against it is that when it goes wrong, it may be quite confusing (so
we're better off with the current message, which punts back to the
user).

I think it also comes up fairly rarely these days, as short sha1s we
print have some headroom built in (as you can see above; the one you've
input is really quite short compared to anything Git would have printed
in that repo).

> Also, while considering this, I noticed that `git rev-list
> dc41e11ee18` (the blob from the output above) doesn't fail. It
> silently exits, nothing written to stdout or stderr, with 0 status. A
> little surprising; I would have expected rev-list to complain that
> dc41e11ee18 isn't a valid commit-ish value.

Yeah, this is a separate issue. If the revision machinery has pending
trees or blobs but isn't asked to show them via "--objects", then it
just ignores them.

I've been running with the patch below for several years; it just adds a
warning when we ignore such an object. I've been tempted to send it for
inclusion, but it has some rough edges:

  - there are some fast-export calls in the test scripts that trigger
    this. I don't remember the details, and what the fix would look
    like.

  - it makes wildcards like "rev-list --all" complain, because they may
    add a tag-of-blob, for example (in git.git, junio-gpg-pub triggers
    this). Things like "--all" would probably need to get smarter, and
    avoid adding non-commits in the first place (when --objects is not
    in use, of course)

---
 revision.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/revision.c b/revision.c
index 0e39b2b8a5..7dc2d9a822 100644
--- a/revision.c
+++ b/revision.c
@@ -393,6 +393,16 @@ void add_pending_oid(struct rev_info *revs, const char *name,
 	add_pending_object(revs, object, name);
 }
 
+static void warn_ignored_object(struct object *object, const char *name)
+{
+	if (object->flags & UNINTERESTING)
+		return;
+
+	warning(_("ignoring %s object in traversal: %s"),
+		type_name(object->type),
+		(name && *name) ? name : oid_to_hex(&object->oid));
+}
+
 static struct commit *handle_commit(struct rev_info *revs,
 				    struct object_array_entry *entry)
 {
@@ -458,8 +468,10 @@ static struct commit *handle_commit(struct rev_info *revs,
 	 */
 	if (object->type == OBJ_TREE) {
 		struct tree *tree = (struct tree *)object;
-		if (!revs->tree_objects)
+		if (!revs->tree_objects) {
+			warn_ignored_object(object, name);
 			return NULL;
+		}
 		if (flags & UNINTERESTING) {
 			mark_tree_contents_uninteresting(revs->repo, tree);
 			return NULL;
@@ -472,8 +484,10 @@ static struct commit *handle_commit(struct rev_info *revs,
 	 * Blob object? You know the drill by now..
 	 */
 	if (object->type == OBJ_BLOB) {
-		if (!revs->blob_objects)
+		if (!revs->blob_objects) {
+			warn_ignored_object(object, name);
 			return NULL;
+		}
 		if (flags & UNINTERESTING)
 			return NULL;
 		add_pending_object_with_path(revs, object, name, mode, path);
-- 
2.24.0.739.gb5632e4929


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-14  5:59 ` Jeff King
@ 2019-11-15  0:12   ` Thomas Braun
  2019-11-15  3:49     ` Jeff King
  2019-11-15  5:07     ` Junio C Hamano
  2019-11-15  1:19   ` Bryan Turner
  1 sibling, 2 replies; 13+ messages in thread
From: Thomas Braun @ 2019-11-15  0:12 UTC (permalink / raw)
  To: Jeff King, Bryan Turner; +Cc: Git Users

Am 14.11.2019 um 06:59 schrieb Jeff King:

[...]

> You can also set core.disambiguate to "committish" (or even "commit").
> At the time we added that option (and started reporting the list of
> candidates), we pondered whether it might make sense to make that the
> default.

I did not know this setting. Thanks!

> That would probably help in a lot of cases, but the argument
> against it is that when it goes wrong, it may be quite confusing (so
> we're better off with the current message, which punts back to the
> user).

Just out of curiosity: Is there a use case for inspecting non-commit
objects with git log?

If I do (in the git repo)

$ git log 1231

I get

error: short SHA1 1231 is ambiguous
hint: The candidates are:
hint:   123139fc89 tree
hint:   12316a1673 tree
hint:   123144fe8a blob
fatal: ambiguous argument '1231': unknown revision or path not in the
working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

with
$ git --version
git version 2.24.0.windows.2

and all of these candidates are no commits.

> I think it also comes up fairly rarely these days, as short sha1s we
> print have some headroom built in (as you can see above; the one you've
> input is really quite short compared to anything Git would have printed
> in that repo).
> 
>> Also, while considering this, I noticed that `git rev-list
>> dc41e11ee18` (the blob from the output above) doesn't fail. It
>> silently exits, nothing written to stdout or stderr, with 0 status. A
>> little surprising; I would have expected rev-list to complain that
>> dc41e11ee18 isn't a valid commit-ish value.
> 
> Yeah, this is a separate issue. If the revision machinery has pending
> trees or blobs but isn't asked to show them via "--objects", then it
> just ignores them.
> 
> I've been running with the patch below for several years; it just adds a
> warning when we ignore such an object. I've been tempted to send it for
> inclusion, but it has some rough edges:
> 
>   - there are some fast-export calls in the test scripts that trigger
>     this. I don't remember the details, and what the fix would look
>     like.
> 
>   - it makes wildcards like "rev-list --all" complain, because they may
>     add a tag-of-blob, for example (in git.git, junio-gpg-pub triggers
>     this). Things like "--all" would probably need to get smarter, and
>     avoid adding non-commits in the first place (when --objects is not
>     in use, of course)
> 
> ---
>  revision.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/revision.c b/revision.c
> index 0e39b2b8a5..7dc2d9a822 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -393,6 +393,16 @@ void add_pending_oid(struct rev_info *revs, const char *name,
>  	add_pending_object(revs, object, name);
>  }
>  
> +static void warn_ignored_object(struct object *object, const char *name)
> +{
> +	if (object->flags & UNINTERESTING)
> +		return;
> +
> +	warning(_("ignoring %s object in traversal: %s"),
> +		type_name(object->type),
> +		(name && *name) ? name : oid_to_hex(&object->oid));
> +}
> +
>  static struct commit *handle_commit(struct rev_info *revs,
>  				    struct object_array_entry *entry)
>  {
> @@ -458,8 +468,10 @@ static struct commit *handle_commit(struct rev_info *revs,
>  	 */
>  	if (object->type == OBJ_TREE) {
>  		struct tree *tree = (struct tree *)object;
> -		if (!revs->tree_objects)
> +		if (!revs->tree_objects) {
> +			warn_ignored_object(object, name);
>  			return NULL;
> +		}
>  		if (flags & UNINTERESTING) {
>  			mark_tree_contents_uninteresting(revs->repo, tree);
>  			return NULL;
> @@ -472,8 +484,10 @@ static struct commit *handle_commit(struct rev_info *revs,
>  	 * Blob object? You know the drill by now..
>  	 */
>  	if (object->type == OBJ_BLOB) {
> -		if (!revs->blob_objects)
> +		if (!revs->blob_objects) {
> +			warn_ignored_object(object, name);
>  			return NULL;
> +		}
>  		if (flags & UNINTERESTING)
>  			return NULL;
>  		add_pending_object_with_path(revs, object, name, mode, path);
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-14  5:59 ` Jeff King
  2019-11-15  0:12   ` Thomas Braun
@ 2019-11-15  1:19   ` Bryan Turner
  2019-11-15  3:57     ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Bryan Turner @ 2019-11-15  1:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Users

On Wed, Nov 13, 2019 at 9:59 PM Jeff King <peff@peff.net> wrote:
>
> On Wed, Nov 13, 2019 at 08:35:47PM -0800, Bryan Turner wrote:
>
> > When using a command like `git rev-list dc41e --`, it's possible to
> > get output like this: (from newer Git versions)
> > error: short SHA1 dc41e is ambiguous
> > hint: The candidates are:
> > hint:   dc41eeb01ba commit 2012-11-23 - Stuff from the commit message
> > hint:   dc41e0d508b tree
> > hint:   dc41e5bef41 tree
> > hint:   dc41e11ee18 blob
> > fatal: bad revision 'dc41e'
> >
> > Is there any way to ask rev-list to be a little...pickier about what
> > it considers a candidate? Almost without question the two trees and
> > the blob aren't what I'm asking for, which means there's actually only
> > one real candidate.
>
> Try "dc41e^{commit}", which will realize that trees and blobs cannot
> peel to a commit (there would still be an ambiguity with a tag).

Slick!

>
> I think one could argue that without "--objects" in play, rev-list
> should automatically disambiguate in favor of a committish. But that's
> not true for every command.
>
> You can also set core.disambiguate to "committish" (or even "commit").
> At the time we added that option (and started reporting the list of
> candidates), we pondered whether it might make sense to make that the
> default. That would probably help in a lot of cases, but the argument
> against it is that when it goes wrong, it may be quite confusing (so
> we're better off with the current message, which punts back to the
> user).

Having no disambiguation by default seems fine. Both of the approaches
here seem easy enough to activate explicitly in cases where the caller
(in this case Bitbucket Server; more on that later) knows they're
looking for a commit.

>
> I think it also comes up fairly rarely these days, as short sha1s we
> print have some headroom built in (as you can see above; the one you've
> input is really quite short compared to anything Git would have printed
> in that repo).

Just to provide a little context, this isn't coming up as something I
myself hit. Rather, it's a fairly common issue reported by Bitbucket
Server end users, and I would assume it happens with other hosting
providers as well: A user URL-hacks an ambiguous (or "ambiguous", in
cases like this) short hash and is disappointed when the system
doesn't manage to find the commit they were looking for. I'm just
investigating possible avenues for improving how Bitbucket Server
handles these cases. One option is to (essentially) parse the "hint",
if it's present, to get the candidates, and include them on the error
message we display. But in cases like the above it gets weird because
there's only one _commit_ candidate, and having our error message
include trees and blobs seems likely to be confusing/unexpected. I
suspect most Bitbucket Server users would say "The answer's obvious!
Why didn't you just use the commit?!", and I can sort of get behind
that view. The combination of using the disambiguation mechanism, so
single-commit ambiguities are resolved automatically, and parsing the
hint seems like it would produce the most logical behavior.

Where users get the short hashes they try is an interesting question.
As you say, Git wouldn't display a 5 character short hash, at least by
default, and Bitbucket Server doesn't either; it shows a flat 11
characters. I'm not sure, on that point.

Thanks for your insights! Learned a new Git trick today.

Best regards,
Bryan Turner



>
> > Also, while considering this, I noticed that `git rev-list
> > dc41e11ee18` (the blob from the output above) doesn't fail. It
> > silently exits, nothing written to stdout or stderr, with 0 status. A
> > little surprising; I would have expected rev-list to complain that
> > dc41e11ee18 isn't a valid commit-ish value.
>
> Yeah, this is a separate issue. If the revision machinery has pending
> trees or blobs but isn't asked to show them via "--objects", then it
> just ignores them.
>
> I've been running with the patch below for several years; it just adds a
> warning when we ignore such an object. I've been tempted to send it for
> inclusion, but it has some rough edges:
>
>   - there are some fast-export calls in the test scripts that trigger
>     this. I don't remember the details, and what the fix would look
>     like.
>
>   - it makes wildcards like "rev-list --all" complain, because they may
>     add a tag-of-blob, for example (in git.git, junio-gpg-pub triggers
>     this). Things like "--all" would probably need to get smarter, and
>     avoid adding non-commits in the first place (when --objects is not
>     in use, of course)
>
> ---
>  revision.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/revision.c b/revision.c
> index 0e39b2b8a5..7dc2d9a822 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -393,6 +393,16 @@ void add_pending_oid(struct rev_info *revs, const char *name,
>         add_pending_object(revs, object, name);
>  }
>
> +static void warn_ignored_object(struct object *object, const char *name)
> +{
> +       if (object->flags & UNINTERESTING)
> +               return;
> +
> +       warning(_("ignoring %s object in traversal: %s"),
> +               type_name(object->type),
> +               (name && *name) ? name : oid_to_hex(&object->oid));
> +}
> +
>  static struct commit *handle_commit(struct rev_info *revs,
>                                     struct object_array_entry *entry)
>  {
> @@ -458,8 +468,10 @@ static struct commit *handle_commit(struct rev_info *revs,
>          */
>         if (object->type == OBJ_TREE) {
>                 struct tree *tree = (struct tree *)object;
> -               if (!revs->tree_objects)
> +               if (!revs->tree_objects) {
> +                       warn_ignored_object(object, name);
>                         return NULL;
> +               }
>                 if (flags & UNINTERESTING) {
>                         mark_tree_contents_uninteresting(revs->repo, tree);
>                         return NULL;
> @@ -472,8 +484,10 @@ static struct commit *handle_commit(struct rev_info *revs,
>          * Blob object? You know the drill by now..
>          */
>         if (object->type == OBJ_BLOB) {
> -               if (!revs->blob_objects)
> +               if (!revs->blob_objects) {
> +                       warn_ignored_object(object, name);
>                         return NULL;
> +               }
>                 if (flags & UNINTERESTING)
>                         return NULL;
>                 add_pending_object_with_path(revs, object, name, mode, path);
> --
> 2.24.0.739.gb5632e4929
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-15  0:12   ` Thomas Braun
@ 2019-11-15  3:49     ` Jeff King
  2019-11-15 23:38       ` Thomas Braun
  2019-11-15  5:07     ` Junio C Hamano
  1 sibling, 1 reply; 13+ messages in thread
From: Jeff King @ 2019-11-15  3:49 UTC (permalink / raw)
  To: Thomas Braun; +Cc: Bryan Turner, Git Users

On Fri, Nov 15, 2019 at 01:12:47AM +0100, Thomas Braun wrote:

> > That would probably help in a lot of cases, but the argument
> > against it is that when it goes wrong, it may be quite confusing (so
> > we're better off with the current message, which punts back to the
> > user).
> 
> Just out of curiosity: Is there a use case for inspecting non-commit
> objects with git log?

Not that I can think of. You can't even say "--objects" there.

And indeed, "git log" already prefers commits for disambiguation, since
d5f6b1d756 (revision.c: the "log" family, except for "show", takes
committish, 2012-07-02).

But...

> If I do (in the git repo)
> 
> $ git log 1231
> 
> I get
> 
> error: short SHA1 1231 is ambiguous
> hint: The candidates are:
> hint:   123139fc89 tree
> hint:   12316a1673 tree
> hint:   123144fe8a blob
> fatal: ambiguous argument '1231': unknown revision or path not in the
> working tree.
> Use '--' to separate paths from revisions, like this:
> 'git <command> [<revision>...] -- [<file>...]'
> 
> with
> $ git --version
> git version 2.24.0.windows.2
> 
> and all of these candidates are no commits.

...remember that the disambiguation code is just about preferring one
object to the other. If the rule in effect doesn't have a preference,
it's still ambiguous. On my system, "1231" actually _does_ have a
commit:

  $ git show 1231
  error: short SHA1 1231 is ambiguous
  hint: The candidates are:
  hint:   12319e3bf2 commit 2017-03-25 - Merge 'git-gui-add-2nd-line' into HEAD
  hint:   123139fc89 tree
  hint:   12315b58b8 tree
  hint:   12316a1673 tree
  hint:   12317ab2d9 tree
  hint:   123193f802 tree
  hint:   123144fe8a blob
  fatal: ambiguous argument '1231': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

That's ambiguous because git-show can handle trees and blobs, too. But
if I feed that sha1 to git-log:

  $ git log --oneline -1 1231
  12319e3bf2 Merge 'git-gui-add-2nd-line' into HEAD

it's perfectly fine, because git-log knows to disambiguate the commit.
But if I choose another prefix that has no commits at all, it's
ambiguous under either, because the "committish" rule has no way to
decide:

  $ git show abcd2
  error: short SHA1 abcd2 is ambiguous
  hint: The candidates are:
  hint:   abcd22f55e tree
  hint:   abcd238df0 tree
  hint:   abcd2b1cc8 blob
  
  $ git log abcd2
  error: short SHA1 abcd2 is ambiguous
  hint: The candidates are:
  hint:   abcd22f55e tree
  hint:   abcd238df0 tree
  hint:   abcd2b1cc8 blob

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-15  1:19   ` Bryan Turner
@ 2019-11-15  3:57     ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2019-11-15  3:57 UTC (permalink / raw)
  To: Bryan Turner; +Cc: Git Users

On Thu, Nov 14, 2019 at 05:19:39PM -0800, Bryan Turner wrote:

> Just to provide a little context, this isn't coming up as something I
> myself hit. Rather, it's a fairly common issue reported by Bitbucket
> Server end users, and I would assume it happens with other hosting
> providers as well: A user URL-hacks an ambiguous (or "ambiguous", in
> cases like this) short hash and is disappointed when the system
> doesn't manage to find the commit they were looking for. I'm just
> investigating possible avenues for improving how Bitbucket Server
> handles these cases. One option is to (essentially) parse the "hint",
> if it's present, to get the candidates, and include them on the error
> message we display. But in cases like the above it gets weird because
> there's only one _commit_ candidate, and having our error message
> include trees and blobs seems likely to be confusing/unexpected. I
> suspect most Bitbucket Server users would say "The answer's obvious!
> Why didn't you just use the commit?!", and I can sort of get behind
> that view. The combination of using the disambiguation mechanism, so
> single-commit ambiguities are resolved automatically, and parsing the
> hint seems like it would produce the most logical behavior.

It depends on your URL scheme obviously, but on GitHub for example, it
would make sense for https://github.com/user/repo/commit/1234abcd to use
the "^{commit}" trick. I don't think it currently does, though.

> Where users get the short hashes they try is an interesting question.
> As you say, Git wouldn't display a 5 character short hash, at least by
> default, and Bitbucket Server doesn't either; it shows a flat 11
> characters. I'm not sure, on that point.

GitHub often produces 7-char short hashes, because it's abbreviating
them in presentation code that doesn't want to spend the round trip to
talk to the repo (to find out if it's unique, or how many objects are in
the repo). It _usually_ shouldn't matter much, because we try to produce
abbreviated hashes where users might read them, and long hashes when we
generate URLs. But of course people sometimes generate URLs themselves
from who knows where. :) I've been lightly lobbying to bump our default
to something higher, like 12.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-15  0:12   ` Thomas Braun
  2019-11-15  3:49     ` Jeff King
@ 2019-11-15  5:07     ` Junio C Hamano
  2019-11-15  8:16       ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2019-11-15  5:07 UTC (permalink / raw)
  To: Thomas Braun; +Cc: Jeff King, Bryan Turner, Git Users

Thomas Braun <thomas.braun@virtuell-zuhause.de> writes:

> Just out of curiosity: Is there a use case for inspecting non-commit
> objects with git log?

I do not think there is (rev-list is a different story, given that
you can pass --objects), and it probably is not too difficult to
teach "git log" and friends that they only want commit-ish.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-15  5:07     ` Junio C Hamano
@ 2019-11-15  8:16       ` Jeff King
  2019-11-15 11:23         ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2019-11-15  8:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Braun, Bryan Turner, Git Users

On Fri, Nov 15, 2019 at 02:07:13PM +0900, Junio C Hamano wrote:

> Thomas Braun <thomas.braun@virtuell-zuhause.de> writes:
> 
> > Just out of curiosity: Is there a use case for inspecting non-commit
> > objects with git log?
> 
> I do not think there is (rev-list is a different story, given that
> you can pass --objects), and it probably is not too difficult to
> teach "git log" and friends that they only want commit-ish.

I think you already did; see my other reply. ;)

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-15  8:16       ` Jeff King
@ 2019-11-15 11:23         ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2019-11-15 11:23 UTC (permalink / raw)
  To: Jeff King; +Cc: Thomas Braun, Bryan Turner, Git Users

Jeff King <peff@peff.net> writes:

> On Fri, Nov 15, 2019 at 02:07:13PM +0900, Junio C Hamano wrote:
>
>> Thomas Braun <thomas.braun@virtuell-zuhause.de> writes:
>> 
>> > Just out of curiosity: Is there a use case for inspecting non-commit
>> > objects with git log?
>> 
>> I do not think there is (rev-list is a different story, given that
>> you can pass --objects), and it probably is not too difficult to
>> teach "git log" and friends that they only want commit-ish.
>
> I think you already did; see my other reply. ;)

Heh, indeed I did.  I just did not recall.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-15  3:49     ` Jeff King
@ 2019-11-15 23:38       ` Thomas Braun
  2019-11-16  3:47         ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Braun @ 2019-11-15 23:38 UTC (permalink / raw)
  To: Jeff King; +Cc: Bryan Turner, Git Users

Am 15.11.2019 um 04:49 schrieb Jeff King:
> On Fri, Nov 15, 2019 at 01:12:47AM +0100, Thomas Braun wrote:
> 
>>> That would probably help in a lot of cases, but the argument
>>> against it is that when it goes wrong, it may be quite confusing (so
>>> we're better off with the current message, which punts back to the
>>> user).
>>
>> Just out of curiosity: Is there a use case for inspecting non-commit
>> objects with git log?
> 
> Not that I can think of. You can't even say "--objects" there.
> 
> And indeed, "git log" already prefers commits for disambiguation, since
> d5f6b1d756 (revision.c: the "log" family, except for "show", takes
> committish, 2012-07-02).
> 
> But...
> 
>> If I do (in the git repo)
>>
>> $ git log 1231
>>
>> I get
>>
>> error: short SHA1 1231 is ambiguous
>> hint: The candidates are:
>> hint:   123139fc89 tree
>> hint:   12316a1673 tree
>> hint:   123144fe8a blob
>> fatal: ambiguous argument '1231': unknown revision or path not in the
>> working tree.
>> Use '--' to separate paths from revisions, like this:
>> 'git <command> [<revision>...] -- [<file>...]'
>>
>> with
>> $ git --version
>> git version 2.24.0.windows.2
>>
>> and all of these candidates are no commits.
> 
> ...remember that the disambiguation code is just about preferring one
> object to the other. If the rule in effect doesn't have a preference,
> it's still ambiguous. On my system, "1231" actually _does_ have a
> commit:
> 
>   $ git show 1231
>   error: short SHA1 1231 is ambiguous
>   hint: The candidates are:
>   hint:   12319e3bf2 commit 2017-03-25 - Merge 'git-gui-add-2nd-line' into HEAD
>   hint:   123139fc89 tree
>   hint:   12315b58b8 tree
>   hint:   12316a1673 tree
>   hint:   12317ab2d9 tree
>   hint:   123193f802 tree
>   hint:   123144fe8a blob
>   fatal: ambiguous argument '1231': unknown revision or path not in the working tree.
>   Use '--' to separate paths from revisions, like this:
>   'git <command> [<revision>...] -- [<file>...]'
> 
> That's ambiguous because git-show can handle trees and blobs, too. But
> if I feed that sha1 to git-log:
> 
>   $ git log --oneline -1 1231
>   12319e3bf2 Merge 'git-gui-add-2nd-line' into HEAD
> 
> it's perfectly fine, because git-log knows to disambiguate the commit.
> But if I choose another prefix that has no commits at all, it's
> ambiguous under either, because the "committish" rule has no way to
> decide:
> 
>   $ git show abcd2
>   error: short SHA1 abcd2 is ambiguous
>   hint: The candidates are:
>   hint:   abcd22f55e tree
>   hint:   abcd238df0 tree
>   hint:   abcd2b1cc8 blob
>   
>   $ git log abcd2
>   error: short SHA1 abcd2 is ambiguous
>   hint: The candidates are:
>   hint:   abcd22f55e tree
>   hint:   abcd238df0 tree
>   hint:   abcd2b1cc8 blob

I would have expected that git log did just tell me that it could not
find something commitish, instead it told me that there are multiple
candidates, all of them being no commit.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-15 23:38       ` Thomas Braun
@ 2019-11-16  3:47         ` Junio C Hamano
  2019-11-18 12:03           ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2019-11-16  3:47 UTC (permalink / raw)
  To: Thomas Braun; +Cc: Jeff King, Bryan Turner, Git Users

Thomas Braun <thomas.braun@virtuell-zuhause.de> writes:

>> But if I choose another prefix that has no commits at all, it's
>> ambiguous under either, because the "committish" rule has no way to
>> decide:
>> 
>>   $ git show abcd2
>>   error: short SHA1 abcd2 is ambiguous
>>   hint: The candidates are:
>>   hint:   abcd22f55e tree
>>   hint:   abcd238df0 tree
>>   hint:   abcd2b1cc8 blob
>>   
>>   $ git log abcd2
>>   error: short SHA1 abcd2 is ambiguous
>>   hint: The candidates are:
>>   hint:   abcd22f55e tree
>>   hint:   abcd238df0 tree
>>   hint:   abcd2b1cc8 blob
>
> I would have expected that git log did just tell me that it could not
> find something commitish, instead it told me that there are multiple
> candidates, all of them being no commit.

With this, I 100% agree with.   The latter should instead say

    $ git log abcd2 [--]
    error: bad revision 'abcd2'

just like the case where no object has abcd2 as prefix.

When we ask for commit-ish or any specific type in general, there
are a few possible cases.

 - There is only one such object that has the prefix and is
   compatible with the type.  We handle this correctly---yield that
   object and do not complain about ambiguity.

 - There are two or more such objects, or there is no such object.
   We show all objects that share the prefix, regardless of the
   type, which is way suboptimal.

An improvement can be localized to sha1-name.c::get_short_oid(), I
would think.  We know what type we want (e.g. GET_OID_COMMITTISH)
in the function, so we should be able to teach collect_ambiguous() 
to discard an object with the given prefix but of a wrong type.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-16  3:47         ` Junio C Hamano
@ 2019-11-18 12:03           ` Jeff King
  2019-11-19  1:24             ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2019-11-18 12:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Braun, Bryan Turner, Git Users

On Sat, Nov 16, 2019 at 12:47:20PM +0900, Junio C Hamano wrote:

> > I would have expected that git log did just tell me that it could not
> > find something commitish, instead it told me that there are multiple
> > candidates, all of them being no commit.
> 
> With this, I 100% agree with.   The latter should instead say
> 
>     $ git log abcd2 [--]
>     error: bad revision 'abcd2'
> 
> just like the case where no object has abcd2 as prefix.
> 
> When we ask for commit-ish or any specific type in general, there
> are a few possible cases.
> 
>  - There is only one such object that has the prefix and is
>    compatible with the type.  We handle this correctly---yield that
>    object and do not complain about ambiguity.
> 
>  - There are two or more such objects, or there is no such object.
>    We show all objects that share the prefix, regardless of the
>    type, which is way suboptimal.
> 
> An improvement can be localized to sha1-name.c::get_short_oid(), I
> would think.  We know what type we want (e.g. GET_OID_COMMITTISH)
> in the function, so we should be able to teach collect_ambiguous() 
> to discard an object with the given prefix but of a wrong type.

I think that changes the meaning of GET_OID_COMMITTISH, though. Right
now it means "if disambiguating, prefer a committish", but not "I can
only accept a commit". So we would still happily return an unambiguous
object that does not match that type. And that is why "git -c
core.disambiguate=committish show $short_blob" works, for example.

If you adjust your first case above to "only one such object...and the
type does not matter" then I think it is OK. I.e., the logic in
get_short_oid() becomes:

  - if there is only one, return it

  - if there is more than one, and only one matches the disambiguator,
    return it

  - otherwise, _do not_ print the ambiguous list, and return an error
    (and no object at all), letting the caller complain

where the third part is the new behavior. I think that helps in some
ways (you do not get a list of non-commits for a context that only takes
commits). But it might also hurt, because it gives the user less
information. E.g., imagine the user feeds a short sha1 that they know to
be a blob to a command expecting a committish and is told "no, that
short sha1 does not exist", even though the actual problem is that there
are two such blobs.

I think it's a bit simpler for a command which doesn't expect
non-commits at all, like "git log". But it would need to communicate
that to get_short_oid() with more than just GET_OID_COMMITTISH, so that
the latter can tell it apart from contexts which merely prefer a
committish.

I'm also not entirely sure that even that case doesn't suffer from
telling the user less information. If I say "git log 1234" knowing that
"1234" is a blob, that's a mistake. But Git may guide me in correcting
that mistake by saying "yes, we know about 1234, but it's ambiguous"
rather than "1234 is not something we know about".

Perhaps a simple fix would just be for get_short_oid()'s error message
to mention the disambiguation rule. E.g., something like:

   $ git show abcd2
   error: short SHA1 abcd2 is ambiguous
   hint: We would have preferred a commit or tag pointing to a commit,
   hint: but none were found. The candidates are:
   hint:   abcd22f55e tree
   hint:   abcd238df0 tree
   hint:   abcd2b1cc8 blob

or

  $ git show abcd2
  error: short SHA1 abcd2 is ambiguous
  hint: We preferred a commit or tag pointing to a commit to other
  hint: object types, but two candidates were found:
  hint:   abcd22f55e commit
  hint:   abcd238df0 commit
  hint:   abcd2b1cc8 blob

(optionally the second one could even not mention the blob, though I
think with the lead-in sentence it's OK).

The verbiage there isn't great (I was trying to avoid the jargon
"committish"), but hopefully you get the point.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rev-list and "ambiguous" IDs
  2019-11-18 12:03           ` Jeff King
@ 2019-11-19  1:24             ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2019-11-19  1:24 UTC (permalink / raw)
  To: Jeff King; +Cc: Thomas Braun, Bryan Turner, Git Users

Jeff King <peff@peff.net> writes:

> I think that changes the meaning of GET_OID_COMMITTISH, though. Right
> now it means "if disambiguating, prefer a committish", but not "I can
> only accept a commit". So we would still happily return an unambiguous
> object that does not match that type.

Ah, OK, so I was stupid (not a news anymore ;-)

> And that is why "git -c
> core.disambiguate=committish show $short_blob" works, for example.

Yes, and it should work that way.

> Perhaps a simple fix would just be for get_short_oid()'s error message
> to mention the disambiguation rule. E.g., something like:
>
>    $ git show abcd2
>    error: short SHA1 abcd2 is ambiguous
>    hint: We would have preferred a commit or tag pointing to a commit,
>    hint: but none were found. The candidates are:
>    hint:   abcd22f55e tree
>    hint:   abcd238df0 tree
>    hint:   abcd2b1cc8 blob
>
> or
>
>   $ git show abcd2
>   error: short SHA1 abcd2 is ambiguous
>   hint: We preferred a commit or tag pointing to a commit to other
>   hint: object types, but two candidates were found:
>   hint:   abcd22f55e commit
>   hint:   abcd238df0 commit
>   hint:   abcd2b1cc8 blob
>
> (optionally the second one could even not mention the blob, though I
> think with the lead-in sentence it's OK).
>
> The verbiage there isn't great (I was trying to avoid the jargon
> "committish"), but hopefully you get the point.

Yup, if we were to do anything, this is a much more sensible thing
to do than make GET_OID_<TYPE> reject objects that are not of <TYPE>,
I think.

Thanks for a dose of sanity.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-11-19  1:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-14  4:35 rev-list and "ambiguous" IDs Bryan Turner
2019-11-14  5:59 ` Jeff King
2019-11-15  0:12   ` Thomas Braun
2019-11-15  3:49     ` Jeff King
2019-11-15 23:38       ` Thomas Braun
2019-11-16  3:47         ` Junio C Hamano
2019-11-18 12:03           ` Jeff King
2019-11-19  1:24             ` Junio C Hamano
2019-11-15  5:07     ` Junio C Hamano
2019-11-15  8:16       ` Jeff King
2019-11-15 11:23         ` Junio C Hamano
2019-11-15  1:19   ` Bryan Turner
2019-11-15  3:57     ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.