* rev-list and "ambiguous" IDs @ 2019-11-14 4:35 Bryan Turner 2019-11-14 5:59 ` Jeff King 0 siblings, 1 reply; 13+ messages in thread From: Bryan Turner @ 2019-11-14 4:35 UTC (permalink / raw) To: Git Users When using a command like `git rev-list dc41e --`, it's possible to get output like this: (from newer Git versions) error: short SHA1 dc41e is ambiguous hint: The candidates are: hint: dc41eeb01ba commit 2012-11-23 - Stuff from the commit message hint: dc41e0d508b tree hint: dc41e5bef41 tree hint: dc41e11ee18 blob fatal: bad revision 'dc41e' Is there any way to ask rev-list to be a little...pickier about what it considers a candidate? Almost without question the two trees and the blob aren't what I'm asking for, which means there's actually only one real candidate. Also, while considering this, I noticed that `git rev-list dc41e11ee18` (the blob from the output above) doesn't fail. It silently exits, nothing written to stdout or stderr, with 0 status. A little surprising; I would have expected rev-list to complain that dc41e11ee18 isn't a valid commit-ish value. Bryan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-14 4:35 rev-list and "ambiguous" IDs Bryan Turner @ 2019-11-14 5:59 ` Jeff King 2019-11-15 0:12 ` Thomas Braun 2019-11-15 1:19 ` Bryan Turner 0 siblings, 2 replies; 13+ messages in thread From: Jeff King @ 2019-11-14 5:59 UTC (permalink / raw) To: Bryan Turner; +Cc: Git Users On Wed, Nov 13, 2019 at 08:35:47PM -0800, Bryan Turner wrote: > When using a command like `git rev-list dc41e --`, it's possible to > get output like this: (from newer Git versions) > error: short SHA1 dc41e is ambiguous > hint: The candidates are: > hint: dc41eeb01ba commit 2012-11-23 - Stuff from the commit message > hint: dc41e0d508b tree > hint: dc41e5bef41 tree > hint: dc41e11ee18 blob > fatal: bad revision 'dc41e' > > Is there any way to ask rev-list to be a little...pickier about what > it considers a candidate? Almost without question the two trees and > the blob aren't what I'm asking for, which means there's actually only > one real candidate. Try "dc41e^{commit}", which will realize that trees and blobs cannot peel to a commit (there would still be an ambiguity with a tag). I think one could argue that without "--objects" in play, rev-list should automatically disambiguate in favor of a committish. But that's not true for every command. You can also set core.disambiguate to "committish" (or even "commit"). At the time we added that option (and started reporting the list of candidates), we pondered whether it might make sense to make that the default. That would probably help in a lot of cases, but the argument against it is that when it goes wrong, it may be quite confusing (so we're better off with the current message, which punts back to the user). I think it also comes up fairly rarely these days, as short sha1s we print have some headroom built in (as you can see above; the one you've input is really quite short compared to anything Git would have printed in that repo). > Also, while considering this, I noticed that `git rev-list > dc41e11ee18` (the blob from the output above) doesn't fail. It > silently exits, nothing written to stdout or stderr, with 0 status. A > little surprising; I would have expected rev-list to complain that > dc41e11ee18 isn't a valid commit-ish value. Yeah, this is a separate issue. If the revision machinery has pending trees or blobs but isn't asked to show them via "--objects", then it just ignores them. I've been running with the patch below for several years; it just adds a warning when we ignore such an object. I've been tempted to send it for inclusion, but it has some rough edges: - there are some fast-export calls in the test scripts that trigger this. I don't remember the details, and what the fix would look like. - it makes wildcards like "rev-list --all" complain, because they may add a tag-of-blob, for example (in git.git, junio-gpg-pub triggers this). Things like "--all" would probably need to get smarter, and avoid adding non-commits in the first place (when --objects is not in use, of course) --- revision.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/revision.c b/revision.c index 0e39b2b8a5..7dc2d9a822 100644 --- a/revision.c +++ b/revision.c @@ -393,6 +393,16 @@ void add_pending_oid(struct rev_info *revs, const char *name, add_pending_object(revs, object, name); } +static void warn_ignored_object(struct object *object, const char *name) +{ + if (object->flags & UNINTERESTING) + return; + + warning(_("ignoring %s object in traversal: %s"), + type_name(object->type), + (name && *name) ? name : oid_to_hex(&object->oid)); +} + static struct commit *handle_commit(struct rev_info *revs, struct object_array_entry *entry) { @@ -458,8 +468,10 @@ static struct commit *handle_commit(struct rev_info *revs, */ if (object->type == OBJ_TREE) { struct tree *tree = (struct tree *)object; - if (!revs->tree_objects) + if (!revs->tree_objects) { + warn_ignored_object(object, name); return NULL; + } if (flags & UNINTERESTING) { mark_tree_contents_uninteresting(revs->repo, tree); return NULL; @@ -472,8 +484,10 @@ static struct commit *handle_commit(struct rev_info *revs, * Blob object? You know the drill by now.. */ if (object->type == OBJ_BLOB) { - if (!revs->blob_objects) + if (!revs->blob_objects) { + warn_ignored_object(object, name); return NULL; + } if (flags & UNINTERESTING) return NULL; add_pending_object_with_path(revs, object, name, mode, path); -- 2.24.0.739.gb5632e4929 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-14 5:59 ` Jeff King @ 2019-11-15 0:12 ` Thomas Braun 2019-11-15 3:49 ` Jeff King 2019-11-15 5:07 ` Junio C Hamano 2019-11-15 1:19 ` Bryan Turner 1 sibling, 2 replies; 13+ messages in thread From: Thomas Braun @ 2019-11-15 0:12 UTC (permalink / raw) To: Jeff King, Bryan Turner; +Cc: Git Users Am 14.11.2019 um 06:59 schrieb Jeff King: [...] > You can also set core.disambiguate to "committish" (or even "commit"). > At the time we added that option (and started reporting the list of > candidates), we pondered whether it might make sense to make that the > default. I did not know this setting. Thanks! > That would probably help in a lot of cases, but the argument > against it is that when it goes wrong, it may be quite confusing (so > we're better off with the current message, which punts back to the > user). Just out of curiosity: Is there a use case for inspecting non-commit objects with git log? If I do (in the git repo) $ git log 1231 I get error: short SHA1 1231 is ambiguous hint: The candidates are: hint: 123139fc89 tree hint: 12316a1673 tree hint: 123144fe8a blob fatal: ambiguous argument '1231': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' with $ git --version git version 2.24.0.windows.2 and all of these candidates are no commits. > I think it also comes up fairly rarely these days, as short sha1s we > print have some headroom built in (as you can see above; the one you've > input is really quite short compared to anything Git would have printed > in that repo). > >> Also, while considering this, I noticed that `git rev-list >> dc41e11ee18` (the blob from the output above) doesn't fail. It >> silently exits, nothing written to stdout or stderr, with 0 status. A >> little surprising; I would have expected rev-list to complain that >> dc41e11ee18 isn't a valid commit-ish value. > > Yeah, this is a separate issue. If the revision machinery has pending > trees or blobs but isn't asked to show them via "--objects", then it > just ignores them. > > I've been running with the patch below for several years; it just adds a > warning when we ignore such an object. I've been tempted to send it for > inclusion, but it has some rough edges: > > - there are some fast-export calls in the test scripts that trigger > this. I don't remember the details, and what the fix would look > like. > > - it makes wildcards like "rev-list --all" complain, because they may > add a tag-of-blob, for example (in git.git, junio-gpg-pub triggers > this). Things like "--all" would probably need to get smarter, and > avoid adding non-commits in the first place (when --objects is not > in use, of course) > > --- > revision.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/revision.c b/revision.c > index 0e39b2b8a5..7dc2d9a822 100644 > --- a/revision.c > +++ b/revision.c > @@ -393,6 +393,16 @@ void add_pending_oid(struct rev_info *revs, const char *name, > add_pending_object(revs, object, name); > } > > +static void warn_ignored_object(struct object *object, const char *name) > +{ > + if (object->flags & UNINTERESTING) > + return; > + > + warning(_("ignoring %s object in traversal: %s"), > + type_name(object->type), > + (name && *name) ? name : oid_to_hex(&object->oid)); > +} > + > static struct commit *handle_commit(struct rev_info *revs, > struct object_array_entry *entry) > { > @@ -458,8 +468,10 @@ static struct commit *handle_commit(struct rev_info *revs, > */ > if (object->type == OBJ_TREE) { > struct tree *tree = (struct tree *)object; > - if (!revs->tree_objects) > + if (!revs->tree_objects) { > + warn_ignored_object(object, name); > return NULL; > + } > if (flags & UNINTERESTING) { > mark_tree_contents_uninteresting(revs->repo, tree); > return NULL; > @@ -472,8 +484,10 @@ static struct commit *handle_commit(struct rev_info *revs, > * Blob object? You know the drill by now.. > */ > if (object->type == OBJ_BLOB) { > - if (!revs->blob_objects) > + if (!revs->blob_objects) { > + warn_ignored_object(object, name); > return NULL; > + } > if (flags & UNINTERESTING) > return NULL; > add_pending_object_with_path(revs, object, name, mode, path); > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-15 0:12 ` Thomas Braun @ 2019-11-15 3:49 ` Jeff King 2019-11-15 23:38 ` Thomas Braun 2019-11-15 5:07 ` Junio C Hamano 1 sibling, 1 reply; 13+ messages in thread From: Jeff King @ 2019-11-15 3:49 UTC (permalink / raw) To: Thomas Braun; +Cc: Bryan Turner, Git Users On Fri, Nov 15, 2019 at 01:12:47AM +0100, Thomas Braun wrote: > > That would probably help in a lot of cases, but the argument > > against it is that when it goes wrong, it may be quite confusing (so > > we're better off with the current message, which punts back to the > > user). > > Just out of curiosity: Is there a use case for inspecting non-commit > objects with git log? Not that I can think of. You can't even say "--objects" there. And indeed, "git log" already prefers commits for disambiguation, since d5f6b1d756 (revision.c: the "log" family, except for "show", takes committish, 2012-07-02). But... > If I do (in the git repo) > > $ git log 1231 > > I get > > error: short SHA1 1231 is ambiguous > hint: The candidates are: > hint: 123139fc89 tree > hint: 12316a1673 tree > hint: 123144fe8a blob > fatal: ambiguous argument '1231': unknown revision or path not in the > working tree. > Use '--' to separate paths from revisions, like this: > 'git <command> [<revision>...] -- [<file>...]' > > with > $ git --version > git version 2.24.0.windows.2 > > and all of these candidates are no commits. ...remember that the disambiguation code is just about preferring one object to the other. If the rule in effect doesn't have a preference, it's still ambiguous. On my system, "1231" actually _does_ have a commit: $ git show 1231 error: short SHA1 1231 is ambiguous hint: The candidates are: hint: 12319e3bf2 commit 2017-03-25 - Merge 'git-gui-add-2nd-line' into HEAD hint: 123139fc89 tree hint: 12315b58b8 tree hint: 12316a1673 tree hint: 12317ab2d9 tree hint: 123193f802 tree hint: 123144fe8a blob fatal: ambiguous argument '1231': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' That's ambiguous because git-show can handle trees and blobs, too. But if I feed that sha1 to git-log: $ git log --oneline -1 1231 12319e3bf2 Merge 'git-gui-add-2nd-line' into HEAD it's perfectly fine, because git-log knows to disambiguate the commit. But if I choose another prefix that has no commits at all, it's ambiguous under either, because the "committish" rule has no way to decide: $ git show abcd2 error: short SHA1 abcd2 is ambiguous hint: The candidates are: hint: abcd22f55e tree hint: abcd238df0 tree hint: abcd2b1cc8 blob $ git log abcd2 error: short SHA1 abcd2 is ambiguous hint: The candidates are: hint: abcd22f55e tree hint: abcd238df0 tree hint: abcd2b1cc8 blob -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-15 3:49 ` Jeff King @ 2019-11-15 23:38 ` Thomas Braun 2019-11-16 3:47 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Thomas Braun @ 2019-11-15 23:38 UTC (permalink / raw) To: Jeff King; +Cc: Bryan Turner, Git Users Am 15.11.2019 um 04:49 schrieb Jeff King: > On Fri, Nov 15, 2019 at 01:12:47AM +0100, Thomas Braun wrote: > >>> That would probably help in a lot of cases, but the argument >>> against it is that when it goes wrong, it may be quite confusing (so >>> we're better off with the current message, which punts back to the >>> user). >> >> Just out of curiosity: Is there a use case for inspecting non-commit >> objects with git log? > > Not that I can think of. You can't even say "--objects" there. > > And indeed, "git log" already prefers commits for disambiguation, since > d5f6b1d756 (revision.c: the "log" family, except for "show", takes > committish, 2012-07-02). > > But... > >> If I do (in the git repo) >> >> $ git log 1231 >> >> I get >> >> error: short SHA1 1231 is ambiguous >> hint: The candidates are: >> hint: 123139fc89 tree >> hint: 12316a1673 tree >> hint: 123144fe8a blob >> fatal: ambiguous argument '1231': unknown revision or path not in the >> working tree. >> Use '--' to separate paths from revisions, like this: >> 'git <command> [<revision>...] -- [<file>...]' >> >> with >> $ git --version >> git version 2.24.0.windows.2 >> >> and all of these candidates are no commits. > > ...remember that the disambiguation code is just about preferring one > object to the other. If the rule in effect doesn't have a preference, > it's still ambiguous. On my system, "1231" actually _does_ have a > commit: > > $ git show 1231 > error: short SHA1 1231 is ambiguous > hint: The candidates are: > hint: 12319e3bf2 commit 2017-03-25 - Merge 'git-gui-add-2nd-line' into HEAD > hint: 123139fc89 tree > hint: 12315b58b8 tree > hint: 12316a1673 tree > hint: 12317ab2d9 tree > hint: 123193f802 tree > hint: 123144fe8a blob > fatal: ambiguous argument '1231': unknown revision or path not in the working tree. > Use '--' to separate paths from revisions, like this: > 'git <command> [<revision>...] -- [<file>...]' > > That's ambiguous because git-show can handle trees and blobs, too. But > if I feed that sha1 to git-log: > > $ git log --oneline -1 1231 > 12319e3bf2 Merge 'git-gui-add-2nd-line' into HEAD > > it's perfectly fine, because git-log knows to disambiguate the commit. > But if I choose another prefix that has no commits at all, it's > ambiguous under either, because the "committish" rule has no way to > decide: > > $ git show abcd2 > error: short SHA1 abcd2 is ambiguous > hint: The candidates are: > hint: abcd22f55e tree > hint: abcd238df0 tree > hint: abcd2b1cc8 blob > > $ git log abcd2 > error: short SHA1 abcd2 is ambiguous > hint: The candidates are: > hint: abcd22f55e tree > hint: abcd238df0 tree > hint: abcd2b1cc8 blob I would have expected that git log did just tell me that it could not find something commitish, instead it told me that there are multiple candidates, all of them being no commit. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-15 23:38 ` Thomas Braun @ 2019-11-16 3:47 ` Junio C Hamano 2019-11-18 12:03 ` Jeff King 0 siblings, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2019-11-16 3:47 UTC (permalink / raw) To: Thomas Braun; +Cc: Jeff King, Bryan Turner, Git Users Thomas Braun <thomas.braun@virtuell-zuhause.de> writes: >> But if I choose another prefix that has no commits at all, it's >> ambiguous under either, because the "committish" rule has no way to >> decide: >> >> $ git show abcd2 >> error: short SHA1 abcd2 is ambiguous >> hint: The candidates are: >> hint: abcd22f55e tree >> hint: abcd238df0 tree >> hint: abcd2b1cc8 blob >> >> $ git log abcd2 >> error: short SHA1 abcd2 is ambiguous >> hint: The candidates are: >> hint: abcd22f55e tree >> hint: abcd238df0 tree >> hint: abcd2b1cc8 blob > > I would have expected that git log did just tell me that it could not > find something commitish, instead it told me that there are multiple > candidates, all of them being no commit. With this, I 100% agree with. The latter should instead say $ git log abcd2 [--] error: bad revision 'abcd2' just like the case where no object has abcd2 as prefix. When we ask for commit-ish or any specific type in general, there are a few possible cases. - There is only one such object that has the prefix and is compatible with the type. We handle this correctly---yield that object and do not complain about ambiguity. - There are two or more such objects, or there is no such object. We show all objects that share the prefix, regardless of the type, which is way suboptimal. An improvement can be localized to sha1-name.c::get_short_oid(), I would think. We know what type we want (e.g. GET_OID_COMMITTISH) in the function, so we should be able to teach collect_ambiguous() to discard an object with the given prefix but of a wrong type. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-16 3:47 ` Junio C Hamano @ 2019-11-18 12:03 ` Jeff King 2019-11-19 1:24 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Jeff King @ 2019-11-18 12:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: Thomas Braun, Bryan Turner, Git Users On Sat, Nov 16, 2019 at 12:47:20PM +0900, Junio C Hamano wrote: > > I would have expected that git log did just tell me that it could not > > find something commitish, instead it told me that there are multiple > > candidates, all of them being no commit. > > With this, I 100% agree with. The latter should instead say > > $ git log abcd2 [--] > error: bad revision 'abcd2' > > just like the case where no object has abcd2 as prefix. > > When we ask for commit-ish or any specific type in general, there > are a few possible cases. > > - There is only one such object that has the prefix and is > compatible with the type. We handle this correctly---yield that > object and do not complain about ambiguity. > > - There are two or more such objects, or there is no such object. > We show all objects that share the prefix, regardless of the > type, which is way suboptimal. > > An improvement can be localized to sha1-name.c::get_short_oid(), I > would think. We know what type we want (e.g. GET_OID_COMMITTISH) > in the function, so we should be able to teach collect_ambiguous() > to discard an object with the given prefix but of a wrong type. I think that changes the meaning of GET_OID_COMMITTISH, though. Right now it means "if disambiguating, prefer a committish", but not "I can only accept a commit". So we would still happily return an unambiguous object that does not match that type. And that is why "git -c core.disambiguate=committish show $short_blob" works, for example. If you adjust your first case above to "only one such object...and the type does not matter" then I think it is OK. I.e., the logic in get_short_oid() becomes: - if there is only one, return it - if there is more than one, and only one matches the disambiguator, return it - otherwise, _do not_ print the ambiguous list, and return an error (and no object at all), letting the caller complain where the third part is the new behavior. I think that helps in some ways (you do not get a list of non-commits for a context that only takes commits). But it might also hurt, because it gives the user less information. E.g., imagine the user feeds a short sha1 that they know to be a blob to a command expecting a committish and is told "no, that short sha1 does not exist", even though the actual problem is that there are two such blobs. I think it's a bit simpler for a command which doesn't expect non-commits at all, like "git log". But it would need to communicate that to get_short_oid() with more than just GET_OID_COMMITTISH, so that the latter can tell it apart from contexts which merely prefer a committish. I'm also not entirely sure that even that case doesn't suffer from telling the user less information. If I say "git log 1234" knowing that "1234" is a blob, that's a mistake. But Git may guide me in correcting that mistake by saying "yes, we know about 1234, but it's ambiguous" rather than "1234 is not something we know about". Perhaps a simple fix would just be for get_short_oid()'s error message to mention the disambiguation rule. E.g., something like: $ git show abcd2 error: short SHA1 abcd2 is ambiguous hint: We would have preferred a commit or tag pointing to a commit, hint: but none were found. The candidates are: hint: abcd22f55e tree hint: abcd238df0 tree hint: abcd2b1cc8 blob or $ git show abcd2 error: short SHA1 abcd2 is ambiguous hint: We preferred a commit or tag pointing to a commit to other hint: object types, but two candidates were found: hint: abcd22f55e commit hint: abcd238df0 commit hint: abcd2b1cc8 blob (optionally the second one could even not mention the blob, though I think with the lead-in sentence it's OK). The verbiage there isn't great (I was trying to avoid the jargon "committish"), but hopefully you get the point. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-18 12:03 ` Jeff King @ 2019-11-19 1:24 ` Junio C Hamano 0 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2019-11-19 1:24 UTC (permalink / raw) To: Jeff King; +Cc: Thomas Braun, Bryan Turner, Git Users Jeff King <peff@peff.net> writes: > I think that changes the meaning of GET_OID_COMMITTISH, though. Right > now it means "if disambiguating, prefer a committish", but not "I can > only accept a commit". So we would still happily return an unambiguous > object that does not match that type. Ah, OK, so I was stupid (not a news anymore ;-) > And that is why "git -c > core.disambiguate=committish show $short_blob" works, for example. Yes, and it should work that way. > Perhaps a simple fix would just be for get_short_oid()'s error message > to mention the disambiguation rule. E.g., something like: > > $ git show abcd2 > error: short SHA1 abcd2 is ambiguous > hint: We would have preferred a commit or tag pointing to a commit, > hint: but none were found. The candidates are: > hint: abcd22f55e tree > hint: abcd238df0 tree > hint: abcd2b1cc8 blob > > or > > $ git show abcd2 > error: short SHA1 abcd2 is ambiguous > hint: We preferred a commit or tag pointing to a commit to other > hint: object types, but two candidates were found: > hint: abcd22f55e commit > hint: abcd238df0 commit > hint: abcd2b1cc8 blob > > (optionally the second one could even not mention the blob, though I > think with the lead-in sentence it's OK). > > The verbiage there isn't great (I was trying to avoid the jargon > "committish"), but hopefully you get the point. Yup, if we were to do anything, this is a much more sensible thing to do than make GET_OID_<TYPE> reject objects that are not of <TYPE>, I think. Thanks for a dose of sanity. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-15 0:12 ` Thomas Braun 2019-11-15 3:49 ` Jeff King @ 2019-11-15 5:07 ` Junio C Hamano 2019-11-15 8:16 ` Jeff King 1 sibling, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2019-11-15 5:07 UTC (permalink / raw) To: Thomas Braun; +Cc: Jeff King, Bryan Turner, Git Users Thomas Braun <thomas.braun@virtuell-zuhause.de> writes: > Just out of curiosity: Is there a use case for inspecting non-commit > objects with git log? I do not think there is (rev-list is a different story, given that you can pass --objects), and it probably is not too difficult to teach "git log" and friends that they only want commit-ish. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-15 5:07 ` Junio C Hamano @ 2019-11-15 8:16 ` Jeff King 2019-11-15 11:23 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Jeff King @ 2019-11-15 8:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: Thomas Braun, Bryan Turner, Git Users On Fri, Nov 15, 2019 at 02:07:13PM +0900, Junio C Hamano wrote: > Thomas Braun <thomas.braun@virtuell-zuhause.de> writes: > > > Just out of curiosity: Is there a use case for inspecting non-commit > > objects with git log? > > I do not think there is (rev-list is a different story, given that > you can pass --objects), and it probably is not too difficult to > teach "git log" and friends that they only want commit-ish. I think you already did; see my other reply. ;) -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-15 8:16 ` Jeff King @ 2019-11-15 11:23 ` Junio C Hamano 0 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2019-11-15 11:23 UTC (permalink / raw) To: Jeff King; +Cc: Thomas Braun, Bryan Turner, Git Users Jeff King <peff@peff.net> writes: > On Fri, Nov 15, 2019 at 02:07:13PM +0900, Junio C Hamano wrote: > >> Thomas Braun <thomas.braun@virtuell-zuhause.de> writes: >> >> > Just out of curiosity: Is there a use case for inspecting non-commit >> > objects with git log? >> >> I do not think there is (rev-list is a different story, given that >> you can pass --objects), and it probably is not too difficult to >> teach "git log" and friends that they only want commit-ish. > > I think you already did; see my other reply. ;) Heh, indeed I did. I just did not recall. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-14 5:59 ` Jeff King 2019-11-15 0:12 ` Thomas Braun @ 2019-11-15 1:19 ` Bryan Turner 2019-11-15 3:57 ` Jeff King 1 sibling, 1 reply; 13+ messages in thread From: Bryan Turner @ 2019-11-15 1:19 UTC (permalink / raw) To: Jeff King; +Cc: Git Users On Wed, Nov 13, 2019 at 9:59 PM Jeff King <peff@peff.net> wrote: > > On Wed, Nov 13, 2019 at 08:35:47PM -0800, Bryan Turner wrote: > > > When using a command like `git rev-list dc41e --`, it's possible to > > get output like this: (from newer Git versions) > > error: short SHA1 dc41e is ambiguous > > hint: The candidates are: > > hint: dc41eeb01ba commit 2012-11-23 - Stuff from the commit message > > hint: dc41e0d508b tree > > hint: dc41e5bef41 tree > > hint: dc41e11ee18 blob > > fatal: bad revision 'dc41e' > > > > Is there any way to ask rev-list to be a little...pickier about what > > it considers a candidate? Almost without question the two trees and > > the blob aren't what I'm asking for, which means there's actually only > > one real candidate. > > Try "dc41e^{commit}", which will realize that trees and blobs cannot > peel to a commit (there would still be an ambiguity with a tag). Slick! > > I think one could argue that without "--objects" in play, rev-list > should automatically disambiguate in favor of a committish. But that's > not true for every command. > > You can also set core.disambiguate to "committish" (or even "commit"). > At the time we added that option (and started reporting the list of > candidates), we pondered whether it might make sense to make that the > default. That would probably help in a lot of cases, but the argument > against it is that when it goes wrong, it may be quite confusing (so > we're better off with the current message, which punts back to the > user). Having no disambiguation by default seems fine. Both of the approaches here seem easy enough to activate explicitly in cases where the caller (in this case Bitbucket Server; more on that later) knows they're looking for a commit. > > I think it also comes up fairly rarely these days, as short sha1s we > print have some headroom built in (as you can see above; the one you've > input is really quite short compared to anything Git would have printed > in that repo). Just to provide a little context, this isn't coming up as something I myself hit. Rather, it's a fairly common issue reported by Bitbucket Server end users, and I would assume it happens with other hosting providers as well: A user URL-hacks an ambiguous (or "ambiguous", in cases like this) short hash and is disappointed when the system doesn't manage to find the commit they were looking for. I'm just investigating possible avenues for improving how Bitbucket Server handles these cases. One option is to (essentially) parse the "hint", if it's present, to get the candidates, and include them on the error message we display. But in cases like the above it gets weird because there's only one _commit_ candidate, and having our error message include trees and blobs seems likely to be confusing/unexpected. I suspect most Bitbucket Server users would say "The answer's obvious! Why didn't you just use the commit?!", and I can sort of get behind that view. The combination of using the disambiguation mechanism, so single-commit ambiguities are resolved automatically, and parsing the hint seems like it would produce the most logical behavior. Where users get the short hashes they try is an interesting question. As you say, Git wouldn't display a 5 character short hash, at least by default, and Bitbucket Server doesn't either; it shows a flat 11 characters. I'm not sure, on that point. Thanks for your insights! Learned a new Git trick today. Best regards, Bryan Turner > > > Also, while considering this, I noticed that `git rev-list > > dc41e11ee18` (the blob from the output above) doesn't fail. It > > silently exits, nothing written to stdout or stderr, with 0 status. A > > little surprising; I would have expected rev-list to complain that > > dc41e11ee18 isn't a valid commit-ish value. > > Yeah, this is a separate issue. If the revision machinery has pending > trees or blobs but isn't asked to show them via "--objects", then it > just ignores them. > > I've been running with the patch below for several years; it just adds a > warning when we ignore such an object. I've been tempted to send it for > inclusion, but it has some rough edges: > > - there are some fast-export calls in the test scripts that trigger > this. I don't remember the details, and what the fix would look > like. > > - it makes wildcards like "rev-list --all" complain, because they may > add a tag-of-blob, for example (in git.git, junio-gpg-pub triggers > this). Things like "--all" would probably need to get smarter, and > avoid adding non-commits in the first place (when --objects is not > in use, of course) > > --- > revision.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/revision.c b/revision.c > index 0e39b2b8a5..7dc2d9a822 100644 > --- a/revision.c > +++ b/revision.c > @@ -393,6 +393,16 @@ void add_pending_oid(struct rev_info *revs, const char *name, > add_pending_object(revs, object, name); > } > > +static void warn_ignored_object(struct object *object, const char *name) > +{ > + if (object->flags & UNINTERESTING) > + return; > + > + warning(_("ignoring %s object in traversal: %s"), > + type_name(object->type), > + (name && *name) ? name : oid_to_hex(&object->oid)); > +} > + > static struct commit *handle_commit(struct rev_info *revs, > struct object_array_entry *entry) > { > @@ -458,8 +468,10 @@ static struct commit *handle_commit(struct rev_info *revs, > */ > if (object->type == OBJ_TREE) { > struct tree *tree = (struct tree *)object; > - if (!revs->tree_objects) > + if (!revs->tree_objects) { > + warn_ignored_object(object, name); > return NULL; > + } > if (flags & UNINTERESTING) { > mark_tree_contents_uninteresting(revs->repo, tree); > return NULL; > @@ -472,8 +484,10 @@ static struct commit *handle_commit(struct rev_info *revs, > * Blob object? You know the drill by now.. > */ > if (object->type == OBJ_BLOB) { > - if (!revs->blob_objects) > + if (!revs->blob_objects) { > + warn_ignored_object(object, name); > return NULL; > + } > if (flags & UNINTERESTING) > return NULL; > add_pending_object_with_path(revs, object, name, mode, path); > -- > 2.24.0.739.gb5632e4929 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: rev-list and "ambiguous" IDs 2019-11-15 1:19 ` Bryan Turner @ 2019-11-15 3:57 ` Jeff King 0 siblings, 0 replies; 13+ messages in thread From: Jeff King @ 2019-11-15 3:57 UTC (permalink / raw) To: Bryan Turner; +Cc: Git Users On Thu, Nov 14, 2019 at 05:19:39PM -0800, Bryan Turner wrote: > Just to provide a little context, this isn't coming up as something I > myself hit. Rather, it's a fairly common issue reported by Bitbucket > Server end users, and I would assume it happens with other hosting > providers as well: A user URL-hacks an ambiguous (or "ambiguous", in > cases like this) short hash and is disappointed when the system > doesn't manage to find the commit they were looking for. I'm just > investigating possible avenues for improving how Bitbucket Server > handles these cases. One option is to (essentially) parse the "hint", > if it's present, to get the candidates, and include them on the error > message we display. But in cases like the above it gets weird because > there's only one _commit_ candidate, and having our error message > include trees and blobs seems likely to be confusing/unexpected. I > suspect most Bitbucket Server users would say "The answer's obvious! > Why didn't you just use the commit?!", and I can sort of get behind > that view. The combination of using the disambiguation mechanism, so > single-commit ambiguities are resolved automatically, and parsing the > hint seems like it would produce the most logical behavior. It depends on your URL scheme obviously, but on GitHub for example, it would make sense for https://github.com/user/repo/commit/1234abcd to use the "^{commit}" trick. I don't think it currently does, though. > Where users get the short hashes they try is an interesting question. > As you say, Git wouldn't display a 5 character short hash, at least by > default, and Bitbucket Server doesn't either; it shows a flat 11 > characters. I'm not sure, on that point. GitHub often produces 7-char short hashes, because it's abbreviating them in presentation code that doesn't want to spend the round trip to talk to the repo (to find out if it's unique, or how many objects are in the repo). It _usually_ shouldn't matter much, because we try to produce abbreviated hashes where users might read them, and long hashes when we generate URLs. But of course people sometimes generate URLs themselves from who knows where. :) I've been lightly lobbying to bump our default to something higher, like 12. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-11-19 1:24 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-14 4:35 rev-list and "ambiguous" IDs Bryan Turner 2019-11-14 5:59 ` Jeff King 2019-11-15 0:12 ` Thomas Braun 2019-11-15 3:49 ` Jeff King 2019-11-15 23:38 ` Thomas Braun 2019-11-16 3:47 ` Junio C Hamano 2019-11-18 12:03 ` Jeff King 2019-11-19 1:24 ` Junio C Hamano 2019-11-15 5:07 ` Junio C Hamano 2019-11-15 8:16 ` Jeff King 2019-11-15 11:23 ` Junio C Hamano 2019-11-15 1:19 ` Bryan Turner 2019-11-15 3:57 ` Jeff King
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.