All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bryan Turner <bturner@atlassian.com>
To: Jeff King <peff@peff.net>
Cc: Git Users <git@vger.kernel.org>
Subject: Re: rev-list and "ambiguous" IDs
Date: Thu, 14 Nov 2019 17:19:39 -0800	[thread overview]
Message-ID: <CAGyf7-GTWsQEYH9mkM8TkY1PusMimtYcSaKhHubN_KsOtMRiBA@mail.gmail.com> (raw)
In-Reply-To: <20191114055906.GA10643@sigill.intra.peff.net>

On Wed, Nov 13, 2019 at 9:59 PM Jeff King <peff@peff.net> wrote:
>
> On Wed, Nov 13, 2019 at 08:35:47PM -0800, Bryan Turner wrote:
>
> > When using a command like `git rev-list dc41e --`, it's possible to
> > get output like this: (from newer Git versions)
> > error: short SHA1 dc41e is ambiguous
> > hint: The candidates are:
> > hint:   dc41eeb01ba commit 2012-11-23 - Stuff from the commit message
> > hint:   dc41e0d508b tree
> > hint:   dc41e5bef41 tree
> > hint:   dc41e11ee18 blob
> > fatal: bad revision 'dc41e'
> >
> > Is there any way to ask rev-list to be a little...pickier about what
> > it considers a candidate? Almost without question the two trees and
> > the blob aren't what I'm asking for, which means there's actually only
> > one real candidate.
>
> Try "dc41e^{commit}", which will realize that trees and blobs cannot
> peel to a commit (there would still be an ambiguity with a tag).

Slick!

>
> I think one could argue that without "--objects" in play, rev-list
> should automatically disambiguate in favor of a committish. But that's
> not true for every command.
>
> You can also set core.disambiguate to "committish" (or even "commit").
> At the time we added that option (and started reporting the list of
> candidates), we pondered whether it might make sense to make that the
> default. That would probably help in a lot of cases, but the argument
> against it is that when it goes wrong, it may be quite confusing (so
> we're better off with the current message, which punts back to the
> user).

Having no disambiguation by default seems fine. Both of the approaches
here seem easy enough to activate explicitly in cases where the caller
(in this case Bitbucket Server; more on that later) knows they're
looking for a commit.

>
> I think it also comes up fairly rarely these days, as short sha1s we
> print have some headroom built in (as you can see above; the one you've
> input is really quite short compared to anything Git would have printed
> in that repo).

Just to provide a little context, this isn't coming up as something I
myself hit. Rather, it's a fairly common issue reported by Bitbucket
Server end users, and I would assume it happens with other hosting
providers as well: A user URL-hacks an ambiguous (or "ambiguous", in
cases like this) short hash and is disappointed when the system
doesn't manage to find the commit they were looking for. I'm just
investigating possible avenues for improving how Bitbucket Server
handles these cases. One option is to (essentially) parse the "hint",
if it's present, to get the candidates, and include them on the error
message we display. But in cases like the above it gets weird because
there's only one _commit_ candidate, and having our error message
include trees and blobs seems likely to be confusing/unexpected. I
suspect most Bitbucket Server users would say "The answer's obvious!
Why didn't you just use the commit?!", and I can sort of get behind
that view. The combination of using the disambiguation mechanism, so
single-commit ambiguities are resolved automatically, and parsing the
hint seems like it would produce the most logical behavior.

Where users get the short hashes they try is an interesting question.
As you say, Git wouldn't display a 5 character short hash, at least by
default, and Bitbucket Server doesn't either; it shows a flat 11
characters. I'm not sure, on that point.

Thanks for your insights! Learned a new Git trick today.

Best regards,
Bryan Turner



>
> > Also, while considering this, I noticed that `git rev-list
> > dc41e11ee18` (the blob from the output above) doesn't fail. It
> > silently exits, nothing written to stdout or stderr, with 0 status. A
> > little surprising; I would have expected rev-list to complain that
> > dc41e11ee18 isn't a valid commit-ish value.
>
> Yeah, this is a separate issue. If the revision machinery has pending
> trees or blobs but isn't asked to show them via "--objects", then it
> just ignores them.
>
> I've been running with the patch below for several years; it just adds a
> warning when we ignore such an object. I've been tempted to send it for
> inclusion, but it has some rough edges:
>
>   - there are some fast-export calls in the test scripts that trigger
>     this. I don't remember the details, and what the fix would look
>     like.
>
>   - it makes wildcards like "rev-list --all" complain, because they may
>     add a tag-of-blob, for example (in git.git, junio-gpg-pub triggers
>     this). Things like "--all" would probably need to get smarter, and
>     avoid adding non-commits in the first place (when --objects is not
>     in use, of course)
>
> ---
>  revision.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/revision.c b/revision.c
> index 0e39b2b8a5..7dc2d9a822 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -393,6 +393,16 @@ void add_pending_oid(struct rev_info *revs, const char *name,
>         add_pending_object(revs, object, name);
>  }
>
> +static void warn_ignored_object(struct object *object, const char *name)
> +{
> +       if (object->flags & UNINTERESTING)
> +               return;
> +
> +       warning(_("ignoring %s object in traversal: %s"),
> +               type_name(object->type),
> +               (name && *name) ? name : oid_to_hex(&object->oid));
> +}
> +
>  static struct commit *handle_commit(struct rev_info *revs,
>                                     struct object_array_entry *entry)
>  {
> @@ -458,8 +468,10 @@ static struct commit *handle_commit(struct rev_info *revs,
>          */
>         if (object->type == OBJ_TREE) {
>                 struct tree *tree = (struct tree *)object;
> -               if (!revs->tree_objects)
> +               if (!revs->tree_objects) {
> +                       warn_ignored_object(object, name);
>                         return NULL;
> +               }
>                 if (flags & UNINTERESTING) {
>                         mark_tree_contents_uninteresting(revs->repo, tree);
>                         return NULL;
> @@ -472,8 +484,10 @@ static struct commit *handle_commit(struct rev_info *revs,
>          * Blob object? You know the drill by now..
>          */
>         if (object->type == OBJ_BLOB) {
> -               if (!revs->blob_objects)
> +               if (!revs->blob_objects) {
> +                       warn_ignored_object(object, name);
>                         return NULL;
> +               }
>                 if (flags & UNINTERESTING)
>                         return NULL;
>                 add_pending_object_with_path(revs, object, name, mode, path);
> --
> 2.24.0.739.gb5632e4929
>

  parent reply	other threads:[~2019-11-15  1:19 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-14  4:35 rev-list and "ambiguous" IDs Bryan Turner
2019-11-14  5:59 ` Jeff King
2019-11-15  0:12   ` Thomas Braun
2019-11-15  3:49     ` Jeff King
2019-11-15 23:38       ` Thomas Braun
2019-11-16  3:47         ` Junio C Hamano
2019-11-18 12:03           ` Jeff King
2019-11-19  1:24             ` Junio C Hamano
2019-11-15  5:07     ` Junio C Hamano
2019-11-15  8:16       ` Jeff King
2019-11-15 11:23         ` Junio C Hamano
2019-11-15  1:19   ` Bryan Turner [this message]
2019-11-15  3:57     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGyf7-GTWsQEYH9mkM8TkY1PusMimtYcSaKhHubN_KsOtMRiBA@mail.gmail.com \
    --to=bturner@atlassian.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.