All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Philip Oakley" <philipoakley@iee.org>
To: "Stefan Beller" <sbeller@google.com>, <git@vger.kernel.org>
Cc: <gitster@pobox.com>, <jonathantanmy@google.com>,
	"Stefan Beller" <sbeller@google.com>
Subject: Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
Date: Mon, 20 Nov 2017 15:10:53 -0000	[thread overview]
Message-ID: <B03A23EA75104E99BADBB420B6BEA614@PhilipOakley> (raw)
In-Reply-To: 20171116020039.17810-8-sbeller@google.com

From: "Stefan Beller" <sbeller@google.com>
Sent: Thursday, November 16, 2017 2:00 AM

[in catch up mode..]

> Sometimes users are given a hash of an object and they want to
> identify it further (ex.: Use verify-pack to find the largest blobs,
> but what are these? or [1])
>
> When describing commits, we try to anchor them to tags or refs, as these
> are conceptually on a higher level than the commit. And if there is no ref
> or tag that matches exactly, we're out of luck.  So we employ a heuristic
> to make up a name for the commit. These names are ambiguous, there might
> be different tags or refs to anchor to, and there might be different
> path in the DAG to travel to arrive at the commit precisely.
>
> When describing a blob, we want to describe the blob from a higher layer
> as well, which is a tuple of (commit, deep/path) as the tree objects
> involved are rather uninteresting.  The same blob can be referenced by
> multiple commits, so how we decide which commit to use?  This patch
> implements a rather naive approach on this: As there are no back pointers
> from blobs to commits in which the blob occurs, we'll start walking from
> any tips available, listing the blobs in-order of the commit and once we
> found the blob, we'll take the first commit that listed the blob. For
> example
>
>  git describe --tags v0.99:Makefile
>  conversion-901-g7672db20c2:Makefile
>
> tells us the Makefile as it was in v0.99 was introduced in commit 
> 7672db20.
>
> The walking is performed in reverse order to show the introduction of a
> blob rather than its last occurrence.
>
> [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> Documentation/git-describe.txt | 18 ++++++++++--
> builtin/describe.c             | 62 
> ++++++++++++++++++++++++++++++++++++++----
> t/t6120-describe.sh            | 34 +++++++++++++++++++++++
> 3 files changed, 107 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/git-describe.txt 
> b/Documentation/git-describe.txt
> index c924c945ba..e027fb8c4b 100644
> --- a/Documentation/git-describe.txt
> +++ b/Documentation/git-describe.txt
> @@ -3,14 +3,14 @@ git-describe(1)
>
> NAME
> ----
> -git-describe - Describe a commit using the most recent tag reachable from 
> it
> -
> +git-describe - Give an object a human readable name based on an available 
> ref
>
> SYNOPSIS
> --------
> [verse]
> 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] 
> [<commit-ish>...]
> 'git describe' [--all] [--tags] [--contains] 
> [--abbrev=<n>] --dirty[=<mark>]
> +'git describe' <blob>
>
> DESCRIPTION
> -----------
> @@ -24,6 +24,12 @@ By default (without --all or --tags) `git describe` 
> only shows
> annotated tags.  For more information about creating annotated tags
> see the -a and -s options to linkgit:git-tag[1].
>
> +If the given object refers to a blob, it will be described
> +as `<commit-ish>:<path>`, such that the blob can be found
> +at `<path>` in the `<commit-ish>`, which itself describes the
> +first commit in which this blob occurs in a reverse revision walk
> +from HEAD.
> +
> OPTIONS
> -------
> <commit-ish>...::
> @@ -186,6 +192,14 @@ selected and output.  Here fewest commits different 
> is defined as
> the number of commits which would be shown by `git log tag..input`
> will be the smallest number of commits possible.
>
> +BUGS
> +----
> +
> +Tree objects as well as tag objects not pointing at commits, cannot be 
> described.

Is this true? Is it stand alone from the describing of a blob? If so should 
it be its own patchlet. - I thought I'd read that within the series there is 
now a tree / tag (of blob/trees) description capability.

I'd prefer that we don't start with the "can't" view (relative to the 
subsequent sentences of the paragraph). It puts off the reader - we are 
about to say what can be described but in a limited way - the limitation 
being the bug. Maybe just swap the line to form a second paragraph.

> +When describing blobs, the lightweight tags pointing at blobs are 
> ignored,
> +but the blob is still described as <committ-ish>:<path> despite the 
> lightweight
> +tag being favorable.
> +
--
Philip

> GIT
> ---
> Part of the linkgit:git[1] suite
> diff --git a/builtin/describe.c b/builtin/describe.c
> index 9e9a5ed5d4..5b4bfaba3f 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -3,6 +3,7 @@
> #include "lockfile.h"
> #include "commit.h"
> #include "tag.h"
> +#include "blob.h"
> #include "refs.h"
> #include "builtin.h"
> #include "exec_cmd.h"
> @@ -11,8 +12,9 @@
> #include "hashmap.h"
> #include "argv-array.h"
> #include "run-command.h"
> +#include "revision.h"
> +#include "list-objects.h"
>
> -#define SEEN (1u << 0)
> #define MAX_TAGS (FLAG_BITS - 1)
>
> static const char * const describe_usage[] = {
> @@ -434,6 +436,53 @@ static void describe_commit(struct object_id *oid, 
> struct strbuf *dst)
>  strbuf_addstr(dst, suffix);
> }
>
> +struct process_commit_data {
> + struct object_id current_commit;
> + struct object_id looking_for;
> + struct strbuf *dst;
> + struct rev_info *revs;
> +};
> +
> +static void process_commit(struct commit *commit, void *data)
> +{
> + struct process_commit_data *pcd = data;
> + pcd->current_commit = commit->object.oid;
> +}
> +
> +static void process_object(struct object *obj, const char *path, void 
> *data)
> +{
> + struct process_commit_data *pcd = data;
> +
> + if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
> + reset_revision_walk();
> + describe_commit(&pcd->current_commit, pcd->dst);
> + strbuf_addf(pcd->dst, ":%s", path);
> + free_commit_list(pcd->revs->commits);
> + pcd->revs->commits = NULL;
> + }
> +}
> +
> +static void describe_blob(struct object_id oid, struct strbuf *dst)
> +{
> + struct rev_info revs;
> + struct argv_array args = ARGV_ARRAY_INIT;
> + struct process_commit_data pcd = { null_oid, oid, dst, &revs};
> +
> + argv_array_pushl(&args, "internal: The first arg is not parsed",
> + "--objects", "--in-commit-order", "--reverse", "HEAD",
> + NULL);
> +
> + init_revisions(&revs, NULL);
> + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
> + BUG("setup_revisions could not handle all args?");
> +
> + if (prepare_revision_walk(&revs))
> + die("revision walk setup failed");
> +
> + traverse_commit_list(&revs, process_commit, process_object, &pcd);
> + reset_revision_walk();
> +}
> +
> static void describe(const char *arg, int last_one)
> {
>  struct object_id oid;
> @@ -445,11 +494,14 @@ static void describe(const char *arg, int last_one)
>
>  if (get_oid(arg, &oid))
>  die(_("Not a valid object name %s"), arg);
> - cmit = lookup_commit_reference(&oid);
> - if (!cmit)
> - die(_("%s is not a valid '%s' object"), arg, commit_type);
> + cmit = lookup_commit_reference_gently(&oid, 1);
>
> - describe_commit(&oid, &sb);
> + if (cmit)
> + describe_commit(&oid, &sb);
> + else if (lookup_blob(&oid))
> + describe_blob(oid, &sb);
> + else
> + die(_("%s is neither a commit nor blob"), arg);
>
>  puts(sb.buf);
>
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index c8b7ed82d9..4668f0058e 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -310,6 +310,40 @@ test_expect_success 'describe ignoring a broken 
> submodule' '
>  grep broken out
> '
>
> +test_expect_success 'describe a blob at a directly tagged commit' '
> + echo "make it a unique blob" >file &&
> + git add file && git commit -m "content in file" &&
> + git tag -a -m "latest annotated tag" unique-file &&
> + git describe HEAD:file >actual &&
> + echo "unique-file:file" >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'describe a blob with its first introduction' '
> + git commit --allow-empty -m "empty commit" &&
> + git rm file &&
> + git commit -m "delete blob" &&
> + git revert HEAD &&
> + git commit --allow-empty -m "empty commit" &&
> + git describe HEAD:file >actual &&
> + echo "unique-file:file" >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'describe directly tagged blob' '
> + git tag test-blob unique-file:file &&
> + git describe test-blob >actual &&
> + echo "unique-file:file" >expect &&
> + # suboptimal: we rather want to see "test-blob"
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'describe tag object' '
> + git tag test-blob-1 -a -m msg unique-file:file &&
> + test_must_fail git describe test-blob-1 2>actual &&
> + grep "fatal: test-blob-1 is neither a commit nor blob" actual
> +'
> +
> test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
>  i=1 &&
>  while test $i -lt 8000
> -- 
> 2.15.0.128.gcadd42da22
> 


  parent reply	other threads:[~2017-11-20 17:26 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
2017-11-16  2:00 ` [PATCHv5 1/7] t6120: fix typo in test name Stefan Beller
2017-11-16  2:00 ` [PATCHv5 2/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller
2017-11-16  2:00 ` [PATCHv5 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller
2017-11-16  2:00 ` [PATCHv5 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller
2017-11-16  2:00 ` [PATCHv5 5/7] builtin/describe.c: print debug statements earlier Stefan Beller
2017-11-16  2:00 ` [PATCHv5 6/7] builtin/describe.c: factor out describe_commit Stefan Beller
2017-11-16  2:00 ` [PATCHv5 7/7] builtin/describe.c: describe a blob Stefan Beller
2017-11-16  3:24   ` Junio C Hamano
2017-11-16 19:34     ` Stefan Beller
2017-11-22  7:55       ` Junio C Hamano
2017-11-22 17:00         ` Stefan Beller
2017-11-24  7:18           ` Junio C Hamano
2017-11-27 19:38             ` Stefan Beller
2017-11-27 23:37               ` Junio C Hamano
2017-11-20 15:10   ` Philip Oakley [this message]
2017-12-19 19:22   ` Junio C Hamano
2017-12-19 19:39     ` Stefan Beller
2017-12-22 23:10       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B03A23EA75104E99BADBB420B6BEA614@PhilipOakley \
    --to=philipoakley@iee.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.