All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Improve errors from 'git diff --no-index'.
@ 2011-05-22 19:17 Anthony Foiani
       [not found] ` <7vlixyw4cx.fsf@alter.siamese.dyndns.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Anthony Foiani @ 2011-05-22 19:17 UTC (permalink / raw)
  To: anthony.foiani

I accidentally tried to use "git diff" when I wasn't within a git
repository, only to be confused by getting a usage message with no
particular error output.

This patch tries to provide better explanations to the user when the diff
cannot be done.

Signed-off-by: Anthony Foiani <anthony.foiani@gmail.com>
---
 diff-no-index.c |  122 ++++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 95 insertions(+), 27 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 3a36144..3172788 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -149,21 +149,56 @@ static int queue_diff(struct diff_options *o,
 	}
 }
 
-static int path_outside_repo(const char *path)
+/**
+ * path_in_repo() - Determine whether @path is within a git directory/repo.
+ * @path: pathspec to test
+ *
+ * Returns true if @path is equal to, or lies within, the git tree
+ * returned by get_git_work_tree.
+ */
+static int path_in_repo(const char *path)
 {
-	const char *work_tree;
-	size_t len;
+	const char *tree;
+	const char *abs_path;
+	const char *abs_tree;
+	char norm_path[PATH_MAX+1];
+	char norm_tree[PATH_MAX+1];
+	size_t tree_len;
+	size_t path_len;
+
+	/* Paranoia. */
+	if (!path)
+		return 0;
 
-	if (!is_absolute_path(path))
+	/* If there's no tree, then there's no repo to be in. */
+	tree = get_git_work_tree();
+	if (!tree)
 		return 0;
-	work_tree = get_git_work_tree();
-	if (!work_tree)
-		return 1;
-	len = strlen(work_tree);
-	if (strncmp(path, work_tree, len) ||
-	    (path[len] != '\0' && path[len] != '/'))
-		return 1;
-	return 0;
+
+	/* Get the full paths and normalize them.. */
+	abs_path = real_path(path);
+	if (normalize_path_copy(norm_path, abs_path))
+		die("could not normalize path '%s'", abs_path);
+	abs_tree = real_path(tree);
+	if (normalize_path_copy(norm_tree, abs_tree))
+		die("could not normalize tree '%s'", abs_tree);
+
+	/* See if the work tree is a prefix *directory* of the given path. */
+	path_len = strlen(norm_path);
+	tree_len = strlen(norm_tree);
+
+	if (path_len < tree_len)
+		return 0; /* Can't be a subdir if it's shorter. */
+
+	if (strncmp(norm_path, norm_tree, tree_len))
+		return 0; /* Nor if the first 'tree_len' bytes don't match. */
+
+	if (norm_path[tree_len] == '\0')
+		return 1; /* The path and tree are the same. */
+	if (norm_path[tree_len] == '/')
+		return 1; /* The path is a subdir of the tree. */
+
+	return 0; /* Nope. */
 }
 
 void diff_no_index(struct rev_info *revs,
@@ -186,22 +221,55 @@ void diff_no_index(struct rev_info *revs,
 			break;
 	}
 
-	if (!no_index && !nongit) {
-		/*
-		 * Inside a git repository, without --no-index.  Only
-		 * when a path outside the repository is given,
-		 * e.g. "git diff /var/tmp/[12]", or "git diff
-		 * Makefile /var/tmp/Makefile", allow it to be used as
-		 * a colourful "diff" replacement.
-		 */
-		if ((argc != i + 2) ||
-		    (!path_outside_repo(argv[i]) &&
-		     !path_outside_repo(argv[i+1])))
-			return;
+	/* How many pathspecs were given on the command line? */
+	const int n_paths = (argc - i);
+
+	/* If --no-index wasn't specified explicitly, check to see if
+	 * we need to force it. */
+	if (!no_index) {
+
+		/* The first pathspec is the "left hand side" */
+		const char *lhs = (n_paths >= 1) ? argv[i] : NULL;
+		const int lhs_in_repo = lhs && path_in_repo(lhs);
+
+		/* And the second is the "right hand side" */
+		const char *rhs = (n_paths >= 2) ? argv[i+1] : NULL;
+		const int rhs_in_repo = rhs && path_in_repo(rhs);
+
+		int force_no_index = 0;
+
+		if (nongit) {
+			warning("'git diff' outside a repo, forcing --no-index");
+			force_no_index = 1;
+		}
+		else {
+			if (lhs_in_repo || rhs_in_repo)
+				return; /* Normal git diff, let core handle it. */
+
+			const int lhs_untracked = (lhs && !lhs_in_repo);
+			const int rhs_untracked = (rhs && !rhs_in_repo);
+
+			if (lhs_untracked && rhs_untracked)
+				warning("neither '%s' nor '%s' are tracked,"
+					" forcing --no-index", lhs, rhs );
+			else if (lhs_untracked)
+				warning("'%s' is not tracked,"
+					" forcing --no-index", lhs );
+			else if (rhs_untracked)
+				warning("'%s' is not tracked,"
+					" forcing --no-index", rhs );
+
+			force_no_index = lhs_untracked || rhs_untracked;
+		}
+
+		if (!force_no_index) /* Impossible? */
+			die("--no-index not set and not forced");
+	}
+
+	if (n_paths != 2) {
+		warning("no-index mode requires exactly two pathspecs");
+		usage("git diff --no-index <path> <path>");
 	}
-	if (argc != i + 2)
-		usagef("git diff %s <path> <path>",
-		       no_index ? "--no-index" : "[--no-index]");
 
 	diff_setup(&revs->diffopt);
 	for (i = 1; i < argc - 2; ) {
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve errors from 'git diff --no-index'.
       [not found] ` <7vlixyw4cx.fsf@alter.siamese.dyndns.org>
@ 2011-05-23  2:35   ` Anthony Foiani
  2011-05-23  3:18     ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Anthony Foiani @ 2011-05-23  2:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio --

On Sun, May 22, 2011 at 8:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Anthony Foiani <anthony.foiani@gmail.com> writes:
>
>> I accidentally tried to use "git diff" when I wasn't within a git
>> repository, only to be confused by getting a usage message with no
>> particular error output.
>
> I do not understand this at all.
>
>    $ cd /
>    $ git diff
>    usage: git diff [--no-index] <path> <path>
>
> What's unclear about it?

It's hard for me to explain right now, as I just spent a few hours
figuring out exactly what "git diff" assumed would happen.  And now
that I have done that study, I would not need anything beyond the
original error message, either.

But when I screwed up earlier today, I was somewhat taken aback by the
fact that I kept on getting the same usage message with no indication
of what the heck I was doing wrong:

$ git diff -b main.c
usage: git diff [--no-index] <path> <path>
$ git diff --cached -b main.c
usage: git diff [--no-index] <path> <path>
$ git diff -b HEAD -- main.c
usage: git diff [--no-index] <path> <path>
$ git diff -b -- main.c
usage: git diff [--no-index] <path> <path>

It would have saved me a chunk of time to have one of my diagnostics
(in this case, "not in a git repo") to knock some sense into me
earlier.

*shrug* Toss it in the bit-bucket if you like, but the work's already
been done. And I happen to find the new code more readable than the
previous triple-negative and de Morgan's-expanded conditionals, but
that's of course personal taste.

For that matter, the existing code was already doing many of these
checks, but not communicating the results back in any way other than
presenting a usage string. I apparently interpreted that as "you're
calling me wrong", not "you're using me wrong".

There might be a bug even in the previous code, though: there's an odd
early exit from 'path_outside_repo' for absolute pathnames, but I got
lost in the multiple negations and could never quite convince myself
of its correctness (or lack thereof).

Either way, thanks for all your work on git!

Best regards,
Tony

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve errors from 'git diff --no-index'.
  2011-05-23  2:35   ` Anthony Foiani
@ 2011-05-23  3:18     ` Junio C Hamano
  2011-05-23  4:05       ` Anthony Foiani
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2011-05-23  3:18 UTC (permalink / raw)
  To: Anthony Foiani; +Cc: Junio C Hamano, git

Anthony Foiani <anthony.foiani@gmail.com> writes:

> On Sun, May 22, 2011 at 8:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Anthony Foiani <anthony.foiani@gmail.com> writes:
>>
>>> I accidentally tried to use "git diff" when I wasn't within a git
>>> repository, only to be confused by getting a usage message with no
>>> particular error output.
>>
>> I do not understand this at all.
>>
>>    $ cd /
>>    $ git diff
>>    usage: git diff [--no-index] <path> <path>
>>
>> What's unclear about it?
>
> It's hard for me to explain right now,...

No, it was only your 'I tried to use "git diff"' that was confusing to me.

With pathspecs or any other options, it is understandable that you were
confused, for example from an output like this:

> $ git diff -b main.c
> usage: git diff [--no-index] <path> <path>

At that point we are committed to no-index codepath, so probably a good
thing to do is to update that confusing usage message a bit better.

It already does change the behaviour when --no-index is not given by the
end user, so perhaps it would be sufficient to change this part:

	if (argc != i + 2)
        	usagef("git diff %s <path> <path>",
                	no_index ? "--no-index" : "[--no-index]");

to something like (caution: I am not typing this in my MUA):

	if (argc != i + 2) {
        	if (no_index)
			warning("Assuming '--no-index'; you are in no repository");
        	usagef("git diff %s <path> <path>",
                	no_index ? "--no-index" : "[--no-index]");
	}

without changing anything else.

That would be just a patch with 1 line removed, 4 lines added, no?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve errors from 'git diff --no-index'.
  2011-05-23  3:18     ` Junio C Hamano
@ 2011-05-23  4:05       ` Anthony Foiani
  2011-05-23 19:22         ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Anthony Foiani @ 2011-05-23  4:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio --

On Sun, May 22, 2011 at 9:18 PM, Junio C Hamano <gitster@pobox.com> wrote:
> No, it was only your 'I tried to use "git diff"' that was confusing to me.

Apologies!

> With pathspecs or any other options, it is understandable that you were
> confused, for example from an output like this:
>
>> $ git diff -b main.c
>> usage: git diff [--no-index] <path> <path>
>
> At that point we are committed to no-index codepath, so probably a good
> thing to do is to update that confusing usage message a bit better.
>
> It already does change the behaviour when --no-index is not given by the
> end user, so perhaps it would be sufficient to change this part:
>
>        if (argc != i + 2)
>                usagef("git diff %s <path> <path>",
>                        no_index ? "--no-index" : "[--no-index]");
>
> to something like (caution: I am not typing this in my MUA):
>
>        if (argc != i + 2) {
>                if (no_index)
>                        warning("Assuming '--no-index'; you are in no repository");
>                usagef("git diff %s <path> <path>",
>                        no_index ? "--no-index" : "[--no-index]");
>        }
>
> without changing anything else.
>
> That would be just a patch with 1 line removed, 4 lines added, no?

The "no_index" variable purely indicates whether or not "--no-index"
was given on the command line; for this particular warning, you want
"nongit" instead.

Alternately, you could simply do this above the "if (argc..." condition:

if (!no_index)
        warning("Assuming '--no-index'");

It wouldn't explain *why*, but would at least provide some sort of a
hint to the user that they're maybe not doing what they thought they
were doing.

If that's more to your taste, then I'd rather see this than nothing.  :)

You might also want to account for the fact that, even within a git
repo, if "git diff" is invoked on two paths that are both outside the
repo, then --no-index is forced. E.g.:

$ cd linux-2.6-stable/
$ git log -1 --pretty=short
commit 831d52bc153971b70e64eccfbed2b232394f22f8
Author: Suresh Siddha <suresh.b.siddha@intel.com>

    x86, mm: avoid possible bogus tlb entries by clearing prev
mm_cpumask after switching mm

$ ls -al /tmp/{foo,bar}
-rw-rw-r--. 1 tony tony 0 May 22 10:09 /tmp/bar
-rw-rw-r--. 1 tony tony 0 May 22 10:00 /tmp/foo

$ # current git tip:
$ git diff /tmp/{foo,bar}

$ # with my patch:
$ ../git/git-diff /tmp/{foo,bar}
warning: neither '/tmp/foo' nor '/tmp/bar' are tracked, forcing --no-index

Although I apparently did break something...

$ touch bar
$ git diff bar /tmp/foo
$ ../git/git-diff bar /tmp/foo
fatal: '/tmp/foo' is outside repository

Given that I broke things, and that a +100/-25 patch is a bit
excessive, consider it retracted.

Thanks once again!

Best regards,
Tony

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve errors from 'git diff --no-index'.
  2011-05-23  4:05       ` Anthony Foiani
@ 2011-05-23 19:22         ` Junio C Hamano
  2011-05-23 20:50           ` Jeff King
  2011-05-23 21:24           ` Anthony Foiani
  0 siblings, 2 replies; 7+ messages in thread
From: Junio C Hamano @ 2011-05-23 19:22 UTC (permalink / raw)
  To: Anthony Foiani; +Cc: git

Anthony Foiani <anthony.foiani@gmail.com> writes:

> $ ls -al /tmp/{foo,bar}
> -rw-rw-r--. 1 tony tony 0 May 22 10:09 /tmp/bar
> -rw-rw-r--. 1 tony tony 0 May 22 10:00 /tmp/foo
>
> $ # current git tip:
> $ git diff /tmp/{foo,bar}
>
> $ # with my patch:
> $ ../git/git-diff /tmp/{foo,bar}
> warning: neither '/tmp/foo' nor '/tmp/bar' are tracked, forcing --no-index

I actually consider this a regression. We are giving an output that the
user wanted to see, and I do not see a reason why we need to warn.

A tangent.

One thing that I have always been unhappy about --no-index is that it does
not really mesh well with the notion of what a "git diff" is. "git diff"
inherently is an operation to take two collections of contents labeled
with paths, and show series of patch output between corresponding paths in
these collections, while rename/copy detection may affect the definition
of "correspoinding paths".

A typical "I know 'git diff' has a lot more features like color-words that
my platform 'diff' does not support, so let me use 'git diff' instead"
session does something like:

	$ git diff [--no-index] /tmp/foo /tmp/bar

but such a request does not compare "two collections of paths that have
corresponding paths" at all. We could say we are comparing a collection
that has tmp/foo with another that has tmp/bar, but then we should either
emit a delete patch for tmp/foo and a create patch for tmp/bar, or emit a
rename patch to create tmp/bar out of tmp/foo if we want to be consistent.

But that consistency goes totally against what the users would expect.
This inconsistency is not a fault of either the definition of "git diff"
nor the user's expectation. They are fundamentally different and the root
cause of it is that we support --no-index diff between randomly chosen two
files. I am not saying we should drop that feature, but it does not change
the fact that we had to add extra code in the output codepath only to
support this "outside git" use case to suppress rename information (and
probably other things I do not remember).

The _only_ use of --no-index that is in line with what "git diff" does is
to compare two directories as the "two collections of contents" above,
i.e.

	$ git diff --no-index old/ new/

and then support pathspecs, like this:

	$ git diff --no-index old/ new/ -- Makefile '*.c'

But I do not think the current implementation does not even support this
only sane usecase.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve errors from 'git diff --no-index'.
  2011-05-23 19:22         ` Junio C Hamano
@ 2011-05-23 20:50           ` Jeff King
  2011-05-23 21:24           ` Anthony Foiani
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff King @ 2011-05-23 20:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Anthony Foiani, git

On Mon, May 23, 2011 at 12:22:34PM -0700, Junio C Hamano wrote:

> > $ # with my patch:
> > $ ../git/git-diff /tmp/{foo,bar}
> > warning: neither '/tmp/foo' nor '/tmp/bar' are tracked, forcing --no-index
> 
> I actually consider this a regression. We are giving an output that the
> user wanted to see, and I do not see a reason why we need to warn.

Agreed.

> One thing that I have always been unhappy about --no-index is that it does
> not really mesh well with the notion of what a "git diff" is. "git diff"
> inherently is an operation to take two collections of contents labeled
> with paths, and show series of patch output between corresponding paths in
> these collections, while rename/copy detection may affect the definition
> of "correspoinding paths".

It's not just --no-index that breaks this abstraction. How about:

  git diff HEAD:Makefile v1.7.5:Makefile

or even

  git diff :1:Makefile :2:Makefile

?

So I think even without --no-index, we have the concept that diff
compares two "things", and that only some of those things are a
collection of paths (either a tree-ish, or possibly a directory, but as
you note below, I suspect the latter is broken).

And obviously the two "things" must either both be single items, or must
both be collections, as any other comparison doesn't really make sense.
We do catch this, but it's another place where it might be nice to have
better error messages:

  $ git diff HEAD:Makefile HEAD
  usage: git diff [<options>] [<commit> [<commit>]] [--] [<path>...]

> 	$ git diff [--no-index] /tmp/foo /tmp/bar
> 
> but such a request does not compare "two collections of paths that have
> corresponding paths" at all. We could say we are comparing a collection
> that has tmp/foo with another that has tmp/bar, but then we should either
> emit a delete patch for tmp/foo and a create patch for tmp/bar, or emit a
> rename patch to create tmp/bar out of tmp/foo if we want to be consistent.
> 
> But that consistency goes totally against what the users would expect.

Right. If I give git two single items to compare, rather than
collections, I am not going to expect renames at all, which
fundamentally involve path movement. Those are a collection thing, and I
asked for a much narrower scope.

> The _only_ use of --no-index that is in line with what "git diff" does is
> to compare two directories as the "two collections of contents" above,
> i.e.
> 
> 	$ git diff --no-index old/ new/
> 
> and then support pathspecs, like this:
> 
> 	$ git diff --no-index old/ new/ -- Makefile '*.c'
> 
> But I do not think the current implementation does not even support this
> only sane usecase.

Yeah, I would expect git to support that, but knowing the builtin/diff.c
code, I am not too surprised if it doesn't. :)

On the subject of things git diff should probably do but I doubt anybody
cares enough to implement, it would be nice to be able to do:

  git diff $merge^1 $merge^2 $merge

and get the combined diff that "git show $merge" would give you, but
without the log cruft.

I also wonder if the reverse would be useful:

  git diff $ours $theirs `git merge-base $ours $theirs`

which would produce something like a reverse combined diff, or "how have
we diverged". E.g.,:

  diff --combined file
  index $base..$ours,$theirs
  --- a/file
  --- b/file
  @@@ -1,1 +1,1 +1,1 @@@
  --what was in the base
  + what is in ours
   +what is in theirs

But that is more than just a git-diff interface problem; we would
actually have to write the reverse-combined diff code (I _think_ it
should be a pretty simple reversal of the regular combined diff, but I
haven't really thought it through).

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Improve errors from 'git diff --no-index'.
  2011-05-23 19:22         ` Junio C Hamano
  2011-05-23 20:50           ` Jeff King
@ 2011-05-23 21:24           ` Anthony Foiani
  1 sibling, 0 replies; 7+ messages in thread
From: Anthony Foiani @ 2011-05-23 21:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio --

This fundamentally all started as me confusing myself, and the patch
was the result of me scratching that itch.  As such, I'm perfectly
willing to drop it, but I did want to address a few last things:

On Mon, May 23, 2011 at 1:22 PM, Junio C Hamano <gitster@pobox.com> wrote:

>> $ # with my patch:
>> $ ../git/git-diff /tmp/{foo,bar}
>> warning: neither '/tmp/foo' nor '/tmp/bar' are tracked, forcing --no-index
>
> I actually consider this a regression. We are giving an output that the
> user wanted to see, and I do not see a reason why we need to warn.

No, we're not: the user did *not* ask for a no-index diff, but we're
giving them one anyway. That warrants a warn (or other message; if
there's a less-severe version of "warn", then that's fine) to tell
them that the software is trying to "do what you mean".

Please also keep in mind that you live and breathe git, and have done
so for years (and I'm very thankful for that!).  But not everyone has
the git mindset all the time; adding an extra message or two doesn't
seem like too much pain to help novices and dilettantes.

> But that consistency goes totally against what the users would expect.
> This inconsistency is not a fault of either the definition of "git diff"
> nor the user's expectation. They are fundamentally different and the root
> cause of it is that we support --no-index diff between randomly chosen two
> files.

Maybe we *should* drop the feature: I would have been *happier* had
the error message from my initial attempt been simply: "not in a git
repo, can't use git diff". (Since that was, fundamentally, the mistake
I made.)

Maybe install "diff-git" or even "gdiff" (although maybe that's used
by gnu diff on non-gnu platforms, dunno).

But I maintain that the current combination of options, assumptions,
and error output are insufficient, and I still feel that my patch(es)
improve that matter.  *shrug*  Your call, of course.

Best regards,
Tony

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-05-23 21:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-22 19:17 [PATCH] Improve errors from 'git diff --no-index' Anthony Foiani
     [not found] ` <7vlixyw4cx.fsf@alter.siamese.dyndns.org>
2011-05-23  2:35   ` Anthony Foiani
2011-05-23  3:18     ` Junio C Hamano
2011-05-23  4:05       ` Anthony Foiani
2011-05-23 19:22         ` Junio C Hamano
2011-05-23 20:50           ` Jeff King
2011-05-23 21:24           ` Anthony Foiani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.