All of lore.kernel.org
 help / color / mirror / Atom feed
* misleading diff-hunk header
@ 2012-08-21 13:21 Tim Chase
  0 siblings, 0 replies; 15+ messages in thread
From: Tim Chase @ 2012-08-21 13:21 UTC (permalink / raw)
  To: git

[posted originally to git-users@ but advised this would be a better
forum]

diff.{type}.xfuncname seems to start searching backwards in
from the beginning of the hunk, not the first differing line.

To reproduce:

  $ mkdir tmp
  $ cd tmp
  $ git init
  $ cat > foo.c <<EOF
  int call_me(int maybe)
  {
  }

  int main()
  {
  }
  EOF
  $ git add foo.c
  $ git commit -m "Initial checkin"
  $ ed foo.c
  # main() should return 0
  $i
    return 0;
  .
  wq
  $ git diff

The diff returns a header line of

  @@ -4,4 +4,5 @@ int call_me(int maybe)

   int main()
   {
  +  return 0;
   }

misleadingly suggesting that the change occurred in the call_me()
function, rather than in main()

I'm running 1.7.2.5 from Debian Stable if that makes a difference.

Is this expected/proper behavior?

-tkc

FWIW, I stumbled across this tinkering with
diff.{typename}.xfuncname detailed in gitattributes(5)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-26 10:43                 ` Stefano Lattarini
@ 2012-08-26 18:53                   ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2012-08-26 18:53 UTC (permalink / raw)
  To: Stefano Lattarini; +Cc: Tim Chase, Jeff King, git

Stefano Lattarini <stefano.lattarini@gmail.com> writes:

> On 08/25/2012 02:56 PM, Tim Chase wrote:
>> On 08/24/12 23:29, Junio C Hamano wrote:
>>> Tim Chase <git@tim.thechases.com> writes:
>>>> If the documented purpose of "diff -p" (and by proxy
>>>> diff.{type}.xfuncname) is to show the name of the *function*
>>>> containing the changed lines,....
>>>
>>> Yeah, the documentation is misleading, but I do not offhand think of
>>> a better phrasing. Perhaps you could send in a patch to improve it.
>>>
>>> How does GNU manual explain the option?
>> 
>> Tersely. :-)
>> 
>>        -p  --show-c-function
>>               Show which C function each change is in.
>>
> That's in the manpage, which is basically just a copy of the output from
> "diff --help".  In the texinfo manual (which is the real documentation),
> there are additional explanations, saying, among other things:
>
>     To show in which functions differences occur for C and similar languages,
>     you can use the --show-c-function (-p) option. This option automatically
>     defaults to the context output format (see Context Format), with the
>     default number of lines of context. You can override that number with
>     -C lines elsewhere in the command line. You can override both the format
>     and the number with -U lines elsewhere in the command line.
>     The -p option is equivalent to -F '^[[:alpha:]$_]' if the unified format
>     is specified, otherwise -c -F '^[[:alpha:]$_]' (see Specified Headings).
>     GNU diff provides this option for the sake of convenience.
>     ...
>     The --show-function-line (-F) option finds the nearest unchanged line
>     that precedes each hunk of differences and matches the given regular
>     expression.

So in short, if we say "Show which function each change is in" in
the documentation, that is consistent with what GNU does and that is
described consistently with what GNU says, modulo that we obviously
do more than "C" via the diff.<driver>.xfuncname mechanism.

We already document diff.<driver>.xfuncname as determining "the hunk
header", and the documentation that is referred to (i.e. gitattributes)
shows the shape of a hunk in the "diff" output:

            @@ -k,l +n,m @@ TEXT

    This is called a 'hunk header'.  The "TEXT" portion is by default a line
    that begins with an alphabet, an underscore or a dollar sign; this
    matches what GNU 'diff -p' output uses.

and then later says:

    Then, you would define a "diff.tex.xfuncname" configuration to
    specify a regular expression that matches a line that you would
    want to appear as the hunk header "TEXT".

Honestly, I do not offhand see an obvious and possible room for vast
improvement over what we already have, so my RFH to Tim that appears
at the beginning of what you quoted from my previous message still
stands ;-).

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-25 12:56               ` Tim Chase
@ 2012-08-26 10:43                 ` Stefano Lattarini
  2012-08-26 18:53                   ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Stefano Lattarini @ 2012-08-26 10:43 UTC (permalink / raw)
  To: Tim Chase; +Cc: Junio C Hamano, Jeff King, git

On 08/25/2012 02:56 PM, Tim Chase wrote:
> On 08/24/12 23:29, Junio C Hamano wrote:
>> Tim Chase <git@tim.thechases.com> writes:
>>> If the documented purpose of "diff -p" (and by proxy
>>> diff.{type}.xfuncname) is to show the name of the *function*
>>> containing the changed lines,....
>>
>> Yeah, the documentation is misleading, but I do not offhand think of
>> a better phrasing. Perhaps you could send in a patch to improve it.
>>
>> How does GNU manual explain the option?
> 
> Tersely. :-)
> 
>        -p  --show-c-function
>               Show which C function each change is in.
>
That's in the manpage, which is basically just a copy of the output from
"diff --help".  In the texinfo manual (which is the real documentation),
there are additional explanations, saying, among other things:

    To show in which functions differences occur for C and similar languages,
    you can use the --show-c-function (-p) option. This option automatically
    defaults to the context output format (see Context Format), with the
    default number of lines of context. You can override that number with
    -C lines elsewhere in the command line. You can override both the format
    and the number with -U lines elsewhere in the command line.
    The -p option is equivalent to -F '^[[:alpha:]$_]' if the unified format
    is specified, otherwise -c -F '^[[:alpha:]$_]' (see Specified Headings).
    GNU diff provides this option for the sake of convenience.
    ...
    The --show-function-line (-F) option finds the nearest unchanged line
    that precedes each hunk of differences and matches the given regular
    expression.

You can find more information in the on-line documentation:

  <http://www.gnu.org/software/diffutils/manual/diffutils.html>

HTH,
  Stefano

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-25  4:29             ` Junio C Hamano
@ 2012-08-25 12:56               ` Tim Chase
  2012-08-26 10:43                 ` Stefano Lattarini
  0 siblings, 1 reply; 15+ messages in thread
From: Tim Chase @ 2012-08-25 12:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git

On 08/24/12 23:29, Junio C Hamano wrote:
> Tim Chase <git@tim.thechases.com> writes:
>> If the documented purpose of "diff -p" (and by proxy
>> diff.{type}.xfuncname) is to show the name of the *function*
>> containing the changed lines,....
> 
> Yeah, the documentation is misleading, but I do not offhand think of
> a better phrasing. Perhaps you could send in a patch to improve it.
> 
> How does GNU manual explain the option?

Tersely. :-)

       -p  --show-c-function
              Show which C function each change is in.

And that's it.  To describe the current behavior, it might be better
written as "Find and show the first function definition prior to the
hunk".  The code in diff(1) actually just uses the regexp something
like "^[a-z]" which happens to find function definitions, but can
also find module-level variable definitions, structs, etc.

-tkc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-25  0:41           ` Tim Chase
@ 2012-08-25  4:29             ` Junio C Hamano
  2012-08-25 12:56               ` Tim Chase
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-08-25  4:29 UTC (permalink / raw)
  To: Tim Chase; +Cc: Jeff King, git

Tim Chase <git@tim.thechases.com> writes:

> If the documented purpose of "diff -p" (and by proxy
> diff.{type}.xfuncname) is to show the name of the *function*
> containing the changed lines,....

Yeah, the documentation is misleading, but I do not offhand think of
a better phrasing. Perhaps you could send in a patch to improve it.

How does GNU manual explain the option?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-24 16:44         ` Jeff King
@ 2012-08-25  0:41           ` Tim Chase
  2012-08-25  4:29             ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Tim Chase @ 2012-08-25  0:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On 08/24/12 11:44, Jeff King wrote:
> With the old code, you'd get:
> 
> 	diff --git a/old b/new
> 	index f384549..1066a25 100644
> 	--- a/old
> 	+++ b/new
> 	@@ -2,3 +2,3 @@ one
> 	 two
> 	-three
> 	+three -- modified
> 	 four
> 
> So the hunk header is showing you something useful; the element just
> above your context. But with my patch, you'd see:
> 
> 	diff --git a/old b/new
> 	index f384549..1066a25 100644
> 	--- a/old
> 	+++ b/new
> 	@@ -2,3 +2,3 @@ two
> 	 two
> 	-three
> 	+three -- modified
> 	 four
> 
> I.e., it shows the element just before the change, which is already in
> the context anyway. So it's actually less useful. Although note that the
> current behavior is not all that useful, either; it is not really giving
> you any information about the change, but rather just showing one extra
> line of context.
> 
> So I would say that which you would prefer might depend on exactly what
> you are diffing. But I would also argue that in any case where the new
> code produces a worse result, the hunk header was not all that useful to
> begin with.

If the documented purpose of "diff -p" (and by proxy
diff.{type}.xfuncname) is to show the name of the *function*
containing the changed lines, and all you have is a list of lines
with no function names, it's pretty arbitrary to call either
behavior "worse". :-)

-tkc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-24 14:29       ` Jeff King
  2012-08-24 15:05         ` Tim Chase
@ 2012-08-24 16:44         ` Jeff King
  2012-08-25  0:41           ` Tim Chase
  1 sibling, 1 reply; 15+ messages in thread
From: Jeff King @ 2012-08-24 16:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Tim Chase, Thomas Rast, git

On Fri, Aug 24, 2012 at 10:29:09AM -0400, Jeff King wrote:

> > Would this be sufficient?  Instead of looking for the first line that
> > matches the "beginning" pattern going backwards starting from one line
> > before the displayed context, we start our examination at the first line
> > shown in the context.
> > 
> >  xdiff/xemit.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> > index 277e2ee..5f9c0e0 100644
> > --- a/xdiff/xemit.c
> > +++ b/xdiff/xemit.c
> > @@ -131,7 +131,7 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
> >  
> >  		if (xecfg->flags & XDL_EMIT_FUNCNAMES) {
> >  			long l;
> > -			for (l = s1 - 1; l >= 0 && l > funclineprev; l--) {
> > +			for (l = s1; l >= 0 && l > funclineprev; l--) {
> >  				const char *rec;
> >  				long reclen = xdl_get_rec(&xe->xdf1, l, &rec);
> >  				long newfunclen = ff(rec, reclen, funcbuf,
> 
> In the case we were discussing then, the modified function started on
> the first line of context. But as Tim's example shows, it doesn't
> necessarily have to. I think it would make more sense to start counting
> from the first modified line.

That patch would look something like this:

diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index d11dbf9..441ecf5 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -194,7 +194,7 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
 
 		if (xecfg->flags & XDL_EMIT_FUNCNAMES) {
 			get_func_line(xe, xecfg, &func_line,
-				      s1 - 1, funclineprev);
+				      xche->i1 - 1, funclineprev);
 			funclineprev = s1 - 1;
 		}
 		if (xdl_emit_hunk_hdr(s1 + 1, e1 - s1, s2 + 1, e2 - s2,

Note that this breaks a ton of tests. Some of them are just noise (e.g.,
t4042 changes line 2, so line 1 is the top of the context; before, we
would show no hunk header, since we were at the top of the file, but now
we will show line 1). Some of them are improved in the way that this
patch intends (e.g., t4051).

But some I'm not sure of. For instance, the failure in t4018.38 is odd.
I think it's because the pattern it is looking for is a somewhat odd toy
example (it's looking for a line with "s" in it, so naturally when we
shift the start-point of our search, we are likely to find some other
false positive). But it raises an interesting point: what if the pattern
is just looking for lines in a list, and not an enclosing function?

For example, imagine you have a file a list of items, one per line.
With the old code, you'd get:

	diff --git a/old b/new
	index f384549..1066a25 100644
	--- a/old
	+++ b/new
	@@ -2,3 +2,3 @@ one
	 two
	-three
	+three -- modified
	 four

So the hunk header is showing you something useful; the element just
above your context. But with my patch, you'd see:

	diff --git a/old b/new
	index f384549..1066a25 100644
	--- a/old
	+++ b/new
	@@ -2,3 +2,3 @@ two
	 two
	-three
	+three -- modified
	 four

I.e., it shows the element just before the change, which is already in
the context anyway. So it's actually less useful. Although note that the
current behavior is not all that useful, either; it is not really giving
you any information about the change, but rather just showing one extra
line of context.

So I would say that which you would prefer might depend on exactly what
you are diffing. But I would also argue that in any case where the new
code produces a worse result, the hunk header was not all that useful to
begin with.

-Peff

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-24 14:29       ` Jeff King
@ 2012-08-24 15:05         ` Tim Chase
  2012-08-24 16:44         ` Jeff King
  1 sibling, 0 replies; 15+ messages in thread
From: Tim Chase @ 2012-08-24 15:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Thomas Rast, git

On 08/24/12 09:29, Jeff King wrote:
> On Tue, Aug 21, 2012 at 10:52:03AM -0700, Junio C Hamano wrote:
> 
>>>>> diff.{type}.xfuncname seems to start searching backwards in
>>>>> from the beginning of the hunk, not the first differing line.
>>>> [...]
>>>>>   @@ -4,4 +4,5 @@ int call_me(int maybe)
>>>>>
>>>>>    int main()
>>>>>    {
>>>>>   +  return 0;
>>>>>    }
>>>>>
>>>>> misleadingly suggesting that the change occurred in the call_me()
>>>>> function, rather than in main()
>>>>
>>>> I think that's intentional, and matches what 'diff -p' does. 
...
>>  xdiff/xemit.c |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/xdiff/xemit.c b/xdiff/xemit.c
>> index 277e2ee..5f9c0e0 100644
>> --- a/xdiff/xemit.c
>> +++ b/xdiff/xemit.c
>> @@ -131,7 +131,7 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
>>  
>>  		if (xecfg->flags & XDL_EMIT_FUNCNAMES) {
>>  			long l;
>> -			for (l = s1 - 1; l >= 0 && l > funclineprev; l--) {
>> +			for (l = s1; l >= 0 && l > funclineprev; l--) {
>>  				const char *rec;
>>  				long reclen = xdl_get_rec(&xe->xdf1, l, &rec);
>>  				long newfunclen = ff(rec, reclen, funcbuf,
> 
> In the case we were discussing then, the modified function started on
> the first line of context. But as Tim's example shows, it doesn't
> necessarily have to. I think it would make more sense to start counting
> from the first modified line.

Junio mentions that it matches the "diff -p" output, though I'd
consider that a bug in diff as well, since the diff(1) man/info
pages state "-p  Show which C function each change is in."  In the
above (both with "diff -p" and with git), the change was clearly in
main() but it's not showing main().  Documented behavior and
implemented behavior conflict.

Starting at the first differing line rather than the first line of
context in the hunk would ameliorate this.  It doesn't address what
happens if multiple functions were changed in the same hunk, but at
least it becomes correct for the first one.  More complex code might
be doable to split hunks if an xfuncname match occurs between two
disjoint changes in the same hunk.  But for my purposes here, the
above should suffice.

-tkc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-21 17:52     ` Junio C Hamano
@ 2012-08-24 14:29       ` Jeff King
  2012-08-24 15:05         ` Tim Chase
  2012-08-24 16:44         ` Jeff King
  0 siblings, 2 replies; 15+ messages in thread
From: Jeff King @ 2012-08-24 14:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Tim Chase, Thomas Rast, git

On Tue, Aug 21, 2012 at 10:52:03AM -0700, Junio C Hamano wrote:

> >>> diff.{type}.xfuncname seems to start searching backwards in
> >>> from the beginning of the hunk, not the first differing line.
> >> [...]
> >>>   @@ -4,4 +4,5 @@ int call_me(int maybe)
> >>>
> >>>    int main()
> >>>    {
> >>>   +  return 0;
> >>>    }
> >>>
> >>> misleadingly suggesting that the change occurred in the call_me()
> >>> function, rather than in main()
> >> 
> >> I think that's intentional, and matches what 'diff -p' does.  It gives
> >> you the context before the hunk.  After all, if a new function starts in
> >> the leading context lines, you can see that in the usual diff data.
> 
> Correct.  It is about "give the user _more_ hint/clue on the context
> of the hunk", in addition to what the user can see in the
> pre-context of the hunk, so it is pointless to hoist "int main()"
> there.

I don't think it is pointless. If you are skimming a diff, then the hunk
headers stand out to easily show which functions were touched. Of
course, as you mentioned later in your email, it is not an exhaustive
list, and I think for Tim's use case, he needs to actually read and
parse the whole patch.

But mentioning call_me here _is_ pointless, because it is not relevant
context at all (it was not modified; it just happens to be located near
the code in question).  So I would argue that showing main() is more
useful to a reader.

It gets even more obvious as you increase the context. Imagine I have
code like this:

   int foo(void)
   {
          return 1;
   }

   int bar(void)
   {
          return 2;
   }

   int baz(void)
   {
          return 3;
   }

and I modify "baz" to return "4" instead. With the regular diff
settings, the hunk header would claim that "bar()" is the context in the
hunk header. But if I ask for -U7, then "foo()" is mentioned in the hunk
header. To me, that doesn't make sense; the modification is exactly the
same, so why would the hunk header differ?

I suppose one could argue that the hunk header is not showing the
context of the change, but rather the context of the surrounding context
lines. But that doesn't seem useful to me.

We discussed this a while ago and you did a "how about this" patch:

  http://article.gmane.org/gmane.comp.version-control.git/181385

Gmane seems to be acting up this morning, so here is the patch (and your
comment) for reference:

> Would this be sufficient?  Instead of looking for the first line that
> matches the "beginning" pattern going backwards starting from one line
> before the displayed context, we start our examination at the first line
> shown in the context.
> 
>  xdiff/xemit.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> index 277e2ee..5f9c0e0 100644
> --- a/xdiff/xemit.c
> +++ b/xdiff/xemit.c
> @@ -131,7 +131,7 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
>  
>  		if (xecfg->flags & XDL_EMIT_FUNCNAMES) {
>  			long l;
> -			for (l = s1 - 1; l >= 0 && l > funclineprev; l--) {
> +			for (l = s1; l >= 0 && l > funclineprev; l--) {
>  				const char *rec;
>  				long reclen = xdl_get_rec(&xe->xdf1, l, &rec);
>  				long newfunclen = ff(rec, reclen, funcbuf,

In the case we were discussing then, the modified function started on
the first line of context. But as Tim's example shows, it doesn't
necessarily have to. I think it would make more sense to start counting
from the first modified line.

-Peff

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-21 17:39     ` Johannes Sixt
@ 2012-08-21 22:29       ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2012-08-21 22:29 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Tim Chase, Thomas Rast, git

Johannes Sixt <j6t@kdbg.org> writes:

> Am 21.08.2012 17:42, schrieb Tim Chase:
>> On 08/21/12 10:22, Thomas Rast wrote:
>>>> misleadingly suggesting that the change occurred in the call_me()
>>>> function, rather than in main()
>>>
>>> I think that's intentional, and matches what 'diff -p' does...
>> 
>> Okay...I tested "diff -p" and can't argue (much) with historical
>> adherence.  It just makes it hard for me to gather some stats on the
>> functions that changed, and requires that I look in more than one
>> place (both in the header, and in the leading context) rather than
>> having a single authoritative place to grep.
>
> If it's only for stats, why not just remove the context with -U0?

I actually think you want a way to say -U<sufficiently-large> in
this case instead of unsightly -U99999.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-21 15:42   ` Tim Chase
  2012-08-21 17:39     ` Johannes Sixt
@ 2012-08-21 17:52     ` Junio C Hamano
  2012-08-24 14:29       ` Jeff King
  1 sibling, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-08-21 17:52 UTC (permalink / raw)
  To: Tim Chase; +Cc: Thomas Rast, git

Tim Chase <git@tim.thechases.com> writes:

> On 08/21/12 10:22, Thomas Rast wrote:
>> Tim Chase <git@tim.thechases.com> writes:
>> 
>>> diff.{type}.xfuncname seems to start searching backwards in
>>> from the beginning of the hunk, not the first differing line.
>> [...]
>>>   @@ -4,4 +4,5 @@ int call_me(int maybe)
>>>
>>>    int main()
>>>    {
>>>   +  return 0;
>>>    }
>>>
>>> misleadingly suggesting that the change occurred in the call_me()
>>> function, rather than in main()
>> 
>> I think that's intentional, and matches what 'diff -p' does.  It gives
>> you the context before the hunk.  After all, if a new function starts in
>> the leading context lines, you can see that in the usual diff data.

Correct.  It is about "give the user _more_ hint/clue on the context
of the hunk", in addition to what the user can see in the
pre-context of the hunk, so it is pointless to hoist "int main()"
there.

> ...  It just makes it hard for me to gather some stats on the
> functions that changed, and requires that I look in more than one
> place (both in the header, and in the leading context) rather than
> having a single authoritative place to grep.

The right way to answer "which functions were touched?" question is
to ignore what you see on the hunk header "@@ .. @@" lines and only
look at the patch text, running "git diff" with larger number of
context lines as necessary.

If you have a large patch hunk that adds or removes two or more new
functions, you would have to look at the patch text _anyway_ to
learn about these two or more names---they cannot possibly both
appear on the hunk header lines, so looking at the context hint
there is pointless for the purpose for which you are using "diff"
output.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-21 15:42   ` Tim Chase
@ 2012-08-21 17:39     ` Johannes Sixt
  2012-08-21 22:29       ` Junio C Hamano
  2012-08-21 17:52     ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Johannes Sixt @ 2012-08-21 17:39 UTC (permalink / raw)
  To: Tim Chase; +Cc: Thomas Rast, git

Am 21.08.2012 17:42, schrieb Tim Chase:
> On 08/21/12 10:22, Thomas Rast wrote:
>>> misleadingly suggesting that the change occurred in the call_me()
>>> function, rather than in main()
>>
>> I think that's intentional, and matches what 'diff -p' does...
> 
> Okay...I tested "diff -p" and can't argue (much) with historical
> adherence.  It just makes it hard for me to gather some stats on the
> functions that changed, and requires that I look in more than one
> place (both in the header, and in the leading context) rather than
> having a single authoritative place to grep.

If it's only for stats, why not just remove the context with -U0?

-- Hannes

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-21 15:22 ` Thomas Rast
@ 2012-08-21 15:42   ` Tim Chase
  2012-08-21 17:39     ` Johannes Sixt
  2012-08-21 17:52     ` Junio C Hamano
  0 siblings, 2 replies; 15+ messages in thread
From: Tim Chase @ 2012-08-21 15:42 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git

On 08/21/12 10:22, Thomas Rast wrote:
> Tim Chase <git@tim.thechases.com> writes:
> 
>> diff.{type}.xfuncname seems to start searching backwards in
>> from the beginning of the hunk, not the first differing line.
> [...]
>>   @@ -4,4 +4,5 @@ int call_me(int maybe)
>>
>>    int main()
>>    {
>>   +  return 0;
>>    }
>>
>> misleadingly suggesting that the change occurred in the call_me()
>> function, rather than in main()
> 
> I think that's intentional, and matches what 'diff -p' does.  It gives
> you the context before the hunk.  After all, if a new function starts in
> the leading context lines, you can see that in the usual diff data.

Okay...I tested "diff -p" and can't argue (much) with historical
adherence.  It just makes it hard for me to gather some stats on the
functions that changed, and requires that I look in more than one
place (both in the header, and in the leading context) rather than
having a single authoritative place to grep.

Then again, "diff -p" only seems to support C functions, while git
supports bibtex, cpp, html, java, objc, pascal, php, python, ruby,
and tex out-of-the-box, with the option to build your own
function-finder, so pure adherence to history gets a little muddied.

Thanks for your thoughts,

-tkc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: misleading diff-hunk header
  2012-08-21 12:57 Tim Chase
@ 2012-08-21 15:22 ` Thomas Rast
  2012-08-21 15:42   ` Tim Chase
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Rast @ 2012-08-21 15:22 UTC (permalink / raw)
  To: Tim Chase; +Cc: git

Tim Chase <git@tim.thechases.com> writes:

> diff.{type}.xfuncname seems to start searching backwards in
> from the beginning of the hunk, not the first differing line.
[...]
>   @@ -4,4 +4,5 @@ int call_me(int maybe)
>
>    int main()
>    {
>   +  return 0;
>    }
>
> misleadingly suggesting that the change occurred in the call_me()
> function, rather than in main()

I think that's intentional, and matches what 'diff -p' does.  It gives
you the context before the hunk.  After all, if a new function starts in
the leading context lines, you can see that in the usual diff data.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* misleading diff-hunk header
@ 2012-08-21 12:57 Tim Chase
  2012-08-21 15:22 ` Thomas Rast
  0 siblings, 1 reply; 15+ messages in thread
From: Tim Chase @ 2012-08-21 12:57 UTC (permalink / raw)
  To: git

[posted originally to git-users@ but advised this would be a better
forum]

diff.{type}.xfuncname seems to start searching backwards in
from the beginning of the hunk, not the first differing line.

To reproduce:

  $ mkdir tmp
  $ cd tmp
  $ git init
  $ cat > foo.c <<EOF
  int call_me(int maybe)
  {
  }

  int main()
  {
  }
  EOF
  $ git add foo.c
  $ git commit -m "Initial checkin"
  $ ed foo.c
  # main() should return 0
  $i
    return 0;
  .
  wq
  $ git diff

The diff returns a header line of

  @@ -4,4 +4,5 @@ int call_me(int maybe)

   int main()
   {
  +  return 0;
   }

misleadingly suggesting that the change occurred in the call_me()
function, rather than in main()

I'm running 1.7.2.5 from Debian Stable if that makes a difference.

Is this expected/proper behavior?

-tkc

FWIW, I stumbled across this tinkering with
diff.{typename}.xfuncname detailed in gitattributes(5)

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-08-26 18:53 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-21 13:21 misleading diff-hunk header Tim Chase
  -- strict thread matches above, loose matches on Subject: below --
2012-08-21 12:57 Tim Chase
2012-08-21 15:22 ` Thomas Rast
2012-08-21 15:42   ` Tim Chase
2012-08-21 17:39     ` Johannes Sixt
2012-08-21 22:29       ` Junio C Hamano
2012-08-21 17:52     ` Junio C Hamano
2012-08-24 14:29       ` Jeff King
2012-08-24 15:05         ` Tim Chase
2012-08-24 16:44         ` Jeff King
2012-08-25  0:41           ` Tim Chase
2012-08-25  4:29             ` Junio C Hamano
2012-08-25 12:56               ` Tim Chase
2012-08-26 10:43                 ` Stefano Lattarini
2012-08-26 18:53                   ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.