All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible git blame bug?
@ 2017-03-13 20:11 Domagoj Stolfa
  2017-03-13 20:38 ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Domagoj Stolfa @ 2017-03-13 20:11 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1590 bytes --]

Hello,

yesterday I came across sort of a weird behaviour with git-blame. It would
appear when one queries the git blame on a specific file, such as:

$ git blame <file> --since=<year>

it will blame the entire file on some commit of that year, regardless of the
fact whether the commit has actually touched that file or not.

This seems consistent across all the tested systems: FreeBSD, macOS, Ubuntu and
Fedora.

An example of the blame can be seen:

$ git blame tcp_output.c
cd0bc69e2fdd (wollman  1995-09-22 20:05:58 +0000   29)  *       @(#)tcp_output.c        8.4 (Berkeley) 5/24/95

compared to:

$ git blame tcp_output.c --since=2017
^e19f2a27ed8 (Domagoj Stolfa 2017-03-12 20:43:01 +0100   29)  * @(#)tcp_output.c        8.4 (Berkeley) 5/24/95

$ git blame tcp_output.c --since=2016
^e4bdb83e76c (jceel    2016-03-13 19:50:17 +0000   29)  *       @(#)tcp_output.c        8.4 (Berkeley) 5/24/95

$ git blame tcp_output.c --since=2015
^d749a6e6c70 (pfg      2015-03-13 18:42:43 +0000   29)  *       @(#)tcp_output.c        8.4 (Berkeley) 5/24/95

Of course, specifiying

$ git blame --since=1.1.2017

gives correct results, as expected.

The question is whether this is a bug or not, as --since=<year> might not be a
valid filter. However, this might surprise a lot of users and mislead
development. I would personally like to see this behaviour changed to either a
blank report, a report of that year(making it a valid filter), but certainly not
blaming it on commits that have never touched that file.

-- 
Best regards,
Domagoj Stolfa

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible git blame bug?
  2017-03-13 20:11 Possible git blame bug? Domagoj Stolfa
@ 2017-03-13 20:38 ` Junio C Hamano
       [not found]   ` <20170313204401.GB80633@workstation>
  2017-03-13 21:29   ` Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2017-03-13 20:38 UTC (permalink / raw)
  To: Domagoj Stolfa; +Cc: git

Domagoj Stolfa <domagoj.stolfa@gmail.com> writes:

> The question is whether this is a bug or not, as --since=<year> might not be a
> valid filter.

I do not think blame ever was designed to work with --since, so that
is indeed the case.  

Making it work with --since=<date> might be a worthy addition.
Patches welcome.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible git blame bug?
       [not found]   ` <20170313204401.GB80633@workstation>
@ 2017-03-13 20:46     ` Domagoj Stolfa
  0 siblings, 0 replies; 10+ messages in thread
From: Domagoj Stolfa @ 2017-03-13 20:46 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 895 bytes --]

Hello,
 
> > The question is whether this is a bug or not, as --since=<year> might not be a
> > valid filter.
> 
> I do not think blame ever was designed to work with --since, so that
> is indeed the case.  
> 
> Making it work with --since=<date> might be a worthy addition.
> Patches welcome.
 
Thanks for the information. I wasn't aware that it wasn't designed with that in
mind, though it would indicate to me that --since seems to work with git-blame
according to [1], but expects a date, as I have usually been using it. It also
seems to produce correct results on FreeBSD 12.0-CURRENT with git 2.11.1, except
when given a filter such as --since=<year>, in which case perhaps nothing should
be displayed?
 
Could you please clarify which bits wouldn't work with --since in git-blame?
 
[1] https://www.git-scm.com/docs/git-blame
 
-- 
Best regards,
Domagoj Stolfa

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible git blame bug?
  2017-03-13 20:38 ` Junio C Hamano
       [not found]   ` <20170313204401.GB80633@workstation>
@ 2017-03-13 21:29   ` Junio C Hamano
  2017-03-13 21:44     ` Domagoj Stolfa
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2017-03-13 21:29 UTC (permalink / raw)
  To: Domagoj Stolfa; +Cc: Git Mailing List

On Mon, Mar 13, 2017 at 1:38 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Domagoj Stolfa <domagoj.stolfa@gmail.com> writes:
>
>> The question is whether this is a bug or not, as --since=<year> might not be a
>> valid filter.
>
> I do not think blame ever was designed to work with --since, so that
> is indeed the case.

Actually, I do see that we had a cut-off based on rev->max_age since we
introduced it at cee7f245 ("git-pickaxe: blame rewritten.", 2006-10-19).

I do not know offhand if --since=2000 _means_ --since=2000-01-01 or something
completely different, though.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible git blame bug?
  2017-03-13 21:29   ` Junio C Hamano
@ 2017-03-13 21:44     ` Domagoj Stolfa
  2017-03-13 22:19       ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Domagoj Stolfa @ 2017-03-13 21:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 1790 bytes --]

Hello,

> >> The question is whether this is a bug or not, as --since=<year> might not be a
> >> valid filter.
> >
> > I do not think blame ever was designed to work with --since, so that
> > is indeed the case.
> 
> Actually, I do see that we had a cut-off based on rev->max_age since we
> introduced it at cee7f245 ("git-pickaxe: blame rewritten.", 2006-10-19).
> 
> I do not know offhand if --since=2000 _means_ --since=2000-01-01 or something
> completely different, though.

It seems to indicate something entirely different. The problem with it is that
it mentions commits that haven't even touched the file though. Output with
commit hashes that have touched that file would be sensible, albeit wrong in the
sense that the user did not want to see that behaviour.

For example, saying:

$ git blame time.h --since=2017
^e19f2a27ed8 (Domagoj Stolfa 2017-03-12 20:43:01 +0100  33) #ifndef _SYS_TIME_H_

$ git blame time.h --since=2016
^21613a57af9 (bz  2016-03-13 21:26:18 +0000  33) #ifndef _SYS_TIME_H_

$ git blame time.h --since=2015
^48507f436f0 (mav 2015-03-13 21:01:25 +0000  33) #ifndef _SYS_TIME_H_

and so on, with different hashes.

Looking at these commits:

(1) https://github.com/dstolfa/freebsd/commit/e19f2a27ed82f616d47dd4e0dc75722139e72957
(2) https://github.com/dstolfa/freebsd/commit/21613a57af9500acca9b3528958312dd3ae12bb4
(3) https://github.com/dstolfa/freebsd/commit/48507f436f07a9515c6d7108509a50dd4fd403b2

neither of them seems to touch time.h, yet git blame reports them to do so.
Interestingly enough, it seems to be taking the commit that is the closest to
the current date in that year, and blaming it on that one, regardless of what it
actually did and the file specified.

-- 
Best regards,
Domagoj Stolfa

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible git blame bug?
  2017-03-13 21:44     ` Domagoj Stolfa
@ 2017-03-13 22:19       ` Junio C Hamano
  2017-03-13 22:46         ` Junio C Hamano
  2017-03-13 23:08         ` Domagoj Stolfa
  0 siblings, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2017-03-13 22:19 UTC (permalink / raw)
  To: Domagoj Stolfa; +Cc: Git Mailing List

Domagoj Stolfa <domagoj.stolfa@gmail.com> writes:

> For example, saying:
>
> $ git blame time.h --since=2017
> ^e19f2a27ed8 (Domagoj Stolfa 2017-03-12 20:43:01 +0100  33) #ifndef _SYS_TIME_H_
>
> $ git blame time.h --since=2016
> ^21613a57af9 (bz  2016-03-13 21:26:18 +0000  33) #ifndef _SYS_TIME_H_
>
> $ git blame time.h --since=2015
> ^48507f436f0 (mav 2015-03-13 21:01:25 +0000  33) #ifndef _SYS_TIME_H_
>
> and so on, with different hashes.

The output lines "^deadbeef" does *NOT* mean that commit deadbeef
changed the revision.  It just is telling you that the hisory was
dug down to that revision and it was found that since that revision
there is no change (and you told the command not to bother looking
beyond that time range, so we do not know what happened before that
time).

It is understandable, when your history has a lot of merges, the
history traversal may stop at commits on different branches.

Imagine a case where the line in question never changed throughout
the history:

          o---o---B
         /         \
    O---o---o---A---C---o---o

Imagine A is from 2015, B is from 2016 and C is from 2017.  C's
first parent, i.e. C^1, is A and C^2 is B.

If you ask the command to stop digging when you hit a commit on or
before 2017-03-13 (03-13 is because today's date is appended to your
2017), your traversal will stop at C and you get a line that begins
with ^C.

If you ask it to stop at 2016, A won't be even looked at because it
is older.  The command will keep digging from C to find B.  If B's
parent is also newer than the cutoff, but its parent is older, then
the line will be shown with ^ and commit object name of B's parent.

If you ask it to stop at 2015, the command will first consider A
(C's earlier parent) and pass blame to the lines common between
these two commits.  In this illustration, we are pretending that the
file did not change throughout the hsitory, so blame for all lines
are passed to A and we don't even look at B.  Then we keep digging
through A to find the culprit, or hit a commit older than the
specified cut-off time.  The line will be shown with ^A or perhaps
its ancestor.

So it is entirely sane if you saw three boundary commits named with
three different time ranges.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible git blame bug?
  2017-03-13 22:19       ` Junio C Hamano
@ 2017-03-13 22:46         ` Junio C Hamano
  2017-03-13 23:08         ` Domagoj Stolfa
  1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2017-03-13 22:46 UTC (permalink / raw)
  To: Domagoj Stolfa; +Cc: Git Mailing List

Junio C Hamano <gitster@pobox.com> writes:

> Domagoj Stolfa <domagoj.stolfa@gmail.com> writes:
>
>> For example, saying:
>>
>> $ git blame time.h --since=2017
>> ^e19f2a27ed8 (Domagoj Stolfa 2017-03-12 20:43:01 +0100  33) #ifndef _SYS_TIME_H_
>>
>> $ git blame time.h --since=2016
>> ^21613a57af9 (bz  2016-03-13 21:26:18 +0000  33) #ifndef _SYS_TIME_H_
>>
>> $ git blame time.h --since=2015
>> ^48507f436f0 (mav 2015-03-13 21:01:25 +0000  33) #ifndef _SYS_TIME_H_
>>
>> and so on, with different hashes.
>
> The output lines "^deadbeef" does *NOT* mean that commit deadbeef
> changed the revision.  It just is telling you that the hisory was
> dug down to that revision and it was found that since that revision
> there is no change (and you told the command not to bother looking
> beyond that time range, so we do not know what happened before that
> time).
> ...
> So it is entirely sane if you saw three boundary commits named with
> three different time ranges.

Side note.  If the displaying of the boundary commit object names
are distracting, the user can always use the -b option to blank them
out.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible git blame bug?
  2017-03-13 22:19       ` Junio C Hamano
  2017-03-13 22:46         ` Junio C Hamano
@ 2017-03-13 23:08         ` Domagoj Stolfa
  2017-03-13 23:15           ` Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: Domagoj Stolfa @ 2017-03-13 23:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 2548 bytes --]

Hello,

> > For example, saying:
> >
> > $ git blame time.h --since=2017
> > ^e19f2a27ed8 (Domagoj Stolfa 2017-03-12 20:43:01 +0100  33) #ifndef _SYS_TIME_H_
> >
> > $ git blame time.h --since=2016
> > ^21613a57af9 (bz  2016-03-13 21:26:18 +0000  33) #ifndef _SYS_TIME_H_
> >
> > $ git blame time.h --since=2015
> > ^48507f436f0 (mav 2015-03-13 21:01:25 +0000  33) #ifndef _SYS_TIME_H_
> >
> > and so on, with different hashes.
> 
> The output lines "^deadbeef" does *NOT* mean that commit deadbeef
> changed the revision.  It just is telling you that the hisory was
> dug down to that revision and it was found that since that revision
> there is no change (and you told the command not to bother looking
> beyond that time range, so we do not know what happened before that
> time).
> 
> It is understandable, when your history has a lot of merges, the
> history traversal may stop at commits on different branches.
> 
> Imagine a case where the line in question never changed throughout
> the history:
> 
>           o---o---B
>          /         \
>     O---o---o---A---C---o---o
> 
> Imagine A is from 2015, B is from 2016 and C is from 2017.  C's
> first parent, i.e. C^1, is A and C^2 is B.
> 
> If you ask the command to stop digging when you hit a commit on or
> before 2017-03-13 (03-13 is because today's date is appended to your
> 2017), your traversal will stop at C and you get a line that begins
> with ^C.
> 
> If you ask it to stop at 2016, A won't be even looked at because it
> is older.  The command will keep digging from C to find B.  If B's
> parent is also newer than the cutoff, but its parent is older, then
> the line will be shown with ^ and commit object name of B's parent.
> 
> If you ask it to stop at 2015, the command will first consider A
> (C's earlier parent) and pass blame to the lines common between
> these two commits.  In this illustration, we are pretending that the
> file did not change throughout the hsitory, so blame for all lines
> are passed to A and we don't even look at B.  Then we keep digging
> through A to find the culprit, or hit a commit older than the
> specified cut-off time.  The line will be shown with ^A or perhaps
> its ancestor.
> 
> So it is entirely sane if you saw three boundary commits named with
> three different time ranges.

Thanks for clearing this up. Is this documented somewhere, so that if it happens
again I can point people to the docs that explain this behaviour?

-- 
Best regards,
Domagoj Stolfa

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible git blame bug?
  2017-03-13 23:08         ` Domagoj Stolfa
@ 2017-03-13 23:15           ` Junio C Hamano
  2017-03-13 23:19             ` Domagoj Stolfa
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2017-03-13 23:15 UTC (permalink / raw)
  To: Domagoj Stolfa; +Cc: Git Mailing List

Domagoj Stolfa <domagoj.stolfa@gmail.com> writes:

> Thanks for clearing this up. Is this documented somewhere, so that if it happens
> again I can point people to the docs that explain this behaviour?

Is this from "git blame --help" sufficient?

    When you are not interested in changes older than version
    v2.6.18, or changes older than 3 weeks, you can use revision
    range specifiers  similar to 'git rev-list':

            git blame v2.6.18.. -- foo
            git blame --since=3.weeks -- foo

    When revision range specifiers are used to limit the annotation,
    lines that have not changed since the range boundary (either the
    commit v2.6.18 or the most recent commit that is more than 3
    weeks old in the above example) are blamed for that range
    boundary commit.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible git blame bug?
  2017-03-13 23:15           ` Junio C Hamano
@ 2017-03-13 23:19             ` Domagoj Stolfa
  0 siblings, 0 replies; 10+ messages in thread
From: Domagoj Stolfa @ 2017-03-13 23:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 977 bytes --]

Hello,

> > Thanks for clearing this up. Is this documented somewhere, so that if it happens
> > again I can point people to the docs that explain this behaviour?
> 
> Is this from "git blame --help" sufficient?
> 
>     When you are not interested in changes older than version
>     v2.6.18, or changes older than 3 weeks, you can use revision
>     range specifiers  similar to 'git rev-list':
> 
>             git blame v2.6.18.. -- foo
>             git blame --since=3.weeks -- foo
> 
>     When revision range specifiers are used to limit the annotation,
>     lines that have not changed since the range boundary (either the
>     commit v2.6.18 or the most recent commit that is more than 3
>     weeks old in the above example) are blamed for that range
>     boundary commit.

re-reading it post your explanation, it seems to make perfect sense. Thanks
again for the detailed explanation of the behaviour!

-- 
Best regards,
Domagoj Stolfa

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-03-13 23:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-13 20:11 Possible git blame bug? Domagoj Stolfa
2017-03-13 20:38 ` Junio C Hamano
     [not found]   ` <20170313204401.GB80633@workstation>
2017-03-13 20:46     ` Domagoj Stolfa
2017-03-13 21:29   ` Junio C Hamano
2017-03-13 21:44     ` Domagoj Stolfa
2017-03-13 22:19       ` Junio C Hamano
2017-03-13 22:46         ` Junio C Hamano
2017-03-13 23:08         ` Domagoj Stolfa
2017-03-13 23:15           ` Junio C Hamano
2017-03-13 23:19             ` Domagoj Stolfa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.