All of lore.kernel.org
 help / color / mirror / Atom feed
* blame -M vs. log -p|grep -c ^+ weirdness
@ 2009-08-11 10:16 Thomas Rast
  2009-08-11 11:56 ` Thomas Rast
  0 siblings, 1 reply; 2+ messages in thread
From: Thomas Rast @ 2009-08-11 10:16 UTC (permalink / raw)
  To: git; +Cc: Björn Steinbrink

[-- Attachment #1: Type: Text/Plain, Size: 3047 bytes --]

Hi all

I think I'm fundamentally misunderstanding something about the blame
code...  The other day I wanted to see how much our local fork of
DOMjudge diverged from their upstream.  You can grab the entire
history at

  git://csa.inf.ethz.ch/domjudge-public.git

if you want to try the commands I ran.

As a first statistic I looked at how many lines are blamed to our
local team (Christoph, Florian and me) by running

  git ls-files | while read f; do git blame -M -- "$f"; done |
  perl -pe 's/^\^?[a-f0-9]* (?:[^(]* )?\(([^2]*?) *20.*/$1/' |
  sort | uniq -c | sort -n

This shows that over 8000 lines are attributed to the three of us:

      1 domjudge                                                                   
      2 rob                                                                        
    113 Stijn van Drongelen                                                        
    126 Jeroen Schot                                                               
    149 neus                                                                       
    866 Peter van de Werken                                                        
   1245 Thomas Rast                                                                
   1752 Christoph Krautz                                                           
   5350 Florian Jug                                                                
  10293 Thijs Kinkhorst                                                            
  20397 Jaap Eldering   

However, sanity checking this against the diffs of the single commits
shows quite a different number:

  git log --no-merges -p upstream/2.2.. | grep '^+' | grep -v -c '^+++'

gives only 4943 '+' lines, and you can easily verify with

  git shortlog -sn upstream/2.2..

that indeed all commits in that range are ours.  So why does the blame
think more lines are ours than we even added *in total*?

Björn Steinbrink suggested on IRC that I use -M5 -C5 -C5 -C5, which
indeed reduces it to

      1 domjudge                                                                   
      2 rob                                                                        
    115 Stijn van Drongelen                                                        
    116 Jeroen Schot                                                               
    149 neus                                                                       
    390 Florian Jug                                                                
    930 Peter van de Werken                                                        
   1209 Thomas Rast                                                                
   1612 Christoph Krautz                                                           
  11750 Thijs Kinkhorst                                                            
  24020 Jaap Eldering

Note especially the huge drop in Florian's numbers.  What's going on
here?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: blame -M vs. log -p|grep -c ^+ weirdness
  2009-08-11 10:16 blame -M vs. log -p|grep -c ^+ weirdness Thomas Rast
@ 2009-08-11 11:56 ` Thomas Rast
  0 siblings, 0 replies; 2+ messages in thread
From: Thomas Rast @ 2009-08-11 11:56 UTC (permalink / raw)
  To: git; +Cc: Björn Steinbrink

[-- Attachment #1: Type: Text/Plain, Size: 822 bytes --]

Thomas Rast wrote:
>   git://csa.inf.ethz.ch/domjudge-public.git

Turns out this is locked down from the outside as of this writing, so
I mirrored it at

  git://thomasrast.ch/domjudge.git

>   git ls-files | while read f; do git blame -M -- "$f"; done |
[...]
>   git log --no-merges -p upstream/2.2.. | grep '^+' | grep -v -c '^+++'
[...]
> Björn Steinbrink suggested on IRC that I use -M5 -C5 -C5 -C5, which
> indeed reduces it to

As Björn kindly pointed out once he had access to the repo, the blame
-M works for binary files too, while log -p doesn't; and
'bin/sh-static' and 'doc/logos/DOMjudgelogo.pdf' live under other
names in the upstream repository too, so sufficiently many -C blame
them on upstream instead.  So that resolves the mystery.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-08-11 12:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-11 10:16 blame -M vs. log -p|grep -c ^+ weirdness Thomas Rast
2009-08-11 11:56 ` Thomas Rast

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.