All of lore.kernel.org
 help / color / mirror / Atom feed
* hot to get file sizes in git log output
@ 2017-10-20 18:12 David Lang
  2017-10-20 20:44 ` Jacob Keller
  0 siblings, 1 reply; 8+ messages in thread
From: David Lang @ 2017-10-20 18:12 UTC (permalink / raw)
  To: git

I'm needing to scan through git history looking for the file sizes (looking for 
when a particular file shrunk drastically)

I'm not seeing an option in git log or git whatchanged that gives me the file 
size, am I overlooking something?

David Lang

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hot to get file sizes in git log output
  2017-10-20 18:12 hot to get file sizes in git log output David Lang
@ 2017-10-20 20:44 ` Jacob Keller
  2017-10-20 21:43   ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Jacob Keller @ 2017-10-20 20:44 UTC (permalink / raw)
  To: David Lang; +Cc: Git mailing list

On Fri, Oct 20, 2017 at 11:12 AM, David Lang <david@lang.hm> wrote:
> I'm needing to scan through git history looking for the file sizes (looking
> for when a particular file shrunk drastically)
>
> I'm not seeing an option in git log or git whatchanged that gives me the
> file size, am I overlooking something?
>
> David Lang

I'm not exactly sure what you mean by size, but if you want to show
how many lines were added and removed by a given commit for each file,
you can use the "--stat" option to produce a diffstat. The "size" of
the files in each commit isn't very meaningful to the commit itself,
but a stat of how much was removed might be more accurate to what
you're looking for.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hot to get file sizes in git log output
  2017-10-20 20:44 ` Jacob Keller
@ 2017-10-20 21:43   ` Jeff King
  2017-10-20 21:50     ` Eric Sunshine
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2017-10-20 21:43 UTC (permalink / raw)
  To: Jacob Keller; +Cc: David Lang, Git mailing list

On Fri, Oct 20, 2017 at 01:44:36PM -0700, Jacob Keller wrote:

> On Fri, Oct 20, 2017 at 11:12 AM, David Lang <david@lang.hm> wrote:
> > I'm needing to scan through git history looking for the file sizes (looking
> > for when a particular file shrunk drastically)
> >
> > I'm not seeing an option in git log or git whatchanged that gives me the
> > file size, am I overlooking something?
> >
> > David Lang
> 
> I'm not exactly sure what you mean by size, but if you want to show
> how many lines were added and removed by a given commit for each file,
> you can use the "--stat" option to produce a diffstat. The "size" of
> the files in each commit isn't very meaningful to the commit itself,
> but a stat of how much was removed might be more accurate to what
> you're looking for.

That's a good suggestion, and hopefully could help David answer his
original question.

I took the request to mean "walk through history, and for each file that
a commit touches, show its size". Which is a bit harder to do, and I
think you need to script a little:

  git rev-list HEAD |
  git diff-tree --stdin -r |
  perl -lne '
    # raw diff line, capture filename and post-image sha1
    if (/^:\S+ \S+ \S+ (\S+) \S+\t(.*)/) {
      print "$1 $commit:$2"
    }
    # otherwise it is a commit sha1
    else {
      $commit = $_;
    }
  ' |
  git cat-file --batch-check='%(objectsize) %(rest)'

That should show the size of each file along with the "commit:path" in
which it was introduced.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hot to get file sizes in git log output
  2017-10-20 21:43   ` Jeff King
@ 2017-10-20 21:50     ` Eric Sunshine
  2017-10-20 21:54       ` Jacob Keller
  2017-10-21  2:38       ` David Lang
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Sunshine @ 2017-10-20 21:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Jacob Keller, David Lang, Git mailing list

On Fri, Oct 20, 2017 at 5:43 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Oct 20, 2017 at 01:44:36PM -0700, Jacob Keller wrote:
>> On Fri, Oct 20, 2017 at 11:12 AM, David Lang <david@lang.hm> wrote:
>> > I'm needing to scan through git history looking for the file sizes (looking
>> > for when a particular file shrunk drastically)
>> >
>> > I'm not seeing an option in git log or git whatchanged that gives me the
>> > file size, am I overlooking something?
>>
>> I'm not exactly sure what you mean by size, but if you want to show
>> how many lines were added and removed by a given commit for each file,
>> you can use the "--stat" option to produce a diffstat. The "size" of
>> the files in each commit isn't very meaningful to the commit itself,
>> but a stat of how much was removed might be more accurate to what
>> you're looking for.
>
> That's a good suggestion, and hopefully could help David answer his
> original question.
>
> I took the request to mean "walk through history, and for each file that
> a commit touches, show its size". Which is a bit harder to do, and I
> think you need to script a little:

David's mention of "a particular file", suggests to me that something
"bad" happened to one file, and he wants to know in which commit that
"badness" happened. If so, then it sounds like a job for git-bisect.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hot to get file sizes in git log output
  2017-10-20 21:50     ` Eric Sunshine
@ 2017-10-20 21:54       ` Jacob Keller
  2017-10-21  2:38       ` David Lang
  1 sibling, 0 replies; 8+ messages in thread
From: Jacob Keller @ 2017-10-20 21:54 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Jeff King, David Lang, Git mailing list

On Fri, Oct 20, 2017 at 2:50 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Fri, Oct 20, 2017 at 5:43 PM, Jeff King <peff@peff.net> wrote:
>> On Fri, Oct 20, 2017 at 01:44:36PM -0700, Jacob Keller wrote:
>>> On Fri, Oct 20, 2017 at 11:12 AM, David Lang <david@lang.hm> wrote:
>>> > I'm needing to scan through git history looking for the file sizes (looking
>>> > for when a particular file shrunk drastically)
>>> >
>>> > I'm not seeing an option in git log or git whatchanged that gives me the
>>> > file size, am I overlooking something?
>>>
>>> I'm not exactly sure what you mean by size, but if you want to show
>>> how many lines were added and removed by a given commit for each file,
>>> you can use the "--stat" option to produce a diffstat. The "size" of
>>> the files in each commit isn't very meaningful to the commit itself,
>>> but a stat of how much was removed might be more accurate to what
>>> you're looking for.
>>
>> That's a good suggestion, and hopefully could help David answer his
>> original question.
>>
>> I took the request to mean "walk through history, and for each file that
>> a commit touches, show its size". Which is a bit harder to do, and I
>> think you need to script a little:
>
> David's mention of "a particular file", suggests to me that something
> "bad" happened to one file, and he wants to know in which commit that
> "badness" happened. If so, then it sounds like a job for git-bisect.

Yea, if you have a simple script which can tell when the file is
"bad", you could run it through git bisect run pretty easily and
rapidly find the right answer.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hot to get file sizes in git log output
  2017-10-20 21:50     ` Eric Sunshine
  2017-10-20 21:54       ` Jacob Keller
@ 2017-10-21  2:38       ` David Lang
  2017-10-21  2:43         ` David Lang
  2017-10-21  3:25         ` Jeff King
  1 sibling, 2 replies; 8+ messages in thread
From: David Lang @ 2017-10-21  2:38 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Jeff King, Jacob Keller, David Lang, Git mailing list

On Fri, 20 Oct 2017, Eric Sunshine wrote:

>>> I'm not exactly sure what you mean by size, but if you want to show
>>> how many lines were added and removed by a given commit for each file,
>>> you can use the "--stat" option to produce a diffstat. The "size" of
>>> the files in each commit isn't very meaningful to the commit itself,
>>> but a stat of how much was removed might be more accurate to what
>>> you're looking for.
>>
>> That's a good suggestion, and hopefully could help David answer his
>> original question.
>>
>> I took the request to mean "walk through history, and for each file that
>> a commit touches, show its size". Which is a bit harder to do, and I
>> think you need to script a little:
>
> David's mention of "a particular file", suggests to me that something
> "bad" happened to one file, and he wants to know in which commit that
> "badness" happened. If so, then it sounds like a job for git-bisect.

In this case, I have git store a copy of the state file for chromium (and do a 
similar thing for firefox), so that if something bad happens and it crashes and 
looses all 200-400 tabs in a way that it's recovery doesn't work, I can go back 
to a prior version.

This is done by having the following crontab entries, along with smudge filters 
that change the one-line json to pretty printed json before the commit.

0 * * * * dlang cd /home/dlang/.config/chromium/Default; git add *Session *Tabs Bookmarks History ; git commit -mupdate > /dev/null 2>&1

0 0 3 * * dlang cd /home/dlang/.config/chromium/Default; git gc --aggressive > /dev/null 2>&1

0 * * * * dlang cd /home/dlang/.mozilla/firefox/bux6mwl1.default/sessionstore-backups; git add *.js ; git commit -mupdate > /dev/null 2>&1

0 0 3 * * dlang cd /home/dlang/.mozilla/firefox/bux6mwl1.default/sessionstore-backups; git gc --aggressive > /dev/null 2>&1

This has saved me many times in the past. But this time I didn't recognize when 
the problem happened because instead of a crash, it just closed all the tabs 
except the one that was open. Once I realized all my other tabs were gone, I 
didn't have time to mess with it for a few days. So the problem could have 
happened anytime in the last week or two.

I'm sure that when this happened, the files shrunk drastically (from several 
hundred tabs to a dozen or so will be very obvious).

But I don't have any specific line I can look at, the lines that are there 
change pretty regularly, and/or would not have changed at the transition.

git whatchanged shows commits like:

commit fb7e54c12ddc7c87c4862806d583f5c6abf3e731
Author: David Lang <david@lang.hm>
Date:   Fri Oct 20 11:00:01 2017 -0700

     update

:100644 100644 1a842ca... 290e9dd... M  Default/Bookmarks
:100644 100644 1cd745c... 388a455... M  Default/Current Session
:100644 100644 51074ad... c4dce40... M  Default/Current Tabs

If there was a way to add file size to this output, it would be perfect for what 
I'm needing.

David Lang

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hot to get file sizes in git log output
  2017-10-21  2:38       ` David Lang
@ 2017-10-21  2:43         ` David Lang
  2017-10-21  3:25         ` Jeff King
  1 sibling, 0 replies; 8+ messages in thread
From: David Lang @ 2017-10-21  2:43 UTC (permalink / raw)
  To: David Lang; +Cc: Eric Sunshine, Jeff King, Jacob Keller, Git mailing list

On Fri, 20 Oct 2017, David Lang wrote:

> On Fri, 20 Oct 2017, Eric Sunshine wrote:
>
>>>> I'm not exactly sure what you mean by size, but if you want to show
>>>> how many lines were added and removed by a given commit for each file,
>>>> you can use the "--stat" option to produce a diffstat. The "size" of
>>>> the files in each commit isn't very meaningful to the commit itself,
>>>> but a stat of how much was removed might be more accurate to what
>>>> you're looking for.
>>> 
>>> That's a good suggestion, and hopefully could help David answer his
>>> original question.
>>> 
>>> I took the request to mean "walk through history, and for each file that
>>> a commit touches, show its size". Which is a bit harder to do, and I
>>> think you need to script a little:
>> 
>> David's mention of "a particular file", suggests to me that something
>> "bad" happened to one file, and he wants to know in which commit that
>> "badness" happened. If so, then it sounds like a job for git-bisect.

summarizing (and removing the long explination of why I'm doing this)

for each file (or each file changed in the commit), what is the byte count of 
that file at the time of that commit.

git whatschanged currently reports

commit 17be1c1e1f80086e8ddda1706c8c8d6cf80d26b7
Author: David Lang <david@lang.hm>
Date:   Thu Oct 19 22:00:01 2017 -0700

     update

:100644 100644 bb9dcd3... 8635d2b... M  Default/Current Session

commit d3f94d406e0d5c6ee7b6f6bcea019adff438127c
Author: David Lang <david@lang.hm>
Date:   Thu Oct 19 21:00:01 2017 -0700

     update

:100644 100644 88ece53... bb9dcd3... M  Default/Current Session

commit fea290bd235a444bbd4bc4430fa0844501ae2b8c
Author: David Lang <david@lang.hm>
Date:   Thu Oct 19 06:00:01 2017 -0700

     update

:100644 100644 ff04089... 88ece53... M  Default/Current Session

what is the size of the file "Current Session" for each commit?

David Lang

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: hot to get file sizes in git log output
  2017-10-21  2:38       ` David Lang
  2017-10-21  2:43         ` David Lang
@ 2017-10-21  3:25         ` Jeff King
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff King @ 2017-10-21  3:25 UTC (permalink / raw)
  To: David Lang; +Cc: Eric Sunshine, Jacob Keller, Git mailing list

On Fri, Oct 20, 2017 at 07:38:00PM -0700, David Lang wrote:

> git whatchanged shows commits like:
> 
> commit fb7e54c12ddc7c87c4862806d583f5c6abf3e731
> Author: David Lang <david@lang.hm>
> Date:   Fri Oct 20 11:00:01 2017 -0700
> 
>     update
> 
> :100644 100644 1a842ca... 290e9dd... M  Default/Bookmarks
> :100644 100644 1cd745c... 388a455... M  Default/Current Session
> :100644 100644 51074ad... c4dce40... M  Default/Current Tabs
> 
> If there was a way to add file size to this output, it would be perfect for
> what I'm needing.

There isn't. You'll have to process the output to pass the post-image
hashes to cat-file to get the size (i.e., what I showed before, though
of course you could make the output prettier if you wanted to).

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-10-21  3:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-20 18:12 hot to get file sizes in git log output David Lang
2017-10-20 20:44 ` Jacob Keller
2017-10-20 21:43   ` Jeff King
2017-10-20 21:50     ` Eric Sunshine
2017-10-20 21:54       ` Jacob Keller
2017-10-21  2:38       ` David Lang
2017-10-21  2:43         ` David Lang
2017-10-21  3:25         ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.