git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Diff --stat for files that differ only in whitespace
@ 2020-07-31 13:10 Matt Rogers
  2020-07-31 17:32 ` Junio C Hamano
  2020-07-31 17:41 ` Jeff King
  0 siblings, 2 replies; 7+ messages in thread
From: Matt Rogers @ 2020-07-31 13:10 UTC (permalink / raw)
  To: Git Mailing List

When using a repository with "core.autocrlf=false" I'm trying to run a diff
between two commits that have many files (~1000) changed that differ mostly
in line endings or other whitespace.  When I run
`git diff --stat --ignore-all-space <commit-1> <commit-2>` I'm getting an output
that has many files listed like:

some-file.txt | 0

This is easy enough to parse through when it's a small number of files but when
there is ~1000 files with only maybe 1500 insertions/deletions showing it's not
really useful to me to see a list of those 1000 files, if there was a way to
sort by number of insertion/deletions or filter out the files that had 0
effective changes that would solve my problem.


Simple example:

```
mkdir example && cd example
git init
git config --local core.autocrlf false
echo HELLO > file.txt
echo WORLD >> file.txt
git add file.txt
git commit -m "first kind of line endings"

<use an editor to swap out the line endings>
git add file.txt
git commit -m "second kind of line endings"

git diff --stat --ignore-all-space HEAD~1 HEAD
```

Thanks,
Matthew Rogers

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Diff --stat for files that differ only in whitespace
  2020-07-31 13:10 Diff --stat for files that differ only in whitespace Matt Rogers
@ 2020-07-31 17:32 ` Junio C Hamano
  2020-07-31 17:41 ` Jeff King
  1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2020-07-31 17:32 UTC (permalink / raw)
  To: Matt Rogers; +Cc: Git Mailing List

Matt Rogers <mattr94@gmail.com> writes:

> some-file.txt | 0
>
> This is easy enough to parse through when it's a small number of files but when
> there is ~1000 files with only maybe 1500 insertions/deletions showing it's not
> really useful to me to see a list of those 1000 files, if there was a way to
> sort by number of insertion/deletions or filter out the files that had 0
> effective changes that would solve my problem.

Whether you are interested in whitespace or not, if you are
processing the diff output with scripts, --numstat (not --stat)
may be simpler to work with.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Diff --stat for files that differ only in whitespace
  2020-07-31 13:10 Diff --stat for files that differ only in whitespace Matt Rogers
  2020-07-31 17:32 ` Junio C Hamano
@ 2020-07-31 17:41 ` Jeff King
  2020-07-31 19:26   ` Matt Rogers
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff King @ 2020-07-31 17:41 UTC (permalink / raw)
  To: Matt Rogers; +Cc: Junio C Hamano, Git Mailing List

On Fri, Jul 31, 2020 at 09:10:44AM -0400, Matt Rogers wrote:

> When using a repository with "core.autocrlf=false" I'm trying to run a diff
> between two commits that have many files (~1000) changed that differ mostly
> in line endings or other whitespace.  When I run
> `git diff --stat --ignore-all-space <commit-1> <commit-2>` I'm getting an output
> that has many files listed like:
> 
> some-file.txt | 0

This seemed familiar, so I dug up some prior discussion here:

  https://lore.kernel.org/git/1484704915.2096.16.camel@mattmccutchen.net/

We didn't come to a resolution there, but there is a patch to play with,
and I think nobody was opposed to the notion that with the right
code change we could be suppressing these whitespace-only stat lines

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Diff --stat for files that differ only in whitespace
  2020-07-31 17:41 ` Jeff King
@ 2020-07-31 19:26   ` Matt Rogers
  2020-07-31 19:48     ` Jeff King
  2020-07-31 20:25     ` Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: Matt Rogers @ 2020-07-31 19:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List

On Fri, Jul 31, 2020 at 1:41 PM Jeff King <peff@peff.net> wrote:
>
> This seemed familiar, so I dug up some prior discussion here:
>
>   https://lore.kernel.org/git/1484704915.2096.16.camel@mattmccutchen.net/
>
> We didn't come to a resolution there, but there is a patch to play with,
> and I think nobody was opposed to the notion that with the right
> code change we could be suppressing these whitespace-only stat lines
>
> -Peff


I think for now I'm going to feed --numstat into a script like Junio suggested.
Out of curiosity what would it take to get that patch into git? Is it just a
matter of someone just verifying it and submitting it?

--
Matthew Rogers

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Diff --stat for files that differ only in whitespace
  2020-07-31 19:26   ` Matt Rogers
@ 2020-07-31 19:48     ` Jeff King
  2020-07-31 20:25     ` Junio C Hamano
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff King @ 2020-07-31 19:48 UTC (permalink / raw)
  To: Matt Rogers; +Cc: Junio C Hamano, Git Mailing List

On Fri, Jul 31, 2020 at 03:26:16PM -0400, Matt Rogers wrote:

> > This seemed familiar, so I dug up some prior discussion here:
> >
> >   https://lore.kernel.org/git/1484704915.2096.16.camel@mattmccutchen.net/
> >
> > We didn't come to a resolution there, but there is a patch to play with,
> > and I think nobody was opposed to the notion that with the right
> > code change we could be suppressing these whitespace-only stat lines
> >
> > -Peff
> 
> 
> I think for now I'm going to feed --numstat into a script like Junio suggested.
> Out of curiosity what would it take to get that patch into git? Is it just a
> matter of someone just verifying it and submitting it?

Yeah, I think it would require making an argument that the patch covers
the correct set of cases (or fixing it if it doesn't), and probably
adding some tests.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Diff --stat for files that differ only in whitespace
  2020-07-31 19:26   ` Matt Rogers
  2020-07-31 19:48     ` Jeff King
@ 2020-07-31 20:25     ` Junio C Hamano
  2020-07-31 20:43       ` Matt Rogers
  1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2020-07-31 20:25 UTC (permalink / raw)
  To: Matt Rogers; +Cc: Jeff King, Git Mailing List

Matt Rogers <mattr94@gmail.com> writes:

> On Fri, Jul 31, 2020 at 1:41 PM Jeff King <peff@peff.net> wrote:
>>
>> This seemed familiar, so I dug up some prior discussion here:
>>
>>   https://lore.kernel.org/git/1484704915.2096.16.camel@mattmccutchen.net/
>>
>> We didn't come to a resolution there, but there is a patch to play with,
>> and I think nobody was opposed to the notion that with the right
>> code change we could be suppressing these whitespace-only stat lines
>>
>> -Peff
>
>
> I think for now I'm going to feed --numstat into a script like Junio suggested.
> Out of curiosity what would it take to get that patch into git? Is it just a
> matter of someone just verifying it and submitting it?

After re-reading the thread, I think it takes a bit more than just
re-reading and testing, because even the author of the patch said
(and I think I still agree with the assessment) that the patch does
one thing that is not exactly what we want it to do.  So we'd need
to tweak it more to do what we want to do, I would suspect.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Diff --stat for files that differ only in whitespace
  2020-07-31 20:25     ` Junio C Hamano
@ 2020-07-31 20:43       ` Matt Rogers
  0 siblings, 0 replies; 7+ messages in thread
From: Matt Rogers @ 2020-07-31 20:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List

On Fri, Jul 31, 2020 at 4:25 PM Junio C Hamano <gitster@pobox.com> wrote:
>
>
> After re-reading the thread, I think it takes a bit more than just
> re-reading and testing, because even the author of the patch said
> (and I think I still agree with the assessment) that the patch does
> one thing that is not exactly what we want it to do.  So we'd need
> to tweak it more to do what we want to do, I would suspect.

Got it, I'll try to look into this more over the weekend then if there's
no problems with that

-- 
Matthew Rogers

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-07-31 20:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-31 13:10 Diff --stat for files that differ only in whitespace Matt Rogers
2020-07-31 17:32 ` Junio C Hamano
2020-07-31 17:41 ` Jeff King
2020-07-31 19:26   ` Matt Rogers
2020-07-31 19:48     ` Jeff King
2020-07-31 20:25     ` Junio C Hamano
2020-07-31 20:43       ` Matt Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).