git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git stash --include-untracked walks ignored directories
@ 2020-06-09 19:33 Brian Malehorn
  2020-06-10  3:21 ` Elijah Newren
  0 siblings, 1 reply; 3+ messages in thread
From: Brian Malehorn @ 2020-06-09 19:33 UTC (permalink / raw)
  To: git

Hi,

Not sure if this is the right place to send this, but I'm here to
report a performance regression with git stash --include-untracked.

Here's a quick way to reproduce:

1. make a directory with a lot of ignored files

$ find ignored -type f | wc -l
   50000

$ cat .gitignore
ignored

2. touch foo

3. time git stash --include-untracked

git version 2.26.0:
real    0m0.094s

git version 2.27.0.83.g0313f36c6e:
real    0m1.913s

This is a much bigger pain point on my work repo, which has 1.4
million ignored files(!). As you can imagine it takes a long time to
run git stash. While it might be valid to question why anyone would
need that many files for any purpose, the bottom line is that I told
git to ignore this directory, and it didn't ignore it.

In the meantime I've reverted to 2.26.0 which doesn't have this
performance regression. Let me know if you want any other information
related to this issue.

Thanks,
Brian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: git stash --include-untracked walks ignored directories
  2020-06-09 19:33 git stash --include-untracked walks ignored directories Brian Malehorn
@ 2020-06-10  3:21 ` Elijah Newren
  2020-06-10  5:56   ` Brian Malehorn
  0 siblings, 1 reply; 3+ messages in thread
From: Elijah Newren @ 2020-06-10  3:21 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: Git Mailing List

On Tue, Jun 9, 2020 at 1:39 PM Brian Malehorn <bmalehorn@gmail.com> wrote:
>
> Hi,
>
> Not sure if this is the right place to send this, but I'm here to
> report a performance regression with git stash --include-untracked.
>
> Here's a quick way to reproduce:
>
> 1. make a directory with a lot of ignored files
>
> $ find ignored -type f | wc -l
>    50000
>
> $ cat .gitignore
> ignored
>
> 2. touch foo
>
> 3. time git stash --include-untracked
>
> git version 2.26.0:
> real    0m0.094s
>
> git version 2.27.0.83.g0313f36c6e:
> real    0m1.913s
>
> This is a much bigger pain point on my work repo, which has 1.4
> million ignored files(!). As you can imagine it takes a long time to
> run git stash. While it might be valid to question why anyone would
> need that many files for any purpose, the bottom line is that I told
> git to ignore this directory, and it didn't ignore it.
>
> In the meantime I've reverted to 2.26.0 which doesn't have this
> performance regression. Let me know if you want any other information
> related to this issue.
>
> Thanks,
> Brian

I seem to be missing some important step to reproduce; what else is
needed?  Here's what I see:

<Set path to use git-2.26.0>
$ ./repro.sh
Number of files in ignored before: 50000
Saved working directory and index state WIP on master: e2b0471 initial

real 0m0.029s
user 0m0.014s
sys 0m0.014s
git version 2.26.0
Number of files in ignored after: 50000

<Set path to use git-2.27.0>
$ ./repro.sh
Number of files in ignored before: 50000
Saved working directory and index state WIP on master: 5c596b8 initial

real 0m0.052s
user 0m0.014s
sys 0m0.034s
git version 2.27.0
Number of files in ignored after: 50000


Where repro.sh is:

#!/bin/bash

rm -rf stupid
git init -q stupid
cd stupid

echo ignored >.gitignore
seq 1 10 >numbers-tracked
git add numbers-tracked .gitignore
git commit -q -m initial

seq 11 20 >>numbers-tracked
seq 21 30 >numbers-untracked

mkdir ignored
cd ignored
for i in $(seq 1 50000); do >$i; done
cd ..

echo "Number of files in ignored before: $(find ignored -type f | wc -l)"
time git stash --include-untracked
git --version
echo "Number of files in ignored after: $(find ignored -type f | wc -l)"

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: git stash --include-untracked walks ignored directories
  2020-06-10  3:21 ` Elijah Newren
@ 2020-06-10  5:56   ` Brian Malehorn
  0 siblings, 0 replies; 3+ messages in thread
From: Brian Malehorn @ 2020-06-10  5:56 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

Ah, my original message was a bit misleading. I believe git slows down
based on the number of directories, not the number of files. Here's an
updated version of your script that creates a lot of directories
instead of a lot of files:

#!/bin/bash

rm -rf stupid
git init -q stupid
cd stupid

echo ignored >.gitignore
seq 1 10 >numbers-tracked
git add numbers-tracked .gitignore
git commit -q -m initial

seq 11 20 >>numbers-tracked
seq 21 30 >numbers-untracked

mkdir ignored
cd ignored
for i in $(seq 1 50); do
  for j in $(seq 1 1000); do
    echo "$i/$j"
  done
done | xargs mkdir -p
cd ..

echo "Number of directories in ignored before: $(find ignored -type d | wc -l)"
time git stash --include-untracked
git --version
echo "Number of directories in ignored after: $(find ignored -type d | wc -l)"

----

I got it to reproduce with the versions I have handy:

$ ./repro.sh
Number of directories in ignored before: 50051
Saved working directory and index state WIP on master: 0dabddf initial

real    0m0.023s
user    0m0.012s
sys     0m0.010s
git version 2.25.1
Number of directories in ignored after: 50051


$./repro.sh
Number of directories in ignored before: 50051
Saved working directory and index state WIP on master: 175d4ce initial

real    0m0.619s
user    0m0.157s
sys     0m0.452s
git version 2.27.0.83.g0313f36c6e
Number of directories in ignored after: 50051

On Tue, Jun 9, 2020 at 8:21 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Jun 9, 2020 at 1:39 PM Brian Malehorn <bmalehorn@gmail.com> wrote:
> >
> > Hi,
> >
> > Not sure if this is the right place to send this, but I'm here to
> > report a performance regression with git stash --include-untracked.
> >
> > Here's a quick way to reproduce:
> >
> > 1. make a directory with a lot of ignored files
> >
> > $ find ignored -type f | wc -l
> >    50000
> >
> > $ cat .gitignore
> > ignored
> >
> > 2. touch foo
> >
> > 3. time git stash --include-untracked
> >
> > git version 2.26.0:
> > real    0m0.094s
> >
> > git version 2.27.0.83.g0313f36c6e:
> > real    0m1.913s
> >
> > This is a much bigger pain point on my work repo, which has 1.4
> > million ignored files(!). As you can imagine it takes a long time to
> > run git stash. While it might be valid to question why anyone would
> > need that many files for any purpose, the bottom line is that I told
> > git to ignore this directory, and it didn't ignore it.
> >
> > In the meantime I've reverted to 2.26.0 which doesn't have this
> > performance regression. Let me know if you want any other information
> > related to this issue.
> >
> > Thanks,
> > Brian
>
> I seem to be missing some important step to reproduce; what else is
> needed?  Here's what I see:
>
> <Set path to use git-2.26.0>
> $ ./repro.sh
> Number of files in ignored before: 50000
> Saved working directory and index state WIP on master: e2b0471 initial
>
> real 0m0.029s
> user 0m0.014s
> sys 0m0.014s
> git version 2.26.0
> Number of files in ignored after: 50000
>
> <Set path to use git-2.27.0>
> $ ./repro.sh
> Number of files in ignored before: 50000
> Saved working directory and index state WIP on master: 5c596b8 initial
>
> real 0m0.052s
> user 0m0.014s
> sys 0m0.034s
> git version 2.27.0
> Number of files in ignored after: 50000
>
>
> Where repro.sh is:
>
> #!/bin/bash
>
> rm -rf stupid
> git init -q stupid
> cd stupid
>
> echo ignored >.gitignore
> seq 1 10 >numbers-tracked
> git add numbers-tracked .gitignore
> git commit -q -m initial
>
> seq 11 20 >>numbers-tracked
> seq 21 30 >numbers-untracked
>
> mkdir ignored
> cd ignored
> for i in $(seq 1 50000); do >$i; done
> cd ..
>
> echo "Number of files in ignored before: $(find ignored -type f | wc -l)"
> time git stash --include-untracked
> git --version
> echo "Number of files in ignored after: $(find ignored -type f | wc -l)"

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-06-10  5:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-09 19:33 git stash --include-untracked walks ignored directories Brian Malehorn
2020-06-10  3:21 ` Elijah Newren
2020-06-10  5:56   ` Brian Malehorn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).