From: Kjetil Barvik <barvik@broadpark.no>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Bevan Watkiss <bevan.watkiss@cloakware.com>,
'Alex Riesen' <raa.lkml@gmail.com>,
Git Mailing List <git@vger.kernel.org>
Subject: 'git checkout' and unlink() calls (was: Re: )
Date: Fri, 08 May 2009 18:47:46 +0200 [thread overview]
Message-ID: <86y6t77d8t.fsf_-_@broadpark.no> (raw)
In-Reply-To: <alpine.LFD.2.01.0905071446500.4983@localhost.localdomain>
* Linus Torvalds <torvalds@linux-foundation.org> writes:
| So here's a better patch. It should cut down the 'lstat()' calls from
| "git checkout" a lot.
|
| It looks obvious enough, and it passes testing (and now "git checkout"
| only does about as many lstat's as there are files in the repository,
| and they seem to all be properly asynchronous if 'core.preloadindex'
| is set.
I did a test by switching from v2.6.27 to v2.6.25, and now the only
"lstat()-difference" between with and without the -q option is 2
lstat() calls extra done without the -q option. And, compared to over
41 000 lstat() calls, that is not noticable. Very good!
| Somebody should check. It would be interesting to hear about whether
| this makes a performance impact, especially with slow filesystems
| and/or other operating systems that have a relatively higher cost for
| 'lstat()'.
Below is a table which is output from
strace -o result -T git checkout my-v2.6.25 /* from my-v2.6.27 */
where the "result" file is run through a perl script to pretty print it:
TOTAL 113988 100.000% OK:107252 NOT: 6736 6.263578 sec 55 usec/call
lstat64 41114 36.069% OK: 35829 NOT: 5285 0.710936 sec 17 usec/call
open 15027 13.183% OK: 13872 NOT: 1155 0.559302 sec 37 usec/call
unlink 14379 12.614% OK: 14374 NOT: 5 3.720167 sec 259 usec/call
write 14207 12.464% OK: 14207 NOT: 0 0.754196 sec 53 usec/call
close 13872 12.170% OK: 13872 NOT: 0 0.185572 sec 13 usec/call
fstat64 13862 12.161% OK: 13862 NOT: 0 0.169952 sec 12 usec/call
rmdir 551 0.483% OK: 269 NOT: 282 0.035534 sec 64 usec/call
brk 510 0.447% OK: 510 NOT: 0 0.014804 sec 29 usec/call
mkdir 174 0.153% OK: 174 NOT: 0 0.102625 sec 590 usec/call
mmap2 102 0.089% OK: 102 NOT: 0 0.001725 sec 17 usec/call
read 68 0.060% OK: 68 NOT: 0 0.000999 sec 15 usec/call
munmap 61 0.054% OK: 61 NOT: 0 0.005037 sec 83 usec/call
access 20 0.018% OK: 12 NOT: 8 0.000348 sec 17 usec/call
mprotect 13 0.011% OK: 13 NOT: 0 0.000193 sec 15 usec/call
stat64 7 0.006% OK: 7 NOT: 0 0.000109 sec 16 usec/call
getcwd 3 0.003% OK: 3 NOT: 0 0.000053 sec 18 usec/call
chdir 3 0.003% OK: 3 NOT: 0 0.000048 sec 16 usec/call
fcntl64 3 0.003% OK: 3 NOT: 0 0.000036 sec 12 usec/call
rename 2 0.002% OK: 2 NOT: 0 0.001553 sec 776 usec/call
setitimer 2 0.002% OK: 2 NOT: 0 0.000028 sec 14 usec/call
getdents64 2 0.002% OK: 2 NOT: 0 0.000039 sec 20 usec/call
uname 1 0.001% OK: 1 NOT: 0 0.000013 sec 13 usec/call
time 1 0.001% OK: 1 NOT: 0 0.000011 sec 11 usec/call
futex 1 0.001% OK: 1 NOT: 0 0.000013 sec 13 usec/call
readlink 1 0.001% OK: 0 NOT: 1 0.000018 sec 18 usec/call
execve 1 0.001% OK: 1 NOT: 0 0.000256 sec 256 usec/call
getrlimit 1 0.001% OK: 1 NOT: 0 0.000011 sec 11 usec/call
So, if the numbers from strace is trustable, 0.71 seconds is used on
41 114 calls to lstat64(). But, look at the unlink line, where each
call took 259 microseconds (= 0.259 milliseconds), and all 14 379
calls took 3.72 seconds.
It should be noted that when switching branch the other way (from .25
to .27), the unlink() calls used less time (below 160 microseconds
each). Also note that the above was tested by only 3 runs. Warm
cache. ext4 disk partition with git compiled with the USE_NSEC=1
option.
Most (all?) of the unlink() calls seems to be from the following lines
from the checkout_entry() funciton in entry.c
/*
* We unlink the old file, to get the new one with the
* right permissions (including umask, which is nasty
* to emulate by hand - much easier to let the system
* just do the right thing)
*/
if (S_ISDIR(st.st_mode)) {
/* If it is a gitlink, leave it alone! */
if (S_ISGITLINK(ce->ce_mode))
return 0;
if (!state->force)
return error("%s is a directory", path);
remove_subtree(path);
} else if (unlink(path))
return error("unable to unlink old '%s' (%s)", path, strerror(errno));
-- kjetil
next prev parent reply other threads:[~2009-05-08 16:48 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-07 17:01 (unknown), Bevan Watkiss
2009-05-07 17:13 ` Alex Riesen
2009-05-07 17:26 ` Bevan Watkiss
2009-05-07 18:18 ` Alex Riesen
2009-05-07 18:48 ` Bevan Watkiss
2009-05-07 19:56 ` Björn Steinbrink
2009-05-07 18:56 ` Linus Torvalds
2009-05-07 19:37 ` RE: Bevan Watkiss
2009-05-07 20:07 ` RE: Linus Torvalds
2009-05-07 20:20 ` RE: Linus Torvalds
2009-05-07 20:43 ` Junio C Hamano
2009-05-07 21:33 ` Re: Linus Torvalds
2009-05-07 21:55 ` Linus Torvalds
2009-05-07 22:27 ` RE: david
2009-05-07 22:36 ` RE: Linus Torvalds
2009-05-07 22:43 ` RE: david
2009-05-07 23:00 ` RE: Linus Torvalds
2009-05-07 23:07 ` RE: david
2009-05-07 23:18 ` RE: Linus Torvalds
2009-05-07 23:31 ` RE: david
2009-05-07 23:57 ` Johan Herland
2009-05-08 16:14 ` Bevan Watkiss
2009-05-08 8:17 ` Alex Riesen
2009-05-08 14:39 ` Re: Linus Torvalds
2009-05-08 15:51 ` Re: Brandon Casey
2009-05-08 16:15 ` Re: Linus Torvalds
2009-05-08 17:27 ` Re: Brandon Casey
2009-05-08 17:43 ` Re: Brandon Casey
2009-05-08 21:49 ` Re: Linus Torvalds
2009-05-08 23:04 ` Re: Brandon Casey
2009-05-09 16:44 ` Re: Linus Torvalds
2009-05-08 17:44 ` Re: Linus Torvalds
2009-05-08 16:47 ` Kjetil Barvik [this message]
2009-05-08 17:57 ` 'git checkout' and unlink() calls (was: Re: ) Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86y6t77d8t.fsf_-_@broadpark.no \
--to=barvik@broadpark.no \
--cc=bevan.watkiss@cloakware.com \
--cc=git@vger.kernel.org \
--cc=raa.lkml@gmail.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).